[PATCH v2] exfat: check disk status during buffer write

Dongliang Cui posted 1 patch 1 month, 2 weeks ago
There is a newer version of this series
fs/exfat/inode.c | 11 +++++++++++
1 file changed, 11 insertions(+)
[PATCH v2] exfat: check disk status during buffer write
Posted by Dongliang Cui 1 month, 2 weeks ago
We found that when writing a large file through buffer write, if the
disk is inaccessible, exFAT does not return an error normally, which
leads to the writing process not stopping properly.

To easily reproduce this issue, you can follow the steps below:

1. format a device to exFAT and then mount (with a full disk erase)
2. dd if=/dev/zero of=/exfat_mount/test.img bs=1M count=8192
3. eject the device

You may find that the dd process does not stop immediately and may
continue for a long time.

The root cause of this issue is that during buffer write process,
exFAT does not need to access the disk to look up directory entries
or the FAT table (whereas FAT would do) every time data is written.
Instead, exFAT simply marks the buffer as dirty and returns,
delegating the writeback operation to the writeback process.

If the disk cannot be accessed at this time, the error will only be
returned to the writeback process, and the original process will not
receive the error, so it cannot be returned to the user side.

When the disk cannot be accessed normally, an error should be returned
to stop the writing process.

Signed-off-by: Dongliang Cui <dongliang.cui@unisoc.com>
Signed-off-by: Zhiguo Niu <zhiguo.niu@unisoc.com>
---
Changes in v2:
 - Refer to the block_device_ejected in ext4 for determining the
   device status.
 - Change the disk_check process to exfat_get_block to cover all
   buffer write scenarios.
---
---
 fs/exfat/inode.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/fs/exfat/inode.c b/fs/exfat/inode.c
index dd894e558c91..463cebb19852 100644
--- a/fs/exfat/inode.c
+++ b/fs/exfat/inode.c
@@ -8,6 +8,7 @@
 #include <linux/mpage.h>
 #include <linux/bio.h>
 #include <linux/blkdev.h>
+#include <linux/backing-dev-defs.h>
 #include <linux/time.h>
 #include <linux/writeback.h>
 #include <linux/uio.h>
@@ -275,6 +276,13 @@ static int exfat_map_new_buffer(struct exfat_inode_info *ei,
 	return 0;
 }
 
+static int exfat_block_device_ejected(struct super_block *sb)
+{
+	struct backing_dev_info *bdi = sb->s_bdi;
+
+	return bdi->dev == NULL;
+}
+
 static int exfat_get_block(struct inode *inode, sector_t iblock,
 		struct buffer_head *bh_result, int create)
 {
@@ -290,6 +298,9 @@ static int exfat_get_block(struct inode *inode, sector_t iblock,
 	sector_t valid_blks;
 	loff_t pos;
 
+	if (exfat_block_device_ejected(sb))
+		return -ENODEV;
+
 	mutex_lock(&sbi->s_lock);
 	last_block = EXFAT_B_TO_BLK_ROUND_UP(i_size_read(inode), sb);
 	if (iblock >= last_block && !create)
-- 
2.25.1
Re: [PATCH v2] exfat: check disk status during buffer write
Posted by Christoph Hellwig 1 month, 2 weeks ago
> +static int exfat_block_device_ejected(struct super_block *sb)
> +{
> +	struct backing_dev_info *bdi = sb->s_bdi;
> +
> +	return bdi->dev == NULL;
> +}

NAK, file systems have no business looking at this.  What you probably
really want is to implement the ->shutdown method for exfat so it gets
called on device removal.
RE: [PATCH v2] exfat: check disk status during buffer write
Posted by Sungjong Seo 1 month, 2 weeks ago
> > +static int exfat_block_device_ejected(struct super_block *sb)
> > +{
> > +	struct backing_dev_info *bdi = sb->s_bdi;
> > +
> > +	return bdi->dev == NULL;
> > +}
> 
> NAK, file systems have no business looking at this.  What you probably
> really want is to implement the ->shutdown method for exfat so it gets
> called on device removal.

Oh! Thank you for your additional comments. I completely missed this part.
I agree with what you said. Implementing ->shutdown seems to be the
right decision.
Re: [PATCH v2] exfat: check disk status during buffer write
Posted by dongliang cui 1 month, 1 week ago
On Thu, Jul 25, 2024 at 2:00 PM Sungjong Seo <sj1557.seo@samsung.com> wrote:
>
> > > +static int exfat_block_device_ejected(struct super_block *sb)
> > > +{
> > > +   struct backing_dev_info *bdi = sb->s_bdi;
> > > +
> > > +   return bdi->dev == NULL;
> > > +}
> >
> > NAK, file systems have no business looking at this.  What you probably
> > really want is to implement the ->shutdown method for exfat so it gets
> > called on device removal.
>
> Oh! Thank you for your additional comments. I completely missed this part.
> I agree with what you said. Implementing ->shutdown seems to be the
> right decision.
>
Thank you for your suggestions. I'll test it out this way.
RE: [PATCH v2] exfat: check disk status during buffer write
Posted by Sungjong Seo 1 month, 2 weeks ago
> We found that when writing a large file through buffer write, if the
> disk is inaccessible, exFAT does not return an error normally, which
> leads to the writing process not stopping properly.
> 
> To easily reproduce this issue, you can follow the steps below:
> 
> 1. format a device to exFAT and then mount (with a full disk erase)
> 2. dd if=/dev/zero of=/exfat_mount/test.img bs=1M count=8192
> 3. eject the device
> 
> You may find that the dd process does not stop immediately and may
> continue for a long time.
> 
> The root cause of this issue is that during buffer write process,
> exFAT does not need to access the disk to look up directory entries
> or the FAT table (whereas FAT would do) every time data is written.
> Instead, exFAT simply marks the buffer as dirty and returns,
> delegating the writeback operation to the writeback process.
> 
> If the disk cannot be accessed at this time, the error will only be
> returned to the writeback process, and the original process will not
> receive the error, so it cannot be returned to the user side.
> 
> When the disk cannot be accessed normally, an error should be returned
> to stop the writing process.
> 
> Signed-off-by: Dongliang Cui <dongliang.cui@unisoc.com>
> Signed-off-by: Zhiguo Niu <zhiguo.niu@unisoc.com>
> ---
> Changes in v2:
>  - Refer to the block_device_ejected in ext4 for determining the
>    device status.
>  - Change the disk_check process to exfat_get_block to cover all
>    buffer write scenarios.
> ---
> ---
>  fs/exfat/inode.c | 11 +++++++++++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/fs/exfat/inode.c b/fs/exfat/inode.c
> index dd894e558c91..463cebb19852 100644
> --- a/fs/exfat/inode.c
> +++ b/fs/exfat/inode.c
> @@ -8,6 +8,7 @@
>  #include <linux/mpage.h>
>  #include <linux/bio.h>
>  #include <linux/blkdev.h>
> +#include <linux/backing-dev-defs.h>
>  #include <linux/time.h>
>  #include <linux/writeback.h>
>  #include <linux/uio.h>
> @@ -275,6 +276,13 @@ static int exfat_map_new_buffer(struct
> exfat_inode_info *ei,
>  	return 0;
>  }
> 
> +static int exfat_block_device_ejected(struct super_block *sb)
> +{
> +	struct backing_dev_info *bdi = sb->s_bdi;
> +
> +	return bdi->dev == NULL;
> +}
Have you tested with this again?

> +
>  static int exfat_get_block(struct inode *inode, sector_t iblock,
>  		struct buffer_head *bh_result, int create)
>  {
> @@ -290,6 +298,9 @@ static int exfat_get_block(struct inode *inode,
> sector_t iblock,
>  	sector_t valid_blks;
>  	loff_t pos;
> 
> +	if (exfat_block_device_ejected(sb))
This looks better than the modified location in the last patch.
However, the caller of this function may not be interested in exfat
error handling, so here we should call exfat_fs_error_ratelimit()
with an appropriate error message.

> +		return -ENODEV;
> +
>  	mutex_lock(&sbi->s_lock);
>  	last_block = EXFAT_B_TO_BLK_ROUND_UP(i_size_read(inode), sb);
>  	if (iblock >= last_block && !create)
> --
> 2.25.1
Re: [PATCH v2] exfat: check disk status during buffer write
Posted by dongliang cui 1 month, 2 weeks ago
On Wed, Jul 24, 2024 at 3:03 PM Sungjong Seo <sj1557.seo@samsung.com> wrote:
>
> > We found that when writing a large file through buffer write, if the
> > disk is inaccessible, exFAT does not return an error normally, which
> > leads to the writing process not stopping properly.
> >
> > To easily reproduce this issue, you can follow the steps below:
> >
> > 1. format a device to exFAT and then mount (with a full disk erase)
> > 2. dd if=/dev/zero of=/exfat_mount/test.img bs=1M count=8192
> > 3. eject the device
> >
> > You may find that the dd process does not stop immediately and may
> > continue for a long time.
> >
> > The root cause of this issue is that during buffer write process,
> > exFAT does not need to access the disk to look up directory entries
> > or the FAT table (whereas FAT would do) every time data is written.
> > Instead, exFAT simply marks the buffer as dirty and returns,
> > delegating the writeback operation to the writeback process.
> >
> > If the disk cannot be accessed at this time, the error will only be
> > returned to the writeback process, and the original process will not
> > receive the error, so it cannot be returned to the user side.
> >
> > When the disk cannot be accessed normally, an error should be returned
> > to stop the writing process.
> >
> > Signed-off-by: Dongliang Cui <dongliang.cui@unisoc.com>
> > Signed-off-by: Zhiguo Niu <zhiguo.niu@unisoc.com>
> > ---
> > Changes in v2:
> >  - Refer to the block_device_ejected in ext4 for determining the
> >    device status.
> >  - Change the disk_check process to exfat_get_block to cover all
> >    buffer write scenarios.
> > ---
> > ---
> >  fs/exfat/inode.c | 11 +++++++++++
> >  1 file changed, 11 insertions(+)
> >
> > diff --git a/fs/exfat/inode.c b/fs/exfat/inode.c
> > index dd894e558c91..463cebb19852 100644
> > --- a/fs/exfat/inode.c
> > +++ b/fs/exfat/inode.c
> > @@ -8,6 +8,7 @@
> >  #include <linux/mpage.h>
> >  #include <linux/bio.h>
> >  #include <linux/blkdev.h>
> > +#include <linux/backing-dev-defs.h>
> >  #include <linux/time.h>
> >  #include <linux/writeback.h>
> >  #include <linux/uio.h>
> > @@ -275,6 +276,13 @@ static int exfat_map_new_buffer(struct
> > exfat_inode_info *ei,
> >       return 0;
> >  }
> >
> > +static int exfat_block_device_ejected(struct super_block *sb)
> > +{
> > +     struct backing_dev_info *bdi = sb->s_bdi;
> > +
> > +     return bdi->dev == NULL;
> > +}
> Have you tested with this again?
Yes, I tested it in this way. The user side can receive the -ENODEV error
after the device is ejected.
dongliang.cui@deivice:/data/tmp # dd if=/dev/zero of=test.img bs=1M count=10240
dd: test.img: write error: No such device
1274+0 records in
1273+1 records out
1335635968 bytes (1.2 G) copied, 8.060 s, 158 M/s

>
> > +
> >  static int exfat_get_block(struct inode *inode, sector_t iblock,
> >               struct buffer_head *bh_result, int create)
> >  {
> > @@ -290,6 +298,9 @@ static int exfat_get_block(struct inode *inode,
> > sector_t iblock,
> >       sector_t valid_blks;
> >       loff_t pos;
> >
> > +     if (exfat_block_device_ejected(sb))
> This looks better than the modified location in the last patch.
> However, the caller of this function may not be interested in exfat
> error handling, so here we should call exfat_fs_error_ratelimit()
> with an appropriate error message.
Thank you for the reminder. I will make the changes in the next version.

>
> > +             return -ENODEV;
> > +
> >       mutex_lock(&sbi->s_lock);
> >       last_block = EXFAT_B_TO_BLK_ROUND_UP(i_size_read(inode), sb);
> >       if (iblock >= last_block && !create)
> > --
> > 2.25.1
>
>
RE: [PATCH v2] exfat: check disk status during buffer write
Posted by Sungjong Seo 1 month, 2 weeks ago
> On Wed, Jul 24, 2024 at 3:03 PM Sungjong Seo <sj1557.seo@samsung.com>
> wrote:
> >
[snip]
> > >
> > > +static int exfat_block_device_ejected(struct super_block *sb)
> > > +{
> > > +     struct backing_dev_info *bdi = sb->s_bdi;
> > > +
> > > +     return bdi->dev == NULL;
> > > +}
> > Have you tested with this again?
> Yes, I tested it in this way. The user side can receive the -ENODEV error
> after the device is ejected.
> dongliang.cui@deivice:/data/tmp # dd if=/dev/zero of=test.img bs=1M
> count=10240
> dd: test.img: write error: No such device
> 1274+0 records in
> 1273+1 records out
> 1335635968 bytes (1.2 G) copied, 8.060 s, 158 M/s
Oops!, write() seems to return ENODEV that man page does not have.
In exfat_map_cluster, it was necessary to distinguish and return error
values, but now that explicitly differentiated error messages will be
printed. So, why not return EIO again? It seem appropriate to return EIO
instead of ENODEV from the read/write syscall.

> 
> >
> > > +
> > >  static int exfat_get_block(struct inode *inode, sector_t iblock,
> > >               struct buffer_head *bh_result, int create)
> > >  {
> > > @@ -290,6 +298,9 @@ static int exfat_get_block(struct inode *inode,
> > > sector_t iblock,
> > >       sector_t valid_blks;
> > >       loff_t pos;
> > >
> > > +     if (exfat_block_device_ejected(sb))
> > This looks better than the modified location in the last patch.
> > However, the caller of this function may not be interested in exfat
> > error handling, so here we should call exfat_fs_error_ratelimit()
> > with an appropriate error message.
> Thank you for the reminder. I will make the changes in the next version.
Sounds good!

> 
> >
> > > +             return -ENODEV;
> > > +
> > >       mutex_lock(&sbi->s_lock);
> > >       last_block = EXFAT_B_TO_BLK_ROUND_UP(i_size_read(inode), sb);
> > >       if (iblock >= last_block && !create)
> > > --
> > > 2.25.1
> >
> >

Re: [PATCH v2] exfat: check disk status during buffer write
Posted by dongliang cui 1 month, 2 weeks ago
On Wed, Jul 24, 2024 at 3:50 PM Sungjong Seo <sj1557.seo@samsung.com> wrote:
>
> > On Wed, Jul 24, 2024 at 3:03 PM Sungjong Seo <sj1557.seo@samsung.com>
> > wrote:
> > >
> [snip]
> > > >
> > > > +static int exfat_block_device_ejected(struct super_block *sb)
> > > > +{
> > > > +     struct backing_dev_info *bdi = sb->s_bdi;
> > > > +
> > > > +     return bdi->dev == NULL;
> > > > +}
> > > Have you tested with this again?
> > Yes, I tested it in this way. The user side can receive the -ENODEV error
> > after the device is ejected.
> > dongliang.cui@deivice:/data/tmp # dd if=/dev/zero of=test.img bs=1M
> > count=10240
> > dd: test.img: write error: No such device
> > 1274+0 records in
> > 1273+1 records out
> > 1335635968 bytes (1.2 G) copied, 8.060 s, 158 M/s
> Oops!, write() seems to return ENODEV that man page does not have.
> In exfat_map_cluster, it was necessary to distinguish and return error
> values, but now that explicitly differentiated error messages will be
> printed. So, why not return EIO again? It seem appropriate to return EIO
> instead of ENODEV from the read/write syscall.
Yes, indeed.
I will make the changes all together in the next version.
Thanks!
>
> >
> > >
> > > > +
> > > >  static int exfat_get_block(struct inode *inode, sector_t iblock,
> > > >               struct buffer_head *bh_result, int create)
> > > >  {
> > > > @@ -290,6 +298,9 @@ static int exfat_get_block(struct inode *inode,
> > > > sector_t iblock,
> > > >       sector_t valid_blks;
> > > >       loff_t pos;
> > > >
> > > > +     if (exfat_block_device_ejected(sb))
> > > This looks better than the modified location in the last patch.
> > > However, the caller of this function may not be interested in exfat
> > > error handling, so here we should call exfat_fs_error_ratelimit()
> > > with an appropriate error message.
> > Thank you for the reminder. I will make the changes in the next version.
> Sounds good!
>
> >
> > >
> > > > +             return -ENODEV;
> > > > +
> > > >       mutex_lock(&sbi->s_lock);
> > > >       last_block = EXFAT_B_TO_BLK_ROUND_UP(i_size_read(inode), sb);
> > > >       if (iblock >= last_block && !create)
> > > > --
> > > > 2.25.1
> > >
> > >
>
>