[PATCH] ext4: goto right label 'out_mmap_sem' in ext4_setattr()

libaokun@huaweicloud.com posted 1 patch 10 months, 1 week ago
There is a newer version of this series
fs/ext4/inode.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
[PATCH] ext4: goto right label 'out_mmap_sem' in ext4_setattr()
Posted by libaokun@huaweicloud.com 10 months, 1 week ago
From: Baokun Li <libaokun1@huawei.com>

Otherwise, if ext4_inode_attach_jinode() fails, a hung task will
happen because filemap_invalidate_unlock() isn't called to unlock
mapping->invalidate_lock. Like this:

EXT4-fs error (device sda) in ext4_setattr:5557: Out of memory
INFO: task fsstress:374 blocked for more than 122 seconds.
      Not tainted 6.14.0-rc1-next-20250206-xfstests-dirty #726
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:fsstress state:D stack:0     pid:374   tgid:374   ppid:373
                                  task_flags:0x440140 flags:0x00000000
Call Trace:
 <TASK>
 __schedule+0x2c9/0x7f0
 schedule+0x27/0xa0
 schedule_preempt_disabled+0x15/0x30
 rwsem_down_read_slowpath+0x278/0x4c0
 down_read+0x59/0xb0
 page_cache_ra_unbounded+0x65/0x1b0
 filemap_get_pages+0x124/0x3e0
 filemap_read+0x114/0x3d0
 vfs_read+0x297/0x360
 ksys_read+0x6c/0xe0
 do_syscall_64+0x4b/0x110
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

Fixes: c7fc0366c656 ("ext4: partial zero eof block on unaligned inode size extension")
Signed-off-by: Baokun Li <libaokun1@huawei.com>
---
 fs/ext4/inode.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 3cc8da6357aa..04ffd802dbde 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -5452,7 +5452,7 @@ int ext4_setattr(struct mnt_idmap *idmap, struct dentry *dentry,
 			    oldsize & (inode->i_sb->s_blocksize - 1)) {
 				error = ext4_inode_attach_jinode(inode);
 				if (error)
-					goto err_out;
+					goto out_mmap_sem;
 			}
 
 			handle = ext4_journal_start(inode, EXT4_HT_INODE, 3);
-- 
2.39.2
Re: [PATCH] ext4: goto right label 'out_mmap_sem' in ext4_setattr()
Posted by Theodore Ts'o 9 months ago
On Thu, 13 Feb 2025 19:22:47 +0800, libaokun@huaweicloud.com wrote:
> Otherwise, if ext4_inode_attach_jinode() fails, a hung task will
> happen because filemap_invalidate_unlock() isn't called to unlock
> mapping->invalidate_lock. Like this:
> 
> EXT4-fs error (device sda) in ext4_setattr:5557: Out of memory
> INFO: task fsstress:374 blocked for more than 122 seconds.
>       Not tainted 6.14.0-rc1-next-20250206-xfstests-dirty #726
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> task:fsstress state:D stack:0     pid:374   tgid:374   ppid:373
>                                   task_flags:0x440140 flags:0x00000000
> Call Trace:
>  <TASK>
>  __schedule+0x2c9/0x7f0
>  schedule+0x27/0xa0
>  schedule_preempt_disabled+0x15/0x30
>  rwsem_down_read_slowpath+0x278/0x4c0
>  down_read+0x59/0xb0
>  page_cache_ra_unbounded+0x65/0x1b0
>  filemap_get_pages+0x124/0x3e0
>  filemap_read+0x114/0x3d0
>  vfs_read+0x297/0x360
>  ksys_read+0x6c/0xe0
>  do_syscall_64+0x4b/0x110
>  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> 
> [...]

Applied, thanks!

[1/1] ext4: goto right label 'out_mmap_sem' in ext4_setattr()
      commit: 6b7e17cd4534688c341e900b9a2e42f307a3ff9c

Best regards,
-- 
Theodore Ts'o <tytso@mit.edu>
Re: [PATCH] ext4: goto right label 'out_mmap_sem' in ext4_setattr()
Posted by Brian Foster 10 months, 1 week ago
On Thu, Feb 13, 2025 at 07:22:47PM +0800, libaokun@huaweicloud.com wrote:
> From: Baokun Li <libaokun1@huawei.com>
> 
> Otherwise, if ext4_inode_attach_jinode() fails, a hung task will
> happen because filemap_invalidate_unlock() isn't called to unlock
> mapping->invalidate_lock. Like this:
> 
> EXT4-fs error (device sda) in ext4_setattr:5557: Out of memory
> INFO: task fsstress:374 blocked for more than 122 seconds.
>       Not tainted 6.14.0-rc1-next-20250206-xfstests-dirty #726
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> task:fsstress state:D stack:0     pid:374   tgid:374   ppid:373
>                                   task_flags:0x440140 flags:0x00000000
> Call Trace:
>  <TASK>
>  __schedule+0x2c9/0x7f0
>  schedule+0x27/0xa0
>  schedule_preempt_disabled+0x15/0x30
>  rwsem_down_read_slowpath+0x278/0x4c0
>  down_read+0x59/0xb0
>  page_cache_ra_unbounded+0x65/0x1b0
>  filemap_get_pages+0x124/0x3e0
>  filemap_read+0x114/0x3d0
>  vfs_read+0x297/0x360
>  ksys_read+0x6c/0xe0
>  do_syscall_64+0x4b/0x110
>  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> 
> Fixes: c7fc0366c656 ("ext4: partial zero eof block on unaligned inode size extension")
> Signed-off-by: Baokun Li <libaokun1@huawei.com>
> ---

First off, thank you for catching this. :)

>  fs/ext4/inode.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 3cc8da6357aa..04ffd802dbde 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -5452,7 +5452,7 @@ int ext4_setattr(struct mnt_idmap *idmap, struct dentry *dentry,
>  			    oldsize & (inode->i_sb->s_blocksize - 1)) {
>  				error = ext4_inode_attach_jinode(inode);
>  				if (error)
> -					goto err_out;
> +					goto out_mmap_sem;
>  			}

This looks reasonable to me, but I notice that the immediate previous
error check looks like this:

		...
                rc = ext4_break_layouts(inode);
                if (rc) {
                        filemap_invalidate_unlock(inode->i_mapping);
                        goto err_out;
                }
		...

... and then the following after the broken logic uses out_mmap_sem.
Could we be a little more consistent here one way or the other? The
change looks functionally correct to me either way:

Reviewed-by: Brian Foster <bfoster@redhat.com>

Brian

>  
>  			handle = ext4_journal_start(inode, EXT4_HT_INODE, 3);
> -- 
> 2.39.2
> 
>
Re: [PATCH] ext4: goto right label 'out_mmap_sem' in ext4_setattr()
Posted by Baokun Li 10 months, 1 week ago
Hi,

On 2025/2/13 20:51, Brian Foster wrote:
> On Thu, Feb 13, 2025 at 07:22:47PM +0800, libaokun@huaweicloud.com wrote:
>> From: Baokun Li <libaokun1@huawei.com>
>>
>> Otherwise, if ext4_inode_attach_jinode() fails, a hung task will
>> happen because filemap_invalidate_unlock() isn't called to unlock
>> mapping->invalidate_lock. Like this:
>>
>> EXT4-fs error (device sda) in ext4_setattr:5557: Out of memory
>> INFO: task fsstress:374 blocked for more than 122 seconds.
>>        Not tainted 6.14.0-rc1-next-20250206-xfstests-dirty #726
>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>> task:fsstress state:D stack:0     pid:374   tgid:374   ppid:373
>>                                    task_flags:0x440140 flags:0x00000000
>> Call Trace:
>>   <TASK>
>>   __schedule+0x2c9/0x7f0
>>   schedule+0x27/0xa0
>>   schedule_preempt_disabled+0x15/0x30
>>   rwsem_down_read_slowpath+0x278/0x4c0
>>   down_read+0x59/0xb0
>>   page_cache_ra_unbounded+0x65/0x1b0
>>   filemap_get_pages+0x124/0x3e0
>>   filemap_read+0x114/0x3d0
>>   vfs_read+0x297/0x360
>>   ksys_read+0x6c/0xe0
>>   do_syscall_64+0x4b/0x110
>>   entry_SYSCALL_64_after_hwframe+0x76/0x7e
>>
>> Fixes: c7fc0366c656 ("ext4: partial zero eof block on unaligned inode size extension")
>> Signed-off-by: Baokun Li <libaokun1@huawei.com>
>> ---
> First off, thank you for catching this. :)
Thanks for your review!
>
>>   fs/ext4/inode.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
>> index 3cc8da6357aa..04ffd802dbde 100644
>> --- a/fs/ext4/inode.c
>> +++ b/fs/ext4/inode.c
>> @@ -5452,7 +5452,7 @@ int ext4_setattr(struct mnt_idmap *idmap, struct dentry *dentry,
>>   			    oldsize & (inode->i_sb->s_blocksize - 1)) {
>>   				error = ext4_inode_attach_jinode(inode);
>>   				if (error)
>> -					goto err_out;
>> +					goto out_mmap_sem;
>>   			}
> This looks reasonable to me, but I notice that the immediate previous
> error check looks like this:
>
> 		...
>                  rc = ext4_break_layouts(inode);
>                  if (rc) {
>                          filemap_invalidate_unlock(inode->i_mapping);
>                          goto err_out;
>                  }
> 		...
>
> ... and then the following after the broken logic uses out_mmap_sem.
> Could we be a little more consistent here one way or the other? The
> change looks functionally correct to me either way:
>
> Reviewed-by: Brian Foster <bfoster@redhat.com>
>
> Brian
Indeed, this is confusing.

The reason is that we don't want to call ext4_std_error() when
ext4_break_layouts() fails. So we first store the error in 'rc', and then
pass the error to 'error' at the end. (See b9c1c26739ec
("ext4: gracefully handle ext4_break_layouts() failure during truncate"))

However, because 'error' is not assigned, the goto out_mmap_sem label will
execute some code that shouldn't be executed. Therefore, in the error
handling of ext4_break_layouts(), we unlock and then goto err_out label.

While under normal error conditions, 'error' is assigned, and it should
enter the out_mmap_sem label. Therefore, in the error handling of
ext4_inode_attach_jinode(), we directly goto out_mmap_sem label.

The handling of 'rc' in this function is indeed very subtle.


Cheers,
Baokun
>>   
>>   			handle = ext4_journal_start(inode, EXT4_HT_INODE, 3);
>> -- 
>> 2.39.2
>>
>>
Re: [PATCH] ext4: goto right label 'out_mmap_sem' in ext4_setattr()
Posted by Brian Foster 10 months, 1 week ago
On Thu, Feb 13, 2025 at 09:20:21PM +0800, Baokun Li wrote:
> Hi,
> 
> On 2025/2/13 20:51, Brian Foster wrote:
> > On Thu, Feb 13, 2025 at 07:22:47PM +0800, libaokun@huaweicloud.com wrote:
> > > From: Baokun Li <libaokun1@huawei.com>
> > > 
> > > Otherwise, if ext4_inode_attach_jinode() fails, a hung task will
> > > happen because filemap_invalidate_unlock() isn't called to unlock
> > > mapping->invalidate_lock. Like this:
> > > 
> > > EXT4-fs error (device sda) in ext4_setattr:5557: Out of memory
> > > INFO: task fsstress:374 blocked for more than 122 seconds.
> > >        Not tainted 6.14.0-rc1-next-20250206-xfstests-dirty #726
> > > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > > task:fsstress state:D stack:0     pid:374   tgid:374   ppid:373
> > >                                    task_flags:0x440140 flags:0x00000000
> > > Call Trace:
> > >   <TASK>
> > >   __schedule+0x2c9/0x7f0
> > >   schedule+0x27/0xa0
> > >   schedule_preempt_disabled+0x15/0x30
> > >   rwsem_down_read_slowpath+0x278/0x4c0
> > >   down_read+0x59/0xb0
> > >   page_cache_ra_unbounded+0x65/0x1b0
> > >   filemap_get_pages+0x124/0x3e0
> > >   filemap_read+0x114/0x3d0
> > >   vfs_read+0x297/0x360
> > >   ksys_read+0x6c/0xe0
> > >   do_syscall_64+0x4b/0x110
> > >   entry_SYSCALL_64_after_hwframe+0x76/0x7e
> > > 
> > > Fixes: c7fc0366c656 ("ext4: partial zero eof block on unaligned inode size extension")
> > > Signed-off-by: Baokun Li <libaokun1@huawei.com>
> > > ---
> > First off, thank you for catching this. :)
> Thanks for your review!
> > 
> > >   fs/ext4/inode.c | 2 +-
> > >   1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> > > index 3cc8da6357aa..04ffd802dbde 100644
> > > --- a/fs/ext4/inode.c
> > > +++ b/fs/ext4/inode.c
> > > @@ -5452,7 +5452,7 @@ int ext4_setattr(struct mnt_idmap *idmap, struct dentry *dentry,
> > >   			    oldsize & (inode->i_sb->s_blocksize - 1)) {
> > >   				error = ext4_inode_attach_jinode(inode);
> > >   				if (error)
> > > -					goto err_out;
> > > +					goto out_mmap_sem;
> > >   			}
> > This looks reasonable to me, but I notice that the immediate previous
> > error check looks like this:
> > 
> > 		...
> >                  rc = ext4_break_layouts(inode);
> >                  if (rc) {
> >                          filemap_invalidate_unlock(inode->i_mapping);
> >                          goto err_out;
> >                  }
> > 		...
> > 
> > ... and then the following after the broken logic uses out_mmap_sem.
> > Could we be a little more consistent here one way or the other? The
> > change looks functionally correct to me either way:
> > 
> > Reviewed-by: Brian Foster <bfoster@redhat.com>
> > 
> > Brian
> Indeed, this is confusing.
> 
> The reason is that we don't want to call ext4_std_error() when
> ext4_break_layouts() fails. So we first store the error in 'rc', and then
> pass the error to 'error' at the end. (See b9c1c26739ec
> ("ext4: gracefully handle ext4_break_layouts() failure during truncate"))
> 
> However, because 'error' is not assigned, the goto out_mmap_sem label will
> execute some code that shouldn't be executed. Therefore, in the error
> handling of ext4_break_layouts(), we unlock and then goto err_out label.
> 
> While under normal error conditions, 'error' is assigned, and it should
> enter the out_mmap_sem label. Therefore, in the error handling of
> ext4_inode_attach_jinode(), we directly goto out_mmap_sem label.
> 
> The handling of 'rc' in this function is indeed very subtle.
> 

Ah, indeed.. I glossed over the use of rc in there on my quick read.
Thanks for the clarification!

Brian

> 
> Cheers,
> Baokun
> > >   			handle = ext4_journal_start(inode, EXT4_HT_INODE, 3);
> > > -- 
> > > 2.39.2
> > > 
> > > 
>
Re: [PATCH] ext4: goto right label 'out_mmap_sem' in ext4_setattr()
Posted by Jan Kara 10 months, 1 week ago
On Thu 13-02-25 19:22:47, libaokun@huaweicloud.com wrote:
> From: Baokun Li <libaokun1@huawei.com>
> 
> Otherwise, if ext4_inode_attach_jinode() fails, a hung task will
> happen because filemap_invalidate_unlock() isn't called to unlock
> mapping->invalidate_lock. Like this:
> 
> EXT4-fs error (device sda) in ext4_setattr:5557: Out of memory
> INFO: task fsstress:374 blocked for more than 122 seconds.
>       Not tainted 6.14.0-rc1-next-20250206-xfstests-dirty #726
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> task:fsstress state:D stack:0     pid:374   tgid:374   ppid:373
>                                   task_flags:0x440140 flags:0x00000000
> Call Trace:
>  <TASK>
>  __schedule+0x2c9/0x7f0
>  schedule+0x27/0xa0
>  schedule_preempt_disabled+0x15/0x30
>  rwsem_down_read_slowpath+0x278/0x4c0
>  down_read+0x59/0xb0
>  page_cache_ra_unbounded+0x65/0x1b0
>  filemap_get_pages+0x124/0x3e0
>  filemap_read+0x114/0x3d0
>  vfs_read+0x297/0x360
>  ksys_read+0x6c/0xe0
>  do_syscall_64+0x4b/0x110
>  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> 
> Fixes: c7fc0366c656 ("ext4: partial zero eof block on unaligned inode size extension")
> Signed-off-by: Baokun Li <libaokun1@huawei.com>

Indeed. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/ext4/inode.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 3cc8da6357aa..04ffd802dbde 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -5452,7 +5452,7 @@ int ext4_setattr(struct mnt_idmap *idmap, struct dentry *dentry,
>  			    oldsize & (inode->i_sb->s_blocksize - 1)) {
>  				error = ext4_inode_attach_jinode(inode);
>  				if (error)
> -					goto err_out;
> +					goto out_mmap_sem;
>  			}
>  
>  			handle = ext4_journal_start(inode, EXT4_HT_INODE, 3);
> -- 
> 2.39.2
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR