Hi,
I've a following syzkaller report (no reproducer); the report is
against 5.15, but the same call-chain seems possible in current
upstream as well. So I suspect that maybe ext4_xattr_inode_create()
should take nested inode_lock (I_MUTEX_XATTR) instead. Does the
patch below make any sense?
======================================================
WARNING: possible circular locking dependency detected
5.15.168-syzkaller-23766-g3f37c55c6291 #0 Not tainted
------------------------------------------------------
syz-executor297/1452 is trying to acquire lock:
ffff888120b5e750 (&ea_inode->i_rwsem#8/1){+.+.}-{3:3}, at: inode_lock
ffff888120b5e750 (&ea_inode->i_rwsem#8/1){+.+.}-{3:3}, at: ext4_xattr_inode_create
ffff888120b5e750 (&ea_inode->i_rwsem#8/1){+.+.}-{3:3}, at: ext4_xattr_inode_lookup_create
ffff888120b5e750 (&ea_inode->i_rwsem#8/1){+.+.}-{3:3}, at: ext4_xattr_set_entry+0x2aeb/0x3200
but task is already holding lock:
ffff888120b58c68 (&ei->i_data_sem/3){++++}-{3:3}, at: ext4_setattr+0x12b5/0x1950
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (&ei->i_data_sem/3){++++}-{3:3}:
down_write+0x38/0x60
ext4_update_i_disksize
ext4_xattr_inode_write
ext4_xattr_inode_lookup_create
ext4_xattr_set_entry+0x2839/0x3200
ext4_xattr_ibody_set+0x113/0x320
ext4_xattr_set_handle+0xa31/0x1440
ext4_xattr_set+0x266/0x3d0
__vfs_setxattr+0x15e/0x1c0
__vfs_setxattr_noperm+0x128/0x5e0
vfs_setxattr+0x1c6/0x410
setxattr+0x1d6/0x270
path_setxattr+0x1cc/0x2b0
__do_sys_lsetxattr
__se_sys_lsetxattr
__x64_sys_lsetxattr+0xb4/0xd0
do_syscall_x64
do_syscall_64+0x69/0xc0
entry_SYSCALL_64_after_hwframe+0x66/0xd0
-> #0 (&ea_inode->i_rwsem#8/1){+.+.}-{3:3}:
check_prev_add
check_prevs_add
validate_chain
__lock_acquire+0x2c95/0x7850
lock_acquire+0x1d2/0x4e0
down_write+0x38/0x60
inode_lock
ext4_xattr_inode_create
ext4_xattr_inode_lookup_create
ext4_xattr_set_entry+0x2aeb/0x3200
ext4_xattr_block_set+0xdc1/0x2de0
ext4_xattr_move_to_block
ext4_xattr_make_inode_space
ext4_expand_extra_isize_ea+0xe58/0x19c0
__ext4_expand_extra_isize+0x2fd/0x400
ext4_try_to_expand_extra_isize
__ext4_mark_inode_dirty+0x58b/0x840
ext4_setattr+0x1341/0x1950
notify_change+0xafb/0xd80
do_truncate+0x218/0x2f0
handle_truncate
do_open
path_openat+0x27d3/0x2e10
do_filp_open+0x23a/0x360
do_sys_openat2+0x188/0x720
do_sys_open+0x1d1/0x220
do_syscall_x64
do_syscall_64+0x69/0xc0
entry_SYSCALL_64_after_hwframe+0x66/0xd0
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&ei->i_data_sem/3);
lock(&ea_inode->i_rwsem#8/1);
lock(&ei->i_data_sem/3);
lock(&ea_inode->i_rwsem#8/1);
*** DEADLOCK ***
5 locks held by syz-executor297/1452:
#0: ffff88811231c460 (sb_writers#5){.+.+}-{0:0}, at: mnt_want_write+0x3b/0x80
#1: ffff888120b58de0 (&sb->s_type->i_mutex_key#8){++++}-{3:3}, at: inode_lock
#1: ffff888120b58de0 (&sb->s_type->i_mutex_key#8){++++}-{3:3}, at: do_truncate+0x204/0x2f0
#2: ffff888120b58f80 (mapping.invalidate_lock){++++}-{3:3}, at: filemap_invalidate_lock
#2: ffff888120b58f80 (mapping.invalidate_lock){++++}-{3:3}, at: ext4_setattr+0xd49/0x1950
#3: ffff888120b58c68 (&ei->i_data_sem/3){++++}-{3:3}, at: ext4_setattr+0x12b5/0x1950
#4: ffff888120b58ab8 (&ei->xattr_sem){++++}-{3:3}, at: ext4_write_trylock_xattr
#4: ffff888120b58ab8 (&ei->xattr_sem){++++}-{3:3}, at: ext4_try_to_expand_extra_isize
#4: ffff888120b58ab8 (&ei->xattr_sem){++++}-{3:3}, at: __ext4_mark_inode_dirty+0x4f7/0x840
stack backtrace:
Call Trace:
<TASK>
__dump_stack
dump_stack_lvl+0x1e3/0x2d0
check_noncircular+0x2f3/0x3a0
check_prev_add
check_prevs_add
validate_chain
__lock_acquire+0x2c95/0x7850
lock_acquire+0x1d2/0x4e0
down_write+0x38/0x60
inode_lock
ext4_xattr_inode_create
ext4_xattr_inode_lookup_create
ext4_xattr_set_entry+0x2aeb/0x3200
ext4_xattr_block_set+0xdc1/0x2de0
ext4_xattr_move_to_block
ext4_xattr_make_inode_space
ext4_expand_extra_isize_ea+0xe58/0x19c0
__ext4_expand_extra_isize+0x2fd/0x400
ext4_try_to_expand_extra_isize
__ext4_mark_inode_dirty+0x58b/0x840
ext4_setattr+0x1341/0x1950
notify_change+0xafb/0xd80
do_truncate+0x218/0x2f0
handle_truncate
do_open
path_openat+0x27d3/0x2e10
do_filp_open+0x23a/0x360
do_sys_openat2+0x188/0x720
do_sys_open+0x1d1/0x220
do_syscall_x64
do_syscall_64+0x69/0xc0
entry_SYSCALL_64_after_hwframe+0x66/0xd0
---
diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c
index 7647e9f6e190..db3c68fbbadf 100644
--- a/fs/ext4/xattr.c
+++ b/fs/ext4/xattr.c
@@ -1511,7 +1511,7 @@ static struct inode *ext4_xattr_inode_create(handle_t *handle,
*/
dquot_free_inode(ea_inode);
dquot_drop(ea_inode);
- inode_lock(ea_inode);
+ inode_lock_nested(inode, I_MUTEX_XATTR);
ea_inode->i_flags |= S_NOQUOTA;
inode_unlock(ea_inode);
}
On Tue, Nov 12, 2024 at 04:34:21PM +0900, Sergey Senozhatsky wrote: > > I've a following syzkaller report (no reproducer); the report is > against 5.15, but the same call-chain seems possible in current > upstream as well. So I suspect that maybe ext4_xattr_inode_create() > should take nested inode_lock (I_MUTEX_XATTR) instead. Does the > patch below make any sense? These syzkaller reports result from mounting a corrupted (fuzzed) file system typically when an inode is used in multiple contexts (e.g., as a directory and an EA inode, etc.) at the same time. I'd have to take a closer look to see if it makes sense, but in general, very often whenever we try to fix one of these it ends up triggering some other syzkaller failure. And, these sorts of things don't actually result in actual security problems (at worst, a hang / denial of service attack), and the right thing to do is to just run fsck on the !@#?!? file system before mounting the thing. The best way to protect systems against threat model of users picking up a random USB stick dropped in a parking lot that contains a maliciously fuzzed file system is to either (a) run fsck before allowing the file system to be mounted, (b) enable the enterprise policy that prohibits USB thumb drives from being automounted, or (c) mount USB stick in some kind of VM (e.g., CrosVM) and then use a reverse virtiofs / 9pfs / fuse to make the file system be available in the host system. The last would be best solution, but it would require development work. So I mention it in the hopes that at some point I can convince some company to pick it up, since it would significantly improve security for all desktops, laptops, and mobile systems that want to support mounting removeable storage. In any case, trying to fix these sorts of syzkaller warnings is essentially playing whack-a-mole, and so while I don't have objections to these sorts of fixes, if it causes any kind of regression or worse, *two* new syzkaller failures, it just makes life harder for overworked ext4 developers. :-) Cheers, - Ted
Hi Ted, On (24/11/12 10:29), Theodore Ts'o wrote: > > I've a following syzkaller report (no reproducer); the report is > > against 5.15, but the same call-chain seems possible in current > > upstream as well. So I suspect that maybe ext4_xattr_inode_create() > > should take nested inode_lock (I_MUTEX_XATTR) instead. Does the > > patch below make any sense? > > These syzkaller reports result from mounting a corrupted (fuzzed) file > system typically when an inode is used in multiple contexts (e.g., as > a directory and an EA inode, etc.) at the same time. I certainly see your point, and I don't argue. > I'd have to take a closer look to see if it makes sense, but in > general, very often whenever we try to fix one of these it ends up > triggering some other syzkaller failure. I see, the one-liner that I posted sort of looks like an addition to d1bc560e9a9c7 which landed in ext4 recently. > And, these sorts of things don't actually result in actual security > problems (at worst, a hang / denial of service attack), and the right > thing to do is to just run fsck on the !@#?!? file system before > mounting the thing. So in our particular case reboot is a bad scenario. Looking at reports from the fleet I see a bunch of hung-task reboots with ext4 frames, e.g. ext4_update_i_disksize()->down_write()->schedule() /* forever */, but I can't claim that this is the deadlock that syzkaller has reported, it very well might not be.
© 2016 - 2025 Red Hat, Inc.