ext4: do not BUG when INLINE_DATA_FL lacks system.data xattr

[PATCH] ext4: do not BUG when INLINE_DATA_FL lacks system.data xattr

Posted by Theodore Ts'o 2 months, 3 weeks ago

On Thu, Jul 10, 2025 at 01:01:04AM -0700, Moon Hee Lee wrote:
> From 4c910ac989e7a6d97565a67677a1ee88e2d1a9ad Mon Sep 17 00:00:00 2001
> From: Moon Hee Lee <moonhee.lee.ca@gmail.com>
> Date: Thu, 10 Jul 2025 00:36:59 -0700
> Subject: [PATCH] ext4: bail out when INLINE_DATA_FL lacks system.data xattr
> 
> A syzbot fuzzed image triggered a BUG_ON in ext4_update_inline_data()
> when an inode had the INLINE_DATA_FL flag set but was missing the
> system.data extended attribute.
> 
> ext4_prepare_inline_data() now checks for the presence of that xattr
> and returns -EFSCORRUPTED if it is missing, preventing corrupted inodes
> from reaching the update path and triggering a crash.

Thanks ofor the patch!  However, instead of doing an xattr lookup in
ext4_prepare_inline_data(), we can more simply and more efficiently
just not BUG in ext4_update_inline_data, like this:


From eedfada9eb51541fe72f19350503890da522212d Mon Sep 17 00:00:00 2001
From: Theodore Ts'o <tytso@mit.edu>
Date: Thu, 17 Jul 2025 10:54:34 -0400
Subject: [PATCH] ext4: do not BUG when INLINE_DATA_FL lacks system.data xattr

A syzbot fuzzed image triggered a BUG_ON in ext4_update_inline_data()
when an inode had the INLINE_DATA_FL flag set but was missing the
system.data extended attribute.

Since this can happen due to a maiciouly fuzzed file system, we
shouldn't BUG, but rather, report it as a corrupted file system.

Reported-by: syzbot+544248a761451c0df72f@syzkaller.appspotmail.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
---
 fs/ext4/inline.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/fs/ext4/inline.c b/fs/ext4/inline.c
index d5b32d242495..424c40c768de 100644
--- a/fs/ext4/inline.c
+++ b/fs/ext4/inline.c
@@ -354,6 +354,12 @@ static int ext4_update_inline_data(handle_t *handle, struct inode *inode,
 	if (error)
 		goto out;
 
+	if (is.s.not_found) {
+		EXT4_ERROR_INODE(inode, "missing inline data xattr");
+		error = -EFSCORRUPTED;
+		goto out;
+	}
+
 	BUG_ON(is.s.not_found);
 
 	len -= EXT4_MIN_INLINE_DATA_SIZE;
-- 
2.47.2

Re: [PATCH] ext4: do not BUG when INLINE_DATA_FL lacks system.data xattr

Posted by Moon Hee Lee 2 months, 3 weeks ago

>
> Thanks ofor the patch!  However, instead of doing an xattr lookup in
> ext4_prepare_inline_data(), we can more simply and more efficiently
> just not BUG in ext4_update_inline_data, like this:

Thanks for the response and for taking the time to address the issue.

Just to clarify the intent behind the earlier patch [1]: it was meant to
catch the missing system.data xattr early in ext4_prepare_inline_data(),
before branching into paths that assume the xattr is present.

> @@ -354,6 +354,12 @@ static int ext4_update_inline_data(handle_t *handle, struct inode *inode,
>         if (error)
>                 goto out;
>
> +       if (is.s.not_found) {
> +               EXT4_ERROR_INODE(inode, "missing inline data xattr");
> +               error = -EFSCORRUPTED;
> +               goto out;
> +       }
> +
>         BUG_ON(is.s.not_found);

The current patch addresses ext4_update_inline_data() directly, but the
same condition also leads to a BUG_ON in ext4_create_inline_data() [2],
which the earlier approach intended to prevent as well.

Later, a third instance was found in ext4_inline_data_truncate() [3],
which also contains a similar BUG_ON and might need the same kind of
check.

Reducing duplicated checks across these sites would be beneficial, though
fixing each case directly also looks reasonable and straightforward.

[1] https://lore.kernel.org/all/20250710175837.29822-2-moonhee.lee.ca@gmail.com/
[2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/fs/ext4/inline.c?h=v6.16-rc6#n306
[3] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/fs/ext4/inline.c?h=v6.16-rc6#n1906

Re: [PATCH] ext4: do not BUG when INLINE_DATA_FL lacks system.data xattr

Posted by Theodore Ts'o 2 months, 2 weeks ago

On Thu, Jul 17, 2025 at 09:59:13AM -0700, Moon Hee Lee wrote:
> The current patch addresses ext4_update_inline_data() directly, but the
> same condition also leads to a BUG_ON in ext4_create_inline_data() [2],
> which the earlier approach intended to prevent as well.

Actually, the two conditions are opposite to each other.  The one in
ext4_update_inline_data() was:

         BUG_ON(is.s.not_found);

while te one in ext4_create_inline_data() was:

	BUG_ON(!is.s.not_found);

So your patch would not only cause an extra xattr lookup in
ext4_prepare_inline_data(), but it would actually cause problems by
causing spurious failures when first writing to an inline data file.
(Which makes me suspect that you hadn't run other test on your patich
other than just vaidating that the syzkaller reproduce was no longer
reproducing.)   

Also, having taking a closer look at te code paths, I became
suspicious that there is something about the syzkaller reproducer is
doing which might be a bit sus.  That's because whether we call
ext4_update_inline_data() or ext4_create_inline_data() is based on
whether i_inline off is set or not:

	if (ei->i_inline_off)
		ret = ext4_update_inline_data(handle, inode, len);
	else
		ret = ext4_create_inline_data(handle, inode, len);

But how is ei->i_inline_off set?  It's set from a former call to
ext4_xattr_ibody_find():

	error = ext4_xattr_ibody_find(inode, &i, &is);
	if (error)
		goto out;

	if (!is.s.not_found) {
		if (is.s.here->e_value_inum) {
			EXT4_ERROR_INODE(inode, "inline data xattr refers "
					 "to an external xattr inode");
			error = -EFSCORRUPTED;
			goto out;
		}
		EXT4_I(inode)->i_inline_off = (u16)((void *)is.s.here -
					(void *)ext4_raw_inode(&is.iloc));
		EXT4_I(inode)->i_inline_size = EXT4_MIN_INLINE_DATA_SIZE +
				le32_to_cpu(is.s.here->e_value_size);
	}

So the whole *reason* why i_inline_off exists is because we're caching
the result of calling ext4_xattr_ibody_find().  So if i_inline_off is
non-zero, and then when we call ext4_ibody_find() later on, and we
find that xattr has suddenly disappeared, there is something weird
going on.   That's why the BUG_ON was added orginally.

When I took a look at the reproduer, I found that indeed, it is
calling LOOP_CLR_FD and LOOP_SET_STATUS64 to reconfigure the loop
device out from under the mounted file system.  This is smashing the
file system, and is therefore corrupting the block device.  As it
turns out, Jan Kara recently sent out a patch, and it has been
accepted in the block tree, to prevent a similar Syzkaller issue using
LOOP_SET_BLOCK_SIZE[1].

[1] https://lore.kernel.org/r/20250711163202.19623-2-jack@suse.cz

We need to do something similar for LOOP_CLR_FD, LOOP_SET_STATUS,
LOOP_SET_STATUS64, LOOP_CHANGE_FD, and LOOP_SET_CAPACITY ioctls.

Cheers,

						- Ted

Re: [PATCH] ext4: do not BUG when INLINE_DATA_FL lacks system.data xattr

Posted by Jan Kara 2 months, 2 weeks ago

On Thu 17-07-25 21:05:21, Theodore Ts'o wrote:
> On Thu, Jul 17, 2025 at 09:59:13AM -0700, Moon Hee Lee wrote:
> > The current patch addresses ext4_update_inline_data() directly, but the
> > same condition also leads to a BUG_ON in ext4_create_inline_data() [2],
> > which the earlier approach intended to prevent as well.
> 
> Actually, the two conditions are opposite to each other.  The one in
> ext4_update_inline_data() was:
> 
>          BUG_ON(is.s.not_found);
> 
> while te one in ext4_create_inline_data() was:
> 
> 	BUG_ON(!is.s.not_found);
> 
> So your patch would not only cause an extra xattr lookup in
> ext4_prepare_inline_data(), but it would actually cause problems by
> causing spurious failures when first writing to an inline data file.
> (Which makes me suspect that you hadn't run other test on your patich
> other than just vaidating that the syzkaller reproduce was no longer
> reproducing.)   
> 
> Also, having taking a closer look at te code paths, I became
> suspicious that there is something about the syzkaller reproducer is
> doing which might be a bit sus.  That's because whether we call
> ext4_update_inline_data() or ext4_create_inline_data() is based on
> whether i_inline off is set or not:
> 
> 	if (ei->i_inline_off)
> 		ret = ext4_update_inline_data(handle, inode, len);
> 	else
> 		ret = ext4_create_inline_data(handle, inode, len);
> 
> 
> But how is ei->i_inline_off set?  It's set from a former call to
> ext4_xattr_ibody_find():
> 
> 	error = ext4_xattr_ibody_find(inode, &i, &is);
> 	if (error)
> 		goto out;
> 
> 	if (!is.s.not_found) {
> 		if (is.s.here->e_value_inum) {
> 			EXT4_ERROR_INODE(inode, "inline data xattr refers "
> 					 "to an external xattr inode");
> 			error = -EFSCORRUPTED;
> 			goto out;
> 		}
> 		EXT4_I(inode)->i_inline_off = (u16)((void *)is.s.here -
> 					(void *)ext4_raw_inode(&is.iloc));
> 		EXT4_I(inode)->i_inline_size = EXT4_MIN_INLINE_DATA_SIZE +
> 				le32_to_cpu(is.s.here->e_value_size);
> 	}
> 
> So the whole *reason* why i_inline_off exists is because we're caching
> the result of calling ext4_xattr_ibody_find().  So if i_inline_off is
> non-zero, and then when we call ext4_ibody_find() later on, and we
> find that xattr has suddenly disappeared, there is something weird
> going on.   That's why the BUG_ON was added orginally.
> 
> When I took a look at the reproduer, I found that indeed, it is
> calling LOOP_CLR_FD and LOOP_SET_STATUS64 to reconfigure the loop
> device out from under the mounted file system.  This is smashing the
> file system, and is therefore corrupting the block device.  As it
> turns out, Jan Kara recently sent out a patch, and it has been
> accepted in the block tree, to prevent a similar Syzkaller issue using
> LOOP_SET_BLOCK_SIZE[1].
> 
> [1] https://lore.kernel.org/r/20250711163202.19623-2-jack@suse.cz
> 
> We need to do something similar for LOOP_CLR_FD, LOOP_SET_STATUS,
> LOOP_SET_STATUS64, LOOP_CHANGE_FD, and LOOP_SET_CAPACITY ioctls.

Well, careful here. Changing loop device underneath mounted filesystem is a
valid usecase in active use (similarly as changing DM device underneath a
filesystem). So don't think we can play similar tricks as with
LOOP_SET_BLOCK_SIZE where changing block device block size just doesn't
make sense while the device is in use. Similarly LOOP_CLR_FD is an
equivalent of device going away. LOOP_CHANGE_FD is a legacy of the past but
it was *designed* to be used to swap backing file under a life filesystem
(old days of Wild West :)) during boot. We may get away with dropping that
these days but so far I'm not convinced it's worth the risk. So in this case
I don't see anything here that couldn't happen with say DM device and thus
I wouldn't really restrict the loop device functionality...

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

Re: [PATCH] ext4: do not BUG when INLINE_DATA_FL lacks system.data xattr

Posted by Theodore Ts'o 2 months, 2 weeks ago

On Mon, Jul 21, 2025 at 02:49:57PM +0200, Jan Kara wrote:
> > We need to do something similar for LOOP_CLR_FD, LOOP_SET_STATUS,
> > LOOP_SET_STATUS64, LOOP_CHANGE_FD, and LOOP_SET_CAPACITY ioctls.
> 
> Well, careful here. Changing loop device underneath mounted filesystem is a
> valid usecase in active use (similarly as changing DM device underneath a
> filesystem). So don't think we can play similar tricks as with
> LOOP_SET_BLOCK_SIZE where changing block device block size just doesn't
> make sense while the device is in use. Similarly LOOP_CLR_FD is an
> equivalent of device going away. LOOP_CHANGE_FD is a legacy of the past but
> it was *designed* to be used to swap backing file under a life filesystem
> (old days of Wild West :)) during boot. We may get away with dropping that
> these days but so far I'm not convinced it's worth the risk. So in this case
> I don't see anything here that couldn't happen with say DM device and thus
> I wouldn't really restrict the loop device functionality...

Sure, and LOOP_SET_CAPACITY might be used to grow a file system image
which the file system could then grow into.  Fair.

This is related to BLK_DEV_WRITE_MOUNTED=n which the syzkaller folks
have agreed to use to prevent noisy syzkaller reports.  We're seeing a
bunch of syzkaller reports now that syzkaller has been taught how to
use these loop ioctls and so we're seeing loop device hijinks.  Which
is fine; I can just start filtering any syzkaller report that uses
loop device ioctls as false positive noise and call it a day.
Unfortunately, that won't help deal with researchers that are taking
the syzkaller code and then sending reports without any dashboards or
reproducers.  :-(

However, I do think that if the file system has advertised that they
don't support random underlying block device hijinks, such as XFS for
example, we should honor this and disallow those "wild west" loop
device operations.  And perhaps we should similarly disallow them if
BLK_DEV_WRITE_MOUNTED=n.

						- Ted

Re: [PATCH] ext4: do not BUG when INLINE_DATA_FL lacks system.data xattr

Posted by Moon Hee Lee 2 months, 2 weeks ago

On Thu, Jul 17, 2025 at 9:59 AM Moon Hee Lee <moonhee.lee.ca@gmail.com> wrote:

Just a quick follow-up to close the loop:

>
> The current patch addresses ext4_update_inline_data() directly, but the
> same condition also leads to a BUG_ON in ext4_create_inline_data() [2],
> which the earlier approach intended to prevent as well.

I missed that ext4_create_inline_data expects the xattr to be absent at
that point, since it's about to create it. The BUG_ON(!is.s.not_found)
enforces that expectation.

The patch I sent earlier didn’t account for this correctly. Returning an
error when the xattr was not found would have broken valid behavior in
the create path.

Thanks again for resolving this with a simple and correct fix. Per-site
handling makes sense here, as each path has different expectations about
the xattr state.

Best regards,
Moonhee