f2fs: don't BUG on node footer mismatch in f2fs_write_end_io

[PATCH] f2fs: don't BUG on node footer mismatch in f2fs_write_end_io

Posted by Deepanshu Kartikey 1 week, 1 day ago

Syzbot reports a recurrence of the kernel BUG in f2fs_write_end_io:

  kernel BUG at fs/f2fs/data.c:388!
  Oops: invalid opcode: 0000 [#1] SMP KASAN PTI
  CPU: 0 UID: 0 PID: 15 Comm: ksoftirqd/0 PREEMPT_{RT,(full)}
  RIP: 0010:f2fs_write_end_io+0x16df/0x1740
  Call Trace:
   blk_update_request+0x57e/0xe60
   blk_mq_end_request+0x3e/0x70
   blk_done_softirq+0x10a/0x160
   handle_softirqs+0x1de/0x6d0
   run_ksoftirqd+0x52/0x180

Commit 50ac3ecd8e05 ("f2fs: fix to do sanity check on node footer
in {read,write}_end_io") added f2fs_sanity_check_node_footer() to
both end_io paths to catch corrupted node footers reachable from
fuzzed on-disk images. In f2fs_write_end_io(), however, the
existing

  f2fs_bug_on(sbi, folio->index != nid_of_node(folio));

was left in place immediately after the new helper call. The
helper detects the mismatch, sets SBI_NEED_FSCK and emits a
ratelimited warning, but its return value is discarded and the
following f2fs_bug_on() panics on the exact same condition.

Tracing the reproducer confirms the failure path. A node folio
with index=11 is looked up via __get_node_folio(), the
synchronous sanity check at page_hit fails with -EFSCORRUPTED
and out_err clears uptodate but leaves the dirty bit set from
the folio's earlier lifecycle. A subsequent read_node_folio()
fails with the same error (footer_nid=0, ino=0), and
folio_end_read(folio, false) does not clear dirty either. The
writeback iterator then finds the still-dirty folio via the
PAGECACHE_TAG_DIRTY tag and submits it. f2fs_write_end_io()
observes folio->index=11 with nid_of_node(folio)=0 and panics
from softirq context via blk_done_softirq, even though
f2fs_sanity_check_node_footer() has already correctly identified
the corruption and would have signalled it via its return value.

A filesystem inconsistency reachable from a mounted image must
not panic the kernel. Mirror the handling already used in
f2fs_finish_read_bio(): capture the helper's return value and
mark the bio with BLK_STS_IOERR on mismatch instead of issuing
BUG_ON. SBI_NEED_FSCK is set by the helper, so fsck.f2fs will
repair the inconsistency on the next mount.

Reported-by: syzbot+4af46ee83100e99bce09@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=4af46ee83100e99bce09
Fixes: 50ac3ecd8e05 ("f2fs: fix to do sanity check on node footer in {read,write}_end_io")
Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com>
---
 fs/f2fs/data.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 8d4f1e75dee3..c149b0ccf22d 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -382,11 +382,11 @@ static void f2fs_write_end_io(struct bio *bio)
 						STOP_CP_REASON_WRITE_FAIL);
 		}
 
-		if (is_node_folio(folio)) {
-			f2fs_sanity_check_node_footer(sbi, folio,
-				folio->index, NODE_TYPE_REGULAR, true);
-			f2fs_bug_on(sbi, folio->index != nid_of_node(folio));
-		}
+		if (is_node_folio(folio) &&
+		    f2fs_sanity_check_node_footer(sbi, folio,
+						  folio->index, NODE_TYPE_REGULAR, true))
+			bio->bi_status = BLK_STS_IOERR;
+
 		if (f2fs_in_warm_node_list(folio))
 			f2fs_del_fsync_node_entry(sbi, folio);
 
-- 
2.43.0

Re: [PATCH] f2fs: don't BUG on node footer mismatch in f2fs_write_end_io

Posted by Chao Yu 1 week ago

On 5/17/2026 8:52 AM, Deepanshu Kartikey wrote:
> Syzbot reports a recurrence of the kernel BUG in f2fs_write_end_io:
> 
>    kernel BUG at fs/f2fs/data.c:388!
>    Oops: invalid opcode: 0000 [#1] SMP KASAN PTI
>    CPU: 0 UID: 0 PID: 15 Comm: ksoftirqd/0 PREEMPT_{RT,(full)}
>    RIP: 0010:f2fs_write_end_io+0x16df/0x1740
>    Call Trace:
>     blk_update_request+0x57e/0xe60
>     blk_mq_end_request+0x3e/0x70
>     blk_done_softirq+0x10a/0x160
>     handle_softirqs+0x1de/0x6d0
>     run_ksoftirqd+0x52/0x180
> 
> Commit 50ac3ecd8e05 ("f2fs: fix to do sanity check on node footer
> in {read,write}_end_io") added f2fs_sanity_check_node_footer() to
> both end_io paths to catch corrupted node footers reachable from
> fuzzed on-disk images. In f2fs_write_end_io(), however, the
> existing
> 
>    f2fs_bug_on(sbi, folio->index != nid_of_node(folio));
> 
> was left in place immediately after the new helper call. The
> helper detects the mismatch, sets SBI_NEED_FSCK and emits a
> ratelimited warning, but its return value is discarded and the
> following f2fs_bug_on() panics on the exact same condition.
> 
> Tracing the reproducer confirms the failure path. A node folio
> with index=11 is looked up via __get_node_folio(), the
> synchronous sanity check at page_hit fails with -EFSCORRUPTED
> and out_err clears uptodate but leaves the dirty bit set from
> the folio's earlier lifecycle. A subsequent read_node_folio()
> fails with the same error (footer_nid=0, ino=0), and
> folio_end_read(folio, false) does not clear dirty either. The
> writeback iterator then finds the still-dirty folio via the
> PAGECACHE_TAG_DIRTY tag and submits it. f2fs_write_end_io()
> observes folio->index=11 with nid_of_node(folio)=0 and panics
> from softirq context via blk_done_softirq, even though
> f2fs_sanity_check_node_footer() has already correctly identified
> the corruption and would have signalled it via its return value.
> 
> A filesystem inconsistency reachable from a mounted image must
> not panic the kernel. Mirror the handling already used in
> f2fs_finish_read_bio(): capture the helper's return value and
> mark the bio with BLK_STS_IOERR on mismatch instead of issuing
> BUG_ON. SBI_NEED_FSCK is set by the helper, so fsck.f2fs will
> repair the inconsistency on the next mount.
> 
> Reported-by: syzbot+4af46ee83100e99bce09@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=4af46ee83100e99bce09
> Fixes: 50ac3ecd8e05 ("f2fs: fix to do sanity check on node footer in {read,write}_end_io")
> Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com>
> ---
>   fs/f2fs/data.c | 10 +++++-----
>   1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
> index 8d4f1e75dee3..c149b0ccf22d 100644
> --- a/fs/f2fs/data.c
> +++ b/fs/f2fs/data.c
> @@ -382,11 +382,11 @@ static void f2fs_write_end_io(struct bio *bio)
>   						STOP_CP_REASON_WRITE_FAIL);
>   		}
>   
> -		if (is_node_folio(folio)) {
> -			f2fs_sanity_check_node_footer(sbi, folio,
> -				folio->index, NODE_TYPE_REGULAR, true);
> -			f2fs_bug_on(sbi, folio->index != nid_of_node(folio));

Well, I don't think removing the f2fs_bug_on() is the right way to fix this,
because we may lose chance to detect any f2fs bug w/o f2fs_bug_on().

The problem here is why we haven't detected such inconsistent node footer
before writebacking the node folio.

I find a missing case, please take a look:

https://git.kernel.org/pub/scm/linux/kernel/git/chao/linux.git/commit/?h=syzbot&id=20e7d40cfa8d2afdc16ed2d3df24ef68ebda71ba

Thanks,

> -		}
> +		if (is_node_folio(folio) &&
> +		    f2fs_sanity_check_node_footer(sbi, folio,
> +						  folio->index, NODE_TYPE_REGULAR, true))
> +			bio->bi_status = BLK_STS_IOERR;
> +
>   		if (f2fs_in_warm_node_list(folio))
>   			f2fs_del_fsync_node_entry(sbi, folio);
>