xfs: reject CRC validation when the log header cannot be retrieved

[PATCH] xfs: reject CRC validation when the log header cannot be retrieved

Posted by Edward Adam Davis 2 months, 1 week ago

When the traditional algorithm fails to locate the log header, it triggers
the uninitialized-value issue regarding tmp_rhead_blk reported in [1],
continuing with the subsequent CRC verification traversal in such a
scenario is futile.

A check has been added to detect the absence of the log header and prevent
the execution of the subsequent CRC verification traversal.

[1]
BUG: KMSAN: uninit-value in xlog_verify_head+0x6c3/0x910 fs/xfs/xfs_log_recover.c:1058
 xlog_verify_head+0x6c3/0x910 fs/xfs/xfs_log_recover.c:1058
 xlog_find_tail+0xc2e/0x1a50 fs/xfs/xfs_log_recover.c:1315
 xlog_recover+0x6d/0x800 fs/xfs/xfs_log_recover.c:3426
 xfs_log_mount+0x4da/0x880 fs/xfs/xfs_log.c:617

Local variable tmp_rhead_blk created at:
 xlog_verify_head+0x81/0x910 fs/xfs/xfs_log_recover.c:1032

Reported-by: syzbot+b7dfbed0c6c2b5e9fd34@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=b7dfbed0c6c2b5e9fd34
Signed-off-by: Edward Adam Davis <eadavis@qq.com>
---
 fs/xfs/xfs_log_recover.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index 09e6678ca487..0d1b4bddd193 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -1050,6 +1050,9 @@ xlog_verify_head(
 	if (error < 0)
 		return error;
 
+	if (!error)
+		return -EIO;
+
 	/*
 	 * Now run a CRC verification pass over the records starting at the
 	 * block found above to the current head. If a CRC failure occurs, the
-- 
2.43.0

Re: [PATCH] xfs: reject CRC validation when the log header cannot be retrieved

Posted by Brian Foster 2 months, 1 week ago

On Fri, Apr 03, 2026 at 09:43:39AM +0800, Edward Adam Davis wrote:
> When the traditional algorithm fails to locate the log header, it triggers
> the uninitialized-value issue regarding tmp_rhead_blk reported in [1],
> continuing with the subsequent CRC verification traversal in such a
> scenario is futile.
> 
> A check has been added to detect the absence of the log header and prevent
> the execution of the subsequent CRC verification traversal.
> 
> [1]
> BUG: KMSAN: uninit-value in xlog_verify_head+0x6c3/0x910 fs/xfs/xfs_log_recover.c:1058
>  xlog_verify_head+0x6c3/0x910 fs/xfs/xfs_log_recover.c:1058
>  xlog_find_tail+0xc2e/0x1a50 fs/xfs/xfs_log_recover.c:1315
>  xlog_recover+0x6d/0x800 fs/xfs/xfs_log_recover.c:3426
>  xfs_log_mount+0x4da/0x880 fs/xfs/xfs_log.c:617
> 
> Local variable tmp_rhead_blk created at:
>  xlog_verify_head+0x81/0x910 fs/xfs/xfs_log_recover.c:1032
> 
> Reported-by: syzbot+b7dfbed0c6c2b5e9fd34@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=b7dfbed0c6c2b5e9fd34
> Signed-off-by: Edward Adam Davis <eadavis@qq.com>
> ---
>  fs/xfs/xfs_log_recover.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
> index 09e6678ca487..0d1b4bddd193 100644
> --- a/fs/xfs/xfs_log_recover.c
> +++ b/fs/xfs/xfs_log_recover.c
> @@ -1050,6 +1050,9 @@ xlog_verify_head(
>  	if (error < 0)
>  		return error;
>  
> +	if (!error)
> +		return -EIO;
> +

Hmm.. at this point we've located the head block, pulled the tail block
from the log record header and are attempting to find the last written
log record that could have potentially been torn based on max iclogs so
we can verify it with a CRC pass.

Have you dug into how syzbot triggers this issue? The tweak seems
reasonable at a glance as I'm not sure why we wouldn't find at least one
log record header in the head/tail range, but at minimum the patch
should provide some analysis on why we should make that assumption here
and how this happens. It would also be ideal to know what's going on to
help determine whether there isn't some other issue here that might need
to be addressed.

For example, are we returning 0 here for the head verification pass and
aside from the uninit variable issue, falling into an otherwise
functional log recovery? Or does log recovery ultimately fail further
along? I'd be hesitant to blindly add an error return into a functioning
recovery situation as that might imply there's something wrong with the
verification logic, whereas maybe it's a different story if there's some
corruption or something that we're not handling gracefully enough.

FWIW, I did some LLM prodding at if/how something like this might happen
and it threw out some ideas based on records wrapping the log, but TBH
given the difficulty it has processing the details and layers of
complexity here I'm not really sure I trust it. The best bet is probably
to dig more into what the log looks like and why it triggers the issue.

Brian

>  	/*
>  	 * Now run a CRC verification pass over the records starting at the
>  	 * block found above to the current head. If a CRC failure occurs, the
> -- 
> 2.43.0
> 
>

Re: [PATCH] xfs: reject CRC validation when the log header cannot be retrieved

Posted by Edward Adam Davis 2 months, 1 week ago

On Mon, 6 Apr 2026 10:12:22 -0400, Brian Foster wrote:
> > When the traditional algorithm fails to locate the log header, it triggers
> > the uninitialized-value issue regarding tmp_rhead_blk reported in [1],
> > continuing with the subsequent CRC verification traversal in such a
> > scenario is futile.
> >
> > A check has been added to detect the absence of the log header and prevent
> > the execution of the subsequent CRC verification traversal.
> >
> > [1]
> > BUG: KMSAN: uninit-value in xlog_verify_head+0x6c3/0x910 fs/xfs/xfs_log_recover.c:1058
> >  xlog_verify_head+0x6c3/0x910 fs/xfs/xfs_log_recover.c:1058
> >  xlog_find_tail+0xc2e/0x1a50 fs/xfs/xfs_log_recover.c:1315
> >  xlog_recover+0x6d/0x800 fs/xfs/xfs_log_recover.c:3426
> >  xfs_log_mount+0x4da/0x880 fs/xfs/xfs_log.c:617
> >
> > Local variable tmp_rhead_blk created at:
> >  xlog_verify_head+0x81/0x910 fs/xfs/xfs_log_recover.c:1032
> >
> > Reported-by: syzbot+b7dfbed0c6c2b5e9fd34@syzkaller.appspotmail.com
> > Closes: https://syzkaller.appspot.com/bug?extid=b7dfbed0c6c2b5e9fd34
> > Signed-off-by: Edward Adam Davis <eadavis@qq.com>
> > ---
> >  fs/xfs/xfs_log_recover.c | 3 +++
> >  1 file changed, 3 insertions(+)
> >
> > diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
> > index 09e6678ca487..0d1b4bddd193 100644
> > --- a/fs/xfs/xfs_log_recover.c
> > +++ b/fs/xfs/xfs_log_recover.c
> > @@ -1050,6 +1050,9 @@ xlog_verify_head(
> >  	if (error < 0)
> >  		return error;
> >
> > +	if (!error)
> > +		return -EIO;
> > +
> 
> Hmm.. at this point we've located the head block, pulled the tail block
> from the log record header and are attempting to find the last written
> log record that could have potentially been torn based on max iclogs so
> we can verify it with a CRC pass.
> 
> Have you dug into how syzbot triggers this issue? The tweak seems
> reasonable at a glance as I'm not sure why we wouldn't find at least one
> log record header in the head/tail range, but at minimum the patch
> should provide some analysis on why we should make that assumption here
> and how this happens. It would also be ideal to know what's going on to
> help determine whether there isn't some other issue here that might need
> to be addressed.
> 
> For example, are we returning 0 here for the head verification pass and
> aside from the uninit variable issue, falling into an otherwise
> functional log recovery? Or does log recovery ultimately fail further
> along? I'd be hesitant to blindly add an error return into a functioning
> recovery situation as that might imply there's something wrong with the
> verification logic, whereas maybe it's a different story if there's some
> corruption or something that we're not handling gracefully enough.
> 
> FWIW, I did some LLM prodding at if/how something like this might happen
> and it threw out some ideas based on records wrapping the log, but TBH
> given the difficulty it has processing the details and layers of
> complexity here I'm not really sure I trust it. The best bet is probably
> to dig more into what the log looks like and why it triggers the issue.
I have reviewed the logs but was unable to find sufficient evidence to
substantiate the existence of records between the start and tail points.
Perhaps once syzbot generates a relevant reproducer, I will be able to
investigate the root cause further.

Edward

Re: [PATCH] xfs: reject CRC validation when the log header cannot be retrieved

Posted by Christoph Hellwig 2 months, 1 week ago

On Mon, Apr 06, 2026 at 10:12:22AM -0400, Brian Foster wrote:
> Have you dug into how syzbot triggers this issue?

There is no syzbot reproducer yet.  Unfortunately the patch submitter
has a history of jumping on unverified syzbot reports and then
submitting speculative patches based on wrong conclusions.  I'd
recommend to ignore him when he does that, which means basically
always.  Hopefully we'll eventually get a syzbot reproducer and can
then do an actual analysis.