[PATCH] nilfs2: no longer save to shadow map if the num of members is too small

Edward Adam Davis posted 1 patch 2 weeks, 6 days ago
fs/nilfs2/segment.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
[PATCH] nilfs2: no longer save to shadow map if the num of members is too small
Posted by Edward Adam Davis 2 weeks, 6 days ago
The value of argv0.v_nmembs passed from userspace is 0. This prevents
nilfs_iget_for_gc() from being called to initialize the gcinode during
the execution of nilfs_ioctl_move_blocks(). Consequently, this triggers
a null-ptr-deref involving ii->i_assoc_inode within the subsequent call
sequence: nilfs_clean_segments()->nilfs_mdt_save_to_shadow_map() [1].

A check for argv[0].v_nmembs has been added to nilfs_clean_segments()
to prevent this potential null-ptr-deref of ii->i_assoc_inode.

[1]
KASAN: null-ptr-deref in range [0x0000000000000030-0x0000000000000037]
Call Trace:
 nilfs_clean_segments+0x162/0xa50 fs/nilfs2/segment.c:2521
 nilfs_ioctl_clean_segments fs/nilfs2/ioctl.c:916 [inline]
 nilfs_ioctl+0x261f/0x2780 fs/nilfs2/ioctl.c:1346

Reported-by: syzbot+4b4093b1f24ad789bf37@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=4b4093b1f24ad789bf37
Signed-off-by: Edward Adam Davis <eadavis@qq.com>
---
 fs/nilfs2/segment.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/fs/nilfs2/segment.c b/fs/nilfs2/segment.c
index 1491a4d4b1e1..7e0b24361d0b 100644
--- a/fs/nilfs2/segment.c
+++ b/fs/nilfs2/segment.c
@@ -2518,9 +2518,11 @@ int nilfs_clean_segments(struct super_block *sb, struct nilfs_argv *argv,
 
 	nilfs_transaction_lock(sb, &ti, 1);
 
-	err = nilfs_mdt_save_to_shadow_map(nilfs->ns_dat);
-	if (unlikely(err))
-		goto out_unlock;
+	if (argv[0].v_nmembs > 0) {
+		err = nilfs_mdt_save_to_shadow_map(nilfs->ns_dat);
+		if (unlikely(err))
+			goto out_unlock;
+	}
 
 	err = nilfs_ioctl_prepare_clean_segments(nilfs, argv, kbufs);
 	if (unlikely(err)) {
-- 
2.43.0
Re: [PATCH] nilfs2: no longer save to shadow map if the num of members is too small
Posted by Deepanshu Kartikey 2 weeks, 6 days ago
Hi Edward,

On Mon, 17 Mar 2026, Edward Adam Davis wrote:

> The value of argv0.v_nmembs passed from userspace is 0. This prevents
> nilfs_iget_for_gc() from being called to initialize the gcinode during
> the execution of nilfs_ioctl_move_blocks(). Consequently, this triggers
> a null-ptr-deref involving ii->i_assoc_inode within the subsequent call
> sequence: nilfs_clean_segments()->nilfs_mdt_save_to_shadow_map() [1].

This analysis is incorrect. The null-ptr-deref is not caused by
nilfs_iget_for_gc() not being called. The real problem is that
ns_dat->i_assoc_inode (the DAT inode's btree node cache) is never
initialized at mount time.

> A check for argv[0].v_nmembs has been added to nilfs_clean_segments()
> to prevent this potential null-ptr-deref of ii->i_assoc_inode.

This fixes the symptom but not the root cause. Also note that in
the original syzkaller reproducer:

    argv[0].v_nmembs = 0xd = 13 > 0

Your check would NOT prevent the crash with the original reproducer.

The correct fix is to initialize the btnode cache eagerly in
nilfs_dat_read() at mount time, since i_assoc_inode is only
initialized lazily during btree operations. When
NILFS_IOCTL_CLEAN_SEGMENTS is called before any btree operation
has occurred, i_assoc_inode is NULL.

I have already submitted this fix and syzbot confirmed it as fixed:

https://lore.kernel.org/all/20260317090109.878401-1-kartikey406@gmail.com/T/

Regards,
Deepanshu Kartikey
Re: [PATCH] nilfs2: no longer save to shadow map if the num of members is too small
Posted by Deepanshu Kartikey 2 weeks, 6 days ago
Hi Edward,

On Mon, 17 Mar 2026, Edward Adam Davis wrote:

> The value of argv0.v_nmembs passed from userspace is 0. This prevents
> nilfs_iget_for_gc() from being called to initialize the gcinode during
> the execution of nilfs_ioctl_move_blocks(). Consequently, this triggers
> a null-ptr-deref involving ii->i_assoc_inode within the subsequent call
> sequence: nilfs_clean_segments()->nilfs_mdt_save_to_shadow_map() [1].

This analysis is incorrect. The null-ptr-deref is not caused by
nilfs_iget_for_gc() not being called. The real problem is that
ns_dat->i_assoc_inode (the DAT inode's btree node cache) is never
initialized at mount time.

> A check for argv[0].v_nmembs has been added to nilfs_clean_segments()
> to prevent this potential null-ptr-deref of ii->i_assoc_inode.

This fixes the symptom but not the root cause. Also note that in
the original syzkaller reproducer:

    argv[0].v_nmembs = 0xd = 13 > 0

Your check would NOT prevent the crash with the original reproducer.

The correct fix is to initialize the btnode cache eagerly in
nilfs_dat_read() at mount time, since i_assoc_inode is only
initialized lazily during btree operations. When
NILFS_IOCTL_CLEAN_SEGMENTS is called before any btree operation
has occurred, i_assoc_inode is NULL.

I have already submitted this fix and syzbot confirmed it as fixed:

https://lore.kernel.org/all/20260317090109.878401-1-kartikey406@gmail.com/T/

Regards,
Deepanshu Kartikey
Re: [PATCH] nilfs2: no longer save to shadow map if the num of members is too small
Posted by Ryusuke Konishi 2 weeks, 6 days ago
Hi Deepanshu and Edward,

On Wed, Mar 18, 2026 at 12:15 AM Deepanshu Kartikey wrote:
>
> Hi Edward,
>
> On Mon, 17 Mar 2026, Edward Adam Davis wrote:
>
> > The value of argv0.v_nmembs passed from userspace is 0. This prevents
> > nilfs_iget_for_gc() from being called to initialize the gcinode during
> > the execution of nilfs_ioctl_move_blocks(). Consequently, this triggers
> > a null-ptr-deref involving ii->i_assoc_inode within the subsequent call
> > sequence: nilfs_clean_segments()->nilfs_mdt_save_to_shadow_map() [1].
>
> This analysis is incorrect. The null-ptr-deref is not caused by
> nilfs_iget_for_gc() not being called. The real problem is that
> ns_dat->i_assoc_inode (the DAT inode's btree node cache) is never
> initialized at mount time.
>
> > A check for argv[0].v_nmembs has been added to nilfs_clean_segments()
> > to prevent this potential null-ptr-deref of ii->i_assoc_inode.
>
> This fixes the symptom but not the root cause. Also note that in
> the original syzkaller reproducer:
>
>     argv[0].v_nmembs = 0xd = 13 > 0
>
> Your check would NOT prevent the crash with the original reproducer.
>
> The correct fix is to initialize the btnode cache eagerly in
> nilfs_dat_read() at mount time, since i_assoc_inode is only
> initialized lazily during btree operations. When
> NILFS_IOCTL_CLEAN_SEGMENTS is called before any btree operation
> has occurred, i_assoc_inode is NULL.
>
> I have already submitted this fix and syzbot confirmed it as fixed:
>
> https://lore.kernel.org/all/20260317090109.878401-1-kartikey406@gmail.com/T/
>
> Regards,
> Deepanshu Kartikey

Deepanshu's suggestion seems close to the answer, but I think there's
a slight leap in the root cause analysis.

When nilfs_dat_read() is in a b-tree configuration, it normally calls
nilfs_attach_btree_node_cache() via nilfs_read_inode_common() ->
nilfs_bmap_read() -> nilfs_btree_init().

Therefore, the problem seems to be one of the following two:
(1) nilfs_mdt_save_to_shadow_map(), called from a GC ioctl specifying
the dat, calls nilfs_copy_dirty_pages() assuming a b-tree node cache
exists, regardless of whether the DAT is direct mapping or b-tree
mapping.
(The DAT mapping method switching is not considered.)

(2) The DAT is in b-tree mapping mode, but nilfs_btree_init() is not
being called because the i_mode of the DAT inode is corrupt.

Both appear to be potential bugs, but their fixes are different.
Have you determined which of these is causing this bug?

Regards,
Ryusuke Konishi
Re: [PATCH] nilfs2: no longer save to shadow map if the num of members is too small
Posted by Ryusuke Konishi 2 weeks, 6 days ago
On Wed, Mar 18, 2026 at 1:51 AM Ryusuke Konishi wrote:
>
> Hi Deepanshu and Edward,
>
> On Wed, Mar 18, 2026 at 12:15 AM Deepanshu Kartikey wrote:
> >
> > Hi Edward,
> >
> > On Mon, 17 Mar 2026, Edward Adam Davis wrote:
> >
> > > The value of argv0.v_nmembs passed from userspace is 0. This prevents
> > > nilfs_iget_for_gc() from being called to initialize the gcinode during
> > > the execution of nilfs_ioctl_move_blocks(). Consequently, this triggers
> > > a null-ptr-deref involving ii->i_assoc_inode within the subsequent call
> > > sequence: nilfs_clean_segments()->nilfs_mdt_save_to_shadow_map() [1].
> >
> > This analysis is incorrect. The null-ptr-deref is not caused by
> > nilfs_iget_for_gc() not being called. The real problem is that
> > ns_dat->i_assoc_inode (the DAT inode's btree node cache) is never
> > initialized at mount time.
> >
> > > A check for argv[0].v_nmembs has been added to nilfs_clean_segments()
> > > to prevent this potential null-ptr-deref of ii->i_assoc_inode.
> >
> > This fixes the symptom but not the root cause. Also note that in
> > the original syzkaller reproducer:
> >
> >     argv[0].v_nmembs = 0xd = 13 > 0
> >
> > Your check would NOT prevent the crash with the original reproducer.
> >
> > The correct fix is to initialize the btnode cache eagerly in
> > nilfs_dat_read() at mount time, since i_assoc_inode is only
> > initialized lazily during btree operations. When
> > NILFS_IOCTL_CLEAN_SEGMENTS is called before any btree operation
> > has occurred, i_assoc_inode is NULL.
> >
> > I have already submitted this fix and syzbot confirmed it as fixed:
> >
> > https://lore.kernel.org/all/20260317090109.878401-1-kartikey406@gmail.com/T/
> >
> > Regards,
> > Deepanshu Kartikey
>
> Deepanshu's suggestion seems close to the answer, but I think there's
> a slight leap in the root cause analysis.
>
> When nilfs_dat_read() is in a b-tree configuration, it normally calls
> nilfs_attach_btree_node_cache() via nilfs_read_inode_common() ->
> nilfs_bmap_read() -> nilfs_btree_init().
>
> Therefore, the problem seems to be one of the following two:
> (1) nilfs_mdt_save_to_shadow_map(), called from a GC ioctl specifying
> the dat, calls nilfs_copy_dirty_pages() assuming a b-tree node cache
> exists, regardless of whether the DAT is direct mapping or b-tree
> mapping.
> (The DAT mapping method switching is not considered.)
>
> (2) The DAT is in b-tree mapping mode, but nilfs_btree_init() is not
> being called because the i_mode of the DAT inode is corrupt.
>
> Both appear to be potential bugs, but their fixes are different.
> Have you determined which of these is causing this bug?
>
> Regards,
> Ryusuke Konishi

Okay, I'll do a few more checks to make sure it's alright, but I'm
going to pick Deepanshu's fix as the solution to this problem.

The reason is that pre-allocating the b-tree node cache inode to
i_assoc_inode, as nilfs_iget_for_shadow() does for shadow mapping, has
no side effects and seems like a comprehensive stabilization method
that covers both potential issues.

The nilfs_btree_node_cache() method, which detaches the b-tree node
cache inode, is currently only called from nilfs_clear_inode(), and
once allocated, it doesn't degrade regardless of whether the inode
uses direct mapping or b-tree mapping.  Therefore, the approach of
pre-allocating the b-tree node cache inode is safe.

And also, this isn't overkill.  The DAT file typically grows very
quickly, so it almost always uses b-tree mapping (which is why this
hasn't been found before).  I think it's a good fix.

Thanks,
Ryusuke Konishi
Re: [PATCH] nilfs2: no longer save to shadow map if the num of members is too small
Posted by Edward Adam Davis 2 weeks, 5 days ago
Have you been following the path below? If the argv0.v_nmembs value
passed from userspace is greater than 0, everything will function normally.

nilfs_ioctl_clean_segments()->
  nilfs_ioctl_move_blocks()->
    nilfs_iget_for_gc()->
      nilfs_init_gcinode()->
        nilfs_attach_btree_node_cache()

Causing the mount to fail solely because the dat B-tree wasn't initialized
is an excessive fix.

BR,
Edward
Re: [PATCH] nilfs2: no longer save to shadow map if the num of members is too small
Posted by Ryusuke Konishi 2 weeks, 5 days ago
Hi Edward, thank you as always.

On Wed, Mar 18, 2026 at 9:08 AM Edward Adam Davis wrote:
>
> Have you been following the path below? If the argv0.v_nmembs value
> passed from userspace is greater than 0, everything will function normally.
>
> nilfs_ioctl_clean_segments()->
>   nilfs_ioctl_move_blocks()->
>     nilfs_iget_for_gc()->
>       nilfs_init_gcinode()->
>         nilfs_attach_btree_node_cache()
>
> Causing the mount to fail solely because the dat B-tree wasn't initialized
> is an excessive fix.
>
> BR,
> Edward

Are you perhaps confusing the regular inode's GC cache (gc inode) with
the DAT's shadow mapping inode?

Calling nilfs_attach_btree_node_cache() from nilfs_init_gcinode() does
not allocate the b-tree node cache for the DAT inode where the problem
is occurring in nilfs_mdt_save_to_shadow_map().
Therefore, it does not fix the root cause of the issue.

As can be seen from the call trace below, the issue arises when the
i_assoc_inode of the DAT inode (or potentially its shadow mapping
inode) in nilfs_mdt_save_to_shadow_map() is dereferenced.

  RIP: 0010:nilfs_mdt_save_to_shadow_map+0x141/0x1c0 fs/nilfs2/mdt.c:559

Your fix eliminates the reproducibility conditions for the reproducer,
so it might pass the tests, but it doesn't fix the original problem,
does it?

The essential flaw is that it attempts to copy dirty pages from the
b-tree node cache to the shadow mapping, even though the DAT might not
have a btree node cache while remaining in direct mapping mode.

As I mentioned earlier, DAT usually grows quickly and switches to
btree mapping, so I don't think it's excessive to pre-allocate a btree
node cache as Deepanshu proposed.

Regards,
Ryusuke Konishi
Re: [PATCH] nilfs2: no longer save to shadow map if the num of members is too small
Posted by Edward Adam Davis 2 weeks, 5 days ago
On Wed, 18 Mar 2026 09:54:12 +0900, Ryusuke Konishi wrote:
> Are you perhaps confusing the regular inode's GC cache (gc inode) with
> the DAT's shadow mapping inode?
My fault.