gfs2: fix duplicate kmem_cache name on concurrent mounts

[PATCH] gfs2: fix duplicate kmem_cache name on concurrent mounts

Posted by Deepanshu Kartikey 1 week, 1 day ago

When gfs2_fill_super() creates the per-filesystem bufdata cache, it
uses sd_fsname as part of the cache name:

  sdp->sd_bufdata = kmem_cache_create("gfs2-bufdata/<fsname>", ...);

At this point sd_fsname is set from sd_table_name (e.g. "syz:syz"),
which is the same for all concurrent mounts of the same device.
The unique suffix (.s for spectator, .N for journal ID) is only
added to sd_fsname AFTER wait_on_journal() completes, since the
journal ID is assigned by the locking subsystem during mount.

With multiple concurrent mounts of the same device (as triggered by
syzkaller with procs:5), multiple calls to gfs2_fill_super() all try
to create a cache named "gfs2-bufdata/syz:syz" before any of them
reaches the point where sd_fsname becomes unique, triggering:

  kmem_cache of name 'gfs2-bufdata/syz:syz' already exists
  WARNING: mm/slab_common.c:112

Fix this by moving the bufdata cache creation to after
wait_on_journal() completes and sd_fsname has been updated with its
unique suffix. Also move the fail_bufdata cleanup label accordingly
in the error unwind path.

Reported-by: syzbot+b441db1854c360b83221@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=b441db1854c360b83221
Fixes: 767e4de3ffce ("gfs2: per-filesystem bufdata cache")
Tested-by: syzbot+b441db1854c360b83221@syzkaller.appspotmail.com
Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com>
---
 fs/gfs2/ops_fstype.c | 21 ++++++++++-----------
 1 file changed, 10 insertions(+), 11 deletions(-)

diff --git a/fs/gfs2/ops_fstype.c b/fs/gfs2/ops_fstype.c
index b44adb40635d..7f8e04c11494 100644
--- a/fs/gfs2/ops_fstype.c
+++ b/fs/gfs2/ops_fstype.c
@@ -1198,17 +1198,9 @@ static int gfs2_fill_super(struct super_block *sb, struct fs_context *fc)
 	if (!sdp->sd_delete_wq)
 		goto fail_glock_wq;
 
-	char *bufdata_name = kasprintf(GFP_KERNEL, "gfs2-bufdata/%s", sdp->sd_fsname);
-	sdp->sd_bufdata = kmem_cache_create(bufdata_name,
-					    sizeof(struct gfs2_bufdata),
-					    0, 0, NULL);
-	kfree(bufdata_name);
-	if (!sdp->sd_bufdata)
-		goto fail_delete_wq;
-
 	error = gfs2_sys_fs_add(sdp);
 	if (error)
-		goto fail_bufdata;
+		goto fail_delete_wq;
 
 	gfs2_create_debugfs_file(sdp);
 
@@ -1255,6 +1247,15 @@ static int gfs2_fill_super(struct super_block *sb, struct fs_context *fc)
 		snprintf(sdp->sd_fsname, sizeof(sdp->sd_fsname), "%s.%u",
 			 sdp->sd_table_name, sdp->sd_lockstruct.ls_jid);
 
+	char *bufdata_name = kasprintf(GFP_KERNEL, "gfs2-bufdata/%s", sdp->sd_fsname);
+
+	sdp->sd_bufdata = kmem_cache_create(bufdata_name,
+					    sizeof(struct gfs2_bufdata),
+					     0, 0, NULL);
+	kfree(bufdata_name);
+	if (!sdp->sd_bufdata)
+		goto fail_sb;
+
 	error = init_inodes(sdp, DO);
 	if (error)
 		goto fail_sb;
@@ -1304,6 +1305,7 @@ static int gfs2_fill_super(struct super_block *sb, struct fs_context *fc)
 	if (sb->s_root)
 		dput(sb->s_root);
 	sb->s_root = NULL;
+	kmem_cache_destroy(sdp->sd_bufdata);
 fail_locking:
 	init_locking(sdp, &mount_gh, UNDO);
 fail_lm:
@@ -1313,8 +1315,6 @@ static int gfs2_fill_super(struct super_block *sb, struct fs_context *fc)
 fail_debug:
 	gfs2_delete_debugfs_file(sdp);
 	gfs2_sys_fs_del(sdp);
-fail_bufdata:
-	kmem_cache_destroy(sdp->sd_bufdata);
 fail_delete_wq:
 	destroy_workqueue(sdp->sd_delete_wq);
 fail_glock_wq:
-- 
2.43.0

Re: [PATCH] gfs2: fix duplicate kmem_cache name on concurrent mounts

Posted by Andrew Price 1 week, 1 day ago

On 25/03/2026 06:05, Deepanshu Kartikey wrote:
> When gfs2_fill_super() creates the per-filesystem bufdata cache, it
> uses sd_fsname as part of the cache name:
> 
>   sdp->sd_bufdata = kmem_cache_create("gfs2-bufdata/<fsname>", ...);
> 
> At this point sd_fsname is set from sd_table_name (e.g. "syz:syz"),
> which is the same for all concurrent mounts of the same device.
> The unique suffix (.s for spectator, .N for journal ID) is only
> added to sd_fsname AFTER wait_on_journal() completes, since the
> journal ID is assigned by the locking subsystem during mount.
> 
> With multiple concurrent mounts of the same device (as triggered by
> syzkaller with procs:5), multiple calls to gfs2_fill_super() all try
> to create a cache named "gfs2-bufdata/syz:syz" before any of them
> reaches the point where sd_fsname becomes unique, triggering:
> 
>   kmem_cache of name 'gfs2-bufdata/syz:syz' already exists
>   WARNING: mm/slab_common.c:112
> 
> Fix this by moving the bufdata cache creation to after
> wait_on_journal() completes and sd_fsname has been updated with its
> unique suffix. Also move the fail_bufdata cleanup label accordingly
> in the error unwind path.

I don't think the description is accurate as the patch moves the kmem_cache_create() to after the gfs2_sys_fs_add(), which fails the mount cleanly on a duplicate lock table instead of triggering a warning.

That said, 767e4de3ffce ("gfs2: per-filesystem bufdata cache") is likely to get pulled out of for-next as it was more of a debugging aid so it's probably not worth a v2.

Andy

> 
> Reported-by: syzbot+b441db1854c360b83221@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=b441db1854c360b83221
> Fixes: 767e4de3ffce ("gfs2: per-filesystem bufdata cache")
> Tested-by: syzbot+b441db1854c360b83221@syzkaller.appspotmail.com
> Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com>
> ---
>  fs/gfs2/ops_fstype.c | 21 ++++++++++-----------
>  1 file changed, 10 insertions(+), 11 deletions(-)
> 
> diff --git a/fs/gfs2/ops_fstype.c b/fs/gfs2/ops_fstype.c
> index b44adb40635d..7f8e04c11494 100644
> --- a/fs/gfs2/ops_fstype.c
> +++ b/fs/gfs2/ops_fstype.c
> @@ -1198,17 +1198,9 @@ static int gfs2_fill_super(struct super_block *sb, struct fs_context *fc)
>  	if (!sdp->sd_delete_wq)
>  		goto fail_glock_wq;
>  
> -	char *bufdata_name = kasprintf(GFP_KERNEL, "gfs2-bufdata/%s", sdp->sd_fsname);
> -	sdp->sd_bufdata = kmem_cache_create(bufdata_name,
> -					    sizeof(struct gfs2_bufdata),
> -					    0, 0, NULL);
> -	kfree(bufdata_name);
> -	if (!sdp->sd_bufdata)
> -		goto fail_delete_wq;
> -
>  	error = gfs2_sys_fs_add(sdp);
>  	if (error)
> -		goto fail_bufdata;
> +		goto fail_delete_wq;
>  
>  	gfs2_create_debugfs_file(sdp);
>  
> @@ -1255,6 +1247,15 @@ static int gfs2_fill_super(struct super_block *sb, struct fs_context *fc)
>  		snprintf(sdp->sd_fsname, sizeof(sdp->sd_fsname), "%s.%u",
>  			 sdp->sd_table_name, sdp->sd_lockstruct.ls_jid);
>  
> +	char *bufdata_name = kasprintf(GFP_KERNEL, "gfs2-bufdata/%s", sdp->sd_fsname);
> +
> +	sdp->sd_bufdata = kmem_cache_create(bufdata_name,
> +					    sizeof(struct gfs2_bufdata),
> +					     0, 0, NULL);
> +	kfree(bufdata_name);
> +	if (!sdp->sd_bufdata)
> +		goto fail_sb;
> +
>  	error = init_inodes(sdp, DO);
>  	if (error)
>  		goto fail_sb;
> @@ -1304,6 +1305,7 @@ static int gfs2_fill_super(struct super_block *sb, struct fs_context *fc)
>  	if (sb->s_root)
>  		dput(sb->s_root);
>  	sb->s_root = NULL;
> +	kmem_cache_destroy(sdp->sd_bufdata);
>  fail_locking:
>  	init_locking(sdp, &mount_gh, UNDO);
>  fail_lm:
> @@ -1313,8 +1315,6 @@ static int gfs2_fill_super(struct super_block *sb, struct fs_context *fc)
>  fail_debug:
>  	gfs2_delete_debugfs_file(sdp);
>  	gfs2_sys_fs_del(sdp);
> -fail_bufdata:
> -	kmem_cache_destroy(sdp->sd_bufdata);
>  fail_delete_wq:
>  	destroy_workqueue(sdp->sd_delete_wq);
>  fail_glock_wq:

Re: [PATCH] gfs2: fix duplicate kmem_cache name on concurrent mounts

Posted by Deepanshu Kartikey 1 week, 1 day ago

On Wed, Mar 25, 2026 at 11:28 PM Andrew Price <anprice@redhat.com> wrote:
>
> On 25/03/2026 06:05, Deepanshu Kartikey wrote:
> > When gfs2_fill_super() creates the per-filesystem bufdata cache, it
> > uses sd_fsname as part of the cache name:
> >
> >   sdp->sd_bufdata = kmem_cache_create("gfs2-bufdata/<fsname>", ...);
> >
> > At this point sd_fsname is set from sd_table_name (e.g. "syz:syz"),
> > which is the same for all concurrent mounts of the same device.
> > The unique suffix (.s for spectator, .N for journal ID) is only
> > added to sd_fsname AFTER wait_on_journal() completes, since the
> > journal ID is assigned by the locking subsystem during mount.
> >
> > With multiple concurrent mounts of the same device (as triggered by
> > syzkaller with procs:5), multiple calls to gfs2_fill_super() all try
> > to create a cache named "gfs2-bufdata/syz:syz" before any of them
> > reaches the point where sd_fsname becomes unique, triggering:
> >
> >   kmem_cache of name 'gfs2-bufdata/syz:syz' already exists
> >   WARNING: mm/slab_common.c:112
> >
> > Fix this by moving the bufdata cache creation to after
> > wait_on_journal() completes and sd_fsname has been updated with its
> > unique suffix. Also move the fail_bufdata cleanup label accordingly
> > in the error unwind path.
>
> I don't think the description is accurate as the patch moves the kmem_cache_create() to after the gfs2_sys_fs_add(), which fails the mount cleanly on a duplicate lock table instead of triggering a warning.
>
> That said, 767e4de3ffce ("gfs2: per-filesystem bufdata cache") is likely to get pulled out of for-next as it was more of a debugging aid so it's probably not worth a v2.
>
> Andy
>

Hi Andy,

Thank you for the explanation.

I went back and traced through the code more carefully. The real
protection comes from gfs2_sys_fs_add() which calls
kobject_init_and_add() with sd_table_name — this fails cleanly
when a duplicate mount is attempted since the sysfs path already
exists.

The bug was that kmem_cache_create() was called BEFORE
gfs2_sys_fs_add(), so concurrent mounts both created a cache
with the same name before either reached the duplicate check.
Moving the cache creation to after gfs2_sys_fs_add() means
duplicate mounts are rejected cleanly there first.

I also understand now why sd_fsname uniqueness is not the real
fix — with lock_nolock, ls_jid comes from mount options and
can be the same (jid=0) for all concurrent mounts, so
sd_fsname would still be "syz:syz.0" for all of them.

Thanks for your time,
Deepanshu Kartikey