[PATCH RFC] fs: cache-align lock_class_keys in struct file_system_type

Eric Sandeen posted 1 patch 1 month, 1 week ago
[PATCH RFC] fs: cache-align lock_class_keys in struct file_system_type
Posted by Eric Sandeen 1 month, 1 week ago
LKP reported that one of their tests was failing to even boot with my
"old mount API code" removal patch. The test was booting an i386 kernel
under QEMU, with lockdep enabled. Rather than a functional failure, it
seemed to have been slowed to a crawl and eventually timed out.

I narrowed the problem down to the removal of the ->mount op from
file_system_type, which changed structure alignment and seems to have
caused cacheline issues with this structure. Annotating the alignment
fixes the problem for me.

Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202512230315.1717476b-lkp@intel.com
Fixes: 51a146e05 ("fs: Remove internal old mount API code")
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
---

RFC because I honestly don't understand why this should be so critical,
especially the structure was not explicitly (or even very well) aligned
before. I would welcome insights from folks who are smarter than me!

diff --git a/include/linux/fs.h b/include/linux/fs.h
index 9949d253e5aa..b3d8cad15de1 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2279,7 +2279,7 @@ struct file_system_type {
 	struct file_system_type * next;
 	struct hlist_head fs_supers;
 
-	struct lock_class_key s_lock_key;
+	struct lock_class_key s_lock_key ____cacheline_aligned;
 	struct lock_class_key s_umount_key;
 	struct lock_class_key s_vfs_rename_key;
 	struct lock_class_key s_writers_key[SB_FREEZE_LEVELS];
Re: [PATCH RFC] fs: cache-align lock_class_keys in struct file_system_type
Posted by Oliver Sang 1 month, 1 week ago
hi, Eric Sandeen,

On Tue, Dec 30, 2025 at 03:07:10PM -0600, Eric Sandeen wrote:
> LKP reported that one of their tests was failing to even boot with my
> "old mount API code" removal patch. The test was booting an i386 kernel
> under QEMU, with lockdep enabled. Rather than a functional failure, it
> seemed to have been slowed to a crawl and eventually timed out.
> 
> I narrowed the problem down to the removal of the ->mount op from
> file_system_type, which changed structure alignment and seems to have
> caused cacheline issues with this structure. Annotating the alignment
> fixes the problem for me.
> 
> Reported-by: kernel test robot <oliver.sang@intel.com>
> Closes: https://lore.kernel.org/oe-lkp/202512230315.1717476b-lkp@intel.com
> Fixes: 51a146e05 ("fs: Remove internal old mount API code")
> Signed-off-by: Eric Sandeen <sandeen@redhat.com>

just FYI. we tried this patch, and think it can fix the issue we observed.

we applied it on top of 51a146e0595c6, so:

* a474b9ba68671 fs: cache-align lock_class_keys in struct
* 51a146e0595c6 fs: Remove internal old mount API code
*   d5bc4e31f2a3f Merge patch series "statmount: accept fd as a parameter"

results:

=========================================================================================
compiler/kconfig/rootfs/sleep/tbox_group/testcase:
  gcc-14/i386-randconfig-2006-20250804/debian-11.1-i386-20220923.cgz/1/vm-snb-i386/boot

d5bc4e31f2a3f301 51a146e0595c638c58097a1660f a474b9ba68671230bd0a712a0f9
---------------- --------------------------- ---------------------------
       fail:runs  %reproduction    fail:runs  %reproduction    fail:runs
           |             |             |             |             |
           :200        100%         200:200          0%            :200   dmesg.BUG:kernel_hang_in_boot_stage
           :200        100%         200:200          0%            :200   last_state.booting
           :200        100%         200:200          0%            :200   last_state.is_incomplete_run

> ---
> 
> RFC because I honestly don't understand why this should be so critical,
> especially the structure was not explicitly (or even very well) aligned
> before. I would welcome insights from folks who are smarter than me!
> 
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 9949d253e5aa..b3d8cad15de1 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -2279,7 +2279,7 @@ struct file_system_type {
>  	struct file_system_type * next;
>  	struct hlist_head fs_supers;
>  
> -	struct lock_class_key s_lock_key;
> +	struct lock_class_key s_lock_key ____cacheline_aligned;
>  	struct lock_class_key s_umount_key;
>  	struct lock_class_key s_vfs_rename_key;
>  	struct lock_class_key s_writers_key[SB_FREEZE_LEVELS];
> 
>
Re: [PATCH RFC] fs: cache-align lock_class_keys in struct file_system_type
Posted by Mateusz Guzik 1 month, 1 week ago
On Tue, Dec 30, 2025 at 03:07:10PM -0600, Eric Sandeen wrote:
> LKP reported that one of their tests was failing to even boot with my
> "old mount API code" removal patch. The test was booting an i386 kernel
> under QEMU, with lockdep enabled. Rather than a functional failure, it
> seemed to have been slowed to a crawl and eventually timed out.
> 
> I narrowed the problem down to the removal of the ->mount op from
> file_system_type, which changed structure alignment and seems to have
> caused cacheline issues with this structure. Annotating the alignment
> fixes the problem for me.
> 
> Reported-by: kernel test robot <oliver.sang@intel.com>
> Closes: https://lore.kernel.org/oe-lkp/202512230315.1717476b-lkp@intel.com
> Fixes: 51a146e05 ("fs: Remove internal old mount API code")
> Signed-off-by: Eric Sandeen <sandeen@redhat.com>
> ---
> 
> RFC because I honestly don't understand why this should be so critical,
> especially the structure was not explicitly (or even very well) aligned
> before. I would welcome insights from folks who are smarter than me!
> 
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 9949d253e5aa..b3d8cad15de1 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -2279,7 +2279,7 @@ struct file_system_type {
>  	struct file_system_type * next;
>  	struct hlist_head fs_supers;
>  
> -	struct lock_class_key s_lock_key;
> +	struct lock_class_key s_lock_key ____cacheline_aligned;
>  	struct lock_class_key s_umount_key;
>  	struct lock_class_key s_vfs_rename_key;
>  	struct lock_class_key s_writers_key[SB_FREEZE_LEVELS];
> 

There is no way is about cacheline bouncing. According to the linked
thread the test vm has only 2 vcpus:
> test machine: qemu-system-i386 -enable-kvm -cpu SandyBridge -smp 2 -m 4G

Even if the vcpu count was in hundreds and the ping pong was a problem it
still would not have prevented bootup.

Instead something depends on the old layout for correctness.

By any chance is this type-punned somewhere?

While I can't be bothered to investigate, I *suspect* the way to catch
this would patch out all of the lock_class_key vars & uses and boot with
KMSAN (or was it KASAN?). Or whatever mechanism which can tell the
access is oob.
Re: [PATCH RFC] fs: cache-align lock_class_keys in struct file_system_type
Posted by Eric Sandeen 1 month, 1 week ago
On 12/30/25 4:04 PM, Mateusz Guzik wrote:
> On Tue, Dec 30, 2025 at 03:07:10PM -0600, Eric Sandeen wrote:
>> LKP reported that one of their tests was failing to even boot with my
>> "old mount API code" removal patch. The test was booting an i386 kernel
>> under QEMU, with lockdep enabled. Rather than a functional failure, it
>> seemed to have been slowed to a crawl and eventually timed out.
>>
>> I narrowed the problem down to the removal of the ->mount op from
>> file_system_type, which changed structure alignment and seems to have
>> caused cacheline issues with this structure. Annotating the alignment
>> fixes the problem for me.
>>
>> Reported-by: kernel test robot <oliver.sang@intel.com>
>> Closes: https://lore.kernel.org/oe-lkp/202512230315.1717476b-lkp@intel.com
>> Fixes: 51a146e05 ("fs: Remove internal old mount API code")
>> Signed-off-by: Eric Sandeen <sandeen@redhat.com>
>> ---
>>
>> RFC because I honestly don't understand why this should be so critical,
>> especially the structure was not explicitly (or even very well) aligned
>> before. I would welcome insights from folks who are smarter than me!
>>
>> diff --git a/include/linux/fs.h b/include/linux/fs.h
>> index 9949d253e5aa..b3d8cad15de1 100644
>> --- a/include/linux/fs.h
>> +++ b/include/linux/fs.h
>> @@ -2279,7 +2279,7 @@ struct file_system_type {
>>  	struct file_system_type * next;
>>  	struct hlist_head fs_supers;
>>  
>> -	struct lock_class_key s_lock_key;
>> +	struct lock_class_key s_lock_key ____cacheline_aligned;
>>  	struct lock_class_key s_umount_key;
>>  	struct lock_class_key s_vfs_rename_key;
>>  	struct lock_class_key s_writers_key[SB_FREEZE_LEVELS];
>>
> 
> There is no way is about cacheline bouncing. According to the linked
> thread the test vm has only 2 vcpus:
>> test machine: qemu-system-i386 -enable-kvm -cpu SandyBridge -smp 2 -m 4G
> 
> Even if the vcpu count was in hundreds and the ping pong was a problem it
> still would not have prevented bootup.

Fair enough, it really didn't make sense to me.

> Instead something depends on the old layout for correctness.

If I add ->mount back to the /end/ of the structure, it boots fine again,
so I guess nothing is expecting specific offsets within the structure
at least.

(If I turn off lockdep in the kernel config, it boots fine again too,
FWIW.)

> By any chance is this type-punned somewhere?

I don't think so... 

> While I can't be bothered to investigate, I *suspect* the way to catch
> this would patch out all of the lock_class_key vars & uses and boot with
> KMSAN (or was it KASAN?). Or whatever mechanism which can tell the
> access is oob.

Can't do KASAN or KMSAN on i386, AFAIK. :(

I suppose I could try such things on x86_64 just in case something shows
up...

Thanks!

-Eric
Re: [PATCH RFC] fs: cache-align lock_class_keys in struct file_system_type
Posted by Mateusz Guzik 1 month ago
On Tue, Dec 30, 2025 at 11:45 PM Eric Sandeen <sandeen@redhat.com> wrote:
>
> On 12/30/25 4:04 PM, Mateusz Guzik wrote:
> > Instead something depends on the old layout for correctness.
>
> If I add ->mount back to the /end/ of the structure, it boots fine again,
> so I guess nothing is expecting specific offsets within the structure
> at least.
>

That's makes it weirder.

> > By any chance is this type-punned somewhere?
>
> I don't think so...
>
> > While I can't be bothered to investigate, I *suspect* the way to catch
> > this would patch out all of the lock_class_key vars & uses and boot with
> > KMSAN (or was it KASAN?). Or whatever mechanism which can tell the
> > access is oob.
>
> Can't do KASAN or KMSAN on i386, AFAIK. :(
>
> I suppose I could try such things on x86_64 just in case something shows
> up...
>

The sanitizer suggestion was to get the kernel to just tell you where
the problem is.

However, you can still use qemu's debug facilities to figure out where
the boot hangs and take it from there. Ultimately this should be
mostly a boring investigation, even if chore-heavy.

I have not used the machinery in over a decade, so no tips from my
end. But obviously it is there. ;)