LKP reported that one of their tests was failing to even boot with my
"old mount API code" removal patch. The test was booting an i386 kernel
under QEMU, with lockdep enabled. Rather than a functional failure, it
seemed to have been slowed to a crawl and eventually timed out.
I narrowed the problem down to the removal of the ->mount op from
file_system_type, which changed structure alignment and seems to have
caused cacheline issues with this structure. Annotating the alignment
fixes the problem for me.
Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202512230315.1717476b-lkp@intel.com
Fixes: 51a146e05 ("fs: Remove internal old mount API code")
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
---
RFC because I honestly don't understand why this should be so critical,
especially the structure was not explicitly (or even very well) aligned
before. I would welcome insights from folks who are smarter than me!
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 9949d253e5aa..b3d8cad15de1 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2279,7 +2279,7 @@ struct file_system_type {
struct file_system_type * next;
struct hlist_head fs_supers;
- struct lock_class_key s_lock_key;
+ struct lock_class_key s_lock_key ____cacheline_aligned;
struct lock_class_key s_umount_key;
struct lock_class_key s_vfs_rename_key;
struct lock_class_key s_writers_key[SB_FREEZE_LEVELS];
hi, Eric Sandeen,
On Tue, Dec 30, 2025 at 03:07:10PM -0600, Eric Sandeen wrote:
> LKP reported that one of their tests was failing to even boot with my
> "old mount API code" removal patch. The test was booting an i386 kernel
> under QEMU, with lockdep enabled. Rather than a functional failure, it
> seemed to have been slowed to a crawl and eventually timed out.
>
> I narrowed the problem down to the removal of the ->mount op from
> file_system_type, which changed structure alignment and seems to have
> caused cacheline issues with this structure. Annotating the alignment
> fixes the problem for me.
>
> Reported-by: kernel test robot <oliver.sang@intel.com>
> Closes: https://lore.kernel.org/oe-lkp/202512230315.1717476b-lkp@intel.com
> Fixes: 51a146e05 ("fs: Remove internal old mount API code")
> Signed-off-by: Eric Sandeen <sandeen@redhat.com>
just FYI. we tried this patch, and think it can fix the issue we observed.
we applied it on top of 51a146e0595c6, so:
* a474b9ba68671 fs: cache-align lock_class_keys in struct
* 51a146e0595c6 fs: Remove internal old mount API code
* d5bc4e31f2a3f Merge patch series "statmount: accept fd as a parameter"
results:
=========================================================================================
compiler/kconfig/rootfs/sleep/tbox_group/testcase:
gcc-14/i386-randconfig-2006-20250804/debian-11.1-i386-20220923.cgz/1/vm-snb-i386/boot
d5bc4e31f2a3f301 51a146e0595c638c58097a1660f a474b9ba68671230bd0a712a0f9
---------------- --------------------------- ---------------------------
fail:runs %reproduction fail:runs %reproduction fail:runs
| | | | |
:200 100% 200:200 0% :200 dmesg.BUG:kernel_hang_in_boot_stage
:200 100% 200:200 0% :200 last_state.booting
:200 100% 200:200 0% :200 last_state.is_incomplete_run
> ---
>
> RFC because I honestly don't understand why this should be so critical,
> especially the structure was not explicitly (or even very well) aligned
> before. I would welcome insights from folks who are smarter than me!
>
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 9949d253e5aa..b3d8cad15de1 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -2279,7 +2279,7 @@ struct file_system_type {
> struct file_system_type * next;
> struct hlist_head fs_supers;
>
> - struct lock_class_key s_lock_key;
> + struct lock_class_key s_lock_key ____cacheline_aligned;
> struct lock_class_key s_umount_key;
> struct lock_class_key s_vfs_rename_key;
> struct lock_class_key s_writers_key[SB_FREEZE_LEVELS];
>
>
On Tue, Dec 30, 2025 at 03:07:10PM -0600, Eric Sandeen wrote:
> LKP reported that one of their tests was failing to even boot with my
> "old mount API code" removal patch. The test was booting an i386 kernel
> under QEMU, with lockdep enabled. Rather than a functional failure, it
> seemed to have been slowed to a crawl and eventually timed out.
>
> I narrowed the problem down to the removal of the ->mount op from
> file_system_type, which changed structure alignment and seems to have
> caused cacheline issues with this structure. Annotating the alignment
> fixes the problem for me.
>
> Reported-by: kernel test robot <oliver.sang@intel.com>
> Closes: https://lore.kernel.org/oe-lkp/202512230315.1717476b-lkp@intel.com
> Fixes: 51a146e05 ("fs: Remove internal old mount API code")
> Signed-off-by: Eric Sandeen <sandeen@redhat.com>
> ---
>
> RFC because I honestly don't understand why this should be so critical,
> especially the structure was not explicitly (or even very well) aligned
> before. I would welcome insights from folks who are smarter than me!
>
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 9949d253e5aa..b3d8cad15de1 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -2279,7 +2279,7 @@ struct file_system_type {
> struct file_system_type * next;
> struct hlist_head fs_supers;
>
> - struct lock_class_key s_lock_key;
> + struct lock_class_key s_lock_key ____cacheline_aligned;
> struct lock_class_key s_umount_key;
> struct lock_class_key s_vfs_rename_key;
> struct lock_class_key s_writers_key[SB_FREEZE_LEVELS];
>
There is no way is about cacheline bouncing. According to the linked
thread the test vm has only 2 vcpus:
> test machine: qemu-system-i386 -enable-kvm -cpu SandyBridge -smp 2 -m 4G
Even if the vcpu count was in hundreds and the ping pong was a problem it
still would not have prevented bootup.
Instead something depends on the old layout for correctness.
By any chance is this type-punned somewhere?
While I can't be bothered to investigate, I *suspect* the way to catch
this would patch out all of the lock_class_key vars & uses and boot with
KMSAN (or was it KASAN?). Or whatever mechanism which can tell the
access is oob.
On 12/30/25 4:04 PM, Mateusz Guzik wrote:
> On Tue, Dec 30, 2025 at 03:07:10PM -0600, Eric Sandeen wrote:
>> LKP reported that one of their tests was failing to even boot with my
>> "old mount API code" removal patch. The test was booting an i386 kernel
>> under QEMU, with lockdep enabled. Rather than a functional failure, it
>> seemed to have been slowed to a crawl and eventually timed out.
>>
>> I narrowed the problem down to the removal of the ->mount op from
>> file_system_type, which changed structure alignment and seems to have
>> caused cacheline issues with this structure. Annotating the alignment
>> fixes the problem for me.
>>
>> Reported-by: kernel test robot <oliver.sang@intel.com>
>> Closes: https://lore.kernel.org/oe-lkp/202512230315.1717476b-lkp@intel.com
>> Fixes: 51a146e05 ("fs: Remove internal old mount API code")
>> Signed-off-by: Eric Sandeen <sandeen@redhat.com>
>> ---
>>
>> RFC because I honestly don't understand why this should be so critical,
>> especially the structure was not explicitly (or even very well) aligned
>> before. I would welcome insights from folks who are smarter than me!
>>
>> diff --git a/include/linux/fs.h b/include/linux/fs.h
>> index 9949d253e5aa..b3d8cad15de1 100644
>> --- a/include/linux/fs.h
>> +++ b/include/linux/fs.h
>> @@ -2279,7 +2279,7 @@ struct file_system_type {
>> struct file_system_type * next;
>> struct hlist_head fs_supers;
>>
>> - struct lock_class_key s_lock_key;
>> + struct lock_class_key s_lock_key ____cacheline_aligned;
>> struct lock_class_key s_umount_key;
>> struct lock_class_key s_vfs_rename_key;
>> struct lock_class_key s_writers_key[SB_FREEZE_LEVELS];
>>
>
> There is no way is about cacheline bouncing. According to the linked
> thread the test vm has only 2 vcpus:
>> test machine: qemu-system-i386 -enable-kvm -cpu SandyBridge -smp 2 -m 4G
>
> Even if the vcpu count was in hundreds and the ping pong was a problem it
> still would not have prevented bootup.
Fair enough, it really didn't make sense to me.
> Instead something depends on the old layout for correctness.
If I add ->mount back to the /end/ of the structure, it boots fine again,
so I guess nothing is expecting specific offsets within the structure
at least.
(If I turn off lockdep in the kernel config, it boots fine again too,
FWIW.)
> By any chance is this type-punned somewhere?
I don't think so...
> While I can't be bothered to investigate, I *suspect* the way to catch
> this would patch out all of the lock_class_key vars & uses and boot with
> KMSAN (or was it KASAN?). Or whatever mechanism which can tell the
> access is oob.
Can't do KASAN or KMSAN on i386, AFAIK. :(
I suppose I could try such things on x86_64 just in case something shows
up...
Thanks!
-Eric
On Tue, Dec 30, 2025 at 11:45 PM Eric Sandeen <sandeen@redhat.com> wrote: > > On 12/30/25 4:04 PM, Mateusz Guzik wrote: > > Instead something depends on the old layout for correctness. > > If I add ->mount back to the /end/ of the structure, it boots fine again, > so I guess nothing is expecting specific offsets within the structure > at least. > That's makes it weirder. > > By any chance is this type-punned somewhere? > > I don't think so... > > > While I can't be bothered to investigate, I *suspect* the way to catch > > this would patch out all of the lock_class_key vars & uses and boot with > > KMSAN (or was it KASAN?). Or whatever mechanism which can tell the > > access is oob. > > Can't do KASAN or KMSAN on i386, AFAIK. :( > > I suppose I could try such things on x86_64 just in case something shows > up... > The sanitizer suggestion was to get the kernel to just tell you where the problem is. However, you can still use qemu's debug facilities to figure out where the boot hangs and take it from there. Ultimately this should be mostly a boring investigation, even if chore-heavy. I have not used the machinery in over a decade, so no tips from my end. But obviously it is there. ;)
© 2016 - 2026 Red Hat, Inc.