arch/x86/kernel/fpu/xstate.c | 3 +++ 1 file changed, 3 insertions(+)
0Day found a 34.6% regression in stress-ng's 'af-alg' test case [1], and
bisected it to commit b81fac906a8f ("x86/fpu: Move FPU initialization
into arch_cpu_finalize_init()"), which optimizes the FPU init order,
and moves the CR4_OSXSAVE enabling into a later place:
arch_cpu_finalize_init
identify_boot_cpu
identify_cpu
generic_identify
get_cpu_cap --> setup cpu capability
...
fpu__init_cpu
fpu__init_cpu_xstate
cr4_set_bits(X86_CR4_OSXSAVE);
And it makes 'X86_FEATURE_OSXSAVE' feature bit missed in cpu capability
setup. Many security module like 'camellia_aesni_avx_x86_64' depends on
this feature, and will fail to be loaded after the commit, causing the
regression.
So set X86_FEATURE_OSXSAVE feature right after OSXSAVE enabling to fix it.
[1]. https://lore.kernel.org/lkml/202307192135.203ac24e-oliver.sang@intel.com/
Fixes: b81fac906a8f ("x86/fpu: Move FPU initialization into arch_cpu_finalize_init()")
Reported-by: kernel test robot <oliver.sang@intel.com>
Signed-off-by: Feng Tang <feng.tang@intel.com>
---
arch/x86/kernel/fpu/xstate.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 0bab497c9436..8ebea0d522d2 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -173,6 +173,9 @@ void fpu__init_cpu_xstate(void)
cr4_set_bits(X86_CR4_OSXSAVE);
+ if (!boot_cpu_has(X86_FEATURE_OSXSAVE))
+ setup_force_cpu_cap(X86_FEATURE_OSXSAVE);
+
/*
* Must happen after CR4 setup and before xsetbv() to allow KVM
* lazy passthrough. Write independent of the dynamic state static
--
2.27.0
On Wed, Aug 23 2023 at 14:57, Feng Tang wrote:
> diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
> index 0bab497c9436..8ebea0d522d2 100644
> --- a/arch/x86/kernel/fpu/xstate.c
> +++ b/arch/x86/kernel/fpu/xstate.c
> @@ -173,6 +173,9 @@ void fpu__init_cpu_xstate(void)
>
> cr4_set_bits(X86_CR4_OSXSAVE);
>
> + if (!boot_cpu_has(X86_FEATURE_OSXSAVE))
> + setup_force_cpu_cap(X86_FEATURE_OSXSAVE);
This is wrong in several aspects:
1) You force the feature bit _before_ XSAVE is completely
initialized. fpu__init_system_xstate() has error paths which
disable XSAVE.
2) This conditional should have been a red flag for you simply
because fpu__init_cpu_xstate() is invoked on all CPUs not only
on the BSP.
I fixed it up and added a proper comment explaining it.
Hi Thomas,
On Thu, Aug 24, 2023 at 11:01:18AM +0200, Thomas Gleixner wrote:
> On Wed, Aug 23 2023 at 14:57, Feng Tang wrote:
> > diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
> > index 0bab497c9436..8ebea0d522d2 100644
> > --- a/arch/x86/kernel/fpu/xstate.c
> > +++ b/arch/x86/kernel/fpu/xstate.c
> > @@ -173,6 +173,9 @@ void fpu__init_cpu_xstate(void)
> >
> > cr4_set_bits(X86_CR4_OSXSAVE);
> >
> > + if (!boot_cpu_has(X86_FEATURE_OSXSAVE))
> > + setup_force_cpu_cap(X86_FEATURE_OSXSAVE);
>
> This is wrong in several aspects:
>
> 1) You force the feature bit _before_ XSAVE is completely
> initialized. fpu__init_system_xstate() has error paths which
> disable XSAVE.
Yes, I missed the error path in BSP fpu initialization code.
> 2) This conditional should have been a red flag for you simply
> because fpu__init_cpu_xstate() is invoked on all CPUs not only
> on the BSP.
Indeed. when I worked on the patch, I even thought about ugly thing like:
if (raw_smp_processor_id() == 0)
setup_force_cpu_cap(X86_FEATURE_OSXSAVE);
> I fixed it up and added a proper comment explaining it.
Thanks for fixing it up and improving the comments!
- Feng
On Wed, 2023-08-23 at 14:57 +0800, Feng Tang wrote:
> 0Day found a 34.6% regression in stress-ng's 'af-alg' test case [1],
> and
> bisected it to commit b81fac906a8f ("x86/fpu: Move FPU initialization
> into arch_cpu_finalize_init()"), which optimizes the FPU init order,
> and moves the CR4_OSXSAVE enabling into a later place:
>
> arch_cpu_finalize_init
> identify_boot_cpu
> identify_cpu
> generic_identify
> get_cpu_cap --> setup cpu capability
> ...
> fpu__init_cpu
> fpu__init_cpu_xstate
> cr4_set_bits(X86_CR4_OSXSAVE);
>
> And it makes 'X86_FEATURE_OSXSAVE' feature bit missed in cpu
> capability
> setup. Many security module like 'camellia_aesni_avx_x86_64' depends
> on
> this feature, and will fail to be loaded after the commit, causing
> the
> regression.
>
> So set X86_FEATURE_OSXSAVE feature right after OSXSAVE enabling to
> fix it.
Oh, that's unfortunate.
It might help to include a bit more about the problem in the log. The
piece that confused me at first was that X86_FEATURE_OSXSAVE maps to a
CPUID bit that will change once CR4.OSXSAVE is set. So since the CPUID
bits are read before CR4.OSXSAVE is set, it stores the original unset
value of the bit.
>
> [1].
> https://lore.kernel.org/lkml/202307192135.203ac24e-oliver.sang@intel.com/
>
> Fixes: b81fac906a8f ("x86/fpu: Move FPU initialization into
> arch_cpu_finalize_init()")
> Reported-by: kernel test robot <oliver.sang@intel.com>
> Signed-off-by: Feng Tang <feng.tang@intel.com>
> ---
> arch/x86/kernel/fpu/xstate.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/arch/x86/kernel/fpu/xstate.c
> b/arch/x86/kernel/fpu/xstate.c
> index 0bab497c9436..8ebea0d522d2 100644
> --- a/arch/x86/kernel/fpu/xstate.c
> +++ b/arch/x86/kernel/fpu/xstate.c
> @@ -173,6 +173,9 @@ void fpu__init_cpu_xstate(void)
>
> cr4_set_bits(X86_CR4_OSXSAVE);
>
> + if (!boot_cpu_has(X86_FEATURE_OSXSAVE))
> + setup_force_cpu_cap(X86_FEATURE_OSXSAVE);
> +
I'd also put a comment here to explain why this is done manually. I'll
toss something out in case it's useful:
/*
* CPUID bit for X86_FEATURE_OSXSAVE value will change once
* CR4.OSXSAVE is set, so update it manually.
*/
Hi Rick,
Thanks for the review!
On Thu, Aug 24, 2023 at 06:16:55AM +0800, Edgecombe, Rick P wrote:
> On Wed, 2023-08-23 at 14:57 +0800, Feng Tang wrote:
> > 0Day found a 34.6% regression in stress-ng's 'af-alg' test case [1],
> > and
> > bisected it to commit b81fac906a8f ("x86/fpu: Move FPU initialization
> > into arch_cpu_finalize_init()"), which optimizes the FPU init order,
> > and moves the CR4_OSXSAVE enabling into a later place:
> >
> > arch_cpu_finalize_init
> > identify_boot_cpu
> > identify_cpu
> > generic_identify
> > get_cpu_cap --> setup cpu capability
> > ...
> > fpu__init_cpu
> > fpu__init_cpu_xstate
> > cr4_set_bits(X86_CR4_OSXSAVE);
> >
> > And it makes 'X86_FEATURE_OSXSAVE' feature bit missed in cpu
> > capability
> > setup. Many security module like 'camellia_aesni_avx_x86_64' depends
> > on
> > this feature, and will fail to be loaded after the commit, causing
> > the
> > regression.
> >
> > So set X86_FEATURE_OSXSAVE feature right after OSXSAVE enabling to
> > fix it.
>
> Oh, that's unfortunate.
>
> It might help to include a bit more about the problem in the log. The
> piece that confused me at first was that X86_FEATURE_OSXSAVE maps to a
> CPUID bit that will change once CR4.OSXSAVE is set. So since the CPUID
> bits are read before CR4.OSXSAVE is set, it stores the original unset
> value of the bit.
Good point. It's also my first time to know that this cpuid bit could
change with CR4 runtime setting.
> >
> > [1].
> > https://lore.kernel.org/lkml/202307192135.203ac24e-oliver.sang@intel.com/
> >
> > Fixes: b81fac906a8f ("x86/fpu: Move FPU initialization into
> > arch_cpu_finalize_init()")
> > Reported-by: kernel test robot <oliver.sang@intel.com>
> > Signed-off-by: Feng Tang <feng.tang@intel.com>
> > ---
> > arch/x86/kernel/fpu/xstate.c | 3 +++
> > 1 file changed, 3 insertions(+)
> >
> > diff --git a/arch/x86/kernel/fpu/xstate.c
> > b/arch/x86/kernel/fpu/xstate.c
> > index 0bab497c9436..8ebea0d522d2 100644
> > --- a/arch/x86/kernel/fpu/xstate.c
> > +++ b/arch/x86/kernel/fpu/xstate.c
> > @@ -173,6 +173,9 @@ void fpu__init_cpu_xstate(void)
> >
> > cr4_set_bits(X86_CR4_OSXSAVE);
> >
> > + if (!boot_cpu_has(X86_FEATURE_OSXSAVE))
> > + setup_force_cpu_cap(X86_FEATURE_OSXSAVE);
> > +
>
> I'd also put a comment here to explain why this is done manually. I'll
> toss something out in case it's useful:
> /*
> * CPUID bit for X86_FEATURE_OSXSAVE value will change once
> * CR4.OSXSAVE is set, so update it manually.
> */
Will add. thanks!
- Feng
© 2016 - 2025 Red Hat, Inc.