[PATCH 1/7] x86/xstate: Fix initialisation of XSS cache

Andrew Cooper posted 7 patches 6 months ago
There is a newer version of this series
[PATCH 1/7] x86/xstate: Fix initialisation of XSS cache
Posted by Andrew Cooper 6 months ago
The clobbering of this_cpu(xcr0) and this_cpu(xss) to architecturally invalid
values is to force the subsequent set_xcr0() and set_msr_xss() to reload the
hardware register.

While XCR0 is reloaded in xstate_init(), MSR_XSS isn't.  This causes
get_msr_xss() to return the invalid value, and logic of the form:

  old = get_msr_xss();
  set_msr_xss(new);
  ...
  set_msr_xss(old);

to try and restore the architecturally invalid value.

The architecturally invalid value must be purged from the cache, meaning the
hardware register must be written at least once.  This in turn highlights that
the invalid value must only be used in the case that the hardware register is
available.

Fixes: f7f4a523927f ("x86/xstate: reset cached register values on resume")
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>

v3:
 * Split out of later patch
---
 xen/arch/x86/xstate.c | 18 +++++++++++-------
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c
index 99cedb4f5e24..75788147966a 100644
--- a/xen/arch/x86/xstate.c
+++ b/xen/arch/x86/xstate.c
@@ -641,13 +641,6 @@ void xstate_init(struct cpuinfo_x86 *c)
         return;
     }
 
-    /*
-     * Zap the cached values to make set_xcr0() and set_msr_xss() really
-     * write it.
-     */
-    this_cpu(xcr0) = 0;
-    this_cpu(xss) = ~0;
-
     cpuid_count(XSTATE_CPUID, 0, &eax, &ebx, &ecx, &edx);
     feature_mask = (((u64)edx << 32) | eax) & XCNTXT_MASK;
     BUG_ON(!valid_xcr0(feature_mask));
@@ -657,8 +650,19 @@ void xstate_init(struct cpuinfo_x86 *c)
      * Set CR4_OSXSAVE and run "cpuid" to get xsave_cntxt_size.
      */
     set_in_cr4(X86_CR4_OSXSAVE);
+
+    /*
+     * Zap the cached values to make set_xcr0() and set_msr_xss() really write
+     * the hardware register.
+     */
+    this_cpu(xcr0) = 0;
     if ( !set_xcr0(feature_mask) )
         BUG();
+    if ( cpu_has_xsaves )
+    {
+        this_cpu(xss) = ~0;
+        set_msr_xss(0);
+    }
 
     if ( bsp )
     {
-- 
2.30.2


Re: [PATCH 1/7] x86/xstate: Fix initialisation of XSS cache
Posted by Jan Beulich 6 months ago
On 23.05.2024 13:16, Andrew Cooper wrote:
> The clobbering of this_cpu(xcr0) and this_cpu(xss) to architecturally invalid
> values is to force the subsequent set_xcr0() and set_msr_xss() to reload the
> hardware register.
> 
> While XCR0 is reloaded in xstate_init(), MSR_XSS isn't.  This causes
> get_msr_xss() to return the invalid value, and logic of the form:
> 
>   old = get_msr_xss();
>   set_msr_xss(new);
>   ...
>   set_msr_xss(old);
> 
> to try and restore the architecturally invalid value.
> 
> The architecturally invalid value must be purged from the cache, meaning the
> hardware register must be written at least once.  This in turn highlights that
> the invalid value must only be used in the case that the hardware register is
> available.
> 
> Fixes: f7f4a523927f ("x86/xstate: reset cached register values on resume")
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>

However, I view it as pretty unfair that now I will need to re-base
https://lists.xen.org/archives/html/xen-devel/2021-04/msg01336.html
over ...

> --- a/xen/arch/x86/xstate.c
> +++ b/xen/arch/x86/xstate.c
> @@ -641,13 +641,6 @@ void xstate_init(struct cpuinfo_x86 *c)
>          return;
>      }
>  
> -    /*
> -     * Zap the cached values to make set_xcr0() and set_msr_xss() really
> -     * write it.
> -     */
> -    this_cpu(xcr0) = 0;
> -    this_cpu(xss) = ~0;
> -
>      cpuid_count(XSTATE_CPUID, 0, &eax, &ebx, &ecx, &edx);
>      feature_mask = (((u64)edx << 32) | eax) & XCNTXT_MASK;
>      BUG_ON(!valid_xcr0(feature_mask));
> @@ -657,8 +650,19 @@ void xstate_init(struct cpuinfo_x86 *c)
>       * Set CR4_OSXSAVE and run "cpuid" to get xsave_cntxt_size.
>       */
>      set_in_cr4(X86_CR4_OSXSAVE);
> +
> +    /*
> +     * Zap the cached values to make set_xcr0() and set_msr_xss() really write
> +     * the hardware register.
> +     */
> +    this_cpu(xcr0) = 0;
>      if ( !set_xcr0(feature_mask) )
>          BUG();
> +    if ( cpu_has_xsaves )
> +    {
> +        this_cpu(xss) = ~0;
> +        set_msr_xss(0);
> +    }

... this change, kind of breaking again your nice arrangement. Seeing for how
long that change has been pending, it _really_ should have gone in ahead of
this one, with you then sorting how you'd like things to be arranged in the
combined result, rather than me re-posting and then either again not getting
any feedback for years, or you disliking what I've done. Oh well ...

Jan