[PATCH v1 4/8] x86/traps: Remove lazy FPU support

Ross Lagerwall posted 8 patches 4 days, 6 hours ago
[PATCH v1 4/8] x86/traps: Remove lazy FPU support
Posted by Ross Lagerwall 4 days, 6 hours ago
From: Wei Liu <wei.liu2@citrix.com>

Remove lazy FPU support from the #DNA exception handler used by PV
guests since fully_eager_fpu is now always true.

Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 xen/arch/x86/i387.c               | 24 ------------------------
 xen/arch/x86/include/asm/i387.h   |  1 -
 xen/arch/x86/pv/misc-hypercalls.c |  3 +--
 xen/arch/x86/traps.c              | 18 ++++++++++--------
 4 files changed, 11 insertions(+), 35 deletions(-)

diff --git a/xen/arch/x86/i387.c b/xen/arch/x86/i387.c
index 954ba3b1799b..7da731865f73 100644
--- a/xen/arch/x86/i387.c
+++ b/xen/arch/x86/i387.c
@@ -234,30 +234,6 @@ void vcpu_restore_fpu_nonlazy(struct vcpu *v, bool need_stts)
         stts();
 }
 
-/* 
- * Restore FPU state when #NM is triggered.
- */
-void vcpu_restore_fpu_lazy(struct vcpu *v)
-{
-    ASSERT(!is_idle_vcpu(v));
-
-    /* Avoid recursion. */
-    clts();
-
-    if ( v->fpu_dirtied )
-        return;
-
-    ASSERT(!v->arch.fully_eager_fpu);
-
-    if ( cpu_has_xsave )
-        fpu_xrstor(v, XSTATE_LAZY);
-    else
-        fpu_fxrstor(v);
-
-    v->fpu_initialised = 1;
-    v->fpu_dirtied = 1;
-}
-
 /* 
  * On each context switch, save the necessary FPU info of VCPU being switch 
  * out. It dispatches saving operation based on CPU's capability.
diff --git a/xen/arch/x86/include/asm/i387.h b/xen/arch/x86/include/asm/i387.h
index 652d7ad2deb6..da0c7e945f95 100644
--- a/xen/arch/x86/include/asm/i387.h
+++ b/xen/arch/x86/include/asm/i387.h
@@ -28,7 +28,6 @@ struct ix87_env {
 };
 
 void vcpu_restore_fpu_nonlazy(struct vcpu *v, bool need_stts);
-void vcpu_restore_fpu_lazy(struct vcpu *v);
 void vcpu_save_fpu(struct vcpu *v);
 void save_fpu_enable(void);
 int vcpu_init_fpu(struct vcpu *v);
diff --git a/xen/arch/x86/pv/misc-hypercalls.c b/xen/arch/x86/pv/misc-hypercalls.c
index 7e915d86b724..34a0717540a9 100644
--- a/xen/arch/x86/pv/misc-hypercalls.c
+++ b/xen/arch/x86/pv/misc-hypercalls.c
@@ -42,8 +42,7 @@ long do_fpu_taskswitch(int set)
     else
     {
         v->arch.pv.ctrlreg[0] &= ~X86_CR0_TS;
-        if ( v->fpu_dirtied )
-            clts();
+        clts();
     }
 
     return 0;
diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index b6b119769722..fb1b94245850 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -2045,15 +2045,17 @@ void asmlinkage do_device_not_available(struct cpu_user_regs *regs)
     }
 
 #ifdef CONFIG_PV
-    vcpu_restore_fpu_lazy(curr);
+    BUG_ON(!(curr->arch.pv.ctrlreg[0] & X86_CR0_TS));
 
-    if ( curr->arch.pv.ctrlreg[0] & X86_CR0_TS )
-    {
-        pv_inject_hw_exception(X86_EXC_NM, X86_EVENT_NO_EC);
-        curr->arch.pv.ctrlreg[0] &= ~X86_CR0_TS;
-    }
-    else
-        TRACE_TIME(TRC_PV_MATH_STATE_RESTORE);
+    /*
+     * PV ABI QUIRK: Classic Xen kernels (2.6.18 and SLES 11 SP4's
+     * 3.0) rely on Xen to clear TS. PVOPS kernels (3.0, 3.16 and 4.15
+     * are checked) always clear TS themselves.
+     */
+    clts();
+
+    pv_inject_hw_exception(X86_EXC_NM, X86_EVENT_NO_EC);
+    curr->arch.pv.ctrlreg[0] &= ~X86_CR0_TS;
 #else
     ASSERT_UNREACHABLE();
 #endif
-- 
2.53.0
Re: [PATCH v1 4/8] x86/traps: Remove lazy FPU support
Posted by Andrew Cooper 4 days, 1 hour ago
On 19/03/2026 1:29 pm, Ross Lagerwall wrote:
> From: Wei Liu <wei.liu2@citrix.com>
>
> Remove lazy FPU support from the #DNA exception handler used by PV

It's the #NM exception handler.

The short name is "No Math[sic]", despite the long name being "Device
Not Available".

> guests since fully_eager_fpu is now always true.
>
> Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>

These two tags need reversing.  Technically this says that Wei
reviewed/accepted the part of the patch that was your changes.

> diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
> index b6b119769722..fb1b94245850 100644
> --- a/xen/arch/x86/traps.c
> +++ b/xen/arch/x86/traps.c
> @@ -2045,15 +2045,17 @@ void asmlinkage do_device_not_available(struct cpu_user_regs *regs)
>      }
>  
>  #ifdef CONFIG_PV
> -    vcpu_restore_fpu_lazy(curr);
> +    BUG_ON(!(curr->arch.pv.ctrlreg[0] & X86_CR0_TS));

I'm not sure if this is safe.

Firstly, since Wei wrote the patch originally, a new source of #NM
exceptions has come into existence.  AMX shouldn't be enabled for VMs
yet, but this ought to be 

    if ( !... )
    {
        ASSERT_UNREACHABLE();
        domain_crash(...);
    }

to be less fatal.

Also, at this point in the series, cpu_init() still sets TS, as does
vcpu_save_fpu().  It's far from clear that TS is only set when the guest
wants it, although we have at least excluded the Xen paths by this point
in the handler.

> -    if ( curr->arch.pv.ctrlreg[0] & X86_CR0_TS )
> -    {
> -        pv_inject_hw_exception(X86_EXC_NM, X86_EVENT_NO_EC);
> -        curr->arch.pv.ctrlreg[0] &= ~X86_CR0_TS;
> -    }
> -    else
> -        TRACE_TIME(TRC_PV_MATH_STATE_RESTORE);
> +    /*
> +     * PV ABI QUIRK: Classic Xen kernels (2.6.18 and SLES 11 SP4's
> +     * 3.0) rely on Xen to clear TS. PVOPS kernels (3.0, 3.16 and 4.15
> +     * are checked) always clear TS themselves.
> +     */
> +    clts();

I think this wants wording differently.

"For better or worse, Xen's ABI with PV guests always clears TS on an
#NM exception.  Classic-xen Linux depends on this."

The behaviour of obsolete PVOps kernels isn't relevant to the ABI, and
now that Linux is strictly eager too, I doubt this statement is accurate
any more.

> +
> +    pv_inject_hw_exception(X86_EXC_NM, X86_EVENT_NO_EC);
> +    curr->arch.pv.ctrlreg[0] &= ~X86_CR0_TS;

Swap these two operations.  The optimiser will then be able to tailcall
pv_inject_hw_exception().

Clearing the guest TS ought to be tied to the clts() operation anyway.

~Andrew