[PATCH v2 1/9] x86/domain: Ensure a vCPU's FPU is reset early

Ross Lagerwall posted 9 patches 1 week, 4 days ago
[PATCH v2 1/9] x86/domain: Ensure a vCPU's FPU is reset early
Posted by Ross Lagerwall 1 week, 4 days ago
When using eager-fpu, a vCPU's FPU is always marked as initialized on
context switch but it is possible that neither vcpu_reset_fpu() nor
vcpu_setup_fpu() has been called on it. If that happens,
arch_get_info_guest() would return a block of all 0's for the FPU
context claiming it to be valid.

Fix this by calling vcpu_reset_fpu() during vCPU creation.

Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
---
New in v2

 xen/arch/x86/domain.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index 9ba2774762cc..82da1c5d7b38 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -522,6 +522,8 @@ int arch_vcpu_create(struct vcpu *v)
         if ( (rc = vcpu_init_fpu(v)) != 0 )
             return rc;
 
+        vcpu_reset_fpu(v);
+
         vmce_init_vcpu(v);
 
         arch_vcpu_regs_init(v);
-- 
2.53.0
Re: [PATCH v2 1/9] x86/domain: Ensure a vCPU's FPU is reset early
Posted by Andrew Cooper 1 week, 4 days ago
On 24/03/2026 6:19 pm, Ross Lagerwall wrote:
> When using eager-fpu, a vCPU's FPU is always marked as initialized on
> context switch but it is possible that neither vcpu_reset_fpu() nor
> vcpu_setup_fpu() has been called on it.

How?

I don't think a PV vCPU can.  You cannot VCPUOP_up a vCPU for which
v->is_initialised is false, and setting is_initialised involves either
giving a good FPU, or taking the "reset" path.

An HVM use of VCPUOP_initialise only passes basic state, so can be used
to set v->is_initialised without touching the FPU state.

DM_INIT preserves the v->fpu_initialised-ness while not touching the
buffer.  This is correct(ish).  FPU registers are (mostly) not modified
on INIT.

>  If that happens,
> arch_get_info_guest() would return a block of all 0's for the FPU
> context claiming it to be valid.
>
> Fix this by calling vcpu_reset_fpu() during vCPU creation.
>
> Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>

The phrasing is a bit awkward, and the function names don't help, but it
is something we're going to have to address properly when doing nested virt.

(A minor tangent which is relevant to where we want to end up)

https://sandpile.org/x86/initial.htm

#RESET and #INIT used to be a physical pins, but are just a message on
the fabric. Either way they're events which alter state in well defined
ways.

From Xen's point of view, vcpu_create() is the only #RESET-like thing
we've got.  If we didn't model crash/reboot as constructing a new
domain, that would be the other place to use #RESET.

#INIT exists explicitly for HVM guests, via the APIC interface.  Xen has
no working model of this because HVM guests were built on PV which
wasn't modelled on how CPUs work.

v->is_initialised is a PV-ism which has infected x86 HVM and non-x86
architectures too.  The key thing which PV vCPUs need that doesn't work
like CPUs in the slightest is the chosen vCR3 (and vCR1 for PV64) need
to refer to a property typed L4/L3 pagetable, and PV guests can't take a
type ref on 0.


Anyway, returning from the tangent ...

> ---
> New in v2
>
>  xen/arch/x86/domain.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
> index 9ba2774762cc..82da1c5d7b38 100644
> --- a/xen/arch/x86/domain.c
> +++ b/xen/arch/x86/domain.c
> @@ -522,6 +522,8 @@ int arch_vcpu_create(struct vcpu *v)
>          if ( (rc = vcpu_init_fpu(v)) != 0 )
>              return rc;
>  
> +        vcpu_reset_fpu(v);
> +

... this really should be part of allocating the memory.

First, we should never have the backing memory in the wrong state, and
second, the idle vCPU doesn't take this path.  i.e. in
xstate_alloc_save_area().

Looking into this asks more questions.

xstate_alloc_save_area() does set some of the backing state, but misses
FXSAVE_FTW_RESET.  That's easy enough to fix, and turns out to address
my original concern.

vcpu_reset_fpu() sets v->fpu_initialised = false.  Doesn't this defeat
the point of this patch?

Maybe it's easiest just to fix FXSAVE_FTW_RESET and then purge the
booleans in the way this series does.  I don't think trying to unpick
any other bugfixes is going to be fruitful.

~Andrew

Re: [PATCH v2 1/9] x86/domain: Ensure a vCPU's FPU is reset early
Posted by Ross Lagerwall 1 week, 4 days ago
On 3/24/26 11:02 PM, Andrew Cooper wrote:
> On 24/03/2026 6:19 pm, Ross Lagerwall wrote:
>> When using eager-fpu, a vCPU's FPU is always marked as initialized on
>> context switch but it is possible that neither vcpu_reset_fpu() nor
>> vcpu_setup_fpu() has been called on it.
> 
> How?
> 
> I don't think a PV vCPU can.  You cannot VCPUOP_up a vCPU for which
> v->is_initialised is false, and setting is_initialised involves either
> giving a good FPU, or taking the "reset" path.
> 
> An HVM use of VCPUOP_initialise only passes basic state, so can be used
> to set v->is_initialised without touching the FPU state.
> 

Yes, then a subsequent VCPUOP_up allows the VCPU to be scheduled.
vcpu_restore_fpu_nonlazy(), called on context switch, always sets
fpu_initialised to 1 in the eager-fpu case. I confirmed this happens
with some debug logging during VM start:

[  106.281001] vcpu_init_fpu vCPU 1 (fpu_initialised is 0)
[  110.088015] vcpu_restore_fpu_nonlazy vCPU 1 (set fpu_initialised to 1)
[  110.088155] vcpu_restore_fpu_nonlazy vCPU 1 (set fpu_initialised to 1)
[  110.088518] vcpu_restore_fpu_nonlazy vCPU 1 (set fpu_initialised to 1)
[  110.356216] vcpu_reset_fpu vCPU 1 (set fpu_initialised to 0)
[  110.356236] vcpu_restore_fpu_nonlazy vCPU 1 (set fpu_initialised to 1)
...

> 
>>   If that happens,
>> arch_get_info_guest() would return a block of all 0's for the FPU
>> context claiming it to be valid.
>>
>> Fix this by calling vcpu_reset_fpu() during vCPU creation.
>>
>> Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
> 
> The phrasing is a bit awkward, and the function names don't help, but it
> is something we're going to have to address properly when doing nested virt.
> 
> (A minor tangent which is relevant to where we want to end up)
> 
> https://sandpile.org/x86/initial.htm
> 
> #RESET and #INIT used to be a physical pins, but are just a message on
> the fabric. Either way they're events which alter state in well defined
> ways.
> 
>  From Xen's point of view, vcpu_create() is the only #RESET-like thing
> we've got.  If we didn't model crash/reboot as constructing a new
> domain, that would be the other place to use #RESET.
> 
> #INIT exists explicitly for HVM guests, via the APIC interface.  Xen has
> no working model of this because HVM guests were built on PV which
> wasn't modelled on how CPUs work.
> 
> v->is_initialised is a PV-ism which has infected x86 HVM and non-x86
> architectures too.  The key thing which PV vCPUs need that doesn't work
> like CPUs in the slightest is the chosen vCR3 (and vCR1 for PV64) need
> to refer to a property typed L4/L3 pagetable, and PV guests can't take a
> type ref on 0.
> 
> 
> Anyway, returning from the tangent ...
> 
>> ---
>> New in v2
>>
>>   xen/arch/x86/domain.c | 2 ++
>>   1 file changed, 2 insertions(+)
>>
>> diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
>> index 9ba2774762cc..82da1c5d7b38 100644
>> --- a/xen/arch/x86/domain.c
>> +++ b/xen/arch/x86/domain.c
>> @@ -522,6 +522,8 @@ int arch_vcpu_create(struct vcpu *v)
>>           if ( (rc = vcpu_init_fpu(v)) != 0 )
>>               return rc;
>>   
>> +        vcpu_reset_fpu(v);
>> +
> 
> ... this really should be part of allocating the memory.
> 
> First, we should never have the backing memory in the wrong state, and
> second, the idle vCPU doesn't take this path.  i.e. in
> xstate_alloc_save_area().
> 
> Looking into this asks more questions.
> 
> xstate_alloc_save_area() does set some of the backing state, but misses
> FXSAVE_FTW_RESET.  That's easy enough to fix, and turns out to address
> my original concern.
> 
> vcpu_reset_fpu() sets v->fpu_initialised = false.  Doesn't this defeat
> the point of this patch?

No, since it means the backing memory has been properly initialised by
the time fpu_initalised gets unilaterally set in
vcpu_restore_fpu_nonlazy()...

> 
> Maybe it's easiest just to fix FXSAVE_FTW_RESET and then purge the
> booleans in the way this series does.  I don't think trying to unpick
> any other bugfixes is going to be fruitful.
> 

... but yes, that does seem like the best way to fix this.

Ross