:p
atchew
Login
In a Xen build with Address Space Isolation the FPU state cannot come from the xenheap, as that means the FPU state of vCPU_A may be speculatively accesible from any pCPU running a hypercall on behalf of vCPU_B. This series prepares code that manipulates the FPU state to use wrappers that fetch said state from "elsewhere"[1]. Those wrappers will crystalise into something more than dummy accesors after existing ASI efforts are merged. So far, they are: a) Remove the directmap (Elias El Yadouzi): https://lore.kernel.org/xen-devel/20240513134046.82605-1-eliasely@amazon.com/ Removes all confidential data pages from the directmap and sets up the infrastructure to access them. Its trust boundary is the domain and builds the foundations of the secret hiding API around {un,}map_domain_page(). b) x86: adventures in Address Space Isolation (Roger Pau Monne): https://lore.kernel.org/xen-devel/20240726152206.28411-1-roger.pau@citrix.com/ Extends (a) to put the trust boundary at the vCPU instead so the threat model covers mutually distrustful vCPUs of the same domain. Extends the API for secret hiding to provide private pCPU-local resources. And an efficient means of accessing resources of the "current" vCPU. In essence, the idea is to stop directly accessing a pointer in the vCPU structure and instead collect it indirectly via a macro invocation. The proposed API is a map/unmap pair in order to tame the complexity involved in the various cases uniformly (Does the domain run with ASI enabled? Is the vCPU "current"? Are we lazy-switching?). The series is somewhat long, but each patch is fairly trivial. If need be, I can fold back a lot of these onto single commits to make it shorter. * Patch 1 refreshes of a couple of asserts back into something helpful. Can be folded onto patches 12 and 13 if deemed too silly for a Fixes tag. * Patch 2 is the introduction of the wrappers in isolation. * Patches 3 - 10 are split for ease of review, but are conceptually the same thing over and over (to stop using direct v->arch.xsave_area and to use wrappers instead). * Patch 11 cleans the idle vcpu state after using it as dumping groung. It's not strictly required for this series, but I'm bound to forget to do it later after we _do_ care and does no harm to do it now. It's otherwise independent of the other patches (it clashes with 10, but only due to both modifying the same code; it's conceptually independent). * Patches 12 and 13 bite the bullet and enlightens the (f)xsave and (f)xrstor abstractions to use the wrappers rather than direct access. * Patches 14 is the last remaining direct use xsave area. It's too tricky to introduce ahead of patches 12 and 13 because they need state passed not available until those have gone in. [1] That "elsewhere" will be with high likelihood either the directmap (on non-ASI), some perma-mapped vCPU-local area (see series (b) at the top) or implemented as a transient mapping in the style of {un,}map_domain_page() for glacially cold accesses to non-current vCPUs. Importantly, writing the final macros involve the other series going in. Alejandro Vallejo (14): x86/xstate: Update stale assertions in fpu_x{rstor,save}() x86/xstate: Create map/unmap primitives for xsave areas x86/hvm: Map/unmap xsave area in hvm_save_cpu_ctxt() x86/fpu: Map/umap xsave area in vcpu_{reset,setup}_fpu() x86/xstate: Map/unmap xsave area in xstate_set_init() and handle_setbv() x86/hvm: Map/unmap xsave area in hvmemul_{get,put}_fpu() x86/domctl: Map/unmap xsave area in arch_get_info_guest() x86/xstate: Map/unmap xsave area in {compress,expand}_xsave_states() x86/emulator: Refactor FXSAVE_AREA to use wrappers x86/mpx: Map/unmap xsave area in in read_bndcfgu() x86/mpx: Adjust read_bndcfgu() to clean after itself x86/fpu: Pass explicit xsave areas to fpu_(f)xsave() x86/fpu: Pass explicit xsave areas to fpu_(f)xrstor() x86/xstate: Make xstate_all() and vcpu_xsave_mask() take explicit xstate xen/arch/x86/domctl.c | 9 +++-- xen/arch/x86/hvm/emulate.c | 10 ++++- xen/arch/x86/hvm/hvm.c | 8 ++-- xen/arch/x86/i387.c | 67 ++++++++++++++++++++----------- xen/arch/x86/include/asm/xstate.h | 29 +++++++++++-- xen/arch/x86/x86_emulate/blk.c | 10 ++++- xen/arch/x86/xstate.c | 51 ++++++++++++++++------- 7 files changed, 130 insertions(+), 54 deletions(-) -- 2.47.0
The asserts' intent was to establish whether the xsave instruction was usable or not, which at the time was strictly given by the presence of the xsave area. After edb48e76458b("x86/fpu: Combine fpu_ctxt and xsave_area in arch_vcpu"), that area is always present a more relevant assert is that the host supports XSAVE. Fixes: edb48e76458b("x86/fpu: Combine fpu_ctxt and xsave_area in arch_vcpu") Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com> --- I'd also be ok with removing the assertions altogether. They serve very little purpose there after the merge of xsave and fpu_ctxt. --- xen/arch/x86/i387.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/xen/arch/x86/i387.c b/xen/arch/x86/i387.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/x86/i387.c +++ b/xen/arch/x86/i387.c @@ -XXX,XX +XXX,XX @@ static inline void fpu_xrstor(struct vcpu *v, uint64_t mask) { bool ok; - ASSERT(v->arch.xsave_area); + ASSERT(cpu_has_xsave); /* * XCR0 normally represents what guest OS set. In case of Xen itself, * we set the accumulated feature mask before doing save/restore. @@ -XXX,XX +XXX,XX @@ static inline void fpu_xsave(struct vcpu *v) uint64_t mask = vcpu_xsave_mask(v); ASSERT(mask); - ASSERT(v->arch.xsave_area); + ASSERT(cpu_has_xsave); /* * XCR0 normally represents what guest OS set. In case of Xen itself, * we set the accumulated feature mask before doing save/restore. -- 2.47.0
Add infrastructure to simplify ASI handling. With ASI in the picture we'll have several different means of accessing the XSAVE area of a given vCPU, depending on whether a domain is covered by ASI or not and whether the vCPU is question is scheduled on the current pCPU or not. Having these complexities exposed at the call sites becomes unwieldy very fast. These wrappers are intended to be used in a similar way to map_domain_page() and unmap_domain_page(); The map operation will dispatch the appropriate pointer for each case in a future patch, while unmap will remain a no-op where no unmap is required (e.g: when there's no ASI) and remove the transient maping if one was required. Follow-up patches replace all uses of raw v->arch.xsave_area by this mechanism in preparation to add the beforementioned dispatch logic to be added at a later time. Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com> --- xen/arch/x86/include/asm/xstate.h | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/xen/arch/x86/include/asm/xstate.h b/xen/arch/x86/include/asm/xstate.h index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/x86/include/asm/xstate.h +++ b/xen/arch/x86/include/asm/xstate.h @@ -XXX,XX +XXX,XX @@ static inline bool xstate_all(const struct vcpu *v) (v->arch.xcr0_accum & XSTATE_LAZY & ~XSTATE_FP_SSE); } +/* + * Fetch a pointer to the XSAVE area of a vCPU + * + * If ASI is enabled for the domain, this mapping is pCPU-local. + * + * @param v Owner of the XSAVE area + */ +#define vcpu_map_xsave_area(v) ((v)->arch.xsave_area) + +/* + * Drops the XSAVE area of a vCPU and nullifies its pointer on exit. + * + * If ASI is enabled and v is not the currently scheduled vCPU then the + * per-pCPU mapping is removed from the address space. + * + * @param v vCPU logically owning xsave_area + * @param xsave_area XSAVE blob of v + */ +#define vcpu_unmap_xsave_area(v, x) ({ (x) = NULL; }) + #endif /* __ASM_XSTATE_H */ -- 2.47.0
No functional change. Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com> --- xen/arch/x86/hvm/hvm.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -XXX,XX +XXX,XX @@ static int cf_check hvm_save_cpu_ctxt(struct vcpu *v, hvm_domain_context_t *h) if ( v->fpu_initialised ) { - BUILD_BUG_ON(sizeof(ctxt.fpu_regs) != - sizeof(v->arch.xsave_area->fpu_sse)); - memcpy(ctxt.fpu_regs, &v->arch.xsave_area->fpu_sse, - sizeof(ctxt.fpu_regs)); + const struct xsave_struct *xsave_area = vcpu_map_xsave_area(v); + BUILD_BUG_ON(sizeof(ctxt.fpu_regs) != sizeof(xsave_area->fpu_sse)); + memcpy(ctxt.fpu_regs, &xsave_area->fpu_sse, sizeof(ctxt.fpu_regs)); + vcpu_unmap_xsave_area(v, xsave_area); ctxt.flags = XEN_X86_FPU_INITIALISED; } -- 2.47.0
No functional change. Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com> --- xen/arch/x86/i387.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/xen/arch/x86/i387.c b/xen/arch/x86/i387.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/x86/i387.c +++ b/xen/arch/x86/i387.c @@ -XXX,XX +XXX,XX @@ int vcpu_init_fpu(struct vcpu *v) void vcpu_reset_fpu(struct vcpu *v) { + struct xsave_struct *xsave_area = vcpu_map_xsave_area(v); + v->fpu_initialised = false; - *v->arch.xsave_area = (struct xsave_struct) { + *xsave_area = (struct xsave_struct) { .fpu_sse = { .mxcsr = MXCSR_DEFAULT, .fcw = FCW_RESET, @@ -XXX,XX +XXX,XX @@ void vcpu_reset_fpu(struct vcpu *v) }, .xsave_hdr.xstate_bv = X86_XCR0_X87, }; + + vcpu_unmap_xsave_area(v, xsave_area); } void vcpu_setup_fpu(struct vcpu *v, const void *data) { + struct xsave_struct *xsave_area = vcpu_map_xsave_area(v); + v->fpu_initialised = true; - *v->arch.xsave_area = (struct xsave_struct) { + *xsave_area = (struct xsave_struct) { .fpu_sse = *(const fpusse_t*)data, .xsave_hdr.xstate_bv = XSTATE_FP_SSE, }; + + vcpu_unmap_xsave_area(v, xsave_area); } /* Free FPU's context save area */ -- 2.47.0
No functional change. Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com> --- xen/arch/x86/xstate.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/x86/xstate.c +++ b/xen/arch/x86/xstate.c @@ -XXX,XX +XXX,XX @@ int handle_xsetbv(u32 index, u64 new_bv) clts(); if ( curr->fpu_dirtied ) - asm ( "stmxcsr %0" : "=m" (curr->arch.xsave_area->fpu_sse.mxcsr) ); + { + struct xsave_struct *xsave_area = vcpu_map_xsave_area(curr); + + asm ( "stmxcsr %0" : "=m" (xsave_area->fpu_sse.mxcsr) ); + vcpu_unmap_xsave_area(curr, xsave_area); + } else if ( xstate_all(curr) ) { /* See the comment in i387.c:vcpu_restore_fpu_eager(). */ @@ -XXX,XX +XXX,XX @@ void xstate_set_init(uint64_t mask) unsigned long cr0 = read_cr0(); unsigned long xcr0 = this_cpu(xcr0); struct vcpu *v = idle_vcpu[smp_processor_id()]; - struct xsave_struct *xstate = v->arch.xsave_area; + struct xsave_struct *xstate; if ( ~xfeature_mask & mask ) { @@ -XXX,XX +XXX,XX @@ void xstate_set_init(uint64_t mask) clts(); + xstate = vcpu_map_xsave_area(v); memset(&xstate->xsave_hdr, 0, sizeof(xstate->xsave_hdr)); xrstor(v, mask); + vcpu_unmap_xsave_area(v, xstate); if ( cr0 & X86_CR0_TS ) write_cr0(cr0); -- 2.47.0
No functional change. Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com> --- xen/arch/x86/hvm/emulate.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/x86/hvm/emulate.c +++ b/xen/arch/x86/hvm/emulate.c @@ -XXX,XX +XXX,XX @@ static int cf_check hvmemul_get_fpu( alternative_vcall(hvm_funcs.fpu_dirty_intercept); else if ( type == X86EMUL_FPU_fpu ) { - const fpusse_t *fpu_ctxt = &curr->arch.xsave_area->fpu_sse; + const struct xsave_struct *xsave_area = vcpu_map_xsave_area(curr); + const fpusse_t *fpu_ctxt = &xsave_area->fpu_sse; /* * Latch current register state so that we can back out changes @@ -XXX,XX +XXX,XX @@ static int cf_check hvmemul_get_fpu( else ASSERT(fcw == fpu_ctxt->fcw); } + + vcpu_unmap_xsave_area(curr, xsave_area); } return X86EMUL_OKAY; @@ -XXX,XX +XXX,XX @@ static void cf_check hvmemul_put_fpu( if ( aux ) { - fpusse_t *fpu_ctxt = &curr->arch.xsave_area->fpu_sse; + struct xsave_struct *xsave_area = vcpu_map_xsave_area(curr); + fpusse_t *fpu_ctxt = &xsave_area->fpu_sse; bool dval = aux->dval; int mode = hvm_guest_x86_mode(curr); @@ -XXX,XX +XXX,XX @@ static void cf_check hvmemul_put_fpu( fpu_ctxt->fop = aux->op; + vcpu_unmap_xsave_area(curr, xsave_area); + /* Re-use backout code below. */ backout = X86EMUL_FPU_fpu; } -- 2.47.0
No functional change. Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com> --- xen/arch/x86/domctl.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/x86/domctl.c +++ b/xen/arch/x86/domctl.c @@ -XXX,XX +XXX,XX @@ void arch_get_info_guest(struct vcpu *v, vcpu_guest_context_u c) unsigned int i; const struct domain *d = v->domain; bool compat = is_pv_32bit_domain(d); + const struct xsave_struct *xsave_area; #ifdef CONFIG_COMPAT #define c(fld) (!compat ? (c.nat->fld) : (c.cmp->fld)) #else #define c(fld) (c.nat->fld) #endif - BUILD_BUG_ON(sizeof(c.nat->fpu_ctxt) != - sizeof(v->arch.xsave_area->fpu_sse)); - memcpy(&c.nat->fpu_ctxt, &v->arch.xsave_area->fpu_sse, - sizeof(c.nat->fpu_ctxt)); + xsave_area = vcpu_map_xsave_area(v); + BUILD_BUG_ON(sizeof(c.nat->fpu_ctxt) != sizeof(xsave_area->fpu_sse)); + memcpy(&c.nat->fpu_ctxt, &xsave_area->fpu_sse, sizeof(c.nat->fpu_ctxt)); + vcpu_unmap_xsave_area(v, xsave_area); if ( is_pv_domain(d) ) c(flags = v->arch.pv.vgc_flags & ~(VGCF_i387_valid|VGCF_in_kernel)); -- 2.47.0
No functional change. Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com> --- xen/arch/x86/xstate.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/x86/xstate.c +++ b/xen/arch/x86/xstate.c @@ -XXX,XX +XXX,XX @@ static void setup_xstate_comp(uint16_t *comp_offsets, */ void expand_xsave_states(const struct vcpu *v, void *dest, unsigned int size) { - const struct xsave_struct *xstate = v->arch.xsave_area; + const struct xsave_struct *xstate = vcpu_map_xsave_area(v); const void *src; uint16_t comp_offsets[sizeof(xfeature_mask)*8]; u64 xstate_bv = xstate->xsave_hdr.xstate_bv; @@ -XXX,XX +XXX,XX @@ void expand_xsave_states(const struct vcpu *v, void *dest, unsigned int size) valid &= ~feature; } + + vcpu_unmap_xsave_area(v, xstate); } /* @@ -XXX,XX +XXX,XX @@ void expand_xsave_states(const struct vcpu *v, void *dest, unsigned int size) */ void compress_xsave_states(struct vcpu *v, const void *src, unsigned int size) { - struct xsave_struct *xstate = v->arch.xsave_area; + struct xsave_struct *xstate = vcpu_map_xsave_area(v); void *dest; uint16_t comp_offsets[sizeof(xfeature_mask)*8]; u64 xstate_bv, valid; @@ -XXX,XX +XXX,XX @@ void compress_xsave_states(struct vcpu *v, const void *src, unsigned int size) valid &= ~feature; } + + vcpu_unmap_xsave_area(v, xstate); } void xsave(struct vcpu *v, uint64_t mask) -- 2.47.0
Adds an UNMAP primitive to make use of vcpu_unmap_xsave_area() when linked into xen. unmap is a no-op during tests. Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com> --- xen/arch/x86/x86_emulate/blk.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/xen/arch/x86/x86_emulate/blk.c b/xen/arch/x86/x86_emulate/blk.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/x86/x86_emulate/blk.c +++ b/xen/arch/x86/x86_emulate/blk.c @@ -XXX,XX +XXX,XX @@ !defined(X86EMUL_NO_SIMD) # ifdef __XEN__ # include <asm/xstate.h> -# define FXSAVE_AREA ((void *)¤t->arch.xsave_area->fpu_sse) +# define FXSAVE_AREA ((void *)vcpu_map_xsave_area(current)) +# define UNMAP_FXSAVE_AREA(x) vcpu_unmap_xsave_area(currt ent, x) # else # define FXSAVE_AREA get_fpu_save_area() +# define UNMAP_FXSAVE_AREA(x) ((void)x) # endif #endif @@ -XXX,XX +XXX,XX @@ int x86_emul_blk( } else asm volatile ( "fxrstor %0" :: "m" (*fxsr) ); + + UNMAP_FXSAVE_AREA(fxsr); + break; } @@ -XXX,XX +XXX,XX @@ int x86_emul_blk( if ( fxsr != ptr ) /* i.e. s->op_bytes < sizeof(*fxsr) */ memcpy(ptr, fxsr, s->op_bytes); + + UNMAP_FXSAVE_AREA(fxsr); + break; } -- 2.47.0
No functional change. Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com> --- xen/arch/x86/xstate.c | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/x86/xstate.c +++ b/xen/arch/x86/xstate.c @@ -XXX,XX +XXX,XX @@ int handle_xsetbv(u32 index, u64 new_bv) uint64_t read_bndcfgu(void) { + uint64_t ret = 0; unsigned long cr0 = read_cr0(); - struct xsave_struct *xstate - = idle_vcpu[smp_processor_id()]->arch.xsave_area; + struct vcpu *v = idle_vcpu[smp_processor_id()]; + struct xsave_struct *xstate = vcpu_map_xsave_area(v); const struct xstate_bndcsr *bndcsr; ASSERT(cpu_has_mpx); @@ -XXX,XX +XXX,XX @@ uint64_t read_bndcfgu(void) if ( cr0 & X86_CR0_TS ) write_cr0(cr0); - return xstate->xsave_hdr.xstate_bv & X86_XCR0_BNDCSR ? bndcsr->bndcfgu : 0; + if ( xstate->xsave_hdr.xstate_bv & X86_XCR0_BNDCSR ) + ret = bndcsr->bndcfgu; + + vcpu_unmap_xsave_area(v, xstate); + + return ret; } void xstate_set_init(uint64_t mask) -- 2.47.0
Overwrite the MPX data dumped in the idle XSAVE area to avoid leaking it. While it's not very sensitive, better to err on the side of caution. Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com> --- Depending on whether the idle domain is considered ASI or non-ASI this might or might not be enough. If the idle domain is not ASI the XSAVE area would be in the directmap, which would render the zap ineffective because it would still be transiently readable from another pCPU. --- xen/arch/x86/xstate.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/x86/xstate.c +++ b/xen/arch/x86/xstate.c @@ -XXX,XX +XXX,XX @@ uint64_t read_bndcfgu(void) unsigned long cr0 = read_cr0(); struct vcpu *v = idle_vcpu[smp_processor_id()]; struct xsave_struct *xstate = vcpu_map_xsave_area(v); - const struct xstate_bndcsr *bndcsr; + struct xstate_bndcsr *bndcsr; ASSERT(cpu_has_mpx); clts(); @@ -XXX,XX +XXX,XX @@ uint64_t read_bndcfgu(void) write_cr0(cr0); if ( xstate->xsave_hdr.xstate_bv & X86_XCR0_BNDCSR ) + { ret = bndcsr->bndcfgu; + *bndcsr = (struct xstate_bndcsr){}; + } vcpu_unmap_xsave_area(v, xstate); -- 2.47.0
No functional change. Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com> --- xen/arch/x86/i387.c | 16 ++++++++++------ xen/arch/x86/include/asm/xstate.h | 2 +- xen/arch/x86/xstate.c | 3 +-- 3 files changed, 12 insertions(+), 9 deletions(-) diff --git a/xen/arch/x86/i387.c b/xen/arch/x86/i387.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/x86/i387.c +++ b/xen/arch/x86/i387.c @@ -XXX,XX +XXX,XX @@ static inline uint64_t vcpu_xsave_mask(const struct vcpu *v) } /* Save x87 extended state */ -static inline void fpu_xsave(struct vcpu *v) +static inline void fpu_xsave(struct vcpu *v, struct xsave_struct *xsave_area) { bool ok; uint64_t mask = vcpu_xsave_mask(v); @@ -XXX,XX +XXX,XX @@ static inline void fpu_xsave(struct vcpu *v) */ ok = set_xcr0(v->arch.xcr0_accum | XSTATE_FP_SSE); ASSERT(ok); - xsave(v, mask); + xsave(v, xsave_area, mask); ok = set_xcr0(v->arch.xcr0 ?: XSTATE_FP_SSE); ASSERT(ok); } /* Save x87 FPU, MMX, SSE and SSE2 state */ -static inline void fpu_fxsave(struct vcpu *v) +static inline void fpu_fxsave(struct vcpu *v, fpusse_t *fpu_ctxt) { - fpusse_t *fpu_ctxt = &v->arch.xsave_area->fpu_sse; unsigned int fip_width = v->domain->arch.x87_fip_width; if ( fip_width != 4 ) @@ -XXX,XX +XXX,XX @@ void vcpu_restore_fpu_lazy(struct vcpu *v) */ static bool _vcpu_save_fpu(struct vcpu *v) { + struct xsave_struct *xsave_area; + if ( !v->fpu_dirtied && !v->arch.nonlazy_xstate_used ) return false; @@ -XXX,XX +XXX,XX @@ static bool _vcpu_save_fpu(struct vcpu *v) /* This can happen, if a paravirtualised guest OS has set its CR0.TS. */ clts(); + xsave_area = vcpu_map_xsave_area(v); + if ( cpu_has_xsave ) - fpu_xsave(v); + fpu_xsave(v, xsave_area); else - fpu_fxsave(v); + fpu_fxsave(v, &xsave_area->fpu_sse); + vcpu_unmap_xsave_area(v, xsave_area); v->fpu_dirtied = 0; return true; diff --git a/xen/arch/x86/include/asm/xstate.h b/xen/arch/x86/include/asm/xstate.h index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/x86/include/asm/xstate.h +++ b/xen/arch/x86/include/asm/xstate.h @@ -XXX,XX +XXX,XX @@ uint64_t get_xcr0(void); void set_msr_xss(u64 xss); uint64_t get_msr_xss(void); uint64_t read_bndcfgu(void); -void xsave(struct vcpu *v, uint64_t mask); +void xsave(struct vcpu *v, struct xsave_struct *ptr, uint64_t mask); void xrstor(struct vcpu *v, uint64_t mask); void xstate_set_init(uint64_t mask); bool xsave_enabled(const struct vcpu *v); diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/x86/xstate.c +++ b/xen/arch/x86/xstate.c @@ -XXX,XX +XXX,XX @@ void compress_xsave_states(struct vcpu *v, const void *src, unsigned int size) vcpu_unmap_xsave_area(v, xstate); } -void xsave(struct vcpu *v, uint64_t mask) +void xsave(struct vcpu *v, struct xsave_struct *ptr, uint64_t mask) { - struct xsave_struct *ptr = v->arch.xsave_area; uint32_t hmask = mask >> 32; uint32_t lmask = mask; unsigned int fip_width = v->domain->arch.x87_fip_width; -- 2.47.0
No functional change. Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com> --- xen/arch/x86/i387.c | 26 ++++++++++++++++---------- xen/arch/x86/include/asm/xstate.h | 2 +- xen/arch/x86/xstate.c | 10 ++++++---- 3 files changed, 23 insertions(+), 15 deletions(-) diff --git a/xen/arch/x86/i387.c b/xen/arch/x86/i387.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/x86/i387.c +++ b/xen/arch/x86/i387.c @@ -XXX,XX +XXX,XX @@ /* FPU Restore Functions */ /*******************************/ /* Restore x87 extended state */ -static inline void fpu_xrstor(struct vcpu *v, uint64_t mask) +static inline void fpu_xrstor(struct vcpu *v, struct xsave_struct *xsave_area, + uint64_t mask) { bool ok; @@ -XXX,XX +XXX,XX @@ static inline void fpu_xrstor(struct vcpu *v, uint64_t mask) */ ok = set_xcr0(v->arch.xcr0_accum | XSTATE_FP_SSE); ASSERT(ok); - xrstor(v, mask); + xrstor(v, xsave_area, mask); ok = set_xcr0(v->arch.xcr0 ?: XSTATE_FP_SSE); ASSERT(ok); } /* Restore x87 FPU, MMX, SSE and SSE2 state */ -static inline void fpu_fxrstor(struct vcpu *v) +static inline void fpu_fxrstor(struct vcpu *v, const fpusse_t *fpu_ctxt) { - const fpusse_t *fpu_ctxt = &v->arch.xsave_area->fpu_sse; - /* * Some CPUs don't save/restore FDP/FIP/FOP unless an exception * is pending. Clear the x87 state here by setting it to fixed @@ -XXX,XX +XXX,XX @@ static inline void fpu_fxsave(struct vcpu *v, fpusse_t *fpu_ctxt) /* Restore FPU state whenever VCPU is schduled in. */ void vcpu_restore_fpu_nonlazy(struct vcpu *v, bool need_stts) { + struct xsave_struct *xsave_area; + /* Restore nonlazy extended state (i.e. parts not tracked by CR0.TS). */ if ( !v->arch.fully_eager_fpu && !v->arch.nonlazy_xstate_used ) goto maybe_stts; @@ -XXX,XX +XXX,XX @@ void vcpu_restore_fpu_nonlazy(struct vcpu *v, bool need_stts) * above) we also need to restore full state, to prevent subsequently * saving state belonging to another vCPU. */ + xsave_area = vcpu_map_xsave_area(v); if ( v->arch.fully_eager_fpu || xstate_all(v) ) { if ( cpu_has_xsave ) - fpu_xrstor(v, XSTATE_ALL); + fpu_xrstor(v, xsave_area, XSTATE_ALL); else - fpu_fxrstor(v); + fpu_fxrstor(v, &xsave_area->fpu_sse); v->fpu_initialised = 1; v->fpu_dirtied = 1; @@ -XXX,XX +XXX,XX @@ void vcpu_restore_fpu_nonlazy(struct vcpu *v, bool need_stts) } else { - fpu_xrstor(v, XSTATE_NONLAZY); + fpu_xrstor(v, xsave_area, XSTATE_NONLAZY); need_stts = true; } + vcpu_unmap_xsave_area(v, xsave_area); maybe_stts: if ( need_stts ) @@ -XXX,XX +XXX,XX @@ void vcpu_restore_fpu_nonlazy(struct vcpu *v, bool need_stts) */ void vcpu_restore_fpu_lazy(struct vcpu *v) { + struct xsave_struct *xsave_area; ASSERT(!is_idle_vcpu(v)); /* Avoid recursion. */ @@ -XXX,XX +XXX,XX @@ void vcpu_restore_fpu_lazy(struct vcpu *v) ASSERT(!v->arch.fully_eager_fpu); + xsave_area = vcpu_map_xsave_area(v); if ( cpu_has_xsave ) - fpu_xrstor(v, XSTATE_LAZY); + fpu_xrstor(v, xsave_area, XSTATE_LAZY); else - fpu_fxrstor(v); + fpu_fxrstor(v, &xsave_area->fpu_sse); + vcpu_unmap_xsave_area(v, xsave_area); v->fpu_initialised = 1; v->fpu_dirtied = 1; diff --git a/xen/arch/x86/include/asm/xstate.h b/xen/arch/x86/include/asm/xstate.h index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/x86/include/asm/xstate.h +++ b/xen/arch/x86/include/asm/xstate.h @@ -XXX,XX +XXX,XX @@ void set_msr_xss(u64 xss); uint64_t get_msr_xss(void); uint64_t read_bndcfgu(void); void xsave(struct vcpu *v, struct xsave_struct *ptr, uint64_t mask); -void xrstor(struct vcpu *v, uint64_t mask); +void xrstor(struct vcpu *v, struct xsave_struct *ptr, uint64_t mask); void xstate_set_init(uint64_t mask); bool xsave_enabled(const struct vcpu *v); int __must_check validate_xstate(const struct domain *d, diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/x86/xstate.c +++ b/xen/arch/x86/xstate.c @@ -XXX,XX +XXX,XX @@ void xsave(struct vcpu *v, struct xsave_struct *ptr, uint64_t mask) ptr->fpu_sse.x[FPU_WORD_SIZE_OFFSET] = fip_width; } -void xrstor(struct vcpu *v, uint64_t mask) +void xrstor(struct vcpu *v, struct xsave_struct *ptr, uint64_t mask) { uint32_t hmask = mask >> 32; uint32_t lmask = mask; - struct xsave_struct *ptr = v->arch.xsave_area; unsigned int faults, prev_faults; /* @@ -XXX,XX +XXX,XX @@ int handle_xsetbv(u32 index, u64 new_bv) mask &= curr->fpu_dirtied ? ~XSTATE_FP_SSE : XSTATE_NONLAZY; if ( mask ) { + struct xsave_struct *xsave_area = vcpu_map_xsave_area(curr); unsigned long cr0 = read_cr0(); clts(); @@ -XXX,XX +XXX,XX @@ int handle_xsetbv(u32 index, u64 new_bv) curr->fpu_dirtied = 1; cr0 &= ~X86_CR0_TS; } - xrstor(curr, mask); + xrstor(curr, xsave_area, mask); + vcpu_unmap_xsave_area(curr, xsave_area); + if ( cr0 & X86_CR0_TS ) write_cr0(cr0); } @@ -XXX,XX +XXX,XX @@ void xstate_set_init(uint64_t mask) xstate = vcpu_map_xsave_area(v); memset(&xstate->xsave_hdr, 0, sizeof(xstate->xsave_hdr)); - xrstor(v, mask); + xrstor(v, xstate, mask); vcpu_unmap_xsave_area(v, xstate); if ( cr0 & X86_CR0_TS ) -- 2.47.0
No functional change. Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com> --- xen/arch/x86/i387.c | 9 +++++---- xen/arch/x86/include/asm/xstate.h | 5 +++-- xen/arch/x86/xstate.c | 2 +- 3 files changed, 9 insertions(+), 7 deletions(-) diff --git a/xen/arch/x86/i387.c b/xen/arch/x86/i387.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/x86/i387.c +++ b/xen/arch/x86/i387.c @@ -XXX,XX +XXX,XX @@ static inline void fpu_fxrstor(struct vcpu *v, const fpusse_t *fpu_ctxt) /* FPU Save Functions */ /*******************************/ -static inline uint64_t vcpu_xsave_mask(const struct vcpu *v) +static inline uint64_t vcpu_xsave_mask(const struct vcpu *v, + const struct xsave_struct *xsave_area) { if ( v->fpu_dirtied ) return v->arch.nonlazy_xstate_used ? XSTATE_ALL : XSTATE_LAZY; @@ -XXX,XX +XXX,XX @@ static inline uint64_t vcpu_xsave_mask(const struct vcpu *v) * XSTATE_FP_SSE), vcpu_xsave_mask will return XSTATE_ALL. Otherwise * return XSTATE_NONLAZY. */ - return xstate_all(v) ? XSTATE_ALL : XSTATE_NONLAZY; + return xstate_all(v, xsave_area) ? XSTATE_ALL : XSTATE_NONLAZY; } /* Save x87 extended state */ static inline void fpu_xsave(struct vcpu *v, struct xsave_struct *xsave_area) { bool ok; - uint64_t mask = vcpu_xsave_mask(v); + uint64_t mask = vcpu_xsave_mask(v, xsave_area); ASSERT(mask); ASSERT(cpu_has_xsave); @@ -XXX,XX +XXX,XX @@ void vcpu_restore_fpu_nonlazy(struct vcpu *v, bool need_stts) * saving state belonging to another vCPU. */ xsave_area = vcpu_map_xsave_area(v); - if ( v->arch.fully_eager_fpu || xstate_all(v) ) + if ( v->arch.fully_eager_fpu || xstate_all(v, xsave_area) ) { if ( cpu_has_xsave ) fpu_xrstor(v, xsave_area, XSTATE_ALL); diff --git a/xen/arch/x86/include/asm/xstate.h b/xen/arch/x86/include/asm/xstate.h index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/x86/include/asm/xstate.h +++ b/xen/arch/x86/include/asm/xstate.h @@ -XXX,XX +XXX,XX @@ xsave_area_compressed(const struct xsave_struct *xsave_area) return xsave_area->xsave_hdr.xcomp_bv & XSTATE_COMPACTION_ENABLED; } -static inline bool xstate_all(const struct vcpu *v) +static inline bool xstate_all(const struct vcpu *v, + const struct xsave_struct *xsave_area) { /* * XSTATE_FP_SSE may be excluded, because the offsets of XSTATE_FP_SSE * (in the legacy region of xsave area) are fixed, so saving * XSTATE_FP_SSE will not cause overwriting problem with XSAVES/XSAVEC. */ - return xsave_area_compressed(v->arch.xsave_area) && + return xsave_area_compressed(xsave_area) && (v->arch.xcr0_accum & XSTATE_LAZY & ~XSTATE_FP_SSE); } diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/x86/xstate.c +++ b/xen/arch/x86/xstate.c @@ -XXX,XX +XXX,XX @@ int handle_xsetbv(u32 index, u64 new_bv) asm ( "stmxcsr %0" : "=m" (xsave_area->fpu_sse.mxcsr) ); vcpu_unmap_xsave_area(curr, xsave_area); } - else if ( xstate_all(curr) ) + else if ( xstate_all(curr, xsave_area) ) { /* See the comment in i387.c:vcpu_restore_fpu_eager(). */ mask |= XSTATE_LAZY; -- 2.47.0
See original cover letter in v1 v1: https://lore.kernel.org/xen-devel/20241028154932.6797-1-alejandro.vallejo@cloud.com/ v1->v2: * Turned v1/patch1 into an assert removal * Dropped v1/patch11: "x86/mpx: Adjust read_bndcfgu() to clean after itself" * Other minor changes out of feedback. Explained in each patch. Alejandro Vallejo (13): x86/xstate: Remove stale assertions in fpu_x{rstor,save}() x86/xstate: Create map/unmap primitives for xsave areas x86/hvm: Map/unmap xsave area in hvm_save_cpu_ctxt() x86/fpu: Map/umap xsave area in vcpu_{reset,setup}_fpu() x86/xstate: Map/unmap xsave area in xstate_set_init() and handle_setbv() x86/hvm: Map/unmap xsave area in hvmemul_{get,put}_fpu() x86/domctl: Map/unmap xsave area in arch_get_info_guest() x86/xstate: Map/unmap xsave area in {compress,expand}_xsave_states() x86/emulator: Refactor FXSAVE_AREA to use wrappers x86/mpx: Map/unmap xsave area in in read_bndcfgu() x86/fpu: Pass explicit xsave areas to fpu_(f)xsave() x86/fpu: Pass explicit xsave areas to fpu_(f)xrstor() x86/xstate: Make xstate_all() and vcpu_xsave_mask() take explicit xstate xen/arch/x86/domctl.c | 9 +++-- xen/arch/x86/hvm/emulate.c | 12 +++++- xen/arch/x86/hvm/hvm.c | 8 ++-- xen/arch/x86/i387.c | 65 +++++++++++++++++++------------ xen/arch/x86/include/asm/xstate.h | 51 ++++++++++++++++++++++-- xen/arch/x86/x86_emulate/blk.c | 11 +++++- xen/arch/x86/xstate.c | 47 +++++++++++++++------- 7 files changed, 150 insertions(+), 53 deletions(-) -- 2.47.0
After edb48e76458b("x86/fpu: Combine fpu_ctxt and xsave_area in arch_vcpu"), v->arch.xsave_area is always present and we can just remove these asserts. Fixes: edb48e76458b("x86/fpu: Combine fpu_ctxt and xsave_area in arch_vcpu") Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com> --- v2: * Remove asserts rather than refactor them. * Trimmed and adjusted commit message --- xen/arch/x86/i387.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/xen/arch/x86/i387.c b/xen/arch/x86/i387.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/x86/i387.c +++ b/xen/arch/x86/i387.c @@ -XXX,XX +XXX,XX @@ static inline void fpu_xrstor(struct vcpu *v, uint64_t mask) { bool ok; - ASSERT(v->arch.xsave_area); /* * XCR0 normally represents what guest OS set. In case of Xen itself, * we set the accumulated feature mask before doing save/restore. @@ -XXX,XX +XXX,XX @@ static inline void fpu_xsave(struct vcpu *v) uint64_t mask = vcpu_xsave_mask(v); ASSERT(mask); - ASSERT(v->arch.xsave_area); /* * XCR0 normally represents what guest OS set. In case of Xen itself, * we set the accumulated feature mask before doing save/restore. -- 2.47.0
Add infrastructure to simplify ASI handling. With ASI in the picture we'll have several different means of accessing the XSAVE area of a given vCPU, depending on whether a domain is covered by ASI or not and whether the vCPU is question is scheduled on the current pCPU or not. Having these complexities exposed at the call sites becomes unwieldy very fast. These wrappers are intended to be used in a similar way to map_domain_page() and unmap_domain_page(); The map operation will dispatch the appropriate pointer for each case in a future patch, while unmap will remain a no-op where no unmap is required (e.g: when there's no ASI) and remove the transient maping if one was required. Follow-up patches replace all uses of raw v->arch.xsave_area by this mechanism in preparation to add the beforementioned dispatch logic to be added at a later time. Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com> --- v2: * Comment macros more heavily to show their performance characteristics. * Addressed various nits in the macro comments. * Macro names to uppercase. --- xen/arch/x86/include/asm/xstate.h | 42 +++++++++++++++++++++++++++++++ 1 file changed, 42 insertions(+) diff --git a/xen/arch/x86/include/asm/xstate.h b/xen/arch/x86/include/asm/xstate.h index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/x86/include/asm/xstate.h +++ b/xen/arch/x86/include/asm/xstate.h @@ -XXX,XX +XXX,XX @@ static inline bool xstate_all(const struct vcpu *v) (v->arch.xcr0_accum & XSTATE_LAZY & ~XSTATE_FP_SSE); } +/* + * Fetch a pointer to a vCPU's XSAVE area + * + * TL;DR: If v == current, the mapping is guaranteed to already exist. + * + * Despite the name, this macro might not actually map anything. The only case + * in which a mutation of page tables is strictly required is when ASI==on && + * v!=current. For everything else the mapping already exists and needs not + * be created nor destroyed. + * + * +-----------------+--------------+ + * | v == current | v != current | + * +--------------+-----------------+--------------+ + * | ASI enabled | per-vCPU fixmap | actual map | + * +--------------+-----------------+--------------+ + * | ASI disabled | directmap | + * +--------------+--------------------------------+ + * + * There MUST NOT be outstanding maps of XSAVE areas of the non-current vCPU + * at the point of context switch. Otherwise, the unmap operation will + * misbehave. + * + * TODO: Expand the macro to the ASI cases after infra to do so is in place. + * + * @param v Owner of the XSAVE area + */ +#define VCPU_MAP_XSAVE_AREA(v) ((v)->arch.xsave_area) + +/* + * Drops the mapping of a vCPU's XSAVE area and nullifies its pointer on exit + * + * See VCPU_MAP_XSAVE_AREA() for additional information on the persistence of + * these mappings. This macro only tears down the mappings in the ASI=on && + * v!=current case. + * + * TODO: Expand the macro to the ASI cases after infra to do so is in place. + * + * @param v Owner of the XSAVE area + * @param x XSAVE blob of v + */ +#define VCPU_UNMAP_XSAVE_AREA(v, x) ({ (x) = NULL; }) + #endif /* __ASM_XSTATE_H */ -- 2.47.0
No functional change. Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com> --- v2: * No change --- xen/arch/x86/hvm/hvm.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -XXX,XX +XXX,XX @@ static int cf_check hvm_save_cpu_ctxt(struct vcpu *v, hvm_domain_context_t *h) if ( v->fpu_initialised ) { - BUILD_BUG_ON(sizeof(ctxt.fpu_regs) != - sizeof(v->arch.xsave_area->fpu_sse)); - memcpy(ctxt.fpu_regs, &v->arch.xsave_area->fpu_sse, - sizeof(ctxt.fpu_regs)); + const struct xsave_struct *xsave_area = VCPU_MAP_XSAVE_AREA(v); + BUILD_BUG_ON(sizeof(ctxt.fpu_regs) != sizeof(xsave_area->fpu_sse)); + memcpy(ctxt.fpu_regs, &xsave_area->fpu_sse, sizeof(ctxt.fpu_regs)); + VCPU_UNMAP_XSAVE_AREA(v, xsave_area); ctxt.flags = XEN_X86_FPU_INITIALISED; } -- 2.47.0
No functional change. Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com> --- v2: * No change --- xen/arch/x86/i387.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/xen/arch/x86/i387.c b/xen/arch/x86/i387.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/x86/i387.c +++ b/xen/arch/x86/i387.c @@ -XXX,XX +XXX,XX @@ int vcpu_init_fpu(struct vcpu *v) void vcpu_reset_fpu(struct vcpu *v) { + struct xsave_struct *xsave_area = VCPU_MAP_XSAVE_AREA(v); + v->fpu_initialised = false; - *v->arch.xsave_area = (struct xsave_struct) { + *xsave_area = (struct xsave_struct) { .fpu_sse = { .mxcsr = MXCSR_DEFAULT, .fcw = FCW_RESET, @@ -XXX,XX +XXX,XX @@ void vcpu_reset_fpu(struct vcpu *v) }, .xsave_hdr.xstate_bv = X86_XCR0_X87, }; + + VCPU_UNMAP_XSAVE_AREA(v, xsave_area); } void vcpu_setup_fpu(struct vcpu *v, const void *data) { + struct xsave_struct *xsave_area = VCPU_MAP_XSAVE_AREA(v); + v->fpu_initialised = true; - *v->arch.xsave_area = (struct xsave_struct) { + *xsave_area = (struct xsave_struct) { .fpu_sse = *(const fpusse_t*)data, .xsave_hdr.xstate_bv = XSTATE_FP_SSE, }; + + VCPU_UNMAP_XSAVE_AREA(v, xsave_area); } /* Free FPU's context save area */ -- 2.47.0
No functional change. Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com> --- v2: * Added comment highlighting fastpath for current --- xen/arch/x86/xstate.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/x86/xstate.c +++ b/xen/arch/x86/xstate.c @@ -XXX,XX +XXX,XX @@ int handle_xsetbv(u32 index, u64 new_bv) clts(); if ( curr->fpu_dirtied ) - asm ( "stmxcsr %0" : "=m" (curr->arch.xsave_area->fpu_sse.mxcsr) ); + { + /* has a fastpath for `current`, so there's no actual map */ + struct xsave_struct *xsave_area = VCPU_MAP_XSAVE_AREA(curr); + + asm ( "stmxcsr %0" : "=m" (xsave_area->fpu_sse.mxcsr) ); + VCPU_UNMAP_XSAVE_AREA(curr, xsave_area); + } else if ( xstate_all(curr) ) { /* See the comment in i387.c:vcpu_restore_fpu_eager(). */ @@ -XXX,XX +XXX,XX @@ void xstate_set_init(uint64_t mask) unsigned long cr0 = read_cr0(); unsigned long xcr0 = this_cpu(xcr0); struct vcpu *v = idle_vcpu[smp_processor_id()]; - struct xsave_struct *xstate = v->arch.xsave_area; + struct xsave_struct *xstate; if ( ~xfeature_mask & mask ) { @@ -XXX,XX +XXX,XX @@ void xstate_set_init(uint64_t mask) clts(); + xstate = VCPU_MAP_XSAVE_AREA(v); memset(&xstate->xsave_hdr, 0, sizeof(xstate->xsave_hdr)); xrstor(v, mask); + VCPU_UNMAP_XSAVE_AREA(v, xstate); if ( cr0 & X86_CR0_TS ) write_cr0(cr0); -- 2.47.0
No functional change. Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com> --- v2: * Added comments highlighting fastpath for current --- xen/arch/x86/hvm/emulate.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/xen/arch/x86/hvm/emulate.c b/xen/arch/x86/hvm/emulate.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/x86/hvm/emulate.c +++ b/xen/arch/x86/hvm/emulate.c @@ -XXX,XX +XXX,XX @@ static int cf_check hvmemul_get_fpu( alternative_vcall(hvm_funcs.fpu_dirty_intercept); else if ( type == X86EMUL_FPU_fpu ) { - const fpusse_t *fpu_ctxt = &curr->arch.xsave_area->fpu_sse; + /* has a fastpath for `current`, so there's no actual map */ + const struct xsave_struct *xsave_area = VCPU_MAP_XSAVE_AREA(curr); + const fpusse_t *fpu_ctxt = &xsave_area->fpu_sse; /* * Latch current register state so that we can back out changes @@ -XXX,XX +XXX,XX @@ static int cf_check hvmemul_get_fpu( else ASSERT(fcw == fpu_ctxt->fcw); } + + VCPU_UNMAP_XSAVE_AREA(curr, xsave_area); } return X86EMUL_OKAY; @@ -XXX,XX +XXX,XX @@ static void cf_check hvmemul_put_fpu( if ( aux ) { - fpusse_t *fpu_ctxt = &curr->arch.xsave_area->fpu_sse; + /* has a fastpath for `current`, so there's no actual map */ + struct xsave_struct *xsave_area = VCPU_MAP_XSAVE_AREA(curr); + fpusse_t *fpu_ctxt = &xsave_area->fpu_sse; bool dval = aux->dval; int mode = hvm_guest_x86_mode(curr); @@ -XXX,XX +XXX,XX @@ static void cf_check hvmemul_put_fpu( fpu_ctxt->fop = aux->op; + VCPU_UNMAP_XSAVE_AREA(curr, xsave_area); + /* Re-use backout code below. */ backout = X86EMUL_FPU_fpu; } -- 2.47.0
No functional change. Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com> --- v2: * No change --- xen/arch/x86/domctl.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/x86/domctl.c +++ b/xen/arch/x86/domctl.c @@ -XXX,XX +XXX,XX @@ void arch_get_info_guest(struct vcpu *v, vcpu_guest_context_u c) unsigned int i; const struct domain *d = v->domain; bool compat = is_pv_32bit_domain(d); + const struct xsave_struct *xsave_area; #ifdef CONFIG_COMPAT #define c(fld) (!compat ? (c.nat->fld) : (c.cmp->fld)) #else #define c(fld) (c.nat->fld) #endif - BUILD_BUG_ON(sizeof(c.nat->fpu_ctxt) != - sizeof(v->arch.xsave_area->fpu_sse)); - memcpy(&c.nat->fpu_ctxt, &v->arch.xsave_area->fpu_sse, - sizeof(c.nat->fpu_ctxt)); + xsave_area = VCPU_MAP_XSAVE_AREA(v); + BUILD_BUG_ON(sizeof(c.nat->fpu_ctxt) != sizeof(xsave_area->fpu_sse)); + memcpy(&c.nat->fpu_ctxt, &xsave_area->fpu_sse, sizeof(c.nat->fpu_ctxt)); + VCPU_UNMAP_XSAVE_AREA(v, xsave_area); if ( is_pv_domain(d) ) c(flags = v->arch.pv.vgc_flags & ~(VGCF_i387_valid|VGCF_in_kernel)); -- 2.47.0
No functional change. Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com> --- v2: * No change --- xen/arch/x86/xstate.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/x86/xstate.c +++ b/xen/arch/x86/xstate.c @@ -XXX,XX +XXX,XX @@ static void setup_xstate_comp(uint16_t *comp_offsets, */ void expand_xsave_states(const struct vcpu *v, void *dest, unsigned int size) { - const struct xsave_struct *xstate = v->arch.xsave_area; + const struct xsave_struct *xstate = VCPU_MAP_XSAVE_AREA(v); const void *src; uint16_t comp_offsets[sizeof(xfeature_mask)*8]; u64 xstate_bv = xstate->xsave_hdr.xstate_bv; @@ -XXX,XX +XXX,XX @@ void expand_xsave_states(const struct vcpu *v, void *dest, unsigned int size) valid &= ~feature; } + + VCPU_UNMAP_XSAVE_AREA(v, xstate); } /* @@ -XXX,XX +XXX,XX @@ void expand_xsave_states(const struct vcpu *v, void *dest, unsigned int size) */ void compress_xsave_states(struct vcpu *v, const void *src, unsigned int size) { - struct xsave_struct *xstate = v->arch.xsave_area; + struct xsave_struct *xstate = VCPU_MAP_XSAVE_AREA(v); void *dest; uint16_t comp_offsets[sizeof(xfeature_mask)*8]; u64 xstate_bv, valid; @@ -XXX,XX +XXX,XX @@ void compress_xsave_states(struct vcpu *v, const void *src, unsigned int size) valid &= ~feature; } + + VCPU_UNMAP_XSAVE_AREA(v, xstate); } void xsave(struct vcpu *v, uint64_t mask) -- 2.47.0
Adds an UNMAP primitive to make use of vcpu_unmap_xsave_area() when linked into xen. unmap is a no-op during tests. Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com> --- v2: * Added comments highlighting fastpath on `current` --- xen/arch/x86/x86_emulate/blk.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/xen/arch/x86/x86_emulate/blk.c b/xen/arch/x86/x86_emulate/blk.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/x86/x86_emulate/blk.c +++ b/xen/arch/x86/x86_emulate/blk.c @@ -XXX,XX +XXX,XX @@ !defined(X86EMUL_NO_SIMD) # ifdef __XEN__ # include <asm/xstate.h> -# define FXSAVE_AREA ((void *)¤t->arch.xsave_area->fpu_sse) +/* has a fastpath for `current`, so there's no actual map */ +# define FXSAVE_AREA ((void *)VCPU_MAP_XSAVE_AREA(current)) +# define UNMAP_FXSAVE_AREA(x) VCPU_UNMAP_XSAVE_AREA(currt ent, x) # else # define FXSAVE_AREA get_fpu_save_area() +# define UNMAP_FXSAVE_AREA(x) ((void)x) # endif #endif @@ -XXX,XX +XXX,XX @@ int x86_emul_blk( } else asm volatile ( "fxrstor %0" :: "m" (*fxsr) ); + + UNMAP_FXSAVE_AREA(fxsr); + break; } @@ -XXX,XX +XXX,XX @@ int x86_emul_blk( if ( fxsr != ptr ) /* i.e. s->op_bytes < sizeof(*fxsr) */ memcpy(ptr, fxsr, s->op_bytes); + + UNMAP_FXSAVE_AREA(fxsr); + break; } -- 2.47.0
No functional change. Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com> --- v2: * s/ret/bndcfgu --- xen/arch/x86/xstate.c | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/x86/xstate.c +++ b/xen/arch/x86/xstate.c @@ -XXX,XX +XXX,XX @@ int handle_xsetbv(u32 index, u64 new_bv) uint64_t read_bndcfgu(void) { + uint64_t bndcfgu = 0; unsigned long cr0 = read_cr0(); - struct xsave_struct *xstate - = idle_vcpu[smp_processor_id()]->arch.xsave_area; + struct vcpu *v = idle_vcpu[smp_processor_id()]; + struct xsave_struct *xstate = VCPU_MAP_XSAVE_AREA(v); const struct xstate_bndcsr *bndcsr; ASSERT(cpu_has_mpx); @@ -XXX,XX +XXX,XX @@ uint64_t read_bndcfgu(void) if ( cr0 & X86_CR0_TS ) write_cr0(cr0); - return xstate->xsave_hdr.xstate_bv & X86_XCR0_BNDCSR ? bndcsr->bndcfgu : 0; + if ( xstate->xsave_hdr.xstate_bv & X86_XCR0_BNDCSR ) + bndcfgu = bndcsr->bndcfgu; + + VCPU_UNMAP_XSAVE_AREA(v, xstate); + + return bndcfgu; } void xstate_set_init(uint64_t mask) -- 2.47.0
No functional change. Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com> --- v2: * const-ified v --- xen/arch/x86/i387.c | 16 ++++++++++------ xen/arch/x86/include/asm/xstate.h | 2 +- xen/arch/x86/xstate.c | 3 +-- 3 files changed, 12 insertions(+), 9 deletions(-) diff --git a/xen/arch/x86/i387.c b/xen/arch/x86/i387.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/x86/i387.c +++ b/xen/arch/x86/i387.c @@ -XXX,XX +XXX,XX @@ static inline uint64_t vcpu_xsave_mask(const struct vcpu *v) } /* Save x87 extended state */ -static inline void fpu_xsave(struct vcpu *v) +static inline void fpu_xsave(const struct vcpu *v, struct xsave_struct *xsave_area) { bool ok; uint64_t mask = vcpu_xsave_mask(v); @@ -XXX,XX +XXX,XX @@ static inline void fpu_xsave(struct vcpu *v) */ ok = set_xcr0(v->arch.xcr0_accum | XSTATE_FP_SSE); ASSERT(ok); - xsave(v, mask); + xsave(v, xsave_area, mask); ok = set_xcr0(v->arch.xcr0 ?: XSTATE_FP_SSE); ASSERT(ok); } /* Save x87 FPU, MMX, SSE and SSE2 state */ -static inline void fpu_fxsave(struct vcpu *v) +static inline void fpu_fxsave(struct vcpu *v, fpusse_t *fpu_ctxt) { - fpusse_t *fpu_ctxt = &v->arch.xsave_area->fpu_sse; unsigned int fip_width = v->domain->arch.x87_fip_width; if ( fip_width != 4 ) @@ -XXX,XX +XXX,XX @@ void vcpu_restore_fpu_lazy(struct vcpu *v) */ static bool _vcpu_save_fpu(struct vcpu *v) { + struct xsave_struct *xsave_area; + if ( !v->fpu_dirtied && !v->arch.nonlazy_xstate_used ) return false; @@ -XXX,XX +XXX,XX @@ static bool _vcpu_save_fpu(struct vcpu *v) /* This can happen, if a paravirtualised guest OS has set its CR0.TS. */ clts(); + xsave_area = VCPU_MAP_XSAVE_AREA(v); + if ( cpu_has_xsave ) - fpu_xsave(v); + fpu_xsave(v, xsave_area); else - fpu_fxsave(v); + fpu_fxsave(v, &xsave_area->fpu_sse); + VCPU_UNMAP_XSAVE_AREA(v, xsave_area); v->fpu_dirtied = 0; return true; diff --git a/xen/arch/x86/include/asm/xstate.h b/xen/arch/x86/include/asm/xstate.h index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/x86/include/asm/xstate.h +++ b/xen/arch/x86/include/asm/xstate.h @@ -XXX,XX +XXX,XX @@ uint64_t get_xcr0(void); void set_msr_xss(u64 xss); uint64_t get_msr_xss(void); uint64_t read_bndcfgu(void); -void xsave(struct vcpu *v, uint64_t mask); +void xsave(const struct vcpu *v, struct xsave_struct *ptr, uint64_t mask); void xrstor(struct vcpu *v, uint64_t mask); void xstate_set_init(uint64_t mask); bool xsave_enabled(const struct vcpu *v); diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/x86/xstate.c +++ b/xen/arch/x86/xstate.c @@ -XXX,XX +XXX,XX @@ void compress_xsave_states(struct vcpu *v, const void *src, unsigned int size) VCPU_UNMAP_XSAVE_AREA(v, xstate); } -void xsave(struct vcpu *v, uint64_t mask) +void xsave(const struct vcpu *v, struct xsave_struct *ptr, uint64_t mask) { - struct xsave_struct *ptr = v->arch.xsave_area; uint32_t hmask = mask >> 32; uint32_t lmask = mask; unsigned int fip_width = v->domain->arch.x87_fip_width; -- 2.47.0
No functional change. Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com> --- v2: * const-ified v in fpu_xrstor() --- xen/arch/x86/i387.c | 26 ++++++++++++++++---------- xen/arch/x86/include/asm/xstate.h | 2 +- xen/arch/x86/xstate.c | 10 ++++++---- 3 files changed, 23 insertions(+), 15 deletions(-) diff --git a/xen/arch/x86/i387.c b/xen/arch/x86/i387.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/x86/i387.c +++ b/xen/arch/x86/i387.c @@ -XXX,XX +XXX,XX @@ /* FPU Restore Functions */ /*******************************/ /* Restore x87 extended state */ -static inline void fpu_xrstor(struct vcpu *v, uint64_t mask) +static inline void fpu_xrstor(struct vcpu *v, struct xsave_struct *xsave_area, + uint64_t mask) { bool ok; @@ -XXX,XX +XXX,XX @@ static inline void fpu_xrstor(struct vcpu *v, uint64_t mask) */ ok = set_xcr0(v->arch.xcr0_accum | XSTATE_FP_SSE); ASSERT(ok); - xrstor(v, mask); + xrstor(v, xsave_area, mask); ok = set_xcr0(v->arch.xcr0 ?: XSTATE_FP_SSE); ASSERT(ok); } /* Restore x87 FPU, MMX, SSE and SSE2 state */ -static inline void fpu_fxrstor(struct vcpu *v) +static inline void fpu_fxrstor(struct vcpu *v, const fpusse_t *fpu_ctxt) { - const fpusse_t *fpu_ctxt = &v->arch.xsave_area->fpu_sse; - /* * Some CPUs don't save/restore FDP/FIP/FOP unless an exception * is pending. Clear the x87 state here by setting it to fixed @@ -XXX,XX +XXX,XX @@ static inline void fpu_fxsave(struct vcpu *v, fpusse_t *fpu_ctxt) /* Restore FPU state whenever VCPU is schduled in. */ void vcpu_restore_fpu_nonlazy(struct vcpu *v, bool need_stts) { + struct xsave_struct *xsave_area; + /* Restore nonlazy extended state (i.e. parts not tracked by CR0.TS). */ if ( !v->arch.fully_eager_fpu && !v->arch.nonlazy_xstate_used ) goto maybe_stts; @@ -XXX,XX +XXX,XX @@ void vcpu_restore_fpu_nonlazy(struct vcpu *v, bool need_stts) * above) we also need to restore full state, to prevent subsequently * saving state belonging to another vCPU. */ + xsave_area = VCPU_MAP_XSAVE_AREA(v); if ( v->arch.fully_eager_fpu || xstate_all(v) ) { if ( cpu_has_xsave ) - fpu_xrstor(v, XSTATE_ALL); + fpu_xrstor(v, xsave_area, XSTATE_ALL); else - fpu_fxrstor(v); + fpu_fxrstor(v, &xsave_area->fpu_sse); v->fpu_initialised = 1; v->fpu_dirtied = 1; @@ -XXX,XX +XXX,XX @@ void vcpu_restore_fpu_nonlazy(struct vcpu *v, bool need_stts) } else { - fpu_xrstor(v, XSTATE_NONLAZY); + fpu_xrstor(v, xsave_area, XSTATE_NONLAZY); need_stts = true; } + VCPU_UNMAP_XSAVE_AREA(v, xsave_area); maybe_stts: if ( need_stts ) @@ -XXX,XX +XXX,XX @@ void vcpu_restore_fpu_nonlazy(struct vcpu *v, bool need_stts) */ void vcpu_restore_fpu_lazy(struct vcpu *v) { + struct xsave_struct *xsave_area; ASSERT(!is_idle_vcpu(v)); /* Avoid recursion. */ @@ -XXX,XX +XXX,XX @@ void vcpu_restore_fpu_lazy(struct vcpu *v) ASSERT(!v->arch.fully_eager_fpu); + xsave_area = VCPU_MAP_XSAVE_AREA(v); if ( cpu_has_xsave ) - fpu_xrstor(v, XSTATE_LAZY); + fpu_xrstor(v, xsave_area, XSTATE_LAZY); else - fpu_fxrstor(v); + fpu_fxrstor(v, &xsave_area->fpu_sse); + VCPU_UNMAP_XSAVE_AREA(v, xsave_area); v->fpu_initialised = 1; v->fpu_dirtied = 1; diff --git a/xen/arch/x86/include/asm/xstate.h b/xen/arch/x86/include/asm/xstate.h index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/x86/include/asm/xstate.h +++ b/xen/arch/x86/include/asm/xstate.h @@ -XXX,XX +XXX,XX @@ void set_msr_xss(u64 xss); uint64_t get_msr_xss(void); uint64_t read_bndcfgu(void); void xsave(const struct vcpu *v, struct xsave_struct *ptr, uint64_t mask); -void xrstor(struct vcpu *v, uint64_t mask); +void xrstor(const struct vcpu *v, struct xsave_struct *ptr, uint64_t mask); void xstate_set_init(uint64_t mask); bool xsave_enabled(const struct vcpu *v); int __must_check validate_xstate(const struct domain *d, diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/x86/xstate.c +++ b/xen/arch/x86/xstate.c @@ -XXX,XX +XXX,XX @@ void xsave(const struct vcpu *v, struct xsave_struct *ptr, uint64_t mask) ptr->fpu_sse.x[FPU_WORD_SIZE_OFFSET] = fip_width; } -void xrstor(struct vcpu *v, uint64_t mask) +void xrstor(const struct vcpu *v, struct xsave_struct *ptr, uint64_t mask) { uint32_t hmask = mask >> 32; uint32_t lmask = mask; - struct xsave_struct *ptr = v->arch.xsave_area; unsigned int faults, prev_faults; /* @@ -XXX,XX +XXX,XX @@ int handle_xsetbv(u32 index, u64 new_bv) mask &= curr->fpu_dirtied ? ~XSTATE_FP_SSE : XSTATE_NONLAZY; if ( mask ) { + struct xsave_struct *xsave_area = VCPU_MAP_XSAVE_AREA(curr); unsigned long cr0 = read_cr0(); clts(); @@ -XXX,XX +XXX,XX @@ int handle_xsetbv(u32 index, u64 new_bv) curr->fpu_dirtied = 1; cr0 &= ~X86_CR0_TS; } - xrstor(curr, mask); + xrstor(curr, xsave_area, mask); + VCPU_UNMAP_XSAVE_AREA(curr, xsave_area); + if ( cr0 & X86_CR0_TS ) write_cr0(cr0); } @@ -XXX,XX +XXX,XX @@ void xstate_set_init(uint64_t mask) xstate = VCPU_MAP_XSAVE_AREA(v); memset(&xstate->xsave_hdr, 0, sizeof(xstate->xsave_hdr)); - xrstor(v, mask); + xrstor(v, xstate, mask); VCPU_UNMAP_XSAVE_AREA(v, xstate); if ( cr0 & X86_CR0_TS ) -- 2.47.0
No functional change. Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com> --- xen/arch/x86/i387.c | 9 +++++---- xen/arch/x86/include/asm/xstate.h | 5 +++-- xen/arch/x86/xstate.c | 2 +- 3 files changed, 9 insertions(+), 7 deletions(-) diff --git a/xen/arch/x86/i387.c b/xen/arch/x86/i387.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/x86/i387.c +++ b/xen/arch/x86/i387.c @@ -XXX,XX +XXX,XX @@ static inline void fpu_fxrstor(struct vcpu *v, const fpusse_t *fpu_ctxt) /* FPU Save Functions */ /*******************************/ -static inline uint64_t vcpu_xsave_mask(const struct vcpu *v) +static inline uint64_t vcpu_xsave_mask(const struct vcpu *v, + const struct xsave_struct *xsave_area) { if ( v->fpu_dirtied ) return v->arch.nonlazy_xstate_used ? XSTATE_ALL : XSTATE_LAZY; @@ -XXX,XX +XXX,XX @@ static inline uint64_t vcpu_xsave_mask(const struct vcpu *v) * XSTATE_FP_SSE), vcpu_xsave_mask will return XSTATE_ALL. Otherwise * return XSTATE_NONLAZY. */ - return xstate_all(v) ? XSTATE_ALL : XSTATE_NONLAZY; + return xstate_all(v, xsave_area) ? XSTATE_ALL : XSTATE_NONLAZY; } /* Save x87 extended state */ static inline void fpu_xsave(const struct vcpu *v, struct xsave_struct *xsave_area) { bool ok; - uint64_t mask = vcpu_xsave_mask(v); + uint64_t mask = vcpu_xsave_mask(v, xsave_area); ASSERT(mask); /* @@ -XXX,XX +XXX,XX @@ void vcpu_restore_fpu_nonlazy(struct vcpu *v, bool need_stts) * saving state belonging to another vCPU. */ xsave_area = VCPU_MAP_XSAVE_AREA(v); - if ( v->arch.fully_eager_fpu || xstate_all(v) ) + if ( v->arch.fully_eager_fpu || xstate_all(v, xsave_area) ) { if ( cpu_has_xsave ) fpu_xrstor(v, xsave_area, XSTATE_ALL); diff --git a/xen/arch/x86/include/asm/xstate.h b/xen/arch/x86/include/asm/xstate.h index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/x86/include/asm/xstate.h +++ b/xen/arch/x86/include/asm/xstate.h @@ -XXX,XX +XXX,XX @@ xsave_area_compressed(const struct xsave_struct *xsave_area) return xsave_area->xsave_hdr.xcomp_bv & XSTATE_COMPACTION_ENABLED; } -static inline bool xstate_all(const struct vcpu *v) +static inline bool xstate_all(const struct vcpu *v, + const struct xsave_struct *xsave_area) { /* * XSTATE_FP_SSE may be excluded, because the offsets of XSTATE_FP_SSE * (in the legacy region of xsave area) are fixed, so saving * XSTATE_FP_SSE will not cause overwriting problem with XSAVES/XSAVEC. */ - return xsave_area_compressed(v->arch.xsave_area) && + return xsave_area_compressed(xsave_area) && (v->arch.xcr0_accum & XSTATE_LAZY & ~XSTATE_FP_SSE); } diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/x86/xstate.c +++ b/xen/arch/x86/xstate.c @@ -XXX,XX +XXX,XX @@ int handle_xsetbv(u32 index, u64 new_bv) asm ( "stmxcsr %0" : "=m" (xsave_area->fpu_sse.mxcsr) ); VCPU_UNMAP_XSAVE_AREA(curr, xsave_area); } - else if ( xstate_all(curr) ) + else if ( xstate_all(curr, xsave_area) ) { /* See the comment in i387.c:vcpu_restore_fpu_eager(). */ mask |= XSTATE_LAZY; -- 2.47.0