arch/arm64/kvm/hyp/include/hyp/switch.h | 2 +- arch/arm64/kvm/hyp/nvhe/hyp-main.c | 12 +++++++----- 2 files changed, 8 insertions(+), 6 deletions(-)
When pKVM saves and restores the host floating point state on a SVE system
it programs the vector length in ZCR_EL2.LEN to be whatever the maximum VL
for the PE is but uses a buffer allocated with kvm_host_sve_max_vl, the
maximum VL shared by all PEs in the system. This means that if we run on a
system where the maximum VLs are not consistent we will overflow the buffer
on PEs which support larger VLs.
Since the host will not currently attempt to make use of non-shared VLs fix
this by explicitly setting the EL2 VL to be the maximum shared VL when we
save and restore. This will enforce the limit on host VL usage. Should we
wish to support asymmetric VLs this code will need to be updated along with
the required changes for the host, patches have previously been posted:
https://lore.kernel.org/r/20240730-kvm-arm64-fix-pkvm-sve-vl-v6-0-cae8a2e0bd66@kernel.org
Fixes: b5b9955617bc ("KVM: arm64: Eagerly restore host fpsimd/sve state in pKVM")
Signed-off-by: Mark Brown <broonie@kernel.org>
---
Changes in v2:
- Update all places where we constrain the host VL, not just those where
we save and restore host state.
- The value written to the register is 0 based, not 1 based.
- Link to v1: https://lore.kernel.org/r/20240910-kvm-arm64-limit-guest-vl-v1-1-54df40b95ffb@kernel.org
---
arch/arm64/kvm/hyp/include/hyp/switch.h | 2 +-
arch/arm64/kvm/hyp/nvhe/hyp-main.c | 12 +++++++-----
2 files changed, 8 insertions(+), 6 deletions(-)
diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
index f59ccfe11ab9..c2cfb4d6dc92 100644
--- a/arch/arm64/kvm/hyp/include/hyp/switch.h
+++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
@@ -339,7 +339,7 @@ static inline void __hyp_sve_save_host(void)
struct cpu_sve_state *sve_state = *host_data_ptr(sve_state);
sve_state->zcr_el1 = read_sysreg_el1(SYS_ZCR);
- write_sysreg_s(ZCR_ELx_LEN_MASK, SYS_ZCR_EL2);
+ write_sysreg_s(sve_vq_from_vl(kvm_host_sve_max_vl) - 1, SYS_ZCR_EL2);
__sve_save_state(sve_state->sve_regs + sve_ffr_offset(kvm_host_sve_max_vl),
&sve_state->fpsr,
true);
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
index f43d845f3c4e..dd1c6aa907a2 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
@@ -33,7 +33,7 @@ static void __hyp_sve_save_guest(struct kvm_vcpu *vcpu)
*/
sve_cond_update_zcr_vq(vcpu_sve_max_vq(vcpu) - 1, SYS_ZCR_EL2);
__sve_save_state(vcpu_sve_pffr(vcpu), &vcpu->arch.ctxt.fp_regs.fpsr, true);
- write_sysreg_s(ZCR_ELx_LEN_MASK, SYS_ZCR_EL2);
+ write_sysreg_s(sve_vq_from_vl(kvm_host_sve_max_vl) - 1, SYS_ZCR_EL2);
}
static void __hyp_sve_restore_host(void)
@@ -45,10 +45,11 @@ static void __hyp_sve_restore_host(void)
* the host. The layout of the data when saving the sve state depends
* on the VL, so use a consistent (i.e., the maximum) host VL.
*
- * Setting ZCR_EL2 to ZCR_ELx_LEN_MASK sets the effective length
- * supported by the system (or limited at EL3).
+ * Note that this constrains the PE to the maximum shared VL
+ * that was discovered, if we wish to use larger VLs this will
+ * need to be revisited.
*/
- write_sysreg_s(ZCR_ELx_LEN_MASK, SYS_ZCR_EL2);
+ write_sysreg_s(sve_vq_from_vl(kvm_host_sve_max_vl) - 1, SYS_ZCR_EL2);
__sve_restore_state(sve_state->sve_regs + sve_ffr_offset(kvm_host_sve_max_vl),
&sve_state->fpsr,
true);
@@ -479,7 +480,8 @@ void handle_trap(struct kvm_cpu_context *host_ctxt)
case ESR_ELx_EC_SVE:
cpacr_clear_set(0, CPACR_ELx_ZEN);
isb();
- sve_cond_update_zcr_vq(ZCR_ELx_LEN_MASK, SYS_ZCR_EL2);
+ sve_cond_update_zcr_vq(sve_vq_from_vl(kvm_host_sve_max_vl) - 1,
+ SYS_ZCR_EL2);
break;
case ESR_ELx_EC_IABT_LOW:
case ESR_ELx_EC_DABT_LOW:
---
base-commit: 7c626ce4bae1ac14f60076d00eafe71af30450ba
change-id: 20240910-kvm-arm64-limit-guest-vl-d5fba0c7cc7b
Best regards,
--
Mark Brown <broonie@kernel.org>
On Thu, 12 Sep 2024 12:39:35 +0100, Mark Brown wrote: > When pKVM saves and restores the host floating point state on a SVE system > it programs the vector length in ZCR_EL2.LEN to be whatever the maximum VL > for the PE is but uses a buffer allocated with kvm_host_sve_max_vl, the > maximum VL shared by all PEs in the system. This means that if we run on a > system where the maximum VLs are not consistent we will overflow the buffer > on PEs which support larger VLs. > > [...] Applied to fixes, thanks! [1/1] KVM: arm64: Constrain the host to the maximum shared SVE VL with pKVM commit: a9f41588a902f386b48f021f56a4c14735cd9371 Cheers, M. -- Without deviation from the norm, progress is not possible.
Hi Mark, On Thu, 12 Sept 2024 at 12:39, Mark Brown <broonie@kernel.org> wrote: > > When pKVM saves and restores the host floating point state on a SVE system > it programs the vector length in ZCR_EL2.LEN to be whatever the maximum VL > for the PE is but uses a buffer allocated with kvm_host_sve_max_vl, the > maximum VL shared by all PEs in the system. This means that if we run on a > system where the maximum VLs are not consistent we will overflow the buffer > on PEs which support larger VLs. > > Since the host will not currently attempt to make use of non-shared VLs fix > this by explicitly setting the EL2 VL to be the maximum shared VL when we > save and restore. This will enforce the limit on host VL usage. Should we > wish to support asymmetric VLs this code will need to be updated along with > the required changes for the host, patches have previously been posted: > > https://lore.kernel.org/r/20240730-kvm-arm64-fix-pkvm-sve-vl-v6-0-cae8a2e0bd66@kernel.org Thanks for fixing this. One part that you haven't changed is setting ZCR_EL2 during el2 setup: arch/arm64/include/asm/el2_setup.h: .Linit_sve_ : lines 290/291 I guess at that point it's not straightforward to figure sve_max_vl. Is there a window after el2 setup where we might actually get the VL implied by ZCR_ELx_LEN_MASK, or would it always get set to sve_vq_from_vl(kvm_host_sve_max_vl) - 1 ? That said, this passes the sve-stress test now: tested on qemu using nvhe, pKVM non-protected guest, pKVM protected guest (the latest with patches not upstream yet). Assuming that the current code in el2_setup.h is fine as it is: Tested-by: Fuad Tabba <tabba@google.com> Reviewed-by: Fuad Tabba <tabba@google.com> Cheers, /fuad > > Fixes: b5b9955617bc ("KVM: arm64: Eagerly restore host fpsimd/sve state in pKVM") > Signed-off-by: Mark Brown <broonie@kernel.org> > --- > Changes in v2: > - Update all places where we constrain the host VL, not just those where > we save and restore host state. > - The value written to the register is 0 based, not 1 based. > - Link to v1: https://lore.kernel.org/r/20240910-kvm-arm64-limit-guest-vl-v1-1-54df40b95ffb@kernel.org > --- > arch/arm64/kvm/hyp/include/hyp/switch.h | 2 +- > arch/arm64/kvm/hyp/nvhe/hyp-main.c | 12 +++++++----- > 2 files changed, 8 insertions(+), 6 deletions(-) > > diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h > index f59ccfe11ab9..c2cfb4d6dc92 100644 > --- a/arch/arm64/kvm/hyp/include/hyp/switch.h > +++ b/arch/arm64/kvm/hyp/include/hyp/switch.h > @@ -339,7 +339,7 @@ static inline void __hyp_sve_save_host(void) > struct cpu_sve_state *sve_state = *host_data_ptr(sve_state); > > sve_state->zcr_el1 = read_sysreg_el1(SYS_ZCR); > - write_sysreg_s(ZCR_ELx_LEN_MASK, SYS_ZCR_EL2); > + write_sysreg_s(sve_vq_from_vl(kvm_host_sve_max_vl) - 1, SYS_ZCR_EL2); > __sve_save_state(sve_state->sve_regs + sve_ffr_offset(kvm_host_sve_max_vl), > &sve_state->fpsr, > true); > diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c > index f43d845f3c4e..dd1c6aa907a2 100644 > --- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c > +++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c > @@ -33,7 +33,7 @@ static void __hyp_sve_save_guest(struct kvm_vcpu *vcpu) > */ > sve_cond_update_zcr_vq(vcpu_sve_max_vq(vcpu) - 1, SYS_ZCR_EL2); > __sve_save_state(vcpu_sve_pffr(vcpu), &vcpu->arch.ctxt.fp_regs.fpsr, true); > - write_sysreg_s(ZCR_ELx_LEN_MASK, SYS_ZCR_EL2); > + write_sysreg_s(sve_vq_from_vl(kvm_host_sve_max_vl) - 1, SYS_ZCR_EL2); > } > > static void __hyp_sve_restore_host(void) > @@ -45,10 +45,11 @@ static void __hyp_sve_restore_host(void) > * the host. The layout of the data when saving the sve state depends > * on the VL, so use a consistent (i.e., the maximum) host VL. > * > - * Setting ZCR_EL2 to ZCR_ELx_LEN_MASK sets the effective length > - * supported by the system (or limited at EL3). > + * Note that this constrains the PE to the maximum shared VL > + * that was discovered, if we wish to use larger VLs this will > + * need to be revisited. > */ > - write_sysreg_s(ZCR_ELx_LEN_MASK, SYS_ZCR_EL2); > + write_sysreg_s(sve_vq_from_vl(kvm_host_sve_max_vl) - 1, SYS_ZCR_EL2); > __sve_restore_state(sve_state->sve_regs + sve_ffr_offset(kvm_host_sve_max_vl), > &sve_state->fpsr, > true); > @@ -479,7 +480,8 @@ void handle_trap(struct kvm_cpu_context *host_ctxt) > case ESR_ELx_EC_SVE: > cpacr_clear_set(0, CPACR_ELx_ZEN); > isb(); > - sve_cond_update_zcr_vq(ZCR_ELx_LEN_MASK, SYS_ZCR_EL2); > + sve_cond_update_zcr_vq(sve_vq_from_vl(kvm_host_sve_max_vl) - 1, > + SYS_ZCR_EL2); > break; > case ESR_ELx_EC_IABT_LOW: > case ESR_ELx_EC_DABT_LOW: > > --- > base-commit: 7c626ce4bae1ac14f60076d00eafe71af30450ba > change-id: 20240910-kvm-arm64-limit-guest-vl-d5fba0c7cc7b > > Best regards, > -- > Mark Brown <broonie@kernel.org> >
On Thu, Sep 12, 2024 at 03:21:43PM +0100, Fuad Tabba wrote: > One part that you haven't changed is setting ZCR_EL2 during el2 setup: > arch/arm64/include/asm/el2_setup.h: .Linit_sve_ : lines 290/291 > I guess at that point it's not straightforward to figure sve_max_vl. > Is there a window after el2 setup where we might actually get the VL > implied by ZCR_ELx_LEN_MASK, or would it always get set to > sve_vq_from_vl(kvm_host_sve_max_vl) - 1 ? Yeah, at that point we have no idea what any other cores might look like and there's just generally a bootstrapping issue - we need to see the actual maximum VLs for the PEs to work out what the maximum VL for the system is so we can tell KVM what to enforce.
On Thu, 12 Sep 2024 15:45:27 +0100, Mark Brown <broonie@kernel.org> wrote: > > [1 <text/plain; us-ascii (7bit)>] > On Thu, Sep 12, 2024 at 03:21:43PM +0100, Fuad Tabba wrote: > > > One part that you haven't changed is setting ZCR_EL2 during el2 setup: > > arch/arm64/include/asm/el2_setup.h: .Linit_sve_ : lines 290/291 > > > I guess at that point it's not straightforward to figure sve_max_vl. > > Is there a window after el2 setup where we might actually get the VL > > implied by ZCR_ELx_LEN_MASK, or would it always get set to > > sve_vq_from_vl(kvm_host_sve_max_vl) - 1 ? > > Yeah, at that point we have no idea what any other cores might look like > and there's just generally a bootstrapping issue - we need to see the > actual maximum VLs for the PEs to work out what the maximum VL for the > system is so we can tell KVM what to enforce. That's fine. By the time KVM initialises, all the "early" CPUs will be on, and pKVM actively prevents any late onlining of CPUs. The only gotcha would be if "something" snapshots the "large" VL in between two CPU boots, but we really don't expect that sort of things. M. -- Without deviation from the norm, progress is not possible.
© 2016 - 2024 Red Hat, Inc.