arch/x86/kvm/vmx/nested.c | 22 ++++++++++++---------- 1 file changed, 12 insertions(+), 10 deletions(-)
WARN if KVM attempts to allocate a shadow VMCS for vmcs02. KVM emulates
VMCS shadowing but doesn't virtualize it, i.e. KVM should never allocate
a "real" shadow VMCS for L2.
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/vmx/nested.c | 22 ++++++++++++----------
1 file changed, 12 insertions(+), 10 deletions(-)
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index f235f77cbc03..92ee0d821a06 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -4851,18 +4851,20 @@ static struct vmcs *alloc_shadow_vmcs(struct kvm_vcpu *vcpu)
struct loaded_vmcs *loaded_vmcs = vmx->loaded_vmcs;
/*
- * We should allocate a shadow vmcs for vmcs01 only when L1
- * executes VMXON and free it when L1 executes VMXOFF.
- * As it is invalid to execute VMXON twice, we shouldn't reach
- * here when vmcs01 already have an allocated shadow vmcs.
+ * KVM allocates a shadow VMCS only when L1 executes VMXON and frees it
+ * when L1 executes VMXOFF or the vCPU is forced out of nested
+ * operation. VMXON faults if the CPU is already post-VMXON, so it
+ * should be impossible to already have an allocated shadow VMCS. KVM
+ * doesn't support virtualization of VMCS shadowing, so vmcs01 should
+ * always be the loaded VMCS.
*/
- WARN_ON(loaded_vmcs == &vmx->vmcs01 && loaded_vmcs->shadow_vmcs);
+ if (WARN_ON(loaded_vmcs != &vmx->vmcs01 || loaded_vmcs->shadow_vmcs))
+ return loaded_vmcs->shadow_vmcs;
+
+ loaded_vmcs->shadow_vmcs = alloc_vmcs(true);
+ if (loaded_vmcs->shadow_vmcs)
+ vmcs_clear(loaded_vmcs->shadow_vmcs);
- if (!loaded_vmcs->shadow_vmcs) {
- loaded_vmcs->shadow_vmcs = alloc_vmcs(true);
- if (loaded_vmcs->shadow_vmcs)
- vmcs_clear(loaded_vmcs->shadow_vmcs);
- }
return loaded_vmcs->shadow_vmcs;
}
base-commit: edb9e50dbe18394d0fc9d0494f5b6046fc912d33
--
2.35.0.rc0.227.g00780c9af4-goog
Sean Christopherson <seanjc@google.com> writes:
> WARN if KVM attempts to allocate a shadow VMCS for vmcs02. KVM emulates
> VMCS shadowing but doesn't virtualize it, i.e. KVM should never allocate
> a "real" shadow VMCS for L2.
>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
> arch/x86/kvm/vmx/nested.c | 22 ++++++++++++----------
> 1 file changed, 12 insertions(+), 10 deletions(-)
>
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index f235f77cbc03..92ee0d821a06 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -4851,18 +4851,20 @@ static struct vmcs *alloc_shadow_vmcs(struct kvm_vcpu *vcpu)
> struct loaded_vmcs *loaded_vmcs = vmx->loaded_vmcs;
>
> /*
> - * We should allocate a shadow vmcs for vmcs01 only when L1
> - * executes VMXON and free it when L1 executes VMXOFF.
> - * As it is invalid to execute VMXON twice, we shouldn't reach
> - * here when vmcs01 already have an allocated shadow vmcs.
> + * KVM allocates a shadow VMCS only when L1 executes VMXON and frees it
> + * when L1 executes VMXOFF or the vCPU is forced out of nested
> + * operation. VMXON faults if the CPU is already post-VMXON, so it
> + * should be impossible to already have an allocated shadow VMCS. KVM
> + * doesn't support virtualization of VMCS shadowing, so vmcs01 should
> + * always be the loaded VMCS.
> */
> - WARN_ON(loaded_vmcs == &vmx->vmcs01 && loaded_vmcs->shadow_vmcs);
> + if (WARN_ON(loaded_vmcs != &vmx->vmcs01 || loaded_vmcs->shadow_vmcs))
> + return loaded_vmcs->shadow_vmcs;
Stupid question: why do we want to care about 'loaded_vmcs' at all,
i.e. why can't we hardcode 'vmx->vmcs01' in alloc_shadow_vmcs()? The
only caller is enter_vmx_operation() and AFAIU 'loaded_vmcs' will always
be pointing to 'vmx->vmcs01' (as enter_vmx_operation() allocates
&vmx->nested.vmcs02 so 'loaded_vmcs' can't point there!).
> +
> + loaded_vmcs->shadow_vmcs = alloc_vmcs(true);
> + if (loaded_vmcs->shadow_vmcs)
> + vmcs_clear(loaded_vmcs->shadow_vmcs);
>
> - if (!loaded_vmcs->shadow_vmcs) {
> - loaded_vmcs->shadow_vmcs = alloc_vmcs(true);
> - if (loaded_vmcs->shadow_vmcs)
> - vmcs_clear(loaded_vmcs->shadow_vmcs);
> - }
> return loaded_vmcs->shadow_vmcs;
> }
>
>
> base-commit: edb9e50dbe18394d0fc9d0494f5b6046fc912d33
--
Vitaly
On 1/26/22 16:56, Vitaly Kuznetsov wrote: >> - WARN_ON(loaded_vmcs == &vmx->vmcs01 && loaded_vmcs->shadow_vmcs); >> + if (WARN_ON(loaded_vmcs != &vmx->vmcs01 || loaded_vmcs->shadow_vmcs)) >> + return loaded_vmcs->shadow_vmcs; > Stupid question: why do we want to care about 'loaded_vmcs' at all, > i.e. why can't we hardcode 'vmx->vmcs01' in alloc_shadow_vmcs()? The > only caller is enter_vmx_operation() and AFAIU 'loaded_vmcs' will always > be pointing to 'vmx->vmcs01' (as enter_vmx_operation() allocates > &vmx->nested.vmcs02 so 'loaded_vmcs' can't point there!). > Well, that's why the WARN never happens. The idea is that if shadow VMCS _virtualization_ (not emulation, i.e. running L2 VMREAD/VMWRITE without even a vmexit to L0) was supported, then you would need a non-NULL shadow_vmcs in vmx->vmcs02. Regarding the patch, the old WARN was messy but it was also trying to avoid a NULL pointer dereference in the caller. What about: if (WARN_ON(loaded_vmcs->shadow_vmcs)) return loaded_vmcs->shadow_vmcs; /* Go ahead anyway. */ WARN_ON(loaded_vmcs != &vmx->vmcs01); ? Paolo
Paolo Bonzini <pbonzini@redhat.com> writes: > On 1/26/22 16:56, Vitaly Kuznetsov wrote: >>> - WARN_ON(loaded_vmcs == &vmx->vmcs01 && loaded_vmcs->shadow_vmcs); >>> + if (WARN_ON(loaded_vmcs != &vmx->vmcs01 || loaded_vmcs->shadow_vmcs)) >>> + return loaded_vmcs->shadow_vmcs; >> Stupid question: why do we want to care about 'loaded_vmcs' at all, >> i.e. why can't we hardcode 'vmx->vmcs01' in alloc_shadow_vmcs()? The >> only caller is enter_vmx_operation() and AFAIU 'loaded_vmcs' will always >> be pointing to 'vmx->vmcs01' (as enter_vmx_operation() allocates >> &vmx->nested.vmcs02 so 'loaded_vmcs' can't point there!). >> > > Well, that's why the WARN never happens. The idea is that if shadow > VMCS _virtualization_ (not emulation, i.e. running L2 VMREAD/VMWRITE > without even a vmexit to L0) was supported, then you would need a > non-NULL shadow_vmcs in vmx->vmcs02. > > Regarding the patch, the old WARN was messy but it was also trying to > avoid a NULL pointer dereference in the caller. > > What about: > > if (WARN_ON(loaded_vmcs->shadow_vmcs)) > return loaded_vmcs->shadow_vmcs; > > /* Go ahead anyway. */ > WARN_ON(loaded_vmcs != &vmx->vmcs01); > > ? > FWIW, this looks better [to my personal taste]. -- Vitaly
© 2016 - 2026 Red Hat, Inc.