[PATCH v3 21/27] KVM: VMX: Invoke vmx_set_cpu_caps() before nested setup

Xin Li (Intel) posted 27 patches 1 year, 4 months ago
There is a newer version of this series
[PATCH v3 21/27] KVM: VMX: Invoke vmx_set_cpu_caps() before nested setup
Posted by Xin Li (Intel) 1 year, 4 months ago
From: Xin Li <xin3.li@intel.com>

Set VMX CPU capabilities before initializing nested instead of after,
as it needs to check VMX CPU capabilities to setup the VMX basic MSR
for nested.

Signed-off-by: Xin Li <xin3.li@intel.com>
Signed-off-by: Xin Li (Intel) <xin@zytor.com>
Tested-by: Shan Kang <shan.kang@intel.com>
---
 arch/x86/kvm/vmx/vmx.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index ef807194ccbd..522ee27a4655 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -8774,6 +8774,12 @@ __init int vmx_hardware_setup(void)
 
 	setup_default_sgx_lepubkeyhash();
 
+	/*
+	 * VMX CPU capabilities are required to setup the VMX basic MSR for
+	 * nested, so this must be done before nested_vmx_setup_ctls_msrs().
+	 */
+	vmx_set_cpu_caps();
+
 	if (nested) {
 		nested_vmx_setup_ctls_msrs(&vmcs_config, vmx_capability.ept);
 
@@ -8782,8 +8788,6 @@ __init int vmx_hardware_setup(void)
 			return r;
 	}
 
-	vmx_set_cpu_caps();
-
 	r = alloc_kvm_area();
 	if (r && nested)
 		nested_vmx_hardware_unsetup();
-- 
2.46.2
Re: [PATCH v3 21/27] KVM: VMX: Invoke vmx_set_cpu_caps() before nested setup
Posted by Chao Gao 1 year, 3 months ago
On Mon, Sep 30, 2024 at 10:01:04PM -0700, Xin Li (Intel) wrote:
>From: Xin Li <xin3.li@intel.com>
>
>Set VMX CPU capabilities before initializing nested instead of after,
>as it needs to check VMX CPU capabilities to setup the VMX basic MSR
>for nested.

Which VMX CPU capabilities are needed? after reading patch 25, I still
don't get that.

>
>Signed-off-by: Xin Li <xin3.li@intel.com>
>Signed-off-by: Xin Li (Intel) <xin@zytor.com>
>Tested-by: Shan Kang <shan.kang@intel.com>
>---
> arch/x86/kvm/vmx/vmx.c | 8 ++++++--
> 1 file changed, 6 insertions(+), 2 deletions(-)
>
>diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
>index ef807194ccbd..522ee27a4655 100644
>--- a/arch/x86/kvm/vmx/vmx.c
>+++ b/arch/x86/kvm/vmx/vmx.c
>@@ -8774,6 +8774,12 @@ __init int vmx_hardware_setup(void)
> 
> 	setup_default_sgx_lepubkeyhash();
> 
>+	/*
>+	 * VMX CPU capabilities are required to setup the VMX basic MSR for
>+	 * nested, so this must be done before nested_vmx_setup_ctls_msrs().
>+	 */
>+	vmx_set_cpu_caps();
>+
> 	if (nested) {
> 		nested_vmx_setup_ctls_msrs(&vmcs_config, vmx_capability.ept);
> 
>@@ -8782,8 +8788,6 @@ __init int vmx_hardware_setup(void)
> 			return r;
> 	}
> 
>-	vmx_set_cpu_caps();
>-
> 	r = alloc_kvm_area();
> 	if (r && nested)
> 		nested_vmx_hardware_unsetup();
>-- 
>2.46.2
>
>
Re: [PATCH v3 21/27] KVM: VMX: Invoke vmx_set_cpu_caps() before nested setup
Posted by Xin Li 1 year, 3 months ago
On 10/24/2024 12:49 AM, Chao Gao wrote:
> On Mon, Sep 30, 2024 at 10:01:04PM -0700, Xin Li (Intel) wrote:
>> From: Xin Li <xin3.li@intel.com>
>>
>> Set VMX CPU capabilities before initializing nested instead of after,
>> as it needs to check VMX CPU capabilities to setup the VMX basic MSR
>> for nested.
> 
> Which VMX CPU capabilities are needed? after reading patch 25, I still
> don't get that.

Sigh, in v2 I had 'if (kvm_cpu_cap_has(X86_FEATURE_FRED))' in
nested_vmx_setup_basic(), which is changed to 'if (cpu_has_vmx_fred())'
in v3.  So the reason for the change is gone.  But I think logically
the change is still needed; nested setup should be after VMX setup.
Re: [PATCH v3 21/27] KVM: VMX: Invoke vmx_set_cpu_caps() before nested setup
Posted by Sean Christopherson 11 months, 2 weeks ago
On Fri, Oct 25, 2024, Xin Li wrote:
> On 10/24/2024 12:49 AM, Chao Gao wrote:
> > On Mon, Sep 30, 2024 at 10:01:04PM -0700, Xin Li (Intel) wrote:
> > > From: Xin Li <xin3.li@intel.com>
> > > 
> > > Set VMX CPU capabilities before initializing nested instead of after,
> > > as it needs to check VMX CPU capabilities to setup the VMX basic MSR
> > > for nested.
> > 
> > Which VMX CPU capabilities are needed? after reading patch 25, I still
> > don't get that.

Heh, I had the same question.  I was worried this was fixing a bug.

> Sigh, in v2 I had 'if (kvm_cpu_cap_has(X86_FEATURE_FRED))' in
> nested_vmx_setup_basic(), which is changed to 'if (cpu_has_vmx_fred())'
> in v3.  So the reason for the change is gone.  But I think logically
> the change is still needed; nested setup should be after VMX setup.

Hmm, no, I don't think we want to allow nested_vmx_setup_ctls_msrs() to consume
any "output" from vmx_set_cpu_caps().  vmx_set_cpu_caps() is called only on the
CPU that loads kvm-intel.ko, whereas nested_vmx_setup_ctls_msrs() is called on
all CPUs to check for consistency between CPUs.

And thinking more about the relevant flows, there's a flaw with kvm_cpu_caps and
vendor module reload.  KVM zeroes kvm_cpu_caps during init, but not until
kvm_set_cpu_caps() is called, i.e. quite some time after KVM has started doing
setup.  If KVM had a bug where it checked a feature kvm_set_cpu_caps(), the bug
could potentially go unnoticed until just the "right" combination of hardware,
module params, and/or Kconfig exposed semi-uninitialized data.

I'll post the below (assuming it actually works) to guard against that.  Ideally,
kvm_cpu_cap_get() would WARN if it's used before caps are finalized, but I don't
think the extra protection would be worth the increase in code footprint.

--
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 97a90689a9dc..8fd48119bd41 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -817,7 +817,8 @@ do {                                                                        \
 
 void kvm_set_cpu_caps(void)
 {
-       memset(kvm_cpu_caps, 0, sizeof(kvm_cpu_caps));
+       WARN_ON_ONCE(!bitmap_empty((void *)kvm_cpu_caps,
+                                  sizeof(kvm_cpu_caps) * BITS_PER_BYTE));
 
        BUILD_BUG_ON(sizeof(kvm_cpu_caps) - (NKVMCAPINTS * sizeof(*kvm_cpu_caps)) >
                     sizeof(boot_cpu_data.x86_capability));
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index f5685f153e08..075a07412893 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -9737,6 +9737,7 @@ int kvm_x86_vendor_init(struct kvm_x86_init_ops *ops)
        }
 
        memset(&kvm_caps, 0, sizeof(kvm_caps));
+       memset(kvm_cpu_caps, 0, sizeof(kvm_cpu_caps));
 
        x86_emulator_cache = kvm_alloc_emulator_cache();
        if (!x86_emulator_cache) {