[PATCH v2 2/2] KVM: nVMX: Remove explicit filtering of GUEST_INTR_STATUS from shadow VMCS fields

Sean Christopherson posted 2 patches 1 month, 1 week ago
[PATCH v2 2/2] KVM: nVMX: Remove explicit filtering of GUEST_INTR_STATUS from shadow VMCS fields
Posted by Sean Christopherson 1 month, 1 week ago
Drop KVM's filtering of GUEST_INTR_STATUS when generating the shadow VMCS
bitmap now that KVM drops GUEST_INTR_STATUS from the set of supported
vmcs12 fields if the field isn't supported by hardware.

Note, there is technically a small functional change here, as the vmcs12
filtering only requires support for Virtual Interrupt Delivery, whereas
the shadow VMCS code being removed required "full" APICv support, i.e.
required Virtual Interrupt Delivery *and* APIC Register Virtualizaton *and*
Posted Interrupt support.

Opportunistically tweak the comment to more precisely explain why the
PML and VMX preemption timer fields need to be explicitly checked.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/vmx/nested.c | 11 ++++-------
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 9d8f84e3f2da..f50d21a6a2d7 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -112,9 +112,10 @@ static void init_vmcs_shadow_fields(void)
 			  "Update vmcs12_write_any() to drop reserved bits from AR_BYTES");
 
 		/*
-		 * PML and the preemption timer can be emulated, but the
-		 * processor cannot vmwrite to fields that don't exist
-		 * on bare metal.
+		 * KVM emulates PML and the VMX preemption timer irrespective
+		 * of hardware support, but shadowing their related VMCS fields
+		 * requires hardware support as the CPU will reject VMWRITEs to
+		 * fields that don't exist.
 		 */
 		switch (field) {
 		case GUEST_PML_INDEX:
@@ -125,10 +126,6 @@ static void init_vmcs_shadow_fields(void)
 			if (!cpu_has_vmx_preemption_timer())
 				continue;
 			break;
-		case GUEST_INTR_STATUS:
-			if (!cpu_has_vmx_apicv())
-				continue;
-			break;
 		default:
 			break;
 		}
-- 
2.52.0.351.gbe84eed79e-goog
Re: [PATCH v2 2/2] KVM: nVMX: Remove explicit filtering of GUEST_INTR_STATUS from shadow VMCS fields
Posted by Chao Gao 1 month, 1 week ago
On Tue, Dec 30, 2025 at 02:02:20PM -0800, Sean Christopherson wrote:
>Drop KVM's filtering of GUEST_INTR_STATUS when generating the shadow VMCS
>bitmap now that KVM drops GUEST_INTR_STATUS from the set of supported
>vmcs12 fields if the field isn't supported by hardware.

IIUC, the construction of the shadow VMCS bitmap and fields doesn't reference
"the set of supported vmcs12 fields".

So, with the filtering dropped, copy_shadow_to_vmcs12() and
copy_vmcs12_to_shadow() may access GUEST_INTR_STATUS on unsupported hardware.

Do we need something like this (i.e., don't shadow unsupported vmcs12 fields)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index f50d21a6a2d7..08433b3713d2 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -127,6 +127,8 @@ static void init_vmcs_shadow_fields(void)
				continue;
			break;
		default:
+			if (!cpu_has_vmcs12_field(field))
+				continue;
			break;
		}

>
>Note, there is technically a small functional change here, as the vmcs12
>filtering only requires support for Virtual Interrupt Delivery, whereas
>the shadow VMCS code being removed required "full" APICv support, i.e.
>required Virtual Interrupt Delivery *and* APIC Register Virtualizaton *and*
>Posted Interrupt support.
>
>Opportunistically tweak the comment to more precisely explain why the
>PML and VMX preemption timer fields need to be explicitly checked.
>
>Signed-off-by: Sean Christopherson <seanjc@google.com>
>---
> arch/x86/kvm/vmx/nested.c | 11 ++++-------
> 1 file changed, 4 insertions(+), 7 deletions(-)
>
>diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
>index 9d8f84e3f2da..f50d21a6a2d7 100644
>--- a/arch/x86/kvm/vmx/nested.c
>+++ b/arch/x86/kvm/vmx/nested.c
>@@ -112,9 +112,10 @@ static void init_vmcs_shadow_fields(void)
> 			  "Update vmcs12_write_any() to drop reserved bits from AR_BYTES");
> 
> 		/*
>-		 * PML and the preemption timer can be emulated, but the
>-		 * processor cannot vmwrite to fields that don't exist
>-		 * on bare metal.
>+		 * KVM emulates PML and the VMX preemption timer irrespective
>+		 * of hardware support, but shadowing their related VMCS fields
>+		 * requires hardware support as the CPU will reject VMWRITEs to
>+		 * fields that don't exist.
> 		 */
> 		switch (field) {
> 		case GUEST_PML_INDEX:
>@@ -125,10 +126,6 @@ static void init_vmcs_shadow_fields(void)
> 			if (!cpu_has_vmx_preemption_timer())
> 				continue;
> 			break;
>-		case GUEST_INTR_STATUS:
>-			if (!cpu_has_vmx_apicv())
>-				continue;
>-			break;
> 		default:
> 			break;
> 		}
>-- 
>2.52.0.351.gbe84eed79e-goog
>
Re: [PATCH v2 2/2] KVM: nVMX: Remove explicit filtering of GUEST_INTR_STATUS from shadow VMCS fields
Posted by Sean Christopherson 1 month ago
On Wed, Dec 31, 2025, Chao Gao wrote:
> On Tue, Dec 30, 2025 at 02:02:20PM -0800, Sean Christopherson wrote:
> >Drop KVM's filtering of GUEST_INTR_STATUS when generating the shadow VMCS
> >bitmap now that KVM drops GUEST_INTR_STATUS from the set of supported
> >vmcs12 fields if the field isn't supported by hardware.
> 
> IIUC, the construction of the shadow VMCS bitmap and fields doesn't reference
> "the set of supported vmcs12 fields".

Argh, right you are.  I assumed init_vmcs_shadow_fields() would already verify
the field is a valid vmcs12 field, at least as a sanity check, but it doesn't.

> So, with the filtering dropped, copy_shadow_to_vmcs12() and
> copy_vmcs12_to_shadow() may access GUEST_INTR_STATUS on unsupported hardware.
> 
> Do we need something like this (i.e., don't shadow unsupported vmcs12 fields)
> 
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index f50d21a6a2d7..08433b3713d2 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -127,6 +127,8 @@ static void init_vmcs_shadow_fields(void)
> 				continue;
> 			break;
> 		default:
> +			if (!cpu_has_vmcs12_field(field))

This can be

			if (get_vmcs12_field_offset(field) < 0)

And I think I'll put it outside the switch statement, because the requirement
applies to all fields, even those that have additional restrictions.

I also think it makes sense to have patch 1 call nested_vmx_setup_vmcs12_fields()
from nested_vmx_hardware_setup(), so that the ordering and dependency between
configuring vmcs12 fields and shadow VMCS fields can be explicitly documented.
Re: [PATCH v2 2/2] KVM: nVMX: Remove explicit filtering of GUEST_INTR_STATUS from shadow VMCS fields
Posted by Chao Gao 1 month ago
>> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
>> index f50d21a6a2d7..08433b3713d2 100644
>> --- a/arch/x86/kvm/vmx/nested.c
>> +++ b/arch/x86/kvm/vmx/nested.c
>> @@ -127,6 +127,8 @@ static void init_vmcs_shadow_fields(void)
>> 				continue;
>> 			break;
>> 		default:
>> +			if (!cpu_has_vmcs12_field(field))
>
>This can be
>
>			if (get_vmcs12_field_offset(field) < 0)
>
>And I think I'll put it outside the switch statement, because the requirement
>applies to all fields, even those that have additional restrictions.

Agree.

>
>I also think it makes sense to have patch 1 call nested_vmx_setup_vmcs12_fields()
>from nested_vmx_hardware_setup(), so that the ordering and dependency between
>configuring vmcs12 fields and shadow VMCS fields can be explicitly documented.

Looks good to me.