[PATCH v2 2/4] KVM: VMX: Handle #MCs on VM-Enter/TD-Enter outside of the fastpath

Sean Christopherson posted 4 patches 1 week, 6 days ago
[PATCH v2 2/4] KVM: VMX: Handle #MCs on VM-Enter/TD-Enter outside of the fastpath
Posted by Sean Christopherson 1 week, 6 days ago
Handle Machine Checks (#MC) that happen on VM-Enter (VMX or TDX) outside
of KVM's fastpath so that as much host state as possible is re-loaded
before invoking the kernel's #MC handler.  The only requirement is that
KVM invokes the #MC handler before enabling IRQs (and even that could
_probably_ be related to handling #MCs before enabling preemption).

Waiting to handle #MCs until "more" host state is loaded hardens KVM
against flaws in the #MC handler, which has historically been quite
brittle. E.g. prior to commit 5567d11c21a1 ("x86/mce: Send #MC singal from
task work"), the #MC code could trigger a schedule() with IRQs and
preemption disabled.  That led to a KVM hack-a-fix in commit 1811d979c716
("x86/kvm: move kvm_load/put_guest_xcr0 into atomic context").

Note, vmx_handle_exit_irqoff() is common to VMX and TDX guests.

Cc: Tony Lindgren <tony.lindgren@linux.intel.com>
Cc: Rick Edgecombe <rick.p.edgecombe@intel.com>
Cc: Jon Kohler <jon@nutanix.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/vmx/tdx.c |  3 ---
 arch/x86/kvm/vmx/vmx.c | 16 +++++++++++-----
 2 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c
index e6105a527372..2d7a4d52ccfb 100644
--- a/arch/x86/kvm/vmx/tdx.c
+++ b/arch/x86/kvm/vmx/tdx.c
@@ -1110,9 +1110,6 @@ fastpath_t tdx_vcpu_run(struct kvm_vcpu *vcpu, u64 run_flags)
 	if (unlikely((tdx->vp_enter_ret & TDX_SW_ERROR) == TDX_SW_ERROR))
 		return EXIT_FASTPATH_NONE;
 
-	if (unlikely(vmx_get_exit_reason(vcpu).basic == EXIT_REASON_MCE_DURING_VMENTRY))
-		kvm_machine_check();
-
 	trace_kvm_exit(vcpu, KVM_ISA_VMX);
 
 	if (unlikely(tdx_failed_vmentry(vcpu)))
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index fdcc519348cd..f369c499b2c3 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7062,10 +7062,19 @@ void vmx_handle_exit_irqoff(struct kvm_vcpu *vcpu)
 	if (to_vt(vcpu)->emulation_required)
 		return;
 
-	if (vmx_get_exit_reason(vcpu).basic == EXIT_REASON_EXTERNAL_INTERRUPT)
+	switch (vmx_get_exit_reason(vcpu).basic) {
+	case EXIT_REASON_EXTERNAL_INTERRUPT:
 		handle_external_interrupt_irqoff(vcpu, vmx_get_intr_info(vcpu));
-	else if (vmx_get_exit_reason(vcpu).basic == EXIT_REASON_EXCEPTION_NMI)
+		break;
+	case EXIT_REASON_EXCEPTION_NMI:
 		handle_exception_irqoff(vcpu, vmx_get_intr_info(vcpu));
+		break;
+	case EXIT_REASON_MCE_DURING_VMENTRY:
+		kvm_machine_check();
+		break;
+	default:
+		break;
+	}
 }
 
 /*
@@ -7528,9 +7537,6 @@ fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu, u64 run_flags)
 	if (unlikely(vmx->fail))
 		return EXIT_FASTPATH_NONE;
 
-	if (unlikely((u16)vmx_get_exit_reason(vcpu).basic == EXIT_REASON_MCE_DURING_VMENTRY))
-		kvm_machine_check();
-
 	trace_kvm_exit(vcpu, KVM_ISA_VMX);
 
 	if (unlikely(vmx_get_exit_reason(vcpu).failed_vmentry))
-- 
2.52.0.rc1.455.g30608eb744-goog
Re: [PATCH v2 2/4] KVM: VMX: Handle #MCs on VM-Enter/TD-Enter outside of the fastpath
Posted by Binbin Wu 1 week, 3 days ago

On 11/19/2025 6:23 AM, Sean Christopherson wrote:
> Handle Machine Checks (#MC) that happen on VM-Enter (VMX or TDX) outside
> of KVM's fastpath so that as much host state as possible is re-loaded
> before invoking the kernel's #MC handler.  The only requirement is that
> KVM invokes the #MC handler before enabling IRQs (and even that could
> _probably_ be related to handling #MCs before enabling preemption).
>
> Waiting to handle #MCs until "more" host state is loaded hardens KVM
> against flaws in the #MC handler, which has historically been quite
> brittle. E.g. prior to commit 5567d11c21a1 ("x86/mce: Send #MC singal from
> task work"), the #MC code could trigger a schedule() with IRQs and
> preemption disabled.  That led to a KVM hack-a-fix in commit 1811d979c716
> ("x86/kvm: move kvm_load/put_guest_xcr0 into atomic context").
>
> Note, vmx_handle_exit_irqoff() is common to VMX and TDX guests.
>
> Cc: Tony Lindgren <tony.lindgren@linux.intel.com>
> Cc: Rick Edgecombe <rick.p.edgecombe@intel.com>
> Cc: Jon Kohler <jon@nutanix.com>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>   arch/x86/kvm/vmx/tdx.c |  3 ---
>   arch/x86/kvm/vmx/vmx.c | 16 +++++++++++-----
>   2 files changed, 11 insertions(+), 8 deletions(-)
>
> diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c
> index e6105a527372..2d7a4d52ccfb 100644
> --- a/arch/x86/kvm/vmx/tdx.c
> +++ b/arch/x86/kvm/vmx/tdx.c
> @@ -1110,9 +1110,6 @@ fastpath_t tdx_vcpu_run(struct kvm_vcpu *vcpu, u64 run_flags)
>   	if (unlikely((tdx->vp_enter_ret & TDX_SW_ERROR) == TDX_SW_ERROR))
>   		return EXIT_FASTPATH_NONE;
>   
> -	if (unlikely(vmx_get_exit_reason(vcpu).basic == EXIT_REASON_MCE_DURING_VMENTRY))
> -		kvm_machine_check();
> -
>   	trace_kvm_exit(vcpu, KVM_ISA_VMX);
>   
>   	if (unlikely(tdx_failed_vmentry(vcpu)))
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index fdcc519348cd..f369c499b2c3 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -7062,10 +7062,19 @@ void vmx_handle_exit_irqoff(struct kvm_vcpu *vcpu)

I think the bug in v1 is because the fact that the function is the common path
for both VMX and TDX is overlooked. Do you think it is worth a comment to
tell the function is the common path for both VMX and TDX?

Otherwise,
Reviewed-by: Binbin Wu <binbin.wu@linxu.intel.com>


>   	if (to_vt(vcpu)->emulation_required)
>   		return;
>   
> -	if (vmx_get_exit_reason(vcpu).basic == EXIT_REASON_EXTERNAL_INTERRUPT)
> +	switch (vmx_get_exit_reason(vcpu).basic) {
> +	case EXIT_REASON_EXTERNAL_INTERRUPT:
>   		handle_external_interrupt_irqoff(vcpu, vmx_get_intr_info(vcpu));
> -	else if (vmx_get_exit_reason(vcpu).basic == EXIT_REASON_EXCEPTION_NMI)
> +		break;
> +	case EXIT_REASON_EXCEPTION_NMI:
>   		handle_exception_irqoff(vcpu, vmx_get_intr_info(vcpu));
> +		break;
> +	case EXIT_REASON_MCE_DURING_VMENTRY:
> +		kvm_machine_check();
> +		break;
> +	default:
> +		break;
> +	}
>   }
>   
>   /*
> @@ -7528,9 +7537,6 @@ fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu, u64 run_flags)
>   	if (unlikely(vmx->fail))
>   		return EXIT_FASTPATH_NONE;
>   
> -	if (unlikely((u16)vmx_get_exit_reason(vcpu).basic == EXIT_REASON_MCE_DURING_VMENTRY))
> -		kvm_machine_check();
> -
>   	trace_kvm_exit(vcpu, KVM_ISA_VMX);
>   
>   	if (unlikely(vmx_get_exit_reason(vcpu).failed_vmentry))
Re: [PATCH v2 2/4] KVM: VMX: Handle #MCs on VM-Enter/TD-Enter outside of the fastpath
Posted by Sean Christopherson 1 week, 3 days ago
On Fri, Nov 21, 2025, Binbin Wu wrote:
> On 11/19/2025 6:23 AM, Sean Christopherson wrote:
> > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> > index fdcc519348cd..f369c499b2c3 100644
> > --- a/arch/x86/kvm/vmx/vmx.c
> > +++ b/arch/x86/kvm/vmx/vmx.c
> > @@ -7062,10 +7062,19 @@ void vmx_handle_exit_irqoff(struct kvm_vcpu *vcpu)
> 
> I think the bug in v1 is because the fact that the function is the common path
> for both VMX and TDX is overlooked. Do you think it is worth a comment to
> tell the function is the common path for both VMX and TDX?

Probably not?  Addressing that quirk crossed my mind as well, but there are so
many dependencies and so much interaction between VMX and TDX code that I think
we just need get used to things and not assuming TDX is isolated.  I.e. trying
to add comments and/or tweak names will probably be a game of whack-a-mole, and
will likely cause confusion, e.g. due to some flows being commented and others
not.
Re: [PATCH v2 2/4] KVM: VMX: Handle #MCs on VM-Enter/TD-Enter outside of the fastpath
Posted by Tony Lindgren 1 week, 5 days ago
On Tue, Nov 18, 2025 at 02:23:26PM -0800, Sean Christopherson wrote:
> Handle Machine Checks (#MC) that happen on VM-Enter (VMX or TDX) outside
> of KVM's fastpath so that as much host state as possible is re-loaded
> before invoking the kernel's #MC handler.  The only requirement is that
> KVM invokes the #MC handler before enabling IRQs (and even that could
> _probably_ be related to handling #MCs before enabling preemption).
> 
> Waiting to handle #MCs until "more" host state is loaded hardens KVM
> against flaws in the #MC handler, which has historically been quite
> brittle. E.g. prior to commit 5567d11c21a1 ("x86/mce: Send #MC singal from
> task work"), the #MC code could trigger a schedule() with IRQs and
> preemption disabled.  That led to a KVM hack-a-fix in commit 1811d979c716
> ("x86/kvm: move kvm_load/put_guest_xcr0 into atomic context").
> 
> Note, vmx_handle_exit_irqoff() is common to VMX and TDX guests.

Reviewed-by: Tony Lindgren <tony.lindgren@linux.intel.com>