KVM: VMX: Fix NMI event loss

[PATCH] KVM: VMX: Fix NMI event loss

Posted by Tianyi Liu 2 years, 5 months ago

Hi, Sean:

I have found that in the latest version of the kernel, some PMU events are
being lost. I used bisect and found out the breaking commit [1], which
moved the handling of NMI events from `handle_exception_irqoff` to
`vmx_vcpu_enter_exit`.

If I revert this part as done in this patch, it works correctly. However,
I'm not really familiar with KVM, and I'm not sure about the intent behind
the original patch [1]. Could you please take a look on this? Thanks a lot.

My use case is to sample the IP of guest OS using `perf kvm`:
`perf kvm --guest record -a -g -e instructions -F 10000 -- sleep 1`

If it works correctly, it will record about 10000 samples (as `-F 10000`)
and it will say:
`[ perf record: Captured and wrote 0.9 MB perf.data.guest (9729 samples) ]`
And if not, it will only record ~100 samples, sometimes no sample at all.

If it's useful for your debug, The callchain of `vmx_vcpu_enter_exit` is:
vmx_vcpu_enter_exit
vmx_vcpu_run
kvm_x86_vcpu_run
vcpu_enter_guest

While the callchain of `handle_exception_irqoff` is:
handle_exception_irqoff
vmx_handle_exit_irqoff
kvm_x86_handle_exit_irqoff
vcpu_enter_guest

[1] https://lore.kernel.org/all/20221213060912.654668-8-seanjc@google.com/

Signed-off-by: Tianyi Liu <i.pear@outlook.com>
---
 arch/x86/kvm/vmx/vmx.c | 13 ++++++-------
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index df461f387e20..3a0b13867a6b 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6955,6 +6955,12 @@ static void handle_exception_irqoff(struct vcpu_vmx *vmx)
 	/* Handle machine checks before interrupts are enabled */
 	else if (is_machine_check(intr_info))
 		kvm_machine_check();
+	/* We need to handle NMIs before interrupts are enabled */
+	else if (is_nmi(intr_info)) {
+		kvm_before_interrupt(&vmx->vcpu, KVM_HANDLING_NMI);
+		vmx_do_nmi_irqoff();
+		kvm_after_interrupt(&vmx->vcpu);
+	}
 }
 
 static void handle_external_interrupt_irqoff(struct kvm_vcpu *vcpu)
@@ -7251,13 +7257,6 @@ static noinstr void vmx_vcpu_enter_exit(struct kvm_vcpu *vcpu,
 	else
 		vmx->exit_reason.full = vmcs_read32(VM_EXIT_REASON);
 
-	if ((u16)vmx->exit_reason.basic == EXIT_REASON_EXCEPTION_NMI &&
-	    is_nmi(vmx_get_intr_info(vcpu))) {
-		kvm_before_interrupt(vcpu, KVM_HANDLING_NMI);
-		vmx_do_nmi_irqoff();
-		kvm_after_interrupt(vcpu);
-	}
-
 	guest_state_exit_irqoff();
 }
 
-- 
2.41.0

Re: [PATCH] KVM: VMX: Fix NMI event loss

Posted by Sean Christopherson 2 years, 5 months ago

On Mon, Aug 28, 2023, Tianyi Liu wrote:
> Hi, Sean:
> 
> I have found that in the latest version of the kernel, some PMU events are
> being lost. I used bisect and found out the breaking commit [1], which
> moved the handling of NMI events from `handle_exception_irqoff` to
> `vmx_vcpu_enter_exit`.
> 
> If I revert this part as done in this patch, it works correctly. However,
> I'm not really familiar with KVM, and I'm not sure about the intent behind
> the original patch [1].

FWIW, the goal was to invoke vmx_do_nmi_irqoff() before leaving the "noinstr"
region.  I messed up and forgot that vmx_get_intr_info() relied on metadata being
reset after VM-Exit :-/

> Could you please take a look on this? Thanks a lot.

Please try this patch, it should fix the problem but I haven't fully tested it
against an affected workload yet.  I'll do that later today.

https://lore.kernel.org/all/20230825014532.2846714-1-seanjc@google.com

Re: [PATCH] KVM: VMX: Fix NMI event loss

Posted by Tianyi Liu 2 years, 5 months ago

On Mon, Aug 28, 2023, Sean Christopherson wrote:
> On Mon, Aug 28, 2023, Tianyi Liu wrote:
> > Hi, Sean:
> > 
> > I have found that in the latest version of the kernel, some PMU events are
> > being lost. I used bisect and found out the breaking commit [1], which
> > moved the handling of NMI events from `handle_exception_irqoff` to
> > `vmx_vcpu_enter_exit`.
> > 
> > If I revert this part as done in this patch, it works correctly. However,
> > I'm not really familiar with KVM, and I'm not sure about the intent behind
> > the original patch [1].
> 
> FWIW, the goal was to invoke vmx_do_nmi_irqoff() before leaving the "noinstr"
> region.  I messed up and forgot that vmx_get_intr_info() relied on metadata being
> reset after VM-Exit :-/
> 
> > Could you please take a look on this? Thanks a lot.
> 
> Please try this patch, it should fix the problem but I haven't fully tested it
> against an affected workload yet.  I'll do that later today.
> 
> https://lore.kernel.org/all/20230825014532.2846714-1-seanjc@google.com

This works for me, thanks.