arch/x86/kvm/x86.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
From: ZhuangYanying <ann.zhuangyanying@huawei.com>
Recently I found NMI could not be injected to vm via libvirt API
Reproduce the problem:
1 use guest of redhat 7.3
2 disable nmi_watchdog and trig spinlock deadlock inside the guest
check the running vcpu thread, make sure not vcpu0
3 inject NMI into the guest via libvirt API "inject-nmi"
Result:
The NMI could not be injected into the guest.
Reason:
1 It sets nmi_queued to 1 when calling ioctl KVM_NMI in qemu, and
sets cpu->kvm_vcpu_dirty to true in do_inject_external_nmi() meanwhile.
2 It sets nmi_queued to 0 in process_nmi(), before entering guest,
because cpu->kvm_vcpu_dirty is true.
Normally, vcpu could call vcpu_enter_guest successfully to inject the NMI.
However, in the problematic scenario, when the guest's threads hold
spin_lock_irqsave for a long time, such as entering a while loop after
spin_lock_irqsave(), other vcpus would enter into S state because of
pvspinlock scheme, then KVM module will loop in vcpu_block rather than
entry the guest.
I think that it's not suitable to decide whether to stay in vcpu_block()
or not just by checking nmi_queued, NMI should be injected immediately
even at this situation.
Solution:
There're 2 ways to solve the problem:
1 call cpu_synchronize_state_not_set_dirty() rather than
cpu_synchronize_state(), while injecting NMI, to avoid changing
nmi_queued to 0. But other workqueues may affect cpu->kvm_vcpu_dirty,
so it's not recommended.
2 add checking nmi_pending plus with nmi_queued in vm_vcpu_has_events()
in KVM module.
qemu_kvm_wait_io_event
qemu_wait_io_event_common
flush_queued_work
do_inject_external_nmi
cpu_synchronize_state
kvm_cpu_synchronize_state
do_kvm_cpu_synchronize_state
cpu->kvm_vcpu_dirty = true; /* trigger process_nmi */
kvm_vcpu_ioctl(cpu, KVM_NMI)
kvm_vcpu_ioctl_nmi
kvm_inject_nmi
atomic_inc(&vcpu->arch.nmi_queued);
nmi_queued = 1
/* nmi_queued set to 1, when qemu ioctl KVM_NMI */
kvm_make_request(KVM_REQ_NMI, vcpu);
kvm_cpu_exec
kvm_arch_put_registers(cpu, KVM_PUT_RUNTIME_STATE);
kvm_arch_put_registers
kvm_put_vcpu_events
kvm_vcpu_ioctl(CPU(cpu), KVM_SET_VCPU_EVENTS, &events);
kvm_vcpu_ioctl_x86_set_vcpu_events
process_nmi(vcpu);
vcpu->arch.nmi_pending +=
atomic_xchg(&vcpu->arch.nmi_queued, 0);
nmi_queued = 0
/* nmi_queued set to 0, vcpu thread always block */
nmi_pending = 1
kvm_make_request(KVM_REQ_EVENT, vcpu);
kvm_vcpu_ioctl(cpu, KVM_RUN, 0);
kvm_arch_vcpu_ioctl_run
vcpu_run(vcpu);
kvm_vcpu_running(vcpu)
/* always false, could not call vcpu_enter_guest */
vcpu_block
kvm_arch_vcpu_runnable
kvm_vcpu_has_events
if (atomic_read(&vcpu->arch.nmi_queued))
/* nmi_queued is 0, vcpu thread always block*/
Signed-off-by: Zhuang Yanying <ann.zhuangyanying@huawei.com>
---
arch/x86/kvm/x86.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 02363e3..96983dc 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -8394,7 +8394,8 @@ static inline bool kvm_vcpu_has_events(struct kvm_vcpu *vcpu)
if (vcpu->arch.pv.pv_unhalted)
return true;
- if (atomic_read(&vcpu->arch.nmi_queued))
+ if (vcpu->arch.nmi_pending ||
+ atomic_read(&vcpu->arch.nmi_queued))
return true;
if (kvm_test_request(KVM_REQ_SMI, vcpu))
--
1.8.3.1
Please use tags in patches. We usually begin the subject with "KVM: x86:" when touching arch/x86/kvm/x86.c. 2017-05-24 13:48+0800, Zhuangyanying: > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > @@ -8394,7 +8394,8 @@ static inline bool kvm_vcpu_has_events(struct kvm_vcpu *vcpu) > if (vcpu->arch.pv.pv_unhalted) > return true; > > - if (atomic_read(&vcpu->arch.nmi_queued)) > + if (vcpu->arch.nmi_pending || > + atomic_read(&vcpu->arch.nmi_queued)) > return true; Hm, I think we've been missing '&& kvm_x86_ops->nmi_allowed(vcpu)'. The undesired resume if we have suppressed NMI is not making it much worse, but wouldn't "kvm_test_request(KVM_REQ_NMI, vcpu)" also work here? > if (kvm_test_request(KVM_REQ_SMI, vcpu)) Thanks.
> -----Original Message----- > From: Radim Krčmář [mailto:rkrcmar@redhat.com] > Sent: Wednesday, May 24, 2017 10:34 PM > To: Zhuangyanying > Cc: pbonzini@redhat.com; Herongguang (Stephen); qemu-devel@nongnu.org; > Gonglei (Arei); Zhangbo (Oscar); kvm@vger.kernel.org > Subject: Re: [PATCH] Fix nmi injection failure when vcpu got blocked > > Please use tags in patches. > We usually begin the subject with "KVM: x86:" when touching > arch/x86/kvm/x86.c. > Sorry, I will add in patch v2. > 2017-05-24 13:48+0800, Zhuangyanying: > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c @@ -8394,7 > > +8394,8 @@ static inline bool kvm_vcpu_has_events(struct kvm_vcpu *vcpu) > > if (vcpu->arch.pv.pv_unhalted) > > return true; > > > > - if (atomic_read(&vcpu->arch.nmi_queued)) > > + if (vcpu->arch.nmi_pending || > > + atomic_read(&vcpu->arch.nmi_queued)) > > return true; > > Hm, I think we've been missing '&& kvm_x86_ops->nmi_allowed(vcpu)'. > Yes, we've been missing, I will add in patch v2. > The undesired resume if we have suppressed NMI is not making it much worse, > but wouldn't "kvm_test_request(KVM_REQ_NMI, vcpu)" also work here? > > > if (kvm_test_request(KVM_REQ_SMI, vcpu)) > > Thanks. "kvm_test_request(KVM_REQ_NMI, vcpu)", works fine for me.
On 24/05/2017 16:34, Radim Krčmář wrote: >> - if (atomic_read(&vcpu->arch.nmi_queued)) >> + if (vcpu->arch.nmi_pending || >> + atomic_read(&vcpu->arch.nmi_queued)) >> return true; > Hm, I think we've been missing '&& kvm_x86_ops->nmi_allowed(vcpu)'. > > The undesired resume if we have suppressed NMI is not making it much > worse, but wouldn't "kvm_test_request(KVM_REQ_NMI, vcpu)" also work > here? Yes, it would be fine (maybe better considering that we have a KVM_REQ_SMI check just below). Ying, please use it for v2. Thanks, Paolo
© 2016 - 2024 Red Hat, Inc.