From: Sean Christopherson <seanjc@google.com>
Avoid local retries within the TDX EPT violation handler if a retry is
triggered by faulting in an invalid memslot, indicating that the memslot is
undergoing a removal process.
This prevents the slot removal process from being blocked while waiting for
the VMExit handler to release the SRCU lock.
Opportunistically, export symbol kvm_vcpu_gfn_to_memslot() to allow for
per-vCPU acceleration of gfn_to_memslot translation.
[Yan: Wrote patch log, comment, fixed a minor error, function export]
Reported-by: Reinette Chatre <reinette.chatre@intel.com>
Closes: https://lore.kernel.org/all/20250519023737.30360-1-yan.y.zhao@intel.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
---
arch/x86/kvm/vmx/tdx.c | 11 +++++++++++
virt/kvm/kvm_main.c | 1 +
2 files changed, 12 insertions(+)
diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c
index 6784aaaced87..de2c4bb36069 100644
--- a/arch/x86/kvm/vmx/tdx.c
+++ b/arch/x86/kvm/vmx/tdx.c
@@ -1992,6 +1992,11 @@ static int tdx_handle_ept_violation(struct kvm_vcpu *vcpu)
* blocked by TDs, false positives are inevitable i.e., KVM may re-enter
* the guest even if the IRQ/NMI can't be delivered.
*
+ * Breaking out of the local retries if a retry is caused by faulting
+ * in an invalid memslot (indicating the slot is under removal), so that
+ * the slot removal will not be blocked due to waiting for releasing
+ * SRCU lock in the VMExit handler.
+ *
* Note: even without breaking out of local retries, zero-step
* mitigation may still occur due to
* - invoking of TDH.VP.ENTER after KVM_EXIT_MEMORY_FAULT,
@@ -2002,6 +2007,8 @@ static int tdx_handle_ept_violation(struct kvm_vcpu *vcpu)
* handle retries locally in their EPT violation handlers.
*/
while (1) {
+ struct kvm_memory_slot *slot;
+
ret = __vmx_handle_ept_violation(vcpu, gpa, exit_qual);
if (ret != RET_PF_RETRY || !local_retry)
@@ -2015,6 +2022,10 @@ static int tdx_handle_ept_violation(struct kvm_vcpu *vcpu)
break;
}
+ slot = kvm_vcpu_gfn_to_memslot(vcpu, gpa_to_gfn(gpa));
+ if (slot && slot->flags & KVM_MEMSLOT_INVALID)
+ break;
+
cond_resched();
}
return ret;
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 6c07dd423458..f769d1dccc21 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2661,6 +2661,7 @@ struct kvm_memory_slot *kvm_vcpu_gfn_to_memslot(struct kvm_vcpu *vcpu, gfn_t gfn
return NULL;
}
+EXPORT_SYMBOL_GPL(kvm_vcpu_gfn_to_memslot);
bool kvm_is_visible_gfn(struct kvm *kvm, gfn_t gfn)
{
--
2.43.2
On 8/22/2025 3:05 PM, Yan Zhao wrote: > From: Sean Christopherson <seanjc@google.com> > > Avoid local retries within the TDX EPT violation handler if a retry is > triggered by faulting in an invalid memslot, indicating that the memslot is > undergoing a removal process. > > This prevents the slot removal process from being blocked while waiting for > the VMExit handler to release the SRCU lock. > > Opportunistically, export symbol kvm_vcpu_gfn_to_memslot() to allow for > per-vCPU acceleration of gfn_to_memslot translation. > > [Yan: Wrote patch log, comment, fixed a minor error, function export] > > Reported-by: Reinette Chatre <reinette.chatre@intel.com> > Closes: https://lore.kernel.org/all/20250519023737.30360-1-yan.y.zhao@intel.com > Signed-off-by: Sean Christopherson <seanjc@google.com> > Signed-off-by: Yan Zhao <yan.y.zhao@intel.com> > --- > arch/x86/kvm/vmx/tdx.c | 11 +++++++++++ > virt/kvm/kvm_main.c | 1 + > 2 files changed, 12 insertions(+) > > diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c > index 6784aaaced87..de2c4bb36069 100644 > --- a/arch/x86/kvm/vmx/tdx.c > +++ b/arch/x86/kvm/vmx/tdx.c > @@ -1992,6 +1992,11 @@ static int tdx_handle_ept_violation(struct kvm_vcpu *vcpu) > * blocked by TDs, false positives are inevitable i.e., KVM may re-enter > * the guest even if the IRQ/NMI can't be delivered. > * > + * Breaking out of the local retries if a retry is caused by faulting > + * in an invalid memslot (indicating the slot is under removal), so that > + * the slot removal will not be blocked due to waiting for releasing > + * SRCU lock in the VMExit handler. > + * > * Note: even without breaking out of local retries, zero-step > * mitigation may still occur due to > * - invoking of TDH.VP.ENTER after KVM_EXIT_MEMORY_FAULT, > @@ -2002,6 +2007,8 @@ static int tdx_handle_ept_violation(struct kvm_vcpu *vcpu) > * handle retries locally in their EPT violation handlers. > */ > while (1) { > + struct kvm_memory_slot *slot; > + > ret = __vmx_handle_ept_violation(vcpu, gpa, exit_qual); > > if (ret != RET_PF_RETRY || !local_retry) > @@ -2015,6 +2022,10 @@ static int tdx_handle_ept_violation(struct kvm_vcpu *vcpu) > break; > } > > + slot = kvm_vcpu_gfn_to_memslot(vcpu, gpa_to_gfn(gpa)); > + if (slot && slot->flags & KVM_MEMSLOT_INVALID) The slot couldn't be NULL here, right? So the checking for slot is to avoid de-referencing a NULL pointer in case of bug? > + break; > + > cond_resched(); > } > return ret; > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c > index 6c07dd423458..f769d1dccc21 100644 > --- a/virt/kvm/kvm_main.c > +++ b/virt/kvm/kvm_main.c > @@ -2661,6 +2661,7 @@ struct kvm_memory_slot *kvm_vcpu_gfn_to_memslot(struct kvm_vcpu *vcpu, gfn_t gfn > > return NULL; > } > +EXPORT_SYMBOL_GPL(kvm_vcpu_gfn_to_memslot); > > bool kvm_is_visible_gfn(struct kvm *kvm, gfn_t gfn) > {
On Tue, Sep 09, 2025, Binbin Wu wrote: > On 8/22/2025 3:05 PM, Yan Zhao wrote: > > diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c > > index 6784aaaced87..de2c4bb36069 100644 > > --- a/arch/x86/kvm/vmx/tdx.c > > +++ b/arch/x86/kvm/vmx/tdx.c > > @@ -1992,6 +1992,11 @@ static int tdx_handle_ept_violation(struct kvm_vcpu *vcpu) > > * blocked by TDs, false positives are inevitable i.e., KVM may re-enter > > * the guest even if the IRQ/NMI can't be delivered. > > * > > + * Breaking out of the local retries if a retry is caused by faulting > > + * in an invalid memslot (indicating the slot is under removal), so that > > + * the slot removal will not be blocked due to waiting for releasing > > + * SRCU lock in the VMExit handler. > > + * > > * Note: even without breaking out of local retries, zero-step > > * mitigation may still occur due to > > * - invoking of TDH.VP.ENTER after KVM_EXIT_MEMORY_FAULT, > > @@ -2002,6 +2007,8 @@ static int tdx_handle_ept_violation(struct kvm_vcpu *vcpu) > > * handle retries locally in their EPT violation handlers. > > */ > > while (1) { > > + struct kvm_memory_slot *slot; > > + > > ret = __vmx_handle_ept_violation(vcpu, gpa, exit_qual); > > if (ret != RET_PF_RETRY || !local_retry) > > @@ -2015,6 +2022,10 @@ static int tdx_handle_ept_violation(struct kvm_vcpu *vcpu) > > break; > > } > > + slot = kvm_vcpu_gfn_to_memslot(vcpu, gpa_to_gfn(gpa)); > > + if (slot && slot->flags & KVM_MEMSLOT_INVALID) > > The slot couldn't be NULL here, right? Uh, hmm. It could be NULL. If the memslot deletion starts concurrently with the S-EPT violation, then the memslot could be transitioned to INVALID (prepared for deletion) prior to the vCPU acquiring SRCU after the VM-Exit. Memslot deletion could then assign to kvm->memslots with a NULL memslot. vCPU DELETE S-EPT Violation Set KVM_MEMSLOT_INVALID synchronize_srcu_expedited() Acquire SRCU __vmx_handle_ept_violation() RET_PF_RETRY due to INVALID Set memslot NULL kvm_vcpu_gfn_to_memslot()
On 9/9/2025 10:18 PM, Sean Christopherson wrote: > On Tue, Sep 09, 2025, Binbin Wu wrote: >> On 8/22/2025 3:05 PM, Yan Zhao wrote: >>> diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c >>> index 6784aaaced87..de2c4bb36069 100644 >>> --- a/arch/x86/kvm/vmx/tdx.c >>> +++ b/arch/x86/kvm/vmx/tdx.c >>> @@ -1992,6 +1992,11 @@ static int tdx_handle_ept_violation(struct kvm_vcpu *vcpu) >>> * blocked by TDs, false positives are inevitable i.e., KVM may re-enter >>> * the guest even if the IRQ/NMI can't be delivered. >>> * >>> + * Breaking out of the local retries if a retry is caused by faulting >>> + * in an invalid memslot (indicating the slot is under removal), so that >>> + * the slot removal will not be blocked due to waiting for releasing >>> + * SRCU lock in the VMExit handler. >>> + * >>> * Note: even without breaking out of local retries, zero-step >>> * mitigation may still occur due to >>> * - invoking of TDH.VP.ENTER after KVM_EXIT_MEMORY_FAULT, >>> @@ -2002,6 +2007,8 @@ static int tdx_handle_ept_violation(struct kvm_vcpu *vcpu) >>> * handle retries locally in their EPT violation handlers. >>> */ >>> while (1) { >>> + struct kvm_memory_slot *slot; >>> + >>> ret = __vmx_handle_ept_violation(vcpu, gpa, exit_qual); >>> if (ret != RET_PF_RETRY || !local_retry) >>> @@ -2015,6 +2022,10 @@ static int tdx_handle_ept_violation(struct kvm_vcpu *vcpu) >>> break; >>> } >>> + slot = kvm_vcpu_gfn_to_memslot(vcpu, gpa_to_gfn(gpa)); >>> + if (slot && slot->flags & KVM_MEMSLOT_INVALID) >> The slot couldn't be NULL here, right? > Uh, hmm. It could be NULL. If the memslot deletion starts concurrently with the > S-EPT violation, then the memslot could be transitioned to INVALID (prepared for > deletion) prior to the vCPU acquiring SRCU after the VM-Exit. Memslot deletion > could then assign to kvm->memslots with a NULL memslot. > > vCPU DELETE > S-EPT Violation > Set KVM_MEMSLOT_INVALID > synchronize_srcu_expedited() > Acquire SRCU > __vmx_handle_ept_violation() > RET_PF_RETRY due to INVALID > Set memslot NULL > kvm_vcpu_gfn_to_memslot() Got it, thanks!
© 2016 - 2025 Red Hat, Inc.