From: Sean Christopherson <seanjc@google.com>
Avoid local retries within the TDX EPT violation handler if a retry is
triggered by faulting in an invalid memslot, indicating that the memslot is
undergoing a removal process.
This prevents the slot removal process from being blocked while waiting for
the VMExit handler to release the SRCU lock.
Opportunistically, export symbol kvm_vcpu_gfn_to_memslot() to allow for
per-vCPU acceleration of gfn_to_memslot translation.
[Yan: Wrote patch log, comment, fixed a minor error, function export]
Reported-by: Reinette Chatre <reinette.chatre@intel.com>
Closes: https://lore.kernel.org/all/20250519023737.30360-1-yan.y.zhao@intel.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
---
arch/x86/kvm/vmx/tdx.c | 11 +++++++++++
virt/kvm/kvm_main.c | 1 +
2 files changed, 12 insertions(+)
diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c
index 6784aaaced87..de2c4bb36069 100644
--- a/arch/x86/kvm/vmx/tdx.c
+++ b/arch/x86/kvm/vmx/tdx.c
@@ -1992,6 +1992,11 @@ static int tdx_handle_ept_violation(struct kvm_vcpu *vcpu)
* blocked by TDs, false positives are inevitable i.e., KVM may re-enter
* the guest even if the IRQ/NMI can't be delivered.
*
+ * Breaking out of the local retries if a retry is caused by faulting
+ * in an invalid memslot (indicating the slot is under removal), so that
+ * the slot removal will not be blocked due to waiting for releasing
+ * SRCU lock in the VMExit handler.
+ *
* Note: even without breaking out of local retries, zero-step
* mitigation may still occur due to
* - invoking of TDH.VP.ENTER after KVM_EXIT_MEMORY_FAULT,
@@ -2002,6 +2007,8 @@ static int tdx_handle_ept_violation(struct kvm_vcpu *vcpu)
* handle retries locally in their EPT violation handlers.
*/
while (1) {
+ struct kvm_memory_slot *slot;
+
ret = __vmx_handle_ept_violation(vcpu, gpa, exit_qual);
if (ret != RET_PF_RETRY || !local_retry)
@@ -2015,6 +2022,10 @@ static int tdx_handle_ept_violation(struct kvm_vcpu *vcpu)
break;
}
+ slot = kvm_vcpu_gfn_to_memslot(vcpu, gpa_to_gfn(gpa));
+ if (slot && slot->flags & KVM_MEMSLOT_INVALID)
+ break;
+
cond_resched();
}
return ret;
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 6c07dd423458..f769d1dccc21 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2661,6 +2661,7 @@ struct kvm_memory_slot *kvm_vcpu_gfn_to_memslot(struct kvm_vcpu *vcpu, gfn_t gfn
return NULL;
}
+EXPORT_SYMBOL_GPL(kvm_vcpu_gfn_to_memslot);
bool kvm_is_visible_gfn(struct kvm *kvm, gfn_t gfn)
{
--
2.43.2
On 8/22/2025 3:05 PM, Yan Zhao wrote:
> From: Sean Christopherson <seanjc@google.com>
>
> Avoid local retries within the TDX EPT violation handler if a retry is
> triggered by faulting in an invalid memslot, indicating that the memslot is
> undergoing a removal process.
>
> This prevents the slot removal process from being blocked while waiting for
> the VMExit handler to release the SRCU lock.
>
> Opportunistically, export symbol kvm_vcpu_gfn_to_memslot() to allow for
> per-vCPU acceleration of gfn_to_memslot translation.
>
> [Yan: Wrote patch log, comment, fixed a minor error, function export]
>
> Reported-by: Reinette Chatre <reinette.chatre@intel.com>
> Closes: https://lore.kernel.org/all/20250519023737.30360-1-yan.y.zhao@intel.com
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
> ---
> arch/x86/kvm/vmx/tdx.c | 11 +++++++++++
> virt/kvm/kvm_main.c | 1 +
> 2 files changed, 12 insertions(+)
>
> diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c
> index 6784aaaced87..de2c4bb36069 100644
> --- a/arch/x86/kvm/vmx/tdx.c
> +++ b/arch/x86/kvm/vmx/tdx.c
> @@ -1992,6 +1992,11 @@ static int tdx_handle_ept_violation(struct kvm_vcpu *vcpu)
> * blocked by TDs, false positives are inevitable i.e., KVM may re-enter
> * the guest even if the IRQ/NMI can't be delivered.
> *
> + * Breaking out of the local retries if a retry is caused by faulting
> + * in an invalid memslot (indicating the slot is under removal), so that
> + * the slot removal will not be blocked due to waiting for releasing
> + * SRCU lock in the VMExit handler.
> + *
> * Note: even without breaking out of local retries, zero-step
> * mitigation may still occur due to
> * - invoking of TDH.VP.ENTER after KVM_EXIT_MEMORY_FAULT,
> @@ -2002,6 +2007,8 @@ static int tdx_handle_ept_violation(struct kvm_vcpu *vcpu)
> * handle retries locally in their EPT violation handlers.
> */
> while (1) {
> + struct kvm_memory_slot *slot;
> +
> ret = __vmx_handle_ept_violation(vcpu, gpa, exit_qual);
>
> if (ret != RET_PF_RETRY || !local_retry)
> @@ -2015,6 +2022,10 @@ static int tdx_handle_ept_violation(struct kvm_vcpu *vcpu)
> break;
> }
>
> + slot = kvm_vcpu_gfn_to_memslot(vcpu, gpa_to_gfn(gpa));
> + if (slot && slot->flags & KVM_MEMSLOT_INVALID)
The slot couldn't be NULL here, right?
So the checking for slot is to avoid de-referencing a NULL pointer in case of
bug?
> + break;
> +
> cond_resched();
> }
> return ret;
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 6c07dd423458..f769d1dccc21 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -2661,6 +2661,7 @@ struct kvm_memory_slot *kvm_vcpu_gfn_to_memslot(struct kvm_vcpu *vcpu, gfn_t gfn
>
> return NULL;
> }
> +EXPORT_SYMBOL_GPL(kvm_vcpu_gfn_to_memslot);
>
> bool kvm_is_visible_gfn(struct kvm *kvm, gfn_t gfn)
> {
On Tue, Sep 09, 2025, Binbin Wu wrote:
> On 8/22/2025 3:05 PM, Yan Zhao wrote:
> > diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c
> > index 6784aaaced87..de2c4bb36069 100644
> > --- a/arch/x86/kvm/vmx/tdx.c
> > +++ b/arch/x86/kvm/vmx/tdx.c
> > @@ -1992,6 +1992,11 @@ static int tdx_handle_ept_violation(struct kvm_vcpu *vcpu)
> > * blocked by TDs, false positives are inevitable i.e., KVM may re-enter
> > * the guest even if the IRQ/NMI can't be delivered.
> > *
> > + * Breaking out of the local retries if a retry is caused by faulting
> > + * in an invalid memslot (indicating the slot is under removal), so that
> > + * the slot removal will not be blocked due to waiting for releasing
> > + * SRCU lock in the VMExit handler.
> > + *
> > * Note: even without breaking out of local retries, zero-step
> > * mitigation may still occur due to
> > * - invoking of TDH.VP.ENTER after KVM_EXIT_MEMORY_FAULT,
> > @@ -2002,6 +2007,8 @@ static int tdx_handle_ept_violation(struct kvm_vcpu *vcpu)
> > * handle retries locally in their EPT violation handlers.
> > */
> > while (1) {
> > + struct kvm_memory_slot *slot;
> > +
> > ret = __vmx_handle_ept_violation(vcpu, gpa, exit_qual);
> > if (ret != RET_PF_RETRY || !local_retry)
> > @@ -2015,6 +2022,10 @@ static int tdx_handle_ept_violation(struct kvm_vcpu *vcpu)
> > break;
> > }
> > + slot = kvm_vcpu_gfn_to_memslot(vcpu, gpa_to_gfn(gpa));
> > + if (slot && slot->flags & KVM_MEMSLOT_INVALID)
>
> The slot couldn't be NULL here, right?
Uh, hmm. It could be NULL. If the memslot deletion starts concurrently with the
S-EPT violation, then the memslot could be transitioned to INVALID (prepared for
deletion) prior to the vCPU acquiring SRCU after the VM-Exit. Memslot deletion
could then assign to kvm->memslots with a NULL memslot.
vCPU DELETE
S-EPT Violation
Set KVM_MEMSLOT_INVALID
synchronize_srcu_expedited()
Acquire SRCU
__vmx_handle_ept_violation()
RET_PF_RETRY due to INVALID
Set memslot NULL
kvm_vcpu_gfn_to_memslot()
On 9/9/2025 10:18 PM, Sean Christopherson wrote:
> On Tue, Sep 09, 2025, Binbin Wu wrote:
>> On 8/22/2025 3:05 PM, Yan Zhao wrote:
>>> diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c
>>> index 6784aaaced87..de2c4bb36069 100644
>>> --- a/arch/x86/kvm/vmx/tdx.c
>>> +++ b/arch/x86/kvm/vmx/tdx.c
>>> @@ -1992,6 +1992,11 @@ static int tdx_handle_ept_violation(struct kvm_vcpu *vcpu)
>>> * blocked by TDs, false positives are inevitable i.e., KVM may re-enter
>>> * the guest even if the IRQ/NMI can't be delivered.
>>> *
>>> + * Breaking out of the local retries if a retry is caused by faulting
>>> + * in an invalid memslot (indicating the slot is under removal), so that
>>> + * the slot removal will not be blocked due to waiting for releasing
>>> + * SRCU lock in the VMExit handler.
>>> + *
>>> * Note: even without breaking out of local retries, zero-step
>>> * mitigation may still occur due to
>>> * - invoking of TDH.VP.ENTER after KVM_EXIT_MEMORY_FAULT,
>>> @@ -2002,6 +2007,8 @@ static int tdx_handle_ept_violation(struct kvm_vcpu *vcpu)
>>> * handle retries locally in their EPT violation handlers.
>>> */
>>> while (1) {
>>> + struct kvm_memory_slot *slot;
>>> +
>>> ret = __vmx_handle_ept_violation(vcpu, gpa, exit_qual);
>>> if (ret != RET_PF_RETRY || !local_retry)
>>> @@ -2015,6 +2022,10 @@ static int tdx_handle_ept_violation(struct kvm_vcpu *vcpu)
>>> break;
>>> }
>>> + slot = kvm_vcpu_gfn_to_memslot(vcpu, gpa_to_gfn(gpa));
>>> + if (slot && slot->flags & KVM_MEMSLOT_INVALID)
>> The slot couldn't be NULL here, right?
> Uh, hmm. It could be NULL. If the memslot deletion starts concurrently with the
> S-EPT violation, then the memslot could be transitioned to INVALID (prepared for
> deletion) prior to the vCPU acquiring SRCU after the VM-Exit. Memslot deletion
> could then assign to kvm->memslots with a NULL memslot.
>
> vCPU DELETE
> S-EPT Violation
> Set KVM_MEMSLOT_INVALID
> synchronize_srcu_expedited()
> Acquire SRCU
> __vmx_handle_ept_violation()
> RET_PF_RETRY due to INVALID
> Set memslot NULL
> kvm_vcpu_gfn_to_memslot()
Got it, thanks!
© 2016 - 2026 Red Hat, Inc.