[RFC PATCH 07/12] KVM: TDX: Avoid a double-KVM_BUG_ON() in tdx_sept_zap_private_spte()

Sean Christopherson posted 12 patches 1 month, 1 week ago
There is a newer version of this series
[RFC PATCH 07/12] KVM: TDX: Avoid a double-KVM_BUG_ON() in tdx_sept_zap_private_spte()
Posted by Sean Christopherson 1 month, 1 week ago
Return -EIO immediately from tdx_sept_zap_private_spte() if the number of
to-be-added pages underflows, so that the following "KVM_BUG_ON(err, kvm)"
isn't also triggered.  Isolating the check from the "is premap error"
if-statement will also allow adding a lockdep assertion that premap errors
are encountered if and only if slots_lock is held.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/vmx/tdx.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c
index ef4ffcad131f..88079e2d45fb 100644
--- a/arch/x86/kvm/vmx/tdx.c
+++ b/arch/x86/kvm/vmx/tdx.c
@@ -1773,8 +1773,10 @@ static int tdx_sept_zap_private_spte(struct kvm *kvm, gfn_t gfn,
 		err = tdh_mem_range_block(&kvm_tdx->td, gpa, tdx_level, &entry, &level_state);
 		tdx_no_vcpus_enter_stop(kvm);
 	}
-	if (tdx_is_sept_zap_err_due_to_premap(kvm_tdx, err, entry, level) &&
-	    !KVM_BUG_ON(!atomic64_read(&kvm_tdx->nr_premapped), kvm)) {
+	if (tdx_is_sept_zap_err_due_to_premap(kvm_tdx, err, entry, level)) {
+		if (KVM_BUG_ON(!atomic64_read(&kvm_tdx->nr_premapped), kvm))
+			return -EIO;
+
 		atomic64_dec(&kvm_tdx->nr_premapped);
 		return 0;
 	}
-- 
2.51.0.268.g9569e192d0-goog
Re: [RFC PATCH 07/12] KVM: TDX: Avoid a double-KVM_BUG_ON() in tdx_sept_zap_private_spte()
Posted by Ira Weiny 1 month ago
Sean Christopherson wrote:
> Return -EIO immediately from tdx_sept_zap_private_spte() if the number of
> to-be-added pages underflows, so that the following "KVM_BUG_ON(err, kvm)"
> isn't also triggered.  Isolating the check from the "is premap error"
> if-statement will also allow adding a lockdep assertion that premap errors
> are encountered if and only if slots_lock is held.
> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/vmx/tdx.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c
> index ef4ffcad131f..88079e2d45fb 100644
> --- a/arch/x86/kvm/vmx/tdx.c
> +++ b/arch/x86/kvm/vmx/tdx.c
> @@ -1773,8 +1773,10 @@ static int tdx_sept_zap_private_spte(struct kvm *kvm, gfn_t gfn,
>  		err = tdh_mem_range_block(&kvm_tdx->td, gpa, tdx_level, &entry, &level_state);
>  		tdx_no_vcpus_enter_stop(kvm);
>  	}
> -	if (tdx_is_sept_zap_err_due_to_premap(kvm_tdx, err, entry, level) &&
> -	    !KVM_BUG_ON(!atomic64_read(&kvm_tdx->nr_premapped), kvm)) {
> +	if (tdx_is_sept_zap_err_due_to_premap(kvm_tdx, err, entry, level)) {
> +		if (KVM_BUG_ON(!atomic64_read(&kvm_tdx->nr_premapped), kvm))
> +			return -EIO;

Won't this -EIO cause the KVM_BUG_ON on in remove_external_spte() to fire too?

static void remove_external_spte(struct kvm *kvm, gfn_t gfn, u64 old_spte,
                                 int level)
{
	...
	ret = kvm_x86_call(remove_external_spte)(kvm, gfn, level, old_pfn);
	KVM_BUG_ON(ret, kvm);
}


This patch is better than 3 bug ons but wouldn't it be better to make both
KVM_BUG_ON's internal errors or debug?

Something like this:

diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c
index 4920ee8ad773..83065f3fe605 100644
--- a/arch/x86/kvm/vmx/tdx.c
+++ b/arch/x86/kvm/vmx/tdx.c
@@ -1774,14 +1774,16 @@ static int tdx_sept_zap_private_spte(struct kvm *kvm, gfn_t gfn,
                tdx_no_vcpus_enter_stop(kvm);
        }
        if (tdx_is_sept_zap_err_due_to_premap(kvm_tdx, err, entry, level)) {
-               if (KVM_BUG_ON(!atomic64_read(&kvm_tdx->nr_premapped), kvm))
+               if (!atomic64_read(&kvm_tdx->nr_premapped)) {
+                       pr_err("nr_premapped underflow\n");
                        return -EIO;
+               }
 
                atomic64_dec(&kvm_tdx->nr_premapped);
                return 0;
        }
 
-       if (KVM_BUG_ON(err, kvm)) {
+       if (err) {
                pr_tdx_error_2(TDH_MEM_RANGE_BLOCK, err, entry, level_state);
                return -EIO;
        }
Re: [RFC PATCH 07/12] KVM: TDX: Avoid a double-KVM_BUG_ON() in tdx_sept_zap_private_spte()
Posted by Edgecombe, Rick P 1 month ago
On Tue, 2025-08-26 at 17:05 -0700, Sean Christopherson wrote:
> Return -EIO immediately from tdx_sept_zap_private_spte() if the number of
> to-be-added pages underflows, so that the following "KVM_BUG_ON(err, kvm)"
> isn't also triggered.  Isolating the check from the "is premap error"
> if-statement will also allow adding a lockdep assertion that premap errors
> are encountered if and only if slots_lock is held.
> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---

Reviewed-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Re: [RFC PATCH 07/12] KVM: TDX: Avoid a double-KVM_BUG_ON() in tdx_sept_zap_private_spte()
Posted by Edgecombe, Rick P 1 month ago
On Wed, 2025-08-27 at 19:19 -0700, Rick Edgecombe wrote:
> On Tue, 2025-08-26 at 17:05 -0700, Sean Christopherson wrote:
> > Return -EIO immediately from tdx_sept_zap_private_spte() if the number of
> > to-be-added pages underflows, so that the following "KVM_BUG_ON(err, kvm)"
> > isn't also triggered.  Isolating the check from the "is premap error"
> > if-statement will also allow adding a lockdep assertion that premap errors
> > are encountered if and only if slots_lock is held.
> > 
> > Signed-off-by: Sean Christopherson <seanjc@google.com>
> > ---
> 
> Reviewed-by: Rick Edgecombe <rick.p.edgecombe@intel.com>

There is actually another KVM_BUG_ON() in the path here:

static void remove_external_spte(struct kvm *kvm, gfn_t gfn, u64 old_spte,
				 int level)
{
	kvm_pfn_t old_pfn = spte_to_pfn(old_spte);
	int ret;

	/*
	 * External (TDX) SPTEs are limited to PG_LEVEL_4K, and external
	 * PTs are removed in a special order, involving free_external_spt().
	 * But remove_external_spte() will be called on non-leaf PTEs via
	 * __tdp_mmu_zap_root(), so avoid the error the former would return
	 * in this case.
	 */
	if (!is_last_spte(old_spte, level))
		return;

	/* Zapping leaf spte is allowed only when write lock is held. */
	lockdep_assert_held_write(&kvm->mmu_lock);
	/* Because write lock is held, operation should success. */
	ret = kvm_x86_call(remove_external_spte)(kvm, gfn, level, old_pfn);
->	KVM_BUG_ON(ret, kvm);

We don't need to do it in this patch, but we could remove the return value in
.remove_external_spte, and the KVM_BUG_ON(). Just let remove_external_spte
handle it internally.
Re: [RFC PATCH 07/12] KVM: TDX: Avoid a double-KVM_BUG_ON() in tdx_sept_zap_private_spte()
Posted by Yan Zhao 1 month ago
On Thu, Aug 28, 2025 at 10:50:06PM +0800, Edgecombe, Rick P wrote:
> On Wed, 2025-08-27 at 19:19 -0700, Rick Edgecombe wrote:
> > On Tue, 2025-08-26 at 17:05 -0700, Sean Christopherson wrote:
> > > Return -EIO immediately from tdx_sept_zap_private_spte() if the number of
> > > to-be-added pages underflows, so that the following "KVM_BUG_ON(err, kvm)"
> > > isn't also triggered.  Isolating the check from the "is premap error"
> > > if-statement will also allow adding a lockdep assertion that premap errors
> > > are encountered if and only if slots_lock is held.
> > > 
> > > Signed-off-by: Sean Christopherson <seanjc@google.com>
> > > ---
> > 
> > Reviewed-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
> 
> There is actually another KVM_BUG_ON() in the path here:
> 
> static void remove_external_spte(struct kvm *kvm, gfn_t gfn, u64 old_spte,
> 				 int level)
> {
> 	kvm_pfn_t old_pfn = spte_to_pfn(old_spte);
> 	int ret;
> 
> 	/*
> 	 * External (TDX) SPTEs are limited to PG_LEVEL_4K, and external
> 	 * PTs are removed in a special order, involving free_external_spt().
> 	 * But remove_external_spte() will be called on non-leaf PTEs via
> 	 * __tdp_mmu_zap_root(), so avoid the error the former would return
> 	 * in this case.
> 	 */
> 	if (!is_last_spte(old_spte, level))
> 		return;
> 
> 	/* Zapping leaf spte is allowed only when write lock is held. */
> 	lockdep_assert_held_write(&kvm->mmu_lock);
> 	/* Because write lock is held, operation should success. */
> 	ret = kvm_x86_call(remove_external_spte)(kvm, gfn, level, old_pfn);
> ->	KVM_BUG_ON(ret, kvm);
> 
> We don't need to do it in this patch, but we could remove the return value in
> .remove_external_spte, and the KVM_BUG_ON(). Just let remove_external_spte
> handle it internally.
+1. Triggering KVM_BUG_ON() only in TDX internally is better.