[PATCH] KVM: x86: Don't read guest CR3 in async pf flow when guest state is protected

Xiaoyao Li posted 1 patch 5 days, 6 hours ago
arch/x86/kvm/mmu/mmu.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
[PATCH] KVM: x86: Don't read guest CR3 in async pf flow when guest state is protected
Posted by Xiaoyao Li 5 days, 6 hours ago
Don't read guest CR3 when setting up the async pf task and skip comparing
the CR3 value in kvm_arch_async_page_ready() when guest state is protected.

When KVM tries to perform the host-only async page fault for the shared
memory of TDX guests, the following WARNING is triggered:

  WARNING: CPU: 1 PID: 90922 at arch/x86/kvm/vmx/main.c:483 vt_cache_reg+0x16/0x20
  Call Trace:
  __kvm_mmu_faultin_pfn
  kvm_mmu_faultin_pfn
  kvm_tdp_page_fault
  kvm_mmu_do_page_fault
  kvm_mmu_page_fault
  tdx_handle_ept_violation

This WARNING is triggered when calling kvm_mmu_get_guest_pgd() to cache
the guest CR3 in kvm_arch_setup_async_pf() for later use in
kvm_arch_async_page_ready() to determine if it's possible to fix the
page fault in the current vCPU context to save one VM exit. However, when
guest state is protected, KVM cannot read the guest CR3.

Check guest_state_protected to avoid calling kvm_mmu_get_guest_pgd() to
read guest CR3 in async page fault flow:
 - In kvm_arch_setup_async_pf(), use dummy 0 when guest state is
   protected.

 - In kvm_arch_async_page_ready(), skip reading CR3 for comparison when
   guest state is protected.

Reported-by: Farrah Chen <farrah.chen@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
For AMD SEV-ES and SNP cases, the guest state is also protected. But
unlike TDX, reading guest CR3 doesn't cause issue since CR3 is always
marked available for svm vCPUs. It always gets the initial value 0,
set by kvm_vcpu_reset(). Whether to update vcpu->arch.regs_avail to
reflect the correct value for SEV-ES and SNP is another topic.
---
 arch/x86/kvm/mmu/mmu.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 667d66cf76d5..03be521df6b9 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -4521,7 +4521,8 @@ static bool kvm_arch_setup_async_pf(struct kvm_vcpu *vcpu,
 	arch.gfn = fault->gfn;
 	arch.error_code = fault->error_code;
 	arch.direct_map = vcpu->arch.mmu->root_role.direct;
-	arch.cr3 = kvm_mmu_get_guest_pgd(vcpu, vcpu->arch.mmu);
+	arch.cr3 = vcpu->arch.guest_state_protected ? 0 :
+		   kvm_mmu_get_guest_pgd(vcpu, vcpu->arch.mmu);
 
 	return kvm_setup_async_pf(vcpu, fault->addr,
 				  kvm_vcpu_gfn_to_hva(vcpu, fault->gfn), &arch);
@@ -4543,7 +4544,8 @@ void kvm_arch_async_page_ready(struct kvm_vcpu *vcpu, struct kvm_async_pf *work)
 		return;
 
 	if (!vcpu->arch.mmu->root_role.direct &&
-	      work->arch.cr3 != kvm_mmu_get_guest_pgd(vcpu, vcpu->arch.mmu))
+	    (vcpu->arch.guest_state_protected ||
+	     work->arch.cr3 != kvm_mmu_get_guest_pgd(vcpu, vcpu->arch.mmu)))
 		return;
 
 	r = kvm_mmu_do_page_fault(vcpu, work->cr2_or_gpa, work->arch.error_code,

base-commit: 7d0a66e4bb9081d75c82ec4957c50034cb0ea449
-- 
2.43.0
Re: [PATCH] KVM: x86: Don't read guest CR3 in async pf flow when guest state is protected
Posted by Sean Christopherson 4 days, 16 hours ago
On Thu, Dec 11, 2025, Xiaoyao Li wrote:
> ---
> For AMD SEV-ES and SNP cases, the guest state is also protected. But
> unlike TDX, reading guest CR3 doesn't cause issue since CR3 is always
> marked available for svm vCPUs. It always gets the initial value 0,
> set by kvm_vcpu_reset(). Whether to update vcpu->arch.regs_avail to
> reflect the correct value for SEV-ES and SNP is another topic.
> ---
>  arch/x86/kvm/mmu/mmu.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 667d66cf76d5..03be521df6b9 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -4521,7 +4521,8 @@ static bool kvm_arch_setup_async_pf(struct kvm_vcpu *vcpu,
>  	arch.gfn = fault->gfn;
>  	arch.error_code = fault->error_code;
>  	arch.direct_map = vcpu->arch.mmu->root_role.direct;
> -	arch.cr3 = kvm_mmu_get_guest_pgd(vcpu, vcpu->arch.mmu);
> +	arch.cr3 = vcpu->arch.guest_state_protected ? 0 :
> +		   kvm_mmu_get_guest_pgd(vcpu, vcpu->arch.mmu);
>  
>  	return kvm_setup_async_pf(vcpu, fault->addr,
>  				  kvm_vcpu_gfn_to_hva(vcpu, fault->gfn), &arch);
> @@ -4543,7 +4544,8 @@ void kvm_arch_async_page_ready(struct kvm_vcpu *vcpu, struct kvm_async_pf *work)
>  		return;
>  
>  	if (!vcpu->arch.mmu->root_role.direct &&
> -	      work->arch.cr3 != kvm_mmu_get_guest_pgd(vcpu, vcpu->arch.mmu))
> +	    (vcpu->arch.guest_state_protected ||
> +	     work->arch.cr3 != kvm_mmu_get_guest_pgd(vcpu, vcpu->arch.mmu)))
>  		return;

Protected guests aren't compatible with shadow paging, so I'd rather key off the
direct MMU role.  '0' is also a legal address; INVALID_GPA would be better.

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 02c450686b4a..446bf2716d08 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -4521,7 +4521,10 @@ static bool kvm_arch_setup_async_pf(struct kvm_vcpu *vcpu,
        arch.gfn = fault->gfn;
        arch.error_code = fault->error_code;
        arch.direct_map = vcpu->arch.mmu->root_role.direct;
-       arch.cr3 = kvm_mmu_get_guest_pgd(vcpu, vcpu->arch.mmu);
+       if (arch.direct_map)
+               arch.cr3 = INVALID_GPA;
+       else
+               arch.cr3 = kvm_mmu_get_guest_pgd(vcpu, vcpu->arch.mmu);
 
        return kvm_setup_async_pf(vcpu, fault->addr,
                                  kvm_vcpu_gfn_to_hva(vcpu, fault->gfn), &arch);