From nobody Sun May 24 20:34:05 2026 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B3BE9385D9A for ; Fri, 22 May 2026 23:27:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779492425; cv=none; b=udFUceHfsoPeP0Hx+54gPZGyY5bi07n4/BHCQ/l00nFBuAZANAMJOcQqrYN+jxO3uUNaVyP1LCJI8F1iaMIPoMfnyzxDLnplSuHZ+CkGy1yZNV4f+/UtWAFp71T3d8VXQL5Z50cx7yKE16np1pGPSKCOFlieYBYpkAnZtMBxaCE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779492425; c=relaxed/simple; bh=BGqs+7XB3gWUIESu7X0yek1Mhw0+X/DZjjKlREjWL90=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=TO+Pw6vAMHehY52oaszGreMPuXmV2z9RCo5uO6qw54GN0qzatWgMjci2F3W4W2+vbNxXp36wR8GrpFzUedPOzgw0YistOa8WC9cDWFi5mShD7FdGfsASakfuVAUvw+BvTKiVrdfdGmpGnP3xq2Xcm+YXJLmKz8MKMdJP07D65bQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=F+Hga5ar; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="F+Hga5ar" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-366122e01fcso7697283a91.2 for ; Fri, 22 May 2026 16:27:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20251104; t=1779492424; x=1780097224; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=YszyzLPcvTIfPw6dM2pGXcfV2rGCyaVWTXh+SVSGVWc=; b=F+Hga5arj5VQrayHbC/PehReCvaA/hcJZ28QXIydOPYARuoBeh8SA5ck9oTD4dMD8Y 4Z90Llvz2jAL8W+ODUMkSGAMlghORLXB1iz2Q/g9ZlvOJVTR8LhDRCuOhke7opGlhXIh MIOWdiMSGy6ztcQkaCdEbqPzeFosOfdLpCH539+grFDhHTrJDW4Pjs49qBe0YIlyseAB YDFP2qrLMV/8X4JfPZQcwN1d/8REYIzray3u8BT440sZLK8YlLr+KWQQ5au+DpSD9qMm VREHzfoHq7XscVN5CSAjJ1BY4eQyemsYvaHg7DIS1wj1bAwc8+G4xJ2IDZk8VeL0zYn2 Mu5Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1779492424; x=1780097224; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=YszyzLPcvTIfPw6dM2pGXcfV2rGCyaVWTXh+SVSGVWc=; b=pFE26qxiuLArFSaCpUpAD9ExU2LY7qNyO5RSw4H6NO8XFeSpR+ki6WfmoriyUqtfdN 6BYYmaGhKD5K1tFj0HJ/8uBIA1iuTf3jv8D0xP29Wj7G7EsqHi71+9nRpQgnN7vrUmnI DuqyAC0zsiq6iMdWSS7xtUj/rt8nvCTicIVAR8SCFzJnhD4YcX8itgO2TfYvkBes34aV W11NNEG22yQO+vnYNbyCd3k1W4TWT9z5Ib4yshzCPWIJzR7oZYrpuJEmWhQJ4zZzjrK4 GzOB92xzYcrqAZKy3zJlScRLtADaB654Cjjlg22+PeYjflqJe+pfYowA4xgVtfE2opPK Zsyg== X-Forwarded-Encrypted: i=1; AFNElJ9r6RQLLjtEZSlNoiurdxN6uGDlx6KhRpgNfz7Vm/alq01qP2lT3zF6IFA3yK3alJItc4HDLxzUoDhFp8E=@vger.kernel.org X-Gm-Message-State: AOJu0YxeAtr1h7c9mRuhOXoZetoK92AJ2TkhXqmapVGz/9iqbN3IgB8Q NzzTAdZQEfTO2C6wytCteErGN1yZdRsBVZgnY9bl8bQpOkP2f1vvkzlWwclU0uAHSPNRw4NPpmg ZJyQjrQ== X-Received: from pgne12.prod.google.com ([2002:a63:744c:0:b0:c79:22b6:a345]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:5343:b0:35f:bb33:d72c with SMTP id 98e67ed59e1d1-36a6773cf81mr4485356a91.4.1779492423541; Fri, 22 May 2026 16:27:03 -0700 (PDT) Reply-To: Sean Christopherson Date: Fri, 22 May 2026 16:26:57 -0700 In-Reply-To: <20260522232701.3671446-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20260522232701.3671446-1-seanjc@google.com> X-Mailer: git-send-email 2.54.0.794.g4f17f83d09-goog Message-ID: <20260522232701.3671446-2-seanjc@google.com> Subject: [PATCH v4 1/5] KVM: x86: Widen x86_exception's error_code to 64 bits From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Kevin Cheng Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Kevin Cheng Widen the error_code field in struct x86_exception from u16 to u64 to accommodate AMD's NPF error code, which defines information bits above bit 31, e.g. PFERR_GUEST_FINAL_MASK (bit 32), and PFERR_GUEST_PAGE_MASK (bit 33). Retain the u16 type for the local errcode variable in walk_addr_generic as the walker synthesizes conventional #PF error codes that are architecturally limited to bits 15:0. Signed-off-by: Kevin Cheng Signed-off-by: Sean Christopherson --- arch/x86/kvm/kvm_emulate.h | 2 +- arch/x86/kvm/mmu/paging_tmpl.h | 6 ++++++ 2 files changed, 7 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/kvm_emulate.h b/arch/x86/kvm/kvm_emulate.h index 72aece9ef575..f5df31a52996 100644 --- a/arch/x86/kvm/kvm_emulate.h +++ b/arch/x86/kvm/kvm_emulate.h @@ -22,7 +22,7 @@ enum x86_intercept_stage; struct x86_exception { u8 vector; bool error_code_valid; - u16 error_code; + u64 error_code; bool nested_page_fault; union { u64 address; /* cr2 or nested page fault gpa */ diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h index 07100bbfc270..51f8b4522314 100644 --- a/arch/x86/kvm/mmu/paging_tmpl.h +++ b/arch/x86/kvm/mmu/paging_tmpl.h @@ -328,6 +328,12 @@ static int FNAME(walk_addr_generic)(struct guest_walke= r *walker, const int write_fault =3D access & PFERR_WRITE_MASK; const int user_fault =3D access & PFERR_USER_MASK; const int fetch_fault =3D access & PFERR_FETCH_MASK; + /* + * Note! Track the error_code that's common to legacy shadow paging + * and NPT shadow paging as a u16 to guard against unintentionally + * setting any of bits 63:16. Architecturally, the #PF error code is + * 32 bits, and Intel CPUs don't support settings bits 31:16. + */ u16 errcode =3D 0; gpa_t real_gpa; gfn_t gfn; --=20 2.54.0.794.g4f17f83d09-goog From nobody Sun May 24 20:34:05 2026 Received: from mail-pg1-f201.google.com (mail-pg1-f201.google.com [209.85.215.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A9A5A3914F8 for ; Fri, 22 May 2026 23:27:05 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779492427; cv=none; b=L6hlQ/QH6KSa0BlY+PdBHj4N3+Umq4GT9KdJnbISTDOQlpt1tEFby+l8G1IHadX9GDx9Qc9bZh5n3SA1nbnx66R5NTKQfMZqq2hGUCLz7JGOpsu8pBa64Fwx2+QTqIXAltsL7XOy/Bjz0KaOGmKSMC5KO3T1U58AvAFoIaehjhE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779492427; c=relaxed/simple; bh=dKhndIESxOmMOENIbqkxr6Q8qh/5uCmzDxH6K9ofD4Y=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=g4njWAxrzByATzNek+ZK+zGgpSHuMViY4Y6Ij+rnXZfViAkFIVSO51HxnH3zOXymMcmXfupW0U/q9zT0LxhNpU94L29we7bWO91qDOTzhfcOftGXr96eqkmTtyDSzez2C6yRh+vWWz7i2VJ/C5fdedzahSiAu9boDjjM4o+b5yc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=hjk+0538; arc=none smtp.client-ip=209.85.215.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="hjk+0538" Received: by mail-pg1-f201.google.com with SMTP id 41be03b00d2f7-c802545ae0eso4843984a12.2 for ; Fri, 22 May 2026 16:27:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20251104; t=1779492425; x=1780097225; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=0kdwtawVGarwPM3X1uzUBTtCZA+NB2eeZjpdD8sKl7w=; b=hjk+0538Uu9gPFHlz2D8cOxPs+0xRY5eStludugr73PA3KL8rK+sGY6v6AKTSB/6E/ TZk0YB7lQV6CWp0DyyE46JLS/V4D0faetinc5JfJIJGNZQ+iiRY0qkUZkfh3Tb2/D0jT 42m1F0cKMgRSAQ4VXr8btUf4cFvMR105+XlqsG5zHLgc3LVKkIE/q6Q+oExBRtcWozed yE6dqZ8nGveVSa9tL3NPVjWf5hfHVzn4/FOAxZQkFTqeT6AUDI21V4KjuWoWNEwpRfy6 YNGkuJ3zj+3u1b+KG4i682giE/M/QHS9nZ08mwADlY5tK96BgTqRYnG11SDqm2CuAB5G 0T8Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1779492425; x=1780097225; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=0kdwtawVGarwPM3X1uzUBTtCZA+NB2eeZjpdD8sKl7w=; b=s/52GxIzKDeCEy+WAp3YTxFes2nkYP19kQcdg1jd87MOmawZwL00K6SIpcdarXEcgR GbZCFsjuANeE6zuMDhFTsPo11yJkwxTxrHiZr15t5NL2abA07A4+TTsMcMZDXD0q1aBR KtKnuuAvMKmy3Fg6V75yl7LqL0V7UUo5uFOR8nyXayxAT66pP0LkuFlX3YiKFwOtRu6T 0JA9sIn7UTIp5xFX6HW6x/pSjS56rQQvPt+9K13Uz38zImvBfrCd8jz0yh4QXrSCGPaN 6eWkF4AUQzZcVYCXJdi2ovKTwJoThzn3gtt036M4/rjW0LmIOV51bOQHk+Yzx7+0D6C/ LJdw== X-Forwarded-Encrypted: i=1; AFNElJ8FCijH//L4rcvTz+y9zFkMAhi+VJUJ4HGstVO6HEH+RZtqvads+4+jAYACYmAFawrMvJ8QLv4alQ77owY=@vger.kernel.org X-Gm-Message-State: AOJu0YznLJL2/l8T/GpW0P8PbwOw3+/UcG1yZgVQw6mJ8aQJifIq6eAZ cMMeEzpWqo84vpoGqrNwineTasxXu25D/U8dBZw4K/LXdTFF6cmFJMHYKB6bO49h3e2v0SODSXL dki4Z2g== X-Received: from pgt6.prod.google.com ([2002:a63:1346:0:b0:c82:7a06:d5f5]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a20:d505:b0:3b2:b203:7896 with SMTP id adf61e73a8af0-3b3293b8415mr5948848637.40.1779492424645; Fri, 22 May 2026 16:27:04 -0700 (PDT) Reply-To: Sean Christopherson Date: Fri, 22 May 2026 16:26:58 -0700 In-Reply-To: <20260522232701.3671446-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20260522232701.3671446-1-seanjc@google.com> X-Mailer: git-send-email 2.54.0.794.g4f17f83d09-goog Message-ID: <20260522232701.3671446-3-seanjc@google.com> Subject: [PATCH v4 2/5] KVM: x86: Tell ->inject_page_fault() whether or a fault came from hardware From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Kevin Cheng Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" When injecting a page fault (including nested TDP faults into L1), tell the injection routine whether or not the fault originated in hardware, i.e. if KVM is effectively forwarding a fault it intercept. For nested TDP fault injection, KVM needs to grab PAGE_WALK vs. GUEST_FINAL information from the VMCB/VMCS, _if_ the fault originated in hardware. No functional change intended (nothing uses the new param, yet...). Signed-off-by: Sean Christopherson --- arch/x86/include/asm/kvm_host.h | 18 ++++++++++++++---- arch/x86/kvm/mmu/paging_tmpl.h | 2 +- arch/x86/kvm/svm/nested.c | 3 ++- arch/x86/kvm/vmx/nested.c | 3 ++- arch/x86/kvm/x86.c | 16 +++++++++------- 5 files changed, 28 insertions(+), 14 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_hos= t.h index 271bdd109a98..d11063c36f03 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -484,7 +484,8 @@ struct kvm_mmu { u64 (*get_pdptr)(struct kvm_vcpu *vcpu, int index); int (*page_fault)(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault); void (*inject_page_fault)(struct kvm_vcpu *vcpu, - struct x86_exception *fault); + struct x86_exception *fault, + bool from_hardware); gpa_t (*gva_to_gpa)(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, gpa_t gva_or_gpa, u64 access, struct x86_exception *exception); @@ -2305,9 +2306,18 @@ void kvm_queue_exception_e(struct kvm_vcpu *vcpu, un= signed nr, u32 error_code); void kvm_queue_exception_p(struct kvm_vcpu *vcpu, unsigned nr, unsigned lo= ng payload); void kvm_requeue_exception(struct kvm_vcpu *vcpu, unsigned int nr, bool has_error_code, u32 error_code); -void kvm_inject_page_fault(struct kvm_vcpu *vcpu, struct x86_exception *fa= ult); -void kvm_inject_emulated_page_fault(struct kvm_vcpu *vcpu, - struct x86_exception *fault); +void kvm_inject_page_fault(struct kvm_vcpu *vcpu, struct x86_exception *fa= ult, + bool from_hardware); +void __kvm_inject_emulated_page_fault(struct kvm_vcpu *vcpu, + struct x86_exception *fault, + bool from_hardware); + +static inline void kvm_inject_emulated_page_fault(struct kvm_vcpu *vcpu, + struct x86_exception *fault) +{ + __kvm_inject_emulated_page_fault(vcpu, fault, false); +} + bool kvm_require_cpl(struct kvm_vcpu *vcpu, int required_cpl); bool kvm_require_dr(struct kvm_vcpu *vcpu, int dr); =20 diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h index 51f8b4522314..cc9c7deb34bc 100644 --- a/arch/x86/kvm/mmu/paging_tmpl.h +++ b/arch/x86/kvm/mmu/paging_tmpl.h @@ -813,7 +813,7 @@ static int FNAME(page_fault)(struct kvm_vcpu *vcpu, str= uct kvm_page_fault *fault */ if (!r) { if (!fault->prefetch) - kvm_inject_emulated_page_fault(vcpu, &walker.fault); + __kvm_inject_emulated_page_fault(vcpu, &walker.fault, true); =20 return RET_PF_RETRY; } diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c index 4ef9bc6a553f..1c1a5e322d18 100644 --- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -34,7 +34,8 @@ #define CC KVM_NESTED_VMENTER_CONSISTENCY_CHECK =20 static void nested_svm_inject_npf_exit(struct kvm_vcpu *vcpu, - struct x86_exception *fault) + struct x86_exception *fault, + bool from_hardware) { struct vcpu_svm *svm =3D to_svm(vcpu); struct vmcb *vmcb =3D svm->vmcb; diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index 4690a4d23709..3bb7eaa7b2a5 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -411,7 +411,8 @@ static void nested_ept_invalidate_addr(struct kvm_vcpu = *vcpu, gpa_t eptp, } =20 static void nested_ept_inject_page_fault(struct kvm_vcpu *vcpu, - struct x86_exception *fault) + struct x86_exception *fault, + bool from_hardware) { struct vmcs12 *vmcs12 =3D get_vmcs12(vcpu); struct vcpu_vmx *vmx =3D to_vmx(vcpu); diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index cecb2f84e5e0..aa2f8f43d94c 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -969,7 +969,8 @@ static int complete_emulated_insn_gp(struct kvm_vcpu *v= cpu, int err) EMULTYPE_COMPLETE_USER_EXIT); } =20 -void kvm_inject_page_fault(struct kvm_vcpu *vcpu, struct x86_exception *fa= ult) +void kvm_inject_page_fault(struct kvm_vcpu *vcpu, struct x86_exception *fa= ult, + bool from_hardware) { ++vcpu->stat.pf_guest; =20 @@ -986,8 +987,9 @@ void kvm_inject_page_fault(struct kvm_vcpu *vcpu, struc= t x86_exception *fault) fault->address); } =20 -void kvm_inject_emulated_page_fault(struct kvm_vcpu *vcpu, - struct x86_exception *fault) +void __kvm_inject_emulated_page_fault(struct kvm_vcpu *vcpu, + struct x86_exception *fault, + bool from_hardware) { struct kvm_mmu *fault_mmu; WARN_ON_ONCE(fault->vector !=3D PF_VECTOR); @@ -1004,9 +1006,9 @@ void kvm_inject_emulated_page_fault(struct kvm_vcpu *= vcpu, kvm_mmu_invalidate_addr(vcpu, fault_mmu, fault->address, KVM_MMU_ROOT_CURRENT); =20 - fault_mmu->inject_page_fault(vcpu, fault); + fault_mmu->inject_page_fault(vcpu, fault, from_hardware); } -EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_inject_emulated_page_fault); +EXPORT_SYMBOL_FOR_KVM_INTERNAL(__kvm_inject_emulated_page_fault); =20 void kvm_inject_nmi(struct kvm_vcpu *vcpu) { @@ -14065,7 +14067,7 @@ bool kvm_arch_async_page_not_present(struct kvm_vcp= u *vcpu, fault.nested_page_fault =3D false; fault.address =3D work->arch.token; fault.async_page_fault =3D true; - kvm_inject_page_fault(vcpu, &fault); + kvm_inject_page_fault(vcpu, &fault, false); return true; } else { /* @@ -14236,7 +14238,7 @@ void kvm_fixup_and_inject_pf_error(struct kvm_vcpu = *vcpu, gva_t gva, u16 error_c fault.address =3D gva; fault.async_page_fault =3D false; } - vcpu->arch.walk_mmu->inject_page_fault(vcpu, &fault); + vcpu->arch.walk_mmu->inject_page_fault(vcpu, &fault, true); } EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_fixup_and_inject_pf_error); =20 --=20 2.54.0.794.g4f17f83d09-goog From nobody Sun May 24 20:34:05 2026 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D747D39526B for ; Fri, 22 May 2026 23:27:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779492428; cv=none; b=Xw9jeqaj2cVKfYPgITMK5jFG7nooPoQR1UbsELCdeHKml4wsRlOUbMCrBjpMa6xvyOacNTxvj3Ts/s5TZ6Na2Yzdwe5vl+AYgvymnh5LiFkonOPKj7pnv29EMYJc68wGCRjCI0nIw1PeC+un0jo3ROa8LRvq9t3mjPJCRxd21Z4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779492428; c=relaxed/simple; bh=0xDc9Sj2KquCH8xt5Xkpdt4RhnhgzgrZ6M7yAba7slc=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=Yr5qu55XE+n8S8gKhCncpUT1r6IO3P6V2efq44FXmX/lnESNunH3Nb4gSSseD3G2eJ/1iscKui2j15FbUjQVq9axTvRA4hEpVQ8W70/T7biJFBrkg/sUXAkYLLeeT8pDFVG8TEb7ioOp8SxxdTlEBoHNZs3jU54PQ++A2w+DNEk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=OJq20X0g; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="OJq20X0g" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-36603ad6709so6506003a91.2 for ; Fri, 22 May 2026 16:27:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20251104; t=1779492426; x=1780097226; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=Hu9a/Wr16frEyZPr2UhLBVScqdRmqcTkSaOGh95mR3o=; b=OJq20X0gbi4RGr03CUtXntdRcZp3dOrCYM0Ao4CLUuxSaEuIYkHeT+PtGjztrMSJVM fgq2FqP1DDAfnhP5V7bXdCN0EHjzCbIo51u3Et1Ja+djnSav/lcfsbgfdbgT5ipjLh8j H6cGVRuyjdqyvpshXzf4uRO+4H7ZomoxuetivtBEhU5HmR72VljIvmh4y6hMXLnKn1lC UO4LDX4vy5XLt1UyHzTUyP4YMJJnfGV4FRM6rq0y6PaSn4nZF/p2ek4TN6dxcxwqfyzg nhlIYnpN4QzLA1Jt34+rCOYRQEVJgrKiUKqzPzyXSVMY3W2mrW/SxdyU0ESCqSN1eLLU 3I1Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1779492426; x=1780097226; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=Hu9a/Wr16frEyZPr2UhLBVScqdRmqcTkSaOGh95mR3o=; b=MmXV8AGsPzICw/8s1hO4oB6x5cvTotUhXF3Tq6otuxlA4IAOzJwUYqI1mXhfcBanUB DBi9RlJf1RqW6ziWqTgqs0dH2mecNdm+ZuFdg2FANFrpbzzkQgM0n9/zS+/NiT2QCfDC 2NYeipFE/jVzoWQrb7Kb2qC5O9oeb56L9ImlqGf2ssk1XLMtWwARkAPKIDrFRwojQPRP SZlWiqCPuT5pdJcDjQr099gwJj5Bah5aST8xgJF9Ww2mVoEKhZdCX4Ze39DneBR0Kfgu IsSZXGXNgMWqOH9rscEXwYP8B2xNt+6T+hY6QHkrxOZrKlqRbvtnFIT7a6CiZhijt6sw 0o1w== X-Forwarded-Encrypted: i=1; AFNElJ9DIuVmmNC/Lw067UbVlLKrXZGnJzYtO8SjEvCg9kZ804oZTMFKvAbUt+eyuYUVTS0yFOUX13ATGS/s5Ds=@vger.kernel.org X-Gm-Message-State: AOJu0YwuvVEAIhFVYJCynLs/W2pEptYqNbLW/UDHMBhfoPFo9EAawiIZ 7XLCRiQoPmXo0evcWEewbaVI2cjnzcZZHItKieHiJRLbVcqGUQwcFgmFKI8uJm63huoTS+6RSLC QFtWS4w== X-Received: from pgau15.prod.google.com ([2002:a05:6a02:2d8f:b0:c73:bdbf:6a66]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:2b4e:b0:35e:d015:d675 with SMTP id 98e67ed59e1d1-36a67719220mr5979911a91.7.1779492425806; Fri, 22 May 2026 16:27:05 -0700 (PDT) Reply-To: Sean Christopherson Date: Fri, 22 May 2026 16:26:59 -0700 In-Reply-To: <20260522232701.3671446-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20260522232701.3671446-1-seanjc@google.com> X-Mailer: git-send-email 2.54.0.794.g4f17f83d09-goog Message-ID: <20260522232701.3671446-4-seanjc@google.com> Subject: [PATCH v4 3/5] KVM: SVM: Fix nested NPF injection of PFERR_GUEST_{PAGE,FINAL}_MASK bits From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Kevin Cheng Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Kevin Cheng Fix KVM's generation of PFERR_GUEST_{PAGE,FINAL}_MASK bits when injecting a Nested Page Fault into L1. Currently, KVM blindly stuffs GUEST_FINAL into L1, which is blatantly wrong given that KVM obviously generates NPFs for page table accesses. There are two paths that trigger NPF injection: hardware NPF exits (from L2) and emulation-triggered faults, i.e. when KVM detects a NPF as part of emulating an L2 GVA access. For the hardware case, use the bits verbatim from the VMCB, as KVM is simply forwarding a NPF to L1. For the emulation case, propagate the GUEST_{PAGE,FINAL} bits from the access field (which were recently added for MBEC+GMET support). To differentiate between the two cases, add "hardware_nested_page_fault" to "struct x86_exception", and set it when injecting a NPF in response to an NPF exit from L2. To help guard against future goofs, assert that exactly one of GUEST_PAGE or GUEST_FINAL is set when injecting a NPF. Unlike VMX, there are no (known) cases where hardware doesn't set either bit, and KVM should always set one or the other when emulating a GVA access. Signed-off-by: Kevin Cheng [sean: use plumbed in @access bits, massage changelog] Signed-off-by: Sean Christopherson --- arch/x86/include/asm/kvm_host.h | 2 ++ arch/x86/kvm/mmu/paging_tmpl.h | 15 +++++--------- arch/x86/kvm/svm/nested.c | 35 ++++++++++++++++++++++----------- 3 files changed, 31 insertions(+), 21 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_hos= t.h index d11063c36f03..e1c4151d6693 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -284,6 +284,8 @@ enum x86_intercept_stage; #define PFERR_GUEST_RMP_MASK BIT_ULL(31) #define PFERR_GUEST_FINAL_MASK BIT_ULL(32) #define PFERR_GUEST_PAGE_MASK BIT_ULL(33) +#define PFERR_GUEST_FAULT_STAGE_MASK \ + (PFERR_GUEST_FINAL_MASK | PFERR_GUEST_PAGE_MASK) #define PFERR_GUEST_ENC_MASK BIT_ULL(34) #define PFERR_GUEST_SIZEM_MASK BIT_ULL(35) #define PFERR_GUEST_VMPL_MASK BIT_ULL(36) diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h index cc9c7deb34bc..66eee6914234 100644 --- a/arch/x86/kvm/mmu/paging_tmpl.h +++ b/arch/x86/kvm/mmu/paging_tmpl.h @@ -397,16 +397,6 @@ static int FNAME(walk_addr_generic)(struct guest_walke= r *walker, nested_access | PFERR_GUEST_PAGE_MASK, &walker->fault, 0); =20 - /* - * FIXME: This can happen if emulation (for of an INS/OUTS - * instruction) triggers a nested page fault. The exit - * qualification / exit info field will incorrectly have - * "guest page access" as the nested page fault's cause, - * instead of "guest page structure access". To fix this, - * the x86_exception struct should be augmented with enough - * information to fix the exit_qualification or exit_info_1 - * fields. - */ if (unlikely(real_gpa =3D=3D INVALID_GPA)) return 0; =20 @@ -548,6 +538,11 @@ static int FNAME(walk_addr_generic)(struct guest_walke= r *walker, walker->fault.nested_page_fault =3D mmu !=3D vcpu->arch.walk_mmu; walker->fault.async_page_fault =3D false; =20 +#if PTTYPE !=3D PTTYPE_EPT + if (walker->fault.nested_page_fault) + walker->fault.error_code |=3D access & PFERR_GUEST_FAULT_STAGE_MASK; +#endif + trace_kvm_mmu_walker_error(walker->fault.error_code); return 0; } diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c index 1c1a5e322d18..28ac5d5c990d 100644 --- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -39,19 +39,32 @@ static void nested_svm_inject_npf_exit(struct kvm_vcpu = *vcpu, { struct vcpu_svm *svm =3D to_svm(vcpu); struct vmcb *vmcb =3D svm->vmcb; + u64 fault_stage; =20 - if (vmcb->control.exit_code !=3D SVM_EXIT_NPF) { - /* - * TODO: track the cause of the nested page fault, and - * correctly fill in the high bits of exit_info_1. - */ - vmcb->control.exit_code =3D SVM_EXIT_NPF; - vmcb->control.exit_info_1 =3D (1ULL << 32); - vmcb->control.exit_info_2 =3D fault->address; - } + /* + * For hardware NPF exits, the GUEST_FAULT_STAGE bits are only + * available in the hardware exit_info_1, since the guest_mmu + * walker doesn't know whether the faulting GPA was a page table + * page or final page from L2's perspective. + */ + if (from_hardware) + fault_stage =3D vmcb->control.exit_info_1 & + PFERR_GUEST_FAULT_STAGE_MASK; + else + fault_stage =3D fault->error_code & PFERR_GUEST_FAULT_STAGE_MASK; =20 - vmcb->control.exit_info_1 &=3D ~0xffffffffULL; - vmcb->control.exit_info_1 |=3D fault->error_code; + /* + * All nested page faults should be annotated as occurring on the + * final translation *or* the page walk. Arbitrarily choose "final" + * if KVM is buggy and enumerated both or neither. + */ + if (WARN_ON_ONCE(hweight64(fault_stage) !=3D 1)) + fault_stage =3D PFERR_GUEST_FINAL_MASK; + + vmcb->control.exit_code =3D SVM_EXIT_NPF; + vmcb->control.exit_info_1 =3D fault_stage | + (fault->error_code & ~PFERR_GUEST_FAULT_STAGE_MASK); + vmcb->control.exit_info_2 =3D fault->address; =20 nested_svm_vmexit(svm); } --=20 2.54.0.794.g4f17f83d09-goog From nobody Sun May 24 20:34:05 2026 Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DD454395AE3 for ; Fri, 22 May 2026 23:27:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779492429; cv=none; b=CKOKgKR6pmDY2dPrljChPVnEtEKyjIXIpELY/taVJu47AOBBabaVvU50ze/7aBPIheItGUKngCVBKM98/7xmF4m95/dSf3UMKg4W4WNsyUyz/NadDOygDOMwFy3EVWvV5ZF07sM6WhRk/JesjHkb1+N/XiDu/t4X02TUY1TUODY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779492429; c=relaxed/simple; bh=z2xNgW/14j2NPgBa0TWtlx4lBFbt/884PL7gTByiVMs=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=amDIHKqrnSc1HBt8Evjt+KTGGaUlmXJcxEUdkc7oYqG/q20Cd65z3pWVJ4aU90JzlkJfgLTXpR2fGZI/lPlMwUVx/SYDUZSTsLxWCnALt5lRc1Gg5yKk8tJxyDRaP6mxcOP/5FeGgvKcQJdbrJt3lwy4Rw9tZunhY9Pi/EEk8y4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=X2D0aXU6; arc=none smtp.client-ip=209.85.214.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="X2D0aXU6" Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-2b4654f9bb6so80383455ad.2 for ; Fri, 22 May 2026 16:27:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20251104; t=1779492427; x=1780097227; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=i8QjonP9W9SfQQUkeH+34C1UPlAz2jzujqbXDGUfkTA=; b=X2D0aXU6dog99pevYD7smMphIR1YUugLnfmCh6u3+1yTgAbYcnPCR7sZE+kmF/2krP vxaa10qJuXYIaSCHBSZnv3dDebqMtoJuWiCvXmZ6GDDxo0tIhW2OMlplAP9tpPyiiyag fhiEy2hp3tI1b8he+G5BOmW597vIeBJaYl4vLf6Ap2IkrsuZ9KakOohr+SWYF7OMqs7D BOljwB29KE0TP2gJ1w0CLSBU4k8dpZCIvgy4ldF+5yi0f6gCtfZ/8zyZc/28J+P/h7fZ Pe4LDG2+Tbr+kleEn45KUgeUZsyELC/TVw3q1Mfz3QCRDuP3OCemAmuYlaTBPQls7Uyl 718g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1779492427; x=1780097227; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=i8QjonP9W9SfQQUkeH+34C1UPlAz2jzujqbXDGUfkTA=; b=qZ8xbJs+DaC1b/+CfgZUdF2SIhUBsGy1WWwg4SqCqMy190zRlCiX3ZP8QEC2TfCTu9 j8LNWF96w+YWKRYgsxtIvdjlIiXKHfg85/TnmnAXH+uYf3C87HnfPwhBsy7+GzXeGfSh GqBOhupUzPbDhxW0U8ioLJGjPLfrvT3VYCTbDq51vlwXcLYVS2geXHZdU6a4QjjbmpkG wB4Juwe/sISkLqwfmckwKg0ZnFZpHOHa1XbrhI/gOztYJOaeXoCrjWQcooEj7M+T1qAG IVoBB/DVDRms00ghUm7qIkXX38AWBLWnlFk1562469nY6Q7k7NNvPq/2oBzO5rdZd5tU OHqw== X-Forwarded-Encrypted: i=1; AFNElJ/GqbzShMmmKsuFkORZVh/gx0qDI74oLsGgqmMGMTkJac4N6guPz3zvVzL7ImOQIHJDUoOy2NQoj8zVVTU=@vger.kernel.org X-Gm-Message-State: AOJu0YzDKktRp30fJgq5Tz9liZavffXLxWF4J3hmlyYE3E3H1YssLscb UbqlJmToCKMvp+QWvdQgGsonrQYQfx/2uySJpdvAURjSHEfUrBmzrhqmu4lyAmE9BH5QK45gxVm ObIH+og== X-Received: from plpf11.prod.google.com ([2002:a17:903:3c4b:b0:2b0:51f0:272d]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:903:1a27:b0:2b2:4b4e:e4d2 with SMTP id d9443c01a7336-2beb0758412mr58825745ad.15.1779492426948; Fri, 22 May 2026 16:27:06 -0700 (PDT) Reply-To: Sean Christopherson Date: Fri, 22 May 2026 16:27:00 -0700 In-Reply-To: <20260522232701.3671446-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20260522232701.3671446-1-seanjc@google.com> X-Mailer: git-send-email 2.54.0.794.g4f17f83d09-goog Message-ID: <20260522232701.3671446-5-seanjc@google.com> Subject: [PATCH v4 4/5] KVM: VMX: Synthesize nested EPT violation GVA_IS_VALID/GVA_TRANSLATED bits From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Kevin Cheng Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Kevin Cheng When injecting an EPT Violation into L2 in response to a fault detected while emulating an L2 GVA access, synthesize the GVA_IS_VALID and GVA_TRANSLATED bits using information provided by the walker, instead of pulling the bits from vmcs02.EXIT_QUALIFICATION. The information in vmcs02.EXIT_QUALIFICATION is valid/correct if and only if the fault being injected into L1 is the direct result of an EPT Violation VM-Exit from L2. E.g. if KVM is emulating an I/O instruction and the memory operand's translation through L1's EPT fails, using vmcs02.EXIT_QUALIFICATION is wrong as the semantics for EXIT_QUALIFICATION would be for an I/O exit, not an EPT Violation exit. Opportunistically clean up the formatting for creating the mask of bits to pull from vmcs02.EXIT_QUALIFICATION. Signed-off-by: Kevin Cheng [sean: use plumbed in @access bits, massage changelog] Signed-off-by: Sean Christopherson --- arch/x86/kvm/mmu/paging_tmpl.h | 13 ++++++++++++- arch/x86/kvm/vmx/nested.c | 26 +++++++++++++++++++++----- 2 files changed, 33 insertions(+), 6 deletions(-) diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h index 66eee6914234..df3ae0c7ec2c 100644 --- a/arch/x86/kvm/mmu/paging_tmpl.h +++ b/arch/x86/kvm/mmu/paging_tmpl.h @@ -502,7 +502,8 @@ static int FNAME(walk_addr_generic)(struct guest_walker= *walker, * [2:0] - Derive from the access bits. The exit_qualification might be * out of date if it is serving an EPT misconfiguration. * [5:3] - Calculated by the page walk of the guest EPT page tables - * [7:11] - Derived from [7:11] of real exit_qualification + * [7:8] - Derived from "fault stage" access bits + * [9:11] - Derived from [9:11] of real exit_qualification * * The other bits are set to 0. */ @@ -516,6 +517,14 @@ static int FNAME(walk_addr_generic)(struct guest_walke= r *walker, else walker->fault.exit_qualification |=3D EPT_VIOLATION_ACC_READ; =20 + /* + * KVM doesn't emulate features that access GPAs directly, e.g. + * Intel Processor Trace. Assume the GVA is always valid; when + * propagating faults from hardware, KVM will discard this info + * and use the EXIT_QUALIFICATION bits from the VMCS. + */ + walker->fault.exit_qualification |=3D EPT_VIOLATION_GVA_IS_VALID; + /* * Accesses to guest paging structures are either "reads" or * "read+write" accesses, so consider them the latter if write_fault @@ -523,6 +532,8 @@ static int FNAME(walk_addr_generic)(struct guest_walker= *walker, */ if (access & PFERR_GUEST_PAGE_MASK) walker->fault.exit_qualification |=3D EPT_VIOLATION_ACC_READ; + else + walker->fault.exit_qualification |=3D EPT_VIOLATION_GVA_TRANSLATED; =20 /* * Note, pte_access holds the raw RWX bits from the EPTE, not diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index 3bb7eaa7b2a5..a78ce0080963 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -445,13 +445,29 @@ static void nested_ept_inject_page_fault(struct kvm_v= cpu *vcpu, exit_qualification =3D 0; } else { u64 mask =3D EPT_VIOLATION_GVA_IS_VALID | - EPT_VIOLATION_GVA_TRANSLATED; + EPT_VIOLATION_GVA_TRANSLATED; + if (vmx->nested.msrs.ept_caps & VMX_EPT_ADVANCED_VMEXIT_INFO_BIT) mask |=3D EPT_VIOLATION_GVA_USER | - EPT_VIOLATION_GVA_WRITABLE | - EPT_VIOLATION_GVA_NX; - exit_qualification =3D fault->exit_qualification; - exit_qualification |=3D vmx_get_exit_qual(vcpu) & mask; + EPT_VIOLATION_GVA_WRITABLE | + EPT_VIOLATION_GVA_NX; + + exit_qualification =3D fault->exit_qualification & ~mask; + + /* + * Use the EXIT_QUALIFICATION from the VMCS if and only + * if the hardware VM-Exit from L2 was an EPT Violation. + * If the fault is synthesized, then EXIT_QUALIFICATION + * is stale and/or holds entirely different data. And + * conversely, KVM _must_ rely on EXIT_QUALIFICATION if + * the fault came from hardware, because KVM only sees + * and walks the faulting GPA. + */ + if (from_hardware) + exit_qualification |=3D vmx_get_exit_qual(vcpu) & mask; + else + exit_qualification |=3D fault->exit_qualification & mask; + vm_exit_reason =3D EXIT_REASON_EPT_VIOLATION; } =20 --=20 2.54.0.794.g4f17f83d09-goog From nobody Sun May 24 20:34:05 2026 Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0ADCD39768C for ; Fri, 22 May 2026 23:27:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779492430; cv=none; b=QvtjeDBmZGiUmYK4VwxFRJjZMGGhPtVbHGHYyFADOhu8Q0FuykdeNcwl+EqOl4xPdK/izLtnm0Ml7CfyyMJa69fim2j3dK4XKGWUCQ/ky5Dx0iW2lVaSY2V5BKnroSyemBNHR9InkEAn/EOsGGfLfSqlsclZM5Rp2QP7GDEz4Ck= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779492430; c=relaxed/simple; bh=Yk8d51k2KEvSQa3bEY3ZmVW7akAyseGfnqCOpB1rCtM=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=rKaiAIPwIKRD97MGk4l6QgkSlSZVdx6EVjENDQ7SpNxUmWrmw6OM6n1hyYiBUjFHFoL9GF0obd6saNC3Icx15IBoTSWn7yoMM1ktfMvzkQP2WF1chokWUWIXW5jEHtYoUsWGJq1eM53kCdgvpvxSFJUKcM73mgk2774qvsulOLI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=mM4xOAn7; arc=none smtp.client-ip=209.85.214.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="mM4xOAn7" Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-2ba3245a43dso83794425ad.0 for ; Fri, 22 May 2026 16:27:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20251104; t=1779492428; x=1780097228; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=xjZlVRSbkNxFG74UY3S/ZMtgusKOWOlshHLNu65IK0Y=; b=mM4xOAn72781LX5XpfTcvQwjB+WLhW+Rmrua9AmnrijASHt38v5vpssJ4SyzIbrl23 uwdvGQD+a2sOU93DSrDUU5k2Dc6oM2gP0knJ1KU0ojIPaa0Zp9WKpuJe/LuSzU9GlSuc yBA/13kzuijPKW7a736RsaqERc2SJlbZRrmFKR7Jmp8Gb20RpL8RVdXfie+l12iyLHoX kHoqp7LQz13z6iE9JgoaYM8YQbp3laute/GNYiU1Gweqv4iDV7wF2SKJVSQ/ca6h95bI lGpY5ksjTxzgf3LGWSQTM4yiUFZyFYrHGegyEuAEtqHZ9eqJqCZ4AwqQLounfEYvKD7W KOBQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1779492428; x=1780097228; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=xjZlVRSbkNxFG74UY3S/ZMtgusKOWOlshHLNu65IK0Y=; b=heAqFPw21nAJkeiBUpdg3WgsBPg90E/dYJj6i5+wSoK5Vb3LNH70cTMlO9YL97QLqh VmDyr/y0kwGD/ssC/PO7b4eupyT9Zu+Dl0KZLxp97rRtb0+t3cjiHsogwudAa+RO23RS bRGXuts2k2xMdYS8QJhx2Yw0w1EVPI6s/PBKuVtbD0wMsE2iuBBi9A7hjh6fcoV+Sy5c gQ9UMMEhjgbH++rIY3NIiLT7A6WUZtYTDUrvI8VaAKSU8PrpCf2OE+FCwvCDTwao8TyD t8lnn88oWhgbdX+MAdgB6LapA2aiCfM5Zw7bP7Q7mAVqaoWgLylwWOkYtYPD+2ogzS2z LVvw== X-Forwarded-Encrypted: i=1; AFNElJ+ZCkzuAL4/0cVHN9W1FvVFgOv5nkkH+vq0iZjc7q+Xwhi1EyrKeDDrXcjbxMAylOsp1Ms4b7K96B7MXQM=@vger.kernel.org X-Gm-Message-State: AOJu0YwuUpcgn+J9v15IpbxD3z2u2Av7Kz5+raaUs1Vk4tLBRZPgdlXx rKoueyQgQFj56GIOkh2NqT404w6w5QqABYjRByB7P8XEUEtwP84X4RsbB5CYSFuCzve8NYgfqEE ytJxSfw== X-Received: from pgee26.prod.google.com ([2002:a63:1e1a:0:b0:c79:788d:5b72]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:903:2f05:b0:2be:1eb1:eaf7 with SMTP id d9443c01a7336-2beb05b5f70mr63466925ad.24.1779492428028; Fri, 22 May 2026 16:27:08 -0700 (PDT) Reply-To: Sean Christopherson Date: Fri, 22 May 2026 16:27:01 -0700 In-Reply-To: <20260522232701.3671446-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20260522232701.3671446-1-seanjc@google.com> X-Mailer: git-send-email 2.54.0.794.g4f17f83d09-goog Message-ID: <20260522232701.3671446-6-seanjc@google.com> Subject: [PATCH v4 5/5] KVM: selftests: Add nested page fault injection test From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Kevin Cheng Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Kevin Cheng Add a test that exercises nested page fault injection during L2 execution. L2 executes I/O string instructions (OUTSB/INSB) that access memory restricted in L1's nested page tables (NPT/EPT), triggering a nested page fault that L0 must inject to L1. The test supports both AMD SVM (NPF) and Intel VMX (EPT violation) and verifies that: - The exit reason is an NPF/EPT violation - The access type and permission bits are correct - The faulting GPA is correct Three test cases are implemented: - Unmap the final data page (final translation fault, OUTSB read) - Unmap a PT page (page walk fault, OUTSB read) - Write-protect the final data page (protection violation, INSB write) - Write-protect a PT page (protection violation on A/D update, OUTSB read) Signed-off-by: Kevin Cheng [sean: name it nested_tdp_fault_test, consolidate asserts] Signed-off-by: Sean Christopherson --- tools/testing/selftests/kvm/Makefile.kvm | 1 + .../selftests/kvm/include/x86/processor.h | 9 + .../selftests/kvm/x86/nested_tdp_fault_test.c | 313 ++++++++++++++++++ 3 files changed, 323 insertions(+) create mode 100644 tools/testing/selftests/kvm/x86/nested_tdp_fault_test.c diff --git a/tools/testing/selftests/kvm/Makefile.kvm b/tools/testing/selft= ests/kvm/Makefile.kvm index 82fa943b9503..2908eca1647a 100644 --- a/tools/testing/selftests/kvm/Makefile.kvm +++ b/tools/testing/selftests/kvm/Makefile.kvm @@ -97,6 +97,7 @@ TEST_GEN_PROGS_x86 +=3D x86/nested_emulation_test TEST_GEN_PROGS_x86 +=3D x86/nested_exceptions_test TEST_GEN_PROGS_x86 +=3D x86/nested_invalid_cr3_test TEST_GEN_PROGS_x86 +=3D x86/nested_set_state_test +TEST_GEN_PROGS_x86 +=3D x86/nested_tdp_fault_test TEST_GEN_PROGS_x86 +=3D x86/nested_tsc_adjust_test TEST_GEN_PROGS_x86 +=3D x86/nested_tsc_scaling_test TEST_GEN_PROGS_x86 +=3D x86/nested_vmsave_vmload_test diff --git a/tools/testing/selftests/kvm/include/x86/processor.h b/tools/te= sting/selftests/kvm/include/x86/processor.h index 851ffcd3340c..06878e7c7347 100644 --- a/tools/testing/selftests/kvm/include/x86/processor.h +++ b/tools/testing/selftests/kvm/include/x86/processor.h @@ -1573,6 +1573,15 @@ u64 *tdp_get_pte(struct kvm_vm *vm, u64 l2_gpa); #define PFERR_GUEST_PAGE_MASK BIT_ULL(PFERR_GUEST_PAGE_BIT) #define PFERR_IMPLICIT_ACCESS BIT_ULL(PFERR_IMPLICIT_ACCESS_BIT) =20 +#define EPT_VIOLATION_ACC_READ BIT(0) +#define EPT_VIOLATION_ACC_WRITE BIT(1) +#define EPT_VIOLATION_ACC_INSTR BIT(2) +#define EPT_VIOLATION_PROT_READ BIT(3) +#define EPT_VIOLATION_PROT_WRITE BIT(4) +#define EPT_VIOLATION_PROT_EXEC BIT(5) +#define EPT_VIOLATION_GVA_IS_VALID BIT(7) +#define EPT_VIOLATION_GVA_TRANSLATED BIT(8) + bool sys_clocksource_is_based_on_tsc(void); =20 #endif /* SELFTEST_KVM_PROCESSOR_H */ diff --git a/tools/testing/selftests/kvm/x86/nested_tdp_fault_test.c b/tool= s/testing/selftests/kvm/x86/nested_tdp_fault_test.c new file mode 100644 index 000000000000..fa95568f55ff --- /dev/null +++ b/tools/testing/selftests/kvm/x86/nested_tdp_fault_test.c @@ -0,0 +1,313 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Copyright (C) 2025, Google, Inc. + */ + +#include "test_util.h" +#include "kvm_util.h" +#include "processor.h" +#include "svm_util.h" +#include "vmx.h" + +#define L2_GUEST_STACK_SIZE 64 + +enum test_type { + TEST_FINAL_PAGE_UNMAPPED, /* Final data page not present */ + TEST_PT_PAGE_UNMAPPED, /* Page table page not present */ + TEST_FINAL_PAGE_WRITE_PROTECTED, /* Final data page read-only */ + TEST_PT_PAGE_WRITE_PROTECTED, /* Page table page read-only */ +}; + +static gva_t l2_test_page; +static void (*l2_entry)(void); + +#define TEST_IO_PORT 0x80 +#define TEST1_VADDR 0x8000000ULL +#define TEST2_VADDR 0x10000000ULL +#define TEST3_VADDR 0x18000000ULL +#define TEST4_VADDR 0x20000000ULL + +/* + * L2 executes OUTS reading from l2_test_page, triggering a nested page + * fault on the read access. + */ +static void l2_guest_code_outs(void) +{ + asm volatile("outsb" ::"S"(l2_test_page), "d"(TEST_IO_PORT) : "memory"); + GUEST_FAIL("L2 should not reach here"); +} + +/* + * L2 executes INS writing to l2_test_page, triggering a nested page + * fault on the write access. + */ +static void l2_guest_code_ins(void) +{ + asm volatile("insb" ::"D"(l2_test_page), "d"(TEST_IO_PORT) : "memory"); + GUEST_FAIL("L2 should not reach here"); +} + +#define GUEST_ASSERT_EXIT_QUAL(ac_eq, ex_eq) \ + __GUEST_ASSERT((ac_eq) =3D=3D (ex_eq), \ + "Wanted EXIT_QUAL '0x%lx', got '0x%lx'", ex_eq, ac_eq) + +static void l1_vmx_code(struct vmx_pages *vmx, u64 expected_fault_gpa, + u64 test_type) +{ + unsigned long l2_guest_stack[L2_GUEST_STACK_SIZE]; + u64 exit_qual; + + GUEST_ASSERT(vmx->vmcs_gpa); + GUEST_ASSERT(prepare_for_vmx_operation(vmx)); + GUEST_ASSERT(load_vmcs(vmx)); + + prepare_vmcs(vmx, l2_entry, &l2_guest_stack[L2_GUEST_STACK_SIZE]); + + GUEST_ASSERT(!vmlaunch()); + + /* Verify we got an EPT violation exit */ + __GUEST_ASSERT(vmreadz(VM_EXIT_REASON) =3D=3D EXIT_REASON_EPT_VIOLATION, + "Expected EPT violation (0x%x), got 0x%lx", + EXIT_REASON_EPT_VIOLATION, + vmreadz(VM_EXIT_REASON)); + + __GUEST_ASSERT(vmreadz(GUEST_PHYSICAL_ADDRESS) =3D=3D expected_fault_gpa, + "Expected guest_physical_address =3D 0x%lx, got 0x%lx", + expected_fault_gpa, + vmreadz(GUEST_PHYSICAL_ADDRESS)); + + exit_qual =3D vmreadz(EXIT_QUALIFICATION); + + /* + * Note, EPT page table accesses are always read+write, e.g. so that + * the CPU can do A/D updates at-will. + */ + switch (test_type) { + case TEST_FINAL_PAGE_UNMAPPED: + GUEST_ASSERT_EXIT_QUAL(exit_qual, EPT_VIOLATION_ACC_READ | + EPT_VIOLATION_GVA_IS_VALID | + EPT_VIOLATION_GVA_TRANSLATED); + break; + case TEST_PT_PAGE_UNMAPPED: + GUEST_ASSERT_EXIT_QUAL(exit_qual, EPT_VIOLATION_ACC_READ | + EPT_VIOLATION_ACC_WRITE | + EPT_VIOLATION_GVA_IS_VALID); + break; + case TEST_FINAL_PAGE_WRITE_PROTECTED: + GUEST_ASSERT_EXIT_QUAL(exit_qual, EPT_VIOLATION_ACC_WRITE | + EPT_VIOLATION_PROT_READ | + EPT_VIOLATION_PROT_EXEC | + EPT_VIOLATION_GVA_IS_VALID | + EPT_VIOLATION_GVA_TRANSLATED); + break; + case TEST_PT_PAGE_WRITE_PROTECTED: + GUEST_ASSERT_EXIT_QUAL(exit_qual, EPT_VIOLATION_ACC_READ | + EPT_VIOLATION_ACC_WRITE | + EPT_VIOLATION_PROT_READ | + EPT_VIOLATION_PROT_EXEC | + EPT_VIOLATION_GVA_IS_VALID); + break; + } + + GUEST_DONE(); +} + +#define GUEST_ASSERT_NPF_EC(ac_ec, ex_ec) \ + __GUEST_ASSERT((ac_ec) =3D=3D (ex_ec), \ + "Wanted NPF error code '0x%lx', got '0x%lx'", (u64)(ex_ec), ac_ec) + + +static void l1_svm_code(struct svm_test_data *svm, u64 expected_fault_gpa, + u64 test_type) +{ + unsigned long l2_guest_stack[L2_GUEST_STACK_SIZE]; + struct vmcb *vmcb =3D svm->vmcb; + u64 exit_info_1; + + generic_svm_setup(svm, l2_entry, + &l2_guest_stack[L2_GUEST_STACK_SIZE]); + + run_guest(vmcb, svm->vmcb_gpa); + + /* Verify we got an NPF exit */ + __GUEST_ASSERT(vmcb->control.exit_code =3D=3D SVM_EXIT_NPF, + "Expected NPF exit (0x%x), got 0x%lx", SVM_EXIT_NPF, + vmcb->control.exit_code); + + __GUEST_ASSERT(vmcb->control.exit_info_2 =3D=3D expected_fault_gpa, + "Expected exit_info_2 =3D 0x%lx, got 0x%lx", + expected_fault_gpa, + vmcb->control.exit_info_2); + + exit_info_1 =3D vmcb->control.exit_info_1; + + /* + * Note, without GMET enabled, NPT walks are always user accesses. And + * like EPT, page table accesses are always read+write. + */ + switch (test_type) { + case TEST_FINAL_PAGE_UNMAPPED: + GUEST_ASSERT_NPF_EC(exit_info_1, PFERR_USER_MASK | + PFERR_GUEST_FINAL_MASK); + break; + case TEST_PT_PAGE_UNMAPPED: + GUEST_ASSERT_NPF_EC(exit_info_1, PFERR_WRITE_MASK | + PFERR_USER_MASK | + PFERR_GUEST_PAGE_MASK); + break; + case TEST_FINAL_PAGE_WRITE_PROTECTED: + GUEST_ASSERT_NPF_EC(exit_info_1, PFERR_PRESENT_MASK | + PFERR_WRITE_MASK | + PFERR_USER_MASK | + PFERR_GUEST_FINAL_MASK); + break; + case TEST_PT_PAGE_WRITE_PROTECTED: + GUEST_ASSERT_NPF_EC(exit_info_1, PFERR_PRESENT_MASK | + PFERR_WRITE_MASK | + PFERR_USER_MASK | + PFERR_GUEST_PAGE_MASK); + break; + } + + GUEST_DONE(); +} + +static void l1_guest_code(void *data, u64 expected_fault_gpa, + u64 test_type) +{ + if (this_cpu_has(X86_FEATURE_VMX)) + l1_vmx_code(data, expected_fault_gpa, test_type); + else + l1_svm_code(data, expected_fault_gpa, test_type); +} + +/* Returns the GPA of the PT page that maps @vaddr. */ +static u64 get_pt_gpa_for_vaddr(struct kvm_vm *vm, u64 vaddr) +{ + u64 *pte; + + pte =3D vm_get_pte(vm, vaddr); + TEST_ASSERT(pte && (*pte & 0x1), "PTE not present for vaddr 0x%lx", + (unsigned long)vaddr); + + return addr_hva2gpa(vm, (void *)((u64)pte & ~0xFFFULL)); +} + +static void run_test(enum test_type type) +{ + gpa_t expected_fault_gpa; + gva_t nested_gva; + + struct kvm_vcpu *vcpu; + struct kvm_vm *vm; + struct ucall uc; + + vm =3D vm_create_with_one_vcpu(&vcpu, l1_guest_code); + vm_enable_tdp(vm); + + if (kvm_cpu_has(X86_FEATURE_VMX)) + vcpu_alloc_vmx(vm, &nested_gva); + else + vcpu_alloc_svm(vm, &nested_gva); + + switch (type) { + case TEST_FINAL_PAGE_UNMAPPED: + /* + * Unmap the final data page from NPT/EPT. The guest page + * table walk succeeds, but the final GPA->HPA translation + * fails. L2 reads from the page via OUTS. + */ + l2_entry =3D l2_guest_code_outs; + l2_test_page =3D vm_alloc(vm, vm->page_size, TEST1_VADDR); + expected_fault_gpa =3D addr_gva2gpa(vm, l2_test_page); + break; + case TEST_PT_PAGE_UNMAPPED: + /* + * Unmap a page table page from NPT/EPT. The hardware page + * table walk fails when translating the PT page's GPA + * through NPT/EPT. L2 reads from the page via OUTS. + */ + l2_entry =3D l2_guest_code_outs; + l2_test_page =3D vm_alloc(vm, vm->page_size, TEST2_VADDR); + expected_fault_gpa =3D get_pt_gpa_for_vaddr(vm, l2_test_page); + break; + case TEST_FINAL_PAGE_WRITE_PROTECTED: + /* + * Write-protect the final data page in NPT/EPT. The page + * is present and readable, but not writable. L2 writes to + * the page via INS, triggering a protection violation. + */ + l2_entry =3D l2_guest_code_ins; + l2_test_page =3D vm_alloc(vm, vm->page_size, TEST3_VADDR); + expected_fault_gpa =3D addr_gva2gpa(vm, l2_test_page); + break; + case TEST_PT_PAGE_WRITE_PROTECTED: + /* + * Write-protect a page table page in NPT/EPT. The page is + * present and readable, but not writable. The guest page + * table walk needs write access to set A/D bits, so it + * triggers a protection violation on the PT page. + * L2 reads from the page via OUTS. + */ + l2_entry =3D l2_guest_code_outs; + l2_test_page =3D vm_alloc(vm, vm->page_size, TEST4_VADDR); + expected_fault_gpa =3D get_pt_gpa_for_vaddr(vm, l2_test_page); + break; + } + + tdp_identity_map_default_memslots(vm); + + if (type =3D=3D TEST_FINAL_PAGE_WRITE_PROTECTED || + type =3D=3D TEST_PT_PAGE_WRITE_PROTECTED) + *tdp_get_pte(vm, expected_fault_gpa) &=3D ~PTE_WRITABLE_MASK(&vm->stage2= _mmu); + else + *tdp_get_pte(vm, expected_fault_gpa) &=3D ~(PTE_PRESENT_MASK(&vm->stage2= _mmu) | + PTE_READABLE_MASK(&vm->stage2_mmu) | + PTE_WRITABLE_MASK(&vm->stage2_mmu) | + PTE_EXECUTABLE_MASK(&vm->stage2_mmu)); + + sync_global_to_guest(vm, l2_entry); + sync_global_to_guest(vm, l2_test_page); + vcpu_args_set(vcpu, 3, nested_gva, expected_fault_gpa, (u64)type); + + /* + * For the INS-based write test, KVM emulates the instruction and + * first reads from the I/O port, which exits to userspace. + * Re-enter the guest so emulation can proceed to the memory + * write, where the nested page fault is triggered. + */ + for (;;) { + vcpu_run(vcpu); + + if (vcpu->run->exit_reason =3D=3D KVM_EXIT_IO && + vcpu->run->io.port =3D=3D TEST_IO_PORT && + vcpu->run->io.direction =3D=3D KVM_EXIT_IO_IN) { + continue; + } + break; + } + + switch (get_ucall(vcpu, &uc)) { + case UCALL_DONE: + break; + case UCALL_ABORT: + REPORT_GUEST_ASSERT(uc); + default: + TEST_FAIL("Unexpected exit reason: %d", vcpu->run->exit_reason); + } + + kvm_vm_free(vm); +} + +int main(int argc, char *argv[]) +{ + TEST_REQUIRE(kvm_cpu_has(X86_FEATURE_VMX) || kvm_cpu_has(X86_FEATURE_SVM)= ); + TEST_REQUIRE(kvm_cpu_has_tdp()); + + run_test(TEST_FINAL_PAGE_UNMAPPED); + run_test(TEST_PT_PAGE_UNMAPPED); + run_test(TEST_FINAL_PAGE_WRITE_PROTECTED); + run_test(TEST_PT_PAGE_WRITE_PROTECTED); + + return 0; +} --=20 2.54.0.794.g4f17f83d09-goog