From nobody Mon Jun 8 07:23:59 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3477143CEEF for ; Wed, 3 Jun 2026 10:58:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780484303; cv=none; b=JT67Us/O3s3d1eylUdECZPwElRRYWROHUL8ZYnS7ECC7tKTds4qnpXNG5mBGi55zwM1QiREwYP26Iy3ZBgH0tZDbb9j6PxxHZWGH1qgk4xoadclNrDOVOaGDF27J2ExFPp6ypCCKLpL3blEr1HaBegbdWe17FrURMZ33uRM4yIg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780484303; c=relaxed/simple; bh=6au/S327OgT7wu5gI5vMswLdmBiMZTUeYiVzWqFMeGU=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=Vtl4lraRAChdcSKO6cbF0ofbm/UXEhsWmpzb0rdC46QAKI6nY4cULOiJUyrS5jLU/hFcm9XJXdPHCva49Vq8aaervhd3+u96KEzs0xHU34qnIWzykTZYPKkFZukCndgqup/gViQoAQEERxgcTTaDrmqAy8tZLUvaGveI7H+mQ+Y= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=Eabz/jqg; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="Eabz/jqg" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1780484300; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=gGbyH9UKeJXKESFRTMlgJjDKtUie2VdU9pMcPcY/c4E=; b=Eabz/jqg3KZWRI5NFY/SWILcXQ0T7su8iK+IdZJx4D/3AqCbt0Zv7qZYQwcALzqDa6O3d9 wsOaxtZiKBXgVGYfQ6XY0YcR3EBVBKOe20RReVAsZEWaub0MrRqKcTEiSV7mkyzj1bYFs9 FhZM2webzS4t3k21jQTI/eHx4uV0abk= Received: from mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-517-jqZwBOgpO8eJA7ApELw2Kw-1; Wed, 03 Jun 2026 06:58:17 -0400 X-MC-Unique: jqZwBOgpO8eJA7ApELw2Kw-1 X-Mimecast-MFC-AGG-ID: jqZwBOgpO8eJA7ApELw2Kw_1780484296 Received: from mx-prod-int-10.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-10.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.95]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 1E904195608C; Wed, 3 Jun 2026 10:58:16 +0000 (UTC) Received: from virtlab1023.lab.eng.rdu2.redhat.lab.eng.rdu2.redhat.com (virtlab1023.lab.eng.rdu2.redhat.com [10.8.1.187]) by mx-prod-int-10.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id ABAB4404; Wed, 3 Jun 2026 10:58:15 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Subject: [PATCH 01/24] KVM: x86: remove nested_mmu from mmu_is_nested() Date: Wed, 3 Jun 2026 06:57:51 -0400 Message-ID: <20260603105814.10236-2-pbonzini@redhat.com> In-Reply-To: <20260603105814.10236-1-pbonzini@redhat.com> References: <20260603105814.10236-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.6 on 10.30.177.95 Content-Type: text/plain; charset="utf-8" nested_mmu is always stored into vcpu->arch.walk_mmu at the same time as guest_mmu is stored into vcpu->arch.mmu. But nested_mmu is not even a proper MMU, it is only used for page walking; plus the fact that walk_mmu has to be switched at all is just an implementation detail. In the end what matters here is whether the guest is using nested page tables; vmx/nested.c and svm/nested.c check it to see if they are in nEPT or nNPT context respectively. So switch to checking root_mmu vs. guest_mmu, which is a more cogent test. Signed-off-by: Paolo Bonzini Message-ID: <20260511150648.685374-2-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini --- arch/x86/kvm/x86.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h index 38a905fa86de..60ff064de12f 100644 --- a/arch/x86/kvm/x86.h +++ b/arch/x86/kvm/x86.h @@ -290,7 +290,7 @@ static inline bool x86_exception_has_error_code(unsigne= d int vector) =20 static inline bool mmu_is_nested(struct kvm_vcpu *vcpu) { - return vcpu->arch.walk_mmu =3D=3D &vcpu->arch.nested_mmu; + return vcpu->arch.mmu =3D=3D &vcpu->arch.guest_mmu; } =20 static inline bool is_pae(struct kvm_vcpu *vcpu) --=20 2.52.0 From nobody Mon Jun 8 07:23:59 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 11B7C402427 for ; Wed, 3 Jun 2026 10:58:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780484301; cv=none; b=KlOORP+uPgqLaPdbx5XOXP7G0Ip9Wbc/3XgA2V0xXR/jZY3ll9ej9s0fe0XAfbZJSHd6cDxD6q8jKAan/YW30o7vULmpX/Ptq1sLGspgsFMb9TvZhV17iiLNOPL9kDEb6OtUY86eXltjML7viOaNtqch+AGDPknUANb/0y+v2D8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780484301; c=relaxed/simple; bh=uGhlhaw8FKlDLXT261XQKolFsUWFnCaqY7nW8k90AiA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=SzifRkoL+WCW334TbFv2b4KC/f1NxMsV6KeePF+Yq8p+cqRqTJitUCTUwWDS3lJrA0rcClH+QuKx3ljfvsDJ8ZOkwx9gZpirjzU3t5vfGlLIDXIMKSJmZlRuGN+4/P4Qu62ueEEl7MSZ6Cq4znklwc6zj7PJtUymxu86LhCvMgw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=amUGGSWw; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="amUGGSWw" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1780484299; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=q2SVIGrCB6MttCzDE0j3sPFqyLMhK1UhS8dftKLkZo8=; b=amUGGSWwjJwjOgvNA3WsIvkIqh/PGzvivy3oFKdSjMRQrTNMeQ5QRazrEOdAwdYNuiRaUk /qIoFYRGejh1zLviRrP7XsdCxgg/tRSHykOv64ksU+8I1L3hN99yHP3JcoitUf6xHfHDhy NsMxR4TzqbCadXuIGtk3FQfPiA/OtyA= Received: from mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-561-cnYophKxPOOrjjN_fggQMw-1; Wed, 03 Jun 2026 06:58:17 -0400 X-MC-Unique: cnYophKxPOOrjjN_fggQMw-1 X-Mimecast-MFC-AGG-ID: cnYophKxPOOrjjN_fggQMw_1780484296 Received: from mx-prod-int-10.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-10.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.95]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id B75E8195608B; Wed, 3 Jun 2026 10:58:16 +0000 (UTC) Received: from virtlab1023.lab.eng.rdu2.redhat.lab.eng.rdu2.redhat.com (virtlab1023.lab.eng.rdu2.redhat.com [10.8.1.187]) by mx-prod-int-10.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 440CD404; Wed, 3 Jun 2026 10:58:16 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Sean Christopherson Subject: [PATCH 02/24] KVM: nVMX: remove unnecessary code in prepare_vmcs02_rare Date: Wed, 3 Jun 2026 06:57:52 -0400 Message-ID: <20260603105814.10236-3-pbonzini@redhat.com> In-Reply-To: <20260603105814.10236-1-pbonzini@redhat.com> References: <20260603105814.10236-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.6 on 10.30.177.95 Content-Type: text/plain; charset="utf-8" The early vmwrite of the PDPTRs in prepare_vmcs02_rare() is redundant, beca= use every write it does will be performed by prepare_vmcs02() if it is actually needed. In any case where the emulator or the processor need the PDPTR, either is_pae_paging() is true on vmentry, or a write of CR0, CR4 or EFER will cause a vmexit to L0. The next vmentry will refresh the PDPTRs in the vmcs02 from vmcs12. In fact, the original version[1] of what ended up being commit c7554efc8335 ("KVM: nVMX: Copy PDPTRs to/from vmcs12 only when necessary"), the writes in what is now prepare_vmcs02_rare() were removed. When the mega-collection of optimizations was posted[2], the removal of that code got dropped as a rebase good, so reinstate it. [1] https://lore.kernel.org/all/20190507160640.4812-16-sean.j.christopherso= n@intel.com [2] https://lore.kernel.org/all/1560445409-17363-31-git-send-email-pbonzini= @redhat.com Suggested-by: Sean Christopherson Signed-off-by: Paolo Bonzini --- arch/x86/kvm/vmx/nested.c | 11 ----------- 1 file changed, 11 deletions(-) diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index 4690a4d23709..1bd0839146fd 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -2623,17 +2623,6 @@ static void prepare_vmcs02_rare(struct vcpu_vmx *vmx= , struct vmcs12 *vmcs12) vmcs_writel(GUEST_SYSENTER_ESP, vmcs12->guest_sysenter_esp); vmcs_writel(GUEST_SYSENTER_EIP, vmcs12->guest_sysenter_eip); =20 - /* - * L1 may access the L2's PDPTR, so save them to construct - * vmcs12 - */ - if (enable_ept) { - vmcs_write64(GUEST_PDPTR0, vmcs12->guest_pdptr0); - vmcs_write64(GUEST_PDPTR1, vmcs12->guest_pdptr1); - vmcs_write64(GUEST_PDPTR2, vmcs12->guest_pdptr2); - vmcs_write64(GUEST_PDPTR3, vmcs12->guest_pdptr3); - } - if (kvm_mpx_supported() && vmx->vcpu.arch.nested_run_pending && (vmcs12->vm_entry_controls & VM_ENTRY_LOAD_BNDCFGS)) vmcs_write64(GUEST_BNDCFGS, vmcs12->guest_bndcfgs); --=20 2.52.0 From nobody Mon Jun 8 07:23:59 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1F979426EB2 for ; Wed, 3 Jun 2026 10:58:22 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780484305; cv=none; b=XkkH5Iat8z+V+KUfhobSgeAWhS0q5iqCwJi5Y6jj0las15qbYmuLrudB2jo8sBsNrKQUkx3sM9dTAi1Hsg3PwuPNtrxRQ2IuNtBG2WReU8YfSGO48s+lhTepWAkVLMK7wNX++MUjK/OH3z5sM+uOH/MxBajogH2eN4+KF18VQl4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780484305; c=relaxed/simple; bh=6ZbOr0gKHxAYTpvFZnZxiMjWEtVGHjzS16uPjJSmNJw=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=mUzuMQgjmkSFspfT50CtDQ5Wz5bylT8LB7gRtPXncP0a2jcoiT3lo9dSY+B80N5jmAmgK7wPfbNUciDWwAOgT9MzsFf+V4ZZuNclCOtON5YcY1iIKkm4iwUObYmNWSkNXg0bA4yFf8sjdtr3hNBZmSPoHhT4pQcLX7AAXclaYBE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=fcE8gTXv; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="fcE8gTXv" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1780484301; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=xEvCRcct3nMsP/Y2MLDDtJZ+e6IxGPXMrxmH4YuafcE=; b=fcE8gTXvsiBF+KuPBngiZiOI3gQMG+KxlhEtuI3MzI1GA5RgKqE89cmWxTj12ayNaFXJu2 fmfpqBXG43K2P5D6yGLdISEKibaKdc+3S6zTpabyYgaKm0LgJ/fvguiZLctyc+XhqiDQH6 4yLh/204glI9jC/Uv0uACPJUxI5BspY= Received: from mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-351-qaLnXK71NRizvw-bQfumxg-1; Wed, 03 Jun 2026 06:58:18 -0400 X-MC-Unique: qaLnXK71NRizvw-bQfumxg-1 X-Mimecast-MFC-AGG-ID: qaLnXK71NRizvw-bQfumxg_1780484297 Received: from mx-prod-int-10.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-10.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.95]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 44DEA1800473; Wed, 3 Jun 2026 10:58:17 +0000 (UTC) Received: from virtlab1023.lab.eng.rdu2.redhat.lab.eng.rdu2.redhat.com (virtlab1023.lab.eng.rdu2.redhat.com [10.8.1.187]) by mx-prod-int-10.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id DE021404; Wed, 3 Jun 2026 10:58:16 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Subject: [PATCH 03/24] KVM: nSVM: invalidate cached PDPTRs across nested NPT transitions Date: Wed, 3 Jun 2026 06:57:53 -0400 Message-ID: <20260603105814.10236-4-pbonzini@redhat.com> In-Reply-To: <20260603105814.10236-1-pbonzini@redhat.com> References: <20260603105814.10236-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.6 on 10.30.177.95 Content-Type: text/plain; charset="utf-8" When L2 runs under nested NPT and uses PAE paging, KVM's cached PDPTRs in mmu->pdptrs[] can hold stale or wrong values after nested transitions and across migration restore, because both nested_svm_load_cr3() and svm_get_nested_state_pages() only refresh PDPTRs on the !nested_npt path. The user-visible bug is on migration restore of an L2 running with nested NPT and 32-bit PAE paging, if userspace uses KVM_SET_SREGS rather than KVM_SET_SREGS2. In that case, load_pdptrs() leaves VCPU_EXREG_PDPTR marked as available, and kvm_pdptr_read() will use a stale translation that used L1 GPAs instead of L2 nGPAs. svm_get_nested_state_pages() runs on first KVM_RUN but skips the refresh because nested_npt_enabled() is true. The CPU itself reads L2's PDPTRs correctly from memory via L1's NPT, but KVM-side walking of guest PAE page tables uses the bogus cached values. Unlike Intel's GUEST_PDPTR0..3 fields in the VMCS, SVM has no VMCB-cached PDPTR state: the in-memory PDPTEs at the current CR3 are the only source of truth, and svm_cache_reg(VCPU_EXREG_PDPTR) simply reloads them from memory via load_pdptrs(). Clearing the avail bit (and the dirty bit because !avail/dirty is invalid) to force a reload when PDPTRs as needed fixes the bug. Do the same for nested_svm_load_cr3()'s nested_npt branch, so that the invariant "PDPTRs need reloading" is handled similarly for both immediate and deferred loading. Note that SVM's usage of pdptrs is overall doubtful, because load_pdptrs() will return 0 without updating mmu->pdptrs in case of failures but marks the register as available before the load attempt. Probably, nSVM shouldn't be using kvm_get_pdptr() at all. Signed-off-by: Paolo Bonzini --- arch/x86/kvm/kvm_cache_regs.h | 8 ++++++++ arch/x86/kvm/svm/nested.c | 27 ++++++++++++++++++--------- 2 files changed, 26 insertions(+), 9 deletions(-) diff --git a/arch/x86/kvm/kvm_cache_regs.h b/arch/x86/kvm/kvm_cache_regs.h index 2ae492ad6412..6bae5db5a54e 100644 --- a/arch/x86/kvm/kvm_cache_regs.h +++ b/arch/x86/kvm/kvm_cache_regs.h @@ -77,6 +77,14 @@ static inline bool kvm_register_is_dirty(struct kvm_vcpu= *vcpu, return test_bit(reg, vcpu->arch.regs_dirty); } =20 +static inline void kvm_register_mark_for_reload(struct kvm_vcpu *vcpu, + enum kvm_reg reg) +{ + kvm_assert_register_caching_allowed(vcpu); + __clear_bit(reg, vcpu->arch.regs_avail); + __clear_bit(reg, vcpu->arch.regs_dirty); +} + static inline void kvm_register_mark_available(struct kvm_vcpu *vcpu, enum kvm_reg reg) { diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c index 3d1fd1776e19..aa5a1d8ea136 100644 --- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -680,9 +680,12 @@ static int nested_svm_load_cr3(struct kvm_vcpu *vcpu, = unsigned long cr3, if (CC(!kvm_vcpu_is_legal_cr3(vcpu, cr3))) return -EINVAL; =20 - if (reload_pdptrs && !nested_npt && is_pae_paging(vcpu) && - CC(!load_pdptrs(vcpu, cr3))) - return -EINVAL; + if (reload_pdptrs && is_pae_paging(vcpu)) { + if (nested_npt) + kvm_register_mark_for_reload(vcpu, VCPU_REG_PDPTR); + else if (CC(!load_pdptrs(vcpu, cr3))) + return -EINVAL; + } =20 vcpu->arch.cr3 =3D cr3; =20 @@ -2055,15 +2058,21 @@ static bool svm_get_nested_state_pages(struct kvm_v= cpu *vcpu) if (WARN_ON(!is_guest_mode(vcpu))) return true; =20 - if (!vcpu->arch.pdptrs_from_userspace && - !nested_npt_enabled(to_svm(vcpu)) && is_pae_paging(vcpu)) + if (is_pae_paging(vcpu)) { /* - * Reload the guest's PDPTRs since after a migration - * the guest CR3 might be restored prior to setting the nested - * state which can lead to a load of wrong PDPTRs. + * After migration, CR3 may have been restored before + * KVM_SET_NESTED_STATE, so the PDPTR load into mmu->pdptrs[] + * may have treated CR3 as an L1 GPA. For nNPT, drop the + * cache so the next access reloads them with the proper + * nGPA translation. For !nNPT, reload eagerly unless userspace + * already supplied authoritative PDPTRs via KVM_SET_SREGS2. */ - if (CC(!load_pdptrs(vcpu, vcpu->arch.cr3))) + if (nested_npt_enabled(to_svm(vcpu))) + kvm_register_mark_for_reload(vcpu, VCPU_REG_PDPTR); + else if (!vcpu->arch.pdptrs_from_userspace && + CC(!load_pdptrs(vcpu, vcpu->arch.cr3))) return false; + } =20 if (!nested_svm_merge_msrpm(vcpu)) { vcpu->run->exit_reason =3D KVM_EXIT_INTERNAL_ERROR; --=20 2.52.0 From nobody Mon Jun 8 07:23:59 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F0CF343901A for ; Wed, 3 Jun 2026 10:58:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780484303; cv=none; b=Dx1mEuriD1824r0/a3SVKZZlvL5TJV/RaVm5tFdKH9tUFdp1IJQY41K1GPHWxSvH7fuQ8t8YNbi1TXSiNQmSqnWX8wXPf2ina8j59Keb0sDs2Vz/SpcMb/mvQNrGpL7CQfTkIgSwfrFmPjkjI0Y4pSSNzykrR6DS4ZWM0MAOkdk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780484303; c=relaxed/simple; bh=b0LcpMR544N/isNg3tfPHpqnBsdyM2QGsdAxWjYuI10=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=F+379hOiO9mMBKxXXBORvOYHrIbfhFlVVVMIwJed8+JAdIxKqjWs+lMOLw+Jhyg+r8wNbSElEWNH8ymXr7lO+CoqcvlxEbf/MmA1iDzgro2myVZprjDHpsWh1TOh/lDDEviavVWubqPQiZbZN06qsf5qBXeZ0Tl++NOVV2kv3jY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=KaCECVeE; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="KaCECVeE" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1780484300; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=kYe3fKfRxXiwypEiyAL53b1qVAEXN1RgGxgN7hKhSQg=; b=KaCECVeE6tzYIsK7vaBWH4luM9JfivJT5xmp8WxrhlsksxZG/EL64Eh8fomd+Mmw4Z29oS Mrq8A1ajzGcV/elETgfFQZv8iXhufcygU0qyLt/y1gaQTiM0smNV8+Yn5x9hdXOUJeMOGd qsPeQC1d+1o3KFnpurpJnQ8p+mcSOxA= Received: from mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-90-FRVRSdfVOPuZV1ftrMXADA-1; Wed, 03 Jun 2026 06:58:18 -0400 X-MC-Unique: FRVRSdfVOPuZV1ftrMXADA-1 X-Mimecast-MFC-AGG-ID: FRVRSdfVOPuZV1ftrMXADA_1780484297 Received: from mx-prod-int-10.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-10.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.95]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id CFB95195609D; Wed, 3 Jun 2026 10:58:17 +0000 (UTC) Received: from virtlab1023.lab.eng.rdu2.redhat.lab.eng.rdu2.redhat.com (virtlab1023.lab.eng.rdu2.redhat.com [10.8.1.187]) by mx-prod-int-10.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 6AB66404; Wed, 3 Jun 2026 10:58:17 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Subject: [PATCH 04/24] KVM: x86: check that kvm_handle_invpcid is only invoked with shadow paging Date: Wed, 3 Jun 2026 06:57:54 -0400 Message-ID: <20260603105814.10236-5-pbonzini@redhat.com> In-Reply-To: <20260603105814.10236-1-pbonzini@redhat.com> References: <20260603105814.10236-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.6 on 10.30.177.95 Content-Type: text/plain; charset="utf-8" This is true for both Intel and AMD. On Intel, "enable INVPCID" is set unconditionally if supported, but the vmexit is triggered by the "INVLPG exiting" control which is disabled by enable_ept. On AMD, KVM can intercept INVPCID if NPT is enabled but only in order to inject #UD in the guest. Signed-off-by: Paolo Bonzini --- arch/x86/kvm/x86.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 48f259015ce4..6897b9f4ce7f 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -14282,6 +14282,9 @@ int kvm_handle_invpcid(struct kvm_vcpu *vcpu, unsig= ned long type, gva_t gva) return 1; } =20 + if (WARN_ON_ONCE(tdp_enabled)) + return 0; + pcid_enabled =3D kvm_is_cr4_bit_set(vcpu, X86_CR4_PCIDE); =20 switch (type) { --=20 2.52.0 From nobody Mon Jun 8 07:23:59 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 80D0043D500 for ; Wed, 3 Jun 2026 10:58:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780484306; cv=none; b=e40eaAWcal/P7JHgwJpg7rqNuW1DdQYs9YvCv14g2XPxx3MYhAGk1WDeknPUuvuG/BnipDgQrtiWScLgKzwZhQJ84Gjc0mjCiHqHIn0EBeD9RF9ccofyp5PW87+fagqv16bGDD/66vnMOL5N8QqFz7TNwlt0hGKo1/wEtIo2lek= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780484306; c=relaxed/simple; bh=JcToRIn/U+xOWtjgMSWKN5U6qzlUHVk2JlxSZbWnarI=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=B9kofWSmgmVuLLHPvL19/k7y0/5IQfO9J1wBYQT2hTITX6IOa6x1cZ9B76h4wFBB6lO+xtbt2TwBY8Vs9s03t9va5XxKhnMIvpLoiBSR3frcnk624Ny22MHQs86V7lgA2Z25REjZEIIg6xfALKGB8lHG9trdGbqreLB2R3Q22iE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=E/iWdtg7; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="E/iWdtg7" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1780484300; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=zSe6e+PH4d/1rZediX8VwcMUZwfqecPZAaG8yeu5u6I=; b=E/iWdtg7Cku/DfwNVqYQQrFmiN2PzSidWm1OVvQXE4yh3bXRj3LrdtIxbhWpu1xaM7KAI2 U1jqi4ovETFBuA+Qj6Rn55VomVwHc+MrZisiqSgkMvxRpzM5rWHUl/bBNDSaY0a++QcXRA Lp57g1gw7vS7yFuNByOxvAL5xf5Zvgc= Received: from mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-413-vM95RHIKMcinr3bxJFihCA-1; Wed, 03 Jun 2026 06:58:19 -0400 X-MC-Unique: vM95RHIKMcinr3bxJFihCA-1 X-Mimecast-MFC-AGG-ID: vM95RHIKMcinr3bxJFihCA_1780484298 Received: from mx-prod-int-10.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-10.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.95]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 5B89F180049F; Wed, 3 Jun 2026 10:58:18 +0000 (UTC) Received: from virtlab1023.lab.eng.rdu2.redhat.lab.eng.rdu2.redhat.com (virtlab1023.lab.eng.rdu2.redhat.com [10.8.1.187]) by mx-prod-int-10.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 012A4404; Wed, 3 Jun 2026 10:58:17 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Subject: [PATCH 05/24] KVM: x86/mmu: move pdptrs out of the MMU Date: Wed, 3 Jun 2026 06:57:55 -0400 Message-ID: <20260603105814.10236-6-pbonzini@redhat.com> In-Reply-To: <20260603105814.10236-1-pbonzini@redhat.com> References: <20260603105814.10236-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.6 on 10.30.177.95 Content-Type: text/plain; charset="utf-8" PDPTRs are part of the CPU state. A bit unconventionally, they are reached via vcpu->arch.walk_mmu instead of being stored in vcpu->arch directly. That is nice in principle---it would allow TDP shadow paging to have its own PDPTRs---but it is not necessary, because EPT has no PDPTRs and NPT does not cache them. Since kvm_pdptr_read does not otherwise need the MMU, drop the pdptrs from the MMU altogether. There is however something to be careful about, in that PDPTRs are now not stored separately in root_mmu and nested_mmu for L1 and L2 guests. In practice this was already not an issue: - for EPT the VMCS0x has to keep them up to date; and for the purpose of emulation they are always loaded from the VMCS on vmentry/vmexit, thanks to the clearing of dirty and available register bitmaps in vmx_switch_vmcs() - for NPT, VCPU_EXREG_PDPTR is similarly cleared for nNPT, which does not cache the PDPTRs; while for non-nNPT the PDPTRs are loaded together with the load of CR3. Note that page table PDPTRs are not affected, since they are stored in pae_root. Signed-off-by: Paolo Bonzini --- arch/x86/include/asm/kvm_host.h | 5 ++--- arch/x86/kvm/kvm_cache_regs.h | 4 ++-- arch/x86/kvm/svm/svm.c | 2 +- arch/x86/kvm/vmx/vmx.c | 20 ++++++++------------ arch/x86/kvm/x86.c | 6 +++--- 5 files changed, 16 insertions(+), 21 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_hos= t.h index 53527b0550c7..c7c1c2e2a7c2 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -522,10 +522,7 @@ struct kvm_mmu { * the bits spte never used. */ struct rsvd_bits_validate shadow_zero_check; - struct rsvd_bits_validate guest_rsvd_check; - - u64 pdptrs[4]; /* pae */ }; =20 enum pmc_type { @@ -883,6 +880,8 @@ struct kvm_vcpu_arch { */ struct kvm_mmu *walk_mmu; =20 + u64 pdptrs[4]; /* pae */ + struct kvm_mmu_memory_cache mmu_pte_list_desc_cache; struct kvm_mmu_memory_cache mmu_shadow_page_cache; struct kvm_mmu_memory_cache mmu_shadowed_info_cache; diff --git a/arch/x86/kvm/kvm_cache_regs.h b/arch/x86/kvm/kvm_cache_regs.h index 6bae5db5a54e..2a93e8c45c1a 100644 --- a/arch/x86/kvm/kvm_cache_regs.h +++ b/arch/x86/kvm/kvm_cache_regs.h @@ -192,12 +192,12 @@ static inline u64 kvm_pdptr_read(struct kvm_vcpu *vcp= u, int index) if (!kvm_register_is_available(vcpu, VCPU_REG_PDPTR)) kvm_x86_call(cache_reg)(vcpu, VCPU_REG_PDPTR); =20 - return vcpu->arch.walk_mmu->pdptrs[index]; + return vcpu->arch.pdptrs[index]; } =20 static inline void kvm_pdptr_write(struct kvm_vcpu *vcpu, int index, u64 v= alue) { - vcpu->arch.walk_mmu->pdptrs[index] =3D value; + vcpu->arch.pdptrs[index] =3D value; } =20 static inline ulong kvm_read_cr0_bits(struct kvm_vcpu *vcpu, ulong mask) diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index b78dd8805ebb..d190a81e030f 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -1526,7 +1526,7 @@ static void svm_cache_reg(struct kvm_vcpu *vcpu, enum= kvm_reg reg) switch (reg) { case VCPU_REG_PDPTR: /* - * When !npt_enabled, mmu->pdptrs[] is already available since + * When !npt_enabled, vcpu->pdptrs[] is already available since * it is always updated per SDM when moving to CRs. */ if (npt_enabled) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 1701db1b2e18..5b74315f7e95 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -3363,30 +3363,26 @@ void vmx_flush_tlb_guest(struct kvm_vcpu *vcpu) =20 void vmx_ept_load_pdptrs(struct kvm_vcpu *vcpu) { - struct kvm_mmu *mmu =3D vcpu->arch.walk_mmu; - if (!kvm_register_is_dirty(vcpu, VCPU_REG_PDPTR)) return; =20 if (is_pae_paging(vcpu)) { - vmcs_write64(GUEST_PDPTR0, mmu->pdptrs[0]); - vmcs_write64(GUEST_PDPTR1, mmu->pdptrs[1]); - vmcs_write64(GUEST_PDPTR2, mmu->pdptrs[2]); - vmcs_write64(GUEST_PDPTR3, mmu->pdptrs[3]); + vmcs_write64(GUEST_PDPTR0, vcpu->arch.pdptrs[0]); + vmcs_write64(GUEST_PDPTR1, vcpu->arch.pdptrs[1]); + vmcs_write64(GUEST_PDPTR2, vcpu->arch.pdptrs[2]); + vmcs_write64(GUEST_PDPTR3, vcpu->arch.pdptrs[3]); } } =20 void ept_save_pdptrs(struct kvm_vcpu *vcpu) { - struct kvm_mmu *mmu =3D vcpu->arch.walk_mmu; - if (WARN_ON_ONCE(!is_pae_paging(vcpu))) return; =20 - mmu->pdptrs[0] =3D vmcs_read64(GUEST_PDPTR0); - mmu->pdptrs[1] =3D vmcs_read64(GUEST_PDPTR1); - mmu->pdptrs[2] =3D vmcs_read64(GUEST_PDPTR2); - mmu->pdptrs[3] =3D vmcs_read64(GUEST_PDPTR3); + vcpu->arch.pdptrs[0] =3D vmcs_read64(GUEST_PDPTR0); + vcpu->arch.pdptrs[1] =3D vmcs_read64(GUEST_PDPTR1); + vcpu->arch.pdptrs[2] =3D vmcs_read64(GUEST_PDPTR2); + vcpu->arch.pdptrs[3] =3D vmcs_read64(GUEST_PDPTR3); =20 kvm_register_mark_available(vcpu, VCPU_REG_PDPTR); } diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 6897b9f4ce7f..c5e55597533b 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1065,7 +1065,7 @@ int load_pdptrs(struct kvm_vcpu *vcpu, unsigned long = cr3) gpa_t real_gpa; int i; int ret; - u64 pdpte[ARRAY_SIZE(mmu->pdptrs)]; + u64 pdpte[ARRAY_SIZE(vcpu->arch.pdptrs)]; =20 /* * If the MMU is nested, CR3 holds an L2 GPA and needs to be translated @@ -1094,10 +1094,10 @@ int load_pdptrs(struct kvm_vcpu *vcpu, unsigned lon= g cr3) * Marking VCPU_REG_PDPTR dirty doesn't work for !tdp_enabled. * Shadow page roots need to be reconstructed instead. */ - if (!tdp_enabled && memcmp(mmu->pdptrs, pdpte, sizeof(mmu->pdptrs))) + if (!tdp_enabled && memcmp(vcpu->arch.pdptrs, pdpte, sizeof(vcpu->arch.pd= ptrs))) kvm_mmu_free_roots(vcpu->kvm, mmu, KVM_MMU_ROOT_CURRENT); =20 - memcpy(mmu->pdptrs, pdpte, sizeof(mmu->pdptrs)); + memcpy(vcpu->arch.pdptrs, pdpte, sizeof(vcpu->arch.pdptrs)); kvm_register_mark_dirty(vcpu, VCPU_REG_PDPTR); kvm_make_request(KVM_REQ_LOAD_MMU_PGD, vcpu); vcpu->arch.pdptrs_from_userspace =3D false; --=20 2.52.0 From nobody Mon Jun 8 07:23:59 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 29D8143E9D8 for ; Wed, 3 Jun 2026 10:58:22 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780484305; cv=none; b=JR5SwfwTjedavxpMDIe19wryGgnqH0sEU5LGSJiAJda/j4crgJv5iGuhoijQwKSDbhmHcxl9TsPrc/WuKLCx7I7jp841BfjIi/Rny7e0FjyWYcOdaKIq0mLsZIdiy1mGl4uvs8vQsMTdl9eimL4rkKOG4KlQfg99A0zG2/U/Cgs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780484305; c=relaxed/simple; bh=bPYfMbvOBKU32p3ra4LQVT35dUe5aIW0jZc2InSi8GE=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=CdlLvXuic78FukEfJXh+VvmX3Pdz0Wv4JjWJY+abdVmCRDm0Nc4/Vt6SI1jLzPkSWzFoXljDoHknbmKND+M8wgW36u01g76w4uVJEc7cFBJdbrs0QwfFBhPKd/1yOrew2SSxkOW1xLq4xD0xR6v+DAm251M7adhRKiJrHWFLB1I= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=WLFYDhCA; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="WLFYDhCA" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1780484301; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=tcXNE2ohVHzsWnkzBLylsxrQ8cOzy7jBak/ymstitHc=; b=WLFYDhCArdMQTPxCBt14nS5mKA5JkfPG42lVBxFunc6JQSiX11TSt3FC8GddIuNM2wsgG6 QcYrIfhEqVPeyNXO9ABNOthmxjAr/u0QzcUH2HG8Jq7YWma5pRn36XhrWUTtoowepbp/mu ZScF2Bfzgdbwa46C/0Dj/jty86aYRlo= Received: from mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-146-LKsWvMulOQOMxhfATAWxhQ-1; Wed, 03 Jun 2026 06:58:19 -0400 X-MC-Unique: LKsWvMulOQOMxhfATAWxhQ-1 X-Mimecast-MFC-AGG-ID: LKsWvMulOQOMxhfATAWxhQ_1780484299 Received: from mx-prod-int-10.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-10.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.95]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id E6924195606D; Wed, 3 Jun 2026 10:58:18 +0000 (UTC) Received: from virtlab1023.lab.eng.rdu2.redhat.lab.eng.rdu2.redhat.com (virtlab1023.lab.eng.rdu2.redhat.com [10.8.1.187]) by mx-prod-int-10.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 81F39404; Wed, 3 Jun 2026 10:58:18 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Subject: [PATCH 06/24] KVM: x86/hyperv: remove unnecessary mmu_is_nested() check Date: Wed, 3 Jun 2026 06:57:56 -0400 Message-ID: <20260603105814.10236-7-pbonzini@redhat.com> In-Reply-To: <20260603105814.10236-1-pbonzini@redhat.com> References: <20260603105814.10236-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.6 on 10.30.177.95 Content-Type: text/plain; charset="utf-8" Just always go through kvm_translate_gpa(), which will either invoke the vendor check or just return hc->ingpa back. This is a better way to fix the issue of commit 464af6fc2b1d ("KVM: x86: check for nEPT/nNPT in slow flush hypercalls", 2026-05-03). Signed-off-by: Paolo Bonzini --- arch/x86/kvm/hyperv.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c index 015c6947b462..a374fd64a76a 100644 --- a/arch/x86/kvm/hyperv.c +++ b/arch/x86/kvm/hyperv.c @@ -2040,10 +2040,9 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu *vcpu, s= truct kvm_hv_hcall *hc) * flush). Translate the address here so the memory can be uniformly * read with kvm_read_guest(). */ - if (!hc->fast && mmu_is_nested(vcpu)) { - hc->ingpa =3D kvm_x86_ops.nested_ops->translate_nested_gpa( - vcpu, hc->ingpa, - PFERR_GUEST_FINAL_MASK, NULL, 0); + if (!hc->fast) { + hc->ingpa =3D kvm_translate_gpa(vcpu, vcpu->arch.walk_mmu, hc->ingpa, + PFERR_GUEST_FINAL_MASK, NULL, 0); if (unlikely(hc->ingpa =3D=3D INVALID_GPA)) return HV_STATUS_INVALID_HYPERCALL_INPUT; } --=20 2.52.0 From nobody Mon Jun 8 07:23:59 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A31504534B3 for ; Wed, 3 Jun 2026 10:58:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780484309; cv=none; b=Pb9XwkPFj0cyGhBBzk/Lz97kBoSKo99+HTMqrcxUIxVCklflr9OJW4bNEH7mU8c+oEpGQUht5NJ50JFSsI1AIPjBm3aYA6/xTJyyTzavvC5/K1EWfPaphy5rVMJ/PtVrle1dl8fRobOvljRQ0NbN84teMfcqAg4TPhHqwOF9qgk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780484309; c=relaxed/simple; bh=MnAnkLEPTiEeKbRC5W/Q/IaOYm9pXUfwlKtxe7d3cwo=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=jNvWwKznmDFLtHgS1DjPRIf+AA35pGdZNn9pWZxATwW7xOhIHOAgGrI9X1UNmZU4l15l2eKleOigZZw/566n09OFaiOjUTLfeeKgtvezVC4dxUoJi2K8mxVNbsCvHwUfy/YSeHfYQw8gFU3FM4sRdvm59rGQbq7OlgsdGKnmvTo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=Aea7Lm/W; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="Aea7Lm/W" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1780484303; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=sQ7X8iEem6j80DsOxKfYdvT2DBlswjfsLpCR4g+WkE0=; b=Aea7Lm/WNUJ9wZAU3bsFWWj2VB0mqhj2ts87k7cIgr0MX3Rjn+5NL4bohARpU+mhToILC5 sX73JhZ40hD8dj9DK3kPEEl986lunufKdMvp5mgwcvP/uODXKrJTz4V5jvmAxOw1MxNoSu cCp1cMuSC7AxzXQJiBdgniYWXl8iNVc= Received: from mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-471-ROkwBrGjNEO4p64_IEul2g-1; Wed, 03 Jun 2026 06:58:20 -0400 X-MC-Unique: ROkwBrGjNEO4p64_IEul2g-1 X-Mimecast-MFC-AGG-ID: ROkwBrGjNEO4p64_IEul2g_1780484299 Received: from mx-prod-int-10.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-10.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.95]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 742791956089; Wed, 3 Jun 2026 10:58:19 +0000 (UTC) Received: from virtlab1023.lab.eng.rdu2.redhat.lab.eng.rdu2.redhat.com (virtlab1023.lab.eng.rdu2.redhat.com [10.8.1.187]) by mx-prod-int-10.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 18EEE404; Wed, 3 Jun 2026 10:58:19 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Subject: [PATCH 07/24] KVM: x86/mmu: introduce struct kvm_pagewalk Date: Wed, 3 Jun 2026 06:57:57 -0400 Message-ID: <20260603105814.10236-8-pbonzini@redhat.com> In-Reply-To: <20260603105814.10236-1-pbonzini@redhat.com> References: <20260603105814.10236-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.6 on 10.30.177.95 Content-Type: text/plain; charset="utf-8" In preparation for separating walking and building of page tables, introduce a dummy struct kvm_pagewalk and pass it around instead of its containing kvm_mmu to functions that do not build the page tables. Outermost functions retrieve the mmu via container_of, while internal functions can pass around the struct kvm_pagewalk pointer. x86.c is still (mostly) oblivious to the existence of struct kvm_pagewalk. There are only a couple exceptions for now, which were done already here for simplicity, but the plan is for the KVM code to use struct kvm_pagewalk whenever dealing with guest page tables. Signed-off-by: Paolo Bonzini --- arch/x86/include/asm/kvm_host.h | 7 +++++- arch/x86/kvm/hyperv.c | 2 +- arch/x86/kvm/mmu.h | 19 +++++++++------ arch/x86/kvm/mmu/mmu.c | 2 +- arch/x86/kvm/mmu/paging_tmpl.h | 43 +++++++++++++++++++-------------- arch/x86/kvm/x86.c | 4 +-- 6 files changed, 46 insertions(+), 31 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_hos= t.h index c7c1c2e2a7c2..f72af337330b 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -476,10 +476,15 @@ struct kvm_page_fault; =20 /* * x86 supports 4 paging modes (5-level 64-bit, 4-level 64-bit, 3-level 32= -bit, - * and 2-level 32-bit). The kvm_mmu structure abstracts the details of the + * and 2-level 32-bit). The kvm_pagewalk structure abstracts the details = of the * current mmu mode. */ +struct kvm_pagewalk { +}; + struct kvm_mmu { + struct kvm_pagewalk w; + unsigned long (*get_guest_pgd)(struct kvm_vcpu *vcpu); u64 (*get_pdptr)(struct kvm_vcpu *vcpu, int index); int (*page_fault)(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault); diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c index a374fd64a76a..a6e7d6f85409 100644 --- a/arch/x86/kvm/hyperv.c +++ b/arch/x86/kvm/hyperv.c @@ -2041,7 +2041,7 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu *vcpu, st= ruct kvm_hv_hcall *hc) * read with kvm_read_guest(). */ if (!hc->fast) { - hc->ingpa =3D kvm_translate_gpa(vcpu, vcpu->arch.walk_mmu, hc->ingpa, + hc->ingpa =3D kvm_translate_gpa(vcpu, &vcpu->arch.walk_mmu->w, hc->ingpa, PFERR_GUEST_FINAL_MASK, NULL, 0); if (unlikely(hc->ingpa =3D=3D INVALID_GPA)) return HV_STATUS_INVALID_HYPERCALL_INPUT; diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h index ddf4e467c071..3f8ac193a1e6 100644 --- a/arch/x86/kvm/mmu.h +++ b/arch/x86/kvm/mmu.h @@ -169,21 +169,22 @@ static inline void kvm_mmu_load_pgd(struct kvm_vcpu *= vcpu) } =20 static inline void kvm_mmu_refresh_passthrough_bits(struct kvm_vcpu *vcpu, - struct kvm_mmu *mmu) + struct kvm_pagewalk *w) { /* * When EPT is enabled, KVM may passthrough CR0.WP to the guest, i.e. - * @mmu's snapshot of CR0.WP and thus all related paging metadata may + * @w's snapshot of CR0.WP and thus all related paging metadata may * be stale. Refresh CR0.WP and the metadata on-demand when checking * for permission faults. Exempt nested MMUs, i.e. MMUs for shadowing * nEPT and nNPT, as CR0.WP is ignored in both cases. Note, KVM does * need to refresh nested_mmu, a.k.a. the walker used to translate L2 * GVAs to GPAs, as that "MMU" needs to honor L2's CR0.WP. */ - if (!tdp_enabled || mmu =3D=3D &vcpu->arch.guest_mmu) + if (!tdp_enabled || w =3D=3D &vcpu->arch.guest_mmu.w) return; =20 - __kvm_mmu_refresh_passthrough_bits(vcpu, mmu); + __kvm_mmu_refresh_passthrough_bits(vcpu, + container_of(w, struct kvm_mmu, w)); } =20 /* @@ -194,10 +195,12 @@ static inline void kvm_mmu_refresh_passthrough_bits(s= truct kvm_vcpu *vcpu, * Return zero if the access does not fault; return the page fault error c= ode * if the access faults. */ -static inline u8 permission_fault(struct kvm_vcpu *vcpu, struct kvm_mmu *m= mu, +static inline u8 permission_fault(struct kvm_vcpu *vcpu, struct kvm_pagewa= lk *w, unsigned pte_access, unsigned pte_pkey, u64 access) { + struct kvm_mmu *mmu =3D container_of(w, struct kvm_mmu, w); + /* strip nested paging fault error codes */ unsigned int pfec =3D access; unsigned long rflags =3D kvm_x86_call(get_rflags)(vcpu); @@ -220,7 +223,7 @@ static inline u8 permission_fault(struct kvm_vcpu *vcpu= , struct kvm_mmu *mmu, u32 errcode =3D PFERR_PRESENT_MASK; bool fault; =20 - kvm_mmu_refresh_passthrough_bits(vcpu, mmu); + kvm_mmu_refresh_passthrough_bits(vcpu, w); =20 fault =3D (mmu->permissions[index] >> pte_access) & 1; =20 @@ -301,12 +304,12 @@ static inline void kvm_update_page_stats(struct kvm *= kvm, int level, int count) } =20 static inline gpa_t kvm_translate_gpa(struct kvm_vcpu *vcpu, - struct kvm_mmu *mmu, + struct kvm_pagewalk *w, gpa_t gpa, u64 access, struct x86_exception *exception, u64 pte_access) { - if (mmu !=3D &vcpu->arch.nested_mmu) + if (w !=3D &vcpu->arch.nested_mmu.w) return gpa; return kvm_x86_ops.nested_ops->translate_nested_gpa(vcpu, gpa, access, exception, diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index f8aa7eda661e..42b7397a1845 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -4354,7 +4354,7 @@ static gpa_t nonpaging_gva_to_gpa(struct kvm_vcpu *vc= pu, struct kvm_mmu *mmu, * user-mode address if CR0.PG=3D0. Therefore *include* ACC_USER_MASK in * the last argument to kvm_translate_gpa (which NPT does not use). */ - return kvm_translate_gpa(vcpu, mmu, vaddr, access | PFERR_GUEST_FINAL_MAS= K, + return kvm_translate_gpa(vcpu, &mmu->w, vaddr, access | PFERR_GUEST_FINAL= _MASK, exception, ACC_ALL); } =20 diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h index 07100bbfc270..ab1aebf2f73c 100644 --- a/arch/x86/kvm/mmu/paging_tmpl.h +++ b/arch/x86/kvm/mmu/paging_tmpl.h @@ -106,9 +106,10 @@ static gfn_t gpte_to_gfn_lvl(pt_element_t gpte, int lv= l) return (gpte & PT_LVL_ADDR_MASK(lvl)) >> PAGE_SHIFT; } =20 -static inline void FNAME(protect_clean_gpte)(struct kvm_mmu *mmu, unsigned= *access, +static inline void FNAME(protect_clean_gpte)(struct kvm_pagewalk *w, unsig= ned *access, unsigned gpte) { + struct kvm_mmu __maybe_unused *mmu =3D container_of(w, struct kvm_mmu, w); unsigned mask; =20 /* dirty bit is not supported, so no need to track it */ @@ -147,8 +148,10 @@ static bool FNAME(is_bad_mt_xwr)(struct rsvd_bits_vali= date *rsvd_check, u64 gpte #endif } =20 -static bool FNAME(is_rsvd_bits_set)(struct kvm_mmu *mmu, u64 gpte, int lev= el) +static bool FNAME(is_rsvd_bits_set)(struct kvm_pagewalk *w, u64 gpte, int = level) { + struct kvm_mmu *mmu =3D container_of(w, struct kvm_mmu, w); + return __is_rsvd_bits_set(&mmu->guest_rsvd_check, gpte, level) || FNAME(is_bad_mt_xwr)(&mmu->guest_rsvd_check, gpte); } @@ -165,7 +168,7 @@ static bool FNAME(prefetch_invalid_gpte)(struct kvm_vcp= u *vcpu, !(gpte & PT_GUEST_ACCESSED_MASK)) goto no_present; =20 - if (FNAME(is_rsvd_bits_set)(vcpu->arch.mmu, gpte, PG_LEVEL_4K)) + if (FNAME(is_rsvd_bits_set)(&vcpu->arch.mmu->w, gpte, PG_LEVEL_4K)) goto no_present; =20 return false; @@ -206,10 +209,11 @@ static inline unsigned FNAME(gpte_access)(u64 gpte) } =20 static int FNAME(update_accessed_dirty_bits)(struct kvm_vcpu *vcpu, - struct kvm_mmu *mmu, + struct kvm_pagewalk *w, struct guest_walker *walker, gpa_t addr, int write_fault) { + struct kvm_mmu __maybe_unused *mmu =3D container_of(w, struct kvm_mmu, w); unsigned level, index; pt_element_t pte, orig_pte; pt_element_t __user *ptep_user; @@ -278,9 +282,11 @@ static inline unsigned FNAME(gpte_pkeys)(struct kvm_vc= pu *vcpu, u64 gpte) return pkeys; } =20 -static inline bool FNAME(is_last_gpte)(struct kvm_mmu *mmu, +static inline bool FNAME(is_last_gpte)(struct kvm_pagewalk *w, unsigned int level, unsigned int gpte) { + struct kvm_mmu __maybe_unused *mmu =3D container_of(w, struct kvm_mmu, w); + /* * For EPT and PAE paging (both variants), bit 7 is either reserved at * all level or indicates a huge page (ignoring CR3/EPTP). In either @@ -311,9 +317,10 @@ static inline bool FNAME(is_last_gpte)(struct kvm_mmu = *mmu, * Fetch a guest pte for a guest virtual address, or for an L2's GPA. */ static int FNAME(walk_addr_generic)(struct guest_walker *walker, - struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, + struct kvm_vcpu *vcpu, struct kvm_pagewalk *w, gpa_t addr, u64 access) { + struct kvm_mmu *mmu =3D container_of(w, struct kvm_mmu, w); int ret; pt_element_t pte; pt_element_t __user *ptep_user; @@ -387,7 +394,7 @@ static int FNAME(walk_addr_generic)(struct guest_walker= *walker, walker->table_gfn[walker->level - 1] =3D table_gfn; walker->pte_gpa[walker->level - 1] =3D pte_gpa; =20 - real_gpa =3D kvm_translate_gpa(vcpu, mmu, gfn_to_gpa(table_gfn), + real_gpa =3D kvm_translate_gpa(vcpu, w, gfn_to_gpa(table_gfn), nested_access | PFERR_GUEST_PAGE_MASK, &walker->fault, 0); =20 @@ -429,7 +436,7 @@ static int FNAME(walk_addr_generic)(struct guest_walker= *walker, if (unlikely(!FNAME(is_present_gpte)(mmu, pte))) goto error; =20 - if (unlikely(FNAME(is_rsvd_bits_set)(mmu, pte, walker->level))) { + if (unlikely(FNAME(is_rsvd_bits_set)(w, pte, walker->level))) { errcode =3D PFERR_RSVD_MASK | PFERR_PRESENT_MASK; goto error; } @@ -438,14 +445,14 @@ static int FNAME(walk_addr_generic)(struct guest_walk= er *walker, =20 /* Convert to ACC_*_MASK flags for struct guest_walker. */ walker->pt_access[walker->level - 1] =3D FNAME(gpte_access)(pt_access ^ = walk_nx_mask); - } while (!FNAME(is_last_gpte)(mmu, walker->level, pte)); + } while (!FNAME(is_last_gpte)(w, walker->level, pte)); =20 pte_pkey =3D FNAME(gpte_pkeys)(vcpu, pte); accessed_dirty =3D have_ad ? pte_access & PT_GUEST_ACCESSED_MASK : 0; =20 /* Convert to ACC_*_MASK flags for struct guest_walker. */ walker->pte_access =3D FNAME(gpte_access)(pte_access ^ walk_nx_mask); - errcode =3D permission_fault(vcpu, mmu, walker->pte_access, pte_pkey, acc= ess); + errcode =3D permission_fault(vcpu, w, walker->pte_access, pte_pkey, acces= s); if (unlikely(errcode)) goto error; =20 @@ -457,7 +464,7 @@ static int FNAME(walk_addr_generic)(struct guest_walker= *walker, gfn +=3D pse36_gfn_delta(pte); #endif =20 - real_gpa =3D kvm_translate_gpa(vcpu, mmu, gfn_to_gpa(gfn), + real_gpa =3D kvm_translate_gpa(vcpu, w, gfn_to_gpa(gfn), access | PFERR_GUEST_FINAL_MASK, &walker->fault, walker->pte_access); if (real_gpa =3D=3D INVALID_GPA) @@ -466,7 +473,7 @@ static int FNAME(walk_addr_generic)(struct guest_walker= *walker, walker->gfn =3D real_gpa >> PAGE_SHIFT; =20 if (!write_fault) - FNAME(protect_clean_gpte)(mmu, &walker->pte_access, pte); + FNAME(protect_clean_gpte)(w, &walker->pte_access, pte); else /* * On a write fault, fold the dirty bit into accessed_dirty. @@ -477,7 +484,7 @@ static int FNAME(walk_addr_generic)(struct guest_walker= *walker, (PT_GUEST_DIRTY_SHIFT - PT_GUEST_ACCESSED_SHIFT); =20 if (unlikely(!accessed_dirty)) { - ret =3D FNAME(update_accessed_dirty_bits)(vcpu, mmu, walker, + ret =3D FNAME(update_accessed_dirty_bits)(vcpu, w, walker, addr, write_fault); if (unlikely(ret < 0)) goto error; @@ -539,7 +546,7 @@ static int FNAME(walk_addr_generic)(struct guest_walker= *walker, } #endif walker->fault.address =3D addr; - walker->fault.nested_page_fault =3D mmu !=3D vcpu->arch.walk_mmu; + walker->fault.nested_page_fault =3D w !=3D &vcpu->arch.walk_mmu->w; walker->fault.async_page_fault =3D false; =20 trace_kvm_mmu_walker_error(walker->fault.error_code); @@ -549,7 +556,7 @@ static int FNAME(walk_addr_generic)(struct guest_walker= *walker, static int FNAME(walk_addr)(struct guest_walker *walker, struct kvm_vcpu *vcpu, gpa_t addr, u64 access) { - return FNAME(walk_addr_generic)(walker, vcpu, vcpu->arch.mmu, addr, + return FNAME(walk_addr_generic)(walker, vcpu, &vcpu->arch.mmu->w, addr, access); } =20 @@ -565,7 +572,7 @@ FNAME(prefetch_gpte)(struct kvm_vcpu *vcpu, struct kvm_= mmu_page *sp, =20 gfn =3D gpte_to_gfn(gpte); pte_access =3D sp->role.access & FNAME(gpte_access)(gpte); - FNAME(protect_clean_gpte)(vcpu->arch.mmu, &pte_access, gpte); + FNAME(protect_clean_gpte)(&vcpu->arch.mmu->w, &pte_access, gpte); =20 return kvm_mmu_prefetch_sptes(vcpu, gfn, spte, 1, pte_access); } @@ -895,7 +902,7 @@ static gpa_t FNAME(gva_to_gpa)(struct kvm_vcpu *vcpu, s= truct kvm_mmu *mmu, WARN_ON_ONCE((addr >> 32) && mmu =3D=3D vcpu->arch.walk_mmu); #endif =20 - r =3D FNAME(walk_addr_generic)(&walker, vcpu, mmu, addr, access); + r =3D FNAME(walk_addr_generic)(&walker, vcpu, &mmu->w, addr, access); =20 if (r) { gpa =3D gfn_to_gpa(walker.gfn); @@ -945,7 +952,7 @@ static int FNAME(sync_spte)(struct kvm_vcpu *vcpu, stru= ct kvm_mmu_page *sp, int gfn =3D gpte_to_gfn(gpte); pte_access =3D sp->role.access; pte_access &=3D FNAME(gpte_access)(gpte); - FNAME(protect_clean_gpte)(vcpu->arch.mmu, &pte_access, gpte); + FNAME(protect_clean_gpte)(&vcpu->arch.mmu->w, &pte_access, gpte); =20 if (sync_mmio_spte(vcpu, &sp->spt[i], gfn, pte_access)) return 0; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index c5e55597533b..0f44482d4be0 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1071,7 +1071,7 @@ int load_pdptrs(struct kvm_vcpu *vcpu, unsigned long = cr3) * If the MMU is nested, CR3 holds an L2 GPA and needs to be translated * to an L1 GPA. */ - real_gpa =3D kvm_translate_gpa(vcpu, mmu, gfn_to_gpa(pdpt_gfn), + real_gpa =3D kvm_translate_gpa(vcpu, &mmu->w, gfn_to_gpa(pdpt_gfn), PFERR_USER_MASK | PFERR_WRITE_MASK | PFERR_GUEST_PAGE_MASK, NULL, 0); if (real_gpa =3D=3D INVALID_GPA) @@ -8090,7 +8090,7 @@ static int vcpu_mmio_gva_to_gpa(struct kvm_vcpu *vcpu= , unsigned long gva, * shadow page table for L2 guest. */ if (vcpu_match_mmio_gva(vcpu, gva) && (!is_paging(vcpu) || - !permission_fault(vcpu, vcpu->arch.walk_mmu, + !permission_fault(vcpu, &vcpu->arch.walk_mmu->w, vcpu->arch.mmio_access, 0, access))) { *gpa =3D vcpu->arch.mmio_gfn << PAGE_SHIFT | (gva & (PAGE_SIZE - 1)); --=20 2.52.0 From nobody Mon Jun 8 07:23:59 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C3B2C449EA4 for ; Wed, 3 Jun 2026 10:58:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780484307; cv=none; b=C2iJlCxaYNoac/irTBK9DZktVFE4tZMB0vbwNZ/VL6s/3tF7nJqmwFYM6a2nv2irezonPcQQu7yIGAznEne0iIsr/TLj2TQAO6jO/37J9dLOF9R5RmHrCty6avSXzUKk5WYsFm/GUZDjQnXhRgaeQGnd9BOz9PB5+F4iVDOmmGU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780484307; c=relaxed/simple; bh=Twf9KpSTB57HCqXSe/YTK1C56UNPppE2ZxdOUO6p/K4=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=qXNXTjHDJRhVg0L3TDA9y+Nua+/n+8sNGwrwZx4uIJB8XxzlBQnUUU8+DluiKc/hTUgc0j6LqwnJxs0VzTCeVyCfQRtZfinGkUzfqKr0Hy6AsG9WzIKJnHED/wImsaoP4dEh1Tmpn/wi4/fijwO7gfbvlb+iF+RTAJ7E7owb1bg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=CHG+jeoj; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="CHG+jeoj" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1780484304; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=3aZSGXr+5jozu/vlwgHPZUK7gxxIz5AYrDMl7Bld64g=; b=CHG+jeojX4zNeOTQr6quFwpgzk6yAchM577Nz+7KeDaXXBnPL6nuKepol7/jOocz2ISQOc CjU9kv1h/98pSSe4XTeV1FnktXQ3ks9WIjZs6DfcJgh5BybR5ezhp/afLsWdPtddkHirBw ATYwNs7kZ3+1G6VnnQriZgyRvy+zcD0= Received: from mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-380-aDsLQgbpP8irFWUKVQEjQA-1; Wed, 03 Jun 2026 06:58:20 -0400 X-MC-Unique: aDsLQgbpP8irFWUKVQEjQA-1 X-Mimecast-MFC-AGG-ID: aDsLQgbpP8irFWUKVQEjQA_1780484300 Received: from mx-prod-int-10.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-10.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.95]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 01A78180049F; Wed, 3 Jun 2026 10:58:20 +0000 (UTC) Received: from virtlab1023.lab.eng.rdu2.redhat.lab.eng.rdu2.redhat.com (virtlab1023.lab.eng.rdu2.redhat.com [10.8.1.187]) by mx-prod-int-10.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 9A908404; Wed, 3 Jun 2026 10:58:19 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Subject: [PATCH 08/24] KVM: x86/mmu: move get_guest_pgd to struct kvm_pagewalk Date: Wed, 3 Jun 2026 06:57:58 -0400 Message-ID: <20260603105814.10236-9-pbonzini@redhat.com> In-Reply-To: <20260603105814.10236-1-pbonzini@redhat.com> References: <20260603105814.10236-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.6 on 10.30.177.95 Content-Type: text/plain; charset="utf-8" Start moving page walking functionality out of kvm_mmu. The easiest target is the callbacks; change the kvm_mmu_get_guest_pgd() wrapper to take a struct kvm_pagewalk too, and avoid the MMU indirection whenever the caller has one. Signed-off-by: Paolo Bonzini --- arch/x86/include/asm/kvm_host.h | 2 +- arch/x86/kvm/mmu/mmu.c | 21 ++++++++++++--------- arch/x86/kvm/mmu/paging_tmpl.h | 2 +- arch/x86/kvm/svm/nested.c | 4 +++- arch/x86/kvm/vmx/nested.c | 3 ++- 5 files changed, 19 insertions(+), 13 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_hos= t.h index f72af337330b..81c0ae3fc3f3 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -480,12 +480,12 @@ struct kvm_page_fault; * current mmu mode. */ struct kvm_pagewalk { + unsigned long (*get_guest_pgd)(struct kvm_vcpu *vcpu); }; =20 struct kvm_mmu { struct kvm_pagewalk w; =20 - unsigned long (*get_guest_pgd)(struct kvm_vcpu *vcpu); u64 (*get_pdptr)(struct kvm_vcpu *vcpu, int index); int (*page_fault)(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault); void (*inject_page_fault)(struct kvm_vcpu *vcpu, diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 42b7397a1845..8981e5526ba1 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -269,12 +269,12 @@ static unsigned long get_guest_cr3(struct kvm_vcpu *v= cpu) } =20 static inline unsigned long kvm_mmu_get_guest_pgd(struct kvm_vcpu *vcpu, - struct kvm_mmu *mmu) + struct kvm_pagewalk *w) { - if (IS_ENABLED(CONFIG_MITIGATION_RETPOLINE) && mmu->get_guest_pgd =3D=3D = get_guest_cr3) + if (IS_ENABLED(CONFIG_MITIGATION_RETPOLINE) && w->get_guest_pgd =3D=3D ge= t_guest_cr3) return kvm_read_cr3(vcpu); =20 - return mmu->get_guest_pgd(vcpu); + return w->get_guest_pgd(vcpu); } =20 static inline bool kvm_available_flush_remote_tlbs_range(void) @@ -4071,7 +4071,7 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vc= pu) int quadrant, i, r; hpa_t root; =20 - root_pgd =3D kvm_mmu_get_guest_pgd(vcpu, mmu); + root_pgd =3D kvm_mmu_get_guest_pgd(vcpu, &mmu->w); root_gfn =3D (root_pgd & __PT_BASE_ADDR_MASK) >> PAGE_SHIFT; =20 if (!kvm_vcpu_is_visible_gfn(vcpu, root_gfn)) { @@ -4543,7 +4543,7 @@ static bool kvm_arch_setup_async_pf(struct kvm_vcpu *= vcpu, if (arch.direct_map) arch.cr3 =3D (unsigned long)INVALID_GPA; else - arch.cr3 =3D kvm_mmu_get_guest_pgd(vcpu, vcpu->arch.mmu); + arch.cr3 =3D kvm_mmu_get_guest_pgd(vcpu, &vcpu->arch.mmu->w); =20 return kvm_setup_async_pf(vcpu, fault->addr, kvm_vcpu_gfn_to_hva(vcpu, fault->gfn), &arch); @@ -4565,7 +4565,7 @@ void kvm_arch_async_page_ready(struct kvm_vcpu *vcpu,= struct kvm_async_pf *work) return; =20 if (!vcpu->arch.mmu->root_role.direct && - work->arch.cr3 !=3D kvm_mmu_get_guest_pgd(vcpu, vcpu->arch.mmu)) + work->arch.cr3 !=3D kvm_mmu_get_guest_pgd(vcpu, &vcpu->arch.mmu->w)) return; =20 r =3D kvm_mmu_do_page_fault(vcpu, work->cr2_or_gpa, work->arch.error_code, @@ -5880,10 +5880,11 @@ static void init_kvm_tdp_mmu(struct kvm_vcpu *vcpu, context->root_role.word =3D root_role.word; context->page_fault =3D kvm_tdp_page_fault; context->sync_spte =3D NULL; - context->get_guest_pgd =3D get_guest_cr3; context->get_pdptr =3D kvm_pdptr_read; context->inject_page_fault =3D kvm_inject_page_fault; =20 + context->w.get_guest_pgd =3D get_guest_cr3; + if (!is_cr0_pg(context)) context->gva_to_gpa =3D nonpaging_gva_to_gpa; else if (is_cr4_pae(context)) @@ -6031,7 +6032,8 @@ static void init_kvm_softmmu(struct kvm_vcpu *vcpu, =20 kvm_init_shadow_mmu(vcpu, cpu_role); =20 - context->get_guest_pgd =3D get_guest_cr3; + context->w.get_guest_pgd =3D get_guest_cr3; + context->get_pdptr =3D kvm_pdptr_read; context->inject_page_fault =3D kvm_inject_page_fault; } @@ -6045,10 +6047,11 @@ static void init_kvm_nested_mmu(struct kvm_vcpu *vc= pu, return; =20 g_context->cpu_role.as_u64 =3D new_mode.as_u64; - g_context->get_guest_pgd =3D get_guest_cr3; g_context->get_pdptr =3D kvm_pdptr_read; g_context->inject_page_fault =3D kvm_inject_page_fault; =20 + g_context->w.get_guest_pgd =3D get_guest_cr3; + /* * L2 page tables are never shadowed, so there is no need to sync * SPTEs. diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h index ab1aebf2f73c..9c3ccea6cd6b 100644 --- a/arch/x86/kvm/mmu/paging_tmpl.h +++ b/arch/x86/kvm/mmu/paging_tmpl.h @@ -342,7 +342,7 @@ static int FNAME(walk_addr_generic)(struct guest_walker= *walker, trace_kvm_mmu_pagetable_walk(addr, access); retry_walk: walker->level =3D mmu->cpu_role.base.level; - pte =3D kvm_mmu_get_guest_pgd(vcpu, mmu); + pte =3D kvm_mmu_get_guest_pgd(vcpu, w); have_ad =3D PT_HAVE_ACCESSED_DIRTY(mmu); =20 #if PTTYPE =3D=3D 64 diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c index aa5a1d8ea136..9f491f45eeb6 100644 --- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -97,7 +97,9 @@ static void nested_svm_init_mmu_context(struct kvm_vcpu *= vcpu) svm->vmcb01.ptr->save.efer, svm->nested.ctl.nested_cr3, svm->nested.ctl.misc_ctl); - vcpu->arch.mmu->get_guest_pgd =3D nested_svm_get_tdp_cr3; + + vcpu->arch.mmu->w.get_guest_pgd =3D nested_svm_get_tdp_cr3; + vcpu->arch.mmu->get_pdptr =3D nested_svm_get_tdp_pdptr; vcpu->arch.mmu->inject_page_fault =3D nested_svm_inject_npf_exit; vcpu->arch.walk_mmu =3D &vcpu->arch.nested_mmu; diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index 1bd0839146fd..db63ae44c988 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -494,7 +494,8 @@ static void nested_ept_init_mmu_context(struct kvm_vcpu= *vcpu) =20 vcpu->arch.mmu =3D &vcpu->arch.guest_mmu; nested_ept_new_eptp(vcpu); - vcpu->arch.mmu->get_guest_pgd =3D nested_ept_get_eptp; + vcpu->arch.mmu->w.get_guest_pgd =3D nested_ept_get_eptp; + vcpu->arch.mmu->inject_page_fault =3D nested_ept_inject_page_fault; vcpu->arch.mmu->get_pdptr =3D kvm_pdptr_read; =20 --=20 2.52.0 From nobody Mon Jun 8 07:23:59 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7BDBC4534A4 for ; Wed, 3 Jun 2026 10:58:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780484309; cv=none; b=d/lx7WCQiu934JzIg4AY4ycsiCpud9P8sIVlHOlVNUUZ5c1vfOQgnY9N8TO9LSeAN2YdqCfaXTJDIaOPVf58iAXuS4pSve3ke+hi7MKvML7ZrHtTxA4pXatMB4uBAymdIBNjWz0btGU6qvT3fuoldMIXZX3KpRPADRqWKdSw7Wo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780484309; c=relaxed/simple; bh=4V8EqgZ1OeibvvRySEAsrj4NuRCVFKy5Z/6tHGw9yjc=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=FmcK8jaMGrFWxMHpTlGgbEAtqJJP0hR6ea8+q9LKyalOhSL7ojCHG39/knyryE8XgDsiwdgFJJCXjrMcqsUXlFNBDT3G0HR+T2m3p5gFoZzmwi+AqRHc2JtQ7nH1xktUh4IXbU4O14gW6dwcp50+0E5nJJGxJMqLLxRidSksCIE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=LlvChqwI; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="LlvChqwI" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1780484304; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=KR+95XRCfmLBXT1JoCfqqhJI7+H2LuOLKQpZ22bvTq0=; b=LlvChqwIyJNq8cka3bdByijNN3XcdGbuFHiyOo+dDqWYr7cWDic5IclP3px9tTRVndS9ou Ix0K3YxChaFkBa2Yc/Iii1KHWvSibXmyOwu3+KvJQSJEsy+IDkAYDCW9qfsRinvrt2pPly rMH4Bv7RtIBcRFsteZBQtN2virf/8R8= Received: from mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-584-G-8qqNXlOp-l8bzkzf1gMA-1; Wed, 03 Jun 2026 06:58:21 -0400 X-MC-Unique: G-8qqNXlOp-l8bzkzf1gMA-1 X-Mimecast-MFC-AGG-ID: G-8qqNXlOp-l8bzkzf1gMA_1780484300 Received: from mx-prod-int-10.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-10.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.95]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 82A27195608C; Wed, 3 Jun 2026 10:58:20 +0000 (UTC) Received: from virtlab1023.lab.eng.rdu2.redhat.lab.eng.rdu2.redhat.com (virtlab1023.lab.eng.rdu2.redhat.com [10.8.1.187]) by mx-prod-int-10.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 280F5404; Wed, 3 Jun 2026 10:58:20 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Subject: [PATCH 09/24] KVM: x86/mmu: move gva_to_gpa to struct kvm_pagewalk Date: Wed, 3 Jun 2026 06:57:59 -0400 Message-ID: <20260603105814.10236-10-pbonzini@redhat.com> In-Reply-To: <20260603105814.10236-1-pbonzini@redhat.com> References: <20260603105814.10236-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.6 on 10.30.177.95 Content-Type: text/plain; charset="utf-8" gva_to_gpa is the main entry point into walk_mmu, which is only used for guest page table walking (as opposed to building the page tables). Moving gva_to_gpa to struct kvm_pagewalk is a steps towards making walk_mmu a struct kvm_pagewalk. Signed-off-by: Paolo Bonzini --- arch/x86/include/asm/kvm_host.h | 6 +++--- arch/x86/kvm/mmu/mmu.c | 26 +++++++++++++------------- arch/x86/kvm/mmu/paging_tmpl.h | 6 +++--- arch/x86/kvm/svm/nested.c | 4 ++-- arch/x86/kvm/vmx/nested.c | 4 ++-- arch/x86/kvm/x86.c | 30 +++++++++++++++--------------- 6 files changed, 38 insertions(+), 38 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_hos= t.h index 81c0ae3fc3f3..536a7d325d89 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -481,6 +481,9 @@ struct kvm_page_fault; */ struct kvm_pagewalk { unsigned long (*get_guest_pgd)(struct kvm_vcpu *vcpu); + gpa_t (*gva_to_gpa)(struct kvm_vcpu *vcpu, struct kvm_pagewalk *w, + gpa_t gva_or_gpa, u64 access, + struct x86_exception *exception); }; =20 struct kvm_mmu { @@ -490,9 +493,6 @@ struct kvm_mmu { int (*page_fault)(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault); void (*inject_page_fault)(struct kvm_vcpu *vcpu, struct x86_exception *fault); - gpa_t (*gva_to_gpa)(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, - gpa_t gva_or_gpa, u64 access, - struct x86_exception *exception); int (*sync_spte)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp, int i); struct kvm_mmu_root_info root; diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 8981e5526ba1..552a104e9496 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -4342,7 +4342,7 @@ void kvm_mmu_sync_prev_roots(struct kvm_vcpu *vcpu) kvm_mmu_free_roots(vcpu->kvm, vcpu->arch.mmu, roots_to_free); } =20 -static gpa_t nonpaging_gva_to_gpa(struct kvm_vcpu *vcpu, struct kvm_mmu *m= mu, +static gpa_t nonpaging_gva_to_gpa(struct kvm_vcpu *vcpu, struct kvm_pagewa= lk *w, gpa_t vaddr, u64 access, struct x86_exception *exception) { @@ -4354,7 +4354,7 @@ static gpa_t nonpaging_gva_to_gpa(struct kvm_vcpu *vc= pu, struct kvm_mmu *mmu, * user-mode address if CR0.PG=3D0. Therefore *include* ACC_USER_MASK in * the last argument to kvm_translate_gpa (which NPT does not use). */ - return kvm_translate_gpa(vcpu, &mmu->w, vaddr, access | PFERR_GUEST_FINAL= _MASK, + return kvm_translate_gpa(vcpu, w, vaddr, access | PFERR_GUEST_FINAL_MASK, exception, ACC_ALL); } =20 @@ -5119,7 +5119,7 @@ EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_tdp_mmu_map_privat= e_pfn); static void nonpaging_init_context(struct kvm_mmu *context) { context->page_fault =3D nonpaging_page_fault; - context->gva_to_gpa =3D nonpaging_gva_to_gpa; + context->w.gva_to_gpa =3D nonpaging_gva_to_gpa; context->sync_spte =3D NULL; } =20 @@ -5750,14 +5750,14 @@ static void reset_guest_paging_metadata(struct kvm_= vcpu *vcpu, static void paging64_init_context(struct kvm_mmu *context) { context->page_fault =3D paging64_page_fault; - context->gva_to_gpa =3D paging64_gva_to_gpa; + context->w.gva_to_gpa =3D paging64_gva_to_gpa; context->sync_spte =3D paging64_sync_spte; } =20 static void paging32_init_context(struct kvm_mmu *context) { context->page_fault =3D paging32_page_fault; - context->gva_to_gpa =3D paging32_gva_to_gpa; + context->w.gva_to_gpa =3D paging32_gva_to_gpa; context->sync_spte =3D paging32_sync_spte; } =20 @@ -5886,11 +5886,11 @@ static void init_kvm_tdp_mmu(struct kvm_vcpu *vcpu, context->w.get_guest_pgd =3D get_guest_cr3; =20 if (!is_cr0_pg(context)) - context->gva_to_gpa =3D nonpaging_gva_to_gpa; + context->w.gva_to_gpa =3D nonpaging_gva_to_gpa; else if (is_cr4_pae(context)) - context->gva_to_gpa =3D paging64_gva_to_gpa; + context->w.gva_to_gpa =3D paging64_gva_to_gpa; else - context->gva_to_gpa =3D paging32_gva_to_gpa; + context->w.gva_to_gpa =3D paging32_gva_to_gpa; =20 reset_guest_paging_metadata(vcpu, context); reset_tdp_shadow_zero_bits_mask(context); @@ -6012,7 +6012,7 @@ void kvm_init_shadow_ept_mmu(struct kvm_vcpu *vcpu, b= ool execonly, context->root_role.word =3D new_mode.base.word; =20 context->page_fault =3D ept_page_fault; - context->gva_to_gpa =3D ept_gva_to_gpa; + context->w.gva_to_gpa =3D ept_gva_to_gpa; context->sync_spte =3D ept_sync_spte; =20 update_permission_bitmask(context, true, true); @@ -6067,13 +6067,13 @@ static void init_kvm_nested_mmu(struct kvm_vcpu *vc= pu, * the gva_to_gpa functions between mmu and nested_mmu are swapped. */ if (!is_paging(vcpu)) - g_context->gva_to_gpa =3D nonpaging_gva_to_gpa; + g_context->w.gva_to_gpa =3D nonpaging_gva_to_gpa; else if (is_long_mode(vcpu)) - g_context->gva_to_gpa =3D paging64_gva_to_gpa; + g_context->w.gva_to_gpa =3D paging64_gva_to_gpa; else if (is_pae(vcpu)) - g_context->gva_to_gpa =3D paging64_gva_to_gpa; + g_context->w.gva_to_gpa =3D paging64_gva_to_gpa; else - g_context->gva_to_gpa =3D paging32_gva_to_gpa; + g_context->w.gva_to_gpa =3D paging32_gva_to_gpa; =20 reset_guest_paging_metadata(vcpu, g_context); } diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h index 9c3ccea6cd6b..6fcce1d9b787 100644 --- a/arch/x86/kvm/mmu/paging_tmpl.h +++ b/arch/x86/kvm/mmu/paging_tmpl.h @@ -889,7 +889,7 @@ static gpa_t FNAME(get_level1_sp_gpa)(struct kvm_mmu_pa= ge *sp) } =20 /* Note, @addr is a GPA when gva_to_gpa() translates an L2 GPA to an L1 GP= A. */ -static gpa_t FNAME(gva_to_gpa)(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, +static gpa_t FNAME(gva_to_gpa)(struct kvm_vcpu *vcpu, struct kvm_pagewalk = *w, gpa_t addr, u64 access, struct x86_exception *exception) { @@ -899,10 +899,10 @@ static gpa_t FNAME(gva_to_gpa)(struct kvm_vcpu *vcpu,= struct kvm_mmu *mmu, =20 #ifndef CONFIG_X86_64 /* A 64-bit GVA should be impossible on 32-bit KVM. */ - WARN_ON_ONCE((addr >> 32) && mmu =3D=3D vcpu->arch.walk_mmu); + WARN_ON_ONCE((addr >> 32) && w =3D=3D &vcpu->arch.walk_mmu->w); #endif =20 - r =3D FNAME(walk_addr_generic)(&walker, vcpu, &mmu->w, addr, access); + r =3D FNAME(walk_addr_generic)(&walker, vcpu, w, addr, access); =20 if (r) { gpa =3D gfn_to_gpa(walker.gfn); diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c index 9f491f45eeb6..d49e3ae28143 100644 --- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -2096,7 +2096,7 @@ static gpa_t svm_translate_nested_gpa(struct kvm_vcpu= *vcpu, gpa_t gpa, u64 pte_access) { struct vcpu_svm *svm =3D to_svm(vcpu); - struct kvm_mmu *mmu =3D vcpu->arch.mmu; + struct kvm_pagewalk *w =3D &vcpu->arch.mmu->w; =20 BUG_ON(!mmu_is_nested(vcpu)); =20 @@ -2104,7 +2104,7 @@ static gpa_t svm_translate_nested_gpa(struct kvm_vcpu= *vcpu, gpa_t gpa, if (!(svm->nested.ctl.misc_ctl & SVM_MISC_ENABLE_GMET)) access |=3D PFERR_USER_MASK; =20 - return mmu->gva_to_gpa(vcpu, mmu, gpa, access, exception); + return w->gva_to_gpa(vcpu, w, gpa, access, exception); } =20 struct kvm_x86_nested_ops svm_nested_ops =3D { diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index db63ae44c988..7d3106c2f83c 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -7450,7 +7450,7 @@ static gpa_t vmx_translate_nested_gpa(struct kvm_vcpu= *vcpu, gpa_t gpa, struct x86_exception *exception, u64 pte_access) { - struct kvm_mmu *mmu =3D vcpu->arch.mmu; + struct kvm_pagewalk *w =3D &vcpu->arch.mmu->w; =20 BUG_ON(!mmu_is_nested(vcpu)); =20 @@ -7462,7 +7462,7 @@ static gpa_t vmx_translate_nested_gpa(struct kvm_vcpu= *vcpu, gpa_t gpa, if ((pte_access & ACC_USER_MASK) && (access & PFERR_GUEST_FINAL_MASK)) access |=3D PFERR_USER_MASK; =20 - return mmu->gva_to_gpa(vcpu, mmu, gpa, access, exception); + return w->gva_to_gpa(vcpu, w, gpa, access, exception); } =20 struct kvm_x86_nested_ops vmx_nested_ops =3D { diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 0f44482d4be0..00566655ad05 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -7851,21 +7851,21 @@ void kvm_get_segment(struct kvm_vcpu *vcpu, gpa_t kvm_mmu_gva_to_gpa_read(struct kvm_vcpu *vcpu, gva_t gva, struct x86_exception *exception) { - struct kvm_mmu *mmu =3D vcpu->arch.walk_mmu; + struct kvm_pagewalk *gva_walk =3D &vcpu->arch.walk_mmu->w; =20 u64 access =3D (kvm_x86_call(get_cpl)(vcpu) =3D=3D 3) ? PFERR_USER_MASK := 0; - return mmu->gva_to_gpa(vcpu, mmu, gva, access, exception); + return gva_walk->gva_to_gpa(vcpu, gva_walk, gva, access, exception); } EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_mmu_gva_to_gpa_read); =20 gpa_t kvm_mmu_gva_to_gpa_write(struct kvm_vcpu *vcpu, gva_t gva, struct x86_exception *exception) { - struct kvm_mmu *mmu =3D vcpu->arch.walk_mmu; + struct kvm_pagewalk *gva_walk =3D &vcpu->arch.walk_mmu->w; =20 u64 access =3D (kvm_x86_call(get_cpl)(vcpu) =3D=3D 3) ? PFERR_USER_MASK := 0; access |=3D PFERR_WRITE_MASK; - return mmu->gva_to_gpa(vcpu, mmu, gva, access, exception); + return gva_walk->gva_to_gpa(vcpu, gva_walk, gva, access, exception); } EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_mmu_gva_to_gpa_write); =20 @@ -7873,21 +7873,21 @@ EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_mmu_gva_to_gpa_w= rite); gpa_t kvm_mmu_gva_to_gpa_system(struct kvm_vcpu *vcpu, gva_t gva, struct x86_exception *exception) { - struct kvm_mmu *mmu =3D vcpu->arch.walk_mmu; + struct kvm_pagewalk *gva_walk =3D &vcpu->arch.walk_mmu->w; =20 - return mmu->gva_to_gpa(vcpu, mmu, gva, 0, exception); + return gva_walk->gva_to_gpa(vcpu, gva_walk, gva, 0, exception); } =20 static int kvm_read_guest_virt_helper(gva_t addr, void *val, unsigned int = bytes, struct kvm_vcpu *vcpu, u64 access, struct x86_exception *exception) { - struct kvm_mmu *mmu =3D vcpu->arch.walk_mmu; + struct kvm_pagewalk *gva_walk =3D &vcpu->arch.walk_mmu->w; void *data =3D val; int r =3D X86EMUL_CONTINUE; =20 while (bytes) { - gpa_t gpa =3D mmu->gva_to_gpa(vcpu, mmu, addr, access, exception); + gpa_t gpa =3D gva_walk->gva_to_gpa(vcpu, gva_walk, addr, access, excepti= on); unsigned offset =3D addr & (PAGE_SIZE-1); unsigned toread =3D min(bytes, (unsigned)PAGE_SIZE - offset); int ret; @@ -7915,14 +7915,14 @@ static int kvm_fetch_guest_virt(struct x86_emulate_= ctxt *ctxt, struct x86_exception *exception) { struct kvm_vcpu *vcpu =3D emul_to_vcpu(ctxt); - struct kvm_mmu *mmu =3D vcpu->arch.walk_mmu; + struct kvm_pagewalk *gva_walk =3D &vcpu->arch.walk_mmu->w; u64 access =3D (kvm_x86_call(get_cpl)(vcpu) =3D=3D 3) ? PFERR_USER_MASK := 0; unsigned offset; int ret; =20 /* Inline kvm_read_guest_virt_helper for speed. */ - gpa_t gpa =3D mmu->gva_to_gpa(vcpu, mmu, addr, access|PFERR_FETCH_MASK, - exception); + gpa_t gpa =3D gva_walk->gva_to_gpa(vcpu, gva_walk, addr, access|PFERR_FET= CH_MASK, + exception); if (unlikely(gpa =3D=3D INVALID_GPA)) return X86EMUL_PROPAGATE_FAULT; =20 @@ -7974,12 +7974,12 @@ static int kvm_write_guest_virt_helper(gva_t addr, = void *val, unsigned int bytes struct kvm_vcpu *vcpu, u64 access, struct x86_exception *exception) { - struct kvm_mmu *mmu =3D vcpu->arch.walk_mmu; + struct kvm_pagewalk *gva_walk =3D &vcpu->arch.walk_mmu->w; void *data =3D val; int r =3D X86EMUL_CONTINUE; =20 while (bytes) { - gpa_t gpa =3D mmu->gva_to_gpa(vcpu, mmu, addr, access, exception); + gpa_t gpa =3D gva_walk->gva_to_gpa(vcpu, gva_walk, addr, access, excepti= on); unsigned offset =3D addr & (PAGE_SIZE-1); unsigned towrite =3D min(bytes, (unsigned)PAGE_SIZE - offset); int ret; @@ -8098,7 +8098,7 @@ static int vcpu_mmio_gva_to_gpa(struct kvm_vcpu *vcpu= , unsigned long gva, return 1; } =20 - *gpa =3D mmu->gva_to_gpa(vcpu, mmu, gva, access, exception); + *gpa =3D mmu->w.gva_to_gpa(vcpu, &mmu->w, gva, access, exception); =20 if (*gpa =3D=3D INVALID_GPA) return -1; @@ -14217,7 +14217,7 @@ void kvm_fixup_and_inject_pf_error(struct kvm_vcpu = *vcpu, gva_t gva, u16 error_c (PFERR_WRITE_MASK | PFERR_FETCH_MASK | PFERR_USER_MASK); =20 if (!(error_code & PFERR_PRESENT_MASK) || - mmu->gva_to_gpa(vcpu, mmu, gva, access, &fault) !=3D INVALID_GPA) { + mmu->w.gva_to_gpa(vcpu, &mmu->w, gva, access, &fault) !=3D INVALID_GP= A) { /* * If vcpu->arch.walk_mmu->gva_to_gpa succeeded, the page * tables probably do not match the TLB. Just proceed --=20 2.52.0 From nobody Mon Jun 8 07:23:59 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 63DAE4657DA for ; Wed, 3 Jun 2026 10:58:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780484308; cv=none; b=GRGtcwQEBnJbxFXxLNIRi4BfdhlMsVvYVmCJyQM8mVxKlwpH4OvDNXgWFFwdIDAxhNETu1YENQfpMpFvTcCzDXKDDtUnG4oA2iFuQ84EkpyDMeTeCZ8vzkce+5QnDWPVan0Ye+CFvEBr31K4vkGBHaMPQSDOVFA69C1AZsccJ1w= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780484308; c=relaxed/simple; bh=OkHkEJ1voQle8LsswVw0OA/aTUhnVbDqmHHJUyaaeOE=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=MVSg+I8XOR4lAJfYuyUyC+dUp4/6uXGc2oPLmdEjSl667Z+aKM45TjV18OGU1xvh/TJ5jU6c4PhGmYXSh6cXzOMRAeg3Kbdi9EJhSWovFiUKnUc6JVasyUI31xjFKdsFgeC4yComkhbII/0HM9EwUXnfYEVmu4CqzzUGVuxnSEo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=Bt+zyzoT; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="Bt+zyzoT" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1780484305; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=DkwFQXTUALMxG3A9apP+NFNWlgd5woPtx062LLDPELc=; b=Bt+zyzoTd3kIG2lnjwiOm0qmcU+hpnX6470S7Rx5Cl1hyjvPtX9uC9OVfxNqiTgU+lePGE +BU3Nhk0kHKAUn6XoITaACekL2OZBT3eFvfXezqwFaIKew/hab0Wegi2gMyCRS8CLf1FmS 9ex75ybW5ErLbgsT6fezoQ90ZtdxTs4= Received: from mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-202-RVJj3Kg7OwmAQFhhe4hWjA-1; Wed, 03 Jun 2026 06:58:22 -0400 X-MC-Unique: RVJj3Kg7OwmAQFhhe4hWjA-1 X-Mimecast-MFC-AGG-ID: RVJj3Kg7OwmAQFhhe4hWjA_1780484301 Received: from mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 7F3161800473; Wed, 3 Jun 2026 10:58:21 +0000 (UTC) Received: from virtlab1023.lab.eng.rdu2.redhat.lab.eng.rdu2.redhat.com (virtlab1023.lab.eng.rdu2.redhat.com [10.8.1.187]) by mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 234E230001A1; Wed, 3 Jun 2026 10:58:21 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Subject: [PATCH 10/24] KVM: x86/mmu: move get_pdptr to struct kvm_pagewalk Date: Wed, 3 Jun 2026 06:58:00 -0400 Message-ID: <20260603105814.10236-11-pbonzini@redhat.com> In-Reply-To: <20260603105814.10236-1-pbonzini@redhat.com> References: <20260603105814.10236-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.4 Content-Type: text/plain; charset="utf-8" Continue with yet another callback used in FNAME(walk_addr_generic), as another step towards removing container_of() from there. Signed-off-by: Paolo Bonzini --- arch/x86/include/asm/kvm_host.h | 2 +- arch/x86/kvm/mmu/mmu.c | 8 ++++---- arch/x86/kvm/mmu/paging_tmpl.h | 2 +- arch/x86/kvm/svm/nested.c | 2 +- arch/x86/kvm/vmx/nested.c | 2 +- 5 files changed, 8 insertions(+), 8 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_hos= t.h index 536a7d325d89..81cb9c03cf88 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -481,6 +481,7 @@ struct kvm_page_fault; */ struct kvm_pagewalk { unsigned long (*get_guest_pgd)(struct kvm_vcpu *vcpu); + u64 (*get_pdptr)(struct kvm_vcpu *vcpu, int index); gpa_t (*gva_to_gpa)(struct kvm_vcpu *vcpu, struct kvm_pagewalk *w, gpa_t gva_or_gpa, u64 access, struct x86_exception *exception); @@ -489,7 +490,6 @@ struct kvm_pagewalk { struct kvm_mmu { struct kvm_pagewalk w; =20 - u64 (*get_pdptr)(struct kvm_vcpu *vcpu, int index); int (*page_fault)(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault); void (*inject_page_fault)(struct kvm_vcpu *vcpu, struct x86_exception *fault); diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 552a104e9496..a51705f53957 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -4085,7 +4085,7 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vc= pu) */ if (mmu->cpu_role.base.level =3D=3D PT32E_ROOT_LEVEL) { for (i =3D 0; i < 4; ++i) { - pdptrs[i] =3D mmu->get_pdptr(vcpu, i); + pdptrs[i] =3D mmu->w.get_pdptr(vcpu, i); if (!(pdptrs[i] & PT_PRESENT_MASK)) continue; =20 @@ -5880,9 +5880,9 @@ static void init_kvm_tdp_mmu(struct kvm_vcpu *vcpu, context->root_role.word =3D root_role.word; context->page_fault =3D kvm_tdp_page_fault; context->sync_spte =3D NULL; - context->get_pdptr =3D kvm_pdptr_read; context->inject_page_fault =3D kvm_inject_page_fault; =20 + context->w.get_pdptr =3D kvm_pdptr_read; context->w.get_guest_pgd =3D get_guest_cr3; =20 if (!is_cr0_pg(context)) @@ -6032,9 +6032,9 @@ static void init_kvm_softmmu(struct kvm_vcpu *vcpu, =20 kvm_init_shadow_mmu(vcpu, cpu_role); =20 + context->w.get_pdptr =3D kvm_pdptr_read; context->w.get_guest_pgd =3D get_guest_cr3; =20 - context->get_pdptr =3D kvm_pdptr_read; context->inject_page_fault =3D kvm_inject_page_fault; } =20 @@ -6047,9 +6047,9 @@ static void init_kvm_nested_mmu(struct kvm_vcpu *vcpu, return; =20 g_context->cpu_role.as_u64 =3D new_mode.as_u64; - g_context->get_pdptr =3D kvm_pdptr_read; g_context->inject_page_fault =3D kvm_inject_page_fault; =20 + g_context->w.get_pdptr =3D kvm_pdptr_read; g_context->w.get_guest_pgd =3D get_guest_cr3; =20 /* diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h index 6fcce1d9b787..ef112ca1e405 100644 --- a/arch/x86/kvm/mmu/paging_tmpl.h +++ b/arch/x86/kvm/mmu/paging_tmpl.h @@ -348,7 +348,7 @@ static int FNAME(walk_addr_generic)(struct guest_walker= *walker, #if PTTYPE =3D=3D 64 walk_nx_mask =3D 1ULL << PT64_NX_SHIFT; if (walker->level =3D=3D PT32E_ROOT_LEVEL) { - pte =3D mmu->get_pdptr(vcpu, (addr >> 30) & 3); + pte =3D w->get_pdptr(vcpu, (addr >> 30) & 3); trace_kvm_mmu_paging_element(pte, walker->level); if (!FNAME(is_present_gpte)(mmu, pte)) goto error; diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c index d49e3ae28143..3eb701454a56 100644 --- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -99,8 +99,8 @@ static void nested_svm_init_mmu_context(struct kvm_vcpu *= vcpu) svm->nested.ctl.misc_ctl); =20 vcpu->arch.mmu->w.get_guest_pgd =3D nested_svm_get_tdp_cr3; + vcpu->arch.mmu->w.get_pdptr =3D nested_svm_get_tdp_pdptr; =20 - vcpu->arch.mmu->get_pdptr =3D nested_svm_get_tdp_pdptr; vcpu->arch.mmu->inject_page_fault =3D nested_svm_inject_npf_exit; vcpu->arch.walk_mmu =3D &vcpu->arch.nested_mmu; } diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index 7d3106c2f83c..4af8a25926da 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -495,9 +495,9 @@ static void nested_ept_init_mmu_context(struct kvm_vcpu= *vcpu) vcpu->arch.mmu =3D &vcpu->arch.guest_mmu; nested_ept_new_eptp(vcpu); vcpu->arch.mmu->w.get_guest_pgd =3D nested_ept_get_eptp; + vcpu->arch.mmu->w.get_pdptr =3D kvm_pdptr_read; =20 vcpu->arch.mmu->inject_page_fault =3D nested_ept_inject_page_fault; - vcpu->arch.mmu->get_pdptr =3D kvm_pdptr_read; =20 vcpu->arch.walk_mmu =3D &vcpu->arch.nested_mmu; } --=20 2.52.0 From nobody Mon Jun 8 07:23:59 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CF3A8466B4B for ; Wed, 3 Jun 2026 10:58:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780484309; cv=none; b=r+/kIJSX3T6gc4J48IUhiPNPMksmlAiQ7Lmgkd2acxowp5wtsP9XA6PbtdDjo8IIiW57gn4z8WxpGioSshkogFIrrTfs9a+Nn5ua/nW7a6oOUra9jD7q+XANMgcS+/RoFoZqwbWGSOiOP10xc6sAKIQp48UiwPnVswXPEThVZP0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780484309; c=relaxed/simple; bh=n2BTAxV0etAR5ciyhPJhNssK3/0XhBg6HgaSgtQQNNI=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=mDXZ7BwcOimhTQpAZkLHVfsespjOOn4llbd6KMlnMz51UJsxUs/YoKHXTousulzC6VT+fYY/tI4zQV9+D+ZPlOPLJKwOxaGlLxKL7B8i7sgBxWf7GuTCUeYJAxhUdNua+zjhGYi9mJbwIcLyb0G19XC0f0+45nj4QfJVynFQhNA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=DW8palNv; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="DW8palNv" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1780484306; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=3CZHC1EjrAKZddVUrY3fvmuHSR7btoMrhPBGfuDvl00=; b=DW8palNvAsllTTDOcXCNQaM4fnIz80a+jJp/yQrLx2bEjpUwqUhbcNaaoq/WxaI4ej+otE SpBwTiH8+AIIvxSe8/lPKJNtkrBH7f6A396pgpeKaqgeskkAHsuKQQppGbzatUK4F5pBvk Cp02LzxyX1oYLnT5FK/3CfsnGaF7bQk= Received: from mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-460-waTGkUzuM6CRn2a3tZxhvQ-1; Wed, 03 Jun 2026 06:58:22 -0400 X-MC-Unique: waTGkUzuM6CRn2a3tZxhvQ-1 X-Mimecast-MFC-AGG-ID: waTGkUzuM6CRn2a3tZxhvQ_1780484302 Received: from mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 0D8421956094; Wed, 3 Jun 2026 10:58:22 +0000 (UTC) Received: from virtlab1023.lab.eng.rdu2.redhat.lab.eng.rdu2.redhat.com (virtlab1023.lab.eng.rdu2.redhat.com [10.8.1.187]) by mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id A517D30001A1; Wed, 3 Jun 2026 10:58:21 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Subject: [PATCH 11/24] KVM: x86/mmu: move inject_page_fault to struct kvm_pagewalk Date: Wed, 3 Jun 2026 06:58:01 -0400 Message-ID: <20260603105814.10236-12-pbonzini@redhat.com> In-Reply-To: <20260603105814.10236-1-pbonzini@redhat.com> References: <20260603105814.10236-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.4 Content-Type: text/plain; charset="utf-8" Injection of page faults is also part of accesses to guest page tables. In particular, kvm_inject_emulated_page_fault calls it on walk_mmu. Move it to struct kvm_pagewalk as part of converting walk_mmu to a struct kvm_pagewalk. Signed-off-by: Paolo Bonzini --- arch/x86/include/asm/kvm_host.h | 4 ++-- arch/x86/kvm/mmu/mmu.c | 8 +++----- arch/x86/kvm/svm/nested.c | 2 +- arch/x86/kvm/vmx/nested.c | 2 +- arch/x86/kvm/x86.c | 4 ++-- 5 files changed, 9 insertions(+), 11 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_hos= t.h index 81cb9c03cf88..fb468e234b37 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -482,6 +482,8 @@ struct kvm_page_fault; struct kvm_pagewalk { unsigned long (*get_guest_pgd)(struct kvm_vcpu *vcpu); u64 (*get_pdptr)(struct kvm_vcpu *vcpu, int index); + void (*inject_page_fault)(struct kvm_vcpu *vcpu, + struct x86_exception *fault); gpa_t (*gva_to_gpa)(struct kvm_vcpu *vcpu, struct kvm_pagewalk *w, gpa_t gva_or_gpa, u64 access, struct x86_exception *exception); @@ -491,8 +493,6 @@ struct kvm_mmu { struct kvm_pagewalk w; =20 int (*page_fault)(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault); - void (*inject_page_fault)(struct kvm_vcpu *vcpu, - struct x86_exception *fault); int (*sync_spte)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp, int i); struct kvm_mmu_root_info root; diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index a51705f53957..4fbb7508e241 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -5880,8 +5880,8 @@ static void init_kvm_tdp_mmu(struct kvm_vcpu *vcpu, context->root_role.word =3D root_role.word; context->page_fault =3D kvm_tdp_page_fault; context->sync_spte =3D NULL; - context->inject_page_fault =3D kvm_inject_page_fault; =20 + context->w.inject_page_fault =3D kvm_inject_page_fault; context->w.get_pdptr =3D kvm_pdptr_read; context->w.get_guest_pgd =3D get_guest_cr3; =20 @@ -6032,10 +6032,9 @@ static void init_kvm_softmmu(struct kvm_vcpu *vcpu, =20 kvm_init_shadow_mmu(vcpu, cpu_role); =20 + context->w.inject_page_fault =3D kvm_inject_page_fault; context->w.get_pdptr =3D kvm_pdptr_read; context->w.get_guest_pgd =3D get_guest_cr3; - - context->inject_page_fault =3D kvm_inject_page_fault; } =20 static void init_kvm_nested_mmu(struct kvm_vcpu *vcpu, @@ -6047,8 +6046,7 @@ static void init_kvm_nested_mmu(struct kvm_vcpu *vcpu, return; =20 g_context->cpu_role.as_u64 =3D new_mode.as_u64; - g_context->inject_page_fault =3D kvm_inject_page_fault; - + g_context->w.inject_page_fault =3D kvm_inject_page_fault; g_context->w.get_pdptr =3D kvm_pdptr_read; g_context->w.get_guest_pgd =3D get_guest_cr3; =20 diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c index 3eb701454a56..79ef81b878d7 100644 --- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -101,7 +101,7 @@ static void nested_svm_init_mmu_context(struct kvm_vcpu= *vcpu) vcpu->arch.mmu->w.get_guest_pgd =3D nested_svm_get_tdp_cr3; vcpu->arch.mmu->w.get_pdptr =3D nested_svm_get_tdp_pdptr; =20 - vcpu->arch.mmu->inject_page_fault =3D nested_svm_inject_npf_exit; + vcpu->arch.mmu->w.inject_page_fault =3D nested_svm_inject_npf_exit; vcpu->arch.walk_mmu =3D &vcpu->arch.nested_mmu; } =20 diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index 4af8a25926da..e9e6714ccd83 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -497,7 +497,7 @@ static void nested_ept_init_mmu_context(struct kvm_vcpu= *vcpu) vcpu->arch.mmu->w.get_guest_pgd =3D nested_ept_get_eptp; vcpu->arch.mmu->w.get_pdptr =3D kvm_pdptr_read; =20 - vcpu->arch.mmu->inject_page_fault =3D nested_ept_inject_page_fault; + vcpu->arch.mmu->w.inject_page_fault =3D nested_ept_inject_page_fault; =20 vcpu->arch.walk_mmu =3D &vcpu->arch.nested_mmu; } diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 00566655ad05..e514096f960c 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1005,7 +1005,7 @@ void kvm_inject_emulated_page_fault(struct kvm_vcpu *= vcpu, kvm_mmu_invalidate_addr(vcpu, fault_mmu, fault->address, KVM_MMU_ROOT_CURRENT); =20 - fault_mmu->inject_page_fault(vcpu, fault); + fault_mmu->w.inject_page_fault(vcpu, fault); } EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_inject_emulated_page_fault); =20 @@ -14230,7 +14230,7 @@ void kvm_fixup_and_inject_pf_error(struct kvm_vcpu = *vcpu, gva_t gva, u16 error_c fault.address =3D gva; fault.async_page_fault =3D false; } - vcpu->arch.walk_mmu->inject_page_fault(vcpu, &fault); + vcpu->arch.walk_mmu->w.inject_page_fault(vcpu, &fault); } EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_fixup_and_inject_pf_error); =20 --=20 2.52.0 From nobody Mon Jun 8 07:23:59 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A20744534A5 for ; Wed, 3 Jun 2026 10:58:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780484308; cv=none; b=R5byTqlSKVxXawCD+JzafJ3q+sKEi3elN4GYMiWLw3/3H02IqgE1/0+PjNVLRMmJcSmeNzV/vIL8KAVBiaSerk5mboVOn//HqagWCUmwF5HH0Z8L6RaaEg1arlfdZSCfxfxJWBBpymXhCC5C5HSOqRaImmtZYVMz9dt3e3DZ1+0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780484308; c=relaxed/simple; bh=djN/hm4AwRtZ8NpedWcbswFxmPcIEC7F0wVM19V4Rg8=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=IP65Y7Sea2Lve3PVfY7HlhRH1r33Hx7UwA9BLgpgKzt5oL9KA3qzSqvU1cDvkT392s9n3PVBS84IoYRnDTwp8NCqZEN1+sfzB11X05+NHH938/1Yu8PRtib10WjrTsO/G0us+0It8g3F0DiOBKmrwHBUClkuYZRPVq2PxYB31fI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=jPlvM/wj; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="jPlvM/wj" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1780484304; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=luLcusphFBiPBFQio41n1QYGFXq6KpOLVjHhKcLgS8s=; b=jPlvM/wjiWnA3+dvzRNqOW+rqNhx4mTBDOJ86xjoQbwIeFChX4gk5oYaBPocvtmvi6EHPD 1XhqSrZsaxw0/kl9M0gLm7XecgWS1yiynihkXa0p1i3IZVonwPL0h87r2gNzFpKYITTqRD STEFfBJ8meSS8JXCNp6tPiwn2NJ2A2U= Received: from mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-223-54Qgg9weMCOLbrTcK8J2Ag-1; Wed, 03 Jun 2026 06:58:23 -0400 X-MC-Unique: 54Qgg9weMCOLbrTcK8J2Ag-1 X-Mimecast-MFC-AGG-ID: 54Qgg9weMCOLbrTcK8J2Ag_1780484302 Received: from mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 8DE4C180044D; Wed, 3 Jun 2026 10:58:22 +0000 (UTC) Received: from virtlab1023.lab.eng.rdu2.redhat.lab.eng.rdu2.redhat.com (virtlab1023.lab.eng.rdu2.redhat.com [10.8.1.187]) by mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 3203830001A1; Wed, 3 Jun 2026 10:58:22 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Subject: [PATCH 12/24] KVM: x86/mmu: move CPU-related fields to struct kvm_pagewalk Date: Wed, 3 Jun 2026 06:58:02 -0400 Message-ID: <20260603105814.10236-13-pbonzini@redhat.com> In-Reply-To: <20260603105814.10236-1-pbonzini@redhat.com> References: <20260603105814.10236-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.4 Content-Type: text/plain; charset="utf-8" struct kvm_pagewalk's behavior depends on the CPU state and its page format. Move related fields so that walk_mmu remains self contained. Note that for now, some of the accessors still use kvm_mmu to split the churn. Signed-off-by: Paolo Bonzini --- arch/x86/include/asm/kvm_host.h | 4 +-- arch/x86/kvm/mmu/mmu.c | 52 ++++++++++++++++----------------- arch/x86/kvm/mmu/paging_tmpl.h | 40 ++++++++++++------------- 3 files changed, 46 insertions(+), 50 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_hos= t.h index fb468e234b37..33c505a15015 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -487,6 +487,8 @@ struct kvm_pagewalk { gpa_t (*gva_to_gpa)(struct kvm_vcpu *vcpu, struct kvm_pagewalk *w, gpa_t gva_or_gpa, u64 access, struct x86_exception *exception); + union kvm_cpu_role cpu_role; + struct rsvd_bits_validate guest_rsvd_check; }; =20 struct kvm_mmu { @@ -497,7 +499,6 @@ struct kvm_mmu { struct kvm_mmu_page *sp, int i); struct kvm_mmu_root_info root; hpa_t mirror_root_hpa; - union kvm_cpu_role cpu_role; union kvm_mmu_page_role root_role; =20 /* @@ -527,7 +528,6 @@ struct kvm_mmu { * the bits spte never used. */ struct rsvd_bits_validate shadow_zero_check; - struct rsvd_bits_validate guest_rsvd_check; }; =20 enum pmc_type { diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 4fbb7508e241..e2bfecf655d9 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -226,7 +226,7 @@ BUILD_MMU_ROLE_REGS_ACCESSOR(efer, lma, EFER_LMA); #define BUILD_MMU_ROLE_ACCESSOR(base_or_ext, reg, name) \ static inline bool __maybe_unused is_##reg##_##name(struct kvm_mmu *mmu) \ { \ - return !!(mmu->cpu_role. base_or_ext . reg##_##name); \ + return !!(mmu->w.cpu_role. base_or_ext . reg##_##name); \ } BUILD_MMU_ROLE_ACCESSOR(base, cr0, wp); BUILD_MMU_ROLE_ACCESSOR(ext, cr4, pse); @@ -239,17 +239,17 @@ BUILD_MMU_ROLE_ACCESSOR(ext, efer, lma); =20 static inline bool has_pferr_fetch(struct kvm_mmu *mmu) { - return mmu->cpu_role.ext.has_pferr_fetch; + return mmu->w.cpu_role.ext.has_pferr_fetch; } =20 static inline bool is_cr0_pg(struct kvm_mmu *mmu) { - return mmu->cpu_role.base.level > 0; + return mmu->w.cpu_role.base.level > 0; } =20 static inline bool is_cr4_pae(struct kvm_mmu *mmu) { - return !mmu->cpu_role.base.has_4_byte_gpte; + return !mmu->w.cpu_role.base.has_4_byte_gpte; } =20 static struct kvm_mmu_role_regs vcpu_to_role_regs(struct kvm_vcpu *vcpu) @@ -2478,7 +2478,7 @@ static void shadow_walk_init_using_root(struct kvm_sh= adow_walk_iterator *iterato iterator->level =3D vcpu->arch.mmu->root_role.level; =20 if (iterator->level >=3D PT64_ROOT_4LEVEL && - vcpu->arch.mmu->cpu_role.base.level < PT64_ROOT_4LEVEL && + vcpu->arch.mmu->w.cpu_role.base.level < PT64_ROOT_4LEVEL && !vcpu->arch.mmu->root_role.direct) iterator->level =3D PT32E_ROOT_LEVEL; =20 @@ -4083,7 +4083,7 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vc= pu) * On SVM, reading PDPTRs might access guest memory, which might fault * and thus might sleep. Grab the PDPTRs before acquiring mmu_lock. */ - if (mmu->cpu_role.base.level =3D=3D PT32E_ROOT_LEVEL) { + if (mmu->w.cpu_role.base.level =3D=3D PT32E_ROOT_LEVEL) { for (i =3D 0; i < 4; ++i) { pdptrs[i] =3D mmu->w.get_pdptr(vcpu, i); if (!(pdptrs[i] & PT_PRESENT_MASK)) @@ -4107,7 +4107,7 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vc= pu) * Do we shadow a long mode page table? If so we need to * write-protect the guests page table root. */ - if (mmu->cpu_role.base.level >=3D PT64_ROOT_4LEVEL) { + if (mmu->w.cpu_role.base.level >=3D PT64_ROOT_4LEVEL) { root =3D mmu_alloc_root(vcpu, root_gfn, 0, mmu->root_role.level); mmu->root.hpa =3D root; @@ -4146,7 +4146,7 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vc= pu) for (i =3D 0; i < 4; ++i) { WARN_ON_ONCE(IS_VALID_PAE_ROOT(mmu->pae_root[i])); =20 - if (mmu->cpu_role.base.level =3D=3D PT32E_ROOT_LEVEL) { + if (mmu->w.cpu_role.base.level =3D=3D PT32E_ROOT_LEVEL) { if (!(pdptrs[i] & PT_PRESENT_MASK)) { mmu->pae_root[i] =3D INVALID_PAE_ROOT; continue; @@ -4160,7 +4160,7 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vc= pu) * directory. Othwerise each PAE page direct shadows one guest * PAE page directory so that quadrant should be 0. */ - quadrant =3D (mmu->cpu_role.base.level =3D=3D PT32_ROOT_LEVEL) ? i : 0; + quadrant =3D (mmu->w.cpu_role.base.level =3D=3D PT32_ROOT_LEVEL) ? i : 0; =20 root =3D mmu_alloc_root(vcpu, root_gfn, quadrant, PT32_ROOT_LEVEL); mmu->pae_root[i] =3D root | pm_mask; @@ -4196,7 +4196,7 @@ static int mmu_alloc_special_roots(struct kvm_vcpu *v= cpu) * on demand, as running a 32-bit L1 VMM on 64-bit KVM is very rare. */ if (mmu->root_role.direct || - mmu->cpu_role.base.level >=3D PT64_ROOT_4LEVEL || + mmu->w.cpu_role.base.level >=3D PT64_ROOT_4LEVEL || mmu->root_role.level < PT64_ROOT_4LEVEL) return 0; =20 @@ -4301,7 +4301,7 @@ void kvm_mmu_sync_roots(struct kvm_vcpu *vcpu) =20 vcpu_clear_mmio_info(vcpu, MMIO_GVA_ANY); =20 - if (vcpu->arch.mmu->cpu_role.base.level >=3D PT64_ROOT_4LEVEL) { + if (vcpu->arch.mmu->w.cpu_role.base.level >=3D PT64_ROOT_4LEVEL) { hpa_t root =3D vcpu->arch.mmu->root.hpa; =20 if (!is_unsync_root(root)) @@ -5387,9 +5387,9 @@ static void __reset_rsvds_bits_mask(struct rsvd_bits_= validate *rsvd_check, static void reset_guest_rsvds_bits_mask(struct kvm_vcpu *vcpu, struct kvm_mmu *context) { - __reset_rsvds_bits_mask(&context->guest_rsvd_check, + __reset_rsvds_bits_mask(&context->w.guest_rsvd_check, vcpu->arch.reserved_gpa_bits, - context->cpu_role.base.level, is_efer_nx(context), + context->w.cpu_role.base.level, is_efer_nx(context), guest_cpu_cap_has(vcpu, X86_FEATURE_GBPAGES), is_cr4_pse(context), guest_cpuid_is_amd_compatible(vcpu)); @@ -5436,7 +5436,7 @@ static void __reset_rsvds_bits_mask_ept(struct rsvd_b= its_validate *rsvd_check, static void reset_rsvds_bits_mask_ept(struct kvm_vcpu *vcpu, struct kvm_mmu *context, bool execonly, int huge_page_level) { - __reset_rsvds_bits_mask_ept(&context->guest_rsvd_check, + __reset_rsvds_bits_mask_ept(&context->w.guest_rsvd_check, vcpu->arch.reserved_gpa_bits, execonly, huge_page_level); } @@ -5813,7 +5813,7 @@ void __kvm_mmu_refresh_passthrough_bits(struct kvm_vc= pu *vcpu, if (is_cr0_wp(mmu) =3D=3D cr0_wp) return; =20 - mmu->cpu_role.base.cr0_wp =3D cr0_wp; + mmu->w.cpu_role.base.cr0_wp =3D cr0_wp; reset_guest_paging_metadata(vcpu, mmu); } =20 @@ -5872,11 +5872,11 @@ static void init_kvm_tdp_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *context =3D &vcpu->arch.root_mmu; union kvm_mmu_page_role root_role =3D kvm_calc_tdp_mmu_root_page_role(vcp= u, cpu_role); =20 - if (cpu_role.as_u64 =3D=3D context->cpu_role.as_u64 && + if (cpu_role.as_u64 =3D=3D context->w.cpu_role.as_u64 && root_role.word =3D=3D context->root_role.word) return; =20 - context->cpu_role.as_u64 =3D cpu_role.as_u64; + context->w.cpu_role.as_u64 =3D cpu_role.as_u64; context->root_role.word =3D root_role.word; context->page_fault =3D kvm_tdp_page_fault; context->sync_spte =3D NULL; @@ -5900,11 +5900,11 @@ static void shadow_mmu_init_context(struct kvm_vcpu= *vcpu, struct kvm_mmu *conte union kvm_cpu_role cpu_role, union kvm_mmu_page_role root_role) { - if (cpu_role.as_u64 =3D=3D context->cpu_role.as_u64 && + if (cpu_role.as_u64 =3D=3D context->w.cpu_role.as_u64 && root_role.word =3D=3D context->root_role.word) return; =20 - context->cpu_role.as_u64 =3D cpu_role.as_u64; + context->w.cpu_role.as_u64 =3D cpu_role.as_u64; context->root_role.word =3D root_role.word; =20 if (!is_cr0_pg(context)) @@ -6006,9 +6006,9 @@ void kvm_init_shadow_ept_mmu(struct kvm_vcpu *vcpu, b= ool execonly, kvm_calc_shadow_ept_root_page_role(vcpu, accessed_dirty, execonly, level, mbec); =20 - if (new_mode.as_u64 !=3D context->cpu_role.as_u64) { + if (new_mode.as_u64 !=3D context->w.cpu_role.as_u64) { /* EPT, and thus nested EPT, does not consume CR0, CR4, nor EFER. */ - context->cpu_role.as_u64 =3D new_mode.as_u64; + context->w.cpu_role.as_u64 =3D new_mode.as_u64; context->root_role.word =3D new_mode.base.word; =20 context->page_fault =3D ept_page_fault; @@ -6042,10 +6042,10 @@ static void init_kvm_nested_mmu(struct kvm_vcpu *vc= pu, { struct kvm_mmu *g_context =3D &vcpu->arch.nested_mmu; =20 - if (new_mode.as_u64 =3D=3D g_context->cpu_role.as_u64) + if (new_mode.as_u64 =3D=3D g_context->w.cpu_role.as_u64) return; =20 - g_context->cpu_role.as_u64 =3D new_mode.as_u64; + g_context->w.cpu_role.as_u64 =3D new_mode.as_u64; g_context->w.inject_page_fault =3D kvm_inject_page_fault; g_context->w.get_pdptr =3D kvm_pdptr_read; g_context->w.get_guest_pgd =3D get_guest_cr3; @@ -6107,9 +6107,9 @@ void kvm_mmu_after_set_cpuid(struct kvm_vcpu *vcpu) vcpu->arch.root_mmu.root_role.invalid =3D 1; vcpu->arch.guest_mmu.root_role.invalid =3D 1; vcpu->arch.nested_mmu.root_role.invalid =3D 1; - vcpu->arch.root_mmu.cpu_role.ext.valid =3D 0; - vcpu->arch.guest_mmu.cpu_role.ext.valid =3D 0; - vcpu->arch.nested_mmu.cpu_role.ext.valid =3D 0; + vcpu->arch.root_mmu.w.cpu_role.ext.valid =3D 0; + vcpu->arch.guest_mmu.w.cpu_role.ext.valid =3D 0; + vcpu->arch.nested_mmu.w.cpu_role.ext.valid =3D 0; kvm_mmu_reset_context(vcpu); =20 KVM_BUG_ON(!kvm_can_set_cpuid_and_feature_msrs(vcpu), vcpu->kvm); diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h index ef112ca1e405..10b1e7a08e90 100644 --- a/arch/x86/kvm/mmu/paging_tmpl.h +++ b/arch/x86/kvm/mmu/paging_tmpl.h @@ -55,7 +55,7 @@ #define PT_LEVEL_BITS 9 #define PT_GUEST_DIRTY_SHIFT 9 #define PT_GUEST_ACCESSED_SHIFT 8 - #define PT_HAVE_ACCESSED_DIRTY(mmu) (!(mmu)->cpu_role.base.ad_disabled) + #define PT_HAVE_ACCESSED_DIRTY(w) (!(w)->cpu_role.base.ad_disabled) #define PT_MAX_FULL_LEVELS PT64_ROOT_MAX_LEVEL #else #error Invalid PTTYPE value @@ -109,11 +109,10 @@ static gfn_t gpte_to_gfn_lvl(pt_element_t gpte, int l= vl) static inline void FNAME(protect_clean_gpte)(struct kvm_pagewalk *w, unsig= ned *access, unsigned gpte) { - struct kvm_mmu __maybe_unused *mmu =3D container_of(w, struct kvm_mmu, w); unsigned mask; =20 /* dirty bit is not supported, so no need to track it */ - if (!PT_HAVE_ACCESSED_DIRTY(mmu)) + if (!PT_HAVE_ACCESSED_DIRTY(w)) return; =20 BUILD_BUG_ON(PT_WRITABLE_MASK !=3D ACC_WRITE_MASK); @@ -125,7 +124,7 @@ static inline void FNAME(protect_clean_gpte)(struct kvm= _pagewalk *w, unsigned *a *access &=3D mask; } =20 -static inline int FNAME(is_present_gpte)(struct kvm_mmu *mmu, +static inline int FNAME(is_present_gpte)(struct kvm_pagewalk *w, unsigned long pte) { #if PTTYPE !=3D PTTYPE_EPT @@ -135,7 +134,7 @@ static inline int FNAME(is_present_gpte)(struct kvm_mmu= *mmu, * For EPT, an entry is present if any of bits 2:0 are set. * With mode-based execute control, bit 10 also indicates presence. */ - return pte & (7 | (mmu_has_mbec(mmu) ? VMX_EPT_USER_EXECUTABLE_MASK : 0)); + return pte & (7 | (w->cpu_role.base.cr4_smep ? VMX_EPT_USER_EXECUTABLE_MA= SK : 0)); #endif } =20 @@ -150,25 +149,25 @@ static bool FNAME(is_bad_mt_xwr)(struct rsvd_bits_val= idate *rsvd_check, u64 gpte =20 static bool FNAME(is_rsvd_bits_set)(struct kvm_pagewalk *w, u64 gpte, int = level) { - struct kvm_mmu *mmu =3D container_of(w, struct kvm_mmu, w); - - return __is_rsvd_bits_set(&mmu->guest_rsvd_check, gpte, level) || - FNAME(is_bad_mt_xwr)(&mmu->guest_rsvd_check, gpte); + return __is_rsvd_bits_set(&w->guest_rsvd_check, gpte, level) || + FNAME(is_bad_mt_xwr)(&w->guest_rsvd_check, gpte); } =20 static bool FNAME(prefetch_invalid_gpte)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp, u64 *spte, u64 gpte) { - if (!FNAME(is_present_gpte)(vcpu->arch.mmu, gpte)) + struct kvm_pagewalk *w =3D &vcpu->arch.mmu->w; + + if (!FNAME(is_present_gpte)(w, gpte)) goto no_present; =20 /* Prefetch only accessed entries (unless A/D bits are disabled). */ - if (PT_HAVE_ACCESSED_DIRTY(vcpu->arch.mmu) && + if (PT_HAVE_ACCESSED_DIRTY(w) && !(gpte & PT_GUEST_ACCESSED_MASK)) goto no_present; =20 - if (FNAME(is_rsvd_bits_set)(&vcpu->arch.mmu->w, gpte, PG_LEVEL_4K)) + if (FNAME(is_rsvd_bits_set)(w, gpte, PG_LEVEL_4K)) goto no_present; =20 return false; @@ -213,7 +212,6 @@ static int FNAME(update_accessed_dirty_bits)(struct kvm= _vcpu *vcpu, struct guest_walker *walker, gpa_t addr, int write_fault) { - struct kvm_mmu __maybe_unused *mmu =3D container_of(w, struct kvm_mmu, w); unsigned level, index; pt_element_t pte, orig_pte; pt_element_t __user *ptep_user; @@ -221,7 +219,7 @@ static int FNAME(update_accessed_dirty_bits)(struct kvm= _vcpu *vcpu, int ret; =20 /* dirty/accessed bits are not supported, so no need to update them */ - if (!PT_HAVE_ACCESSED_DIRTY(mmu)) + if (!PT_HAVE_ACCESSED_DIRTY(w)) return 0; =20 for (level =3D walker->max_level; level >=3D walker->level; --level) { @@ -285,8 +283,6 @@ static inline unsigned FNAME(gpte_pkeys)(struct kvm_vcp= u *vcpu, u64 gpte) static inline bool FNAME(is_last_gpte)(struct kvm_pagewalk *w, unsigned int level, unsigned int gpte) { - struct kvm_mmu __maybe_unused *mmu =3D container_of(w, struct kvm_mmu, w); - /* * For EPT and PAE paging (both variants), bit 7 is either reserved at * all level or indicates a huge page (ignoring CR3/EPTP). In either @@ -302,7 +298,7 @@ static inline bool FNAME(is_last_gpte)(struct kvm_pagew= alk *w, * is not reserved and does not indicate a large page at this level, * so clear PT_PAGE_SIZE_MASK in gpte if that is the case. */ - gpte &=3D level - (PT32_ROOT_LEVEL + mmu->cpu_role.ext.cr4_pse); + gpte &=3D level - (PT32_ROOT_LEVEL + w->cpu_role.ext.cr4_pse); #endif /* * PG_LEVEL_4K always terminates. The RHS has bit 7 set @@ -341,16 +337,16 @@ static int FNAME(walk_addr_generic)(struct guest_walk= er *walker, =20 trace_kvm_mmu_pagetable_walk(addr, access); retry_walk: - walker->level =3D mmu->cpu_role.base.level; + walker->level =3D w->cpu_role.base.level; pte =3D kvm_mmu_get_guest_pgd(vcpu, w); - have_ad =3D PT_HAVE_ACCESSED_DIRTY(mmu); + have_ad =3D PT_HAVE_ACCESSED_DIRTY(w); =20 #if PTTYPE =3D=3D 64 walk_nx_mask =3D 1ULL << PT64_NX_SHIFT; if (walker->level =3D=3D PT32E_ROOT_LEVEL) { pte =3D w->get_pdptr(vcpu, (addr >> 30) & 3); trace_kvm_mmu_paging_element(pte, walker->level); - if (!FNAME(is_present_gpte)(mmu, pte)) + if (!FNAME(is_present_gpte)(w, pte)) goto error; --walker->level; } @@ -433,7 +429,7 @@ static int FNAME(walk_addr_generic)(struct guest_walker= *walker, */ pte_access =3D pt_access & (pte ^ walk_nx_mask); =20 - if (unlikely(!FNAME(is_present_gpte)(mmu, pte))) + if (unlikely(!FNAME(is_present_gpte)(w, pte))) goto error; =20 if (unlikely(FNAME(is_rsvd_bits_set)(w, pte, walker->level))) { @@ -655,7 +651,7 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, struct k= vm_page_fault *fault, WARN_ON_ONCE(gw->gfn !=3D base_gfn); direct_access =3D gw->pte_access; =20 - top_level =3D vcpu->arch.mmu->cpu_role.base.level; + top_level =3D vcpu->arch.mmu->w.cpu_role.base.level; if (top_level =3D=3D PT32E_ROOT_LEVEL) top_level =3D PT32_ROOT_LEVEL; /* --=20 2.52.0 From nobody Mon Jun 8 07:23:59 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 23EA44611F4 for ; Wed, 3 Jun 2026 10:58:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780484308; cv=none; b=iC3l0j4Nntrmed+kkQBYMbyo3tNIQae0cEq+iDUGtv+kEPZkmqG5Fzt/8MmP1I07hI1bc6l0ZoZtwIw1LUgPojBkvwrdDBE6ll01g7b92m/UEX8KA9DFwJVlhhMnQs31fwlBhHvmVBjRtfZvCJXHMW8CtSqtZZf1L4GvznehSAk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780484308; c=relaxed/simple; bh=rbA4BCpxgGIsjxB95BpebFtdscfxFM+181bgADt/2gw=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=HPAHJIGnU+zxebTRr15R0DOpBkXlTIurNxxH21sQEC5R/bJNQGG2lE+4hDkkmmT3N/DY0QjfC2aOCiGvLR4jMHi3EI9HXNqgNxOnxTwINP23hVSFQnmPcf0k2bNiGBGakA6cWRrXw/cTgvHn/53ZlLaeuKHoBw6PWJxcDJpOi8w= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=TUttbCxr; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="TUttbCxr" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1780484305; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=HoRgVZ/kapo97NlGfnz9p/Dwj+dMEeOUU5QCO8nQoSY=; b=TUttbCxrhd9zvbc9kZ8lSry/eR6iaXLulylcd72Nulj2mu1Yn4U1V7qCBTSuruxgoFpfVN yU7hItjyyZMIMxOdM2FigBSvFQJdbwNd1/u+89ON/5x2UwI+WQP66hqSi7VmOpLdR2WBxq d4IVFxOTcj9Vv01UWuE9eOlTtcBZZwQ= Received: from mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-310-6R4ukf6tPJ-iOG-0SVutsg-1; Wed, 03 Jun 2026 06:58:24 -0400 X-MC-Unique: 6R4ukf6tPJ-iOG-0SVutsg-1 X-Mimecast-MFC-AGG-ID: 6R4ukf6tPJ-iOG-0SVutsg_1780484303 Received: from mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 1BE8B195608B; Wed, 3 Jun 2026 10:58:23 +0000 (UTC) Received: from virtlab1023.lab.eng.rdu2.redhat.lab.eng.rdu2.redhat.com (virtlab1023.lab.eng.rdu2.redhat.com [10.8.1.187]) by mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id B373130001A1; Wed, 3 Jun 2026 10:58:22 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Subject: [PATCH 13/24] KVM: x86/mmu: change CPU-role accessor fields to take struct kvm_pagewalk Date: Wed, 3 Jun 2026 06:58:03 -0400 Message-ID: <20260603105814.10236-14-pbonzini@redhat.com> In-Reply-To: <20260603105814.10236-1-pbonzini@redhat.com> References: <20260603105814.10236-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.4 Content-Type: text/plain; charset="utf-8" With this change, walk_addr_generic and its callees do not need to use container_of() anymore. The next step is removing it from permission_fault() and kvm_mmu_refresh_passthrough_bits(). Signed-off-by: Paolo Bonzini --- arch/x86/kvm/mmu/mmu.c | 44 +++++++++++++++++----------------- arch/x86/kvm/mmu/paging_tmpl.h | 11 ++++----- 2 files changed, 27 insertions(+), 28 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index e2bfecf655d9..2ef04d8c6f95 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -224,9 +224,9 @@ BUILD_MMU_ROLE_REGS_ACCESSOR(efer, lma, EFER_LMA); * and the vCPU may be incorrect/irrelevant. */ #define BUILD_MMU_ROLE_ACCESSOR(base_or_ext, reg, name) \ -static inline bool __maybe_unused is_##reg##_##name(struct kvm_mmu *mmu) \ +static inline bool __maybe_unused is_##reg##_##name(struct kvm_pagewalk *w= ) \ { \ - return !!(mmu->w.cpu_role. base_or_ext . reg##_##name); \ + return !!(w->cpu_role. base_or_ext . reg##_##name); \ } BUILD_MMU_ROLE_ACCESSOR(base, cr0, wp); BUILD_MMU_ROLE_ACCESSOR(ext, cr4, pse); @@ -237,19 +237,19 @@ BUILD_MMU_ROLE_ACCESSOR(ext, cr4, la57); BUILD_MMU_ROLE_ACCESSOR(base, efer, nx); BUILD_MMU_ROLE_ACCESSOR(ext, efer, lma); =20 -static inline bool has_pferr_fetch(struct kvm_mmu *mmu) +static inline bool has_pferr_fetch(struct kvm_pagewalk *w) { - return mmu->w.cpu_role.ext.has_pferr_fetch; + return w->cpu_role.ext.has_pferr_fetch; } =20 -static inline bool is_cr0_pg(struct kvm_mmu *mmu) +static inline bool is_cr0_pg(struct kvm_pagewalk *w) { - return mmu->w.cpu_role.base.level > 0; + return w->cpu_role.base.level > 0; } =20 -static inline bool is_cr4_pae(struct kvm_mmu *mmu) +static inline bool is_cr4_pae(struct kvm_pagewalk *w) { - return !mmu->w.cpu_role.base.has_4_byte_gpte; + return !w->cpu_role.base.has_4_byte_gpte; } =20 static struct kvm_mmu_role_regs vcpu_to_role_regs(struct kvm_vcpu *vcpu) @@ -5389,9 +5389,9 @@ static void reset_guest_rsvds_bits_mask(struct kvm_vc= pu *vcpu, { __reset_rsvds_bits_mask(&context->w.guest_rsvd_check, vcpu->arch.reserved_gpa_bits, - context->w.cpu_role.base.level, is_efer_nx(context), + context->w.cpu_role.base.level, is_efer_nx(&context->w), guest_cpu_cap_has(vcpu, X86_FEATURE_GBPAGES), - is_cr4_pse(context), + is_cr4_pse(&context->w), guest_cpuid_is_amd_compatible(vcpu)); } =20 @@ -5573,10 +5573,10 @@ static void update_permission_bitmask(struct kvm_mm= u *mmu, bool tdp, bool ept) const u16 w =3D ACC_BITS_MASK(ACC_WRITE_MASK); const u16 r =3D ACC_BITS_MASK(ACC_READ_MASK); =20 - bool cr4_smep =3D is_cr4_smep(mmu); - bool cr4_smap =3D is_cr4_smap(mmu); - bool cr0_wp =3D is_cr0_wp(mmu); - bool efer_nx =3D is_efer_nx(mmu); + bool cr4_smep =3D is_cr4_smep(&mmu->w); + bool cr4_smap =3D is_cr4_smap(&mmu->w); + bool cr0_wp =3D is_cr0_wp(&mmu->w); + bool efer_nx =3D is_efer_nx(&mmu->w); =20 /* * In hardware, page fault error codes are generated (as the name @@ -5699,10 +5699,10 @@ static void update_pkru_bitmask(struct kvm_mmu *mmu) =20 mmu->pkru_mask =3D 0; =20 - if (!is_cr4_pke(mmu)) + if (!is_cr4_pke(&mmu->w)) return; =20 - wp =3D is_cr0_wp(mmu); + wp =3D is_cr0_wp(&mmu->w); =20 for (bit =3D 0; bit < ARRAY_SIZE(mmu->permissions); ++bit) { unsigned pfec, pkey_bits; @@ -5739,7 +5739,7 @@ static void update_pkru_bitmask(struct kvm_mmu *mmu) static void reset_guest_paging_metadata(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu) { - if (!is_cr0_pg(mmu)) + if (!is_cr0_pg(&mmu->w)) return; =20 reset_guest_rsvds_bits_mask(vcpu, mmu); @@ -5810,7 +5810,7 @@ void __kvm_mmu_refresh_passthrough_bits(struct kvm_vc= pu *vcpu, BUILD_BUG_ON((KVM_MMU_CR0_ROLE_BITS & KVM_POSSIBLE_CR0_GUEST_BITS) !=3D X= 86_CR0_WP); BUILD_BUG_ON((KVM_MMU_CR4_ROLE_BITS & KVM_POSSIBLE_CR4_GUEST_BITS)); =20 - if (is_cr0_wp(mmu) =3D=3D cr0_wp) + if (is_cr0_wp(&mmu->w) =3D=3D cr0_wp) return; =20 mmu->w.cpu_role.base.cr0_wp =3D cr0_wp; @@ -5885,9 +5885,9 @@ static void init_kvm_tdp_mmu(struct kvm_vcpu *vcpu, context->w.get_pdptr =3D kvm_pdptr_read; context->w.get_guest_pgd =3D get_guest_cr3; =20 - if (!is_cr0_pg(context)) + if (!is_cr0_pg(&context->w)) context->w.gva_to_gpa =3D nonpaging_gva_to_gpa; - else if (is_cr4_pae(context)) + else if (is_cr4_pae(&context->w)) context->w.gva_to_gpa =3D paging64_gva_to_gpa; else context->w.gva_to_gpa =3D paging32_gva_to_gpa; @@ -5907,9 +5907,9 @@ static void shadow_mmu_init_context(struct kvm_vcpu *= vcpu, struct kvm_mmu *conte context->w.cpu_role.as_u64 =3D cpu_role.as_u64; context->root_role.word =3D root_role.word; =20 - if (!is_cr0_pg(context)) + if (!is_cr0_pg(&context->w)) nonpaging_init_context(context); - else if (is_cr4_pae(context)) + else if (is_cr4_pae(&context->w)) paging64_init_context(context); else paging32_init_context(context); diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h index 10b1e7a08e90..99a0e1c95223 100644 --- a/arch/x86/kvm/mmu/paging_tmpl.h +++ b/arch/x86/kvm/mmu/paging_tmpl.h @@ -134,7 +134,7 @@ static inline int FNAME(is_present_gpte)(struct kvm_pag= ewalk *w, * For EPT, an entry is present if any of bits 2:0 are set. * With mode-based execute control, bit 10 also indicates presence. */ - return pte & (7 | (w->cpu_role.base.cr4_smep ? VMX_EPT_USER_EXECUTABLE_MA= SK : 0)); + return pte & (7 | (is_cr4_smep(w) ? VMX_EPT_USER_EXECUTABLE_MASK : 0)); #endif } =20 @@ -316,7 +316,6 @@ static int FNAME(walk_addr_generic)(struct guest_walker= *walker, struct kvm_vcpu *vcpu, struct kvm_pagewalk *w, gpa_t addr, u64 access) { - struct kvm_mmu *mmu =3D container_of(w, struct kvm_mmu, w); int ret; pt_element_t pte; pt_element_t __user *ptep_user; @@ -492,7 +491,7 @@ static int FNAME(walk_addr_generic)(struct guest_walker= *walker, =20 error: errcode |=3D write_fault | user_fault; - if (fetch_fault && has_pferr_fetch(mmu)) + if (fetch_fault && has_pferr_fetch(w)) errcode |=3D PFERR_FETCH_MASK; =20 walker->fault.vector =3D PF_VECTOR; @@ -536,7 +535,7 @@ static int FNAME(walk_addr_generic)(struct guest_walker= *walker, * ACC_*_MASK flags! */ walker->fault.exit_qualification |=3D EPT_VIOLATION_RWX_TO_PROT(pte_acce= ss); - if (mmu_has_mbec(mmu)) + if (is_cr4_smep(w)) walker->fault.exit_qualification |=3D EPT_VIOLATION_USER_EXEC_TO_PROT(pte_access); } @@ -840,7 +839,7 @@ static int FNAME(page_fault)(struct kvm_vcpu *vcpu, str= uct kvm_page_fault *fault * otherwise KVM will cache incorrect access information in the SPTE. */ if (fault->write && !(walker.pte_access & ACC_WRITE_MASK) && - !is_cr0_wp(vcpu->arch.mmu) && !fault->user && fault->slot) { + !is_cr0_wp(&vcpu->arch.mmu->w) && !fault->user && fault->slot) { walker.pte_access |=3D ACC_WRITE_MASK; walker.pte_access &=3D ~ACC_USER_MASK; =20 @@ -850,7 +849,7 @@ static int FNAME(page_fault)(struct kvm_vcpu *vcpu, str= uct kvm_page_fault *fault * then we should prevent the kernel from executing it * if SMEP is enabled. */ - if (is_cr4_smep(vcpu->arch.mmu)) + if (is_cr4_smep(&vcpu->arch.mmu->w)) walker.pte_access &=3D ~ACC_EXEC_MASK; } #endif --=20 2.52.0 From nobody Mon Jun 8 07:23:59 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 19F8F472777 for ; Wed, 3 Jun 2026 10:58:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780484311; cv=none; b=G2nCmSmU/ylQrD4c32Emsoqf3unJNCyTDH693wMLuxP0AhCtdOO0/0xxNQ7LhyKyUDjwz+SJDWrTBzmzfvZfRUFswb/jMlTzIy6QalRLpGzbW7Tv6sjRNNidfRl/O0JYTYe5s77jPe4ulrZfBcj9uoyqlZJcP49Ott8GCzNskeo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780484311; c=relaxed/simple; bh=Wji63uM+TL9uvifxhrRpmFxUGgf/iHA4Gf6A2Q24C4c=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=mst8AEEtX2x1Ax0d2T+8paxv+rMzHeV8EgsH4DGGFdei5xIMmU7/D38q8lq8YPCCCtT1ZSlNM0K3eIRfdUjuY13gPHgusortqg72xR7BVGmbM+CRKo7dbxaFRZ6lqbem13hB41TGOJgZ86bEX9FqsIiB16ybnNFs4by4lBquNY0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=VB+tkIVC; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="VB+tkIVC" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1780484308; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=P88ovYKVzwV1CsEFtR0JFh+vJwUNbSJHZLJ8OOLre1Q=; b=VB+tkIVCLkZkHvKjTB6dPNB1TpNSMonfIpu70R5A/siJrrJGs4xuoXq5eNTUq0D8IhXRta CzG3HnX2S4yNPmbTS0SSl7QwLxl5jz5VDssg9KPm6gzVvaWXWrGxrO5/MfwHbsFM4mGSy5 eeC+6PCmReqfUoLz+hdz09je7gedwUs= Received: from mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-464-Ge7RPiNVNIuoXuBDHw0K2g-1; Wed, 03 Jun 2026 06:58:24 -0400 X-MC-Unique: Ge7RPiNVNIuoXuBDHw0K2g-1 X-Mimecast-MFC-AGG-ID: Ge7RPiNVNIuoXuBDHw0K2g_1780484303 Received: from mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id B2851195605C; Wed, 3 Jun 2026 10:58:23 +0000 (UTC) Received: from virtlab1023.lab.eng.rdu2.redhat.lab.eng.rdu2.redhat.com (virtlab1023.lab.eng.rdu2.redhat.com [10.8.1.187]) by mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 40BA930001A1; Wed, 3 Jun 2026 10:58:23 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Subject: [PATCH 14/24] KVM: x86/mmu: move remaining permission fields to struct kvm_pagewalk Date: Wed, 3 Jun 2026 06:58:04 -0400 Message-ID: <20260603105814.10236-15-pbonzini@redhat.com> In-Reply-To: <20260603105814.10236-1-pbonzini@redhat.com> References: <20260603105814.10236-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.4 Content-Type: text/plain; charset="utf-8" As promised, this removes the remaining instances of container_of(w, struct kvm_mmu, w). Signed-off-by: Paolo Bonzini --- arch/x86/include/asm/kvm_host.h | 30 ++++++++-------- arch/x86/kvm/mmu.h | 13 +++---- arch/x86/kvm/mmu/mmu.c | 62 ++++++++++++++++----------------- 3 files changed, 51 insertions(+), 54 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_hos= t.h index 33c505a15015..860a929e3cd8 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -489,6 +489,21 @@ struct kvm_pagewalk { struct x86_exception *exception); union kvm_cpu_role cpu_role; struct rsvd_bits_validate guest_rsvd_check; + + /* + * The pkru_mask indicates if protection key checks are needed. It + * consists of 16 domains indexed by page fault error code bits [4:1], + * with PFEC.RSVD replaced by ACC_USER_MASK from the page tables. + * Each domain has 2 bits which are ANDed with AD and WD from PKRU. + */ + u32 pkru_mask; + + /* + * Bitmap; bit set =3D permission fault + * Array index: page fault error code [4:1] + * Bit index: pte permissions in ACC_* format + */ + u16 permissions[16]; }; =20 struct kvm_mmu { @@ -501,23 +516,8 @@ struct kvm_mmu { hpa_t mirror_root_hpa; union kvm_mmu_page_role root_role; =20 - /* - * The pkru_mask indicates if protection key checks are needed. It - * consists of 16 domains indexed by page fault error code bits [4:1], - * with PFEC.RSVD replaced by ACC_USER_MASK from the page tables. - * Each domain has 2 bits which are ANDed with AD and WD from PKRU. - */ - u32 pkru_mask; - struct kvm_mmu_root_info prev_roots[KVM_MMU_NUM_PREV_ROOTS]; =20 - /* - * Bitmap; bit set =3D permission fault - * Byte index: page fault error code [4:1] - * Bit index: pte permissions in ACC_* format - */ - u16 permissions[16]; - u64 *pae_root; u64 *pml4_root; u64 *pml5_root; diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h index 3f8ac193a1e6..d1b5d9b0c6ad 100644 --- a/arch/x86/kvm/mmu.h +++ b/arch/x86/kvm/mmu.h @@ -105,7 +105,7 @@ bool kvm_can_do_async_pf(struct kvm_vcpu *vcpu); int kvm_handle_page_fault(struct kvm_vcpu *vcpu, u64 error_code, u64 fault_address, char *insn, int insn_len); void __kvm_mmu_refresh_passthrough_bits(struct kvm_vcpu *vcpu, - struct kvm_mmu *mmu); + struct kvm_pagewalk *pw); =20 int kvm_mmu_load(struct kvm_vcpu *vcpu); void kvm_mmu_unload(struct kvm_vcpu *vcpu); @@ -183,8 +183,7 @@ static inline void kvm_mmu_refresh_passthrough_bits(str= uct kvm_vcpu *vcpu, if (!tdp_enabled || w =3D=3D &vcpu->arch.guest_mmu.w) return; =20 - __kvm_mmu_refresh_passthrough_bits(vcpu, - container_of(w, struct kvm_mmu, w)); + __kvm_mmu_refresh_passthrough_bits(vcpu, w); } =20 /* @@ -199,8 +198,6 @@ static inline u8 permission_fault(struct kvm_vcpu *vcpu= , struct kvm_pagewalk *w, unsigned pte_access, unsigned pte_pkey, u64 access) { - struct kvm_mmu *mmu =3D container_of(w, struct kvm_mmu, w); - /* strip nested paging fault error codes */ unsigned int pfec =3D access; unsigned long rflags =3D kvm_x86_call(get_rflags)(vcpu); @@ -225,10 +222,10 @@ static inline u8 permission_fault(struct kvm_vcpu *vc= pu, struct kvm_pagewalk *w, =20 kvm_mmu_refresh_passthrough_bits(vcpu, w); =20 - fault =3D (mmu->permissions[index] >> pte_access) & 1; + fault =3D (w->permissions[index] >> pte_access) & 1; =20 WARN_ON_ONCE(pfec & (PFERR_PK_MASK | PFERR_SS_MASK | PFERR_RSVD_MASK)); - if (unlikely(mmu->pkru_mask)) { + if (unlikely(w->pkru_mask)) { u32 pkru_bits, offset; =20 /* @@ -242,7 +239,7 @@ static inline u8 permission_fault(struct kvm_vcpu *vcpu= , struct kvm_pagewalk *w, /* clear present bit, replace PFEC.RSVD with ACC_USER_MASK. */ offset =3D (pfec & ~1) | ((pte_access & PT_USER_MASK) ? PFERR_RSVD_MASK = : 0); =20 - pkru_bits &=3D mmu->pkru_mask >> offset; + pkru_bits &=3D w->pkru_mask >> offset; errcode |=3D -pkru_bits & PFERR_PK_MASK; fault |=3D (pkru_bits !=3D 0); } diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 2ef04d8c6f95..cc58b6157118 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -5385,13 +5385,13 @@ static void __reset_rsvds_bits_mask(struct rsvd_bit= s_validate *rsvd_check, } =20 static void reset_guest_rsvds_bits_mask(struct kvm_vcpu *vcpu, - struct kvm_mmu *context) + struct kvm_pagewalk *w) { - __reset_rsvds_bits_mask(&context->w.guest_rsvd_check, + __reset_rsvds_bits_mask(&w->guest_rsvd_check, vcpu->arch.reserved_gpa_bits, - context->w.cpu_role.base.level, is_efer_nx(&context->w), + w->cpu_role.base.level, is_efer_nx(w), guest_cpu_cap_has(vcpu, X86_FEATURE_GBPAGES), - is_cr4_pse(&context->w), + is_cr4_pse(w), guest_cpuid_is_amd_compatible(vcpu)); } =20 @@ -5566,17 +5566,17 @@ reset_ept_shadow_zero_bits_mask(struct kvm_mmu *con= text, bool execonly) (14 & (access) ? 1 << 14 : 0) | \ (15 & (access) ? 1 << 15 : 0)) =20 -static void update_permission_bitmask(struct kvm_mmu *mmu, bool tdp, bool = ept) +static void update_permission_bitmask(struct kvm_pagewalk *pw, bool tdp, b= ool ept) { unsigned index; =20 const u16 w =3D ACC_BITS_MASK(ACC_WRITE_MASK); const u16 r =3D ACC_BITS_MASK(ACC_READ_MASK); =20 - bool cr4_smep =3D is_cr4_smep(&mmu->w); - bool cr4_smap =3D is_cr4_smap(&mmu->w); - bool cr0_wp =3D is_cr0_wp(&mmu->w); - bool efer_nx =3D is_efer_nx(&mmu->w); + bool cr4_smep =3D is_cr4_smep(pw); + bool cr4_smap =3D is_cr4_smap(pw); + bool cr0_wp =3D is_cr0_wp(pw); + bool efer_nx =3D is_efer_nx(pw); =20 /* * In hardware, page fault error codes are generated (as the name @@ -5590,7 +5590,7 @@ static void update_permission_bitmask(struct kvm_mmu = *mmu, bool tdp, bool ept) * permission_fault() to indicate accesses that are *not* subject to * SMAP restrictions. */ - for (index =3D 0; index < ARRAY_SIZE(mmu->permissions); ++index) { + for (index =3D 0; index < ARRAY_SIZE(pw->permissions); ++index) { unsigned pfec =3D index << 1; =20 /* @@ -5664,7 +5664,7 @@ static void update_permission_bitmask(struct kvm_mmu = *mmu, bool tdp, bool ept) smapf =3D (pfec & (PFERR_RSVD_MASK|PFERR_FETCH_MASK)) ? 0 : kf; } =20 - mmu->permissions[index] =3D ff | uf | wf | rf | smapf; + pw->permissions[index] =3D ff | uf | wf | rf | smapf; } } =20 @@ -5692,19 +5692,19 @@ static void update_permission_bitmask(struct kvm_mm= u *mmu, bool tdp, bool ept) * away both AD and WD. For all reads or if the last condition holds, WD * only will be masked away. */ -static void update_pkru_bitmask(struct kvm_mmu *mmu) +static void update_pkru_bitmask(struct kvm_pagewalk *w) { unsigned bit; bool wp; =20 - mmu->pkru_mask =3D 0; + w->pkru_mask =3D 0; =20 - if (!is_cr4_pke(&mmu->w)) + if (!is_cr4_pke(w)) return; =20 - wp =3D is_cr0_wp(&mmu->w); + wp =3D is_cr0_wp(w); =20 - for (bit =3D 0; bit < ARRAY_SIZE(mmu->permissions); ++bit) { + for (bit =3D 0; bit < ARRAY_SIZE(w->permissions); ++bit) { unsigned pfec, pkey_bits; bool check_pkey, check_write, ff, uf, wf, pte_user; =20 @@ -5732,19 +5732,19 @@ static void update_pkru_bitmask(struct kvm_mmu *mmu) /* PKRU.WD stops write access. */ pkey_bits |=3D (!!check_write) << 1; =20 - mmu->pkru_mask |=3D (pkey_bits & 3) << pfec; + w->pkru_mask |=3D (pkey_bits & 3) << pfec; } } =20 static void reset_guest_paging_metadata(struct kvm_vcpu *vcpu, - struct kvm_mmu *mmu) + struct kvm_pagewalk *w) { - if (!is_cr0_pg(&mmu->w)) + if (!is_cr0_pg(w)) return; =20 - reset_guest_rsvds_bits_mask(vcpu, mmu); - update_permission_bitmask(mmu, mmu =3D=3D &vcpu->arch.guest_mmu, false); - update_pkru_bitmask(mmu); + reset_guest_rsvds_bits_mask(vcpu, w); + update_permission_bitmask(w, w =3D=3D &vcpu->arch.guest_mmu.w, false); + update_pkru_bitmask(w); } =20 static void paging64_init_context(struct kvm_mmu *context) @@ -5803,18 +5803,18 @@ static union kvm_cpu_role kvm_calc_cpu_role(struct = kvm_vcpu *vcpu, } =20 void __kvm_mmu_refresh_passthrough_bits(struct kvm_vcpu *vcpu, - struct kvm_mmu *mmu) + struct kvm_pagewalk *w) { const bool cr0_wp =3D kvm_is_cr0_bit_set(vcpu, X86_CR0_WP); =20 BUILD_BUG_ON((KVM_MMU_CR0_ROLE_BITS & KVM_POSSIBLE_CR0_GUEST_BITS) !=3D X= 86_CR0_WP); BUILD_BUG_ON((KVM_MMU_CR4_ROLE_BITS & KVM_POSSIBLE_CR4_GUEST_BITS)); =20 - if (is_cr0_wp(&mmu->w) =3D=3D cr0_wp) + if (is_cr0_wp(w) =3D=3D cr0_wp) return; =20 - mmu->w.cpu_role.base.cr0_wp =3D cr0_wp; - reset_guest_paging_metadata(vcpu, mmu); + w->cpu_role.base.cr0_wp =3D cr0_wp; + reset_guest_paging_metadata(vcpu, w); } =20 static inline int kvm_mmu_get_tdp_level(struct kvm_vcpu *vcpu) @@ -5892,7 +5892,7 @@ static void init_kvm_tdp_mmu(struct kvm_vcpu *vcpu, else context->w.gva_to_gpa =3D paging32_gva_to_gpa; =20 - reset_guest_paging_metadata(vcpu, context); + reset_guest_paging_metadata(vcpu, &context->w); reset_tdp_shadow_zero_bits_mask(context); } =20 @@ -5914,7 +5914,7 @@ static void shadow_mmu_init_context(struct kvm_vcpu *= vcpu, struct kvm_mmu *conte else paging32_init_context(context); =20 - reset_guest_paging_metadata(vcpu, context); + reset_guest_paging_metadata(vcpu, &context->w); reset_shadow_zero_bits_mask(vcpu, context); } =20 @@ -6015,8 +6015,8 @@ void kvm_init_shadow_ept_mmu(struct kvm_vcpu *vcpu, b= ool execonly, context->w.gva_to_gpa =3D ept_gva_to_gpa; context->sync_spte =3D ept_sync_spte; =20 - update_permission_bitmask(context, true, true); - context->pkru_mask =3D 0; + update_permission_bitmask(&context->w, true, true); + context->w.pkru_mask =3D 0; reset_rsvds_bits_mask_ept(vcpu, context, execonly, huge_page_level); reset_ept_shadow_zero_bits_mask(context, execonly); } @@ -6073,7 +6073,7 @@ static void init_kvm_nested_mmu(struct kvm_vcpu *vcpu, else g_context->w.gva_to_gpa =3D paging32_gva_to_gpa; =20 - reset_guest_paging_metadata(vcpu, g_context); + reset_guest_paging_metadata(vcpu, &g_context->w); } =20 void kvm_init_mmu(struct kvm_vcpu *vcpu) --=20 2.52.0 From nobody Mon Jun 8 07:23:59 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 125EE472774 for ; Wed, 3 Jun 2026 10:58:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780484311; cv=none; b=ZEbgVKJQMGHGOy/t6Nw+LVMMYCSycx2l26SxJap7xLbkeMJ2CYMM6VwAPRv4j86qOmyYudy34Jsb7cmQzwQcHy09MpjZhfi0oZAXDbDQHSxn1JjtkjMzyN30TJU5Dq2Zmuj48rjZQNOTU6/35pcrKJd6cEg/x8AcxEyVSsttxvk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780484311; c=relaxed/simple; bh=l9hyfeN/AtqrU8YaVdG/QWa7j4YiWJyblUWI0p2rju8=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=HzH1KUeaikxekyLv4HQZkp66CEFObwPsF4WjXDorqxBtpr3YxQpNYvloyD2ec35AGZ7Im8ZAxH21aY6NmvFK+puKPmt5+ukyo+j4bQo+Fcsn9Kq13CqZYKLd2hAp/6UOuTMjMm8v28XgEVr2Vj6PA9wwT192QNSNIy4ZPS3moe4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=S1maf1gf; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="S1maf1gf" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1780484308; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=umbxxPkYR3DC2FMznof/uMDDMVTvOSdIjwZe/hrSctY=; b=S1maf1gfUMPcVqjX9J+A0CjS7M+Toh/ELBF/aP5CYYt+K7UThrsAYDCKObQcLWh3PdUgEV 4Lph5yOrhBSU56NPdPwoobyyzDfPhA+eCqtqlu/Wa0SG2Acy0evO/DZX+Qsg0gjptvi+Zg yj9z3GUFgLo9nNr6KDBDaZdzCShemiE= Received: from mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-638-418TwKm5PQaQMtx04KFIfQ-1; Wed, 03 Jun 2026 06:58:25 -0400 X-MC-Unique: 418TwKm5PQaQMtx04KFIfQ-1 X-Mimecast-MFC-AGG-ID: 418TwKm5PQaQMtx04KFIfQ_1780484304 Received: from mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 2A2AB1956059; Wed, 3 Jun 2026 10:58:24 +0000 (UTC) Received: from virtlab1023.lab.eng.rdu2.redhat.lab.eng.rdu2.redhat.com (virtlab1023.lab.eng.rdu2.redhat.com [10.8.1.187]) by mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id C1F8E30001A7; Wed, 3 Jun 2026 10:58:23 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Subject: [PATCH 15/24] KVM: x86/mmu: pass struct kvm_pagewalk to kvm_mmu_invalidate_addr Date: Wed, 3 Jun 2026 06:58:05 -0400 Message-ID: <20260603105814.10236-16-pbonzini@redhat.com> In-Reply-To: <20260603105814.10236-1-pbonzini@redhat.com> References: <20260603105814.10236-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.4 Content-Type: text/plain; charset="utf-8" kvm_mmu_invalidate_addr only needs to know if what's being invalidated is a GVA or GPA. This will ultimately be represented by two different kvm_pagewalk structs, so adjust the type of the parameter. For now the GVA case is represented by both root_mmu and nested_mmu. Since nested_mmu never has a sync_spte callback, it would exit at its check; but really nested_mmu should not be a kvm_mmu in the first place and the container_of() would be bogus, so introduce a separate check for whether the invalidation is happening for a nested GVA. In that case there's nothing needed beyond kvm_x86_call(flush_tlb_gva). Signed-off-by: Paolo Bonzini --- arch/x86/include/asm/kvm_host.h | 2 +- arch/x86/kvm/mmu/mmu.c | 12 ++++++++---- arch/x86/kvm/vmx/nested.c | 2 +- arch/x86/kvm/x86.c | 2 +- 4 files changed, 11 insertions(+), 7 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_hos= t.h index 860a929e3cd8..def338583a0f 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -2388,7 +2388,7 @@ int kvm_mmu_page_fault(struct kvm_vcpu *vcpu, gpa_t c= r2_or_gpa, u64 error_code, void *insn, int insn_len); void kvm_mmu_print_sptes(struct kvm_vcpu *vcpu, gpa_t gpa, const char *msg= ); void kvm_mmu_invlpg(struct kvm_vcpu *vcpu, gva_t gva); -void kvm_mmu_invalidate_addr(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, +void kvm_mmu_invalidate_addr(struct kvm_vcpu *vcpu, struct kvm_pagewalk *w, u64 addr, unsigned long roots); void kvm_mmu_invpcid_gva(struct kvm_vcpu *vcpu, gva_t gva, unsigned long p= cid); void kvm_mmu_new_pgd(struct kvm_vcpu *vcpu, gpa_t new_pgd); diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index cc58b6157118..967c2226cba0 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -6596,22 +6596,26 @@ static void __kvm_mmu_invalidate_addr(struct kvm_vc= pu *vcpu, struct kvm_mmu *mmu write_unlock(&vcpu->kvm->mmu_lock); } =20 -void kvm_mmu_invalidate_addr(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, +void kvm_mmu_invalidate_addr(struct kvm_vcpu *vcpu, struct kvm_pagewalk *w, u64 addr, unsigned long roots) { + struct kvm_mmu *mmu; int i; =20 WARN_ON_ONCE(roots & ~KVM_MMU_ROOTS_ALL); =20 /* It's actually a GPA for vcpu->arch.guest_mmu. */ - if (mmu !=3D &vcpu->arch.guest_mmu) { + if (w !=3D &vcpu->arch.guest_mmu.w) { /* INVLPG on a non-canonical address is a NOP according to the SDM. */ if (is_noncanonical_invlpg_address(addr, vcpu)) return; =20 kvm_x86_call(flush_tlb_gva)(vcpu, addr); + if (w =3D=3D &vcpu->arch.nested_mmu.w) + return; } =20 + mmu =3D container_of(w, struct kvm_mmu, w); if (!mmu->sync_spte) return; =20 @@ -6637,7 +6641,7 @@ void kvm_mmu_invlpg(struct kvm_vcpu *vcpu, gva_t gva) * be synced when switching to that new cr3, so nothing needs to be * done here for them. */ - kvm_mmu_invalidate_addr(vcpu, vcpu->arch.walk_mmu, gva, KVM_MMU_ROOTS_ALL= ); + kvm_mmu_invalidate_addr(vcpu, &vcpu->arch.walk_mmu->w, gva, KVM_MMU_ROOTS= _ALL); ++vcpu->stat.invlpg; } EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_mmu_invlpg); @@ -6659,7 +6663,7 @@ void kvm_mmu_invpcid_gva(struct kvm_vcpu *vcpu, gva_t= gva, unsigned long pcid) } =20 if (roots) - kvm_mmu_invalidate_addr(vcpu, mmu, gva, roots); + kvm_mmu_invalidate_addr(vcpu, &mmu->w, gva, roots); ++vcpu->stat.invlpg; =20 /* diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index e9e6714ccd83..475fea4bf97a 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -407,7 +407,7 @@ static void nested_ept_invalidate_addr(struct kvm_vcpu = *vcpu, gpa_t eptp, roots |=3D KVM_MMU_ROOT_PREVIOUS(i); } if (roots) - kvm_mmu_invalidate_addr(vcpu, vcpu->arch.mmu, addr, roots); + kvm_mmu_invalidate_addr(vcpu, &vcpu->arch.guest_mmu.w, addr, roots); } =20 static void nested_ept_inject_page_fault(struct kvm_vcpu *vcpu, diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index e514096f960c..37dbf8c78376 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1002,7 +1002,7 @@ void kvm_inject_emulated_page_fault(struct kvm_vcpu *= vcpu, */ if ((fault->error_code & PFERR_PRESENT_MASK) && !(fault->error_code & PFERR_RSVD_MASK)) - kvm_mmu_invalidate_addr(vcpu, fault_mmu, fault->address, + kvm_mmu_invalidate_addr(vcpu, &fault_mmu->w, fault->address, KVM_MMU_ROOT_CURRENT); =20 fault_mmu->w.inject_page_fault(vcpu, fault); --=20 2.52.0 From nobody Mon Jun 8 07:23:59 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 307B1477E4A for ; Wed, 3 Jun 2026 10:58:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780484314; cv=none; b=dIf7bwWWn6/5m0N5G6QwcZbxl6ocxthz8noYXBx/PDxp1ms/3UrlZwmZB+SFQEM+okGtVp5/trBxb5eqFsYAGsSyNOHMnJ4JO6iSOY4rdstz8DBhrifxqfyAID+qyvR5UjmlpV1pA09iIQuw+Zc/AsSPscKk6CrJR2Z3u91P3XI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780484314; c=relaxed/simple; bh=4QhHLrvrT+JIS+vYRF7YEyax8aij5L08LMNU+3XS57Y=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=sMxLiC7LbxyU9MFAs1YWKzvDA17OHHdpdg+l8RDlAKFWrSjyh6CuvOw3+dT+5omLePEdFBHPCZSP8n4s/zMxMr2zRd7bdDwQB3ahPYQ2juWMcvB4k06aIXhliozp87JUaSPHR/vnCGLw1xGr31F2e/QGFkT1OFw1BTUfE6Qj6Lk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=gJ0IfJXt; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="gJ0IfJXt" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1780484310; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=03kY13aTGtKi9cHz3ZGDZdzVgJfhzTIF+BVQppSmbi8=; b=gJ0IfJXtNoUYWPqTCyb/MHvwZnqsE9e0sFxY5KZMrtGoY2Y8ZJtngMA27RI0hDijoEOkDw Ea2HZroHkyQuQzyn99TxTCANeYk3nES3bFLcr9JFtYnG4J3pn4Dp8ewHyGUSgWQjUM7RJL osW+lLqM/jtYxJrBnf7JQscK4ou8Ty8= Received: from mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-59-XmkIhOJtOaWxluz4LytNnQ-1; Wed, 03 Jun 2026 06:58:25 -0400 X-MC-Unique: XmkIhOJtOaWxluz4LytNnQ-1 X-Mimecast-MFC-AGG-ID: XmkIhOJtOaWxluz4LytNnQ_1780484304 Received: from mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id AD2BD1956058; Wed, 3 Jun 2026 10:58:24 +0000 (UTC) Received: from virtlab1023.lab.eng.rdu2.redhat.lab.eng.rdu2.redhat.com (virtlab1023.lab.eng.rdu2.redhat.com [10.8.1.187]) by mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 4F1AF30001A1; Wed, 3 Jun 2026 10:58:24 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Subject: [PATCH 16/24] KVM: x86/mmu: change walk_mmu to struct kvm_pagewalk Date: Wed, 3 Jun 2026 06:58:06 -0400 Message-ID: <20260603105814.10236-17-pbonzini@redhat.com> In-Reply-To: <20260603105814.10236-1-pbonzini@redhat.com> References: <20260603105814.10236-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.4 Content-Type: text/plain; charset="utf-8" Now that walk_mmu is only accessed for its "w" member, store directly the pointer to it. This also means that nested_mmu is only accessed for its "w" member. Signed-off-by: Paolo Bonzini --- arch/x86/include/asm/kvm_host.h | 2 +- arch/x86/kvm/hyperv.c | 2 +- arch/x86/kvm/mmu/mmu.c | 4 +-- arch/x86/kvm/mmu/paging_tmpl.h | 4 +-- arch/x86/kvm/svm/nested.c | 4 +-- arch/x86/kvm/vmx/nested.c | 4 +-- arch/x86/kvm/x86.c | 44 +++++++++++++++++---------------- 7 files changed, 33 insertions(+), 31 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_hos= t.h index def338583a0f..368386aac3c3 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -883,7 +883,7 @@ struct kvm_vcpu_arch { * Pointer to the mmu context currently used for * gva_to_gpa translations. */ - struct kvm_mmu *walk_mmu; + struct kvm_pagewalk *gva_walk; =20 u64 pdptrs[4]; /* pae */ =20 diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c index a6e7d6f85409..414dc57f1de3 100644 --- a/arch/x86/kvm/hyperv.c +++ b/arch/x86/kvm/hyperv.c @@ -2041,7 +2041,7 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu *vcpu, st= ruct kvm_hv_hcall *hc) * read with kvm_read_guest(). */ if (!hc->fast) { - hc->ingpa =3D kvm_translate_gpa(vcpu, &vcpu->arch.walk_mmu->w, hc->ingpa, + hc->ingpa =3D kvm_translate_gpa(vcpu, vcpu->arch.gva_walk, hc->ingpa, PFERR_GUEST_FINAL_MASK, NULL, 0); if (unlikely(hc->ingpa =3D=3D INVALID_GPA)) return HV_STATUS_INVALID_HYPERCALL_INPUT; diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 967c2226cba0..e6952409c78a 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -6641,7 +6641,7 @@ void kvm_mmu_invlpg(struct kvm_vcpu *vcpu, gva_t gva) * be synced when switching to that new cr3, so nothing needs to be * done here for them. */ - kvm_mmu_invalidate_addr(vcpu, &vcpu->arch.walk_mmu->w, gva, KVM_MMU_ROOTS= _ALL); + kvm_mmu_invalidate_addr(vcpu, vcpu->arch.gva_walk, gva, KVM_MMU_ROOTS_ALL= ); ++vcpu->stat.invlpg; } EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_mmu_invlpg); @@ -6778,7 +6778,7 @@ int kvm_mmu_create(struct kvm_vcpu *vcpu) vcpu->arch.mmu_shadow_page_cache.gfp_zero =3D __GFP_ZERO; =20 vcpu->arch.mmu =3D &vcpu->arch.root_mmu; - vcpu->arch.walk_mmu =3D &vcpu->arch.root_mmu; + vcpu->arch.gva_walk =3D &vcpu->arch.root_mmu.w; =20 ret =3D __kvm_mmu_create(vcpu, &vcpu->arch.guest_mmu); if (ret) diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h index 99a0e1c95223..6b21778e8340 100644 --- a/arch/x86/kvm/mmu/paging_tmpl.h +++ b/arch/x86/kvm/mmu/paging_tmpl.h @@ -541,7 +541,7 @@ static int FNAME(walk_addr_generic)(struct guest_walker= *walker, } #endif walker->fault.address =3D addr; - walker->fault.nested_page_fault =3D w !=3D &vcpu->arch.walk_mmu->w; + walker->fault.nested_page_fault =3D w !=3D vcpu->arch.gva_walk; walker->fault.async_page_fault =3D false; =20 trace_kvm_mmu_walker_error(walker->fault.error_code); @@ -894,7 +894,7 @@ static gpa_t FNAME(gva_to_gpa)(struct kvm_vcpu *vcpu, s= truct kvm_pagewalk *w, =20 #ifndef CONFIG_X86_64 /* A 64-bit GVA should be impossible on 32-bit KVM. */ - WARN_ON_ONCE((addr >> 32) && w =3D=3D &vcpu->arch.walk_mmu->w); + WARN_ON_ONCE((addr >> 32) && w =3D=3D vcpu->arch.gva_walk); #endif =20 r =3D FNAME(walk_addr_generic)(&walker, vcpu, w, addr, access); diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c index 79ef81b878d7..7d89285b0677 100644 --- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -102,13 +102,13 @@ static void nested_svm_init_mmu_context(struct kvm_vc= pu *vcpu) vcpu->arch.mmu->w.get_pdptr =3D nested_svm_get_tdp_pdptr; =20 vcpu->arch.mmu->w.inject_page_fault =3D nested_svm_inject_npf_exit; - vcpu->arch.walk_mmu =3D &vcpu->arch.nested_mmu; + vcpu->arch.gva_walk =3D &vcpu->arch.nested_mmu.w; } =20 static void nested_svm_uninit_mmu_context(struct kvm_vcpu *vcpu) { vcpu->arch.mmu =3D &vcpu->arch.root_mmu; - vcpu->arch.walk_mmu =3D &vcpu->arch.root_mmu; + vcpu->arch.gva_walk =3D &vcpu->arch.root_mmu.w; } =20 static bool nested_vmcb_needs_vls_intercept(struct vcpu_svm *svm) diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index 475fea4bf97a..5a89d5dcfb9a 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -499,13 +499,13 @@ static void nested_ept_init_mmu_context(struct kvm_vc= pu *vcpu) =20 vcpu->arch.mmu->w.inject_page_fault =3D nested_ept_inject_page_fault; =20 - vcpu->arch.walk_mmu =3D &vcpu->arch.nested_mmu; + vcpu->arch.gva_walk =3D &vcpu->arch.nested_mmu.w; } =20 static void nested_ept_uninit_mmu_context(struct kvm_vcpu *vcpu) { vcpu->arch.mmu =3D &vcpu->arch.root_mmu; - vcpu->arch.walk_mmu =3D &vcpu->arch.root_mmu; + vcpu->arch.gva_walk =3D &vcpu->arch.root_mmu.w; } =20 static bool nested_vmx_is_page_fault_vmexit(struct vmcs12 *vmcs12, diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 37dbf8c78376..147cef7b23b6 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -990,11 +990,12 @@ void kvm_inject_page_fault(struct kvm_vcpu *vcpu, str= uct x86_exception *fault) void kvm_inject_emulated_page_fault(struct kvm_vcpu *vcpu, struct x86_exception *fault) { - struct kvm_mmu *fault_mmu; + struct kvm_pagewalk *fault_walk; + WARN_ON_ONCE(fault->vector !=3D PF_VECTOR); =20 - fault_mmu =3D fault->nested_page_fault ? vcpu->arch.mmu : - vcpu->arch.walk_mmu; + fault_walk =3D fault->nested_page_fault ? &vcpu->arch.mmu->w : + vcpu->arch.gva_walk; =20 /* * Invalidate the TLB entry for the faulting address, if it exists, @@ -1002,10 +1003,10 @@ void kvm_inject_emulated_page_fault(struct kvm_vcpu= *vcpu, */ if ((fault->error_code & PFERR_PRESENT_MASK) && !(fault->error_code & PFERR_RSVD_MASK)) - kvm_mmu_invalidate_addr(vcpu, &fault_mmu->w, fault->address, + kvm_mmu_invalidate_addr(vcpu, fault_walk, fault->address, KVM_MMU_ROOT_CURRENT); =20 - fault_mmu->w.inject_page_fault(vcpu, fault); + fault_walk->inject_page_fault(vcpu, fault); } EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_inject_emulated_page_fault); =20 @@ -1060,7 +1061,7 @@ static inline u64 pdptr_rsvd_bits(struct kvm_vcpu *vc= pu) */ int load_pdptrs(struct kvm_vcpu *vcpu, unsigned long cr3) { - struct kvm_mmu *mmu =3D vcpu->arch.walk_mmu; + struct kvm_pagewalk *w =3D vcpu->arch.gva_walk; gfn_t pdpt_gfn =3D cr3 >> PAGE_SHIFT; gpa_t real_gpa; int i; @@ -1071,7 +1072,7 @@ int load_pdptrs(struct kvm_vcpu *vcpu, unsigned long = cr3) * If the MMU is nested, CR3 holds an L2 GPA and needs to be translated * to an L1 GPA. */ - real_gpa =3D kvm_translate_gpa(vcpu, &mmu->w, gfn_to_gpa(pdpt_gfn), + real_gpa =3D kvm_translate_gpa(vcpu, w, gfn_to_gpa(pdpt_gfn), PFERR_USER_MASK | PFERR_WRITE_MASK | PFERR_GUEST_PAGE_MASK, NULL, 0); if (real_gpa =3D=3D INVALID_GPA) @@ -1095,7 +1096,8 @@ int load_pdptrs(struct kvm_vcpu *vcpu, unsigned long = cr3) * Shadow page roots need to be reconstructed instead. */ if (!tdp_enabled && memcmp(vcpu->arch.pdptrs, pdpte, sizeof(vcpu->arch.pd= ptrs))) - kvm_mmu_free_roots(vcpu->kvm, mmu, KVM_MMU_ROOT_CURRENT); + kvm_mmu_free_roots(vcpu->kvm, &vcpu->arch.root_mmu, + KVM_MMU_ROOT_CURRENT); =20 memcpy(vcpu->arch.pdptrs, pdpte, sizeof(vcpu->arch.pdptrs)); kvm_register_mark_dirty(vcpu, VCPU_REG_PDPTR); @@ -7851,7 +7853,7 @@ void kvm_get_segment(struct kvm_vcpu *vcpu, gpa_t kvm_mmu_gva_to_gpa_read(struct kvm_vcpu *vcpu, gva_t gva, struct x86_exception *exception) { - struct kvm_pagewalk *gva_walk =3D &vcpu->arch.walk_mmu->w; + struct kvm_pagewalk *gva_walk =3D vcpu->arch.gva_walk; =20 u64 access =3D (kvm_x86_call(get_cpl)(vcpu) =3D=3D 3) ? PFERR_USER_MASK := 0; return gva_walk->gva_to_gpa(vcpu, gva_walk, gva, access, exception); @@ -7861,7 +7863,7 @@ EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_mmu_gva_to_gpa_rea= d); gpa_t kvm_mmu_gva_to_gpa_write(struct kvm_vcpu *vcpu, gva_t gva, struct x86_exception *exception) { - struct kvm_pagewalk *gva_walk =3D &vcpu->arch.walk_mmu->w; + struct kvm_pagewalk *gva_walk =3D vcpu->arch.gva_walk; =20 u64 access =3D (kvm_x86_call(get_cpl)(vcpu) =3D=3D 3) ? PFERR_USER_MASK := 0; access |=3D PFERR_WRITE_MASK; @@ -7873,7 +7875,7 @@ EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_mmu_gva_to_gpa_wri= te); gpa_t kvm_mmu_gva_to_gpa_system(struct kvm_vcpu *vcpu, gva_t gva, struct x86_exception *exception) { - struct kvm_pagewalk *gva_walk =3D &vcpu->arch.walk_mmu->w; + struct kvm_pagewalk *gva_walk =3D vcpu->arch.gva_walk; =20 return gva_walk->gva_to_gpa(vcpu, gva_walk, gva, 0, exception); } @@ -7882,7 +7884,7 @@ static int kvm_read_guest_virt_helper(gva_t addr, voi= d *val, unsigned int bytes, struct kvm_vcpu *vcpu, u64 access, struct x86_exception *exception) { - struct kvm_pagewalk *gva_walk =3D &vcpu->arch.walk_mmu->w; + struct kvm_pagewalk *gva_walk =3D vcpu->arch.gva_walk; void *data =3D val; int r =3D X86EMUL_CONTINUE; =20 @@ -7915,7 +7917,7 @@ static int kvm_fetch_guest_virt(struct x86_emulate_ct= xt *ctxt, struct x86_exception *exception) { struct kvm_vcpu *vcpu =3D emul_to_vcpu(ctxt); - struct kvm_pagewalk *gva_walk =3D &vcpu->arch.walk_mmu->w; + struct kvm_pagewalk *gva_walk =3D vcpu->arch.gva_walk; u64 access =3D (kvm_x86_call(get_cpl)(vcpu) =3D=3D 3) ? PFERR_USER_MASK := 0; unsigned offset; int ret; @@ -7974,7 +7976,7 @@ static int kvm_write_guest_virt_helper(gva_t addr, vo= id *val, unsigned int bytes struct kvm_vcpu *vcpu, u64 access, struct x86_exception *exception) { - struct kvm_pagewalk *gva_walk =3D &vcpu->arch.walk_mmu->w; + struct kvm_pagewalk *gva_walk =3D vcpu->arch.gva_walk; void *data =3D val; int r =3D X86EMUL_CONTINUE; =20 @@ -8080,7 +8082,7 @@ static int vcpu_mmio_gva_to_gpa(struct kvm_vcpu *vcpu= , unsigned long gva, gpa_t *gpa, struct x86_exception *exception, bool write) { - struct kvm_mmu *mmu =3D vcpu->arch.walk_mmu; + struct kvm_pagewalk *gva_walk =3D vcpu->arch.gva_walk; u64 access =3D ((kvm_x86_call(get_cpl)(vcpu) =3D=3D 3) ? PFERR_USER_MASK = : 0) | (write ? PFERR_WRITE_MASK : 0); =20 @@ -8090,7 +8092,7 @@ static int vcpu_mmio_gva_to_gpa(struct kvm_vcpu *vcpu= , unsigned long gva, * shadow page table for L2 guest. */ if (vcpu_match_mmio_gva(vcpu, gva) && (!is_paging(vcpu) || - !permission_fault(vcpu, &vcpu->arch.walk_mmu->w, + !permission_fault(vcpu, gva_walk, vcpu->arch.mmio_access, 0, access))) { *gpa =3D vcpu->arch.mmio_gfn << PAGE_SHIFT | (gva & (PAGE_SIZE - 1)); @@ -8098,7 +8100,7 @@ static int vcpu_mmio_gva_to_gpa(struct kvm_vcpu *vcpu= , unsigned long gva, return 1; } =20 - *gpa =3D mmu->w.gva_to_gpa(vcpu, &mmu->w, gva, access, exception); + *gpa =3D gva_walk->gva_to_gpa(vcpu, gva_walk, gva, access, exception); =20 if (*gpa =3D=3D INVALID_GPA) return -1; @@ -14211,15 +14213,15 @@ EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_spec_ctrl_test= _value); =20 void kvm_fixup_and_inject_pf_error(struct kvm_vcpu *vcpu, gva_t gva, u16 e= rror_code) { - struct kvm_mmu *mmu =3D vcpu->arch.walk_mmu; + struct kvm_pagewalk *gva_walk =3D vcpu->arch.gva_walk; struct x86_exception fault; u64 access =3D error_code & (PFERR_WRITE_MASK | PFERR_FETCH_MASK | PFERR_USER_MASK); =20 if (!(error_code & PFERR_PRESENT_MASK) || - mmu->w.gva_to_gpa(vcpu, &mmu->w, gva, access, &fault) !=3D INVALID_GP= A) { + gva_walk->gva_to_gpa(vcpu, gva_walk, gva, access, &fault) !=3D INVALI= D_GPA) { /* - * If vcpu->arch.walk_mmu->gva_to_gpa succeeded, the page + * If gva_walk->gva_to_gpa succeeded, the page * tables probably do not match the TLB. Just proceed * with the error code that the processor gave. */ @@ -14230,7 +14232,7 @@ void kvm_fixup_and_inject_pf_error(struct kvm_vcpu = *vcpu, gva_t gva, u16 error_c fault.address =3D gva; fault.async_page_fault =3D false; } - vcpu->arch.walk_mmu->w.inject_page_fault(vcpu, &fault); + gva_walk->inject_page_fault(vcpu, &fault); } EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_fixup_and_inject_pf_error); =20 --=20 2.52.0 From nobody Mon Jun 8 07:23:59 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9106844B695 for ; Wed, 3 Jun 2026 10:58:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780484312; cv=none; b=uHx63FYt03DgVbC2rMiow2LHq4WK+UyqHlVDt3zujwDEFcShUrzqA04Oz5vbHyljhAbic6v+JUquDtWVsII3lC1ULZL38jaLnLIJrhgJWK99o1soVk2Lke+j4TuUtlnK59wsgzhlePmiJeMBpleaCwzEfF5DIZ6uKQf+OMATemc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780484312; c=relaxed/simple; bh=6KzD1s4MXx7i7ATCK/mmaRClf/nKrarqTyy8zgMunKk=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=JumZtUd4qkcWA5j0SkyGd7Im7urBRqOnh39xygnT+V7c65gWzFUteiEeMHnYQCiQFKNuy/5DRLa4cSO2LVBRkHaskS/regn3ZvcXGKdBI5zhiJRfxMUvrBm+D1+QVaapI95twoybXcty94vWLoKFianTN8rAuvqppi/7c8yZHLI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=JAH4MNIq; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="JAH4MNIq" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1780484309; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=1qhO5F4sdJ0xvuAPWtbQAoUDGS95StnP6VP55mBpOeU=; b=JAH4MNIq6ExAfCTvVAmKbs8IUXIB4vTFB0FjmjhAPZziV5TuNepctDyh7Nuz8JmiBeAZvC 1tQkN335IPAK8kb7H12N/Py3xgx2SPln374D1WSKMhjqZhqCEOGsYoBUyYbWnNe5gCJj5i rN+kbz8iZZPM64SXYEeZDH+5nYo+Nu4= Received: from mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-615-sNLHG8m0Mva66OGe8rrgSg-1; Wed, 03 Jun 2026 06:58:26 -0400 X-MC-Unique: sNLHG8m0Mva66OGe8rrgSg-1 X-Mimecast-MFC-AGG-ID: sNLHG8m0Mva66OGe8rrgSg_1780484305 Received: from mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 3854119560A2; Wed, 3 Jun 2026 10:58:25 +0000 (UTC) Received: from virtlab1023.lab.eng.rdu2.redhat.lab.eng.rdu2.redhat.com (virtlab1023.lab.eng.rdu2.redhat.com [10.8.1.187]) by mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id D1A2C30001A1; Wed, 3 Jun 2026 10:58:24 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Subject: [PATCH 17/24] KVM: x86/mmu: change nested_mmu.w to ngva_walk Date: Wed, 3 Jun 2026 06:58:07 -0400 Message-ID: <20260603105814.10236-18-pbonzini@redhat.com> In-Reply-To: <20260603105814.10236-1-pbonzini@redhat.com> References: <20260603105814.10236-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.4 Content-Type: text/plain; charset="utf-8" nested_mmu is now only used for its w member. Rename it, and change its type. Signed-off-by: Paolo Bonzini --- arch/x86/include/asm/kvm_host.h | 5 ++-- arch/x86/kvm/mmu.h | 6 ++--- arch/x86/kvm/mmu/mmu.c | 41 ++++++++++++++------------------- arch/x86/kvm/svm/nested.c | 2 +- arch/x86/kvm/vmx/nested.c | 2 +- 5 files changed, 24 insertions(+), 32 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_hos= t.h index 368386aac3c3..1bebd98ce846 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -877,11 +877,10 @@ struct kvm_vcpu_arch { * walking and not for faulting since we never handle l2 page faults on * the host. */ - struct kvm_mmu nested_mmu; + struct kvm_pagewalk ngva_walk; =20 /* - * Pointer to the mmu context currently used for - * gva_to_gpa translations. + * Pagewalk context used for gva_to_gpa translations. */ struct kvm_pagewalk *gva_walk; =20 diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h index d1b5d9b0c6ad..debdaff7f710 100644 --- a/arch/x86/kvm/mmu.h +++ b/arch/x86/kvm/mmu.h @@ -177,8 +177,8 @@ static inline void kvm_mmu_refresh_passthrough_bits(str= uct kvm_vcpu *vcpu, * be stale. Refresh CR0.WP and the metadata on-demand when checking * for permission faults. Exempt nested MMUs, i.e. MMUs for shadowing * nEPT and nNPT, as CR0.WP is ignored in both cases. Note, KVM does - * need to refresh nested_mmu, a.k.a. the walker used to translate L2 - * GVAs to GPAs, as that "MMU" needs to honor L2's CR0.WP. + * need to refresh ngva_walk, a.k.a. the walker used to translate L2 + * GVAs to GPAs, so as to honor L2's CR0.WP. */ if (!tdp_enabled || w =3D=3D &vcpu->arch.guest_mmu.w) return; @@ -306,7 +306,7 @@ static inline gpa_t kvm_translate_gpa(struct kvm_vcpu *= vcpu, struct x86_exception *exception, u64 pte_access) { - if (w !=3D &vcpu->arch.nested_mmu.w) + if (w !=3D &vcpu->arch.ngva_walk) return gpa; return kvm_x86_ops.nested_ops->translate_nested_gpa(vcpu, gpa, access, exception, diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index e6952409c78a..386fdbc34b02 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -6037,43 +6037,37 @@ static void init_kvm_softmmu(struct kvm_vcpu *vcpu, context->w.get_guest_pgd =3D get_guest_cr3; } =20 -static void init_kvm_nested_mmu(struct kvm_vcpu *vcpu, +static void init_kvm_ngva_walk(struct kvm_vcpu *vcpu, union kvm_cpu_role new_mode) { - struct kvm_mmu *g_context =3D &vcpu->arch.nested_mmu; + struct kvm_pagewalk *g_context =3D &vcpu->arch.ngva_walk; =20 - if (new_mode.as_u64 =3D=3D g_context->w.cpu_role.as_u64) + if (new_mode.as_u64 =3D=3D g_context->cpu_role.as_u64) return; =20 - g_context->w.cpu_role.as_u64 =3D new_mode.as_u64; - g_context->w.inject_page_fault =3D kvm_inject_page_fault; - g_context->w.get_pdptr =3D kvm_pdptr_read; - g_context->w.get_guest_pgd =3D get_guest_cr3; - - /* - * L2 page tables are never shadowed, so there is no need to sync - * SPTEs. - */ - g_context->sync_spte =3D NULL; + g_context->cpu_role.as_u64 =3D new_mode.as_u64; + g_context->inject_page_fault =3D kvm_inject_page_fault; + g_context->get_pdptr =3D kvm_pdptr_read; + g_context->get_guest_pgd =3D get_guest_cr3; =20 /* * Note that arch.mmu->gva_to_gpa translates l2_gpa to l1_gpa using * L1's nested page tables (e.g. EPT12). The nested translation - * of l2_gva to l1_gpa is done by arch.nested_mmu.gva_to_gpa using + * of l2_gva to l1_gpa is done by arch.ngva_walk.gva_to_gpa using * L2's page tables as the first level of translation and L1's * nested page tables as the second level of translation. Basically - * the gva_to_gpa functions between mmu and nested_mmu are swapped. + * the gva_to_gpa functions between mmu and ngva_walk are swapped. */ if (!is_paging(vcpu)) - g_context->w.gva_to_gpa =3D nonpaging_gva_to_gpa; + g_context->gva_to_gpa =3D nonpaging_gva_to_gpa; else if (is_long_mode(vcpu)) - g_context->w.gva_to_gpa =3D paging64_gva_to_gpa; + g_context->gva_to_gpa =3D paging64_gva_to_gpa; else if (is_pae(vcpu)) - g_context->w.gva_to_gpa =3D paging64_gva_to_gpa; + g_context->gva_to_gpa =3D paging64_gva_to_gpa; else - g_context->w.gva_to_gpa =3D paging32_gva_to_gpa; + g_context->gva_to_gpa =3D paging32_gva_to_gpa; =20 - reset_guest_paging_metadata(vcpu, &g_context->w); + reset_guest_paging_metadata(vcpu, g_context); } =20 void kvm_init_mmu(struct kvm_vcpu *vcpu) @@ -6082,7 +6076,7 @@ void kvm_init_mmu(struct kvm_vcpu *vcpu) union kvm_cpu_role cpu_role =3D kvm_calc_cpu_role(vcpu, ®s); =20 if (mmu_is_nested(vcpu)) - init_kvm_nested_mmu(vcpu, cpu_role); + init_kvm_ngva_walk(vcpu, cpu_role); else if (tdp_enabled) init_kvm_tdp_mmu(vcpu, cpu_role); else @@ -6106,10 +6100,9 @@ void kvm_mmu_after_set_cpuid(struct kvm_vcpu *vcpu) */ vcpu->arch.root_mmu.root_role.invalid =3D 1; vcpu->arch.guest_mmu.root_role.invalid =3D 1; - vcpu->arch.nested_mmu.root_role.invalid =3D 1; vcpu->arch.root_mmu.w.cpu_role.ext.valid =3D 0; vcpu->arch.guest_mmu.w.cpu_role.ext.valid =3D 0; - vcpu->arch.nested_mmu.w.cpu_role.ext.valid =3D 0; + vcpu->arch.ngva_walk.cpu_role.ext.valid =3D 0; kvm_mmu_reset_context(vcpu); =20 KVM_BUG_ON(!kvm_can_set_cpuid_and_feature_msrs(vcpu), vcpu->kvm); @@ -6611,7 +6604,7 @@ void kvm_mmu_invalidate_addr(struct kvm_vcpu *vcpu, s= truct kvm_pagewalk *w, return; =20 kvm_x86_call(flush_tlb_gva)(vcpu, addr); - if (w =3D=3D &vcpu->arch.nested_mmu.w) + if (w =3D=3D &vcpu->arch.ngva_walk) return; } =20 diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c index 7d89285b0677..20469fd83e8b 100644 --- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -102,7 +102,7 @@ static void nested_svm_init_mmu_context(struct kvm_vcpu= *vcpu) vcpu->arch.mmu->w.get_pdptr =3D nested_svm_get_tdp_pdptr; =20 vcpu->arch.mmu->w.inject_page_fault =3D nested_svm_inject_npf_exit; - vcpu->arch.gva_walk =3D &vcpu->arch.nested_mmu.w; + vcpu->arch.gva_walk =3D &vcpu->arch.ngva_walk; } =20 static void nested_svm_uninit_mmu_context(struct kvm_vcpu *vcpu) diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index 5a89d5dcfb9a..477c0e8a6e43 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -499,7 +499,7 @@ static void nested_ept_init_mmu_context(struct kvm_vcpu= *vcpu) =20 vcpu->arch.mmu->w.inject_page_fault =3D nested_ept_inject_page_fault; =20 - vcpu->arch.gva_walk =3D &vcpu->arch.nested_mmu.w; + vcpu->arch.gva_walk =3D &vcpu->arch.ngva_walk; } =20 static void nested_ept_uninit_mmu_context(struct kvm_vcpu *vcpu) --=20 2.52.0 From nobody Mon Jun 8 07:23:59 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C553B477E2A for ; Wed, 3 Jun 2026 10:58:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780484314; cv=none; b=SY86zYsEe1tcTSAwTOB9dNIXLxmajC4pWszn9JM5Cvpa2ILitFAqQt2hFhedn4jgUEqkg/L4TAcSftWjGWC7HX7kSjh7E/5b9ooTl96B51WdeAMQcop8NMbrtFf6P+R0e63Mffu0O1XrrcZNVazl9z9iPGfUlyyRi97bAlo4PJw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780484314; c=relaxed/simple; bh=oQ3uB9ag6TFntj2f7RaLgx9+Ljpmo52fpeiN7GgMR3o=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=RRxpKTNeXHFYsD7/yD28+xHtmMexjZhQAiWWarz5HLTJFR+Z6HYlYVjdjAZnVPwJhDXDL7umaguY0LEBYFOPpxdtKrjv150NfRlQ4iCuBrvq5Yu1952PgJ5IBcVxbC4VKgJErsM/Q6A4HqN4IDasZ5NiQN0Fdq4mslpZrJzrpFs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=GwS2yI2n; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="GwS2yI2n" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1780484310; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/Zmh95JMXQZcNUosnCHawF8QUOQ1JiyKHvBrI6qzBKA=; b=GwS2yI2nuRCz/BPGzTm+34WG+SjZk0SNwfB9JI/T3l3IC1yb4G9D0+BPsR9yqAoxIBVXYG iIfeWh+VqhW6iIR98p1towLCe9ogAPWskq9alJxZ0kzOWI+QOBorg8Y0yljT/GpPkT6o5d lEIMePQgNN7FoAPXep9KVP7Zp6xqwsw= Received: from mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-408-7YVSihvMP8SFu-cK9s4ERg-1; Wed, 03 Jun 2026 06:58:26 -0400 X-MC-Unique: 7YVSihvMP8SFu-cK9s4ERg-1 X-Mimecast-MFC-AGG-ID: 7YVSihvMP8SFu-cK9s4ERg_1780484305 Received: from mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id D08F81956096; Wed, 3 Jun 2026 10:58:25 +0000 (UTC) Received: from virtlab1023.lab.eng.rdu2.redhat.lab.eng.rdu2.redhat.com (virtlab1023.lab.eng.rdu2.redhat.com [10.8.1.187]) by mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 5EC5030001A1; Wed, 3 Jun 2026 10:58:25 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Subject: [PATCH 18/24] KVM: x86/mmu: make gva_walk a value Date: Wed, 3 Jun 2026 06:58:08 -0400 Message-ID: <20260603105814.10236-19-pbonzini@redhat.com> In-Reply-To: <20260603105814.10236-1-pbonzini@redhat.com> References: <20260603105814.10236-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.4 Content-Type: text/plain; charset="utf-8" Always use the same instance of kvm_pagewalk to do GVA->GPA translations, instead of flipping the gva_walk pointer back and forth. After all the page walking does behave the same no matter if you are in guest mode or not; the difference lies in the behavior of kvm_translate_gpa and thus in vcpu->arch.mmu, not in the page walker itself. At this point, vcpu->arch.gva_walk and vcpu->arch.root_mmu.w contain the same information (at least when KVM is not running a nested guest, i.e. when root_mmu is actually in use); compare init_kvm_page_walk() on one side with init_kvm_softmmu() + shadow_mmu_init_context() on the other. root_mmu.w is still used by shadow paging, via FNAME(walk_addr) and its callers. vcpu->arch.guest_mmu.w instead is used for both guest emulation (kvm_translate_gpa) and shadow paging. Signed-off-by: Paolo Bonzini --- arch/x86/include/asm/kvm_host.h | 12 +---- arch/x86/kvm/hyperv.c | 2 +- arch/x86/kvm/mmu.h | 8 +-- arch/x86/kvm/mmu/mmu.c | 86 +++++++++++++++------------------ arch/x86/kvm/mmu/paging_tmpl.h | 4 +- arch/x86/kvm/svm/nested.c | 2 - arch/x86/kvm/vmx/nested.c | 3 -- arch/x86/kvm/x86.c | 20 ++++---- 8 files changed, 58 insertions(+), 79 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_hos= t.h index 1bebd98ce846..383bef0cf0f0 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -869,20 +869,10 @@ struct kvm_vcpu_arch { /* L1 MMU when running nested */ struct kvm_mmu guest_mmu; =20 - /* - * Paging state of an L2 guest (used for nested npt) - * - * This context will save all necessary information to walk page tables - * of an L2 guest. This context is only initialized for page table - * walking and not for faulting since we never handle l2 page faults on - * the host. - */ - struct kvm_pagewalk ngva_walk; - /* * Pagewalk context used for gva_to_gpa translations. */ - struct kvm_pagewalk *gva_walk; + struct kvm_pagewalk gva_walk; =20 u64 pdptrs[4]; /* pae */ =20 diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c index 414dc57f1de3..5ccb76010a37 100644 --- a/arch/x86/kvm/hyperv.c +++ b/arch/x86/kvm/hyperv.c @@ -2041,7 +2041,7 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu *vcpu, st= ruct kvm_hv_hcall *hc) * read with kvm_read_guest(). */ if (!hc->fast) { - hc->ingpa =3D kvm_translate_gpa(vcpu, vcpu->arch.gva_walk, hc->ingpa, + hc->ingpa =3D kvm_translate_gpa(vcpu, &vcpu->arch.gva_walk, hc->ingpa, PFERR_GUEST_FINAL_MASK, NULL, 0); if (unlikely(hc->ingpa =3D=3D INVALID_GPA)) return HV_STATUS_INVALID_HYPERCALL_INPUT; diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h index debdaff7f710..b8dc88eb56a5 100644 --- a/arch/x86/kvm/mmu.h +++ b/arch/x86/kvm/mmu.h @@ -176,9 +176,9 @@ static inline void kvm_mmu_refresh_passthrough_bits(str= uct kvm_vcpu *vcpu, * @w's snapshot of CR0.WP and thus all related paging metadata may * be stale. Refresh CR0.WP and the metadata on-demand when checking * for permission faults. Exempt nested MMUs, i.e. MMUs for shadowing - * nEPT and nNPT, as CR0.WP is ignored in both cases. Note, KVM does - * need to refresh ngva_walk, a.k.a. the walker used to translate L2 - * GVAs to GPAs, so as to honor L2's CR0.WP. + * nEPT and nNPT, as CR0.WP is ignored in both cases. Note, KVM will + * still refresh gva_walk, so as to honor L2's CR0.WP when translating + * L2 GVAs to GPAs. */ if (!tdp_enabled || w =3D=3D &vcpu->arch.guest_mmu.w) return; @@ -306,7 +306,7 @@ static inline gpa_t kvm_translate_gpa(struct kvm_vcpu *= vcpu, struct x86_exception *exception, u64 pte_access) { - if (w !=3D &vcpu->arch.ngva_walk) + if (!mmu_is_nested(vcpu) || w =3D=3D &vcpu->arch.guest_mmu.w) return gpa; return kvm_x86_ops.nested_ops->translate_nested_gpa(vcpu, gpa, access, exception, diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 386fdbc34b02..2fe4d5359006 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -5943,6 +5943,27 @@ static void kvm_init_shadow_mmu(struct kvm_vcpu *vcp= u, shadow_mmu_init_context(vcpu, context, cpu_role, root_role); } =20 +static void init_kvm_page_walk(struct kvm_vcpu *vcpu, struct kvm_pagewalk = *w, + union kvm_cpu_role cpu_role) +{ + if (cpu_role.as_u64 =3D=3D w->cpu_role.as_u64) + return; + + w->cpu_role.as_u64 =3D cpu_role.as_u64; + w->inject_page_fault =3D kvm_inject_page_fault; + w->get_pdptr =3D kvm_pdptr_read; + w->get_guest_pgd =3D get_guest_cr3; + + if (!is_cr0_pg(w)) + w->gva_to_gpa =3D nonpaging_gva_to_gpa; + else if (is_cr4_pae(w)) + w->gva_to_gpa =3D paging64_gva_to_gpa; + else + w->gva_to_gpa =3D paging32_gva_to_gpa; + + reset_guest_paging_metadata(vcpu, w); +} + void kvm_init_shadow_npt_mmu(struct kvm_vcpu *vcpu, unsigned long cr4, u64 efer, gpa_t nested_cr3, u64 misc_ctl) { @@ -6037,50 +6058,19 @@ static void init_kvm_softmmu(struct kvm_vcpu *vcpu, context->w.get_guest_pgd =3D get_guest_cr3; } =20 -static void init_kvm_ngva_walk(struct kvm_vcpu *vcpu, - union kvm_cpu_role new_mode) -{ - struct kvm_pagewalk *g_context =3D &vcpu->arch.ngva_walk; - - if (new_mode.as_u64 =3D=3D g_context->cpu_role.as_u64) - return; - - g_context->cpu_role.as_u64 =3D new_mode.as_u64; - g_context->inject_page_fault =3D kvm_inject_page_fault; - g_context->get_pdptr =3D kvm_pdptr_read; - g_context->get_guest_pgd =3D get_guest_cr3; - - /* - * Note that arch.mmu->gva_to_gpa translates l2_gpa to l1_gpa using - * L1's nested page tables (e.g. EPT12). The nested translation - * of l2_gva to l1_gpa is done by arch.ngva_walk.gva_to_gpa using - * L2's page tables as the first level of translation and L1's - * nested page tables as the second level of translation. Basically - * the gva_to_gpa functions between mmu and ngva_walk are swapped. - */ - if (!is_paging(vcpu)) - g_context->gva_to_gpa =3D nonpaging_gva_to_gpa; - else if (is_long_mode(vcpu)) - g_context->gva_to_gpa =3D paging64_gva_to_gpa; - else if (is_pae(vcpu)) - g_context->gva_to_gpa =3D paging64_gva_to_gpa; - else - g_context->gva_to_gpa =3D paging32_gva_to_gpa; - - reset_guest_paging_metadata(vcpu, g_context); -} - void kvm_init_mmu(struct kvm_vcpu *vcpu) { struct kvm_mmu_role_regs regs =3D vcpu_to_role_regs(vcpu); union kvm_cpu_role cpu_role =3D kvm_calc_cpu_role(vcpu, ®s); =20 - if (mmu_is_nested(vcpu)) - init_kvm_ngva_walk(vcpu, cpu_role); - else if (tdp_enabled) - init_kvm_tdp_mmu(vcpu, cpu_role); - else - init_kvm_softmmu(vcpu, cpu_role); + init_kvm_page_walk(vcpu, &vcpu->arch.gva_walk, cpu_role); + + if (!mmu_is_nested(vcpu)) { + if (tdp_enabled) + init_kvm_tdp_mmu(vcpu, cpu_role); + else + init_kvm_softmmu(vcpu, cpu_role); + } } EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_init_mmu); =20 @@ -6102,7 +6092,7 @@ void kvm_mmu_after_set_cpuid(struct kvm_vcpu *vcpu) vcpu->arch.guest_mmu.root_role.invalid =3D 1; vcpu->arch.root_mmu.w.cpu_role.ext.valid =3D 0; vcpu->arch.guest_mmu.w.cpu_role.ext.valid =3D 0; - vcpu->arch.ngva_walk.cpu_role.ext.valid =3D 0; + vcpu->arch.gva_walk.cpu_role.ext.valid =3D 0; kvm_mmu_reset_context(vcpu); =20 KVM_BUG_ON(!kvm_can_set_cpuid_and_feature_msrs(vcpu), vcpu->kvm); @@ -6598,17 +6588,22 @@ void kvm_mmu_invalidate_addr(struct kvm_vcpu *vcpu,= struct kvm_pagewalk *w, WARN_ON_ONCE(roots & ~KVM_MMU_ROOTS_ALL); =20 /* It's actually a GPA for vcpu->arch.guest_mmu. */ - if (w !=3D &vcpu->arch.guest_mmu.w) { + if (w =3D=3D &vcpu->arch.gva_walk) { /* INVLPG on a non-canonical address is a NOP according to the SDM. */ if (is_noncanonical_invlpg_address(addr, vcpu)) return; =20 kvm_x86_call(flush_tlb_gva)(vcpu, addr); - if (w =3D=3D &vcpu->arch.ngva_walk) + + if (tdp_enabled) return; + + mmu =3D &vcpu->arch.root_mmu; + } else { + mmu =3D &vcpu->arch.guest_mmu; } =20 - mmu =3D container_of(w, struct kvm_mmu, w); + /* Invalidate shadow pages, whether GPA->GVA or nGPA->GPA. */ if (!mmu->sync_spte) return; =20 @@ -6634,7 +6629,7 @@ void kvm_mmu_invlpg(struct kvm_vcpu *vcpu, gva_t gva) * be synced when switching to that new cr3, so nothing needs to be * done here for them. */ - kvm_mmu_invalidate_addr(vcpu, vcpu->arch.gva_walk, gva, KVM_MMU_ROOTS_ALL= ); + kvm_mmu_invalidate_addr(vcpu, &vcpu->arch.gva_walk, gva, KVM_MMU_ROOTS_AL= L); ++vcpu->stat.invlpg; } EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_mmu_invlpg); @@ -6656,7 +6651,7 @@ void kvm_mmu_invpcid_gva(struct kvm_vcpu *vcpu, gva_t= gva, unsigned long pcid) } =20 if (roots) - kvm_mmu_invalidate_addr(vcpu, &mmu->w, gva, roots); + kvm_mmu_invalidate_addr(vcpu, &vcpu->arch.gva_walk, gva, roots); ++vcpu->stat.invlpg; =20 /* @@ -6771,7 +6766,6 @@ int kvm_mmu_create(struct kvm_vcpu *vcpu) vcpu->arch.mmu_shadow_page_cache.gfp_zero =3D __GFP_ZERO; =20 vcpu->arch.mmu =3D &vcpu->arch.root_mmu; - vcpu->arch.gva_walk =3D &vcpu->arch.root_mmu.w; =20 ret =3D __kvm_mmu_create(vcpu, &vcpu->arch.guest_mmu); if (ret) diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h index 6b21778e8340..b12c6b5e4a2f 100644 --- a/arch/x86/kvm/mmu/paging_tmpl.h +++ b/arch/x86/kvm/mmu/paging_tmpl.h @@ -541,7 +541,7 @@ static int FNAME(walk_addr_generic)(struct guest_walker= *walker, } #endif walker->fault.address =3D addr; - walker->fault.nested_page_fault =3D w !=3D vcpu->arch.gva_walk; + walker->fault.nested_page_fault =3D w !=3D &vcpu->arch.gva_walk; walker->fault.async_page_fault =3D false; =20 trace_kvm_mmu_walker_error(walker->fault.error_code); @@ -894,7 +894,7 @@ static gpa_t FNAME(gva_to_gpa)(struct kvm_vcpu *vcpu, s= truct kvm_pagewalk *w, =20 #ifndef CONFIG_X86_64 /* A 64-bit GVA should be impossible on 32-bit KVM. */ - WARN_ON_ONCE((addr >> 32) && w =3D=3D vcpu->arch.gva_walk); + WARN_ON_ONCE((addr >> 32) && w =3D=3D &vcpu->arch.gva_walk); #endif =20 r =3D FNAME(walk_addr_generic)(&walker, vcpu, w, addr, access); diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c index 20469fd83e8b..7853bd9ed6cc 100644 --- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -102,13 +102,11 @@ static void nested_svm_init_mmu_context(struct kvm_vc= pu *vcpu) vcpu->arch.mmu->w.get_pdptr =3D nested_svm_get_tdp_pdptr; =20 vcpu->arch.mmu->w.inject_page_fault =3D nested_svm_inject_npf_exit; - vcpu->arch.gva_walk =3D &vcpu->arch.ngva_walk; } =20 static void nested_svm_uninit_mmu_context(struct kvm_vcpu *vcpu) { vcpu->arch.mmu =3D &vcpu->arch.root_mmu; - vcpu->arch.gva_walk =3D &vcpu->arch.root_mmu.w; } =20 static bool nested_vmcb_needs_vls_intercept(struct vcpu_svm *svm) diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index 477c0e8a6e43..449efad7ea1f 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -498,14 +498,11 @@ static void nested_ept_init_mmu_context(struct kvm_vc= pu *vcpu) vcpu->arch.mmu->w.get_pdptr =3D kvm_pdptr_read; =20 vcpu->arch.mmu->w.inject_page_fault =3D nested_ept_inject_page_fault; - - vcpu->arch.gva_walk =3D &vcpu->arch.ngva_walk; } =20 static void nested_ept_uninit_mmu_context(struct kvm_vcpu *vcpu) { vcpu->arch.mmu =3D &vcpu->arch.root_mmu; - vcpu->arch.gva_walk =3D &vcpu->arch.root_mmu.w; } =20 static bool nested_vmx_is_page_fault_vmexit(struct vmcs12 *vmcs12, diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 147cef7b23b6..14af0f4d010e 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -995,7 +995,7 @@ void kvm_inject_emulated_page_fault(struct kvm_vcpu *vc= pu, WARN_ON_ONCE(fault->vector !=3D PF_VECTOR); =20 fault_walk =3D fault->nested_page_fault ? &vcpu->arch.mmu->w : - vcpu->arch.gva_walk; + &vcpu->arch.gva_walk; =20 /* * Invalidate the TLB entry for the faulting address, if it exists, @@ -1061,7 +1061,7 @@ static inline u64 pdptr_rsvd_bits(struct kvm_vcpu *vc= pu) */ int load_pdptrs(struct kvm_vcpu *vcpu, unsigned long cr3) { - struct kvm_pagewalk *w =3D vcpu->arch.gva_walk; + struct kvm_pagewalk *w =3D &vcpu->arch.gva_walk; gfn_t pdpt_gfn =3D cr3 >> PAGE_SHIFT; gpa_t real_gpa; int i; @@ -7853,7 +7853,7 @@ void kvm_get_segment(struct kvm_vcpu *vcpu, gpa_t kvm_mmu_gva_to_gpa_read(struct kvm_vcpu *vcpu, gva_t gva, struct x86_exception *exception) { - struct kvm_pagewalk *gva_walk =3D vcpu->arch.gva_walk; + struct kvm_pagewalk *gva_walk =3D &vcpu->arch.gva_walk; =20 u64 access =3D (kvm_x86_call(get_cpl)(vcpu) =3D=3D 3) ? PFERR_USER_MASK := 0; return gva_walk->gva_to_gpa(vcpu, gva_walk, gva, access, exception); @@ -7863,7 +7863,7 @@ EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_mmu_gva_to_gpa_rea= d); gpa_t kvm_mmu_gva_to_gpa_write(struct kvm_vcpu *vcpu, gva_t gva, struct x86_exception *exception) { - struct kvm_pagewalk *gva_walk =3D vcpu->arch.gva_walk; + struct kvm_pagewalk *gva_walk =3D &vcpu->arch.gva_walk; =20 u64 access =3D (kvm_x86_call(get_cpl)(vcpu) =3D=3D 3) ? PFERR_USER_MASK := 0; access |=3D PFERR_WRITE_MASK; @@ -7875,7 +7875,7 @@ EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_mmu_gva_to_gpa_wri= te); gpa_t kvm_mmu_gva_to_gpa_system(struct kvm_vcpu *vcpu, gva_t gva, struct x86_exception *exception) { - struct kvm_pagewalk *gva_walk =3D vcpu->arch.gva_walk; + struct kvm_pagewalk *gva_walk =3D &vcpu->arch.gva_walk; =20 return gva_walk->gva_to_gpa(vcpu, gva_walk, gva, 0, exception); } @@ -7884,7 +7884,7 @@ static int kvm_read_guest_virt_helper(gva_t addr, voi= d *val, unsigned int bytes, struct kvm_vcpu *vcpu, u64 access, struct x86_exception *exception) { - struct kvm_pagewalk *gva_walk =3D vcpu->arch.gva_walk; + struct kvm_pagewalk *gva_walk =3D &vcpu->arch.gva_walk; void *data =3D val; int r =3D X86EMUL_CONTINUE; =20 @@ -7917,7 +7917,7 @@ static int kvm_fetch_guest_virt(struct x86_emulate_ct= xt *ctxt, struct x86_exception *exception) { struct kvm_vcpu *vcpu =3D emul_to_vcpu(ctxt); - struct kvm_pagewalk *gva_walk =3D vcpu->arch.gva_walk; + struct kvm_pagewalk *gva_walk =3D &vcpu->arch.gva_walk; u64 access =3D (kvm_x86_call(get_cpl)(vcpu) =3D=3D 3) ? PFERR_USER_MASK := 0; unsigned offset; int ret; @@ -7976,7 +7976,7 @@ static int kvm_write_guest_virt_helper(gva_t addr, vo= id *val, unsigned int bytes struct kvm_vcpu *vcpu, u64 access, struct x86_exception *exception) { - struct kvm_pagewalk *gva_walk =3D vcpu->arch.gva_walk; + struct kvm_pagewalk *gva_walk =3D &vcpu->arch.gva_walk; void *data =3D val; int r =3D X86EMUL_CONTINUE; =20 @@ -8082,7 +8082,7 @@ static int vcpu_mmio_gva_to_gpa(struct kvm_vcpu *vcpu= , unsigned long gva, gpa_t *gpa, struct x86_exception *exception, bool write) { - struct kvm_pagewalk *gva_walk =3D vcpu->arch.gva_walk; + struct kvm_pagewalk *gva_walk =3D &vcpu->arch.gva_walk; u64 access =3D ((kvm_x86_call(get_cpl)(vcpu) =3D=3D 3) ? PFERR_USER_MASK = : 0) | (write ? PFERR_WRITE_MASK : 0); =20 @@ -14213,7 +14213,7 @@ EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_spec_ctrl_test_v= alue); =20 void kvm_fixup_and_inject_pf_error(struct kvm_vcpu *vcpu, gva_t gva, u16 e= rror_code) { - struct kvm_pagewalk *gva_walk =3D vcpu->arch.gva_walk; + struct kvm_pagewalk *gva_walk =3D &vcpu->arch.gva_walk; struct x86_exception fault; u64 access =3D error_code & (PFERR_WRITE_MASK | PFERR_FETCH_MASK | PFERR_USER_MASK); --=20 2.52.0 From nobody Mon Jun 8 07:23:59 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 92AA5478851 for ; Wed, 3 Jun 2026 10:58:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780484315; cv=none; b=W8oTG3pBVIKaY5TE/LhsZ4bblSaVI6XtoLUJeyBjQAhKhfjDnUswmZFJ/QS6sfAiBcH4XozSFfLM4JRU/hxNnvVTK2kSIVhg/jvrEm+d15jGqhM/LiENZQBGmEHgEQjGVFoJZPFhIsQcu6AaE8COUcuVWk3qSiHAPjoxCp8/1ao= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780484315; c=relaxed/simple; bh=WlNXvzW7ln7u94TbWINmk7Y7j3x+T4r4fb9Xw+m/9LI=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=MCb4FxnDWfR6XppmM494EKVmDco9oEkuHEP7JME6ZS4q87ML2O0SSkfhNunQrOfEFS3bWCttFHqJGTF7ezgnnliRJXYZbSQ+Mexjzy0DqOeXA3xdDHr89gEm10hKZRotV8l0DGwyIZfruvfq5KcRvew9LLHr0wEcTKNEN3Ys7fI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=Uua8yia+; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="Uua8yia+" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1780484310; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=baPWCBWFLJklhmO9o/jKMPHzOjdlVdInltl8nPF7PCQ=; b=Uua8yia+F5Zw73pciA+nniHWWMifvq+/U9lzWwhPjwSuhVt00U+AiH+fj+92CJ/6OMpVNh sqdEqF5FXb10QzkEhRnKotKOtz8wemwI3pfGHiccjZC9tChFIas9bf5toj8+GJRPF1Vule r72sfX+uBDxaE7xKbtAS4IbL8higktc= Received: from mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-98-pFe-Q0R6NbiAW8mmbVA_bQ-1; Wed, 03 Jun 2026 06:58:27 -0400 X-MC-Unique: pFe-Q0R6NbiAW8mmbVA_bQ-1 X-Mimecast-MFC-AGG-ID: pFe-Q0R6NbiAW8mmbVA_bQ_1780484306 Received: from mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 5C5F5180048E; Wed, 3 Jun 2026 10:58:26 +0000 (UTC) Received: from virtlab1023.lab.eng.rdu2.redhat.lab.eng.rdu2.redhat.com (virtlab1023.lab.eng.rdu2.redhat.com [10.8.1.187]) by mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id E03CD30001A1; Wed, 3 Jun 2026 10:58:25 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Subject: [PATCH 19/24] KVM: x86/mmu: pull struct kvm_pagewalk out of struct kvm_mmu Date: Wed, 3 Jun 2026 06:58:09 -0400 Message-ID: <20260603105814.10236-20-pbonzini@redhat.com> In-Reply-To: <20260603105814.10236-1-pbonzini@redhat.com> References: <20260603105814.10236-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.4 Content-Type: text/plain; charset="utf-8" Now that root_mmu.w always has the same content as gva_walk, replace it with just a pointer to gva_walk. For guest_mmu, introduce a second struct kvm_pagewalk and point to it. It is now clear that non-MMU code does cares about page walks, but it funnels (almost) all interactions with the TLB to mmu.c. It is left as an exercise to the reader to split kvm_pagewalk to its own file... Signed-off-by: Paolo Bonzini --- arch/x86/include/asm/kvm_host.h | 7 ++- arch/x86/kvm/mmu.h | 4 +- arch/x86/kvm/mmu/mmu.c | 97 +++++++++++++-------------------- arch/x86/kvm/mmu/paging_tmpl.h | 14 ++--- arch/x86/kvm/svm/nested.c | 9 ++- arch/x86/kvm/vmx/nested.c | 11 ++-- arch/x86/kvm/x86.c | 2 +- 7 files changed, 63 insertions(+), 81 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_hos= t.h index 383bef0cf0f0..ce39230eaebb 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -507,11 +507,11 @@ struct kvm_pagewalk { }; =20 struct kvm_mmu { - struct kvm_pagewalk w; - int (*page_fault)(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault); int (*sync_spte)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp, int i); + struct kvm_pagewalk *w; + struct kvm_mmu_root_info root; hpa_t mirror_root_hpa; union kvm_mmu_page_role root_role; @@ -866,8 +866,9 @@ struct kvm_vcpu_arch { /* Non-nested MMU for L1 */ struct kvm_mmu root_mmu; =20 - /* L1 MMU when running nested */ + /* L1 TDP when running nested */ struct kvm_mmu guest_mmu; + struct kvm_pagewalk ngpa_walk; =20 /* * Pagewalk context used for gva_to_gpa translations. diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h index b8dc88eb56a5..58eb98585a29 100644 --- a/arch/x86/kvm/mmu.h +++ b/arch/x86/kvm/mmu.h @@ -180,7 +180,7 @@ static inline void kvm_mmu_refresh_passthrough_bits(str= uct kvm_vcpu *vcpu, * still refresh gva_walk, so as to honor L2's CR0.WP when translating * L2 GVAs to GPAs. */ - if (!tdp_enabled || w =3D=3D &vcpu->arch.guest_mmu.w) + if (!tdp_enabled || w =3D=3D &vcpu->arch.ngpa_walk) return; =20 __kvm_mmu_refresh_passthrough_bits(vcpu, w); @@ -306,7 +306,7 @@ static inline gpa_t kvm_translate_gpa(struct kvm_vcpu *= vcpu, struct x86_exception *exception, u64 pte_access) { - if (!mmu_is_nested(vcpu) || w =3D=3D &vcpu->arch.guest_mmu.w) + if (!mmu_is_nested(vcpu) || w =3D=3D &vcpu->arch.ngpa_walk) return gpa; return kvm_x86_ops.nested_ops->translate_nested_gpa(vcpu, gpa, access, exception, diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 2fe4d5359006..bd307e9b3fd6 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -2473,12 +2473,14 @@ static void shadow_walk_init_using_root(struct kvm_= shadow_walk_iterator *iterato struct kvm_vcpu *vcpu, hpa_t root, u64 addr) { + struct kvm_pagewalk *w =3D vcpu->arch.mmu->w; + iterator->addr =3D addr; iterator->shadow_addr =3D root; iterator->level =3D vcpu->arch.mmu->root_role.level; =20 if (iterator->level >=3D PT64_ROOT_4LEVEL && - vcpu->arch.mmu->w.cpu_role.base.level < PT64_ROOT_4LEVEL && + w->cpu_role.base.level < PT64_ROOT_4LEVEL && !vcpu->arch.mmu->root_role.direct) iterator->level =3D PT32E_ROOT_LEVEL; =20 @@ -4066,12 +4068,13 @@ static int mmu_first_shadow_root_alloc(struct kvm *= kvm) static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu) { struct kvm_mmu *mmu =3D vcpu->arch.mmu; + struct kvm_pagewalk *w =3D mmu->w; u64 pdptrs[4], pm_mask; gfn_t root_gfn, root_pgd; int quadrant, i, r; hpa_t root; =20 - root_pgd =3D kvm_mmu_get_guest_pgd(vcpu, &mmu->w); + root_pgd =3D kvm_mmu_get_guest_pgd(vcpu, mmu->w); root_gfn =3D (root_pgd & __PT_BASE_ADDR_MASK) >> PAGE_SHIFT; =20 if (!kvm_vcpu_is_visible_gfn(vcpu, root_gfn)) { @@ -4083,9 +4086,9 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vc= pu) * On SVM, reading PDPTRs might access guest memory, which might fault * and thus might sleep. Grab the PDPTRs before acquiring mmu_lock. */ - if (mmu->w.cpu_role.base.level =3D=3D PT32E_ROOT_LEVEL) { + if (w->cpu_role.base.level =3D=3D PT32E_ROOT_LEVEL) { for (i =3D 0; i < 4; ++i) { - pdptrs[i] =3D mmu->w.get_pdptr(vcpu, i); + pdptrs[i] =3D w->get_pdptr(vcpu, i); if (!(pdptrs[i] & PT_PRESENT_MASK)) continue; =20 @@ -4107,7 +4110,7 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vc= pu) * Do we shadow a long mode page table? If so we need to * write-protect the guests page table root. */ - if (mmu->w.cpu_role.base.level >=3D PT64_ROOT_4LEVEL) { + if (w->cpu_role.base.level >=3D PT64_ROOT_4LEVEL) { root =3D mmu_alloc_root(vcpu, root_gfn, 0, mmu->root_role.level); mmu->root.hpa =3D root; @@ -4146,7 +4149,7 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vc= pu) for (i =3D 0; i < 4; ++i) { WARN_ON_ONCE(IS_VALID_PAE_ROOT(mmu->pae_root[i])); =20 - if (mmu->w.cpu_role.base.level =3D=3D PT32E_ROOT_LEVEL) { + if (w->cpu_role.base.level =3D=3D PT32E_ROOT_LEVEL) { if (!(pdptrs[i] & PT_PRESENT_MASK)) { mmu->pae_root[i] =3D INVALID_PAE_ROOT; continue; @@ -4160,7 +4163,7 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vc= pu) * directory. Othwerise each PAE page direct shadows one guest * PAE page directory so that quadrant should be 0. */ - quadrant =3D (mmu->w.cpu_role.base.level =3D=3D PT32_ROOT_LEVEL) ? i : 0; + quadrant =3D (w->cpu_role.base.level =3D=3D PT32_ROOT_LEVEL) ? i : 0; =20 root =3D mmu_alloc_root(vcpu, root_gfn, quadrant, PT32_ROOT_LEVEL); mmu->pae_root[i] =3D root | pm_mask; @@ -4184,6 +4187,7 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vc= pu) static int mmu_alloc_special_roots(struct kvm_vcpu *vcpu) { struct kvm_mmu *mmu =3D vcpu->arch.mmu; + struct kvm_pagewalk *w =3D mmu->w; bool need_pml5 =3D mmu->root_role.level > PT64_ROOT_4LEVEL; u64 *pml5_root =3D NULL; u64 *pml4_root =3D NULL; @@ -4196,7 +4200,7 @@ static int mmu_alloc_special_roots(struct kvm_vcpu *v= cpu) * on demand, as running a 32-bit L1 VMM on 64-bit KVM is very rare. */ if (mmu->root_role.direct || - mmu->w.cpu_role.base.level >=3D PT64_ROOT_4LEVEL || + w->cpu_role.base.level >=3D PT64_ROOT_4LEVEL || mmu->root_role.level < PT64_ROOT_4LEVEL) return 0; =20 @@ -4301,7 +4305,7 @@ void kvm_mmu_sync_roots(struct kvm_vcpu *vcpu) =20 vcpu_clear_mmio_info(vcpu, MMIO_GVA_ANY); =20 - if (vcpu->arch.mmu->w.cpu_role.base.level >=3D PT64_ROOT_4LEVEL) { + if (vcpu->arch.mmu->w->cpu_role.base.level >=3D PT64_ROOT_4LEVEL) { hpa_t root =3D vcpu->arch.mmu->root.hpa; =20 if (!is_unsync_root(root)) @@ -4543,7 +4547,7 @@ static bool kvm_arch_setup_async_pf(struct kvm_vcpu *= vcpu, if (arch.direct_map) arch.cr3 =3D (unsigned long)INVALID_GPA; else - arch.cr3 =3D kvm_mmu_get_guest_pgd(vcpu, &vcpu->arch.mmu->w); + arch.cr3 =3D kvm_mmu_get_guest_pgd(vcpu, vcpu->arch.mmu->w); =20 return kvm_setup_async_pf(vcpu, fault->addr, kvm_vcpu_gfn_to_hva(vcpu, fault->gfn), &arch); @@ -4565,7 +4569,7 @@ void kvm_arch_async_page_ready(struct kvm_vcpu *vcpu,= struct kvm_async_pf *work) return; =20 if (!vcpu->arch.mmu->root_role.direct && - work->arch.cr3 !=3D kvm_mmu_get_guest_pgd(vcpu, &vcpu->arch.mmu->w)) + work->arch.cr3 !=3D kvm_mmu_get_guest_pgd(vcpu, vcpu->arch.mmu->w)) return; =20 r =3D kvm_mmu_do_page_fault(vcpu, work->cr2_or_gpa, work->arch.error_code, @@ -5119,7 +5123,6 @@ EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_tdp_mmu_map_privat= e_pfn); static void nonpaging_init_context(struct kvm_mmu *context) { context->page_fault =3D nonpaging_page_fault; - context->w.gva_to_gpa =3D nonpaging_gva_to_gpa; context->sync_spte =3D NULL; } =20 @@ -5434,9 +5437,9 @@ static void __reset_rsvds_bits_mask_ept(struct rsvd_b= its_validate *rsvd_check, } =20 static void reset_rsvds_bits_mask_ept(struct kvm_vcpu *vcpu, - struct kvm_mmu *context, bool execonly, int huge_page_level) + bool execonly, int huge_page_level) { - __reset_rsvds_bits_mask_ept(&context->w.guest_rsvd_check, + __reset_rsvds_bits_mask_ept(&vcpu->arch.ngpa_walk.guest_rsvd_check, vcpu->arch.reserved_gpa_bits, execonly, huge_page_level); } @@ -5743,21 +5746,19 @@ static void reset_guest_paging_metadata(struct kvm_= vcpu *vcpu, return; =20 reset_guest_rsvds_bits_mask(vcpu, w); - update_permission_bitmask(w, w =3D=3D &vcpu->arch.guest_mmu.w, false); + update_permission_bitmask(w, w =3D=3D &vcpu->arch.ngpa_walk, false); update_pkru_bitmask(w); } =20 static void paging64_init_context(struct kvm_mmu *context) { context->page_fault =3D paging64_page_fault; - context->w.gva_to_gpa =3D paging64_gva_to_gpa; context->sync_spte =3D paging64_sync_spte; } =20 static void paging32_init_context(struct kvm_mmu *context) { context->page_fault =3D paging32_page_fault; - context->w.gva_to_gpa =3D paging32_gva_to_gpa; context->sync_spte =3D paging32_sync_spte; } =20 @@ -5872,49 +5873,31 @@ static void init_kvm_tdp_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *context =3D &vcpu->arch.root_mmu; union kvm_mmu_page_role root_role =3D kvm_calc_tdp_mmu_root_page_role(vcp= u, cpu_role); =20 - if (cpu_role.as_u64 =3D=3D context->w.cpu_role.as_u64 && - root_role.word =3D=3D context->root_role.word) + if (root_role.word =3D=3D context->root_role.word) return; =20 - context->w.cpu_role.as_u64 =3D cpu_role.as_u64; context->root_role.word =3D root_role.word; context->page_fault =3D kvm_tdp_page_fault; context->sync_spte =3D NULL; =20 - context->w.inject_page_fault =3D kvm_inject_page_fault; - context->w.get_pdptr =3D kvm_pdptr_read; - context->w.get_guest_pgd =3D get_guest_cr3; - - if (!is_cr0_pg(&context->w)) - context->w.gva_to_gpa =3D nonpaging_gva_to_gpa; - else if (is_cr4_pae(&context->w)) - context->w.gva_to_gpa =3D paging64_gva_to_gpa; - else - context->w.gva_to_gpa =3D paging32_gva_to_gpa; - - reset_guest_paging_metadata(vcpu, &context->w); reset_tdp_shadow_zero_bits_mask(context); } =20 static void shadow_mmu_init_context(struct kvm_vcpu *vcpu, struct kvm_mmu = *context, - union kvm_cpu_role cpu_role, union kvm_mmu_page_role root_role) { - if (cpu_role.as_u64 =3D=3D context->w.cpu_role.as_u64 && - root_role.word =3D=3D context->root_role.word) + if (root_role.word =3D=3D context->root_role.word) return; =20 - context->w.cpu_role.as_u64 =3D cpu_role.as_u64; context->root_role.word =3D root_role.word; =20 - if (!is_cr0_pg(&context->w)) + if (!is_cr0_pg(context->w)) nonpaging_init_context(context); - else if (is_cr4_pae(&context->w)) + else if (is_cr4_pae(context->w)) paging64_init_context(context); else paging32_init_context(context); =20 - reset_guest_paging_metadata(vcpu, &context->w); reset_shadow_zero_bits_mask(vcpu, context); } =20 @@ -5940,7 +5923,7 @@ static void kvm_init_shadow_mmu(struct kvm_vcpu *vcpu, */ root_role.efer_nx =3D true; =20 - shadow_mmu_init_context(vcpu, context, cpu_role, root_role); + shadow_mmu_init_context(vcpu, context, root_role); } =20 static void init_kvm_page_walk(struct kvm_vcpu *vcpu, struct kvm_pagewalk = *w, @@ -5980,13 +5963,15 @@ void kvm_init_shadow_npt_mmu(struct kvm_vcpu *vcpu,= unsigned long cr4, WARN_ON_ONCE(cpu_role.base.direct || !cpu_role.base.guest_mode); cpu_role.base.cr4_smep =3D (misc_ctl & SVM_MISC_ENABLE_GMET) !=3D 0; =20 + init_kvm_page_walk(vcpu, &vcpu->arch.ngpa_walk, cpu_role); + root_role =3D cpu_role.base; root_role.level =3D kvm_mmu_get_tdp_level(vcpu); if (root_role.level =3D=3D PT64_ROOT_5LEVEL && cpu_role.base.level =3D=3D PT64_ROOT_4LEVEL) root_role.passthrough =3D 1; =20 - shadow_mmu_init_context(vcpu, context, cpu_role, root_role); + shadow_mmu_init_context(vcpu, context, root_role); kvm_mmu_new_pgd(vcpu, nested_cr3); } EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_init_shadow_npt_mmu); @@ -6027,18 +6012,20 @@ void kvm_init_shadow_ept_mmu(struct kvm_vcpu *vcpu,= bool execonly, kvm_calc_shadow_ept_root_page_role(vcpu, accessed_dirty, execonly, level, mbec); =20 - if (new_mode.as_u64 !=3D context->w.cpu_role.as_u64) { + struct kvm_pagewalk *ngpa_walk =3D &vcpu->arch.ngpa_walk; + + if (new_mode.as_u64 !=3D ngpa_walk->cpu_role.as_u64) { /* EPT, and thus nested EPT, does not consume CR0, CR4, nor EFER. */ - context->w.cpu_role.as_u64 =3D new_mode.as_u64; + ngpa_walk->cpu_role.as_u64 =3D new_mode.as_u64; context->root_role.word =3D new_mode.base.word; =20 context->page_fault =3D ept_page_fault; - context->w.gva_to_gpa =3D ept_gva_to_gpa; + ngpa_walk->gva_to_gpa =3D ept_gva_to_gpa; context->sync_spte =3D ept_sync_spte; =20 - update_permission_bitmask(&context->w, true, true); - context->w.pkru_mask =3D 0; - reset_rsvds_bits_mask_ept(vcpu, context, execonly, huge_page_level); + update_permission_bitmask(ngpa_walk, true, true); + ngpa_walk->pkru_mask =3D 0; + reset_rsvds_bits_mask_ept(vcpu, execonly, huge_page_level); reset_ept_shadow_zero_bits_mask(context, execonly); } =20 @@ -6049,13 +6036,7 @@ EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_init_shadow_ept_m= mu); static void init_kvm_softmmu(struct kvm_vcpu *vcpu, union kvm_cpu_role cpu_role) { - struct kvm_mmu *context =3D &vcpu->arch.root_mmu; - kvm_init_shadow_mmu(vcpu, cpu_role); - - context->w.inject_page_fault =3D kvm_inject_page_fault; - context->w.get_pdptr =3D kvm_pdptr_read; - context->w.get_guest_pgd =3D get_guest_cr3; } =20 void kvm_init_mmu(struct kvm_vcpu *vcpu) @@ -6090,8 +6071,7 @@ void kvm_mmu_after_set_cpuid(struct kvm_vcpu *vcpu) */ vcpu->arch.root_mmu.root_role.invalid =3D 1; vcpu->arch.guest_mmu.root_role.invalid =3D 1; - vcpu->arch.root_mmu.w.cpu_role.ext.valid =3D 0; - vcpu->arch.guest_mmu.w.cpu_role.ext.valid =3D 0; + vcpu->arch.ngpa_walk.cpu_role.ext.valid =3D 0; vcpu->arch.gva_walk.cpu_role.ext.valid =3D 0; kvm_mmu_reset_context(vcpu); =20 @@ -6696,11 +6676,12 @@ static void free_mmu_pages(struct kvm_mmu *mmu) free_page((unsigned long)mmu->pml5_root); } =20 -static int __kvm_mmu_create(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu) +static int __kvm_mmu_create(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, st= ruct kvm_pagewalk *w) { struct page *page; int i; =20 + mmu->w =3D w; mmu->root.hpa =3D INVALID_PAGE; mmu->root.pgd =3D 0; mmu->mirror_root_hpa =3D INVALID_PAGE; @@ -6767,11 +6748,11 @@ int kvm_mmu_create(struct kvm_vcpu *vcpu) =20 vcpu->arch.mmu =3D &vcpu->arch.root_mmu; =20 - ret =3D __kvm_mmu_create(vcpu, &vcpu->arch.guest_mmu); + ret =3D __kvm_mmu_create(vcpu, &vcpu->arch.guest_mmu, &vcpu->arch.ngpa_wa= lk); if (ret) return ret; =20 - ret =3D __kvm_mmu_create(vcpu, &vcpu->arch.root_mmu); + ret =3D __kvm_mmu_create(vcpu, &vcpu->arch.root_mmu, &vcpu->arch.gva_walk= ); if (ret) goto fail_allocate_root; =20 diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h index b12c6b5e4a2f..088b86d228c3 100644 --- a/arch/x86/kvm/mmu/paging_tmpl.h +++ b/arch/x86/kvm/mmu/paging_tmpl.h @@ -157,7 +157,7 @@ static bool FNAME(prefetch_invalid_gpte)(struct kvm_vcp= u *vcpu, struct kvm_mmu_page *sp, u64 *spte, u64 gpte) { - struct kvm_pagewalk *w =3D &vcpu->arch.mmu->w; + struct kvm_pagewalk *w =3D vcpu->arch.mmu->w; =20 if (!FNAME(is_present_gpte)(w, gpte)) goto no_present; @@ -551,7 +551,7 @@ static int FNAME(walk_addr_generic)(struct guest_walker= *walker, static int FNAME(walk_addr)(struct guest_walker *walker, struct kvm_vcpu *vcpu, gpa_t addr, u64 access) { - return FNAME(walk_addr_generic)(walker, vcpu, &vcpu->arch.mmu->w, addr, + return FNAME(walk_addr_generic)(walker, vcpu, vcpu->arch.mmu->w, addr, access); } =20 @@ -567,7 +567,7 @@ FNAME(prefetch_gpte)(struct kvm_vcpu *vcpu, struct kvm_= mmu_page *sp, =20 gfn =3D gpte_to_gfn(gpte); pte_access =3D sp->role.access & FNAME(gpte_access)(gpte); - FNAME(protect_clean_gpte)(&vcpu->arch.mmu->w, &pte_access, gpte); + FNAME(protect_clean_gpte)(vcpu->arch.mmu->w, &pte_access, gpte); =20 return kvm_mmu_prefetch_sptes(vcpu, gfn, spte, 1, pte_access); } @@ -650,7 +650,7 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, struct k= vm_page_fault *fault, WARN_ON_ONCE(gw->gfn !=3D base_gfn); direct_access =3D gw->pte_access; =20 - top_level =3D vcpu->arch.mmu->w.cpu_role.base.level; + top_level =3D vcpu->arch.mmu->w->cpu_role.base.level; if (top_level =3D=3D PT32E_ROOT_LEVEL) top_level =3D PT32_ROOT_LEVEL; /* @@ -839,7 +839,7 @@ static int FNAME(page_fault)(struct kvm_vcpu *vcpu, str= uct kvm_page_fault *fault * otherwise KVM will cache incorrect access information in the SPTE. */ if (fault->write && !(walker.pte_access & ACC_WRITE_MASK) && - !is_cr0_wp(&vcpu->arch.mmu->w) && !fault->user && fault->slot) { + !is_cr0_wp(vcpu->arch.mmu->w) && !fault->user && fault->slot) { walker.pte_access |=3D ACC_WRITE_MASK; walker.pte_access &=3D ~ACC_USER_MASK; =20 @@ -849,7 +849,7 @@ static int FNAME(page_fault)(struct kvm_vcpu *vcpu, str= uct kvm_page_fault *fault * then we should prevent the kernel from executing it * if SMEP is enabled. */ - if (is_cr4_smep(&vcpu->arch.mmu->w)) + if (is_cr4_smep(vcpu->arch.mmu->w)) walker.pte_access &=3D ~ACC_EXEC_MASK; } #endif @@ -947,7 +947,7 @@ static int FNAME(sync_spte)(struct kvm_vcpu *vcpu, stru= ct kvm_mmu_page *sp, int gfn =3D gpte_to_gfn(gpte); pte_access =3D sp->role.access; pte_access &=3D FNAME(gpte_access)(gpte); - FNAME(protect_clean_gpte)(&vcpu->arch.mmu->w, &pte_access, gpte); + FNAME(protect_clean_gpte)(vcpu->arch.mmu->w, &pte_access, gpte); =20 if (sync_mmio_spte(vcpu, &sp->spt[i], gfn, pte_access)) return 0; diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c index 7853bd9ed6cc..e93d2e9a9aa4 100644 --- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -98,10 +98,9 @@ static void nested_svm_init_mmu_context(struct kvm_vcpu = *vcpu) svm->nested.ctl.nested_cr3, svm->nested.ctl.misc_ctl); =20 - vcpu->arch.mmu->w.get_guest_pgd =3D nested_svm_get_tdp_cr3; - vcpu->arch.mmu->w.get_pdptr =3D nested_svm_get_tdp_pdptr; - - vcpu->arch.mmu->w.inject_page_fault =3D nested_svm_inject_npf_exit; + vcpu->arch.ngpa_walk.get_guest_pgd =3D nested_svm_get_tdp_cr3; + vcpu->arch.ngpa_walk.get_pdptr =3D nested_svm_get_tdp_pdptr; + vcpu->arch.ngpa_walk.inject_page_fault =3D nested_svm_inject_npf_exit; } =20 static void nested_svm_uninit_mmu_context(struct kvm_vcpu *vcpu) @@ -2094,7 +2093,7 @@ static gpa_t svm_translate_nested_gpa(struct kvm_vcpu= *vcpu, gpa_t gpa, u64 pte_access) { struct vcpu_svm *svm =3D to_svm(vcpu); - struct kvm_pagewalk *w =3D &vcpu->arch.mmu->w; + struct kvm_pagewalk *w =3D &vcpu->arch.ngpa_walk; =20 BUG_ON(!mmu_is_nested(vcpu)); =20 diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index 449efad7ea1f..974116fff635 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -407,7 +407,7 @@ static void nested_ept_invalidate_addr(struct kvm_vcpu = *vcpu, gpa_t eptp, roots |=3D KVM_MMU_ROOT_PREVIOUS(i); } if (roots) - kvm_mmu_invalidate_addr(vcpu, &vcpu->arch.guest_mmu.w, addr, roots); + kvm_mmu_invalidate_addr(vcpu, &vcpu->arch.ngpa_walk, addr, roots); } =20 static void nested_ept_inject_page_fault(struct kvm_vcpu *vcpu, @@ -494,10 +494,10 @@ static void nested_ept_init_mmu_context(struct kvm_vc= pu *vcpu) =20 vcpu->arch.mmu =3D &vcpu->arch.guest_mmu; nested_ept_new_eptp(vcpu); - vcpu->arch.mmu->w.get_guest_pgd =3D nested_ept_get_eptp; - vcpu->arch.mmu->w.get_pdptr =3D kvm_pdptr_read; + vcpu->arch.ngpa_walk.get_guest_pgd =3D nested_ept_get_eptp; + vcpu->arch.ngpa_walk.get_pdptr =3D kvm_pdptr_read; =20 - vcpu->arch.mmu->w.inject_page_fault =3D nested_ept_inject_page_fault; + vcpu->arch.ngpa_walk.inject_page_fault =3D nested_ept_inject_page_fault; } =20 static void nested_ept_uninit_mmu_context(struct kvm_vcpu *vcpu) @@ -7442,12 +7442,13 @@ __init int nested_vmx_hardware_setup(int (*exit_han= dlers[])(struct kvm_vcpu *)) return 0; } =20 + static gpa_t vmx_translate_nested_gpa(struct kvm_vcpu *vcpu, gpa_t gpa, u64 access, struct x86_exception *exception, u64 pte_access) { - struct kvm_pagewalk *w =3D &vcpu->arch.mmu->w; + struct kvm_pagewalk *w =3D &vcpu->arch.ngpa_walk; =20 BUG_ON(!mmu_is_nested(vcpu)); =20 diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 14af0f4d010e..35094997e70a 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -994,7 +994,7 @@ void kvm_inject_emulated_page_fault(struct kvm_vcpu *vc= pu, =20 WARN_ON_ONCE(fault->vector !=3D PF_VECTOR); =20 - fault_walk =3D fault->nested_page_fault ? &vcpu->arch.mmu->w : + fault_walk =3D fault->nested_page_fault ? &vcpu->arch.ngpa_walk : &vcpu->arch.gva_walk; =20 /* --=20 2.52.0 From nobody Mon Jun 8 07:23:59 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6CEBF4779AA for ; Wed, 3 Jun 2026 10:58:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780484312; cv=none; b=beWtdN0UbrlbvIyIdMcBEpaid0RNDRDWXhkYBqu8kja0GDOJ9fHceHODGTvKm9GUKmK3SSFJK7n3o8SkwzSPyo7glLQp5YNOc17TIPs7Jc8bJ1uABO4UFx3VRiR6UrEr+Z7XKr57Q/H64uXdadqtRr/ok1dcF/SuXPOw3WKFupI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780484312; c=relaxed/simple; bh=ydCj9hMxHr4TLiqX1CbHvFY7cEB4vrEOjW2nVY1FD/U=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=PhzwaJ9ZXsJj/YrroInc+pjHS60AuJtLG8EJKFws68VianoRiM9bKctiKE7GzuilLMVo+6+dyW2ZcTWWCxsMvBbudUj+6SLV0F/TP6H3HalfZ3g+l2fkgeJOJwWF4+GEAPavEXME35Gq/kZbvA1Nt2Z6MuHxk6ZpU8m/XEUkWbU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=IreXsixe; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="IreXsixe" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1780484309; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=9GxDtzyRHQ1Hi0ZD8K9HUIjO2vSzuHW4JcTKE62dPfQ=; b=IreXsixeyCKrIYP2rjsUAi7QFBLOM9LXZ7q/NgAyPvz1Qqt/Me6UBybZkfDug2eMDNJBnW vkWujms/F4gkTdGR1NY24MHb2TIR0Uj1eerkQOQdAQD0QMQoax+JSOJL1vbjHE7KA/Wf21 E1s0MdrHBLJJIMHIu4C5e3ZYoGhUouE= Received: from mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-642-2qN7Zsp9PD6rk6xa8lf0jg-1; Wed, 03 Jun 2026 06:58:28 -0400 X-MC-Unique: 2qN7Zsp9PD6rk6xa8lf0jg-1 X-Mimecast-MFC-AGG-ID: 2qN7Zsp9PD6rk6xa8lf0jg_1780484307 Received: from mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.12]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 6C19818005BC; Wed, 3 Jun 2026 10:58:27 +0000 (UTC) Received: from virtlab1023.lab.eng.rdu2.redhat.lab.eng.rdu2.redhat.com (virtlab1023.lab.eng.rdu2.redhat.com [10.8.1.187]) by mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 11163195E48A; Wed, 3 Jun 2026 10:58:26 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Subject: [PATCH 20/24] KVM: x86/mmu: cleanup functions that initialize shadow MMU Date: Wed, 3 Jun 2026 06:58:10 -0400 Message-ID: <20260603105814.10236-21-pbonzini@redhat.com> In-Reply-To: <20260603105814.10236-1-pbonzini@redhat.com> References: <20260603105814.10236-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.0 on 10.30.177.12 Content-Type: text/plain; charset="utf-8" Now that the GVA->GPA page walker is initialized independently, init_kvm_softmmu() does not do anything more than calling kvm_init_shadow_mmu() so eliminate it from the call chain. At the same time, rename kvm_init_shadow_mmu() to init_kvm_shadow_mmu() for consistency with init_kvm_tdp_mmu(). Signed-off-by: Paolo Bonzini --- arch/x86/kvm/mmu/mmu.c | 10 ++-------- 1 file changed, 2 insertions(+), 8 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index bd307e9b3fd6..e444536768ba 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -5901,7 +5901,7 @@ static void shadow_mmu_init_context(struct kvm_vcpu *= vcpu, struct kvm_mmu *conte reset_shadow_zero_bits_mask(vcpu, context); } =20 -static void kvm_init_shadow_mmu(struct kvm_vcpu *vcpu, +static void init_kvm_shadow_mmu(struct kvm_vcpu *vcpu, union kvm_cpu_role cpu_role) { struct kvm_mmu *context =3D &vcpu->arch.root_mmu; @@ -6033,12 +6033,6 @@ void kvm_init_shadow_ept_mmu(struct kvm_vcpu *vcpu, = bool execonly, } EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_init_shadow_ept_mmu); =20 -static void init_kvm_softmmu(struct kvm_vcpu *vcpu, - union kvm_cpu_role cpu_role) -{ - kvm_init_shadow_mmu(vcpu, cpu_role); -} - void kvm_init_mmu(struct kvm_vcpu *vcpu) { struct kvm_mmu_role_regs regs =3D vcpu_to_role_regs(vcpu); @@ -6050,7 +6044,7 @@ void kvm_init_mmu(struct kvm_vcpu *vcpu) if (tdp_enabled) init_kvm_tdp_mmu(vcpu, cpu_role); else - init_kvm_softmmu(vcpu, cpu_role); + init_kvm_shadow_mmu(vcpu, cpu_role); } } EXPORT_SYMBOL_FOR_KVM_INTERNAL(kvm_init_mmu); --=20 2.52.0 From nobody Mon Jun 8 07:23:59 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 19ED347A0B8 for ; Wed, 3 Jun 2026 10:58:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780484315; cv=none; b=o5JzXRUbmMDl+vwx0IJOpSJ2jY06CPzIDmDQDJjcmC9bezh/cPFS7LLF4R2ZHTfa0SRnr+1m2Jb30ftFvZVjUmL7iw+ZyJhuSbVkSdP8ZnoPp03nLnu9LRAlFFZegwPPE8QEmqzeu81+kvlY4kYP9a+GAjsfrmQtwEtEDS6CFW0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780484315; c=relaxed/simple; bh=MjfarTbGD06IJKYJTFn4dWB3Qq2B+Qlc9KDsLW04Ubg=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=XVhIVG1FsvvhCDFmPtOawzlJ+H1zRG9JMncbBzDxjg1UzzgSBsv76tDDnijAfQqujRs6MpEHLY4VgqNDjww1HpfwkdNwdamJ8eOVdPGr9pgVF5NrjVUPdd43rA/lCyjpu1KKpCuphsdyy4mD56+lhlaGTZr99xqk67nN7PNFLCI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=XweBpFx5; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="XweBpFx5" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1780484312; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=36M7hVybo6tX1PLHRIR427H0FpuIGYm2aHST/qkqo/g=; b=XweBpFx5lZRy8/k8n+2RvHsNVHF3HYP7PM2oj4wT2z8rbHZbiG7ja6cw7TW2Yc6bK03hra 5wI2+qSCWSrPNSLOc2LMdSbDI8hNLLJHBMxdyKl0sKrlN3V26f2xhts8PpOlWjNdT4fhSd sDeITCE6Udc/rjwE590GjJ/aY3RHcQk= Received: from mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-683-Kxkop9TQP-ezT1TVpAvCGQ-1; Wed, 03 Jun 2026 06:58:28 -0400 X-MC-Unique: Kxkop9TQP-ezT1TVpAvCGQ-1 X-Mimecast-MFC-AGG-ID: Kxkop9TQP-ezT1TVpAvCGQ_1780484308 Received: from mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.12]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id EF719195606F; Wed, 3 Jun 2026 10:58:27 +0000 (UTC) Received: from virtlab1023.lab.eng.rdu2.redhat.lab.eng.rdu2.redhat.com (virtlab1023.lab.eng.rdu2.redhat.com [10.8.1.187]) by mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 929BC195E487; Wed, 3 Jun 2026 10:58:27 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Subject: [PATCH 21/24] KVM: x86/mmu: pull page format to a new struct Date: Wed, 3 Jun 2026 06:58:11 -0400 Message-ID: <20260603105814.10236-22-pbonzini@redhat.com> In-Reply-To: <20260603105814.10236-1-pbonzini@redhat.com> References: <20260603105814.10236-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.0 on 10.30.177.12 Content-Type: text/plain; charset="utf-8" KVM is doing reserved bits checks on both guest and host page tables, though the latter are only for consistency. Create a new struct for this common code as well as for all data that is extracted from the CPU role. Signed-off-by: Paolo Bonzini --- arch/x86/include/asm/kvm_host.h | 23 ++++++++++++++--------- arch/x86/kvm/mmu.h | 7 ++++--- arch/x86/kvm/mmu/mmu.c | 16 ++++++++-------- arch/x86/kvm/mmu/paging_tmpl.h | 10 +++++----- 4 files changed, 31 insertions(+), 25 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_hos= t.h index ce39230eaebb..08fb47f2b7fc 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -479,15 +479,7 @@ struct kvm_page_fault; * and 2-level 32-bit). The kvm_pagewalk structure abstracts the details = of the * current mmu mode. */ -struct kvm_pagewalk { - unsigned long (*get_guest_pgd)(struct kvm_vcpu *vcpu); - u64 (*get_pdptr)(struct kvm_vcpu *vcpu, int index); - void (*inject_page_fault)(struct kvm_vcpu *vcpu, - struct x86_exception *fault); - gpa_t (*gva_to_gpa)(struct kvm_vcpu *vcpu, struct kvm_pagewalk *w, - gpa_t gva_or_gpa, u64 access, - struct x86_exception *exception); - union kvm_cpu_role cpu_role; +struct kvm_page_format { struct rsvd_bits_validate guest_rsvd_check; =20 /* @@ -506,6 +498,19 @@ struct kvm_pagewalk { u16 permissions[16]; }; =20 +struct kvm_pagewalk { + unsigned long (*get_guest_pgd)(struct kvm_vcpu *vcpu); + u64 (*get_pdptr)(struct kvm_vcpu *vcpu, int index); + void (*inject_page_fault)(struct kvm_vcpu *vcpu, + struct x86_exception *fault); + gpa_t (*gva_to_gpa)(struct kvm_vcpu *vcpu, struct kvm_pagewalk *w, + gpa_t gva_or_gpa, u64 access, + struct x86_exception *exception); + + union kvm_cpu_role cpu_role; + struct kvm_page_format fmt; +}; + struct kvm_mmu { int (*page_fault)(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault); int (*sync_spte)(struct kvm_vcpu *vcpu, diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h index 58eb98585a29..f604726d5b29 100644 --- a/arch/x86/kvm/mmu.h +++ b/arch/x86/kvm/mmu.h @@ -217,15 +217,16 @@ static inline u8 permission_fault(struct kvm_vcpu *vc= pu, struct kvm_pagewalk *w, u64 implicit_access =3D access & PFERR_IMPLICIT_ACCESS; bool not_smap =3D ((rflags & X86_EFLAGS_AC) | implicit_access) =3D=3D X86= _EFLAGS_AC; int index =3D (pfec | (not_smap ? PFERR_RSVD_MASK : 0)) >> 1; + struct kvm_page_format *fmt =3D &w->fmt; u32 errcode =3D PFERR_PRESENT_MASK; bool fault; =20 kvm_mmu_refresh_passthrough_bits(vcpu, w); =20 - fault =3D (w->permissions[index] >> pte_access) & 1; + fault =3D (fmt->permissions[index] >> pte_access) & 1; =20 WARN_ON_ONCE(pfec & (PFERR_PK_MASK | PFERR_SS_MASK | PFERR_RSVD_MASK)); - if (unlikely(w->pkru_mask)) { + if (unlikely(fmt->pkru_mask)) { u32 pkru_bits, offset; =20 /* @@ -239,7 +240,7 @@ static inline u8 permission_fault(struct kvm_vcpu *vcpu= , struct kvm_pagewalk *w, /* clear present bit, replace PFEC.RSVD with ACC_USER_MASK. */ offset =3D (pfec & ~1) | ((pte_access & PT_USER_MASK) ? PFERR_RSVD_MASK = : 0); =20 - pkru_bits &=3D w->pkru_mask >> offset; + pkru_bits &=3D fmt->pkru_mask >> offset; errcode |=3D -pkru_bits & PFERR_PK_MASK; fault |=3D (pkru_bits !=3D 0); } diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index e444536768ba..420bd70fb54a 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -5390,7 +5390,7 @@ static void __reset_rsvds_bits_mask(struct rsvd_bits_= validate *rsvd_check, static void reset_guest_rsvds_bits_mask(struct kvm_vcpu *vcpu, struct kvm_pagewalk *w) { - __reset_rsvds_bits_mask(&w->guest_rsvd_check, + __reset_rsvds_bits_mask(&w->fmt.guest_rsvd_check, vcpu->arch.reserved_gpa_bits, w->cpu_role.base.level, is_efer_nx(w), guest_cpu_cap_has(vcpu, X86_FEATURE_GBPAGES), @@ -5439,7 +5439,7 @@ static void __reset_rsvds_bits_mask_ept(struct rsvd_b= its_validate *rsvd_check, static void reset_rsvds_bits_mask_ept(struct kvm_vcpu *vcpu, bool execonly, int huge_page_level) { - __reset_rsvds_bits_mask_ept(&vcpu->arch.ngpa_walk.guest_rsvd_check, + __reset_rsvds_bits_mask_ept(&vcpu->arch.ngpa_walk.fmt.guest_rsvd_check, vcpu->arch.reserved_gpa_bits, execonly, huge_page_level); } @@ -5593,7 +5593,7 @@ static void update_permission_bitmask(struct kvm_page= walk *pw, bool tdp, bool ep * permission_fault() to indicate accesses that are *not* subject to * SMAP restrictions. */ - for (index =3D 0; index < ARRAY_SIZE(pw->permissions); ++index) { + for (index =3D 0; index < ARRAY_SIZE(pw->fmt.permissions); ++index) { unsigned pfec =3D index << 1; =20 /* @@ -5667,7 +5667,7 @@ static void update_permission_bitmask(struct kvm_page= walk *pw, bool tdp, bool ep smapf =3D (pfec & (PFERR_RSVD_MASK|PFERR_FETCH_MASK)) ? 0 : kf; } =20 - pw->permissions[index] =3D ff | uf | wf | rf | smapf; + pw->fmt.permissions[index] =3D ff | uf | wf | rf | smapf; } } =20 @@ -5700,14 +5700,14 @@ static void update_pkru_bitmask(struct kvm_pagewalk= *w) unsigned bit; bool wp; =20 - w->pkru_mask =3D 0; + w->fmt.pkru_mask =3D 0; =20 if (!is_cr4_pke(w)) return; =20 wp =3D is_cr0_wp(w); =20 - for (bit =3D 0; bit < ARRAY_SIZE(w->permissions); ++bit) { + for (bit =3D 0; bit < ARRAY_SIZE(w->fmt.permissions); ++bit) { unsigned pfec, pkey_bits; bool check_pkey, check_write, ff, uf, wf, pte_user; =20 @@ -5735,7 +5735,7 @@ static void update_pkru_bitmask(struct kvm_pagewalk *= w) /* PKRU.WD stops write access. */ pkey_bits |=3D (!!check_write) << 1; =20 - w->pkru_mask |=3D (pkey_bits & 3) << pfec; + w->fmt.pkru_mask |=3D (pkey_bits & 3) << pfec; } } =20 @@ -6024,7 +6024,7 @@ void kvm_init_shadow_ept_mmu(struct kvm_vcpu *vcpu, b= ool execonly, context->sync_spte =3D ept_sync_spte; =20 update_permission_bitmask(ngpa_walk, true, true); - ngpa_walk->pkru_mask =3D 0; + ngpa_walk->fmt.pkru_mask =3D 0; reset_rsvds_bits_mask_ept(vcpu, execonly, huge_page_level); reset_ept_shadow_zero_bits_mask(context, execonly); } diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h index 088b86d228c3..fe12e9d17b0e 100644 --- a/arch/x86/kvm/mmu/paging_tmpl.h +++ b/arch/x86/kvm/mmu/paging_tmpl.h @@ -147,10 +147,10 @@ static bool FNAME(is_bad_mt_xwr)(struct rsvd_bits_val= idate *rsvd_check, u64 gpte #endif } =20 -static bool FNAME(is_rsvd_bits_set)(struct kvm_pagewalk *w, u64 gpte, int = level) +static bool FNAME(is_rsvd_bits_set)(struct kvm_page_format *fmt, u64 gpte,= int level) { - return __is_rsvd_bits_set(&w->guest_rsvd_check, gpte, level) || - FNAME(is_bad_mt_xwr)(&w->guest_rsvd_check, gpte); + return __is_rsvd_bits_set(&fmt->guest_rsvd_check, gpte, level) || + FNAME(is_bad_mt_xwr)(&fmt->guest_rsvd_check, gpte); } =20 static bool FNAME(prefetch_invalid_gpte)(struct kvm_vcpu *vcpu, @@ -167,7 +167,7 @@ static bool FNAME(prefetch_invalid_gpte)(struct kvm_vcp= u *vcpu, !(gpte & PT_GUEST_ACCESSED_MASK)) goto no_present; =20 - if (FNAME(is_rsvd_bits_set)(w, gpte, PG_LEVEL_4K)) + if (FNAME(is_rsvd_bits_set)(&w->fmt, gpte, PG_LEVEL_4K)) goto no_present; =20 return false; @@ -431,7 +431,7 @@ static int FNAME(walk_addr_generic)(struct guest_walker= *walker, if (unlikely(!FNAME(is_present_gpte)(w, pte))) goto error; =20 - if (unlikely(FNAME(is_rsvd_bits_set)(w, pte, walker->level))) { + if (unlikely(FNAME(is_rsvd_bits_set)(&w->fmt, pte, walker->level))) { errcode =3D PFERR_RSVD_MASK | PFERR_PRESENT_MASK; goto error; } --=20 2.52.0 From nobody Mon Jun 8 07:23:59 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 85FDA47884E for ; Wed, 3 Jun 2026 10:58:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780484317; cv=none; b=bWcOc6alpNjn0aL8udkSqqNok09SLk0bYVKMqfV7E2uRfhjnB330m8d/1T0gvPHVeG7e442DZ3MPdA8W8Zpi+iGmTtmacw/9aNZLaxUE9Q4y0OHCQ+tHAvLA7sazxffBjC83qFrquAcJhQoZlKN38m0MnGIFDJebrDdInowD5W0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780484317; c=relaxed/simple; bh=A9sREkDluf2c1wkFWS2AA0Sn2AJNtLHjZZI6WxZfROA=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=FLpq+XtmHM4VRnJb+GhnsaXiIG7Kja03c5q/lcDGM5oIFnnzwCw//tkibRs2HaQSukgBUU44YSvGnlMSn0zBWBouNxTaZtsN7JhdEMCPyiulxCl13hk0M8+sZd9HkxNNlo2s1pbMCnCWNoPHNTLi5CaA5DVU7ejjL/BMcZxY9iE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=TiKg07zM; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="TiKg07zM" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1780484310; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=0VBDBOAA68Z2RLXsNaERQBzOZuXSc467HItqKkZUXZo=; b=TiKg07zMyIzqLRryfBIUi6SVdQGNZnBoh08x1PvvkPlDyiG+CBmZ5DDMV9YFdYnry10TdW RTUEgcOgXAgPIuv0Rmy89RhTvdjDreXY7CkWyDKCorpM8V4BAlcTUgSYiZFS6by5mA9Bs2 nd46IJYN6vf46SwTQfdEfYH8IN7ZmVY= Received: from mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-627-2MeMkBXOMzO7ewrmPewCPw-1; Wed, 03 Jun 2026 06:58:29 -0400 X-MC-Unique: 2MeMkBXOMzO7ewrmPewCPw-1 X-Mimecast-MFC-AGG-ID: 2MeMkBXOMzO7ewrmPewCPw_1780484308 Received: from mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.12]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 7BB3A18004D4; Wed, 3 Jun 2026 10:58:28 +0000 (UTC) Received: from virtlab1023.lab.eng.rdu2.redhat.lab.eng.rdu2.redhat.com (virtlab1023.lab.eng.rdu2.redhat.com [10.8.1.187]) by mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 20098195E483; Wed, 3 Jun 2026 10:58:28 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Subject: [PATCH 22/24] KVM: x86/mmu: merge struct rsvd_bits_validate into struct kvm_page_format Date: Wed, 3 Jun 2026 06:58:12 -0400 Message-ID: <20260603105814.10236-23-pbonzini@redhat.com> In-Reply-To: <20260603105814.10236-1-pbonzini@redhat.com> References: <20260603105814.10236-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.0 on 10.30.177.12 Content-Type: text/plain; charset="utf-8" Remove one level of indirection, and prepare for using the permission bitma= sk machinery for shadow pages as well. Signed-off-by: Paolo Bonzini --- arch/x86/include/asm/kvm_host.h | 38 +++++------ arch/x86/kvm/mmu/mmu.c | 116 ++++++++++++++++---------------- arch/x86/kvm/mmu/paging_tmpl.h | 8 +-- arch/x86/kvm/mmu/spte.c | 4 +- arch/x86/kvm/mmu/spte.h | 18 ++--- arch/x86/kvm/vmx/vmx.c | 2 +- 6 files changed, 91 insertions(+), 95 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_hos= t.h index 08fb47f2b7fc..7c6ac551a2d9 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -450,9 +450,24 @@ struct kvm_pio_request { =20 #define PT64_ROOT_MAX_LEVEL 5 =20 -struct rsvd_bits_validate { +struct kvm_page_format { u64 rsvd_bits_mask[2][PT64_ROOT_MAX_LEVEL]; u64 bad_mt_xwr; + + /* + * The pkru_mask indicates if protection key checks are needed. It + * consists of 16 domains indexed by page fault error code bits [4:1], + * with PFEC.RSVD replaced by ACC_USER_MASK from the page tables. + * Each domain has 2 bits which are ANDed with AD and WD from PKRU. + */ + u32 pkru_mask; + + /* + * Bitmap; bit set =3D permission fault + * Array index: page fault error code [4:1] + * Bit index: pte permissions in ACC_* format + */ + u16 permissions[16]; }; =20 struct kvm_mmu_root_info { @@ -479,25 +494,6 @@ struct kvm_page_fault; * and 2-level 32-bit). The kvm_pagewalk structure abstracts the details = of the * current mmu mode. */ -struct kvm_page_format { - struct rsvd_bits_validate guest_rsvd_check; - - /* - * The pkru_mask indicates if protection key checks are needed. It - * consists of 16 domains indexed by page fault error code bits [4:1], - * with PFEC.RSVD replaced by ACC_USER_MASK from the page tables. - * Each domain has 2 bits which are ANDed with AD and WD from PKRU. - */ - u32 pkru_mask; - - /* - * Bitmap; bit set =3D permission fault - * Array index: page fault error code [4:1] - * Bit index: pte permissions in ACC_* format - */ - u16 permissions[16]; -}; - struct kvm_pagewalk { unsigned long (*get_guest_pgd)(struct kvm_vcpu *vcpu); u64 (*get_pdptr)(struct kvm_vcpu *vcpu, int index); @@ -532,7 +528,7 @@ struct kvm_mmu { * bits include not only hardware reserved bits but also * the bits spte never used. */ - struct rsvd_bits_validate shadow_zero_check; + struct kvm_page_format fmt; }; =20 enum pmc_type { diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 420bd70fb54a..29755afe5b46 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -4422,7 +4422,7 @@ static int get_sptes_lockless(struct kvm_vcpu *vcpu, = u64 addr, u64 *sptes, static bool get_mmio_spte(struct kvm_vcpu *vcpu, u64 addr, u64 *sptep) { u64 sptes[PT64_ROOT_MAX_LEVEL + 1]; - struct rsvd_bits_validate *rsvd_check; + struct kvm_page_format *rsvd_check; int root, leaf, level; bool reserved =3D false; =20 @@ -4443,7 +4443,7 @@ static bool get_mmio_spte(struct kvm_vcpu *vcpu, u64 = addr, u64 *sptep) if (!is_shadow_present_pte(sptes[leaf])) leaf++; =20 - rsvd_check =3D &vcpu->arch.mmu->shadow_zero_check; + rsvd_check =3D &vcpu->arch.mmu->fmt; =20 for (level =3D root; level >=3D leaf; level--) reserved |=3D is_rsvd_spte(rsvd_check, sptes[level], level); @@ -5298,7 +5298,7 @@ static bool sync_mmio_spte(struct kvm_vcpu *vcpu, u64= *sptep, gfn_t gfn, #include "paging_tmpl.h" #undef PTTYPE =20 -static void __reset_rsvds_bits_mask(struct rsvd_bits_validate *rsvd_check, +static void __reset_rsvds_bits_mask(struct kvm_page_format *fmt, u64 pa_bits_rsvd, int level, bool nx, bool gbpages, bool pse, bool amd) { @@ -5306,7 +5306,7 @@ static void __reset_rsvds_bits_mask(struct rsvd_bits_= validate *rsvd_check, u64 nonleaf_bit8_rsvd =3D 0; u64 high_bits_rsvd; =20 - rsvd_check->bad_mt_xwr =3D 0; + fmt->bad_mt_xwr =3D 0; =20 if (!gbpages) gbpages_bit_rsvd =3D rsvd_bits(7, 7); @@ -5330,59 +5330,59 @@ static void __reset_rsvds_bits_mask(struct rsvd_bit= s_validate *rsvd_check, switch (level) { case PT32_ROOT_LEVEL: /* no rsvd bits for 2 level 4K page table entries */ - rsvd_check->rsvd_bits_mask[0][1] =3D 0; - rsvd_check->rsvd_bits_mask[0][0] =3D 0; - rsvd_check->rsvd_bits_mask[1][0] =3D - rsvd_check->rsvd_bits_mask[0][0]; + fmt->rsvd_bits_mask[0][1] =3D 0; + fmt->rsvd_bits_mask[0][0] =3D 0; + fmt->rsvd_bits_mask[1][0] =3D + fmt->rsvd_bits_mask[0][0]; =20 if (!pse) { - rsvd_check->rsvd_bits_mask[1][1] =3D 0; + fmt->rsvd_bits_mask[1][1] =3D 0; break; } =20 if (is_cpuid_PSE36()) /* 36bits PSE 4MB page */ - rsvd_check->rsvd_bits_mask[1][1] =3D rsvd_bits(17, 21); + fmt->rsvd_bits_mask[1][1] =3D rsvd_bits(17, 21); else /* 32 bits PSE 4MB page */ - rsvd_check->rsvd_bits_mask[1][1] =3D rsvd_bits(13, 21); + fmt->rsvd_bits_mask[1][1] =3D rsvd_bits(13, 21); break; case PT32E_ROOT_LEVEL: - rsvd_check->rsvd_bits_mask[0][2] =3D rsvd_bits(63, 63) | + fmt->rsvd_bits_mask[0][2] =3D rsvd_bits(63, 63) | high_bits_rsvd | rsvd_bits(5, 8) | rsvd_bits(1, 2); /* PDPTE */ - rsvd_check->rsvd_bits_mask[0][1] =3D high_bits_rsvd; /* PDE */ - rsvd_check->rsvd_bits_mask[0][0] =3D high_bits_rsvd; /* PTE */ - rsvd_check->rsvd_bits_mask[1][1] =3D high_bits_rsvd | + fmt->rsvd_bits_mask[0][1] =3D high_bits_rsvd; /* PDE */ + fmt->rsvd_bits_mask[0][0] =3D high_bits_rsvd; /* PTE */ + fmt->rsvd_bits_mask[1][1] =3D high_bits_rsvd | rsvd_bits(13, 20); /* large page */ - rsvd_check->rsvd_bits_mask[1][0] =3D - rsvd_check->rsvd_bits_mask[0][0]; + fmt->rsvd_bits_mask[1][0] =3D + fmt->rsvd_bits_mask[0][0]; break; case PT64_ROOT_5LEVEL: - rsvd_check->rsvd_bits_mask[0][4] =3D high_bits_rsvd | + fmt->rsvd_bits_mask[0][4] =3D high_bits_rsvd | nonleaf_bit8_rsvd | rsvd_bits(7, 7); - rsvd_check->rsvd_bits_mask[1][4] =3D - rsvd_check->rsvd_bits_mask[0][4]; + fmt->rsvd_bits_mask[1][4] =3D + fmt->rsvd_bits_mask[0][4]; fallthrough; case PT64_ROOT_4LEVEL: - rsvd_check->rsvd_bits_mask[0][3] =3D high_bits_rsvd | + fmt->rsvd_bits_mask[0][3] =3D high_bits_rsvd | nonleaf_bit8_rsvd | rsvd_bits(7, 7); - rsvd_check->rsvd_bits_mask[0][2] =3D high_bits_rsvd | + fmt->rsvd_bits_mask[0][2] =3D high_bits_rsvd | gbpages_bit_rsvd; - rsvd_check->rsvd_bits_mask[0][1] =3D high_bits_rsvd; - rsvd_check->rsvd_bits_mask[0][0] =3D high_bits_rsvd; - rsvd_check->rsvd_bits_mask[1][3] =3D - rsvd_check->rsvd_bits_mask[0][3]; - rsvd_check->rsvd_bits_mask[1][2] =3D high_bits_rsvd | + fmt->rsvd_bits_mask[0][1] =3D high_bits_rsvd; + fmt->rsvd_bits_mask[0][0] =3D high_bits_rsvd; + fmt->rsvd_bits_mask[1][3] =3D + fmt->rsvd_bits_mask[0][3]; + fmt->rsvd_bits_mask[1][2] =3D high_bits_rsvd | gbpages_bit_rsvd | rsvd_bits(13, 29); - rsvd_check->rsvd_bits_mask[1][1] =3D high_bits_rsvd | + fmt->rsvd_bits_mask[1][1] =3D high_bits_rsvd | rsvd_bits(13, 20); /* large page */ - rsvd_check->rsvd_bits_mask[1][0] =3D - rsvd_check->rsvd_bits_mask[0][0]; + fmt->rsvd_bits_mask[1][0] =3D + fmt->rsvd_bits_mask[0][0]; break; } } @@ -5390,7 +5390,7 @@ static void __reset_rsvds_bits_mask(struct rsvd_bits_= validate *rsvd_check, static void reset_guest_rsvds_bits_mask(struct kvm_vcpu *vcpu, struct kvm_pagewalk *w) { - __reset_rsvds_bits_mask(&w->fmt.guest_rsvd_check, + __reset_rsvds_bits_mask(&w->fmt, vcpu->arch.reserved_gpa_bits, w->cpu_role.base.level, is_efer_nx(w), guest_cpu_cap_has(vcpu, X86_FEATURE_GBPAGES), @@ -5398,7 +5398,7 @@ static void reset_guest_rsvds_bits_mask(struct kvm_vc= pu *vcpu, guest_cpuid_is_amd_compatible(vcpu)); } =20 -static void __reset_rsvds_bits_mask_ept(struct rsvd_bits_validate *rsvd_ch= eck, +static void __reset_rsvds_bits_mask_ept(struct kvm_page_format *fmt, u64 pa_bits_rsvd, bool execonly, int huge_page_level) { @@ -5411,18 +5411,18 @@ static void __reset_rsvds_bits_mask_ept(struct rsvd= _bits_validate *rsvd_check, if (huge_page_level < PG_LEVEL_2M) large_2m_rsvd =3D rsvd_bits(7, 7); =20 - rsvd_check->rsvd_bits_mask[0][4] =3D high_bits_rsvd | rsvd_bits(3, 7); - rsvd_check->rsvd_bits_mask[0][3] =3D high_bits_rsvd | rsvd_bits(3, 7); - rsvd_check->rsvd_bits_mask[0][2] =3D high_bits_rsvd | rsvd_bits(3, 6) | l= arge_1g_rsvd; - rsvd_check->rsvd_bits_mask[0][1] =3D high_bits_rsvd | rsvd_bits(3, 6) | l= arge_2m_rsvd; - rsvd_check->rsvd_bits_mask[0][0] =3D high_bits_rsvd; + fmt->rsvd_bits_mask[0][4] =3D high_bits_rsvd | rsvd_bits(3, 7); + fmt->rsvd_bits_mask[0][3] =3D high_bits_rsvd | rsvd_bits(3, 7); + fmt->rsvd_bits_mask[0][2] =3D high_bits_rsvd | rsvd_bits(3, 6) | large_1g= _rsvd; + fmt->rsvd_bits_mask[0][1] =3D high_bits_rsvd | rsvd_bits(3, 6) | large_2m= _rsvd; + fmt->rsvd_bits_mask[0][0] =3D high_bits_rsvd; =20 /* large page */ - rsvd_check->rsvd_bits_mask[1][4] =3D rsvd_check->rsvd_bits_mask[0][4]; - rsvd_check->rsvd_bits_mask[1][3] =3D rsvd_check->rsvd_bits_mask[0][3]; - rsvd_check->rsvd_bits_mask[1][2] =3D high_bits_rsvd | rsvd_bits(12, 29) |= large_1g_rsvd; - rsvd_check->rsvd_bits_mask[1][1] =3D high_bits_rsvd | rsvd_bits(12, 20) |= large_2m_rsvd; - rsvd_check->rsvd_bits_mask[1][0] =3D rsvd_check->rsvd_bits_mask[0][0]; + fmt->rsvd_bits_mask[1][4] =3D fmt->rsvd_bits_mask[0][4]; + fmt->rsvd_bits_mask[1][3] =3D fmt->rsvd_bits_mask[0][3]; + fmt->rsvd_bits_mask[1][2] =3D high_bits_rsvd | rsvd_bits(12, 29) | large_= 1g_rsvd; + fmt->rsvd_bits_mask[1][1] =3D high_bits_rsvd | rsvd_bits(12, 20) | large_= 2m_rsvd; + fmt->rsvd_bits_mask[1][0] =3D fmt->rsvd_bits_mask[0][0]; =20 bad_mt_xwr =3D 0xFFull << (2 * 8); /* bits 3..5 must not be 2 */ bad_mt_xwr |=3D 0xFFull << (3 * 8); /* bits 3..5 must not be 3 */ @@ -5433,13 +5433,13 @@ static void __reset_rsvds_bits_mask_ept(struct rsvd= _bits_validate *rsvd_check, /* bits 0..2 must not be 100 unless VMX capabilities allow it */ bad_mt_xwr |=3D REPEAT_BYTE(1ull << 4); } - rsvd_check->bad_mt_xwr =3D bad_mt_xwr; + fmt->bad_mt_xwr =3D bad_mt_xwr; } =20 static void reset_rsvds_bits_mask_ept(struct kvm_vcpu *vcpu, bool execonly, int huge_page_level) { - __reset_rsvds_bits_mask_ept(&vcpu->arch.ngpa_walk.fmt.guest_rsvd_check, + __reset_rsvds_bits_mask_ept(&vcpu->arch.ngpa_walk.fmt, vcpu->arch.reserved_gpa_bits, execonly, huge_page_level); } @@ -5461,13 +5461,13 @@ static void reset_shadow_zero_bits_mask(struct kvm_= vcpu *vcpu, bool is_amd =3D true; /* KVM doesn't use 2-level page tables for the shadow MMU. */ bool is_pse =3D false; - struct rsvd_bits_validate *shadow_zero_check; + struct kvm_page_format *fmt; int i; =20 WARN_ON_ONCE(context->root_role.level < PT32E_ROOT_LEVEL); =20 - shadow_zero_check =3D &context->shadow_zero_check; - __reset_rsvds_bits_mask(shadow_zero_check, reserved_hpa_bits(), + fmt =3D &context->fmt; + __reset_rsvds_bits_mask(fmt, reserved_hpa_bits(), context->root_role.level, context->root_role.efer_nx, guest_cpu_cap_has(vcpu, X86_FEATURE_GBPAGES), @@ -5483,10 +5483,10 @@ static void reset_shadow_zero_bits_mask(struct kvm_= vcpu *vcpu, * Bits in shadow_me_mask but not in shadow_me_value are * not allowed to be set. */ - shadow_zero_check->rsvd_bits_mask[0][i] |=3D shadow_me_mask; - shadow_zero_check->rsvd_bits_mask[1][i] |=3D shadow_me_mask; - shadow_zero_check->rsvd_bits_mask[0][i] &=3D ~shadow_me_value; - shadow_zero_check->rsvd_bits_mask[1][i] &=3D ~shadow_me_value; + fmt->rsvd_bits_mask[0][i] |=3D shadow_me_mask; + fmt->rsvd_bits_mask[1][i] |=3D shadow_me_mask; + fmt->rsvd_bits_mask[0][i] &=3D ~shadow_me_value; + fmt->rsvd_bits_mask[1][i] &=3D ~shadow_me_value; } =20 } @@ -5503,18 +5503,18 @@ static inline bool boot_cpu_is_amd(void) */ static void reset_tdp_shadow_zero_bits_mask(struct kvm_mmu *context) { - struct rsvd_bits_validate *shadow_zero_check; + struct kvm_page_format *fmt; int i; =20 - shadow_zero_check =3D &context->shadow_zero_check; + fmt =3D &context->fmt; =20 if (boot_cpu_is_amd()) - __reset_rsvds_bits_mask(shadow_zero_check, reserved_hpa_bits(), + __reset_rsvds_bits_mask(fmt, reserved_hpa_bits(), context->root_role.level, true, boot_cpu_has(X86_FEATURE_GBPAGES), false, true); else - __reset_rsvds_bits_mask_ept(shadow_zero_check, + __reset_rsvds_bits_mask_ept(fmt, reserved_hpa_bits(), false, max_huge_page_level); =20 @@ -5522,8 +5522,8 @@ static void reset_tdp_shadow_zero_bits_mask(struct kv= m_mmu *context) return; =20 for (i =3D context->root_role.level; --i >=3D 0;) { - shadow_zero_check->rsvd_bits_mask[0][i] &=3D ~shadow_me_mask; - shadow_zero_check->rsvd_bits_mask[1][i] &=3D ~shadow_me_mask; + fmt->rsvd_bits_mask[0][i] &=3D ~shadow_me_mask; + fmt->rsvd_bits_mask[1][i] &=3D ~shadow_me_mask; } } =20 @@ -5534,7 +5534,7 @@ static void reset_tdp_shadow_zero_bits_mask(struct kv= m_mmu *context) static void reset_ept_shadow_zero_bits_mask(struct kvm_mmu *context, bool execonly) { - __reset_rsvds_bits_mask_ept(&context->shadow_zero_check, + __reset_rsvds_bits_mask_ept(&context->fmt, reserved_hpa_bits(), execonly, max_huge_page_level); } diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h index fe12e9d17b0e..625fe35a1911 100644 --- a/arch/x86/kvm/mmu/paging_tmpl.h +++ b/arch/x86/kvm/mmu/paging_tmpl.h @@ -138,19 +138,19 @@ static inline int FNAME(is_present_gpte)(struct kvm_p= agewalk *w, #endif } =20 -static bool FNAME(is_bad_mt_xwr)(struct rsvd_bits_validate *rsvd_check, u6= 4 gpte) +static bool FNAME(is_bad_mt_xwr)(struct kvm_page_format *fmt, u64 gpte) { #if PTTYPE !=3D PTTYPE_EPT return false; #else - return __is_bad_mt_xwr(rsvd_check, gpte); + return __is_bad_mt_xwr(fmt, gpte); #endif } =20 static bool FNAME(is_rsvd_bits_set)(struct kvm_page_format *fmt, u64 gpte,= int level) { - return __is_rsvd_bits_set(&fmt->guest_rsvd_check, gpte, level) || - FNAME(is_bad_mt_xwr)(&fmt->guest_rsvd_check, gpte); + return __is_rsvd_bits_set(fmt, gpte, level) || + FNAME(is_bad_mt_xwr)(fmt, gpte); } =20 static bool FNAME(prefetch_invalid_gpte)(struct kvm_vcpu *vcpu, diff --git a/arch/x86/kvm/mmu/spte.c b/arch/x86/kvm/mmu/spte.c index d2f5f7dd8fe1..bdf72a98c19c 100644 --- a/arch/x86/kvm/mmu/spte.c +++ b/arch/x86/kvm/mmu/spte.c @@ -280,9 +280,9 @@ bool make_spte(struct kvm_vcpu *vcpu, struct kvm_mmu_pa= ge *sp, if (prefetch && !synchronizing) spte =3D mark_spte_for_access_track(spte); =20 - WARN_ONCE(is_rsvd_spte(&vcpu->arch.mmu->shadow_zero_check, spte, level), + WARN_ONCE(is_rsvd_spte(&vcpu->arch.mmu->fmt, spte, level), "spte =3D 0x%llx, level =3D %d, rsvd bits =3D 0x%llx", spte, level, - get_rsvd_bits(&vcpu->arch.mmu->shadow_zero_check, spte, level)); + get_rsvd_bits(&vcpu->arch.mmu->fmt, spte, level)); =20 /* * Mark the memslot dirty *after* modifying it for access tracking. diff --git a/arch/x86/kvm/mmu/spte.h b/arch/x86/kvm/mmu/spte.h index 13eea94dd212..918533e61b98 100644 --- a/arch/x86/kvm/mmu/spte.h +++ b/arch/x86/kvm/mmu/spte.h @@ -378,33 +378,33 @@ static inline bool is_accessed_spte(u64 spte) return spte & shadow_accessed_mask; } =20 -static inline u64 get_rsvd_bits(struct rsvd_bits_validate *rsvd_check, u64= pte, +static inline u64 get_rsvd_bits(struct kvm_page_format *fmt, u64 pte, int level) { int bit7 =3D (pte >> 7) & 1; =20 - return rsvd_check->rsvd_bits_mask[bit7][level-1]; + return fmt->rsvd_bits_mask[bit7][level-1]; } =20 -static inline bool __is_rsvd_bits_set(struct rsvd_bits_validate *rsvd_chec= k, +static inline bool __is_rsvd_bits_set(struct kvm_page_format *fmt, u64 pte, int level) { - return pte & get_rsvd_bits(rsvd_check, pte, level); + return pte & get_rsvd_bits(fmt, pte, level); } =20 -static inline bool __is_bad_mt_xwr(struct rsvd_bits_validate *rsvd_check, +static inline bool __is_bad_mt_xwr(struct kvm_page_format *fmt, u64 pte) { if (pte & VMX_EPT_USER_EXECUTABLE_MASK) pte |=3D VMX_EPT_EXECUTABLE_MASK; - return rsvd_check->bad_mt_xwr & BIT_ULL(pte & 0x3f); + return fmt->bad_mt_xwr & BIT_ULL(pte & 0x3f); } =20 -static __always_inline bool is_rsvd_spte(struct rsvd_bits_validate *rsvd_c= heck, +static __always_inline bool is_rsvd_spte(struct kvm_page_format *fmt, u64 spte, int level) { - return __is_bad_mt_xwr(rsvd_check, spte) || - __is_rsvd_bits_set(rsvd_check, spte, level); + return __is_bad_mt_xwr(fmt, spte) || + __is_rsvd_bits_set(fmt, spte, level); } =20 /* diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 5b74315f7e95..6565072760f1 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -8703,7 +8703,7 @@ __init int vmx_hardware_setup(void) =20 /* * Setup shadow_me_value/shadow_me_mask to include MKTME KeyID - * bits to shadow_zero_check. + * bits into the MMU's struct kvm_page_format. */ vmx_setup_me_spte_mask(); =20 --=20 2.52.0 From nobody Mon Jun 8 07:24:00 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 03F4F47AF66 for ; Wed, 3 Jun 2026 10:58:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780484316; cv=none; b=VIiPhMgO+auUhXKvAgRZE2udMcwD/nI0UQRd99oX/fmXXytzJEOYuEzEiFKV+x2MLPKEy6mtMxfGYf5r0N3kC7YZxx84AVhtKe/z1OdvT0+9oA1ZSYpq/9Tl9fL1VzOnQg2yH1gwx5mJA9jSj5aCiG8YOPGHVYjrszSnHWjcmiM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780484316; c=relaxed/simple; bh=xH2nNrhMUHhsXmnI+ANftM8SoHXriFpHyMO2OnFmc14=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=U35VoLs4TkCpZAO+J5GhFiNpJUq19eRcL82vfai/BF0daKIMY8Ybv+dZUr8Vz06wLF0Dp+ld7lRZNn0bvOeDJ3xqaX7k6vUOqT9jq/9RRUMSFsg2eLsWpvPl7ZG7TAiHqaliYEOBWnlF9d+reOuJ72xZOiBOOVmfsmiTdtY6xuc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=UzutYNLc; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="UzutYNLc" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1780484313; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=B3ORianq9bhRIrNfSAPaZPqyjkRgJzW7hnwbQDPyMqc=; b=UzutYNLcf/DeWTkmccWy42SOfDfGWpGCjzLpX/n0Q32/1R/yuDD+5O99u3KcMfMWqmRO8j 5pfkG48qYK1wTzb3YbxCDA6qTLCOy0EzW3+3sOlR0XwEMexkKNTsty8sDIHUKf5TrdOpbt eLHXJ0/XEYeeKQbwEHqFS2fUHEJlGVo= Received: from mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-150-Y0mLsgGVN9Kip6cVhi7jzg-1; Wed, 03 Jun 2026 06:58:29 -0400 X-MC-Unique: Y0mLsgGVN9Kip6cVhi7jzg-1 X-Mimecast-MFC-AGG-ID: Y0mLsgGVN9Kip6cVhi7jzg_1780484309 Received: from mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.12]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 084DB19560AB; Wed, 3 Jun 2026 10:58:29 +0000 (UTC) Received: from virtlab1023.lab.eng.rdu2.redhat.lab.eng.rdu2.redhat.com (virtlab1023.lab.eng.rdu2.redhat.com [10.8.1.187]) by mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id A18D5195E486; Wed, 3 Jun 2026 10:58:28 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Subject: [PATCH 23/24] KVM: x86/mmu: parameterize update_permission_bitmask() Date: Wed, 3 Jun 2026 06:58:13 -0400 Message-ID: <20260603105814.10236-24-pbonzini@redhat.com> In-Reply-To: <20260603105814.10236-1-pbonzini@redhat.com> References: <20260603105814.10236-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.0 on 10.30.177.12 Content-Type: text/plain; charset="utf-8" Make it possible to apply the computation loop to both guest and shadow PTEs formats; the latter do not have an extended role, so pass the four parameters to the function one by one. Signed-off-by: Paolo Bonzini --- arch/x86/kvm/mmu/mmu.c | 20 ++++++++++++-------- 1 file changed, 12 insertions(+), 8 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 29755afe5b46..386e7e05d205 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -5569,18 +5569,15 @@ reset_ept_shadow_zero_bits_mask(struct kvm_mmu *con= text, bool execonly) (14 & (access) ? 1 << 14 : 0) | \ (15 & (access) ? 1 << 15 : 0)) =20 -static void update_permission_bitmask(struct kvm_pagewalk *pw, bool tdp, b= ool ept) +static void __update_permission_bitmask(struct kvm_page_format *fmt, bool = tdp, + bool ept, bool cr4_smep, bool cr4_smap, + bool cr0_wp, bool efer_nx) { unsigned index; =20 const u16 w =3D ACC_BITS_MASK(ACC_WRITE_MASK); const u16 r =3D ACC_BITS_MASK(ACC_READ_MASK); =20 - bool cr4_smep =3D is_cr4_smep(pw); - bool cr4_smap =3D is_cr4_smap(pw); - bool cr0_wp =3D is_cr0_wp(pw); - bool efer_nx =3D is_efer_nx(pw); - /* * In hardware, page fault error codes are generated (as the name * suggests) on any kind of page fault. permission_fault() and @@ -5593,7 +5590,7 @@ static void update_permission_bitmask(struct kvm_page= walk *pw, bool tdp, bool ep * permission_fault() to indicate accesses that are *not* subject to * SMAP restrictions. */ - for (index =3D 0; index < ARRAY_SIZE(pw->fmt.permissions); ++index) { + for (index =3D 0; index < ARRAY_SIZE(fmt->permissions); ++index) { unsigned pfec =3D index << 1; =20 /* @@ -5667,10 +5664,17 @@ static void update_permission_bitmask(struct kvm_pa= gewalk *pw, bool tdp, bool ep smapf =3D (pfec & (PFERR_RSVD_MASK|PFERR_FETCH_MASK)) ? 0 : kf; } =20 - pw->fmt.permissions[index] =3D ff | uf | wf | rf | smapf; + fmt->permissions[index] =3D ff | uf | wf | rf | smapf; } } =20 +static void update_permission_bitmask(struct kvm_pagewalk *w, bool tdp, bo= ol ept) +{ + __update_permission_bitmask(&w->fmt, tdp, ept, + is_cr4_smep(w), is_cr4_smap(w), + is_cr0_wp(w), is_efer_nx(w)); +} + /* * PKU is an additional mechanism by which the paging controls access to * user-mode addresses based on the value in the PKRU register. Protection --=20 2.52.0 From nobody Mon Jun 8 07:24:00 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 668CA47B41D for ; Wed, 3 Jun 2026 10:58:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780484317; cv=none; b=pMl7wvwZiV3V7k0IFiSbZmrztsDbqVJy/BRG60YnbZZwjeH91RIWyClUEorQL8vJ/FAjbre08QCgW27d5/g+xz/Z1ccFOHz8kSp9v4MruiQ8NDQWtzbLZXucX3fP4kjW6a30+aJN+bb+LIfkP8js9O8J7UGcI1o2koovo49Bprc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780484317; c=relaxed/simple; bh=CJTORDSn50XWymmnv272DYgIETZOcyeCoWphf1eJiZc=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=WvRG2A1u+PYkA3FldwIG24p1an57lfTyrdKhMsvjXXgLvyq491czwcImqzc/KtXjJ8iGcSHGluPgK3E0N4fPNLEdnoFRIUL+Yq9To0bXOjYlFGMXM1hN26eV0P/o/NFiJ2cd55yC9PZldiTvSo/cU7MjbEd1gZuaorvs8wgriPU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=XDimpxLO; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="XDimpxLO" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1780484313; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=QciZuvVq1+xIHTd6laqKvnuzdV7NLu0dpRHde794czk=; b=XDimpxLO05S4uV2W8O67e224j8kPzPfxXy5WhkuQrV4VxjfWooxbbiujs7sBZ/bVByUjrF N3e6Oxk+KRJTSWW65o6qq7q4QGP6fI/x1lRSdBYjI8NajkwtSb00Xci36h7g8BVpun+aZ2 xDrVvQvu0JJiOtu8XBLhOlIrOuzrBNc= Received: from mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-693-wCsCmnCENfCkCBmbpNHGag-1; Wed, 03 Jun 2026 06:58:30 -0400 X-MC-Unique: wCsCmnCENfCkCBmbpNHGag-1 X-Mimecast-MFC-AGG-ID: wCsCmnCENfCkCBmbpNHGag_1780484309 Received: from mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.12]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 8AEC01800366; Wed, 3 Jun 2026 10:58:29 +0000 (UTC) Received: from virtlab1023.lab.eng.rdu2.redhat.lab.eng.rdu2.redhat.com (virtlab1023.lab.eng.rdu2.redhat.com [10.8.1.187]) by mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 2F143195E487; Wed, 3 Jun 2026 10:58:29 +0000 (UTC) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Subject: [PATCH 24/24] KVM: x86/mmu: use kvm_page_format to test SPTEs Date: Wed, 3 Jun 2026 06:58:14 -0400 Message-ID: <20260603105814.10236-25-pbonzini@redhat.com> In-Reply-To: <20260603105814.10236-1-pbonzini@redhat.com> References: <20260603105814.10236-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.0 on 10.30.177.12 Content-Type: text/plain; charset="utf-8" is_access_allowed(), and is_executable_pte() within it, are effectively a special version of permission_fault() that only supports a subset of roles. In particular it does not allow SMEP, SMAP and PKE. Replace its implementation with a modified version of permission_fault(); the new version will support SMEP (and hence AMD GMET) for free as soon as update_spte_permission_bitmask() stops hardcoding cr4_smep =3D=3D false. This prepares for a possible future where TDP entries could have XS!=3DXU, for example as part of implementing Hyper-V VSM natively inside KVM. Signed-off-by: Paolo Bonzini --- arch/x86/kvm/mmu/mmu.c | 18 ++++++++++++--- arch/x86/kvm/mmu/spte.h | 46 +++++++++++++++++++++----------------- arch/x86/kvm/mmu/tdp_mmu.c | 3 ++- 3 files changed, 42 insertions(+), 25 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 386e7e05d205..a4df38356988 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -3670,6 +3670,7 @@ static u64 *fast_pf_get_last_sptep(struct kvm_vcpu *v= cpu, gpa_t gpa, u64 *spte) */ static int fast_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *f= ault) { + struct kvm_mmu *mmu; struct kvm_mmu_page *sp; int ret =3D RET_PF_INVALID; u64 spte; @@ -3679,6 +3680,7 @@ static int fast_page_fault(struct kvm_vcpu *vcpu, str= uct kvm_page_fault *fault) if (!page_fault_can_be_fast(vcpu->kvm, fault)) return ret; =20 + mmu =3D vcpu->arch.mmu; walk_shadow_page_lockless_begin(vcpu); =20 do { @@ -3714,7 +3716,7 @@ static int fast_page_fault(struct kvm_vcpu *vcpu, str= uct kvm_page_fault *fault) * Need not check the access of upper level table entries since * they are always ACC_ALL. */ - if (is_access_allowed(fault, spte)) { + if (!spte_permission_fault(mmu, spte, fault)) { ret =3D RET_PF_SPURIOUS; break; } @@ -3737,7 +3739,7 @@ static int fast_page_fault(struct kvm_vcpu *vcpu, str= uct kvm_page_fault *fault) * that were write-protected for dirty-logging or access * tracking are handled here. Don't bother checking if the * SPTE is writable to prioritize running with A/D bits enabled. - * The is_access_allowed() check above handles the common case + * The spte_permission_fault() check above handles the common case * of the fault being spurious, and the SPTE is known to be * shadow-present, i.e. except for access tracking restoration * making the new SPTE writable, the check is wasteful. @@ -3762,7 +3764,7 @@ static int fast_page_fault(struct kvm_vcpu *vcpu, str= uct kvm_page_fault *fault) =20 /* Verify that the fault can be handled in the fast path */ if (new_spte =3D=3D spte || - !is_access_allowed(fault, new_spte)) + spte_permission_fault(mmu, new_spte, fault)) break; =20 /* @@ -5675,6 +5677,12 @@ static void update_permission_bitmask(struct kvm_pag= ewalk *w, bool tdp, bool ept is_cr0_wp(w), is_efer_nx(w)); } =20 +static void update_spte_permission_bitmask(struct kvm_mmu *mmu, bool tdp, = bool ept) +{ + __update_permission_bitmask(&mmu->fmt, tdp, ept, + mmu->root_role.cr4_smep, false, true, true); +} + /* * PKU is an additional mechanism by which the paging controls access to * user-mode addresses based on the value in the PKRU register. Protection @@ -5884,6 +5892,7 @@ static void init_kvm_tdp_mmu(struct kvm_vcpu *vcpu, context->page_fault =3D kvm_tdp_page_fault; context->sync_spte =3D NULL; =20 + update_spte_permission_bitmask(context, true, shadow_xs_mask); reset_tdp_shadow_zero_bits_mask(context); } =20 @@ -5902,6 +5911,7 @@ static void shadow_mmu_init_context(struct kvm_vcpu *= vcpu, struct kvm_mmu *conte else paging32_init_context(context); =20 + update_spte_permission_bitmask(context, context =3D=3D &vcpu->arch.guest_= mmu, false); reset_shadow_zero_bits_mask(vcpu, context); } =20 @@ -6030,6 +6040,8 @@ void kvm_init_shadow_ept_mmu(struct kvm_vcpu *vcpu, b= ool execonly, update_permission_bitmask(ngpa_walk, true, true); ngpa_walk->fmt.pkru_mask =3D 0; reset_rsvds_bits_mask_ept(vcpu, execonly, huge_page_level); + + update_spte_permission_bitmask(context, true, true); reset_ept_shadow_zero_bits_mask(context, execonly); } =20 diff --git a/arch/x86/kvm/mmu/spte.h b/arch/x86/kvm/mmu/spte.h index 918533e61b98..9bddfa0e02b9 100644 --- a/arch/x86/kvm/mmu/spte.h +++ b/arch/x86/kvm/mmu/spte.h @@ -357,17 +357,6 @@ static inline bool is_last_spte(u64 pte, int level) return (level =3D=3D PG_LEVEL_4K) || is_large_pte(pte); } =20 -static inline bool is_executable_pte(u64 spte) -{ - /* - * For now, return true if either the XS or XU bit is set - * This function is only used for fast_page_fault, - * which never processes shadow EPT, and regular page - * tables always have XS=3D=3DXU. - */ - return (spte & (shadow_xs_mask | shadow_xu_mask | shadow_nx_mask)) !=3D s= hadow_nx_mask; -} - static inline kvm_pfn_t spte_to_pfn(u64 pte) { return (pte & SPTE_BASE_ADDR_MASK) >> PAGE_SHIFT; @@ -496,20 +485,35 @@ static inline bool is_mmu_writable_spte(u64 spte) } =20 /* - * Returns true if the access indicated by @fault is allowed by the existi= ng - * SPTE protections. Note, the caller is responsible for checking that the - * SPTE is a shadow-present, leaf SPTE (either before or after). + * Returns true if the access indicated by @fault is forbidden by the exis= ting + * SPTE protections. */ -static inline bool is_access_allowed(struct kvm_page_fault *fault, u64 spt= e) +static inline bool spte_permission_fault(struct kvm_mmu *mmu, u64 spte, + struct kvm_page_fault *fault) { - if (fault->exec) - return is_executable_pte(spte); + unsigned int pfec =3D fault->error_code; + int index =3D pfec >> 1; + int pte_access; =20 - if (fault->write) - return is_writable_pte(spte); + if (!is_shadow_present_pte(spte)) + return true; =20 - /* Fault was on Read access */ - return spte & PT_PRESENT_MASK; + BUILD_BUG_ON(PT_PRESENT_MASK !=3D ACC_READ_MASK); + BUILD_BUG_ON(PT_WRITABLE_MASK !=3D ACC_WRITE_MASK); + BUILD_BUG_ON(VMX_EPT_READABLE_MASK !=3D ACC_READ_MASK); + BUILD_BUG_ON(VMX_EPT_WRITABLE_MASK !=3D ACC_WRITE_MASK); + + /* strip nested paging fault error codes */ + pte_access =3D spte & (PT_PRESENT_MASK | PT_WRITABLE_MASK); + if (shadow_nx_mask) { + pte_access |=3D spte & shadow_user_mask ? ACC_USER_MASK : 0; + pte_access |=3D spte & shadow_nx_mask ? 0 : ACC_EXEC_MASK; + } else { + pte_access |=3D spte & shadow_xs_mask ? ACC_EXEC_MASK : 0; + pte_access |=3D spte & shadow_xu_mask ? ACC_USER_EXEC_MASK : 0; + } + + return (mmu->fmt.permissions[index] >> pte_access) & 1; } =20 /* diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 5a2f8ce9a32b..839a8e416510 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -1169,6 +1169,7 @@ static int tdp_mmu_map_handle_target_level(struct kvm= _vcpu *vcpu, struct kvm_page_fault *fault, struct tdp_iter *iter) { + struct kvm_mmu *mmu =3D vcpu->arch.mmu; struct kvm_mmu_page *sp =3D sptep_to_sp(rcu_dereference(iter->sptep)); u64 new_spte; int ret =3D RET_PF_FIXED; @@ -1178,7 +1179,7 @@ static int tdp_mmu_map_handle_target_level(struct kvm= _vcpu *vcpu, return RET_PF_RETRY; =20 if (is_shadow_present_pte(iter->old_spte) && - (fault->prefetch || is_access_allowed(fault, iter->old_spte)) && + (fault->prefetch || !spte_permission_fault(mmu, iter->old_spte, fault= )) && is_last_spte(iter->old_spte, iter->level)) { WARN_ON_ONCE(fault->pfn !=3D spte_to_pfn(iter->old_spte)); return RET_PF_SPURIOUS; --=20 2.52.0