From nobody Tue Sep 16 10:47:15 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 96D07C3DA7D for ; Thu, 5 Jan 2023 10:18:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232307AbjAEJ61 (ORCPT ); Thu, 5 Jan 2023 04:58:27 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54772 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230282AbjAEJ56 (ORCPT ); Thu, 5 Jan 2023 04:57:58 -0500 Received: from mail-pj1-x1033.google.com (mail-pj1-x1033.google.com [IPv6:2607:f8b0:4864:20::1033]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A504D551D8; Thu, 5 Jan 2023 01:57:57 -0800 (PST) Received: by mail-pj1-x1033.google.com with SMTP id fz16-20020a17090b025000b002269d6c2d83so3272653pjb.0; Thu, 05 Jan 2023 01:57:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=1FxfyRYzq+eJDXF+P17kPC8+b8A7MrnUUAaDfTct5VU=; b=c5GcwudmhpoazZWztrEH2AEr39+EQS2yTtDETqUSp5YBTOO8Xv9YcBkcdlxpYn//XH gZNt7GRQusRmr+8p+XVo317JhbR9YfumYrISabC2oMQ3Hx9FQ9/CFGYmnWK1vildiHjh OMJM592J6ns0GQ3K1OdqYKfTuCK/6PGIc2h19S0L7ZT74cMLraKJyhLZNFCYd8V7A9/K G1b+ycOzcnv+TO6BmbxzKX39tvPrngrq2DYUU3+j5WgySbJQIhufLVPedCF9WEkTEPPJ zjYm/n/qR7f22KzND2SQC5Fgk+9s551QqVBoa+h1qpjwSad55uS+A2o4DDmzbPeYfO2v v20A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=1FxfyRYzq+eJDXF+P17kPC8+b8A7MrnUUAaDfTct5VU=; b=33yDQsg8iwrmYX8ChV93rOXogx1CGrBiBEvyizeDntYwG7L+PRO0kYmeQZX+0rrOi3 +QgzOHZKxMxcfpfej3lF77ae+DMJ9ncWJLZ8e8A425LWjmIlbyNcw0crd5x2eUeHelsr uHc6tSeyl1RlCaGWvdvK9bKj05IjnzAni90+Gdek6s0II51o0KjHTp540I+0Ss5TGPDQ DPYmnhgOoJ27y8h314EqLaMqWtR2bJonYc2HGy6JFi+Jvh+S3VSRa4/DacFAe/Dyxbwr E0w6nADCiNmippudBU7za4LCqqeleFaC8kPyIwKD7tu0+CVYYK4zLoaSDK57y03oF4vI +iuA== X-Gm-Message-State: AFqh2kqPxj3Vp67zCCGhvwr4dlYC/8b6X9LX9FbmjhWfu6Y+mO+aCZny QeDrbisdjSlkDR9z+7IiBtqJJXYikls= X-Google-Smtp-Source: AMrXdXtjczeH3ed9iss5bkxv5zBVFXbHfAPc+fXsYPfxoSFcHtAAi5GDivuZrEGi1kSBqn37IIq+EQ== X-Received: by 2002:a05:6a21:e385:b0:b3:4044:1503 with SMTP id cc5-20020a056a21e38500b000b340441503mr45652790pzc.52.1672912676725; Thu, 05 Jan 2023 01:57:56 -0800 (PST) Received: from localhost ([47.88.5.130]) by smtp.gmail.com with ESMTPSA id o11-20020a17090a678b00b0022698aa22d9sm995245pjj.31.2023.01.05.01.57.55 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Thu, 05 Jan 2023 01:57:56 -0800 (PST) From: Lai Jiangshan To: linux-kernel@vger.kernel.org Cc: Paolo Bonzini , Sean Christopherson , Lai Jiangshan , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" , kvm@vger.kernel.org Subject: [PATCH 5/7] kvm: x86/mmu: Move the code out of FNAME(sync_page)'s loop body into mmu.c Date: Thu, 5 Jan 2023 17:58:46 +0800 Message-Id: <20230105095848.6061-6-jiangshanlai@gmail.com> X-Mailer: git-send-email 2.19.1.6.gb485710b In-Reply-To: <20230105095848.6061-1-jiangshanlai@gmail.com> References: <20230105095848.6061-1-jiangshanlai@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Lai Jiangshan Rename mmu->sync_page to mmu->sync_spte and move the code out of FNAME(sync_page)'s loop body into mmu.c. Also initialize mmu->sync_spte as NULL for direct paging. No functionalities change intended. Signed-off-by: Lai Jiangshan --- arch/x86/include/asm/kvm_host.h | 4 +- arch/x86/kvm/mmu/mmu.c | 70 ++++++++++++--- arch/x86/kvm/mmu/paging_tmpl.h | 147 +++++++++++--------------------- 3 files changed, 110 insertions(+), 111 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_hos= t.h index dbea616bccce..69b7967cd743 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -441,8 +441,8 @@ struct kvm_mmu { gpa_t (*gva_to_gpa)(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, gpa_t gva_or_gpa, u64 access, struct x86_exception *exception); - int (*sync_page)(struct kvm_vcpu *vcpu, - struct kvm_mmu_page *sp); + int (*sync_spte)(struct kvm_vcpu *vcpu, + struct kvm_mmu_page *sp, int i); void (*invlpg)(struct kvm_vcpu *vcpu, gva_t gva, hpa_t root_hpa); struct kvm_mmu_root_info root; union kvm_cpu_role cpu_role; diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index ffef9fe0c853..f39bee1542d8 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -1779,12 +1779,6 @@ static void mark_unsync(u64 *spte) kvm_mmu_mark_parents_unsync(sp); } =20 -static int nonpaging_sync_page(struct kvm_vcpu *vcpu, - struct kvm_mmu_page *sp) -{ - return -1; -} - #define KVM_PAGE_ARRAY_NR 16 =20 struct kvm_mmu_pages { @@ -1904,10 +1898,62 @@ static bool sp_has_gptes(struct kvm_mmu_page *sp) &(_kvm)->arch.mmu_page_hash[kvm_page_table_hashfn(_gfn)]) \ if ((_sp)->gfn !=3D (_gfn) || !sp_has_gptes(_sp)) {} else =20 +static int __kvm_sync_page(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp) +{ + union kvm_mmu_page_role root_role =3D vcpu->arch.mmu->root_role; + bool flush =3D false; + int i; + + /* + * Ignore various flags when verifying that it's safe to sync a shadow + * page using the current MMU context. + * + * - level: not part of the overall MMU role and will never match as the= MMU's + * level tracks the root level + * - access: updated based on the new guest PTE + * - quadrant: not part of the overall MMU role (similar to level) + */ + const union kvm_mmu_page_role sync_role_ign =3D { + .level =3D 0xf, + .access =3D 0x7, + .quadrant =3D 0x3, + .passthrough =3D 0x1, + }; + + /* + * Direct pages can never be unsync, and KVM should never attempt to + * sync a shadow page for a different MMU context, e.g. if the role + * differs then the memslot lookup (SMM vs. non-SMM) will be bogus, the + * reserved bits checks will be wrong, etc... + */ + if (WARN_ON_ONCE(sp->role.direct || + (sp->role.word ^ root_role.word) & ~sync_role_ign.word)) + return -1; + + for (i =3D 0; i < SPTE_ENT_PER_PAGE; i++) { + int ret =3D vcpu->arch.mmu->sync_spte(vcpu, sp, i); + + if (ret < -1) + return -1; + flush |=3D ret; + } + + /* + * Note, any flush is purely for KVM's correctness, e.g. when dropping + * an existing SPTE or clearing W/A/D bits to ensure an mmu_notifier + * unmap or dirty logging event doesn't fail to flush. The guest is + * responsible for flushing the TLB to ensure any changes in protection + * bits are recognized, i.e. until the guest flushes or page faults on + * a relevant address, KVM is architecturally allowed to let vCPUs use + * cached translations with the old protection bits. + */ + return flush; +} + static int kvm_sync_page(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp, struct list_head *invalid_list) { - int ret =3D vcpu->arch.mmu->sync_page(vcpu, sp); + int ret =3D __kvm_sync_page(vcpu, sp); =20 if (ret < 0) kvm_mmu_prepare_zap_page(vcpu->kvm, sp, invalid_list); @@ -4458,7 +4504,7 @@ static void nonpaging_init_context(struct kvm_mmu *co= ntext) { context->page_fault =3D nonpaging_page_fault; context->gva_to_gpa =3D nonpaging_gva_to_gpa; - context->sync_page =3D nonpaging_sync_page; + context->sync_spte =3D NULL; context->invlpg =3D NULL; } =20 @@ -5047,7 +5093,7 @@ static void paging64_init_context(struct kvm_mmu *con= text) { context->page_fault =3D paging64_page_fault; context->gva_to_gpa =3D paging64_gva_to_gpa; - context->sync_page =3D paging64_sync_page; + context->sync_spte =3D paging64_sync_spte; context->invlpg =3D paging64_invlpg; } =20 @@ -5055,7 +5101,7 @@ static void paging32_init_context(struct kvm_mmu *con= text) { context->page_fault =3D paging32_page_fault; context->gva_to_gpa =3D paging32_gva_to_gpa; - context->sync_page =3D paging32_sync_page; + context->sync_spte =3D paging32_sync_spte; context->invlpg =3D paging32_invlpg; } =20 @@ -5144,7 +5190,7 @@ static void init_kvm_tdp_mmu(struct kvm_vcpu *vcpu, context->cpu_role.as_u64 =3D cpu_role.as_u64; context->root_role.word =3D root_role.word; context->page_fault =3D kvm_tdp_page_fault; - context->sync_page =3D nonpaging_sync_page; + context->sync_spte =3D NULL; context->invlpg =3D NULL; context->get_guest_pgd =3D get_cr3; context->get_pdptr =3D kvm_pdptr_read; @@ -5276,7 +5322,7 @@ void kvm_init_shadow_ept_mmu(struct kvm_vcpu *vcpu, b= ool execonly, =20 context->page_fault =3D ept_page_fault; context->gva_to_gpa =3D ept_gva_to_gpa; - context->sync_page =3D ept_sync_page; + context->sync_spte =3D ept_sync_spte; context->invlpg =3D ept_invlpg; =20 update_permission_bitmask(context, true); diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h index ab0b031d4825..3bc13b9b61d1 100644 --- a/arch/x86/kvm/mmu/paging_tmpl.h +++ b/arch/x86/kvm/mmu/paging_tmpl.h @@ -942,120 +942,73 @@ static gpa_t FNAME(gva_to_gpa)(struct kvm_vcpu *vcpu= , struct kvm_mmu *mmu, * can't change unless all sptes pointing to it are nuked first. * * Returns - * < 0: the sp should be zapped + * < 0: failed to sync * 0: the sp is synced and no tlb flushing is required * > 0: the sp is synced and tlb flushing is required */ -static int FNAME(sync_page)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp) +static int FNAME(sync_spte)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp= , int i) { - union kvm_mmu_page_role root_role =3D vcpu->arch.mmu->root_role; - int i; bool host_writable; gpa_t first_pte_gpa; - bool flush =3D false; - - /* - * Ignore various flags when verifying that it's safe to sync a shadow - * page using the current MMU context. - * - * - level: not part of the overall MMU role and will never match as the= MMU's - * level tracks the root level - * - access: updated based on the new guest PTE - * - quadrant: not part of the overall MMU role (similar to level) - */ - const union kvm_mmu_page_role sync_role_ign =3D { - .level =3D 0xf, - .access =3D 0x7, - .quadrant =3D 0x3, - .passthrough =3D 0x1, - }; + u64 *sptep, spte; + struct kvm_memory_slot *slot; + unsigned old_pte_access, pte_access; + pt_element_t gpte; + gpa_t pte_gpa; + gfn_t gfn; =20 - /* - * Direct pages can never be unsync, and KVM should never attempt to - * sync a shadow page for a different MMU context, e.g. if the role - * differs then the memslot lookup (SMM vs. non-SMM) will be bogus, the - * reserved bits checks will be wrong, etc... - */ - if (WARN_ON_ONCE(sp->role.direct || - (sp->role.word ^ root_role.word) & ~sync_role_ign.word)) - return -1; + if (!sp->spt[i]) + return 0; =20 first_pte_gpa =3D FNAME(get_level1_sp_gpa)(sp); + pte_gpa =3D first_pte_gpa + i * sizeof(pt_element_t); =20 - for (i =3D 0; i < SPTE_ENT_PER_PAGE; i++) { - u64 *sptep, spte; - struct kvm_memory_slot *slot; - unsigned old_pte_access, pte_access; - pt_element_t gpte; - gpa_t pte_gpa; - gfn_t gfn; - - if (!sp->spt[i]) - continue; - - pte_gpa =3D first_pte_gpa + i * sizeof(pt_element_t); - - if (kvm_vcpu_read_guest_atomic(vcpu, pte_gpa, &gpte, - sizeof(pt_element_t))) - return -1; - - if (FNAME(prefetch_invalid_gpte)(vcpu, sp, &sp->spt[i], gpte)) { - flush =3D true; - continue; - } - - gfn =3D gpte_to_gfn(gpte); - pte_access =3D sp->role.access; - pte_access &=3D FNAME(gpte_access)(gpte); - FNAME(protect_clean_gpte)(vcpu->arch.mmu, &pte_access, gpte); - - if (sync_mmio_spte(vcpu, &sp->spt[i], gfn, pte_access)) - continue; + if (kvm_vcpu_read_guest_atomic(vcpu, pte_gpa, &gpte, + sizeof(pt_element_t))) + return -1; =20 - /* - * Drop the SPTE if the new protections would result in a RWX=3D0 - * SPTE or if the gfn is changing. The RWX=3D0 case only affects - * EPT with execute-only support, i.e. EPT without an effective - * "present" bit, as all other paging modes will create a - * read-only SPTE if pte_access is zero. - */ - if ((!pte_access && !shadow_present_mask) || - gfn !=3D kvm_mmu_page_get_gfn(sp, i)) { - drop_spte(vcpu->kvm, &sp->spt[i]); - flush =3D true; - continue; - } - /* - * Do nothing if the permissions are unchanged. - */ - old_pte_access =3D kvm_mmu_page_get_access(sp, i); - if (old_pte_access =3D=3D pte_access) - continue; + if (FNAME(prefetch_invalid_gpte)(vcpu, sp, &sp->spt[i], gpte)) + return 1; =20 - /* Update the shadowed access bits in case they changed. */ - kvm_mmu_page_set_access(sp, i, pte_access); + gfn =3D gpte_to_gfn(gpte); + pte_access =3D sp->role.access; + pte_access &=3D FNAME(gpte_access)(gpte); + FNAME(protect_clean_gpte)(vcpu->arch.mmu, &pte_access, gpte); =20 - sptep =3D &sp->spt[i]; - spte =3D *sptep; - host_writable =3D spte & shadow_host_writable_mask; - slot =3D kvm_vcpu_gfn_to_memslot(vcpu, gfn); - make_spte(vcpu, sp, slot, pte_access, gfn, - spte_to_pfn(spte), spte, true, false, - host_writable, &spte); + if (sync_mmio_spte(vcpu, &sp->spt[i], gfn, pte_access)) + return 0; =20 - flush |=3D mmu_spte_update(sptep, spte); + /* + * Drop the SPTE if the new protections would result in a RWX=3D0 + * SPTE or if the gfn is changing. The RWX=3D0 case only affects + * EPT with execute-only support, i.e. EPT without an effective + * "present" bit, as all other paging modes will create a + * read-only SPTE if pte_access is zero. + */ + if ((!pte_access && !shadow_present_mask) || + gfn !=3D kvm_mmu_page_get_gfn(sp, i)) { + drop_spte(vcpu->kvm, &sp->spt[i]); + return 1; } - /* - * Note, any flush is purely for KVM's correctness, e.g. when dropping - * an existing SPTE or clearing W/A/D bits to ensure an mmu_notifier - * unmap or dirty logging event doesn't fail to flush. The guest is - * responsible for flushing the TLB to ensure any changes in protection - * bits are recognized, i.e. until the guest flushes or page faults on - * a relevant address, KVM is architecturally allowed to let vCPUs use - * cached translations with the old protection bits. + * Do nothing if the permissions are unchanged. */ - return flush; + old_pte_access =3D kvm_mmu_page_get_access(sp, i); + if (old_pte_access =3D=3D pte_access) + return 0; + + /* Update the shadowed access bits in case they changed. */ + kvm_mmu_page_set_access(sp, i, pte_access); + + sptep =3D &sp->spt[i]; + spte =3D *sptep; + host_writable =3D spte & shadow_host_writable_mask; + slot =3D kvm_vcpu_gfn_to_memslot(vcpu, gfn); + make_spte(vcpu, sp, slot, pte_access, gfn, + spte_to_pfn(spte), spte, true, false, + host_writable, &spte); + + return mmu_spte_update(sptep, spte); } =20 #undef pt_element_t --=20 2.19.1.6.gb485710b