From nobody Wed Apr 15 08:31:28 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AC53BC433EF for ; Sat, 23 Jul 2022 01:23:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236581AbiGWBXi (ORCPT ); Fri, 22 Jul 2022 21:23:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54788 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229825AbiGWBXd (ORCPT ); Fri, 22 Jul 2022 21:23:33 -0400 Received: from mail-pj1-x104a.google.com (mail-pj1-x104a.google.com [IPv6:2607:f8b0:4864:20::104a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9E772675A3 for ; Fri, 22 Jul 2022 18:23:32 -0700 (PDT) Received: by mail-pj1-x104a.google.com with SMTP id 92-20020a17090a09e500b001d917022847so2627941pjo.1 for ; Fri, 22 Jul 2022 18:23:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=reply-to:date:in-reply-to:message-id:mime-version:references :subject:from:to:cc; bh=JFWClFFZ6i0uApbGDS7dRt7zoDBAj/qGZuamUz2KkEU=; b=PXCgqnxNGuq1FULrDW8Hk5uuXzTT16D68a8q6eMLQ7qvZAHvhXyZZqmmHpYxiMFcq1 zuSf6IqoaK6/pNS4FP0ANvZ1Z24Gdjr7ZZtkNyPeXrWAwbhgcJ/K0BVlmJc5dIf1iqTj YSydghR3Jw60Z1qGCumz/usvlweLc82eQL94EIHnavE9iEIzX36YZ6F68eLdPv2CZwNU js2LhC06/m2fEATaAGw46LzhBC5+DibN4SfjGlg3uxEg5vUwNPyXPNtZDAHkMQIevokQ NT7Qjh4iACUavOciN/PsNF1c325JH800voyDwAsS8YL6wHfCJEsTm/Bm0jsXipSBIg/y 1wZQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:reply-to:date:in-reply-to:message-id :mime-version:references:subject:from:to:cc; bh=JFWClFFZ6i0uApbGDS7dRt7zoDBAj/qGZuamUz2KkEU=; b=7zozg6wpeTYPjOn4o0eeT49PCPhLF6mxc/Fksx9hshL2uZSRkUL9ObXL3VsUF5J9ge /06lp6dUqK/Bg1B1FbIFaBwKfZ1TzN3XU/dIRLeDkSv4zq7a0nmiNabHolkt/rxnJgL4 8y/lcIaTW9WuCHPU1trYZ9IZvdIPotsPjMrTwoR2T0B1RUhggGTpRJFZKrScl6lo9zPN 6yE7fzRNHU4jK0lE9t33aGMRJw6GdbYoC6GdD9PDr5NK9P80aBnH3hQh1ZeqOb+XKO82 rwtDaVBvsnh8yOhAcM2qV7kzzWAskIntteZNeEPup+8SlWbAq8k11V4as3SEXSIXoho/ ZSdw== X-Gm-Message-State: AJIora9mBlJNKb8AJlPKHFJxdOHpGG3OChAs2QdCN2cfeZdLcnOvDGqy 45+yioQQ+klyB5/EGTVR1dCvFuMPLpE= X-Google-Smtp-Source: AGRyM1swJaIsO00mH97rvLnUMikQ8FpxjCFjLuxyQ6Tqd4vJnPFY/p5Vv0bx7YCQDZwIvB36m77QIH+73Mg= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a17:902:ce81:b0:16d:4376:3101 with SMTP id f1-20020a170902ce8100b0016d43763101mr2474282plg.85.1658539412169; Fri, 22 Jul 2022 18:23:32 -0700 (PDT) Reply-To: Sean Christopherson Date: Sat, 23 Jul 2022 01:23:20 +0000 In-Reply-To: <20220723012325.1715714-1-seanjc@google.com> Message-Id: <20220723012325.1715714-2-seanjc@google.com> Mime-Version: 1.0 References: <20220723012325.1715714-1-seanjc@google.com> X-Mailer: git-send-email 2.37.1.359.gd136c6c3e2-goog Subject: [PATCH v2 1/6] KVM: x86/mmu: Tag disallowed NX huge pages even if they're not tracked From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Yosry Ahmed , Mingwei Zhang , Ben Gardon Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Tag shadow pages that cannot be replaced with an NX huge page even if zapping the page would not allow KVM to create a huge page, e.g. because something else prevents creating a huge page. This will allow a future patch to more precisely apply the mitigation by checking if an existing shadow page can be replaced by a NX huge page. Currently, KVM assumes that any existing shadow page encountered cannot be replaced by a NX huge page (if the mitigation is enabled), which prevents KVM from replacing no-longer-necessary shadow pages with huge pages, e.g. after disabling dirty logging, zapping from the mmu_notifier due to page migration, etc... Failure to tag shadow pages appropriately could theoretically lead to false negatives, e.g. if a fetch fault requests a small page and thus isn't tracked, and a read/write fault later requests a huge page, KVM will not reject the huge page as it should. To avoid yet another flag, initialize the list_head and use list_empty() to determine whether or not a page is on the list of NX huge pages that should be recovered. Opportunstically rename most of the variables/functions involved to provide consistency, e.g. lpage vs huge page and NX huge vs huge NX, and clarity, e.g. to make it obvious the flag applies only to the NX huge page mitigation, not to any condition that prevents creating a huge page. Fixes: 5bcaf3e1715f ("KVM: x86/mmu: Account NX huge page disallowed iff hug= e page was requested") Signed-off-by: Sean Christopherson --- arch/x86/include/asm/kvm_host.h | 6 +-- arch/x86/kvm/mmu/mmu.c | 75 ++++++++++++++++++++++----------- arch/x86/kvm/mmu/mmu_internal.h | 22 ++++++++-- arch/x86/kvm/mmu/paging_tmpl.h | 6 +-- arch/x86/kvm/mmu/tdp_mmu.c | 8 ++-- 5 files changed, 79 insertions(+), 38 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_hos= t.h index e8281d64a431..246b69262b93 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1143,7 +1143,7 @@ struct kvm_arch { struct hlist_head mmu_page_hash[KVM_NUM_MMU_PAGES]; struct list_head active_mmu_pages; struct list_head zapped_obsolete_pages; - struct list_head lpage_disallowed_mmu_pages; + struct list_head possible_nx_huge_pages; struct kvm_page_track_notifier_node mmu_sp_tracker; struct kvm_page_track_notifier_head track_notifier_head; /* @@ -1304,8 +1304,8 @@ struct kvm_arch { * - tdp_mmu_roots (above) * - tdp_mmu_pages (above) * - the link field of struct kvm_mmu_pages used by the TDP MMU - * - lpage_disallowed_mmu_pages - * - the lpage_disallowed_link field of struct kvm_mmu_pages used + * - possible_nx_huge_pages; + * - the possible_nx_huge_page_link field of struct kvm_mmu_pages used * by the TDP MMU * It is acceptable, but not necessary, to acquire this lock when * the thread holds the MMU lock in write mode. diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 8e477333a263..1112e3a4cf3e 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -802,15 +802,43 @@ static void account_shadowed(struct kvm *kvm, struct = kvm_mmu_page *sp) kvm_flush_remote_tlbs_with_address(kvm, gfn, 1); } =20 -void account_huge_nx_page(struct kvm *kvm, struct kvm_mmu_page *sp) +static void untrack_possible_nx_huge_page(struct kvm *kvm, + struct kvm_mmu_page *sp) { - if (sp->lpage_disallowed) + if (list_empty(&sp->possible_nx_huge_page_link)) + return; + + --kvm->stat.nx_lpage_splits; + list_del_init(&sp->possible_nx_huge_page_link); +} + +void unaccount_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp) +{ + sp->nx_huge_page_disallowed =3D false; + + untrack_possible_nx_huge_page(kvm, sp); +} + +static void track_possible_nx_huge_page(struct kvm *kvm, + struct kvm_mmu_page *sp) +{ + if (!list_empty(&sp->possible_nx_huge_page_link)) return; =20 ++kvm->stat.nx_lpage_splits; - list_add_tail(&sp->lpage_disallowed_link, - &kvm->arch.lpage_disallowed_mmu_pages); - sp->lpage_disallowed =3D true; + list_add_tail(&sp->possible_nx_huge_page_link, + &kvm->arch.possible_nx_huge_pages); +} + +void account_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp, + bool nx_huge_page_possible) +{ + sp->nx_huge_page_disallowed =3D true; + + if (!nx_huge_page_possible) + untrack_possible_nx_huge_page(kvm, sp); + else + track_possible_nx_huge_page(kvm, sp); } =20 static void unaccount_shadowed(struct kvm *kvm, struct kvm_mmu_page *sp) @@ -830,13 +858,6 @@ static void unaccount_shadowed(struct kvm *kvm, struct= kvm_mmu_page *sp) kvm_mmu_gfn_allow_lpage(slot, gfn); } =20 -void unaccount_huge_nx_page(struct kvm *kvm, struct kvm_mmu_page *sp) -{ - --kvm->stat.nx_lpage_splits; - sp->lpage_disallowed =3D false; - list_del(&sp->lpage_disallowed_link); -} - static struct kvm_memory_slot * gfn_to_memslot_dirty_bitmap(struct kvm_vcpu *vcpu, gfn_t gfn, bool no_dirty_log) @@ -2115,6 +2136,8 @@ static struct kvm_mmu_page *kvm_mmu_alloc_shadow_page= (struct kvm *kvm, =20 set_page_private(virt_to_page(sp->spt), (unsigned long)sp); =20 + INIT_LIST_HEAD(&sp->possible_nx_huge_page_link); + /* * active_mmu_pages must be a FIFO list, as kvm_zap_obsolete_pages() * depends on valid pages being added to the head of the list. See @@ -2472,8 +2495,8 @@ static bool __kvm_mmu_prepare_zap_page(struct kvm *kv= m, zapped_root =3D !is_obsolete_sp(kvm, sp); } =20 - if (sp->lpage_disallowed) - unaccount_huge_nx_page(kvm, sp); + if (sp->nx_huge_page_disallowed) + unaccount_nx_huge_page(kvm, sp); =20 sp->role.invalid =3D 1; =20 @@ -3112,9 +3135,9 @@ static int __direct_map(struct kvm_vcpu *vcpu, struct= kvm_page_fault *fault) continue; =20 link_shadow_page(vcpu, it.sptep, sp); - if (fault->is_tdp && fault->huge_page_disallowed && - fault->req_level >=3D it.level) - account_huge_nx_page(vcpu->kvm, sp); + if (fault->is_tdp && fault->huge_page_disallowed) + account_nx_huge_page(vcpu->kvm, sp, + fault->req_level >=3D it.level); } =20 if (WARN_ON_ONCE(it.level !=3D fault->goal_level)) @@ -5970,7 +5993,7 @@ int kvm_mmu_init_vm(struct kvm *kvm) =20 INIT_LIST_HEAD(&kvm->arch.active_mmu_pages); INIT_LIST_HEAD(&kvm->arch.zapped_obsolete_pages); - INIT_LIST_HEAD(&kvm->arch.lpage_disallowed_mmu_pages); + INIT_LIST_HEAD(&kvm->arch.possible_nx_huge_pages); spin_lock_init(&kvm->arch.mmu_unsync_pages_lock); =20 r =3D kvm_mmu_init_tdp_mmu(kvm); @@ -6845,23 +6868,25 @@ static void kvm_recover_nx_lpages(struct kvm *kvm) ratio =3D READ_ONCE(nx_huge_pages_recovery_ratio); to_zap =3D ratio ? DIV_ROUND_UP(nx_lpage_splits, ratio) : 0; for ( ; to_zap; --to_zap) { - if (list_empty(&kvm->arch.lpage_disallowed_mmu_pages)) + if (list_empty(&kvm->arch.possible_nx_huge_pages)) break; =20 /* * We use a separate list instead of just using active_mmu_pages - * because the number of lpage_disallowed pages is expected to - * be relatively small compared to the total. + * because the number of shadow pages that be replaced with an + * NX huge page is expected to be relatively small compared to + * the total number of shadow pages. And because the TDP MMU + * doesn't use active_mmu_pages. */ - sp =3D list_first_entry(&kvm->arch.lpage_disallowed_mmu_pages, + sp =3D list_first_entry(&kvm->arch.possible_nx_huge_pages, struct kvm_mmu_page, - lpage_disallowed_link); - WARN_ON_ONCE(!sp->lpage_disallowed); + possible_nx_huge_page_link); + WARN_ON_ONCE(!sp->nx_huge_page_disallowed); if (is_tdp_mmu_page(sp)) { flush |=3D kvm_tdp_mmu_zap_sp(kvm, sp); } else { kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list); - WARN_ON_ONCE(sp->lpage_disallowed); + WARN_ON_ONCE(sp->nx_huge_page_disallowed); } =20 if (need_resched() || rwlock_needbreak(&kvm->mmu_lock)) { diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_interna= l.h index 582def531d4d..ff4ca54b9dda 100644 --- a/arch/x86/kvm/mmu/mmu_internal.h +++ b/arch/x86/kvm/mmu/mmu_internal.h @@ -57,7 +57,13 @@ struct kvm_mmu_page { bool tdp_mmu_page; bool unsync; u8 mmu_valid_gen; - bool lpage_disallowed; /* Can't be replaced by an equiv large page */ + + /* + * The shadow page can't be replaced by an equivalent huge page + * because it is being used to map an executable page in the guest + * and the NX huge page mitigation is enabled. + */ + bool nx_huge_page_disallowed; =20 /* * The following two entries are used to key the shadow page in the @@ -100,7 +106,14 @@ struct kvm_mmu_page { }; }; =20 - struct list_head lpage_disallowed_link; + /* + * Use to track shadow pages that, if zapped, would allow KVM to create + * an NX huge page. A shadow page will have nx_huge_page_disallowed + * set but not be on the list if a huge page is disallowed for other + * reasons, e.g. because KVM is shadowing a PTE at the same gfn, the + * memslot isn't properly aligned, etc... + */ + struct list_head possible_nx_huge_page_link; #ifdef CONFIG_X86_32 /* * Used out of the mmu-lock to avoid reading spte values while an @@ -315,7 +328,8 @@ void disallowed_hugepage_adjust(struct kvm_page_fault *= fault, u64 spte, int cur_ =20 void *mmu_memory_cache_alloc(struct kvm_mmu_memory_cache *mc); =20 -void account_huge_nx_page(struct kvm *kvm, struct kvm_mmu_page *sp); -void unaccount_huge_nx_page(struct kvm *kvm, struct kvm_mmu_page *sp); +void account_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp, + bool nx_huge_page_possible); +void unaccount_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp); =20 #endif /* __KVM_X86_MMU_INTERNAL_H */ diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h index f5958071220c..259c0f019f09 100644 --- a/arch/x86/kvm/mmu/paging_tmpl.h +++ b/arch/x86/kvm/mmu/paging_tmpl.h @@ -713,9 +713,9 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, struct k= vm_page_fault *fault, continue; =20 link_shadow_page(vcpu, it.sptep, sp); - if (fault->huge_page_disallowed && - fault->req_level >=3D it.level) - account_huge_nx_page(vcpu->kvm, sp); + if (fault->huge_page_disallowed) + account_nx_huge_page(vcpu->kvm, sp, + fault->req_level >=3D it.level); } =20 if (WARN_ON_ONCE(it.level !=3D fault->goal_level)) diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 40ccb5fba870..a30983947fee 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -284,6 +284,8 @@ static struct kvm_mmu_page *tdp_mmu_alloc_sp(struct kvm= _vcpu *vcpu) static void tdp_mmu_init_sp(struct kvm_mmu_page *sp, tdp_ptep_t sptep, gfn_t gfn, union kvm_mmu_page_role role) { + INIT_LIST_HEAD(&sp->possible_nx_huge_page_link); + set_page_private(virt_to_page(sp->spt), (unsigned long)sp); =20 sp->role =3D role; @@ -390,8 +392,8 @@ static void tdp_mmu_unlink_sp(struct kvm *kvm, struct k= vm_mmu_page *sp, lockdep_assert_held_write(&kvm->mmu_lock); =20 list_del(&sp->link); - if (sp->lpage_disallowed) - unaccount_huge_nx_page(kvm, sp); + if (sp->nx_huge_page_disallowed) + unaccount_nx_huge_page(kvm, sp); =20 if (shared) spin_unlock(&kvm->arch.tdp_mmu_pages_lock); @@ -1134,7 +1136,7 @@ static int tdp_mmu_link_sp(struct kvm *kvm, struct td= p_iter *iter, spin_lock(&kvm->arch.tdp_mmu_pages_lock); list_add(&sp->link, &kvm->arch.tdp_mmu_pages); if (account_nx) - account_huge_nx_page(kvm, sp); + account_nx_huge_page(kvm, sp, true); spin_unlock(&kvm->arch.tdp_mmu_pages_lock); =20 return 0; --=20 2.37.1.359.gd136c6c3e2-goog From nobody Wed Apr 15 08:31:28 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 50AD1C43334 for ; Sat, 23 Jul 2022 01:23:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236623AbiGWBXt (ORCPT ); Fri, 22 Jul 2022 21:23:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54856 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236527AbiGWBXg (ORCPT ); Fri, 22 Jul 2022 21:23:36 -0400 Received: from mail-pg1-x549.google.com (mail-pg1-x549.google.com [IPv6:2607:f8b0:4864:20::549]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 722DA675A3 for ; Fri, 22 Jul 2022 18:23:34 -0700 (PDT) Received: by mail-pg1-x549.google.com with SMTP id x71-20020a63314a000000b00419699fc9afso3019479pgx.1 for ; Fri, 22 Jul 2022 18:23:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=reply-to:date:in-reply-to:message-id:mime-version:references :subject:from:to:cc; bh=9ulz7aLVAum1vSSwzpqx4oGCxXM6Qh5WmZWBpBwUqbk=; b=X2Xwpuogh9PUYk+auvdaLmWIyRy814YJlecxSvggqAJVj763zTof4nqo81oSrTKUq8 DXZsV/AO3xzeDa/HfnJBEf7dl6PiS4rAZKrKfwkYMe3RKdn2A5/kWkFsB4JBDDY+dt8m /nuJXscaVhQkltmk2G2Srelw+pADI0+j7QevdKeJJ9WnDg8/nd0kZqL99l+BZvhx1e4Z wwIs2BMUjFwNNTVYRoWW1+u1yguWry2DQVm0IAWTIQTVAiddzZJQLJyY6VLItK+YMn1t a4T/2f3bE3uXziFF5ILsLCnRycf8HHbqTaCoL4W7PMeAIucpl8gKigbyxQv70uujoQj3 3z/A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:reply-to:date:in-reply-to:message-id :mime-version:references:subject:from:to:cc; bh=9ulz7aLVAum1vSSwzpqx4oGCxXM6Qh5WmZWBpBwUqbk=; b=e3xSaQXxfMeTp2RHOQxLqtPTDu2i5NmaOQdpyQFzTHx6BE+D9Cc4XETnelg+78dxbN jMhB8RoLhLajohMcCRVvCJTwo6KXY+631hzUkMSkB1kV+w0e55XZzt+U21sollszmblK yer9ofomkEWSbxOA0w2xWFLVJs7MLUqCrASCBLjqIo9Fu7O5wiMfPaU9eQVaPUFa6DDf 6LNr8xbCd7TwxUpoVcoQlenjg8RmktNbFWbveWDWoAEkpo2s1zu8KR7It1NtIcNZpjku BFxHYTnPAZeLiOxYXWt+8TQ012Mh4vWlk6+uRENgxv9dgQ7LO2TNfr9xNmSRn2VafVtW wbpw== X-Gm-Message-State: AJIora+JEFuM1G0j5D3P/fLg5WrebO1YHz8JynhcbOYrla/D3HREh9VE WDOt+erL7+u9IWqT0Lj2TD42Xt/2LBg= X-Google-Smtp-Source: AGRyM1vpIdyEFymG5ISLUnC3Ku8tFg7vp3XjbYPACA9XQNydAU+pD8UhskJVVfFvEolesusg+plnaCHk83Y= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a63:2cc6:0:b0:411:4fd6:49cb with SMTP id s189-20020a632cc6000000b004114fd649cbmr2040088pgs.365.1658539413965; Fri, 22 Jul 2022 18:23:33 -0700 (PDT) Reply-To: Sean Christopherson Date: Sat, 23 Jul 2022 01:23:21 +0000 In-Reply-To: <20220723012325.1715714-1-seanjc@google.com> Message-Id: <20220723012325.1715714-3-seanjc@google.com> Mime-Version: 1.0 References: <20220723012325.1715714-1-seanjc@google.com> X-Mailer: git-send-email 2.37.1.359.gd136c6c3e2-goog Subject: [PATCH v2 2/6] KVM: x86/mmu: Properly account NX huge page workaround for nonpaging MMUs From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Yosry Ahmed , Mingwei Zhang , Ben Gardon Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Account and track NX huge pages for nonpaging MMUs so that a future enhancement to precisely check if shadow page cannot be replaced by a NX huge page doesn't get false positives. Without correct tracking, KVM can get stuck in a loop if an instruction is fetching and writing data on the same huge page, e.g. KVM installs a small executable page on the fetch fault, replaces it with an NX huge page on the write fault, and faults again on the fetch. Alternatively, and perhaps ideally, KVM would simply not enforce the workaround for nonpaging MMUs. The guest has no page tables to abuse and KVM is guaranteed to switch to a different MMU on CR0.PG being toggled so there's no security or performance concerns. However, getting make_spte() to play nice now and in the future is unnecessarily complex. In the current code base, make_spte() can enforce the mitigation if TDP is enabled or the MMU is indirect, but make_spte() may not always have a vCPU/MMU to work with, e.g. if KVM were to support in-line huge page promotion when disabling dirty logging. Without a vCPU/MMU, KVM could either pass in the correct information and/or derive it from the shadow page, but the former is ugly and the latter subtly non-trivial due to the possitibility of direct shadow pages in indirect MMUs. Given that using shadow paging with an unpaged guest is far from top priority _and_ has been subjected to the workaround since its inception, keep it simple and just fix the accounting glitch. Signed-off-by: Sean Christopherson Reviewed-by: David Matlack --- arch/x86/kvm/mmu/mmu.c | 2 +- arch/x86/kvm/mmu/mmu_internal.h | 8 ++++++++ arch/x86/kvm/mmu/spte.c | 11 +++++++++++ 3 files changed, 20 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 1112e3a4cf3e..493cdf1c29ff 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -3135,7 +3135,7 @@ static int __direct_map(struct kvm_vcpu *vcpu, struct= kvm_page_fault *fault) continue; =20 link_shadow_page(vcpu, it.sptep, sp); - if (fault->is_tdp && fault->huge_page_disallowed) + if (fault->huge_page_disallowed) account_nx_huge_page(vcpu->kvm, sp, fault->req_level >=3D it.level); } diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_interna= l.h index ff4ca54b9dda..83644a0167ab 100644 --- a/arch/x86/kvm/mmu/mmu_internal.h +++ b/arch/x86/kvm/mmu/mmu_internal.h @@ -201,6 +201,14 @@ struct kvm_page_fault { =20 /* Derived from mmu and global state. */ const bool is_tdp; + + /* + * Note, enforcing the NX huge page mitigation for nonpaging MMUs + * (shadow paging, CR0.PG=3D0 in the guest) is completely unnecessary. + * The guest doesn't have any page tables to abuse and is guaranteed + * to switch to a different MMU when CR0.PG is toggled on (may not + * always be guaranteed when KVM is using TDP). See also make_spte(). + */ const bool nx_huge_page_workaround_enabled; =20 /* diff --git a/arch/x86/kvm/mmu/spte.c b/arch/x86/kvm/mmu/spte.c index 7314d27d57a4..9f3e5af088a5 100644 --- a/arch/x86/kvm/mmu/spte.c +++ b/arch/x86/kvm/mmu/spte.c @@ -147,6 +147,17 @@ bool make_spte(struct kvm_vcpu *vcpu, struct kvm_mmu_p= age *sp, if (!prefetch) spte |=3D spte_shadow_accessed_mask(spte); =20 + /* + * For simplicity, enforce the NX huge page mitigation even if not + * strictly necessary. KVM could ignore if the mitigation if paging is + * disabled in the guest, but KVM would then have to ensure a new MMU + * is loaded (or all shadow pages zapped) when CR0.PG is toggled on, + * and that's a net negative for performance when TDP is enabled. KVM + * could ignore the mitigation if TDP is disabled and CR0.PG=3D0, as KVM + * will always switch to a new MMU if paging is enabled in the guest, + * but that adds complexity just to optimize a mode that is anything + * but performance critical. + */ if (level > PG_LEVEL_4K && (pte_access & ACC_EXEC_MASK) && is_nx_huge_page_enabled(vcpu->kvm)) { pte_access &=3D ~ACC_EXEC_MASK; --=20 2.37.1.359.gd136c6c3e2-goog From nobody Wed Apr 15 08:31:28 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3D482C433EF for ; Sat, 23 Jul 2022 01:23:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236870AbiGWBXx (ORCPT ); Fri, 22 Jul 2022 21:23:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54886 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236550AbiGWBXh (ORCPT ); Fri, 22 Jul 2022 21:23:37 -0400 Received: from mail-pl1-x649.google.com (mail-pl1-x649.google.com [IPv6:2607:f8b0:4864:20::649]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 737AD7358F for ; Fri, 22 Jul 2022 18:23:36 -0700 (PDT) Received: by mail-pl1-x649.google.com with SMTP id c18-20020a170903235200b0016c37f6d48cso3452435plh.19 for ; Fri, 22 Jul 2022 18:23:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=reply-to:date:in-reply-to:message-id:mime-version:references :subject:from:to:cc; bh=ExoXZvGRshft7N9/0/R2vsQcznqHDRAkfUEj9JdqFvk=; b=g1MUmSWG+f7m2392BXNfNph6hfSgIddkGENbhq4kqjuOVMn1iYbQDLrfW8AL/mzRmg Omwh7Nd5OPJCHDztI6W9QsdMrosL0OhwsZbCuS2Oyt7ywvAutjZgv02Lt6prKhqYXVsc Awaz0qDZbCLUSSL30aIiMo+3UjBGd9lXQiCGaq/b7cWruIpwVC6kBmJoRSKhmozkv1Gs pE9n6ss1jV8MbG2DlnuXLdYEJu36xnvtAePYV23PIjhG9ERa49+DmlXCiRiDqAgOq0tg e7tqxQAH7Vde9U0Fa0Javf0TgE2wjCgiCdnK4n/HoWylouj+Fv5Xv2d22wfnhPg56e7G CoTw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:reply-to:date:in-reply-to:message-id :mime-version:references:subject:from:to:cc; bh=ExoXZvGRshft7N9/0/R2vsQcznqHDRAkfUEj9JdqFvk=; b=GtWMmH0xR2BZKP1zec9lh4aysxKcF+Dwo8PfNUpEtiGhB3taAsdFwXK5GBkdLRzEiO wcAwNMXkj0UyZvK5JhnCq6XydlMjB7QiXo8DD+URJy0wkyVtV+y6RhP9TqBNASkq68pf E+I4fwCyLd9vOYEIxXoM5/vaJvJETppibqbWypGXgSQFXOl2W7GKwfaOMSQEX9BKXUnr vdH7589Eybgb9z2BRd9f/I3RlGmyJiiydY0qoZVb4g2YEaJASq3swr+RY0fNkzOaifYN n+KEqzuhD6X+YvvqKnsd9moBaDXv09Bb5aXiKIvvR8FadEJ8GSAnxqxLXlvNaEyJ/XI/ dVnA== X-Gm-Message-State: AJIora/QhEjmdM7OZ1LNvRK4rw6KOir6oiB9149ANCxurtM7ChhujkrK EuTbjbZxUpxbg1SJEIW+nAjQR9Vk12E= X-Google-Smtp-Source: AGRyM1uHvwJ2Ty2xhDBseM+YK17WRf4Z8roPfJ0wf7Tl9rBKQ0Dp1darR8hfiJqDItj49WHXTWfUxzHxQz0= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a17:902:e841:b0:16c:3053:c7e6 with SMTP id t1-20020a170902e84100b0016c3053c7e6mr2110080plg.163.1658539415931; Fri, 22 Jul 2022 18:23:35 -0700 (PDT) Reply-To: Sean Christopherson Date: Sat, 23 Jul 2022 01:23:22 +0000 In-Reply-To: <20220723012325.1715714-1-seanjc@google.com> Message-Id: <20220723012325.1715714-4-seanjc@google.com> Mime-Version: 1.0 References: <20220723012325.1715714-1-seanjc@google.com> X-Mailer: git-send-email 2.37.1.359.gd136c6c3e2-goog Subject: [PATCH v2 3/6] KVM: x86/mmu: Set disallowed_nx_huge_page in TDP MMU before setting SPTE From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Yosry Ahmed , Mingwei Zhang , Ben Gardon Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Set nx_huge_page_disallowed in TDP MMU shadow pages before making the SP visible to other readers, i.e. before setting its SPTE. This will allow KVM to query the flag when determining if a shadow page can be replaced by a NX huge page without violating the rules of the mitigation. Signed-off-by: Sean Christopherson Reviewed-by: David Matlack --- arch/x86/kvm/mmu/mmu.c | 12 +++++------- arch/x86/kvm/mmu/mmu_internal.h | 5 ++--- arch/x86/kvm/mmu/tdp_mmu.c | 30 +++++++++++++++++------------- 3 files changed, 24 insertions(+), 23 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 493cdf1c29ff..e9252e7cd5a2 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -802,8 +802,7 @@ static void account_shadowed(struct kvm *kvm, struct kv= m_mmu_page *sp) kvm_flush_remote_tlbs_with_address(kvm, gfn, 1); } =20 -static void untrack_possible_nx_huge_page(struct kvm *kvm, - struct kvm_mmu_page *sp) +void untrack_possible_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *s= p) { if (list_empty(&sp->possible_nx_huge_page_link)) return; @@ -812,15 +811,14 @@ static void untrack_possible_nx_huge_page(struct kvm = *kvm, list_del_init(&sp->possible_nx_huge_page_link); } =20 -void unaccount_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp) +static void unaccount_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *s= p) { sp->nx_huge_page_disallowed =3D false; =20 untrack_possible_nx_huge_page(kvm, sp); } =20 -static void track_possible_nx_huge_page(struct kvm *kvm, - struct kvm_mmu_page *sp) +void track_possible_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp) { if (!list_empty(&sp->possible_nx_huge_page_link)) return; @@ -830,8 +828,8 @@ static void track_possible_nx_huge_page(struct kvm *kvm, &kvm->arch.possible_nx_huge_pages); } =20 -void account_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp, - bool nx_huge_page_possible) +static void account_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp, + bool nx_huge_page_possible) { sp->nx_huge_page_disallowed =3D true; =20 diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_interna= l.h index 83644a0167ab..2a887d08b722 100644 --- a/arch/x86/kvm/mmu/mmu_internal.h +++ b/arch/x86/kvm/mmu/mmu_internal.h @@ -336,8 +336,7 @@ void disallowed_hugepage_adjust(struct kvm_page_fault *= fault, u64 spte, int cur_ =20 void *mmu_memory_cache_alloc(struct kvm_mmu_memory_cache *mc); =20 -void account_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp, - bool nx_huge_page_possible); -void unaccount_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp); +void track_possible_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp); +void untrack_possible_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *s= p); =20 #endif /* __KVM_X86_MMU_INTERNAL_H */ diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index a30983947fee..626c40ec2af9 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -392,8 +392,10 @@ static void tdp_mmu_unlink_sp(struct kvm *kvm, struct = kvm_mmu_page *sp, lockdep_assert_held_write(&kvm->mmu_lock); =20 list_del(&sp->link); - if (sp->nx_huge_page_disallowed) - unaccount_nx_huge_page(kvm, sp); + if (sp->nx_huge_page_disallowed) { + sp->nx_huge_page_disallowed =3D false; + untrack_possible_nx_huge_page(kvm, sp); + } =20 if (shared) spin_unlock(&kvm->arch.tdp_mmu_pages_lock); @@ -1111,16 +1113,13 @@ static int tdp_mmu_map_handle_target_level(struct k= vm_vcpu *vcpu, * @kvm: kvm instance * @iter: a tdp_iter instance currently on the SPTE that should be set * @sp: The new TDP page table to install. - * @account_nx: True if this page table is being installed to split a - * non-executable huge page. * @shared: This operation is running under the MMU lock in read mode. * * Returns: 0 if the new page table was installed. Non-0 if the page table * could not be installed (e.g. the atomic compare-exchange faile= d). */ static int tdp_mmu_link_sp(struct kvm *kvm, struct tdp_iter *iter, - struct kvm_mmu_page *sp, bool account_nx, - bool shared) + struct kvm_mmu_page *sp, bool shared) { u64 spte =3D make_nonleaf_spte(sp->spt, !kvm_ad_enabled()); int ret =3D 0; @@ -1135,8 +1134,6 @@ static int tdp_mmu_link_sp(struct kvm *kvm, struct td= p_iter *iter, =20 spin_lock(&kvm->arch.tdp_mmu_pages_lock); list_add(&sp->link, &kvm->arch.tdp_mmu_pages); - if (account_nx) - account_nx_huge_page(kvm, sp, true); spin_unlock(&kvm->arch.tdp_mmu_pages_lock); =20 return 0; @@ -1149,6 +1146,7 @@ static int tdp_mmu_link_sp(struct kvm *kvm, struct td= p_iter *iter, int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) { struct kvm_mmu *mmu =3D vcpu->arch.mmu; + struct kvm *kvm =3D vcpu->kvm; struct tdp_iter iter; struct kvm_mmu_page *sp; int ret; @@ -1185,9 +1183,6 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, struct kvm= _page_fault *fault) } =20 if (!is_shadow_present_pte(iter.old_spte)) { - bool account_nx =3D fault->huge_page_disallowed && - fault->req_level >=3D iter.level; - /* * If SPTE has been frozen by another thread, just * give up and retry, avoiding unnecessary page table @@ -1199,10 +1194,19 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, struct k= vm_page_fault *fault) sp =3D tdp_mmu_alloc_sp(vcpu); tdp_mmu_init_child_sp(sp, &iter); =20 - if (tdp_mmu_link_sp(vcpu->kvm, &iter, sp, account_nx, true)) { + sp->nx_huge_page_disallowed =3D fault->huge_page_disallowed; + + if (tdp_mmu_link_sp(kvm, &iter, sp, true)) { tdp_mmu_free_sp(sp); break; } + + if (fault->huge_page_disallowed && + fault->req_level >=3D iter.level) { + spin_lock(&kvm->arch.tdp_mmu_pages_lock); + track_possible_nx_huge_page(kvm, sp); + spin_unlock(&kvm->arch.tdp_mmu_pages_lock); + } } } =20 @@ -1490,7 +1494,7 @@ static int tdp_mmu_split_huge_page(struct kvm *kvm, s= truct tdp_iter *iter, * correctness standpoint since the translation will be the same either * way. */ - ret =3D tdp_mmu_link_sp(kvm, iter, sp, false, shared); + ret =3D tdp_mmu_link_sp(kvm, iter, sp, shared); if (ret) goto out; =20 --=20 2.37.1.359.gd136c6c3e2-goog From nobody Wed Apr 15 08:31:28 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E65D1C433EF for ; Sat, 23 Jul 2022 01:24:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237047AbiGWBYE (ORCPT ); Fri, 22 Jul 2022 21:24:04 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55136 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236598AbiGWBXp (ORCPT ); Fri, 22 Jul 2022 21:23:45 -0400 Received: from mail-pg1-x549.google.com (mail-pg1-x549.google.com [IPv6:2607:f8b0:4864:20::549]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A31F08F521 for ; Fri, 22 Jul 2022 18:23:38 -0700 (PDT) Received: by mail-pg1-x549.google.com with SMTP id a24-20020a63e418000000b0041a20e8c1d3so3047845pgi.0 for ; Fri, 22 Jul 2022 18:23:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=reply-to:date:in-reply-to:message-id:mime-version:references :subject:from:to:cc; bh=nQKeW6axBiNTorfU6kCmIReteyQCmGSbc+JLKNeWNik=; b=nU9F+aCOBOSMo2ZwGcatDy6Cq0wCptVl7b+nBzUXasrosg/zWwLZZFsgGzRh7bj7bq LeB+KsvqDStH/7VUPGlSKA5wxQnC8aNQBCbqsCNa0U81xG0YpTYaK4Kw8huH2524RQrw V/jRYX3vs70vvWOPtFcLHTQfHsMGIxiis9NF2WdHl125Z8sS/KhzlyRPGlwEvJC9x1eu w+nWarXekor4zRUPl4+xDnnFEUJ9qofJ38crMHcmHGlw7YiEF2QcVe2RfGloxZemfotA iKuGebN/8jgCVi4fmDdRKT69vi4ebTV4y3X6XumSEvBGQWCWGpE9oBdS2tQi8BQr8qfk ORaA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:reply-to:date:in-reply-to:message-id :mime-version:references:subject:from:to:cc; bh=nQKeW6axBiNTorfU6kCmIReteyQCmGSbc+JLKNeWNik=; b=ix/phohEohiitjwhUtQI/HGVjoITYPfML8SPhdg/32+DZH+h37WZBN2pW1LJDuEk8B gjYyw7thjr/4ICJKHtX+mRhv7zODzdHWGOW4oLxxAorU0Mp+FWYM+Ys7uFgWPUMZFtnz Xo2KNK0M+DFDr7XNf/OmbkLg9EudPmCW2gia38faoGFiM4ss59i0A5BG9qlwh/3W0Tra Xe2mPn1qwtweEDp3hvQdDang3Fa0uRpjfzQu2ANRSVPuWJxxv9G9UbUWzFXqqime3PYx YIPB0lY7y5tN0x9d3SF5enePRiTUYEcr3Y325usf8OORjBH+083wT3kVTRNAmhy1El5y 8p+g== X-Gm-Message-State: AJIora+W1Y+cXl6QO/jsoFKDQkdRFqdEa+s6s5wLGw2O7e54WVUlCwuS Xaag/X09tgAXY5jx4VVgGCurlbUGQEw= X-Google-Smtp-Source: AGRyM1uIjLky6nxm3wdf8Vy1wamATcVle1EV+97cmbtMFNKosMoRaODXbCJTaEJGKaf4Iz5G1bvZmbBQ8Cw= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a17:903:2d0:b0:16c:39bc:876 with SMTP id s16-20020a17090302d000b0016c39bc0876mr2448699plk.42.1658539418092; Fri, 22 Jul 2022 18:23:38 -0700 (PDT) Reply-To: Sean Christopherson Date: Sat, 23 Jul 2022 01:23:23 +0000 In-Reply-To: <20220723012325.1715714-1-seanjc@google.com> Message-Id: <20220723012325.1715714-5-seanjc@google.com> Mime-Version: 1.0 References: <20220723012325.1715714-1-seanjc@google.com> X-Mailer: git-send-email 2.37.1.359.gd136c6c3e2-goog Subject: [PATCH v2 4/6] KVM: x86/mmu: Track the number of TDP MMU pages, but not the actual pages From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Yosry Ahmed , Mingwei Zhang , Ben Gardon Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Track the number of TDP MMU "shadow" pages instead of tracking the pages themselves. With the NX huge page list manipulation moved out of the common linking flow, elminating the list-based tracking means the happy path of adding a shadow page doesn't need to acquire a spinlock and can instead inc/dec an atomic. Keep the tracking as the WARN during TDP MMU teardown on leaked shadow pages is very, very useful for detecting KVM bugs. Tracking the number of pages will also make it trivial to expose the counter to userspace as a stat in the future, which may or may not be desirable. Note, the TDP MMU needs to use a separate counter (and stat if that ever comes to be) from the existing n_used_mmu_pages. The TDP MMU doesn't bother supporting the shrinker nor does it honor KVM_SET_NR_MMU_PAGES (because the TDP MMU consumes so few pages relative to shadow paging), and including TDP MMU pages in that counter would break both the shrinker and shadow MMUs, e.g. if a VM is using nested TDP. Cc: Yosry Ahmed Reviewed-by: Mingwei Zhang Signed-off-by: Sean Christopherson Reviewed-by: David Matlack --- arch/x86/include/asm/kvm_host.h | 11 +++-------- arch/x86/kvm/mmu/tdp_mmu.c | 19 +++++++++---------- 2 files changed, 12 insertions(+), 18 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_hos= t.h index 246b69262b93..5c269b2556d6 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1271,6 +1271,9 @@ struct kvm_arch { */ bool tdp_mmu_enabled; =20 + /* The number of TDP MMU pages across all roots. */ + atomic64_t tdp_mmu_pages; + /* * List of struct kvm_mmu_pages being used as roots. * All struct kvm_mmu_pages in the list should have @@ -1291,18 +1294,10 @@ struct kvm_arch { */ struct list_head tdp_mmu_roots; =20 - /* - * List of struct kvmp_mmu_pages not being used as roots. - * All struct kvm_mmu_pages in the list should have - * tdp_mmu_page set and a tdp_mmu_root_count of 0. - */ - struct list_head tdp_mmu_pages; - /* * Protects accesses to the following fields when the MMU lock * is held in read mode: * - tdp_mmu_roots (above) - * - tdp_mmu_pages (above) * - the link field of struct kvm_mmu_pages used by the TDP MMU * - possible_nx_huge_pages; * - the possible_nx_huge_page_link field of struct kvm_mmu_pages used diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 626c40ec2af9..fea22dc481a0 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -29,7 +29,6 @@ int kvm_mmu_init_tdp_mmu(struct kvm *kvm) kvm->arch.tdp_mmu_enabled =3D true; INIT_LIST_HEAD(&kvm->arch.tdp_mmu_roots); spin_lock_init(&kvm->arch.tdp_mmu_pages_lock); - INIT_LIST_HEAD(&kvm->arch.tdp_mmu_pages); kvm->arch.tdp_mmu_zap_wq =3D wq; return 1; } @@ -54,7 +53,7 @@ void kvm_mmu_uninit_tdp_mmu(struct kvm *kvm) /* Also waits for any queued work items. */ destroy_workqueue(kvm->arch.tdp_mmu_zap_wq); =20 - WARN_ON(!list_empty(&kvm->arch.tdp_mmu_pages)); + WARN_ON(atomic64_read(&kvm->arch.tdp_mmu_pages)); WARN_ON(!list_empty(&kvm->arch.tdp_mmu_roots)); =20 /* @@ -386,16 +385,18 @@ static void handle_changed_spte_dirty_log(struct kvm = *kvm, int as_id, gfn_t gfn, static void tdp_mmu_unlink_sp(struct kvm *kvm, struct kvm_mmu_page *sp, bool shared) { + atomic64_dec(&kvm->arch.tdp_mmu_pages); + + if (!sp->nx_huge_page_disallowed) + return; + if (shared) spin_lock(&kvm->arch.tdp_mmu_pages_lock); else lockdep_assert_held_write(&kvm->mmu_lock); =20 - list_del(&sp->link); - if (sp->nx_huge_page_disallowed) { - sp->nx_huge_page_disallowed =3D false; - untrack_possible_nx_huge_page(kvm, sp); - } + sp->nx_huge_page_disallowed =3D false; + untrack_possible_nx_huge_page(kvm, sp); =20 if (shared) spin_unlock(&kvm->arch.tdp_mmu_pages_lock); @@ -1132,9 +1133,7 @@ static int tdp_mmu_link_sp(struct kvm *kvm, struct td= p_iter *iter, tdp_mmu_set_spte(kvm, iter, spte); } =20 - spin_lock(&kvm->arch.tdp_mmu_pages_lock); - list_add(&sp->link, &kvm->arch.tdp_mmu_pages); - spin_unlock(&kvm->arch.tdp_mmu_pages_lock); + atomic64_inc(&kvm->arch.tdp_mmu_pages); =20 return 0; } --=20 2.37.1.359.gd136c6c3e2-goog From nobody Wed Apr 15 08:31:28 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A94EEC433EF for ; Sat, 23 Jul 2022 01:23:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237013AbiGWBX6 (ORCPT ); Fri, 22 Jul 2022 21:23:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55156 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236629AbiGWBXp (ORCPT ); Fri, 22 Jul 2022 21:23:45 -0400 Received: from mail-pl1-x649.google.com (mail-pl1-x649.google.com [IPv6:2607:f8b0:4864:20::649]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E3EB592840 for ; Fri, 22 Jul 2022 18:23:39 -0700 (PDT) Received: by mail-pl1-x649.google.com with SMTP id u3-20020a17090341c300b0016c3c083636so3440013ple.8 for ; Fri, 22 Jul 2022 18:23:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=reply-to:date:in-reply-to:message-id:mime-version:references :subject:from:to:cc; bh=DD45tdz+67fPg0YVSIsI0VzRQuV5XEPNFSGc6mpSxYY=; b=MIl+pl8arhntJ6H2Ci4YqdD9KPn26WE5ebMG+KJKfISmR/8qlOn+IIAm4tCQ69uBrc Q6xmdsYlS8ULYWFwHuaBhr1pR5sUNcKQsrPXZ5wxav1d790E/IRsOQXzwNnp+64GIBKk guAqf6XiHwSCdmSbcvNoJVsJOtyJx54YQkMBb98qsOrwuDJxBI2iAh4mOcM0quJ7voYI uXoAwV71SWkBqHmc9TbaEvI7o6RJ87ehFIeT3HbPBR6BMKzuAXvbkw/NNK61iIH5BHv9 PvZQ0ukcihKmnQUQUYBGAd3vKMZxLbRvodMCvAO29DqAnay77s61HkoXxCxsXp3pRHwz QVKw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:reply-to:date:in-reply-to:message-id :mime-version:references:subject:from:to:cc; bh=DD45tdz+67fPg0YVSIsI0VzRQuV5XEPNFSGc6mpSxYY=; b=6WnwCkC4a5XxWXYQQM7jfJohMWuJiYDSC0LR7e28WtRBnNlzGy//0IcHB4T145iKdx 3WS8dJYmB5Cx2hBvp3IbEW3d9+0JSREbAIFXcGNyFT+P8v8wcz3VStXkQZWh4z2Z/Guu 9s8Blv1flCdEfDaYQNcmkdH4Pr7cMKigizzmFuX2QqI6U+vYMfEA8GA+vOY5UMgAbuFT CdGS5v//wcrOFOiL7PufEoC21txTi+bQF5ZlmAQQTNMQw6EvbwOLrdWaY60g4BGTfRx9 cMCTveI+g0mjq11NjBg82gaMbq2d5cuZ1PxlmXvmtcOn3YkctlwH52K5dQ+dUc1jb9Ao CFbQ== X-Gm-Message-State: AJIora9nylhI95fyPFahKSVLdjxmPI/phsASXkYVw+Lpmq9Q2LBqxtOV XMRmfvbuHoQtTRGKCPY4L8X5UY+RYBE= X-Google-Smtp-Source: AGRyM1vgPiFJ/+Ix2fHf+lADkdehhCP0kkq+lfMjf12fmfBjq6H5DgnbDnt70OU4JQBgICgbqGozeczLw7w= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a05:6a00:150e:b0:52a:ee55:4806 with SMTP id q14-20020a056a00150e00b0052aee554806mr2581383pfu.37.1658539419501; Fri, 22 Jul 2022 18:23:39 -0700 (PDT) Reply-To: Sean Christopherson Date: Sat, 23 Jul 2022 01:23:24 +0000 In-Reply-To: <20220723012325.1715714-1-seanjc@google.com> Message-Id: <20220723012325.1715714-6-seanjc@google.com> Mime-Version: 1.0 References: <20220723012325.1715714-1-seanjc@google.com> X-Mailer: git-send-email 2.37.1.359.gd136c6c3e2-goog Subject: [PATCH v2 5/6] KVM: x86/mmu: Add helper to convert SPTE value to its shadow page From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Yosry Ahmed , Mingwei Zhang , Ben Gardon Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add a helper to convert a SPTE to its shadow page to deduplicate a variety of flows and hopefully avoid future bugs, e.g. if KVM attempts to get the shadow page for a SPTE without dropping high bits. Opportunistically add a comment in mmu_free_root_page() documenting why it treats the root HPA as a SPTE. No functional change intended. Signed-off-by: Sean Christopherson --- arch/x86/kvm/mmu/mmu.c | 17 ++++++++++------- arch/x86/kvm/mmu/mmu_internal.h | 12 ------------ arch/x86/kvm/mmu/spte.h | 17 +++++++++++++++++ arch/x86/kvm/mmu/tdp_mmu.h | 2 ++ 4 files changed, 29 insertions(+), 19 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index e9252e7cd5a2..ed3cfb31853b 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -1798,7 +1798,7 @@ static int __mmu_unsync_walk(struct kvm_mmu_page *sp, continue; } =20 - child =3D to_shadow_page(ent & SPTE_BASE_ADDR_MASK); + child =3D spte_to_sp(ent); =20 if (child->unsync_children) { if (mmu_pages_add(pvec, child, i)) @@ -2357,7 +2357,7 @@ static void validate_direct_spte(struct kvm_vcpu *vcp= u, u64 *sptep, * so we should update the spte at this point to get * a new sp with the correct access. */ - child =3D to_shadow_page(*sptep & SPTE_BASE_ADDR_MASK); + child =3D spte_to_sp(*sptep); if (child->role.access =3D=3D direct_access) return; =20 @@ -2378,7 +2378,7 @@ static int mmu_page_zap_pte(struct kvm *kvm, struct k= vm_mmu_page *sp, if (is_last_spte(pte, sp->role.level)) { drop_spte(kvm, spte); } else { - child =3D to_shadow_page(pte & SPTE_BASE_ADDR_MASK); + child =3D spte_to_sp(pte); drop_parent_pte(child, spte); =20 /* @@ -2817,7 +2817,7 @@ static int mmu_set_spte(struct kvm_vcpu *vcpu, struct= kvm_memory_slot *slot, struct kvm_mmu_page *child; u64 pte =3D *sptep; =20 - child =3D to_shadow_page(pte & SPTE_BASE_ADDR_MASK); + child =3D spte_to_sp(pte); drop_parent_pte(child, sptep); flush =3D true; } else if (pfn !=3D spte_to_pfn(*sptep)) { @@ -3429,7 +3429,11 @@ static void mmu_free_root_page(struct kvm *kvm, hpa_= t *root_hpa, if (!VALID_PAGE(*root_hpa)) return; =20 - sp =3D to_shadow_page(*root_hpa & SPTE_BASE_ADDR_MASK); + /* + * The "root" may be a special root, e.g. a PAE entry, treat it as a + * SPTE to ensure any non-PA bits are dropped. + */ + sp =3D spte_to_sp(*root_hpa); if (WARN_ON(!sp)) return; =20 @@ -3914,8 +3918,7 @@ void kvm_mmu_sync_roots(struct kvm_vcpu *vcpu) hpa_t root =3D vcpu->arch.mmu->pae_root[i]; =20 if (IS_VALID_PAE_ROOT(root)) { - root &=3D SPTE_BASE_ADDR_MASK; - sp =3D to_shadow_page(root); + sp =3D spte_to_sp(root); mmu_sync_children(vcpu, sp, true); } } diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_interna= l.h index 2a887d08b722..04457b5ec968 100644 --- a/arch/x86/kvm/mmu/mmu_internal.h +++ b/arch/x86/kvm/mmu/mmu_internal.h @@ -133,18 +133,6 @@ struct kvm_mmu_page { =20 extern struct kmem_cache *mmu_page_header_cache; =20 -static inline struct kvm_mmu_page *to_shadow_page(hpa_t shadow_page) -{ - struct page *page =3D pfn_to_page(shadow_page >> PAGE_SHIFT); - - return (struct kvm_mmu_page *)page_private(page); -} - -static inline struct kvm_mmu_page *sptep_to_sp(u64 *sptep) -{ - return to_shadow_page(__pa(sptep)); -} - static inline int kvm_mmu_role_as_id(union kvm_mmu_page_role role) { return role.smm ? 1 : 0; diff --git a/arch/x86/kvm/mmu/spte.h b/arch/x86/kvm/mmu/spte.h index cabe3fbb4f39..a240b7eca54f 100644 --- a/arch/x86/kvm/mmu/spte.h +++ b/arch/x86/kvm/mmu/spte.h @@ -207,6 +207,23 @@ static inline int spte_index(u64 *sptep) */ extern u64 __read_mostly shadow_nonpresent_or_rsvd_lower_gfn_mask; =20 +static inline struct kvm_mmu_page *to_shadow_page(hpa_t shadow_page) +{ + struct page *page =3D pfn_to_page((shadow_page) >> PAGE_SHIFT); + + return (struct kvm_mmu_page *)page_private(page); +} + +static inline struct kvm_mmu_page *spte_to_sp(u64 spte) +{ + return to_shadow_page(spte & SPTE_BASE_ADDR_MASK); +} + +static inline struct kvm_mmu_page *sptep_to_sp(u64 *sptep) +{ + return to_shadow_page(__pa(sptep)); +} + static inline bool is_mmio_spte(u64 spte) { return (spte & shadow_mmio_mask) =3D=3D shadow_mmio_value && diff --git a/arch/x86/kvm/mmu/tdp_mmu.h b/arch/x86/kvm/mmu/tdp_mmu.h index c163f7cc23ca..d3714200b932 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.h +++ b/arch/x86/kvm/mmu/tdp_mmu.h @@ -5,6 +5,8 @@ =20 #include =20 +#include "spte.h" + hpa_t kvm_tdp_mmu_get_vcpu_root_hpa(struct kvm_vcpu *vcpu); =20 __must_check static inline bool kvm_tdp_mmu_get_root(struct kvm_mmu_page *= root) --=20 2.37.1.359.gd136c6c3e2-goog From nobody Wed Apr 15 08:31:28 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A2DEDC433EF for ; Sat, 23 Jul 2022 01:24:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237025AbiGWBYA (ORCPT ); Fri, 22 Jul 2022 21:24:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55130 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236563AbiGWBXr (ORCPT ); Fri, 22 Jul 2022 21:23:47 -0400 Received: from mail-pl1-x64a.google.com (mail-pl1-x64a.google.com [IPv6:2607:f8b0:4864:20::64a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8C05B97A11 for ; Fri, 22 Jul 2022 18:23:41 -0700 (PDT) Received: by mail-pl1-x64a.google.com with SMTP id k11-20020a170902ce0b00b0016a15fe2627so3473762plg.22 for ; Fri, 22 Jul 2022 18:23:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=reply-to:date:in-reply-to:message-id:mime-version:references :subject:from:to:cc; bh=btEsTIYySxPVQ4hIglgkyTpjS/v7UG71CQDMcqqOrRE=; b=drLQt2LDSl7jNOcsHF3VkSAntBfgOcE3aoog0eZYYNxZW9kHPgZFN+oRzHmfyaACAk UrfxUyFA9DbBL14mdrQ+kuaJxOnT4gPmMCWin5z2MS9M06j8dkxe2BaekZrj5X1zFZqh BaV5P20RK38wOFTUh2q3D1r7Ac2RddSEJJyL17p6hvzPb6eGYo1CxKBQKJj9D0IfHv7s IeY8qOtkJtZFE3ukbxZgTu3600E/onAUSfzX85nah37URUu/xX+gHeFolVBvzqG6qft6 iHWglqluCmhGRmMMfgs30X8YvQ260GV6ZmGftkJKgZdAbE22Ogl2fj1tUDJE+0KpgE6e XELw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:reply-to:date:in-reply-to:message-id :mime-version:references:subject:from:to:cc; bh=btEsTIYySxPVQ4hIglgkyTpjS/v7UG71CQDMcqqOrRE=; b=kUImT28Fc0iXjzIc9kRpkJILDzcwWBuCeedijwIXHzGpR6ipAALN0Up7zgyiPQkca3 E2xG8Xwh3W3ZO2kXQropfRpyVA+u5ZFc5UVuIGHuWp04o2V4KnV1KNHo0E7TdWy5tghq fqAk5eQCqED7K4KR8CoxuB8jIkEeE2pk5SyU1D1Eq/wW4fOKFjOVyf/1MEMP9VPRjK9k YETom0htNsE8xJlH3mNoOuFsf6k0HVUcd3BS9/DDh0lyYyWVwIxCSHgj63Yl1lIss2+w DLbBF1pZsbWADUcaEEq12KfYHWd7vej4TDtgMuqv7fqDGIxhL983sltNQvus/qOSDRaQ G5ZQ== X-Gm-Message-State: AJIora+tHcBkYxIgowHhLV58pHEkIuSVvHxDyPWUKzNCOImBLHuJj9L7 ep2faKnVyW1nzOjuniPPcVHT0WuuPAQ= X-Google-Smtp-Source: AGRyM1sBRjjvfJHZxZleZVjvWlLUsAJbEr6jx32pYFSyaLvCmqP+8OZpoJ/fFI+Xp8ppSGRuP/12F0/BEJ0= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a05:6a00:b92:b0:52a:e60d:dfbb with SMTP id g18-20020a056a000b9200b0052ae60ddfbbmr2602461pfj.72.1658539420974; Fri, 22 Jul 2022 18:23:40 -0700 (PDT) Reply-To: Sean Christopherson Date: Sat, 23 Jul 2022 01:23:25 +0000 In-Reply-To: <20220723012325.1715714-1-seanjc@google.com> Message-Id: <20220723012325.1715714-7-seanjc@google.com> Mime-Version: 1.0 References: <20220723012325.1715714-1-seanjc@google.com> X-Mailer: git-send-email 2.37.1.359.gd136c6c3e2-goog Subject: [PATCH v2 6/6] KVM: x86/mmu: explicitly check nx_hugepage in disallowed_hugepage_adjust() From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Yosry Ahmed , Mingwei Zhang , Ben Gardon Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Mingwei Zhang Explicitly check if a NX huge page is disallowed when determining if a page fault needs to be forced to use a smaller sized page. KVM incorrectly assumes that the NX huge page mitigation is the only scenario where KVM will create a shadow page instead of a huge page. Any scenario that causes KVM to zap leaf SPTEs may result in having a SP that can be made huge without violating the NX huge page mitigation. E.g. disabling of dirty logging, zapping from mmu_notifier due to page migration, guest MTRR changes that affect the viability of a huge page, etc... Fixes: b8e8c8303ff2 ("kvm: mmu: ITLB_MULTIHIT mitigation") Reviewed-by: Ben Gardon Signed-off-by: Mingwei Zhang [sean: add barrier comments, use spte_to_sp()] Signed-off-by: Sean Christopherson Reviewed-by: David Matlack --- arch/x86/kvm/mmu/mmu.c | 17 +++++++++++++++-- arch/x86/kvm/mmu/tdp_mmu.c | 6 ++++++ 2 files changed, 21 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index ed3cfb31853b..97980528bf4a 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -3092,6 +3092,19 @@ void disallowed_hugepage_adjust(struct kvm_page_faul= t *fault, u64 spte, int cur_ cur_level =3D=3D fault->goal_level && is_shadow_present_pte(spte) && !is_large_pte(spte)) { + u64 page_mask; + + /* + * Ensure nx_huge_page_disallowed is read after checking for a + * present shadow page. A different vCPU may be concurrently + * installing the shadow page if mmu_lock is held for read. + * Pairs with the smp_wmb() in kvm_tdp_mmu_map(). + */ + smp_rmb(); + + if (!spte_to_sp(spte)->nx_huge_page_disallowed) + return; + /* * A small SPTE exists for this pfn, but FNAME(fetch) * and __direct_map would like to create a large PTE @@ -3099,8 +3112,8 @@ void disallowed_hugepage_adjust(struct kvm_page_fault= *fault, u64 spte, int cur_ * patching back for them into pfn the next 9 bits of * the address. */ - u64 page_mask =3D KVM_PAGES_PER_HPAGE(cur_level) - - KVM_PAGES_PER_HPAGE(cur_level - 1); + page_mask =3D KVM_PAGES_PER_HPAGE(cur_level) - + KVM_PAGES_PER_HPAGE(cur_level - 1); fault->pfn |=3D fault->gfn & page_mask; fault->goal_level--; } diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index fea22dc481a0..313092d4931a 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -1194,6 +1194,12 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, struct kv= m_page_fault *fault) tdp_mmu_init_child_sp(sp, &iter); =20 sp->nx_huge_page_disallowed =3D fault->huge_page_disallowed; + /* + * Ensure nx_huge_page_disallowed is visible before the + * SP is marked present, as mmu_lock is held for read. + * Pairs with the smp_rmb() in disallowed_hugepage_adjust(). + */ + smp_wmb(); =20 if (tdp_mmu_link_sp(kvm, &iter, sp, true)) { tdp_mmu_free_sp(sp); --=20 2.37.1.359.gd136c6c3e2-goog