From nobody Thu Sep 18 23:35:51 2025
Return-Path: <linux-kernel-owner@vger.kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id E9BB9C6FD1E
	for <linux-kernel@archiver.kernel.org>; Mon,  6 Mar 2023 22:41:47 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S230132AbjCFWlq (ORCPT <rfc822;linux-kernel@archiver.kernel.org>);
        Mon, 6 Mar 2023 17:41:46 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43042 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S229718AbjCFWlm (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Mon, 6 Mar 2023 17:41:42 -0500
Received: from mail-pj1-x104a.google.com (mail-pj1-x104a.google.com
 [IPv6:2607:f8b0:4864:20::104a])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E5C3275845
        for <linux-kernel@vger.kernel.org>;
 Mon,  6 Mar 2023 14:41:36 -0800 (PST)
Received: by mail-pj1-x104a.google.com with SMTP id
 q61-20020a17090a1b4300b00237d2fb8400so7096134pjq.0
        for <linux-kernel@vger.kernel.org>;
 Mon, 06 Mar 2023 14:41:36 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20210112; t=1678142496;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:from:to:cc:subject:date:message-id:reply-to;
        bh=E+fxwShkDb5dHF5T9HzQ/pz1bAD+wZheF6ykQqafhXc=;
        b=OrT0FbrKs4ORTv+USpAB4br2OHaiXGJPuCIu32oANDtvkADWxFu017TG+z3EbDOtvl
         4hhpn8AqRw1PQSkDkKbtFWzI7u3Ggk+B9pDVkPdmv2yW46Vx5ubJAeOfUVlWynI2vOxS
         L7foA1Mg6kS3VA02GkEFb+4DOSz948VkKrUa2pXZL1A3fCDSM8SA6mYFrprKVF5k/5ur
         NDCnxTBSC8mT6IDHRHsZdqfPCWHb4QcsVpt70nMn1jpF6h+M2bzp8LZ0ovO2vRoC/R3I
         Vg6qo8wKPNl0+1FQz8QTta7wSwoDIDIkWEoFrcY7T+YadXbJo2TtWfvc2JBPYUXjBFWf
         +Cmw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112; t=1678142496;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to;
        bh=E+fxwShkDb5dHF5T9HzQ/pz1bAD+wZheF6ykQqafhXc=;
        b=7MzwcVBAMv0orivrtoyKAhq7ytAtLj2Im6/33Sq5mP32oExD4A1zzB22t2bIQtaVFf
         uH5J0xk7EsUHXNRueJebB/ty3liZdXT/wDVq5ekXHsEtn4VHnWZqjXQSAcn5Ab3pFmS8
         nt80VS3qNUSbTn1vbn/6BgSpBbio1gAsTeP0I3Iz8qXa0iZnkXAdW5EoybKAtOsiClTf
         hPlr2YRiNYxflfh2/Z1BTI/3ZsZG5v4WgxL8q9whxPLabiyVbzQRx1zpFAQ85uDoRtMQ
         Q+/VGhNDYRx3jHGrzPDQLJFPM/IAW9FCpmPdtwQHsRdzJTuPVyqdxmSnz/tFDs5KZqtL
         vT/A==
X-Gm-Message-State: AO0yUKUHbDK3FHChSoNFdRta+7H2AiChOYdZQYXKyNpCspODvIS0H8pe
        8D7bS9FMEdc3lIjAipMsDOd0anYzXWC9
X-Google-Smtp-Source: 
 AK7set8ARylYbd7dbgJM8WSkxYuSrTEawhfqFWJ/XQcVF1DdaaAV06Zg3oM2HeGunLdmf9/CQUhPTAUf23li
X-Received: from vipin.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:479f])
 (user=vipinsh job=sendgmr) by 2002:a63:7e1c:0:b0:507:2c49:806d with SMTP id
 z28-20020a637e1c000000b005072c49806dmr3140509pgc.4.1678142496180; Mon, 06 Mar
 2023 14:41:36 -0800 (PST)
Date: Mon,  6 Mar 2023 14:41:10 -0800
In-Reply-To: <20230306224127.1689967-1-vipinsh@google.com>
Mime-Version: 1.0
References: <20230306224127.1689967-1-vipinsh@google.com>
X-Mailer: git-send-email 2.40.0.rc0.216.gc4246ad0f0-goog
Message-ID: <20230306224127.1689967-2-vipinsh@google.com>
Subject: [Patch v4 01/18] KVM: x86/mmu: Change KVM mmu shrinker to no-op
From: Vipin Sharma <vipinsh@google.com>
To: seanjc@google.com, pbonzini@redhat.com, bgardon@google.com,
        dmatlack@google.com
Cc: jmattson@google.com, mizhang@google.com, kvm@vger.kernel.org,
        linux-kernel@vger.kernel.org, Vipin Sharma <vipinsh@google.com>
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset="utf-8"

Remove page zapping logic from the shrinker. Keep shrinker
infrastructure in place, it will be reused in future commits to free KVM
page caches.

mmu_shrink_scan() is very disruptive to VMs. It picks the first VM in
the vm_list, zaps the oldest page which is most likely an upper level
SPTEs and most like to be reused. Prior to TDP MMU, this is even more
disruptive in nested VMs case, considering L1 SPTEs will be the oldest
even though most of the entries are for L2 SPTEs.

As discussed in
https://lore.kernel.org/lkml/Y45dldZnI6OIf+a5@google.com/ shrinker logic
has not be very useful in actually keeping VMs performant and reducing
memory usage.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
 arch/x86/kvm/mmu/mmu.c | 87 +++---------------------------------------
 1 file changed, 5 insertions(+), 82 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index c8ebe542c565..0d07767f7922 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -166,7 +166,6 @@ struct kvm_shadow_walk_iterator {
=20
 static struct kmem_cache *pte_list_desc_cache;
 struct kmem_cache *mmu_page_header_cache;
-static struct percpu_counter kvm_total_used_mmu_pages;
=20
 static void mmu_spte_set(u64 *sptep, u64 spte);
=20
@@ -1704,27 +1703,15 @@ static int is_empty_shadow_page(u64 *spt)
 }
 #endif
=20
-/*
- * This value is the sum of all of the kvm instances's
- * kvm->arch.n_used_mmu_pages values.  We need a global,
- * aggregate version in order to make the slab shrinker
- * faster
- */
-static inline void kvm_mod_used_mmu_pages(struct kvm *kvm, long nr)
-{
-	kvm->arch.n_used_mmu_pages +=3D nr;
-	percpu_counter_add(&kvm_total_used_mmu_pages, nr);
-}
-
 static void kvm_account_mmu_page(struct kvm *kvm, struct kvm_mmu_page *sp)
 {
-	kvm_mod_used_mmu_pages(kvm, +1);
+	kvm->arch.n_used_mmu_pages++;
 	kvm_account_pgtable_pages((void *)sp->spt, +1);
 }
=20
 static void kvm_unaccount_mmu_page(struct kvm *kvm, struct kvm_mmu_page *s=
p)
 {
-	kvm_mod_used_mmu_pages(kvm, -1);
+	kvm->arch.n_used_mmu_pages--;
 	kvm_account_pgtable_pages((void *)sp->spt, -1);
 }
=20
@@ -6072,11 +6059,6 @@ static void kvm_mmu_zap_all_fast(struct kvm *kvm)
 		kvm_tdp_mmu_zap_invalidated_roots(kvm);
 }
=20
-static bool kvm_has_zapped_obsolete_pages(struct kvm *kvm)
-{
-	return unlikely(!list_empty_careful(&kvm->arch.zapped_obsolete_pages));
-}
-
 static void kvm_mmu_invalidate_zap_pages_in_memslot(struct kvm *kvm,
 			struct kvm_memory_slot *slot,
 			struct kvm_page_track_notifier_node *node)
@@ -6666,66 +6648,13 @@ void kvm_mmu_invalidate_mmio_sptes(struct kvm *kvm,=
 u64 gen)
 static unsigned long
 mmu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
 {
-	struct kvm *kvm;
-	int nr_to_scan =3D sc->nr_to_scan;
-	unsigned long freed =3D 0;
-
-	mutex_lock(&kvm_lock);
-
-	list_for_each_entry(kvm, &vm_list, vm_list) {
-		int idx;
-		LIST_HEAD(invalid_list);
-
-		/*
-		 * Never scan more than sc->nr_to_scan VM instances.
-		 * Will not hit this condition practically since we do not try
-		 * to shrink more than one VM and it is very unlikely to see
-		 * !n_used_mmu_pages so many times.
-		 */
-		if (!nr_to_scan--)
-			break;
-		/*
-		 * n_used_mmu_pages is accessed without holding kvm->mmu_lock
-		 * here. We may skip a VM instance errorneosly, but we do not
-		 * want to shrink a VM that only started to populate its MMU
-		 * anyway.
-		 */
-		if (!kvm->arch.n_used_mmu_pages &&
-		    !kvm_has_zapped_obsolete_pages(kvm))
-			continue;
-
-		idx =3D srcu_read_lock(&kvm->srcu);
-		write_lock(&kvm->mmu_lock);
-
-		if (kvm_has_zapped_obsolete_pages(kvm)) {
-			kvm_mmu_commit_zap_page(kvm,
-			      &kvm->arch.zapped_obsolete_pages);
-			goto unlock;
-		}
-
-		freed =3D kvm_mmu_zap_oldest_mmu_pages(kvm, sc->nr_to_scan);
-
-unlock:
-		write_unlock(&kvm->mmu_lock);
-		srcu_read_unlock(&kvm->srcu, idx);
-
-		/*
-		 * unfair on small ones
-		 * per-vm shrinkers cry out
-		 * sadness comes quickly
-		 */
-		list_move_tail(&kvm->vm_list, &vm_list);
-		break;
-	}
-
-	mutex_unlock(&kvm_lock);
-	return freed;
+	return SHRINK_STOP;
 }
=20
 static unsigned long
 mmu_shrink_count(struct shrinker *shrink, struct shrink_control *sc)
 {
-	return percpu_counter_read_positive(&kvm_total_used_mmu_pages);
+	return SHRINK_EMPTY;
 }
=20
 static struct shrinker mmu_shrinker =3D {
@@ -6840,17 +6769,12 @@ int kvm_mmu_vendor_module_init(void)
 	if (!mmu_page_header_cache)
 		goto out;
=20
-	if (percpu_counter_init(&kvm_total_used_mmu_pages, 0, GFP_KERNEL))
-		goto out;
-
 	ret =3D register_shrinker(&mmu_shrinker, "x86-mmu");
 	if (ret)
-		goto out_shrinker;
+		goto out;
=20
 	return 0;
=20
-out_shrinker:
-	percpu_counter_destroy(&kvm_total_used_mmu_pages);
 out:
 	mmu_destroy_caches();
 	return ret;
@@ -6867,7 +6791,6 @@ void kvm_mmu_destroy(struct kvm_vcpu *vcpu)
 void kvm_mmu_vendor_module_exit(void)
 {
 	mmu_destroy_caches();
-	percpu_counter_destroy(&kvm_total_used_mmu_pages);
 	unregister_shrinker(&mmu_shrinker);
 }
=20
--=20
2.40.0.rc0.216.gc4246ad0f0-goog
From nobody Thu Sep 18 23:35:51 2025
Return-Path: <linux-kernel-owner@vger.kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id BD0D9C64EC4
	for <linux-kernel@archiver.kernel.org>; Mon,  6 Mar 2023 22:41:50 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S230157AbjCFWlt (ORCPT <rfc822;linux-kernel@archiver.kernel.org>);
        Mon, 6 Mar 2023 17:41:49 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43064 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S230113AbjCFWln (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Mon, 6 Mar 2023 17:41:43 -0500
Received: from mail-pl1-x64a.google.com (mail-pl1-x64a.google.com
 [IPv6:2607:f8b0:4864:20::64a])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9952E75854
        for <linux-kernel@vger.kernel.org>;
 Mon,  6 Mar 2023 14:41:38 -0800 (PST)
Received: by mail-pl1-x64a.google.com with SMTP id
 lm13-20020a170903298d00b0019a8c8a13dfso6633161plb.16
        for <linux-kernel@vger.kernel.org>;
 Mon, 06 Mar 2023 14:41:38 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20210112; t=1678142498;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:from:to:cc:subject:date:message-id:reply-to;
        bh=40gADokIJp7LXp0jd/R1DBU/pOdMA934Lm+cHTZ9qZM=;
        b=T+lSmTcVs1GzNKHcrL6yDPRcrmA/On3+n7qFsig/y3to2l4Di94gZCGr/bCF/QIFCL
         cLBeX1YxouACoIAPgs5QY9rBE2OkiwAkgbEVHRpOGc3BwbxvIbCvo7tmUGlTyZWsYjaM
         rcOL2MkUhC/hl7Y55rrGwp+7FG30goIjn4DoVJHcpFT4JcRvPQBwQZiwyp74cNGlB9/G
         dLwyS0IwCuOH0hP26IbUG5UrvLBfOyWMRTXW1uwn01yE277FOUncIfgzp+cXW4MWpJr2
         gy6GNgQW2rkbETRch+20PWhJaRjHq2yMu6JjDYYIH/Bwi7mxegFKsQmmksJVz6W7VuaU
         nxQg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112; t=1678142498;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to;
        bh=40gADokIJp7LXp0jd/R1DBU/pOdMA934Lm+cHTZ9qZM=;
        b=fnCEO0cZuDnth3OmsgcjwUPNqupKmiSHV2DcIPE5m+pYbzHuBeKPlkhWVsLvv8g5y3
         OL851FL+GdVLBmsuPcqleRiUkPMybqdJ5vltn2zOg8CUmT4XtIUq/7WRrRD2Ui5EDrE+
         IIZpvxcG6GRT0hA/qAFJCq9+HSNI0ozjktR4XZfZFOXSUZ7dgccrqh/vhL6TUH6ZFHW5
         8xkEOBMm4WIyV8augY7pYp6zhQMZkyiqf7Gnl0+Pv4rEX1VzJaTG/uOIzSegzYwd+Pnm
         jeKPXvQBY71mv6IOWPQnXgxiKVA/g1S/ZHoXWRYq/bPUkwpjkMmRXX/EFXt7tgyKhoi/
         BXpg==
X-Gm-Message-State: AO0yUKU+NslgiXzgrZyzr9Dm1jokUYYK/eTWX9a9YHxCDhVuUy+878U7
        jIgX8zmjk+IHp1o9XrzoMq2Ggz0heBDo
X-Google-Smtp-Source: 
 AK7set9sR45L+LDHCgEH2QOACc5WTVEZlCHYGVdCn/w8i1k+AVwEVcDMNn8+7HF/fl2p/1eVU/27oQXHjqdY
X-Received: from vipin.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:479f])
 (user=vipinsh job=sendgmr) by 2002:a17:90a:d58e:b0:233:fbe0:5ccf with SMTP id
 v14-20020a17090ad58e00b00233fbe05ccfmr4320085pju.1.1678142498162; Mon, 06 Mar
 2023 14:41:38 -0800 (PST)
Date: Mon,  6 Mar 2023 14:41:11 -0800
In-Reply-To: <20230306224127.1689967-1-vipinsh@google.com>
Mime-Version: 1.0
References: <20230306224127.1689967-1-vipinsh@google.com>
X-Mailer: git-send-email 2.40.0.rc0.216.gc4246ad0f0-goog
Message-ID: <20230306224127.1689967-3-vipinsh@google.com>
Subject: [Patch v4 02/18] KVM: x86/mmu: Remove zapped_obsolete_pages from
 struct kvm_arch{}
From: Vipin Sharma <vipinsh@google.com>
To: seanjc@google.com, pbonzini@redhat.com, bgardon@google.com,
        dmatlack@google.com
Cc: jmattson@google.com, mizhang@google.com, kvm@vger.kernel.org,
        linux-kernel@vger.kernel.org, Vipin Sharma <vipinsh@google.com>
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset="utf-8"

Remove zapped_obsolete_pages from struct kvm_arch{} and use local list
in kvm_zap_obsolete_pages().

zapped_obsolete_pages list was used in struct kvm_arch{} to provide
pages for KVM MMU shrinker. Since, KVM MMU shrinker is no-op now, this
is not needed.

Signed-off-by: Vipin Sharma <vipinsh@google.com>
Reviewed-by: David Matlack <dmatlack@google.com>

---
 arch/x86/include/asm/kvm_host.h | 1 -
 arch/x86/kvm/mmu/mmu.c          | 8 ++++----
 2 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_hos=
t.h
index 808c292ad3f4..ebbe692acf3f 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1252,7 +1252,6 @@ struct kvm_arch {
 	u8 mmu_valid_gen;
 	struct hlist_head mmu_page_hash[KVM_NUM_MMU_PAGES];
 	struct list_head active_mmu_pages;
-	struct list_head zapped_obsolete_pages;
 	/*
 	 * A list of kvm_mmu_page structs that, if zapped, could possibly be
 	 * replaced by an NX huge page.  A shadow page is on this list if its
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 0d07767f7922..3a452989f5cd 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -5947,6 +5947,7 @@ static void kvm_zap_obsolete_pages(struct kvm *kvm)
 {
 	struct kvm_mmu_page *sp, *node;
 	int nr_zapped, batch =3D 0;
+	LIST_HEAD(invalid_list);
 	bool unstable;
=20
 restart:
@@ -5979,8 +5980,8 @@ static void kvm_zap_obsolete_pages(struct kvm *kvm)
 			goto restart;
 		}
=20
-		unstable =3D __kvm_mmu_prepare_zap_page(kvm, sp,
-				&kvm->arch.zapped_obsolete_pages, &nr_zapped);
+		unstable =3D __kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list,
+						      &nr_zapped);
 		batch +=3D nr_zapped;
=20
 		if (unstable)
@@ -5996,7 +5997,7 @@ static void kvm_zap_obsolete_pages(struct kvm *kvm)
 	 * kvm_mmu_load()), and the reload in the caller ensure no vCPUs are
 	 * running with an obsolete MMU.
 	 */
-	kvm_mmu_commit_zap_page(kvm, &kvm->arch.zapped_obsolete_pages);
+	kvm_mmu_commit_zap_page(kvm, &invalid_list);
 }
=20
 /*
@@ -6072,7 +6073,6 @@ int kvm_mmu_init_vm(struct kvm *kvm)
 	int r;
=20
 	INIT_LIST_HEAD(&kvm->arch.active_mmu_pages);
-	INIT_LIST_HEAD(&kvm->arch.zapped_obsolete_pages);
 	INIT_LIST_HEAD(&kvm->arch.possible_nx_huge_pages);
 	spin_lock_init(&kvm->arch.mmu_unsync_pages_lock);
=20
--=20
2.40.0.rc0.216.gc4246ad0f0-goog
From nobody Thu Sep 18 23:35:51 2025
Return-Path: <linux-kernel-owner@vger.kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 00A9CC64EC4
	for <linux-kernel@archiver.kernel.org>; Mon,  6 Mar 2023 22:41:58 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S230168AbjCFWl4 (ORCPT <rfc822;linux-kernel@archiver.kernel.org>);
        Mon, 6 Mar 2023 17:41:56 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43154 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S230123AbjCFWlp (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Mon, 6 Mar 2023 17:41:45 -0500
Received: from mail-pf1-x44a.google.com (mail-pf1-x44a.google.com
 [IPv6:2607:f8b0:4864:20::44a])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 435637433C
        for <linux-kernel@vger.kernel.org>;
 Mon,  6 Mar 2023 14:41:40 -0800 (PST)
Received: by mail-pf1-x44a.google.com with SMTP id
 cr10-20020a056a000f0a00b005cfec6c2354so6126904pfb.9
        for <linux-kernel@vger.kernel.org>;
 Mon, 06 Mar 2023 14:41:40 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20210112; t=1678142500;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:from:to:cc:subject:date:message-id:reply-to;
        bh=WhzuUwTW42vajQZ5iu4XIp1wWTqvuDE03zxjZV6U7io=;
        b=MLSRqlBt2QQUmtXc6Hxs6fzWOeClb00AzMS611TeDdDOkU8F9hkTUx46kAoCwz8HAZ
         4hDVRQ76rq4hZRHrtxHTV+ooD39DrsfZ4qVkiyRP8OhJdCIE45hhuUtgroZ5EHGvSe3G
         NY5C3KvmUtPfJj26iS9P9hC3noFbpj92H1ch0flmqE5YHyE/tdUEhZP2guvnyClr0FPV
         k+XryHmhhGFhy3SWX9+nGAEftqjmccwsytClASnxRXMAuny+G3p3pyRveClOvC30Em+t
         iBXw9q/hWch7RVcP15vHkSxMkuOmgnKfV8D2J3cQNROQYK6xrMoxokJpRyBHDY9U7ruC
         cUgw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112; t=1678142500;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to;
        bh=WhzuUwTW42vajQZ5iu4XIp1wWTqvuDE03zxjZV6U7io=;
        b=qazB+/n2ISK4oMjoGO8b6HxEG7Sf6+BZK0AcPyIm6e20f9flhiZQBKaVtw+0HVnm+1
         TQ+TP56xgtpwRqSOiNybFHrFYX7kNQGsSvwTXJxChu7LZYmQ/zOA2ZVC5LVPyfoB3fnk
         8k1mNo69bEXtSvz5CpStKZzhqAaKPVv9C5OtIRWZG7S02kdtPRfRXKblQTHMpBlKvEwP
         Wcc7Mmz74FbpGkMMUo/9CQ/9LnqNM+JF3nXPBQXi1EEUMILe4NtNdBllypDTpbOXR1ip
         ImUHDoQP0Fl19M5lK6VSic1tf1cDsJAHDwauWJgkwCWPd02FYh6fmdYz7q7VA4ZzUGEH
         PwHA==
X-Gm-Message-State: AO0yUKU1lPsIJ0OG9/DtXWJnldP8aOs2BXkpAdQ2h5cFgHisT3fyQiLI
        O25/EJ0fRDPR0zrw5QC3yu0nETrRpkSj
X-Google-Smtp-Source: 
 AK7set/qH3bVtHjFPWVL+83cwT5iW0k2bU+aH8x4GGlERwtIAW+l8ahUpxAr+DBodXZ292gTaRbeLii3XhR+
X-Received: from vipin.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:479f])
 (user=vipinsh job=sendgmr) by 2002:a17:902:f809:b0:19a:80b9:78ce with SMTP id
 ix9-20020a170902f80900b0019a80b978cemr5182747plb.0.1678142499775; Mon, 06 Mar
 2023 14:41:39 -0800 (PST)
Date: Mon,  6 Mar 2023 14:41:12 -0800
In-Reply-To: <20230306224127.1689967-1-vipinsh@google.com>
Mime-Version: 1.0
References: <20230306224127.1689967-1-vipinsh@google.com>
X-Mailer: git-send-email 2.40.0.rc0.216.gc4246ad0f0-goog
Message-ID: <20230306224127.1689967-4-vipinsh@google.com>
Subject: [Patch v4 03/18] KVM: x86/mmu: Track count of pages in KVM MMU page
 caches globally
From: Vipin Sharma <vipinsh@google.com>
To: seanjc@google.com, pbonzini@redhat.com, bgardon@google.com,
        dmatlack@google.com
Cc: jmattson@google.com, mizhang@google.com, kvm@vger.kernel.org,
        linux-kernel@vger.kernel.org, Vipin Sharma <vipinsh@google.com>
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset="utf-8"

Create a global counter for total number of pages available
in MMU page caches across all VMs. Add mmu_shadow_page_cache
pages to this counter.

This accounting will be used in future commits to shrink MMU caches via
KVM MMU shrinker.

Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
 arch/x86/include/asm/kvm_host.h |  5 ++
 arch/x86/kvm/mmu/mmu.c          | 90 ++++++++++++++++++++++++++++-----
 arch/x86/kvm/mmu/mmu_internal.h |  2 +
 arch/x86/kvm/mmu/paging_tmpl.h  | 25 +++++----
 arch/x86/kvm/mmu/tdp_mmu.c      |  3 +-
 5 files changed, 100 insertions(+), 25 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_hos=
t.h
index ebbe692acf3f..4322c7020d5d 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -791,6 +791,11 @@ struct kvm_vcpu_arch {
 	struct kvm_mmu_memory_cache mmu_shadowed_info_cache;
 	struct kvm_mmu_memory_cache mmu_page_header_cache;
=20
+	/*
+	 * Protect allocation and release of pages from mmu_shadow_page_cache.
+	 */
+	struct mutex mmu_shadow_page_cache_lock;
+
 	/*
 	 * QEMU userspace and the guest each have their own FPU state.
 	 * In vcpu_run, we switch between the user and guest FPU contexts.
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 3a452989f5cd..13f41b7ac280 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -167,6 +167,11 @@ struct kvm_shadow_walk_iterator {
 static struct kmem_cache *pte_list_desc_cache;
 struct kmem_cache *mmu_page_header_cache;
=20
+/*
+ * Global count of unused pages in MMU page caches across all VMs.
+ */
+static struct percpu_counter kvm_total_unused_cached_pages;
+
 static void mmu_spte_set(u64 *sptep, u64 spte);
=20
 struct kvm_mmu_role_regs {
@@ -667,6 +672,34 @@ static void walk_shadow_page_lockless_end(struct kvm_v=
cpu *vcpu)
 	}
 }
=20
+/**
+ * Caller should hold mutex lock corresponding to cache, if available.
+ */
+static int mmu_topup_sp_memory_cache(struct kvm_mmu_memory_cache *cache,
+				     int min)
+{
+	int orig_nobjs, r;
+
+	orig_nobjs =3D cache->nobjs;
+	r =3D kvm_mmu_topup_memory_cache(cache, min);
+	if (orig_nobjs !=3D cache->nobjs)
+		percpu_counter_add(&kvm_total_unused_cached_pages,
+				   (cache->nobjs - orig_nobjs));
+
+	return r;
+}
+
+/**
+ * Caller should hold mutex lock corresponding to kvm_mmu_memory_cache, if
+ * available.
+ */
+static void mmu_free_sp_memory_cache(struct kvm_mmu_memory_cache *cache)
+{
+	if (cache->nobjs)
+		percpu_counter_sub(&kvm_total_unused_cached_pages, cache->nobjs);
+	kvm_mmu_free_memory_cache(cache);
+}
+
 static int mmu_topup_memory_caches(struct kvm_vcpu *vcpu, bool maybe_indir=
ect)
 {
 	int r;
@@ -676,10 +709,11 @@ static int mmu_topup_memory_caches(struct kvm_vcpu *v=
cpu, bool maybe_indirect)
 				       1 + PT64_ROOT_MAX_LEVEL + PTE_PREFETCH_NUM);
 	if (r)
 		return r;
-	r =3D kvm_mmu_topup_memory_cache(&vcpu->arch.mmu_shadow_page_cache,
-				       PT64_ROOT_MAX_LEVEL);
+
+	r =3D mmu_topup_sp_memory_cache(&vcpu->arch.mmu_shadow_page_cache, PT64_R=
OOT_MAX_LEVEL);
 	if (r)
 		return r;
+
 	if (maybe_indirect) {
 		r =3D kvm_mmu_topup_memory_cache(&vcpu->arch.mmu_shadowed_info_cache,
 					       PT64_ROOT_MAX_LEVEL);
@@ -693,7 +727,9 @@ static int mmu_topup_memory_caches(struct kvm_vcpu *vcp=
u, bool maybe_indirect)
 static void mmu_free_memory_caches(struct kvm_vcpu *vcpu)
 {
 	kvm_mmu_free_memory_cache(&vcpu->arch.mmu_pte_list_desc_cache);
-	kvm_mmu_free_memory_cache(&vcpu->arch.mmu_shadow_page_cache);
+	mutex_lock(&vcpu->arch.mmu_shadow_page_cache_lock);
+	mmu_free_sp_memory_cache(&vcpu->arch.mmu_shadow_page_cache);
+	mutex_unlock(&vcpu->arch.mmu_shadow_page_cache_lock);
 	kvm_mmu_free_memory_cache(&vcpu->arch.mmu_shadowed_info_cache);
 	kvm_mmu_free_memory_cache(&vcpu->arch.mmu_page_header_cache);
 }
@@ -2148,6 +2184,7 @@ struct shadow_page_caches {
 	struct kvm_mmu_memory_cache *page_header_cache;
 	struct kvm_mmu_memory_cache *shadow_page_cache;
 	struct kvm_mmu_memory_cache *shadowed_info_cache;
+	bool count_shadow_page_allocation;
 };
=20
 static struct kvm_mmu_page *kvm_mmu_alloc_shadow_page(struct kvm *kvm,
@@ -2159,7 +2196,8 @@ static struct kvm_mmu_page *kvm_mmu_alloc_shadow_page=
(struct kvm *kvm,
 	struct kvm_mmu_page *sp;
=20
 	sp =3D kvm_mmu_memory_cache_alloc(caches->page_header_cache);
-	sp->spt =3D kvm_mmu_memory_cache_alloc(caches->shadow_page_cache);
+	sp->spt =3D mmu_sp_memory_cache_alloc(caches->shadow_page_cache,
+					    caches->count_shadow_page_allocation);
 	if (!role.direct)
 		sp->shadowed_translation =3D kvm_mmu_memory_cache_alloc(caches->shadowed=
_info_cache);
=20
@@ -2216,6 +2254,7 @@ static struct kvm_mmu_page *kvm_mmu_get_shadow_page(s=
truct kvm_vcpu *vcpu,
 		.page_header_cache =3D &vcpu->arch.mmu_page_header_cache,
 		.shadow_page_cache =3D &vcpu->arch.mmu_shadow_page_cache,
 		.shadowed_info_cache =3D &vcpu->arch.mmu_shadowed_info_cache,
+		.count_shadow_page_allocation =3D true,
 	};
=20
 	return __kvm_mmu_get_shadow_page(vcpu->kvm, vcpu, &caches, gfn, role);
@@ -4314,29 +4353,32 @@ static int direct_page_fault(struct kvm_vcpu *vcpu,=
 struct kvm_page_fault *fault
 	if (r !=3D RET_PF_INVALID)
 		return r;
=20
+	mutex_lock(&vcpu->arch.mmu_shadow_page_cache_lock);
 	r =3D mmu_topup_memory_caches(vcpu, false);
 	if (r)
-		return r;
+		goto out_page_cache_unlock;
=20
 	r =3D kvm_faultin_pfn(vcpu, fault, ACC_ALL);
 	if (r !=3D RET_PF_CONTINUE)
-		return r;
+		goto out_page_cache_unlock;
=20
 	r =3D RET_PF_RETRY;
 	write_lock(&vcpu->kvm->mmu_lock);
=20
 	if (is_page_fault_stale(vcpu, fault))
-		goto out_unlock;
+		goto out_mmu_unlock;
=20
 	r =3D make_mmu_pages_available(vcpu);
 	if (r)
-		goto out_unlock;
+		goto out_mmu_unlock;
=20
 	r =3D direct_map(vcpu, fault);
=20
-out_unlock:
+out_mmu_unlock:
 	write_unlock(&vcpu->kvm->mmu_lock);
 	kvm_release_pfn_clean(fault->pfn);
+out_page_cache_unlock:
+	mutex_unlock(&vcpu->arch.mmu_shadow_page_cache_lock);
 	return r;
 }
=20
@@ -4396,25 +4438,28 @@ static int kvm_tdp_mmu_page_fault(struct kvm_vcpu *=
vcpu,
 	if (r !=3D RET_PF_INVALID)
 		return r;
=20
+	mutex_lock(&vcpu->arch.mmu_shadow_page_cache_lock);
 	r =3D mmu_topup_memory_caches(vcpu, false);
 	if (r)
-		return r;
+		goto out_page_cache_unlock;
=20
 	r =3D kvm_faultin_pfn(vcpu, fault, ACC_ALL);
 	if (r !=3D RET_PF_CONTINUE)
-		return r;
+		goto out_page_cache_unlock;
=20
 	r =3D RET_PF_RETRY;
 	read_lock(&vcpu->kvm->mmu_lock);
=20
 	if (is_page_fault_stale(vcpu, fault))
-		goto out_unlock;
+		goto out_mmu_unlock;
=20
 	r =3D kvm_tdp_mmu_map(vcpu, fault);
=20
-out_unlock:
+out_mmu_unlock:
 	read_unlock(&vcpu->kvm->mmu_lock);
 	kvm_release_pfn_clean(fault->pfn);
+out_page_cache_unlock:
+	mutex_unlock(&vcpu->arch.mmu_shadow_page_cache_lock);
 	return r;
 }
 #endif
@@ -5394,6 +5439,7 @@ int kvm_mmu_load(struct kvm_vcpu *vcpu)
 {
 	int r;
=20
+	mutex_lock(&vcpu->arch.mmu_shadow_page_cache_lock);
 	r =3D mmu_topup_memory_caches(vcpu, !vcpu->arch.mmu->root_role.direct);
 	if (r)
 		goto out;
@@ -5420,6 +5466,7 @@ int kvm_mmu_load(struct kvm_vcpu *vcpu)
 	 */
 	static_call(kvm_x86_flush_tlb_current)(vcpu);
 out:
+	mutex_unlock(&vcpu->arch.mmu_shadow_page_cache_lock);
 	return r;
 }
=20
@@ -5924,6 +5971,7 @@ int kvm_mmu_create(struct kvm_vcpu *vcpu)
 	vcpu->arch.mmu_page_header_cache.gfp_zero =3D __GFP_ZERO;
=20
 	vcpu->arch.mmu_shadow_page_cache.gfp_zero =3D __GFP_ZERO;
+	mutex_init(&vcpu->arch.mmu_shadow_page_cache_lock);
=20
 	vcpu->arch.mmu =3D &vcpu->arch.root_mmu;
 	vcpu->arch.walk_mmu =3D &vcpu->arch.root_mmu;
@@ -6769,12 +6817,17 @@ int kvm_mmu_vendor_module_init(void)
 	if (!mmu_page_header_cache)
 		goto out;
=20
+	if (percpu_counter_init(&kvm_total_unused_cached_pages, 0, GFP_KERNEL))
+		goto out;
+
 	ret =3D register_shrinker(&mmu_shrinker, "x86-mmu");
 	if (ret)
-		goto out;
+		goto out_shrinker;
=20
 	return 0;
=20
+out_shrinker:
+	percpu_counter_destroy(&kvm_total_unused_cached_pages);
 out:
 	mmu_destroy_caches();
 	return ret;
@@ -6792,6 +6845,7 @@ void kvm_mmu_vendor_module_exit(void)
 {
 	mmu_destroy_caches();
 	unregister_shrinker(&mmu_shrinker);
+	percpu_counter_destroy(&kvm_total_unused_cached_pages);
 }
=20
 /*
@@ -6994,3 +7048,11 @@ void kvm_mmu_pre_destroy_vm(struct kvm *kvm)
 	if (kvm->arch.nx_huge_page_recovery_thread)
 		kthread_stop(kvm->arch.nx_huge_page_recovery_thread);
 }
+
+void *mmu_sp_memory_cache_alloc(struct kvm_mmu_memory_cache *shadow_page_c=
ache,
+				bool count_allocation)
+{
+	if (count_allocation && shadow_page_cache->nobjs)
+		percpu_counter_dec(&kvm_total_unused_cached_pages);
+	return kvm_mmu_memory_cache_alloc(shadow_page_cache);
+}
diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_interna=
l.h
index cc58631e2336..798cfbf0a36b 100644
--- a/arch/x86/kvm/mmu/mmu_internal.h
+++ b/arch/x86/kvm/mmu/mmu_internal.h
@@ -338,5 +338,7 @@ void *mmu_memory_cache_alloc(struct kvm_mmu_memory_cach=
e *mc);
=20
 void track_possible_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp);
 void untrack_possible_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *s=
p);
+void *mmu_sp_memory_cache_alloc(struct kvm_mmu_memory_cache *cache,
+				bool count_allocation);
=20
 #endif /* __KVM_X86_MMU_INTERNAL_H */
diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
index 57f0b75c80f9..1dea9be6849d 100644
--- a/arch/x86/kvm/mmu/paging_tmpl.h
+++ b/arch/x86/kvm/mmu/paging_tmpl.h
@@ -821,9 +821,10 @@ static int FNAME(page_fault)(struct kvm_vcpu *vcpu, st=
ruct kvm_page_fault *fault
 		return RET_PF_EMULATE;
 	}
=20
+	mutex_lock(&vcpu->arch.mmu_shadow_page_cache_lock);
 	r =3D mmu_topup_memory_caches(vcpu, true);
 	if (r)
-		return r;
+		goto out_page_cache_unlock;
=20
 	vcpu->arch.write_fault_to_shadow_pgtable =3D false;
=20
@@ -837,7 +838,7 @@ static int FNAME(page_fault)(struct kvm_vcpu *vcpu, str=
uct kvm_page_fault *fault
=20
 	r =3D kvm_faultin_pfn(vcpu, fault, walker.pte_access);
 	if (r !=3D RET_PF_CONTINUE)
-		return r;
+		goto out_page_cache_unlock;
=20
 	/*
 	 * Do not change pte_access if the pfn is a mmio page, otherwise
@@ -862,16 +863,18 @@ static int FNAME(page_fault)(struct kvm_vcpu *vcpu, s=
truct kvm_page_fault *fault
 	write_lock(&vcpu->kvm->mmu_lock);
=20
 	if (is_page_fault_stale(vcpu, fault))
-		goto out_unlock;
+		goto out_mmu_unlock;
=20
 	r =3D make_mmu_pages_available(vcpu);
 	if (r)
-		goto out_unlock;
+		goto out_mmu_unlock;
 	r =3D FNAME(fetch)(vcpu, fault, &walker);
=20
-out_unlock:
+out_mmu_unlock:
 	write_unlock(&vcpu->kvm->mmu_lock);
 	kvm_release_pfn_clean(fault->pfn);
+out_page_cache_unlock:
+	mutex_unlock(&vcpu->arch.mmu_shadow_page_cache_lock);
 	return r;
 }
=20
@@ -897,17 +900,18 @@ static void FNAME(invlpg)(struct kvm_vcpu *vcpu, gva_=
t gva, hpa_t root_hpa)
=20
 	vcpu_clear_mmio_info(vcpu, gva);
=20
+	if (!VALID_PAGE(root_hpa)) {
+		WARN_ON(1);
+		return;
+	}
+
+	mutex_lock(&vcpu->arch.mmu_shadow_page_cache_lock);
 	/*
 	 * No need to check return value here, rmap_can_add() can
 	 * help us to skip pte prefetch later.
 	 */
 	mmu_topup_memory_caches(vcpu, true);
=20
-	if (!VALID_PAGE(root_hpa)) {
-		WARN_ON(1);
-		return;
-	}
-
 	write_lock(&vcpu->kvm->mmu_lock);
 	for_each_shadow_entry_using_root(vcpu, root_hpa, gva, iterator) {
 		level =3D iterator.level;
@@ -943,6 +947,7 @@ static void FNAME(invlpg)(struct kvm_vcpu *vcpu, gva_t =
gva, hpa_t root_hpa)
 			break;
 	}
 	write_unlock(&vcpu->kvm->mmu_lock);
+	mutex_unlock(&vcpu->arch.mmu_shadow_page_cache_lock);
 }
=20
 /* Note, @addr is a GPA when gva_to_gpa() translates an L2 GPA to an L1 GP=
A. */
diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index 7c25dbf32ecc..fa6eb1e9101e 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -265,7 +265,8 @@ static struct kvm_mmu_page *tdp_mmu_alloc_sp(struct kvm=
_vcpu *vcpu)
 	struct kvm_mmu_page *sp;
=20
 	sp =3D kvm_mmu_memory_cache_alloc(&vcpu->arch.mmu_page_header_cache);
-	sp->spt =3D kvm_mmu_memory_cache_alloc(&vcpu->arch.mmu_shadow_page_cache);
+	sp->spt =3D mmu_sp_memory_cache_alloc(&vcpu->arch.mmu_shadow_page_cache,
+					    true);
=20
 	return sp;
 }
--=20
2.40.0.rc0.216.gc4246ad0f0-goog
From nobody Thu Sep 18 23:35:51 2025
Return-Path: <linux-kernel-owner@vger.kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 5BFBBC64EC4
	for <linux-kernel@archiver.kernel.org>; Mon,  6 Mar 2023 22:41:54 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S229687AbjCFWlv (ORCPT <rfc822;linux-kernel@archiver.kernel.org>);
        Mon, 6 Mar 2023 17:41:51 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43156 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S230119AbjCFWlp (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Mon, 6 Mar 2023 17:41:45 -0500
Received: from mail-pj1-x104a.google.com (mail-pj1-x104a.google.com
 [IPv6:2607:f8b0:4864:20::104a])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3262472B2A
        for <linux-kernel@vger.kernel.org>;
 Mon,  6 Mar 2023 14:41:42 -0800 (PST)
Received: by mail-pj1-x104a.google.com with SMTP id
 pl10-20020a17090b268a00b00239ed042afcso7068130pjb.4
        for <linux-kernel@vger.kernel.org>;
 Mon, 06 Mar 2023 14:41:42 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20210112; t=1678142501;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:from:to:cc:subject:date:message-id:reply-to;
        bh=IgIlDEBO3hBgEG9gel6dE65YzklyZJEz6k7NHt2Zk6M=;
        b=aAvXgQFxXMRp/VPlQA8gq3OEIOGeRr56iAvWaFd6sstiqZr0Zawx9Z398mOWpF+2Ud
         GRTkoNY101GNHpcTsMf0UfKI6VXc6fyrnLm4lp1Mr1+l2xUTwzVYOFZD6AH+npaWkXzm
         3aVgO3L1QbuLnMYJtuEMmhi+TIf8Ef0zRZKrntMy8bqSdfKHIRSIFPaOHuXUUgXVwfua
         5uIafCmWgGLxcKgUBnymT+gLziBpdLr/rAyCowZavzGge+VcmZ+PgnXdiVSwS4iOFcqu
         R4/QgcOIHxCYrTIQfho49OhRUXfgBwI5UoNN71e76zPwkKJ6vOhW5ITrJRul8rrZFSQ/
         AtOA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112; t=1678142501;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to;
        bh=IgIlDEBO3hBgEG9gel6dE65YzklyZJEz6k7NHt2Zk6M=;
        b=AmkzcnNB6d2yUsjF9+vmiLUQvf5KcJF5Hy5A1Q0nRkpEjH55wzWc30ESa25O9dhvYj
         OOSqckdIQy3lrkv/Z0YD9iwEUSVx9HkWPS9gdLCMElrPsiyz8aqlq/lk9OaidQ96agPY
         WE9GUAV6G4fadynn5/HIYqdlkF16WNN2fYiwdisbW86yXHbo4pG1CciCoZsCeDLMA9/P
         WUVxcKvYwo0MLAISSpLIzJ9MEnStSB3/BncVVAUjshHg51tHACzMbB4sLLBMus7rnpPK
         VkNlh6Vh6Q/68fySRTMZT6Cj7DlwKGheYCtHmbjt3NlEQy5y/kY9bmw8/7QQtEj/x2Fh
         Hv6Q==
X-Gm-Message-State: AO0yUKUNLn4AVNjg8NtCnqfQlyUANw4Xb9VFLhXIdYOhMuMW7AU7nb4X
        QepK7DKnX8gmxW3ry4c80BiiekVhTW8U
X-Google-Smtp-Source: 
 AK7set/En9Q/8v9ziiTVNBNjcULdErMB0XNxJpifIOtjunDo7/A+/RXyjrxTrsbzR235Afaol2oomqlvrqIZ
X-Received: from vipin.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:479f])
 (user=vipinsh job=sendgmr) by 2002:a17:903:449:b0:19a:87dd:9206 with SMTP id
 iw9-20020a170903044900b0019a87dd9206mr4869867plb.0.1678142501654; Mon, 06 Mar
 2023 14:41:41 -0800 (PST)
Date: Mon,  6 Mar 2023 14:41:13 -0800
In-Reply-To: <20230306224127.1689967-1-vipinsh@google.com>
Mime-Version: 1.0
References: <20230306224127.1689967-1-vipinsh@google.com>
X-Mailer: git-send-email 2.40.0.rc0.216.gc4246ad0f0-goog
Message-ID: <20230306224127.1689967-5-vipinsh@google.com>
Subject: [Patch v4 04/18] KVM: x86/mmu: Shrink shadow page caches via MMU
 shrinker
From: Vipin Sharma <vipinsh@google.com>
To: seanjc@google.com, pbonzini@redhat.com, bgardon@google.com,
        dmatlack@google.com
Cc: jmattson@google.com, mizhang@google.com, kvm@vger.kernel.org,
        linux-kernel@vger.kernel.org, Vipin Sharma <vipinsh@google.com>
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset="utf-8"

Shrink shadow page caches via MMU shrinker based on
kvm_total_unused_cached_pages. Traverse each vCPU of all of the VMs,
empty the caches and exit the shrinker when sufficient number of pages
have been freed. Also, move processed VMs to the end of vm_list so that
next time other VMs are tortured first.

Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
 arch/x86/kvm/mmu/mmu.c   | 55 +++++++++++++++++++++++++++++++++++-----
 include/linux/kvm_host.h |  1 +
 virt/kvm/kvm_main.c      |  6 ++++-
 3 files changed, 54 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 13f41b7ac280..df8dcb7e5de7 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -6693,16 +6693,57 @@ void kvm_mmu_invalidate_mmio_sptes(struct kvm *kvm,=
 u64 gen)
 	}
 }
=20
-static unsigned long
-mmu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
-{
-	return SHRINK_STOP;
+static unsigned long mmu_shrink_scan(struct shrinker *shrink,
+				     struct shrink_control *sc)
+{
+	struct kvm *kvm, *next_kvm, *first_kvm =3D NULL;
+	struct kvm_mmu_memory_cache *cache;
+	unsigned long i, freed =3D 0;
+	struct mutex *cache_lock;
+	struct kvm_vcpu *vcpu;
+
+	mutex_lock(&kvm_lock);
+	list_for_each_entry_safe(kvm, next_kvm, &vm_list, vm_list) {
+		if (first_kvm =3D=3D kvm)
+			break;
+
+		if (!first_kvm)
+			first_kvm =3D kvm;
+
+		list_move_tail(&kvm->vm_list, &vm_list);
+
+		kvm_for_each_vcpu(i, vcpu, kvm) {
+			cache =3D &vcpu->arch.mmu_shadow_page_cache;
+			cache_lock =3D &vcpu->arch.mmu_shadow_page_cache_lock;
+			if (mutex_trylock(cache_lock)) {
+				if (cache->nobjs) {
+					freed +=3D cache->nobjs;
+					kvm_mmu_empty_memory_cache(cache);
+				}
+				mutex_unlock(cache_lock);
+				if (freed >=3D sc->nr_to_scan)
+					goto out;
+			}
+		}
+	}
+out:
+	mutex_unlock(&kvm_lock);
+	if (freed) {
+		percpu_counter_sub(&kvm_total_unused_cached_pages, freed);
+		return freed;
+	} else {
+		return SHRINK_STOP;
+	}
 }
=20
-static unsigned long
-mmu_shrink_count(struct shrinker *shrink, struct shrink_control *sc)
+static unsigned long mmu_shrink_count(struct shrinker *shrink,
+				      struct shrink_control *sc)
 {
-	return SHRINK_EMPTY;
+	s64 count =3D percpu_counter_sum(&kvm_total_unused_cached_pages);
+
+	WARN_ON(count < 0);
+	return count <=3D 0 ? SHRINK_EMPTY : count;
+
 }
=20
 static struct shrinker mmu_shrinker =3D {
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 8ada23756b0e..5cfa42c130e0 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -1361,6 +1361,7 @@ void kvm_flush_remote_tlbs(struct kvm *kvm);
 int kvm_mmu_topup_memory_cache(struct kvm_mmu_memory_cache *mc, int min);
 int __kvm_mmu_topup_memory_cache(struct kvm_mmu_memory_cache *mc, int capa=
city, int min);
 int kvm_mmu_memory_cache_nr_free_objects(struct kvm_mmu_memory_cache *mc);
+void kvm_mmu_empty_memory_cache(struct kvm_mmu_memory_cache *mc);
 void kvm_mmu_free_memory_cache(struct kvm_mmu_memory_cache *mc);
 void *kvm_mmu_memory_cache_alloc(struct kvm_mmu_memory_cache *mc);
 #endif
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index d255964ec331..536d8ab6e61f 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -430,7 +430,7 @@ int kvm_mmu_memory_cache_nr_free_objects(struct kvm_mmu=
_memory_cache *mc)
 	return mc->nobjs;
 }
=20
-void kvm_mmu_free_memory_cache(struct kvm_mmu_memory_cache *mc)
+void kvm_mmu_empty_memory_cache(struct kvm_mmu_memory_cache *mc)
 {
 	while (mc->nobjs) {
 		if (mc->kmem_cache)
@@ -438,7 +438,11 @@ void kvm_mmu_free_memory_cache(struct kvm_mmu_memory_c=
ache *mc)
 		else
 			free_page((unsigned long)mc->objects[--mc->nobjs]);
 	}
+}
=20
+void kvm_mmu_free_memory_cache(struct kvm_mmu_memory_cache *mc)
+{
+	kvm_mmu_empty_memory_cache(mc);
 	kvfree(mc->objects);
=20
 	mc->objects =3D NULL;
--=20
2.40.0.rc0.216.gc4246ad0f0-goog
From nobody Thu Sep 18 23:35:51 2025
Return-Path: <linux-kernel-owner@vger.kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 05A20C6FD1B
	for <linux-kernel@archiver.kernel.org>; Mon,  6 Mar 2023 22:42:02 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S230184AbjCFWmA (ORCPT <rfc822;linux-kernel@archiver.kernel.org>);
        Mon, 6 Mar 2023 17:42:00 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43278 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S230105AbjCFWlr (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Mon, 6 Mar 2023 17:41:47 -0500
Received: from mail-pj1-x104a.google.com (mail-pj1-x104a.google.com
 [IPv6:2607:f8b0:4864:20::104a])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8DD3B7430C
        for <linux-kernel@vger.kernel.org>;
 Mon,  6 Mar 2023 14:41:43 -0800 (PST)
Received: by mail-pj1-x104a.google.com with SMTP id
 cl18-20020a17090af69200b0023470d96ae6so127343pjb.1
        for <linux-kernel@vger.kernel.org>;
 Mon, 06 Mar 2023 14:41:43 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20210112; t=1678142503;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:from:to:cc:subject:date:message-id:reply-to;
        bh=ntsWJChvbXfy8t9EA+eQUOa7LNpcvZXYoCa+kOc7Xxo=;
        b=QE3wGxcpDkwXukGXapdK3z5AcvymfM9fTDSInE4aMcqqqhApqI13rGs3g8WKQUaPS/
         zZ6kNb4R7uz/pdmbRfaqJcYVYuRq+bQgvKsN86OU0tlFIWLibxnJpuqg0yKTzMCyuR1X
         pL28Sm386yDPzPua4c49qPfANPTKuZn9aWIUurbKMbZucDPpluT8Syfvw708TGu5WMO0
         hjDTHUxu7yWc5s4/oRuSPQU74v9qCJL0SWCVXIoKKH6qhy3+hgUFKF39UdJnzPCFwNtA
         PsAmluDoVGVvbeYF4h5LW3dAFwVLGK0yDNSdG3miH//Ijl6QzEc9pgGGk9eEMrDtuikC
         CFZA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112; t=1678142503;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to;
        bh=ntsWJChvbXfy8t9EA+eQUOa7LNpcvZXYoCa+kOc7Xxo=;
        b=l2jsHc6NQk6UuCnfsVi6I6SMMWzQ4ePzF/94et/4PQ7xzoXnVoJrb4Y1uV1QEte514
         nRA/LYVeBF2f/VGkffi5vu/NwsmYsbH1/QvOnbBNojdHpRsUUDRZBX7qoRI2OYvDdxhm
         XoLg8/iXdCIsoe6coYANrZjEAwkPgPesM4zOxjZ3Pp6pMlgkSMhpomcC61zSDQuXhxUw
         /YuWzurcCvMTVH/j4Z/lz0xlP619A0fymciEWWs+AzSZa+j36m6jLfJpf19mL0wTx9X8
         tN9C7Aqjb2iCJzgqwBgxMSXxqrD1bGl/15Cz3ScfrdJl896r8Nq1DOg1ngHmN4p7Mmsn
         zlfg==
X-Gm-Message-State: AO0yUKWEgzJm28cby3KOrQI/o/cVOC7kqWD9aFmxb3VaDVoqAUj2UoLM
        fPt1UkvrkpwUirRCz/diquVXBFi5Vot1
X-Google-Smtp-Source: 
 AK7set+ZI2aBqjIrxBqymLM25KRuDPXvrPAuiXEE5P98Cs7Ib72ELWHQ9A1FvjqxxCOH6L09ow62WHf25Ci6
X-Received: from vipin.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:479f])
 (user=vipinsh job=sendgmr) by 2002:a63:7e11:0:b0:503:913f:77b9 with SMTP id
 z17-20020a637e11000000b00503913f77b9mr4352737pgc.6.1678142503145; Mon, 06 Mar
 2023 14:41:43 -0800 (PST)
Date: Mon,  6 Mar 2023 14:41:14 -0800
In-Reply-To: <20230306224127.1689967-1-vipinsh@google.com>
Mime-Version: 1.0
References: <20230306224127.1689967-1-vipinsh@google.com>
X-Mailer: git-send-email 2.40.0.rc0.216.gc4246ad0f0-goog
Message-ID: <20230306224127.1689967-6-vipinsh@google.com>
Subject: [Patch v4 05/18] KVM: x86/mmu: Add split_shadow_page_cache pages to
 global count of MMU cache pages
From: Vipin Sharma <vipinsh@google.com>
To: seanjc@google.com, pbonzini@redhat.com, bgardon@google.com,
        dmatlack@google.com
Cc: jmattson@google.com, mizhang@google.com, kvm@vger.kernel.org,
        linux-kernel@vger.kernel.org, Vipin Sharma <vipinsh@google.com>
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset="utf-8"

Add pages in split_shadow_page_cache to the global counter
kvm_total_unused_cached_pages. These pages will be freed by MMU shrinker
in future commit.

Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
 arch/x86/kvm/mmu/mmu.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index df8dcb7e5de7..0ebb8a2eaf47 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -6149,7 +6149,9 @@ static void mmu_free_vm_memory_caches(struct kvm *kvm)
 {
 	kvm_mmu_free_memory_cache(&kvm->arch.split_desc_cache);
 	kvm_mmu_free_memory_cache(&kvm->arch.split_page_header_cache);
-	kvm_mmu_free_memory_cache(&kvm->arch.split_shadow_page_cache);
+	mutex_lock(&kvm->slots_lock);
+	mmu_free_sp_memory_cache(&kvm->arch.split_shadow_page_cache);
+	mutex_unlock(&kvm->slots_lock);
 }
=20
 void kvm_mmu_uninit_vm(struct kvm *kvm)
@@ -6303,7 +6305,7 @@ static int topup_split_caches(struct kvm *kvm)
 	if (r)
 		return r;
=20
-	return kvm_mmu_topup_memory_cache(&kvm->arch.split_shadow_page_cache, 1);
+	return mmu_topup_sp_memory_cache(&kvm->arch.split_shadow_page_cache, 1);
 }
=20
 static struct kvm_mmu_page *shadow_mmu_get_sp_for_split(struct kvm *kvm, u=
64 *huge_sptep)
@@ -6328,6 +6330,7 @@ static struct kvm_mmu_page *shadow_mmu_get_sp_for_spl=
it(struct kvm *kvm, u64 *hu
 	/* Direct SPs do not require a shadowed_info_cache. */
 	caches.page_header_cache =3D &kvm->arch.split_page_header_cache;
 	caches.shadow_page_cache =3D &kvm->arch.split_shadow_page_cache;
+	caches.count_shadow_page_allocation =3D true;
=20
 	/* Safe to pass NULL for vCPU since requesting a direct SP. */
 	return __kvm_mmu_get_shadow_page(kvm, NULL, &caches, gfn, role);
--=20
2.40.0.rc0.216.gc4246ad0f0-goog
From nobody Thu Sep 18 23:35:51 2025
Return-Path: <linux-kernel-owner@vger.kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 3F234C61DA4
	for <linux-kernel@archiver.kernel.org>; Mon,  6 Mar 2023 22:42:05 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S230196AbjCFWmD (ORCPT <rfc822;linux-kernel@archiver.kernel.org>);
        Mon, 6 Mar 2023 17:42:03 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43312 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S230137AbjCFWlr (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Mon, 6 Mar 2023 17:41:47 -0500
Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com
 [IPv6:2607:f8b0:4864:20::b49])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 99A647C9D9
        for <linux-kernel@vger.kernel.org>;
 Mon,  6 Mar 2023 14:41:45 -0800 (PST)
Received: by mail-yb1-xb49.google.com with SMTP id
 l24-20020a25b318000000b007eba3f8e3baso11882909ybj.4
        for <linux-kernel@vger.kernel.org>;
 Mon, 06 Mar 2023 14:41:45 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20210112; t=1678142505;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:from:to:cc:subject:date:message-id:reply-to;
        bh=cqnpRtg+h63StLNiA50bWnsxI9nJr8KaP+9nXax4nBE=;
        b=ACVU2Ak73DEsJyCD1CYnIUNV7DZqssWv57lrpw9xJcDfooRtDa2tY3LBqbvNcDU3KP
         hhYlgY79kRm0fgbdafmrzxHRgKfPAEwLXCwahNQwzRNxhmAyfBYAPK5few9foF3yaGPd
         /lcFH9jY5YfuOoMwi8e+ez0gG0qGiMaP8D2CYlANXgrHPRHw1P5HlftE5B8zD84EaxRI
         q7zgM/XeA3Fe4WbCdFWT+96EPjGGWl6cIhWMfg76bGxYmkk0ltneRvBruzuQ5PqN2XDq
         54VqqvM9F19QX4nQqAn3F5PtGzHLbFGu+QsPM4zutD5JATaZxYhRaGPqEj/cObP2REkH
         4SGA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112; t=1678142505;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to;
        bh=cqnpRtg+h63StLNiA50bWnsxI9nJr8KaP+9nXax4nBE=;
        b=L1+Vm9EkEPJ2k58ZFdJevX5mCSmOMtNuhE3qMRehEMvi6jU1m9EvzL+QJsztjEr01G
         8i74028Ek7CGj08iXP609dFO3bZmg9+qMnde4FhTO99uYA9MCrtFraKcRoerBniSghPn
         C61lMatsUlloNZtzc2qlKbiPj/61f/i2RzNdJkBfrBP7UleHVLp/UCaex21IevGEgkLx
         MK7KkoSwSLQrLTq/6ej+hBkqXZUshdqoNjiqP5yJXq7SKkVgfmk4PpWdLjskEN3HOPYu
         Tn8vuo9OPx7exu32nsOAXVED0LGdfhVlOOPZ9Fm5YYHatvuiuAUGUur38JzTv+zFBzlJ
         LYiA==
X-Gm-Message-State: AO0yUKVnGlIGp3nTjacJOkBuaiP2gHXjJOT4MxckjPbG4VNA9WN6QF+I
        4XABbGpN6j/Z3f1ldIq4I/b9cXrNMVs3
X-Google-Smtp-Source: 
 AK7set8kXl/my/i/lLk02KsOkXov6TpYpA7DKlG4AzS8XL9UZfw/8EVyvgNRviAycRks9Qvkwea5EqBRmcfc
X-Received: from vipin.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:479f])
 (user=vipinsh job=sendgmr) by 2002:a05:6902:10e:b0:98e:6280:74ca with SMTP id
 o14-20020a056902010e00b0098e628074camr5314707ybh.1.1678142504847; Mon, 06 Mar
 2023 14:41:44 -0800 (PST)
Date: Mon,  6 Mar 2023 14:41:15 -0800
In-Reply-To: <20230306224127.1689967-1-vipinsh@google.com>
Mime-Version: 1.0
References: <20230306224127.1689967-1-vipinsh@google.com>
X-Mailer: git-send-email 2.40.0.rc0.216.gc4246ad0f0-goog
Message-ID: <20230306224127.1689967-7-vipinsh@google.com>
Subject: [Patch v4 06/18] KVM: x86/mmu: Shrink split_shadow_page_cache via MMU
 shrinker
From: Vipin Sharma <vipinsh@google.com>
To: seanjc@google.com, pbonzini@redhat.com, bgardon@google.com,
        dmatlack@google.com
Cc: jmattson@google.com, mizhang@google.com, kvm@vger.kernel.org,
        linux-kernel@vger.kernel.org, Vipin Sharma <vipinsh@google.com>
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset="utf-8"

Use MMU shrinker to free unused pages in split_shadow_page_cache.
Refactor the code and make common function to try emptying the page cache.

Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
 arch/x86/kvm/mmu/mmu.c | 34 +++++++++++++++++++++-------------
 1 file changed, 21 insertions(+), 13 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 0ebb8a2eaf47..73a0ac9c11ce 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -6696,13 +6696,24 @@ void kvm_mmu_invalidate_mmio_sptes(struct kvm *kvm,=
 u64 gen)
 	}
 }
=20
+static int mmu_memory_cache_try_empty(struct kvm_mmu_memory_cache *cache,
+				      struct mutex *cache_lock)
+{
+	int freed =3D 0;
+
+	if (mutex_trylock(cache_lock)) {
+		freed =3D cache->nobjs;
+		kvm_mmu_empty_memory_cache(cache);
+		mutex_unlock(cache_lock);
+	}
+	return freed;
+}
+
 static unsigned long mmu_shrink_scan(struct shrinker *shrink,
 				     struct shrink_control *sc)
 {
 	struct kvm *kvm, *next_kvm, *first_kvm =3D NULL;
-	struct kvm_mmu_memory_cache *cache;
 	unsigned long i, freed =3D 0;
-	struct mutex *cache_lock;
 	struct kvm_vcpu *vcpu;
=20
 	mutex_lock(&kvm_lock);
@@ -6716,18 +6727,15 @@ static unsigned long mmu_shrink_scan(struct shrinke=
r *shrink,
 		list_move_tail(&kvm->vm_list, &vm_list);
=20
 		kvm_for_each_vcpu(i, vcpu, kvm) {
-			cache =3D &vcpu->arch.mmu_shadow_page_cache;
-			cache_lock =3D &vcpu->arch.mmu_shadow_page_cache_lock;
-			if (mutex_trylock(cache_lock)) {
-				if (cache->nobjs) {
-					freed +=3D cache->nobjs;
-					kvm_mmu_empty_memory_cache(cache);
-				}
-				mutex_unlock(cache_lock);
-				if (freed >=3D sc->nr_to_scan)
-					goto out;
-			}
+			freed +=3D mmu_memory_cache_try_empty(&vcpu->arch.mmu_shadow_page_cache,
+							    &vcpu->arch.mmu_shadow_page_cache_lock);
+			if (freed >=3D sc->nr_to_scan)
+				goto out;
 		}
+		freed +=3D mmu_memory_cache_try_empty(&kvm->arch.split_shadow_page_cache,
+						    &kvm->slots_lock);
+		if (freed >=3D sc->nr_to_scan)
+			goto out;
 	}
 out:
 	mutex_unlock(&kvm_lock);
--=20
2.40.0.rc0.216.gc4246ad0f0-goog
From nobody Thu Sep 18 23:35:51 2025
Return-Path: <linux-kernel-owner@vger.kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 199A8C64EC4
	for <linux-kernel@archiver.kernel.org>; Mon,  6 Mar 2023 22:42:08 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S229973AbjCFWmG (ORCPT <rfc822;linux-kernel@archiver.kernel.org>);
        Mon, 6 Mar 2023 17:42:06 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43366 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S230155AbjCFWlt (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Mon, 6 Mar 2023 17:41:49 -0500
Received: from mail-pj1-x1049.google.com (mail-pj1-x1049.google.com
 [IPv6:2607:f8b0:4864:20::1049])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3688575854
        for <linux-kernel@vger.kernel.org>;
 Mon,  6 Mar 2023 14:41:46 -0800 (PST)
Received: by mail-pj1-x1049.google.com with SMTP id
 m9-20020a17090a7f8900b0023769205928so7063204pjl.6
        for <linux-kernel@vger.kernel.org>;
 Mon, 06 Mar 2023 14:41:46 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20210112; t=1678142506;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:from:to:cc:subject:date:message-id:reply-to;
        bh=fud6El3rNU/znA5Y7aANLknllha9pomZRWuqilbnIcQ=;
        b=RHrpxvtkoJL3G7M+x0b7vy1LMCEVHhhInmhGl/pZvP+imrCBzb7gpuU9I4dY0Mk4Md
         E3mJ9rZE0xkYlu56wN7H7+Wv0o0LrN/Lsh+B4Iz7k7L6oemm3rolHrxMC9Xmod2/w4TW
         9ZrbsWYDscNpA+wbxAxnZ74ClfSjyspMqWYI5BhgxKLQLACztlfI3fgQ+2/ZBHEfqFYC
         pOojCBBKl63E8QCWBNXnDHjA9+D2C3+w4VS6GuaMuxeHHk+KEvBBFbBvqnfWyu1wsOjc
         24XDTa7FyWo1Ei8/jlkbUDsoFeAwntPi/zr2ly80qNFKEHtAPSMAmyCiZMgoHlRKOWMK
         VBfA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112; t=1678142506;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to;
        bh=fud6El3rNU/znA5Y7aANLknllha9pomZRWuqilbnIcQ=;
        b=KxcunUmA0ObQT4c0oYBvqrkdryfp9n77Cchhsc5+AzaVz/gRIq3qBHkm0fSYb+euIX
         6FqJymPkRbWS6fSoBAb4mM7XLnTzAPOLn2bWt80NxN2SKaUNOT89RU7G38gh/PxsYj5y
         xL4ZTnoe3Ib65vPU0XedfsHJ/KxmSDGxJd4OUvi74pfkbEnqImzHe8ZfL/Yo4dC/kQXn
         guUDYN7Fd+hU/vGPpY8IkMArnAhkl1akgE2LNL2tjLAIv47f42YAD0Uj0SEoa1zEjMQc
         yUEpH81Ut0kmrRu73Z5K9b5HxaDxBgxSNG/3gW63XV4RSn5e2dPtHAT7zWnYxq9/DVjn
         EU4Q==
X-Gm-Message-State: AO0yUKXVDk0xhJyy9cM7qKodCpbjLz3Tzhg1uJhxPE0+Zv8DACITeEBT
        oVAgS/AAowiMpzQK+Qz4KAsVzZqkE6qS
X-Google-Smtp-Source: 
 AK7set9b0/v6Geu4DXnRkHsEWeT3CmFfj0wEOQBYiSgplekLFzbGS00cUR6Ne+Y+3saMoYhownTfzbVX3gco
X-Received: from vipin.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:479f])
 (user=vipinsh job=sendgmr) by 2002:a17:90a:f0c9:b0:237:4a5d:5a57 with SMTP id
 fa9-20020a17090af0c900b002374a5d5a57mr4698851pjb.1.1678142506491; Mon, 06 Mar
 2023 14:41:46 -0800 (PST)
Date: Mon,  6 Mar 2023 14:41:16 -0800
In-Reply-To: <20230306224127.1689967-1-vipinsh@google.com>
Mime-Version: 1.0
References: <20230306224127.1689967-1-vipinsh@google.com>
X-Mailer: git-send-email 2.40.0.rc0.216.gc4246ad0f0-goog
Message-ID: <20230306224127.1689967-8-vipinsh@google.com>
Subject: [Patch v4 07/18] KVM: x86/mmu: Unconditionally count allocations from
 MMU page caches
From: Vipin Sharma <vipinsh@google.com>
To: seanjc@google.com, pbonzini@redhat.com, bgardon@google.com,
        dmatlack@google.com
Cc: jmattson@google.com, mizhang@google.com, kvm@vger.kernel.org,
        linux-kernel@vger.kernel.org, Vipin Sharma <vipinsh@google.com>
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset="utf-8"

Remove count_shadow_page_allocations from struct shadow_page_caches{}.
Remove count_allocation boolean condition check from
mmu_sp_memory_cache_alloc().

Both split_shadow_page_cache and mmu_shadow_page_cache are counted in
global count of unused cache pages. count_shadow_page_allocations
boolean is obsolete and can be removed.

Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
 arch/x86/kvm/mmu/mmu.c          | 11 +++--------
 arch/x86/kvm/mmu/mmu_internal.h |  3 +--
 arch/x86/kvm/mmu/tdp_mmu.c      |  3 +--
 3 files changed, 5 insertions(+), 12 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 73a0ac9c11ce..0a0962d8108b 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -2184,7 +2184,6 @@ struct shadow_page_caches {
 	struct kvm_mmu_memory_cache *page_header_cache;
 	struct kvm_mmu_memory_cache *shadow_page_cache;
 	struct kvm_mmu_memory_cache *shadowed_info_cache;
-	bool count_shadow_page_allocation;
 };
=20
 static struct kvm_mmu_page *kvm_mmu_alloc_shadow_page(struct kvm *kvm,
@@ -2196,8 +2195,7 @@ static struct kvm_mmu_page *kvm_mmu_alloc_shadow_page=
(struct kvm *kvm,
 	struct kvm_mmu_page *sp;
=20
 	sp =3D kvm_mmu_memory_cache_alloc(caches->page_header_cache);
-	sp->spt =3D mmu_sp_memory_cache_alloc(caches->shadow_page_cache,
-					    caches->count_shadow_page_allocation);
+	sp->spt =3D mmu_sp_memory_cache_alloc(caches->shadow_page_cache);
 	if (!role.direct)
 		sp->shadowed_translation =3D kvm_mmu_memory_cache_alloc(caches->shadowed=
_info_cache);
=20
@@ -2254,7 +2252,6 @@ static struct kvm_mmu_page *kvm_mmu_get_shadow_page(s=
truct kvm_vcpu *vcpu,
 		.page_header_cache =3D &vcpu->arch.mmu_page_header_cache,
 		.shadow_page_cache =3D &vcpu->arch.mmu_shadow_page_cache,
 		.shadowed_info_cache =3D &vcpu->arch.mmu_shadowed_info_cache,
-		.count_shadow_page_allocation =3D true,
 	};
=20
 	return __kvm_mmu_get_shadow_page(vcpu->kvm, vcpu, &caches, gfn, role);
@@ -6330,7 +6327,6 @@ static struct kvm_mmu_page *shadow_mmu_get_sp_for_spl=
it(struct kvm *kvm, u64 *hu
 	/* Direct SPs do not require a shadowed_info_cache. */
 	caches.page_header_cache =3D &kvm->arch.split_page_header_cache;
 	caches.shadow_page_cache =3D &kvm->arch.split_shadow_page_cache;
-	caches.count_shadow_page_allocation =3D true;
=20
 	/* Safe to pass NULL for vCPU since requesting a direct SP. */
 	return __kvm_mmu_get_shadow_page(kvm, NULL, &caches, gfn, role);
@@ -7101,10 +7097,9 @@ void kvm_mmu_pre_destroy_vm(struct kvm *kvm)
 		kthread_stop(kvm->arch.nx_huge_page_recovery_thread);
 }
=20
-void *mmu_sp_memory_cache_alloc(struct kvm_mmu_memory_cache *shadow_page_c=
ache,
-				bool count_allocation)
+void *mmu_sp_memory_cache_alloc(struct kvm_mmu_memory_cache *shadow_page_c=
ache)
 {
-	if (count_allocation && shadow_page_cache->nobjs)
+	if (shadow_page_cache->nobjs)
 		percpu_counter_dec(&kvm_total_unused_cached_pages);
 	return kvm_mmu_memory_cache_alloc(shadow_page_cache);
 }
diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_interna=
l.h
index 798cfbf0a36b..a607314348e3 100644
--- a/arch/x86/kvm/mmu/mmu_internal.h
+++ b/arch/x86/kvm/mmu/mmu_internal.h
@@ -338,7 +338,6 @@ void *mmu_memory_cache_alloc(struct kvm_mmu_memory_cach=
e *mc);
=20
 void track_possible_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp);
 void untrack_possible_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *s=
p);
-void *mmu_sp_memory_cache_alloc(struct kvm_mmu_memory_cache *cache,
-				bool count_allocation);
+void *mmu_sp_memory_cache_alloc(struct kvm_mmu_memory_cache *cache);
=20
 #endif /* __KVM_X86_MMU_INTERNAL_H */
diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index fa6eb1e9101e..d1e85012a008 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -265,8 +265,7 @@ static struct kvm_mmu_page *tdp_mmu_alloc_sp(struct kvm=
_vcpu *vcpu)
 	struct kvm_mmu_page *sp;
=20
 	sp =3D kvm_mmu_memory_cache_alloc(&vcpu->arch.mmu_page_header_cache);
-	sp->spt =3D mmu_sp_memory_cache_alloc(&vcpu->arch.mmu_shadow_page_cache,
-					    true);
+	sp->spt =3D mmu_sp_memory_cache_alloc(&vcpu->arch.mmu_shadow_page_cache);
=20
 	return sp;
 }
--=20
2.40.0.rc0.216.gc4246ad0f0-goog
From nobody Thu Sep 18 23:35:51 2025
Return-Path: <linux-kernel-owner@vger.kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id A44AAC61DA4
	for <linux-kernel@archiver.kernel.org>; Mon,  6 Mar 2023 22:42:17 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S229680AbjCFWmQ (ORCPT <rfc822;linux-kernel@archiver.kernel.org>);
        Mon, 6 Mar 2023 17:42:16 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43248 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S230181AbjCFWl7 (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Mon, 6 Mar 2023 17:41:59 -0500
Received: from mail-pj1-x104a.google.com (mail-pj1-x104a.google.com
 [IPv6:2607:f8b0:4864:20::104a])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E268178C83
        for <linux-kernel@vger.kernel.org>;
 Mon,  6 Mar 2023 14:41:48 -0800 (PST)
Received: by mail-pj1-x104a.google.com with SMTP id
 q9-20020a17090a9f4900b00237d026fc55so7076833pjv.3
        for <linux-kernel@vger.kernel.org>;
 Mon, 06 Mar 2023 14:41:48 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20210112; t=1678142508;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:from:to:cc:subject:date:message-id:reply-to;
        bh=J8x91/t3dwIfLtqbgCFfhhKLJcllwDKNOmC2E+E2xTE=;
        b=DYLWpU7tZGbEMTBqF+pAbMWErmZ3leTSXhkNG3iAzMHf9CwPyOH2PG3+hRNKU4dSKk
         54VgFhxiQNZ2uL6txzOkCI5k2T2HfkNMs9t/FSAn1IM70FurZUjAiHApnSAFDrGvpjmu
         lZU9KdqbtDhD2zRkcXX02KbU2rJ1G9vBnthgRufITUMn5EUm3D+/IwUhQMHP/nnc8Qcn
         /NndLhKuagf9LNHdfQEKWVLnHi++jOuhONrDlH57z+qYjyWcgeyYpVyJxrF3Ufh2XbrM
         y8ctN7dvwTMZwI2MmJP/RnlK+N7+mSAm7TVDbta0+EPTwyEGj4tMO815hzJooI28zziE
         +v+g==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112; t=1678142508;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to;
        bh=J8x91/t3dwIfLtqbgCFfhhKLJcllwDKNOmC2E+E2xTE=;
        b=WrMj3Q20Yfmm7nvWrvNjIgVzBeEpvBNVg8R0jP7VDMpXLvhnKPk/yi1kSJdci3zPyL
         SGMaYbjy/BtCYZAy73ENTFsg+yn1kaLVeNEe9QdxcSD4C0R8869mMmYC7ciHkCagKZfl
         PKDpW+2+175G3sKkE7onnRr9SijwcOqmx8T/JoJiwwc6N9GhgbiZJbmgPrw+5q87sm38
         tzbE7L7GxwnstTh7pWgfH2KEAgHrDZEhFezL4YDmEATrbBp8p5VyY4w+Rzvzoj/QNKXb
         wLL9/E49LuAqXGkEXpnLZSh3E7u837/ysqWTPbV5YfQOTLhtKxKaHsl7XAlZ2LPLqFsM
         TJpg==
X-Gm-Message-State: AO0yUKVw9MHxcLUh6w9ZeBzoS8TZvXzUiKoA972OXSsVsnn2VX5lBPs6
        9zsNj+M9q59r29X4k7fqtGzh1FA9cjuv
X-Google-Smtp-Source: 
 AK7set+KL1AOCKwf3Kb7nrXUmS3jKvF934DJ+eKPK4nuTqt1uFT5Dh+EO/YoXMmWOFg0NqDRQkatdQds3AQZ
X-Received: from vipin.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:479f])
 (user=vipinsh job=sendgmr) by 2002:a17:90a:7064:b0:230:3b84:9169 with SMTP id
 f91-20020a17090a706400b002303b849169mr4600337pjk.2.1678142508419; Mon, 06 Mar
 2023 14:41:48 -0800 (PST)
Date: Mon,  6 Mar 2023 14:41:17 -0800
In-Reply-To: <20230306224127.1689967-1-vipinsh@google.com>
Mime-Version: 1.0
References: <20230306224127.1689967-1-vipinsh@google.com>
X-Mailer: git-send-email 2.40.0.rc0.216.gc4246ad0f0-goog
Message-ID: <20230306224127.1689967-9-vipinsh@google.com>
Subject: [Patch v4 08/18] KVM: x86/mmu: Track unused mmu_shadowed_info_cache
 pages count via global counter
From: Vipin Sharma <vipinsh@google.com>
To: seanjc@google.com, pbonzini@redhat.com, bgardon@google.com,
        dmatlack@google.com
Cc: jmattson@google.com, mizhang@google.com, kvm@vger.kernel.org,
        linux-kernel@vger.kernel.org, Vipin Sharma <vipinsh@google.com>
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset="utf-8"

Add unused pages in mmu_shadowed_info_cache to global MMU unused page
cache counter i.e. kvm_total_unused_cached_pages. These pages will be
freed by MMU shrinker in future commit.

Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
 arch/x86/include/asm/kvm_host.h | 3 ++-
 arch/x86/kvm/mmu/mmu.c          | 8 ++++----
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_hos=
t.h
index 4322c7020d5d..185719dbeb81 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -792,7 +792,8 @@ struct kvm_vcpu_arch {
 	struct kvm_mmu_memory_cache mmu_page_header_cache;
=20
 	/*
-	 * Protect allocation and release of pages from mmu_shadow_page_cache.
+	 * Protect allocation and release of pages from mmu_shadow_page_cache
+	 * and mmu_shadowed_info_cache.
 	 */
 	struct mutex mmu_shadow_page_cache_lock;
=20
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 0a0962d8108b..b7ca31b5699c 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -715,8 +715,8 @@ static int mmu_topup_memory_caches(struct kvm_vcpu *vcp=
u, bool maybe_indirect)
 		return r;
=20
 	if (maybe_indirect) {
-		r =3D kvm_mmu_topup_memory_cache(&vcpu->arch.mmu_shadowed_info_cache,
-					       PT64_ROOT_MAX_LEVEL);
+		r =3D mmu_topup_sp_memory_cache(&vcpu->arch.mmu_shadowed_info_cache,
+					      PT64_ROOT_MAX_LEVEL);
 		if (r)
 			return r;
 	}
@@ -729,8 +729,8 @@ static void mmu_free_memory_caches(struct kvm_vcpu *vcp=
u)
 	kvm_mmu_free_memory_cache(&vcpu->arch.mmu_pte_list_desc_cache);
 	mutex_lock(&vcpu->arch.mmu_shadow_page_cache_lock);
 	mmu_free_sp_memory_cache(&vcpu->arch.mmu_shadow_page_cache);
+	mmu_free_sp_memory_cache(&vcpu->arch.mmu_shadowed_info_cache);
 	mutex_unlock(&vcpu->arch.mmu_shadow_page_cache_lock);
-	kvm_mmu_free_memory_cache(&vcpu->arch.mmu_shadowed_info_cache);
 	kvm_mmu_free_memory_cache(&vcpu->arch.mmu_page_header_cache);
 }
=20
@@ -2197,7 +2197,7 @@ static struct kvm_mmu_page *kvm_mmu_alloc_shadow_page=
(struct kvm *kvm,
 	sp =3D kvm_mmu_memory_cache_alloc(caches->page_header_cache);
 	sp->spt =3D mmu_sp_memory_cache_alloc(caches->shadow_page_cache);
 	if (!role.direct)
-		sp->shadowed_translation =3D kvm_mmu_memory_cache_alloc(caches->shadowed=
_info_cache);
+		sp->shadowed_translation =3D mmu_sp_memory_cache_alloc(caches->shadowed_=
info_cache);
=20
 	set_page_private(virt_to_page(sp->spt), (unsigned long)sp);
=20
--=20
2.40.0.rc0.216.gc4246ad0f0-goog
From nobody Thu Sep 18 23:35:51 2025
Return-Path: <linux-kernel-owner@vger.kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 95DA7C61DA4
	for <linux-kernel@archiver.kernel.org>; Mon,  6 Mar 2023 22:42:20 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S229542AbjCFWmT (ORCPT <rfc822;linux-kernel@archiver.kernel.org>);
        Mon, 6 Mar 2023 17:42:19 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43802 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S229780AbjCFWmF (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Mon, 6 Mar 2023 17:42:05 -0500
Received: from mail-pl1-x64a.google.com (mail-pl1-x64a.google.com
 [IPv6:2607:f8b0:4864:20::64a])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 979B180E26
        for <linux-kernel@vger.kernel.org>;
 Mon,  6 Mar 2023 14:41:50 -0800 (PST)
Received: by mail-pl1-x64a.google.com with SMTP id
 s8-20020a170902b18800b0019c92f56a8aso6710666plr.22
        for <linux-kernel@vger.kernel.org>;
 Mon, 06 Mar 2023 14:41:50 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20210112; t=1678142510;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:from:to:cc:subject:date:message-id:reply-to;
        bh=ld4MGsQjtnTx0Dcfr9NtPlok1/1/gYNLFbhbEuUoYrA=;
        b=krwov+RRGwrmv2jjs5BE/jAirVZA4W+6fdZ3k7jqm0NcyoDyVN0Iy2f7RNdtZV6pL0
         taP6SnYbTHXrwcU/w37ORxzzFBrYm+IM3kOf6FMHbIrJEsfgizuWkBijapf8wpi7Ccj8
         IJtbaNqH913nKSrQP43U/ZTB0esbAJr1VnGS6Y0n27248m976Z3s9LHepZESK0jcrxYz
         ya5ruaw6MSO5mNR77y9pEo9E4FVyvmRZyHx/KGRJYnaLsDce4p0oh8njFTrb3lPdYVLi
         kpoqjKWJSECDlZ/hBKu0b2pGTcIO/OJSdAv2hJC3/3EEP2PCwYDWa9v1UDkMlUQFcY8k
         y6Ig==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112; t=1678142510;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to;
        bh=ld4MGsQjtnTx0Dcfr9NtPlok1/1/gYNLFbhbEuUoYrA=;
        b=Vkt544vK0Jeo1XOgYpk9WUqvZNvS8Uj2OcwLwRBEtZFdVn4IRUvjjKCDi9yJwTohDF
         7fv1pKFX4Kdx2IyBzpSZ1r94fc+UrpJy7FszfYk6KilKqYhv8RJWS6gu/K9u10f54AVs
         +G0dMjboZh6y2kH1a2kMDMyZ6XK8M3AhYSitZHvMjyYgPGbCuH7YeELohjXU2PpDvUEt
         DfHfHnfyM4cPO2/K4U4Qybb98eKHneqi1PDL7onRY9ds6psfIj3+KvaUPAA6hr9BURgj
         xhIKK7lgY5ZzkKiQPXUCHJvr57rCvpH766TRlpXP4D8wPD/zNa/6dbaSxaaj96czdHIh
         ZphQ==
X-Gm-Message-State: AO0yUKVvBxzzco+ZsAbZ+CDaeOw7ENJfAhaya1iG6xK1D228xIo+ZFfp
        N7TB9GMSHhzLOjugMsz2RC5VNG0UKbJu
X-Google-Smtp-Source: 
 AK7set+nYVXnMVTVXx8L/0tlevRKDekxzEjuxmVVtiFTH9jPAVLUzGzbP9E9MA8O/R60f9kn2yP7smPPz7Ml
X-Received: from vipin.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:479f])
 (user=vipinsh job=sendgmr) by 2002:a17:903:428b:b0:19a:8751:4dfc with SMTP id
 ju11-20020a170903428b00b0019a87514dfcmr4898541plb.1.1678142510172; Mon, 06
 Mar 2023 14:41:50 -0800 (PST)
Date: Mon,  6 Mar 2023 14:41:18 -0800
In-Reply-To: <20230306224127.1689967-1-vipinsh@google.com>
Mime-Version: 1.0
References: <20230306224127.1689967-1-vipinsh@google.com>
X-Mailer: git-send-email 2.40.0.rc0.216.gc4246ad0f0-goog
Message-ID: <20230306224127.1689967-10-vipinsh@google.com>
Subject: [Patch v4 09/18] KVM: x86/mmu: Shrink mmu_shadowed_info_cache via MMU
 shrinker
From: Vipin Sharma <vipinsh@google.com>
To: seanjc@google.com, pbonzini@redhat.com, bgardon@google.com,
        dmatlack@google.com
Cc: jmattson@google.com, mizhang@google.com, kvm@vger.kernel.org,
        linux-kernel@vger.kernel.org, Vipin Sharma <vipinsh@google.com>
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset="utf-8"

Shrink shadow page cache via MMU shrinker based on
kvm_total_unused_cached_pages.

Tested by running dirty_log_perf_test while dropping cache
via "echo 2 > /proc/sys/vm/drop_caches" at 1 second interval. Global
always return to 0. Shrinker also gets invoked to remove pages in cache.

Above test were run with three configurations:
- EPT=3DN
- EPT=3DY, TDP_MMU=3DN
- EPT=3DY, TDP_MMU=3DY

Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
 arch/x86/kvm/mmu/mmu.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index b7ca31b5699c..a4bf2e433030 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -6725,6 +6725,8 @@ static unsigned long mmu_shrink_scan(struct shrinker =
*shrink,
 		kvm_for_each_vcpu(i, vcpu, kvm) {
 			freed +=3D mmu_memory_cache_try_empty(&vcpu->arch.mmu_shadow_page_cache,
 							    &vcpu->arch.mmu_shadow_page_cache_lock);
+			freed +=3D mmu_memory_cache_try_empty(&vcpu->arch.mmu_shadowed_info_cac=
he,
+							    &vcpu->arch.mmu_shadow_page_cache_lock);
 			if (freed >=3D sc->nr_to_scan)
 				goto out;
 		}
--=20
2.40.0.rc0.216.gc4246ad0f0-goog
From nobody Thu Sep 18 23:35:51 2025
Return-Path: <linux-kernel-owner@vger.kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 28343C61DA4
	for <linux-kernel@archiver.kernel.org>; Mon,  6 Mar 2023 22:42:32 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S230190AbjCFWm3 (ORCPT <rfc822;linux-kernel@archiver.kernel.org>);
        Mon, 6 Mar 2023 17:42:29 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44390 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S230219AbjCFWmN (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Mon, 6 Mar 2023 17:42:13 -0500
Received: from mail-pl1-x64a.google.com (mail-pl1-x64a.google.com
 [IPv6:2607:f8b0:4864:20::64a])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 900FC82ABF
        for <linux-kernel@vger.kernel.org>;
 Mon,  6 Mar 2023 14:41:52 -0800 (PST)
Received: by mail-pl1-x64a.google.com with SMTP id
 ju20-20020a170903429400b0019ea5ea044aso4926233plb.21
        for <linux-kernel@vger.kernel.org>;
 Mon, 06 Mar 2023 14:41:52 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20210112; t=1678142512;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:from:to:cc:subject:date:message-id:reply-to;
        bh=tPKVdqaDisFdGAfkqGOe8iBVNEZA/ukGWtJHDgVP+Ak=;
        b=klnvvsrGwCWY0xNtfxRp0OMIN4OSqxCgEdWWZinxsMr8EM4F+vjp47fD1GRBoRu8IE
         wZMKE+LvsZGr6cs1oHgGJoZFWnvG5xdvsLDp/0NKn/b1AR4vIl5EA8B1J15BkOjB1jLL
         o6IYaDl/UuNu0obuHxfVfilPm66n989izaFjrKf/m3gcPb3gyNx/SdEC1/akRr4FlvNw
         JuLuP9ZlyvrcmVrmDT/kaTrfx4pXk1OVn2iy0llzINsLsQHYQYo8jLQt/JYxWJLe3FIy
         CpN+mlqM8PTTTe5Qk0msOFVot0e31lagshAQzRtr+WP+tHSnhAXhsRtLnEbDZ87L0tfA
         TufA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112; t=1678142512;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to;
        bh=tPKVdqaDisFdGAfkqGOe8iBVNEZA/ukGWtJHDgVP+Ak=;
        b=mmJtF6aydOsmMJTttQIJMSvUxwFeO4/b+y1q8pyYmuVNFHILwuFI4IZQfJS4d6ZnBk
         dgg9pzd7OSytmdW0yqkU8SYf7Uw/KmsByQFVmEz9aclW/HnHN9/kDvKA/uO4Inglgdh1
         cfwGeTxki19KItaqJZBwfXmLVD/BZK7WFGiW2w7F2oyDo70I5JV2n9ZMYB8NU5IuEdJP
         og6Dl8K6CLKYERKcNovszAnjZ+Bts7tzu9AZ1hbsNp9oZQHJeeFYxHlTIXYvRPx5HGq9
         rCAo3PlmX756OS84JVBuKZW+OigDLO6ruzeXrkISv+Zm2Jb7HHt5f9J/LqIVmkFSss6k
         z8Xw==
X-Gm-Message-State: AO0yUKVrn3a4Ez/JO8A5HPABCUuMyU6JgZEE1pJidrg9wJ1wjMlytyRa
        H5CCPACWaIAvmm+s+Zui1JO4ovXuuuPM
X-Google-Smtp-Source: 
 AK7set9/b67iH4rOLvz1VS8akjODPn7aZygSqISR8KUp+OH3XeVyxRNJMmsPzTjQ8S9WaZK/QW0Hqkh4C5W5
X-Received: from vipin.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:479f])
 (user=vipinsh job=sendgmr) by 2002:a62:8782:0:b0:60f:b143:8e09 with SMTP id
 i124-20020a628782000000b0060fb1438e09mr5432145pfe.1.1678142511907; Mon, 06
 Mar 2023 14:41:51 -0800 (PST)
Date: Mon,  6 Mar 2023 14:41:19 -0800
In-Reply-To: <20230306224127.1689967-1-vipinsh@google.com>
Mime-Version: 1.0
References: <20230306224127.1689967-1-vipinsh@google.com>
X-Mailer: git-send-email 2.40.0.rc0.216.gc4246ad0f0-goog
Message-ID: <20230306224127.1689967-11-vipinsh@google.com>
Subject: [Patch v4 10/18] KVM: x86/mmu: Add per VM NUMA aware page table
 capability
From: Vipin Sharma <vipinsh@google.com>
To: seanjc@google.com, pbonzini@redhat.com, bgardon@google.com,
        dmatlack@google.com
Cc: jmattson@google.com, mizhang@google.com, kvm@vger.kernel.org,
        linux-kernel@vger.kernel.org, Vipin Sharma <vipinsh@google.com>
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset="utf-8"

Add KVM_CAP_NUMA_AWARE_PAGE_TABLE capability. This capability enables a
VM to allocate its page tables, specifically lower level page tables, on
the NUMA node of underlying leaf physical page pointed by the page table
entry.

This patch is only adding this option, future patches will use the
boolean numa_aware_page_table to allocate page tables on appropriate
NUMA node.

For now this capability is for x86 only, it can be extended to other
architecture in future if needed.

Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
 arch/x86/include/asm/kvm_host.h |  6 ++++++
 arch/x86/kvm/x86.c              | 10 ++++++++++
 include/uapi/linux/kvm.h        |  1 +
 3 files changed, 17 insertions(+)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_hos=
t.h
index 185719dbeb81..64de083cd6b9 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1467,6 +1467,12 @@ struct kvm_arch {
 	 */
 #define SPLIT_DESC_CACHE_MIN_NR_OBJECTS (SPTE_ENT_PER_PAGE + 1)
 	struct kvm_mmu_memory_cache split_desc_cache;
+
+	/*
+	 * If true then allocate page tables near to underlying physical page
+	 * NUMA node.
+	 */
+	bool numa_aware_page_table;
 };
=20
 struct kvm_vm_stat {
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index f706621c35b8..71728abd7f92 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4425,6 +4425,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, lon=
g ext)
 	case KVM_CAP_VAPIC:
 	case KVM_CAP_ENABLE_CAP:
 	case KVM_CAP_VM_DISABLE_NX_HUGE_PAGES:
+	case KVM_CAP_NUMA_AWARE_PAGE_TABLE:
 		r =3D 1;
 		break;
 	case KVM_CAP_EXIT_HYPERCALL:
@@ -6391,6 +6392,15 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
 		}
 		mutex_unlock(&kvm->lock);
 		break;
+	case KVM_CAP_NUMA_AWARE_PAGE_TABLE:
+		r =3D -EINVAL;
+		mutex_lock(&kvm->lock);
+		if (!kvm->created_vcpus) {
+			kvm->arch.numa_aware_page_table =3D true;
+			r =3D 0;
+		}
+		mutex_unlock(&kvm->lock);
+		break;
 	default:
 		r =3D -EINVAL;
 		break;
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index d77aef872a0a..5f367a93762a 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1184,6 +1184,7 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_S390_PROTECTED_ASYNC_DISABLE 224
 #define KVM_CAP_DIRTY_LOG_RING_WITH_BITMAP 225
 #define KVM_CAP_PMU_EVENT_MASKED_EVENTS 226
+#define KVM_CAP_NUMA_AWARE_PAGE_TABLE 227
=20
 #ifdef KVM_CAP_IRQ_ROUTING
=20
--=20
2.40.0.rc0.216.gc4246ad0f0-goog
From nobody Thu Sep 18 23:35:51 2025
Return-Path: <linux-kernel-owner@vger.kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 4FF1CC6FD1A
	for <linux-kernel@archiver.kernel.org>; Mon,  6 Mar 2023 22:42:39 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S230257AbjCFWmi (ORCPT <rfc822;linux-kernel@archiver.kernel.org>);
        Mon, 6 Mar 2023 17:42:38 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44492 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S230253AbjCFWmQ (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Mon, 6 Mar 2023 17:42:16 -0500
Received: from mail-pj1-x104a.google.com (mail-pj1-x104a.google.com
 [IPv6:2607:f8b0:4864:20::104a])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0D62D82AA4
        for <linux-kernel@vger.kernel.org>;
 Mon,  6 Mar 2023 14:41:54 -0800 (PST)
Received: by mail-pj1-x104a.google.com with SMTP id
 q24-20020a17090a2e1800b00237c37964d4so7060816pjd.8
        for <linux-kernel@vger.kernel.org>;
 Mon, 06 Mar 2023 14:41:54 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20210112; t=1678142513;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:from:to:cc:subject:date:message-id:reply-to;
        bh=r1N1Z8zVIXipYiULB/HP9W6AmAVLLJyV6vBxJYQxV+M=;
        b=FfApUJ58/AuBXxawVuRox/02qb6FsGRX+7YTDtxPU77jaZJBO4CnYJ1tmTBRvQoTf4
         1WxdM0Of9MHAMx7Iss3kdWr8ZALDta22kOBWVqiTaxdJ74ntgdGRN2qB0TvXnC7I/0DJ
         DKAOzvDiaXDO5KEGXKY8UECBsHAYEqJ3tyk786Zz1nA9hk88HjSUnQTCvIGVNTjyVyy+
         5Y3kjVTQAHALJm6tEh1ATDL4UpbYfcP1+OeksxLByOk0BGJfcJ+oj1jxbAPaJfx7D6EB
         3j0pXjmg8Bmy4YsD37thgFNRRjZd3v7whTdsYWmtsVHeTVu+dL5SruQ2kAGNnnPXe40q
         Vveg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112; t=1678142513;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to;
        bh=r1N1Z8zVIXipYiULB/HP9W6AmAVLLJyV6vBxJYQxV+M=;
        b=R9EHvJVqo3OPUjzE8Khxwz3DASUnVIxU1Q4D3HxPge6B9ttBQ0OPBiN5gXza7MOzik
         J6eFmxNPoBThI/g4v+6czX3AsRyETE6QYYYS0sGJdyIOWS5+mKniQuY5dbhmSKCBiJJV
         dfdgAWVrLHbDvyvJEq3Ui0fS4r462dXt/zOEzjGxSmdTN2puyPeCLOftNftpCCnKy2rg
         JGwmcg+OBsSJB4PRsb37zDsfxZY0TdAOwXCJ5M7OqmDd3+dMP/bc9RbC93EEznwyeISZ
         kQ1ehzWNt+0o3P8wJcsuuDS2KaNI/bCFuyeHsSVcLIEjshoB1fWyxcBMkEpocDsd4Lst
         18Yg==
X-Gm-Message-State: AO0yUKX3gFZKvnDC5PhFPi8dgrw2PQY+oLwybEcpfFngmRQN0wErcs2I
        5d+6R+KdrZk4M21TG3hnUHmWH/XYggBw
X-Google-Smtp-Source: 
 AK7set8snFnyvhJ+XXtlE8UT1yhT6chY+2StfCzR+MgJQQLYlG/aXCTvrRYV3Hglo9Y68BVO2tl7+OkABeGg
X-Received: from vipin.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:479f])
 (user=vipinsh job=sendgmr) by 2002:a62:8245:0:b0:5a8:b093:ff67 with SMTP id
 w66-20020a628245000000b005a8b093ff67mr5451354pfd.4.1678142513657; Mon, 06 Mar
 2023 14:41:53 -0800 (PST)
Date: Mon,  6 Mar 2023 14:41:20 -0800
In-Reply-To: <20230306224127.1689967-1-vipinsh@google.com>
Mime-Version: 1.0
References: <20230306224127.1689967-1-vipinsh@google.com>
X-Mailer: git-send-email 2.40.0.rc0.216.gc4246ad0f0-goog
Message-ID: <20230306224127.1689967-12-vipinsh@google.com>
Subject: [Patch v4 11/18] KVM: x86/mmu: Add documentation of NUMA aware page
 table capability
From: Vipin Sharma <vipinsh@google.com>
To: seanjc@google.com, pbonzini@redhat.com, bgardon@google.com,
        dmatlack@google.com
Cc: jmattson@google.com, mizhang@google.com, kvm@vger.kernel.org,
        linux-kernel@vger.kernel.org, Vipin Sharma <vipinsh@google.com>
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset="utf-8"

Add documentation for KVM_CAP_NUMA_AWARE_PAGE_TABLE capability and
explain why it is needed.

Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
 Documentation/virt/kvm/api.rst | 29 +++++++++++++++++++++++++++++
 1 file changed, 29 insertions(+)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 62de0768d6aa..7e3a1299ca8e 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -7669,6 +7669,35 @@ This capability is aimed to mitigate the threat that=
 malicious VMs can
 cause CPU stuck (due to event windows don't open up) and make the CPU
 unavailable to host or other VMs.
=20
+7.34 KVM_CAP_NUMA_AWARE_PAGE_TABLE
+------------------------------
+
+:Architectures: x86
+:Target: VM
+:Returns: 0 on success, -EINVAL if vCPUs are already created.
+
+This capability allows userspace to enable NUMA aware page tables allocati=
ons.
+NUMA aware page tables are disabled by default. Once enabled, prior to vCPU
+creation, any page table allocated during the life of a VM will be allocat=
ed
+preferably from the NUMA node of the leaf page.
+
+Without this capability, default feature is to use current thread mempolic=
y and
+allocate page table based on that.
+
+This capability is useful to improve page accesses by a guest. For example=
, an
+initialization thread which access lots of remote memory and ends up creat=
ing
+page tables on local NUMA node, or some service thread allocates memory on
+remote NUMA nodes and later worker/background threads accessing that memory
+will end up accessing remote NUMA node page tables. So, a multi NUMA node
+guest, can with high confidence access local memory faster instead of going
+through remote page tables first.
+
+This capability is also helpful for host to reduce live migration impact w=
hen
+splitting huge pages during dirty log operations. If the thread splitting =
huge
+page is on remote NUMA node it will create page tables on remote node. Eve=
n if
+guest is careful in making sure that it only access local memory they will=
 end
+up accessing remote page tables.
+
 8. Other capabilities.
 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
=20
--=20
2.40.0.rc0.216.gc4246ad0f0-goog
From nobody Thu Sep 18 23:35:51 2025
Return-Path: <linux-kernel-owner@vger.kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 54A4FC61DA4
	for <linux-kernel@archiver.kernel.org>; Mon,  6 Mar 2023 22:42:51 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S229958AbjCFWmu (ORCPT <rfc822;linux-kernel@archiver.kernel.org>);
        Mon, 6 Mar 2023 17:42:50 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44630 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S230212AbjCFWmU (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Mon, 6 Mar 2023 17:42:20 -0500
Received: from mail-pj1-x104a.google.com (mail-pj1-x104a.google.com
 [IPv6:2607:f8b0:4864:20::104a])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2321D8616F
        for <linux-kernel@vger.kernel.org>;
 Mon,  6 Mar 2023 14:41:57 -0800 (PST)
Received: by mail-pj1-x104a.google.com with SMTP id
 gf1-20020a17090ac7c100b002369bf87b7aso2981902pjb.8
        for <linux-kernel@vger.kernel.org>;
 Mon, 06 Mar 2023 14:41:57 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20210112; t=1678142515;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:from:to:cc:subject:date:message-id:reply-to;
        bh=IyMdgcKokNAQM99PayI8bLjHWwV0ForFIy4aNy3o9lk=;
        b=KgBV2fZBDR75hRmr5WoQIaFK5nxiY1MuIyQqq/Lz0E/QkDMmYsjZW3MRa34kuMgQMb
         WFHu0aMlW0ZFuhTdf+BmvoqZ42p2tY1SE+KT3f1mhiQULkycdP5rOo+nnu4LzHCaooJ7
         c6KkY1/mkucfLXMoqhCsBuMbnzaO5qx/JDFT0cJHALyKeLvsY4NyqQuiVdy85EjXLsLj
         VqtvdmAWINjXm8+4BEkwg0o7gzzCctUf18vvRdgNMpLF7UkstTX9iKrOdH3J4OBw8Q5n
         YV4u2mNlr9CEFSbUMSh3uexUnEJUUQJOO7ivnf1EBJMSHq80MICp48/sQvCz3yCvWo84
         R6tw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112; t=1678142515;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to;
        bh=IyMdgcKokNAQM99PayI8bLjHWwV0ForFIy4aNy3o9lk=;
        b=Gi1hyW6g20ninlaU5eCMqjj0rJbRVVMPI27UTuW6yPFRkOaqgn7Yl6NMokkXW4Yxzb
         KUaRRPpI3j5lg9pNCVX8SB9iIqdEY9rCViMveQEU+YY5p8Pk/z7VJMUwfjTVvQOH+Omf
         HsNvT3DK4ILbeYpSQ8EyUZlfKbgVbINrnIw0dMtM5DjUMyRO/G86bQqGSStq9LlQVdiJ
         KiaXjEJ5IBYtGyowm1JwtH53wq6tyYQYNhTySoVEwMr/78dV1d9lXOk7wJfAabhyYN2d
         6WK3QlYzLodnEhe191/hz0G7kLqCjH773xDgGJ+21TMHt0MwIxgpcQU3LvQ9uG2QAnzT
         2Otg==
X-Gm-Message-State: AO0yUKVHP0R0AO+XEqduf9m8E+I5yluCeFsweWxXPU2UKOCdaSmBK/tY
        UTtmkQtJwj1In9pWiYaqTpiT62LCUvVs
X-Google-Smtp-Source: 
 AK7set9+eOiqb/vToH7alNWaohc4OhzJSaVaQWex5JU8jZbtLK/hP7rjgkLPv4lw0xP+Po/bflIraSs++1b1
X-Received: from vipin.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:479f])
 (user=vipinsh job=sendgmr) by 2002:a17:90a:5993:b0:233:b520:1544 with SMTP id
 l19-20020a17090a599300b00233b5201544mr6781101pji.0.1678142515516; Mon, 06 Mar
 2023 14:41:55 -0800 (PST)
Date: Mon,  6 Mar 2023 14:41:21 -0800
In-Reply-To: <20230306224127.1689967-1-vipinsh@google.com>
Mime-Version: 1.0
References: <20230306224127.1689967-1-vipinsh@google.com>
X-Mailer: git-send-email 2.40.0.rc0.216.gc4246ad0f0-goog
Message-ID: <20230306224127.1689967-13-vipinsh@google.com>
Subject: [Patch v4 12/18] KVM: x86/mmu: Allocate NUMA aware page tables on TDP
 huge page splits
From: Vipin Sharma <vipinsh@google.com>
To: seanjc@google.com, pbonzini@redhat.com, bgardon@google.com,
        dmatlack@google.com
Cc: jmattson@google.com, mizhang@google.com, kvm@vger.kernel.org,
        linux-kernel@vger.kernel.org, Vipin Sharma <vipinsh@google.com>
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset="utf-8"

When splitting a huge page, try to allocate new lower level page tables
on the same NUMA node of the huge page. Only do NUMA aware page splits
if KVM has enabled NUMA aware page table for the VM else fall back to
the default method of using current thread mempolicy.

When huge pages are split for dirty log, new page tables are created
based on the current thread mempolicy, which by default will be the NUMA
node of the pCPU executing the thread. If thread enabling dirty log is
on a remote NUMA node than the huge page NUMA node then it will
create all page tables mapping 4KiB pages of that huge page on the
remote node. This reduces performances of the vCPUs which are NUMA
bound and are only accessing local NUMA memory as they will access
remote NUMA node page tables to access their local NUMA node memory.

Tested this feature on synthetic read-write-heavy workload in a 416 vCPU
VM on a 8 NUMA node host. This workload creates multiple threads,
partitions data in equal sizes and assigns them to each thread. Each
thread iterates over its own data in strides, reads and writes value in
its partitions. While executing, this workload continuously outputs
combined rates at which it is performing operations.

When dirty log is enabled in WRPROT mode, workload's performance:
- Without NUMA aware page table drops by ~75%
- With NUMA aware page table drops by ~20%

Raw data from one example run:
1. Without NUMA aware page table
   Before dirty log: ~2750000 accesses/sec
   After dirty log: ~700000 accesses/sec

2. With NUMA aware page table
   Before dirty log: ~2750000 accesses/sec
   After dirty log: ~2250000 accesses/sec

NUMA aware page table improved performance by more than 200%

Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
 arch/x86/kvm/mmu/mmu_internal.h | 15 +++++++++++++++
 arch/x86/kvm/mmu/tdp_mmu.c      |  9 +++++----
 include/linux/kvm_host.h        |  1 +
 virt/kvm/kvm_main.c             | 16 ++++++++++++++++
 4 files changed, 37 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_interna=
l.h
index a607314348e3..b9d0e09ae974 100644
--- a/arch/x86/kvm/mmu/mmu_internal.h
+++ b/arch/x86/kvm/mmu/mmu_internal.h
@@ -340,4 +340,19 @@ void track_possible_nx_huge_page(struct kvm *kvm, stru=
ct kvm_mmu_page *sp);
 void untrack_possible_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *s=
p);
 void *mmu_sp_memory_cache_alloc(struct kvm_mmu_memory_cache *cache);
=20
+static inline int kvm_pfn_to_page_table_nid(struct kvm *kvm, kvm_pfn_t pfn)
+{
+	struct page *page;
+
+	if (!kvm->arch.numa_aware_page_table)
+		return NUMA_NO_NODE;
+
+	page =3D kvm_pfn_to_refcounted_page(pfn);
+
+	if (page)
+		return page_to_nid(page);
+	else
+		return numa_mem_id();
+}
+
 #endif /* __KVM_X86_MMU_INTERNAL_H */
diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index d1e85012a008..61fd9c177694 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -1412,7 +1412,7 @@ bool kvm_tdp_mmu_wrprot_slot(struct kvm *kvm,
 	return spte_set;
 }
=20
-static struct kvm_mmu_page *__tdp_mmu_alloc_sp_for_split(gfp_t gfp)
+static struct kvm_mmu_page *__tdp_mmu_alloc_sp_for_split(gfp_t gfp, int ni=
d)
 {
 	struct kvm_mmu_page *sp;
=20
@@ -1422,7 +1422,7 @@ static struct kvm_mmu_page *__tdp_mmu_alloc_sp_for_sp=
lit(gfp_t gfp)
 	if (!sp)
 		return NULL;
=20
-	sp->spt =3D (void *)__get_free_page(gfp);
+	sp->spt =3D kvm_mmu_get_free_page(gfp, nid);
 	if (!sp->spt) {
 		kmem_cache_free(mmu_page_header_cache, sp);
 		return NULL;
@@ -1435,6 +1435,7 @@ static struct kvm_mmu_page *tdp_mmu_alloc_sp_for_spli=
t(struct kvm *kvm,
 						       struct tdp_iter *iter,
 						       bool shared)
 {
+	int nid =3D kvm_pfn_to_page_table_nid(kvm, spte_to_pfn(iter->old_spte));
 	struct kvm_mmu_page *sp;
=20
 	/*
@@ -1446,7 +1447,7 @@ static struct kvm_mmu_page *tdp_mmu_alloc_sp_for_spli=
t(struct kvm *kvm,
 	 * If this allocation fails we drop the lock and retry with reclaim
 	 * allowed.
 	 */
-	sp =3D __tdp_mmu_alloc_sp_for_split(GFP_NOWAIT | __GFP_ACCOUNT);
+	sp =3D __tdp_mmu_alloc_sp_for_split(GFP_NOWAIT | __GFP_ACCOUNT, nid);
 	if (sp)
 		return sp;
=20
@@ -1458,7 +1459,7 @@ static struct kvm_mmu_page *tdp_mmu_alloc_sp_for_spli=
t(struct kvm *kvm,
 		write_unlock(&kvm->mmu_lock);
=20
 	iter->yielded =3D true;
-	sp =3D __tdp_mmu_alloc_sp_for_split(GFP_KERNEL_ACCOUNT);
+	sp =3D __tdp_mmu_alloc_sp_for_split(GFP_KERNEL_ACCOUNT, nid);
=20
 	if (shared)
 		read_lock(&kvm->mmu_lock);
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 5cfa42c130e0..31586a65e346 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -1358,6 +1358,7 @@ void kvm_vcpu_on_spin(struct kvm_vcpu *vcpu, bool yie=
ld_to_kernel_mode);
 void kvm_flush_remote_tlbs(struct kvm *kvm);
=20
 #ifdef KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE
+void *kvm_mmu_get_free_page(gfp_t gfp, int nid);
 int kvm_mmu_topup_memory_cache(struct kvm_mmu_memory_cache *mc, int min);
 int __kvm_mmu_topup_memory_cache(struct kvm_mmu_memory_cache *mc, int capa=
city, int min);
 int kvm_mmu_memory_cache_nr_free_objects(struct kvm_mmu_memory_cache *mc);
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 536d8ab6e61f..47006d209309 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -377,6 +377,22 @@ static void kvm_flush_shadow_all(struct kvm *kvm)
 }
=20
 #ifdef KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE
+
+void *kvm_mmu_get_free_page(gfp_t gfp, int nid)
+{
+#ifdef CONFIG_NUMA
+	struct page *page;
+
+	if (nid !=3D NUMA_NO_NODE) {
+		page =3D alloc_pages_node(nid, gfp, 0);
+		if (!page)
+			return (void *)0;
+		return page_address(page);
+	}
+#endif /* CONFIG_NUMA */
+	return (void *)__get_free_page(gfp);
+}
+
 static inline void *mmu_memory_cache_alloc_obj(struct kvm_mmu_memory_cache=
 *mc,
 					       gfp_t gfp_flags)
 {
--=20
2.40.0.rc0.216.gc4246ad0f0-goog
From nobody Thu Sep 18 23:35:51 2025
Return-Path: <linux-kernel-owner@vger.kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id ADB6DC64EC4
	for <linux-kernel@archiver.kernel.org>; Mon,  6 Mar 2023 22:42:57 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S230297AbjCFWm4 (ORCPT <rfc822;linux-kernel@archiver.kernel.org>);
        Mon, 6 Mar 2023 17:42:56 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43280 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S230295AbjCFWm1 (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Mon, 6 Mar 2023 17:42:27 -0500
Received: from mail-pg1-x549.google.com (mail-pg1-x549.google.com
 [IPv6:2607:f8b0:4864:20::549])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 570FA7C9EA
        for <linux-kernel@vger.kernel.org>;
 Mon,  6 Mar 2023 14:41:59 -0800 (PST)
Received: by mail-pg1-x549.google.com with SMTP id
 29-20020a63125d000000b005039a1e2a17so2429357pgs.8
        for <linux-kernel@vger.kernel.org>;
 Mon, 06 Mar 2023 14:41:59 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20210112; t=1678142517;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:from:to:cc:subject:date:message-id:reply-to;
        bh=Wxg15xdE/bH7bhI4zmFRH7O/lZGJ/2R6ueAvOgk8JGw=;
        b=kd5ycUoFKU+axxo+AJDPdY5gK23AltHJWQ9PllQmiKCr/9wWHIlr51VYjyPnjMbIB3
         FXWYI9fGg+ZzUIWcjccLPYyucKzNdEUyOya/oP/cbH+iyIrK32foylhJTbkbD2qPJisu
         ivHY/JoeYd3ThHTDf00nDXN/B1QztwsfcfAXwSkxReVyCx4teZGfdwRKj/hYU5XG+JpV
         1QmhVrO7NFFOo4qDizBBQ+fliilDtPjyQU/4k7Wmo2jdcwZOAHfCoXo6IuwN71WBBOlE
         /1BSTLosLDYihhcNdVXIY8IwL/iqX+Nb4Xyc2xmcy1/jK8dBxK4LC/l4EflCSu/zh1/3
         Sc1g==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112; t=1678142517;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to;
        bh=Wxg15xdE/bH7bhI4zmFRH7O/lZGJ/2R6ueAvOgk8JGw=;
        b=LWkjo9L2dm4SW/poIm/weIxB4Md2WreZXWdDsgHWIj+Dfn3FP7412AqS5em+CKmVrV
         JbmbMnVZKmuTVlblictWOVrqTd0HUuK4HHSTFzsVdX5E+Tcv1M2iJqsG/glwV1vEKkLY
         cp3qRZyeRwl/vFtalahBWhjxeKbOV3Qb7riAxhpAdEj1DbhLITEb/5ikw40vakwCquh4
         Bp1hhH6P+vZ1U6RYR3AmN3+Rz5nWljxBEyL6vsBuuG0ZoSkFn7lwkgjgxdj3mNKKiuip
         4tLGaLnDvYo5af6Bb49EuuOj0tZGRRiFG+RhFDzm9q2diNqB64rJnAnO0CzJHYprQ9uI
         QAYQ==
X-Gm-Message-State: AO0yUKWCfQ28zWj8OC4aFUfk6C9wDd6/ps3WwEiNYFDqQHBjF4GpUk5y
        ZSnBgZCvvy2H+pxUdi437BcicsrwyraY
X-Google-Smtp-Source: 
 AK7set8zyLRGjCJT715lEA2X1BTzblUJpfiGrVMGO6a2tbRTxhr7dXOJw/2D/EY+UNBggv7NLNT+hSuz8LTV
X-Received: from vipin.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:479f])
 (user=vipinsh job=sendgmr) by 2002:a05:6a00:2253:b0:603:51de:c0dd with SMTP
 id i19-20020a056a00225300b0060351dec0ddmr5311084pfu.6.1678142517261; Mon, 06
 Mar 2023 14:41:57 -0800 (PST)
Date: Mon,  6 Mar 2023 14:41:22 -0800
In-Reply-To: <20230306224127.1689967-1-vipinsh@google.com>
Mime-Version: 1.0
References: <20230306224127.1689967-1-vipinsh@google.com>
X-Mailer: git-send-email 2.40.0.rc0.216.gc4246ad0f0-goog
Message-ID: <20230306224127.1689967-14-vipinsh@google.com>
Subject: [Patch v4 13/18] KVM: mmu: Add common initialization logic for struct
 kvm_mmu_memory_cache{}
From: Vipin Sharma <vipinsh@google.com>
To: seanjc@google.com, pbonzini@redhat.com, bgardon@google.com,
        dmatlack@google.com
Cc: jmattson@google.com, mizhang@google.com, kvm@vger.kernel.org,
        linux-kernel@vger.kernel.org, Vipin Sharma <vipinsh@google.com>
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset="utf-8"

Add macros and function to make common logic for struct
kvm_mmu_memory_cache{} declaration and initialization.

Any user which wants different values in struct kvm_mmu_memory_cache{}
will overwrite the default values explicitly after the initialization.

Suggested-by: David Matlack <dmatlack@google.com>
Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
 arch/arm64/kvm/arm.c      |  1 +
 arch/arm64/kvm/mmu.c      |  3 ++-
 arch/riscv/kvm/mmu.c      |  9 +++++----
 arch/riscv/kvm/vcpu.c     |  1 +
 arch/x86/kvm/mmu/mmu.c    |  8 ++++++++
 include/linux/kvm_types.h | 10 ++++++++++
 6 files changed, 27 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 3bd732eaf087..2b3d88e4ace8 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -330,6 +330,7 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
 	vcpu->arch.target =3D -1;
 	bitmap_zero(vcpu->arch.features, KVM_VCPU_MAX_FEATURES);
=20
+	INIT_KVM_MMU_MEMORY_CACHE(&vcpu->arch.mmu_page_cache);
 	vcpu->arch.mmu_page_cache.gfp_zero =3D __GFP_ZERO;
=20
 	/*
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index 7113587222ff..8a56f071ca66 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -895,7 +895,7 @@ int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t =
guest_ipa,
 {
 	phys_addr_t addr;
 	int ret =3D 0;
-	struct kvm_mmu_memory_cache cache =3D { .gfp_zero =3D __GFP_ZERO };
+	KVM_MMU_MEMORY_CACHE(cache);
 	struct kvm_pgtable *pgt =3D kvm->arch.mmu.pgt;
 	enum kvm_pgtable_prot prot =3D KVM_PGTABLE_PROT_DEVICE |
 				     KVM_PGTABLE_PROT_R |
@@ -904,6 +904,7 @@ int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t =
guest_ipa,
 	if (is_protected_kvm_enabled())
 		return -EPERM;
=20
+	cache.gfp_zero =3D __GFP_ZERO;
 	size +=3D offset_in_page(guest_ipa);
 	guest_ipa &=3D PAGE_MASK;
=20
diff --git a/arch/riscv/kvm/mmu.c b/arch/riscv/kvm/mmu.c
index 78211aed36fa..bdd8c17958dd 100644
--- a/arch/riscv/kvm/mmu.c
+++ b/arch/riscv/kvm/mmu.c
@@ -351,10 +351,11 @@ int kvm_riscv_gstage_ioremap(struct kvm *kvm, gpa_t g=
pa,
 	int ret =3D 0;
 	unsigned long pfn;
 	phys_addr_t addr, end;
-	struct kvm_mmu_memory_cache pcache =3D {
-		.gfp_custom =3D (in_atomic) ? GFP_ATOMIC | __GFP_ACCOUNT : 0,
-		.gfp_zero =3D __GFP_ZERO,
-	};
+	KVM_MMU_MEMORY_CACHE(pcache);
+
+	pcache.gfp_zero =3D __GFP_ZERO;
+	if (in_atomic)
+		pcache.gfp_custom =3D GFP_ATOMIC | __GFP_ACCOUNT;
=20
 	end =3D (gpa + size + PAGE_SIZE - 1) & PAGE_MASK;
 	pfn =3D __phys_to_pfn(hpa);
diff --git a/arch/riscv/kvm/vcpu.c b/arch/riscv/kvm/vcpu.c
index 7d010b0be54e..bc743e9122d1 100644
--- a/arch/riscv/kvm/vcpu.c
+++ b/arch/riscv/kvm/vcpu.c
@@ -163,6 +163,7 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
=20
 	/* Mark this VCPU never ran */
 	vcpu->arch.ran_atleast_once =3D false;
+	INIT_KVM_MMU_MEMORY_CACHE(&vcpu->arch.mmu_page_cache);
 	vcpu->arch.mmu_page_cache.gfp_zero =3D __GFP_ZERO;
 	bitmap_zero(vcpu->arch.isa, RISCV_ISA_EXT_MAX);
=20
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index a4bf2e433030..b706087ef74e 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -5961,15 +5961,20 @@ int kvm_mmu_create(struct kvm_vcpu *vcpu)
 {
 	int ret;
=20
+	INIT_KVM_MMU_MEMORY_CACHE(&vcpu->arch.mmu_pte_list_desc_cache);
 	vcpu->arch.mmu_pte_list_desc_cache.kmem_cache =3D pte_list_desc_cache;
 	vcpu->arch.mmu_pte_list_desc_cache.gfp_zero =3D __GFP_ZERO;
=20
+	INIT_KVM_MMU_MEMORY_CACHE(&vcpu->arch.mmu_page_header_cache);
 	vcpu->arch.mmu_page_header_cache.kmem_cache =3D mmu_page_header_cache;
 	vcpu->arch.mmu_page_header_cache.gfp_zero =3D __GFP_ZERO;
=20
+	INIT_KVM_MMU_MEMORY_CACHE(&vcpu->arch.mmu_shadow_page_cache);
 	vcpu->arch.mmu_shadow_page_cache.gfp_zero =3D __GFP_ZERO;
 	mutex_init(&vcpu->arch.mmu_shadow_page_cache_lock);
=20
+	INIT_KVM_MMU_MEMORY_CACHE(&vcpu->arch.mmu_shadowed_info_cache);
+
 	vcpu->arch.mmu =3D &vcpu->arch.root_mmu;
 	vcpu->arch.walk_mmu =3D &vcpu->arch.root_mmu;
=20
@@ -6131,11 +6136,14 @@ int kvm_mmu_init_vm(struct kvm *kvm)
 	node->track_flush_slot =3D kvm_mmu_invalidate_zap_pages_in_memslot;
 	kvm_page_track_register_notifier(kvm, node);
=20
+	INIT_KVM_MMU_MEMORY_CACHE(&kvm->arch.split_page_header_cache);
 	kvm->arch.split_page_header_cache.kmem_cache =3D mmu_page_header_cache;
 	kvm->arch.split_page_header_cache.gfp_zero =3D __GFP_ZERO;
=20
+	INIT_KVM_MMU_MEMORY_CACHE(&kvm->arch.split_shadow_page_cache);
 	kvm->arch.split_shadow_page_cache.gfp_zero =3D __GFP_ZERO;
=20
+	INIT_KVM_MMU_MEMORY_CACHE(&kvm->arch.split_desc_cache);
 	kvm->arch.split_desc_cache.kmem_cache =3D pte_list_desc_cache;
 	kvm->arch.split_desc_cache.gfp_zero =3D __GFP_ZERO;
=20
diff --git a/include/linux/kvm_types.h b/include/linux/kvm_types.h
index 2728d49bbdf6..192516eeccac 100644
--- a/include/linux/kvm_types.h
+++ b/include/linux/kvm_types.h
@@ -98,6 +98,16 @@ struct kvm_mmu_memory_cache {
 	int capacity;
 	void **objects;
 };
+
+#define KVM_MMU_MEMORY_CACHE_INIT() { }
+
+#define KVM_MMU_MEMORY_CACHE(_name) \
+		struct kvm_mmu_memory_cache _name =3D KVM_MMU_MEMORY_CACHE_INIT()
+
+static inline void INIT_KVM_MMU_MEMORY_CACHE(struct kvm_mmu_memory_cache *=
cache)
+{
+	*cache =3D (struct kvm_mmu_memory_cache)KVM_MMU_MEMORY_CACHE_INIT();
+}
 #endif
=20
 #define HALT_POLL_HIST_COUNT			32
--=20
2.40.0.rc0.216.gc4246ad0f0-goog
From nobody Thu Sep 18 23:35:51 2025
Return-Path: <linux-kernel-owner@vger.kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 49284C64EC4
	for <linux-kernel@archiver.kernel.org>; Mon,  6 Mar 2023 22:43:08 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S229868AbjCFWnG (ORCPT <rfc822;linux-kernel@archiver.kernel.org>);
        Mon, 6 Mar 2023 17:43:06 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44506 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S230300AbjCFWmh (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Mon, 6 Mar 2023 17:42:37 -0500
Received: from mail-pg1-x549.google.com (mail-pg1-x549.google.com
 [IPv6:2607:f8b0:4864:20::549])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 262B586175
        for <linux-kernel@vger.kernel.org>;
 Mon,  6 Mar 2023 14:42:06 -0800 (PST)
Received: by mail-pg1-x549.google.com with SMTP id
 t185-20020a635fc2000000b00502e332493fso2489940pgb.12
        for <linux-kernel@vger.kernel.org>;
 Mon, 06 Mar 2023 14:42:06 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20210112; t=1678142519;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:from:to:cc:subject:date:message-id:reply-to;
        bh=9itEUO5Up+fI8iZk0iLC4qpDns9vSkVb4A516EJM+I0=;
        b=at6TQkK22xH96lnLyAJmubroPcHciUudqTHcykVv1J4vGrW/W+Wm443aAB9k+s4VL6
         QhJ271wtkDKYq+yN0qOuZEsZhFa7kpf1EYRE9YM+4gPWg5IHwL4hklbYs1zbUgsk7Bbk
         3TcmaPFIIgEvnaZO3k4WnSCc9ZFBZvKPoYHHSFBvgDLaqrbu2ks/AokOLJOKlI1j/4HE
         AfdPl8TZkPDz4VBBwPbmvfYoWUHigzQM0524TwkBQBlm8NzIMPuKiJEruvOwxb7pvzzS
         Iuja/F2k+J1CiXNukSTAtlXOklCvIeUEOyDDUU+xou9hV6faTKP4unCqrQIVi3pzdUsT
         qK3A==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112; t=1678142519;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to;
        bh=9itEUO5Up+fI8iZk0iLC4qpDns9vSkVb4A516EJM+I0=;
        b=3pZhpn4VQ+2aq2icSz24gn4lSH2SW+QaMUR6VMDRH0EMh0rdM93/pT9TBs0rjxyKcO
         PFO6Gdr8RKrBWy7/VFce2zYW8F6tC+LufUDiVIA8KQctdB3nudnyP5+dRQ/sfEqRxQ5w
         xH1RyAYL6zthMFfpoxqKlXdwi4c4OIkyOTGQBjXPiJ9Sqmr7C4fRbsJpM2WqiDVHVFUV
         rnbfM5+Z3GGtFpiMQwMpnAC1aeyO1/zMwfGRnO5ZCgvX29GQMp7XqPaiVsU5C487kSyM
         HaXSHaAgtpcASAtL42jj8MFiBufSLamuSeYqseEE40gQukrTL0pwVYsYV9WMCTZbWYiP
         ETmQ==
X-Gm-Message-State: AO0yUKUvlC0kaORWXw0Z3EWceXu+0CfqmNqHxdLk9LZB2iiL1CJBj5Lw
        A2Bah6PBgV8r5rO6x+l4KTW6OJNV1qBd
X-Google-Smtp-Source: 
 AK7set9LZnkcT+4mc88Pdxib2KGSekMwjKPwN1F6NV9DbxpH+EWfu7Z5xTPOZLDmNsyKAGya/UsFECxhTM3s
X-Received: from vipin.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:479f])
 (user=vipinsh job=sendgmr) by 2002:a17:902:9a03:b0:19a:afc4:2300 with SMTP id
 v3-20020a1709029a0300b0019aafc42300mr5089548plp.6.1678142518978; Mon, 06 Mar
 2023 14:41:58 -0800 (PST)
Date: Mon,  6 Mar 2023 14:41:23 -0800
In-Reply-To: <20230306224127.1689967-1-vipinsh@google.com>
Mime-Version: 1.0
References: <20230306224127.1689967-1-vipinsh@google.com>
X-Mailer: git-send-email 2.40.0.rc0.216.gc4246ad0f0-goog
Message-ID: <20230306224127.1689967-15-vipinsh@google.com>
Subject: [Patch v4 14/18] KVM: mmu: Initialize kvm_mmu_memory_cache.gfp_zero
 to __GFP_ZERO by default
From: Vipin Sharma <vipinsh@google.com>
To: seanjc@google.com, pbonzini@redhat.com, bgardon@google.com,
        dmatlack@google.com
Cc: jmattson@google.com, mizhang@google.com, kvm@vger.kernel.org,
        linux-kernel@vger.kernel.org, Vipin Sharma <vipinsh@google.com>
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset="utf-8"

Set __GFP_ZERO to gfp_zero in default initizliation of struct
kvm_mmu_memory_cache{}

All of the users of default initialization code of struct
kvm_mmu_memory_cache{} explicitly sets gfp_zero to __GFP_ZERO. This can
be moved to common initialization logic.

Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
 arch/arm64/kvm/arm.c      | 1 -
 arch/arm64/kvm/mmu.c      | 1 -
 arch/riscv/kvm/mmu.c      | 1 -
 arch/riscv/kvm/vcpu.c     | 1 -
 arch/x86/kvm/mmu/mmu.c    | 6 ------
 include/linux/kvm_types.h | 4 +++-
 6 files changed, 3 insertions(+), 11 deletions(-)

diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 2b3d88e4ace8..b4243978d962 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -331,7 +331,6 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
 	bitmap_zero(vcpu->arch.features, KVM_VCPU_MAX_FEATURES);
=20
 	INIT_KVM_MMU_MEMORY_CACHE(&vcpu->arch.mmu_page_cache);
-	vcpu->arch.mmu_page_cache.gfp_zero =3D __GFP_ZERO;
=20
 	/*
 	 * Default value for the FP state, will be overloaded at load
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index 8a56f071ca66..133eba96c41f 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -904,7 +904,6 @@ int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t =
guest_ipa,
 	if (is_protected_kvm_enabled())
 		return -EPERM;
=20
-	cache.gfp_zero =3D __GFP_ZERO;
 	size +=3D offset_in_page(guest_ipa);
 	guest_ipa &=3D PAGE_MASK;
=20
diff --git a/arch/riscv/kvm/mmu.c b/arch/riscv/kvm/mmu.c
index bdd8c17958dd..62550fd91c70 100644
--- a/arch/riscv/kvm/mmu.c
+++ b/arch/riscv/kvm/mmu.c
@@ -353,7 +353,6 @@ int kvm_riscv_gstage_ioremap(struct kvm *kvm, gpa_t gpa,
 	phys_addr_t addr, end;
 	KVM_MMU_MEMORY_CACHE(pcache);
=20
-	pcache.gfp_zero =3D __GFP_ZERO;
 	if (in_atomic)
 		pcache.gfp_custom =3D GFP_ATOMIC | __GFP_ACCOUNT;
=20
diff --git a/arch/riscv/kvm/vcpu.c b/arch/riscv/kvm/vcpu.c
index bc743e9122d1..f5a96ed1e426 100644
--- a/arch/riscv/kvm/vcpu.c
+++ b/arch/riscv/kvm/vcpu.c
@@ -164,7 +164,6 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
 	/* Mark this VCPU never ran */
 	vcpu->arch.ran_atleast_once =3D false;
 	INIT_KVM_MMU_MEMORY_CACHE(&vcpu->arch.mmu_page_cache);
-	vcpu->arch.mmu_page_cache.gfp_zero =3D __GFP_ZERO;
 	bitmap_zero(vcpu->arch.isa, RISCV_ISA_EXT_MAX);
=20
 	/* Setup ISA features available to VCPU */
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index b706087ef74e..d96afc849ee8 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -5963,14 +5963,11 @@ int kvm_mmu_create(struct kvm_vcpu *vcpu)
=20
 	INIT_KVM_MMU_MEMORY_CACHE(&vcpu->arch.mmu_pte_list_desc_cache);
 	vcpu->arch.mmu_pte_list_desc_cache.kmem_cache =3D pte_list_desc_cache;
-	vcpu->arch.mmu_pte_list_desc_cache.gfp_zero =3D __GFP_ZERO;
=20
 	INIT_KVM_MMU_MEMORY_CACHE(&vcpu->arch.mmu_page_header_cache);
 	vcpu->arch.mmu_page_header_cache.kmem_cache =3D mmu_page_header_cache;
-	vcpu->arch.mmu_page_header_cache.gfp_zero =3D __GFP_ZERO;
=20
 	INIT_KVM_MMU_MEMORY_CACHE(&vcpu->arch.mmu_shadow_page_cache);
-	vcpu->arch.mmu_shadow_page_cache.gfp_zero =3D __GFP_ZERO;
 	mutex_init(&vcpu->arch.mmu_shadow_page_cache_lock);
=20
 	INIT_KVM_MMU_MEMORY_CACHE(&vcpu->arch.mmu_shadowed_info_cache);
@@ -6138,14 +6135,11 @@ int kvm_mmu_init_vm(struct kvm *kvm)
=20
 	INIT_KVM_MMU_MEMORY_CACHE(&kvm->arch.split_page_header_cache);
 	kvm->arch.split_page_header_cache.kmem_cache =3D mmu_page_header_cache;
-	kvm->arch.split_page_header_cache.gfp_zero =3D __GFP_ZERO;
=20
 	INIT_KVM_MMU_MEMORY_CACHE(&kvm->arch.split_shadow_page_cache);
-	kvm->arch.split_shadow_page_cache.gfp_zero =3D __GFP_ZERO;
=20
 	INIT_KVM_MMU_MEMORY_CACHE(&kvm->arch.split_desc_cache);
 	kvm->arch.split_desc_cache.kmem_cache =3D pte_list_desc_cache;
-	kvm->arch.split_desc_cache.gfp_zero =3D __GFP_ZERO;
=20
 	return 0;
 }
diff --git a/include/linux/kvm_types.h b/include/linux/kvm_types.h
index 192516eeccac..5da7953532ce 100644
--- a/include/linux/kvm_types.h
+++ b/include/linux/kvm_types.h
@@ -99,7 +99,9 @@ struct kvm_mmu_memory_cache {
 	void **objects;
 };
=20
-#define KVM_MMU_MEMORY_CACHE_INIT() { }
+#define KVM_MMU_MEMORY_CACHE_INIT() {	\
+	.gfp_zero =3D __GFP_ZERO,		\
+}
=20
 #define KVM_MMU_MEMORY_CACHE(_name) \
 		struct kvm_mmu_memory_cache _name =3D KVM_MMU_MEMORY_CACHE_INIT()
--=20
2.40.0.rc0.216.gc4246ad0f0-goog
From nobody Thu Sep 18 23:35:51 2025
Return-Path: <linux-kernel-owner@vger.kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 8CA82C61DA4
	for <linux-kernel@archiver.kernel.org>; Mon,  6 Mar 2023 22:43:16 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S230176AbjCFWnP (ORCPT <rfc822;linux-kernel@archiver.kernel.org>);
        Mon, 6 Mar 2023 17:43:15 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44364 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S229975AbjCFWmq (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Mon, 6 Mar 2023 17:42:46 -0500
Received: from mail-pl1-x649.google.com (mail-pl1-x649.google.com
 [IPv6:2607:f8b0:4864:20::649])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5146F79B3C
        for <linux-kernel@vger.kernel.org>;
 Mon,  6 Mar 2023 14:42:13 -0800 (PST)
Received: by mail-pl1-x649.google.com with SMTP id
 iw4-20020a170903044400b0019ccafc1fbeso6574394plb.3
        for <linux-kernel@vger.kernel.org>;
 Mon, 06 Mar 2023 14:42:13 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20210112; t=1678142520;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:from:to:cc:subject:date:message-id:reply-to;
        bh=QF0YVo49bCSm6bcyFWH7OqAKf22akzrbGA/pGOKX2s4=;
        b=j70Rx6ona9KIAnQ4HWSz5x9GD68BX0x1weHZP2/r+WzR2h1/SSUiCMoGT0/LMn8Krl
         hDwgLLwODSiQraOXzkU5foUjHwl43UOCrK1AKg2uMA8DJsB9kMN/FpRAawq/Jm/VX/cJ
         LD8cxRl72IW9wCnhEe3Tet/54wvO6hOSZMHRw53rLydqkVnq/i9RrbKZqXUkjhisFLgo
         MwQy/8x+BSK/8fCADVgj6wpzAz8cfMGDyyHWx55HQAg+EUJ4PFbAHSh2HUhxjhsnfQjL
         g901dK4CRZerMpvmU1+ZNo/lW5rSjLY+I7K/GMlMMlDpjMd2e9OzGaB7LM7xYGdRwzrT
         ePNw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112; t=1678142520;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to;
        bh=QF0YVo49bCSm6bcyFWH7OqAKf22akzrbGA/pGOKX2s4=;
        b=IE2/H5Gf5/gIqu2n6ic3oU8hRi7uUMfAxJRl3j8yXGsWf+SZKOd5jLGMYoPKZGz8zI
         /l+uzn7XBlVJXIUrVc7SOUnT9hwIXmIKEOlNI6IP2+32c8Rg3S92OBXPQUNhMiZTOxkV
         3bQ8z19e+vfa4qYfJxyAIqn3RerCbkW+mQKZ7rlrgdyMqPq18qe8tamVz4gszfVtNcTO
         P+iMsBrCSnrjqlGHlggxlw1kIETcHaXI+qEb/AhJM1g4lSTaXkKCO7itLzZSVUSJgWFQ
         2gL6Xw0jkFnZuGglTtsXK2tiFxB7g+FpnQGbymkate6QTjL01coR+I5R33GbYDxejN5b
         wyhw==
X-Gm-Message-State: AO0yUKVNrckwI8kExR9QyqNbc+vEdrcspPq81oqir3pMZ79FadWd+7Ac
        JAZ9LWcxGkKDW6jK7icvptxg7VxpinEy
X-Google-Smtp-Source: 
 AK7set9V/EfuOapXLc/xwlEXZ52sVooC/583kWWlWB81qUshsgV2Atlufd/VpVQMEq1cD94C2YRFlFJFN9P5
X-Received: from vipin.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:479f])
 (user=vipinsh job=sendgmr) by 2002:a63:3747:0:b0:503:2d50:5bf1 with SMTP id
 g7-20020a633747000000b005032d505bf1mr4047163pgn.7.1678142520704; Mon, 06 Mar
 2023 14:42:00 -0800 (PST)
Date: Mon,  6 Mar 2023 14:41:24 -0800
In-Reply-To: <20230306224127.1689967-1-vipinsh@google.com>
Mime-Version: 1.0
References: <20230306224127.1689967-1-vipinsh@google.com>
X-Mailer: git-send-email 2.40.0.rc0.216.gc4246ad0f0-goog
Message-ID: <20230306224127.1689967-16-vipinsh@google.com>
Subject: [Patch v4 15/18] KVM: mmu: Add NUMA node support in struct
 kvm_mmu_memory_cache{}
From: Vipin Sharma <vipinsh@google.com>
To: seanjc@google.com, pbonzini@redhat.com, bgardon@google.com,
        dmatlack@google.com
Cc: jmattson@google.com, mizhang@google.com, kvm@vger.kernel.org,
        linux-kernel@vger.kernel.org, Vipin Sharma <vipinsh@google.com>
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset="utf-8"

Add NUMA node id variable in struct kvm_mmu_memory_cache{}. This
variable denotes preferable NUMA node from which memory will be
allocated under this memory cache.

Set this variable to NUMA_NO_NODE if there is no preferred node.

MIPS doesn't do any sort of initializatino of struct
kvm_mmu_memory_cache{}. Keep things similar in MIPS by setting gfp_zero
to 0 as INIT_KVM_MMU_MEMORY_CACHE() will initialize it to __GFP_ZERO.

"node" cannot be left as 0, as 0 is a valid NUMA node value.

Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
 arch/mips/kvm/mips.c      | 3 +++
 include/linux/kvm_types.h | 3 +++
 2 files changed, 6 insertions(+)

diff --git a/arch/mips/kvm/mips.c b/arch/mips/kvm/mips.c
index 36c8991b5d39..5ec5ce919918 100644
--- a/arch/mips/kvm/mips.c
+++ b/arch/mips/kvm/mips.c
@@ -294,6 +294,9 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
 		     HRTIMER_MODE_REL);
 	vcpu->arch.comparecount_timer.function =3D kvm_mips_comparecount_wakeup;
=20
+	INIT_KVM_MMU_MEMORY_CACHE(&vcpu->arch.mmu_page_cache);
+	vcpu->arch.mmu_page_cache.gfp_zero =3D 0;
+
 	/*
 	 * Allocate space for host mode exception handlers that handle
 	 * guest mode exits
diff --git a/include/linux/kvm_types.h b/include/linux/kvm_types.h
index 5da7953532ce..b2a405c8e629 100644
--- a/include/linux/kvm_types.h
+++ b/include/linux/kvm_types.h
@@ -97,10 +97,13 @@ struct kvm_mmu_memory_cache {
 	struct kmem_cache *kmem_cache;
 	int capacity;
 	void **objects;
+	/* Preferred NUMA node of memory allocation. */
+	int node;
 };
=20
 #define KVM_MMU_MEMORY_CACHE_INIT() {	\
 	.gfp_zero =3D __GFP_ZERO,		\
+	.node =3D NUMA_NO_NODE,		\
 }
=20
 #define KVM_MMU_MEMORY_CACHE(_name) \
--=20
2.40.0.rc0.216.gc4246ad0f0-goog
From nobody Thu Sep 18 23:35:51 2025
Return-Path: <linux-kernel-owner@vger.kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id B3963C6FD1B
	for <linux-kernel@archiver.kernel.org>; Mon,  6 Mar 2023 22:43:19 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S230287AbjCFWnS (ORCPT <rfc822;linux-kernel@archiver.kernel.org>);
        Mon, 6 Mar 2023 17:43:18 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44398 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S229932AbjCFWmt (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Mon, 6 Mar 2023 17:42:49 -0500
Received: from mail-pj1-x104a.google.com (mail-pj1-x104a.google.com
 [IPv6:2607:f8b0:4864:20::104a])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DF18286DDA
        for <linux-kernel@vger.kernel.org>;
 Mon,  6 Mar 2023 14:42:17 -0800 (PST)
Received: by mail-pj1-x104a.google.com with SMTP id
 m9-20020a17090a7f8900b0023769205928so7063446pjl.6
        for <linux-kernel@vger.kernel.org>;
 Mon, 06 Mar 2023 14:42:17 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20210112; t=1678142522;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:from:to:cc:subject:date:message-id:reply-to;
        bh=6vNzM0GO5MbtnDqLqku67MPZbUcarzFcKpTc3q9nP3k=;
        b=gazM+TYEEbP14W+rWRFTwfUrHyN3KdQf76sB5eH96SwCJUfaR6rpxh8xN+WanYYVzF
         QBZLrpT4BSTNPefyyG4FPpaD7f79PmPGnD6/az01XAnLI2irX1Ygk7x4FjdbcR47ltI8
         3S4ScXjzy7/oo/VV+42a0ZFlbEIXdD0vA+eR8nbC1IkTPxJkfrj4/7xDVwjG32OXY1lb
         sOD+UhczN6cARoP51qOo89jeac+Oj5ykbaYZMttd8P4ADc9KG2JizR3WkTNQtYjmDzAn
         RiX7UM/yTffI3ZfCP0ekk4ug+KVybUSSaucVFneuKSZsWPQGeOC3eaia9hWC5MzitqYz
         IvhA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112; t=1678142522;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to;
        bh=6vNzM0GO5MbtnDqLqku67MPZbUcarzFcKpTc3q9nP3k=;
        b=FFumZAtTs2mXlU94fNlTStX+u2TSmg04k/KmILdRAQAFRELbnUhAD5TqhTwYQXsANg
         oSbWEHAGq3MsehZ7hDQIOa3CuxNA6Blu1uxX8nK3aJbFiHGaQOPOgO8zkmr0L3s0yLgd
         yGD0K5d1T8B0tuVXMXEi+6dK4jidugx71/FEbvUriavPIY7GRrST+id2iOTuQ/eNU2lu
         sl3LmzRYfkpFc3VQwN93VWLuzXyYPveTXZpsUtgSNTW3cSICyFXfqoYO97oP6fcxFuuo
         tYZ6rejyMKuiMPxGI4Eh5lR4L363/ZfDeUATVJvvTYBOyvZ55R/apL0f+/vn2i42/EwH
         rx5w==
X-Gm-Message-State: AO0yUKU8uZn8MBsGF9mRz/FyGn/SBRPK3LidgKzEemm0VrPcYYrRhNRp
        7Y72o0QEBnhLP2TwSHfrbYhTg6dApBOw
X-Google-Smtp-Source: 
 AK7set/eezTIEbASuhG8LhM6PyFQFMMwDXgH1qzjZn8+eIggy+CHhD62jLS147rj2/81GVxpa6dZKyGYL5T3
X-Received: from vipin.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:479f])
 (user=vipinsh job=sendgmr) by 2002:a17:90a:c217:b0:234:b8cb:5133 with SMTP id
 e23-20020a17090ac21700b00234b8cb5133mr4591584pjt.7.1678142522518; Mon, 06 Mar
 2023 14:42:02 -0800 (PST)
Date: Mon,  6 Mar 2023 14:41:25 -0800
In-Reply-To: <20230306224127.1689967-1-vipinsh@google.com>
Mime-Version: 1.0
References: <20230306224127.1689967-1-vipinsh@google.com>
X-Mailer: git-send-email 2.40.0.rc0.216.gc4246ad0f0-goog
Message-ID: <20230306224127.1689967-17-vipinsh@google.com>
Subject: [Patch v4 16/18] KVM: x86/mmu: Allocate numa aware page tables during
 page fault
From: Vipin Sharma <vipinsh@google.com>
To: seanjc@google.com, pbonzini@redhat.com, bgardon@google.com,
        dmatlack@google.com
Cc: jmattson@google.com, mizhang@google.com, kvm@vger.kernel.org,
        linux-kernel@vger.kernel.org, Vipin Sharma <vipinsh@google.com>
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset="utf-8"

Allocate page tables on the preferred NUMA node via memory cache during
page faults. If memory cache doesn't have a preferred NUMA node (node
value is set to NUMA_NO_NODE) then fallback to the default logic where
pages are selected based on thread's mempolicy. Also, free NUMA aware
page caches, mmu_shadow_page_cache, when memory shrinker is invoked.

Allocate root pages based on the current thread's NUMA node as there is
no way to know which will be the ideal NUMA node in long run.

This commit allocate page tables to be on the same NUMA node as the
physical page pointed by them, even if a vCPU causing page fault is on a
different NUMA node. If memory is not available on the requested NUMA
node then the other nearest NUMA node is selected by default. NUMA aware
page tables can be beneficial in cases where a thread touches lot of far
memory initially and then divide work among multiple threads. VMs
generally take advantage of NUMA architecture for faster memory access
by moving threads to the NUMA node of the memory they are accessing.
This change will help them in accessing pages faster.

Downside of this change is that an experimental workload can be created
where a guest threads are always accessing remote memory and not the one
local to them. This will cause performance to degrade compared to VMs
where numa aware page tables are not enabled. Ideally, these VMs when
using non-uniform memory access machine should generally be taking
advantage of NUMA architecture to improve their performance in the first
place.

Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
 arch/x86/include/asm/kvm_host.h |  2 +-
 arch/x86/kvm/mmu/mmu.c          | 63 ++++++++++++++++++++++++---------
 arch/x86/kvm/mmu/mmu_internal.h | 24 ++++++++++++-
 arch/x86/kvm/mmu/paging_tmpl.h  |  4 +--
 arch/x86/kvm/mmu/tdp_mmu.c      | 14 +++++---
 include/linux/kvm_types.h       |  6 ++++
 virt/kvm/kvm_main.c             |  2 +-
 7 files changed, 88 insertions(+), 27 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_hos=
t.h
index 64de083cd6b9..77d3aa368e5e 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -787,7 +787,7 @@ struct kvm_vcpu_arch {
 	struct kvm_mmu *walk_mmu;
=20
 	struct kvm_mmu_memory_cache mmu_pte_list_desc_cache;
-	struct kvm_mmu_memory_cache mmu_shadow_page_cache;
+	struct kvm_mmu_memory_cache mmu_shadow_page_cache[MAX_NUMNODES];
 	struct kvm_mmu_memory_cache mmu_shadowed_info_cache;
 	struct kvm_mmu_memory_cache mmu_page_header_cache;
=20
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index d96afc849ee8..86f0d74d35ed 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -702,7 +702,7 @@ static void mmu_free_sp_memory_cache(struct kvm_mmu_mem=
ory_cache *cache)
=20
 static int mmu_topup_memory_caches(struct kvm_vcpu *vcpu, bool maybe_indir=
ect)
 {
-	int r;
+	int r, nid =3D KVM_MMU_DEFAULT_CACHE_INDEX;
=20
 	/* 1 rmap, 1 parent PTE per level, and the prefetched rmaps. */
 	r =3D kvm_mmu_topup_memory_cache(&vcpu->arch.mmu_pte_list_desc_cache,
@@ -710,7 +710,16 @@ static int mmu_topup_memory_caches(struct kvm_vcpu *vc=
pu, bool maybe_indirect)
 	if (r)
 		return r;
=20
-	r =3D mmu_topup_sp_memory_cache(&vcpu->arch.mmu_shadow_page_cache, PT64_R=
OOT_MAX_LEVEL);
+	if (kvm_numa_aware_page_table_enabled(vcpu->kvm)) {
+		for_each_online_node(nid) {
+			r =3D mmu_topup_sp_memory_cache(&vcpu->arch.mmu_shadow_page_cache[nid],
+						      PT64_ROOT_MAX_LEVEL);
+		}
+	} else {
+		r =3D mmu_topup_sp_memory_cache(&vcpu->arch.mmu_shadow_page_cache[nid],
+					      PT64_ROOT_MAX_LEVEL);
+	}
+
 	if (r)
 		return r;
=20
@@ -726,9 +735,12 @@ static int mmu_topup_memory_caches(struct kvm_vcpu *vc=
pu, bool maybe_indirect)
=20
 static void mmu_free_memory_caches(struct kvm_vcpu *vcpu)
 {
+	int nid;
+
 	kvm_mmu_free_memory_cache(&vcpu->arch.mmu_pte_list_desc_cache);
 	mutex_lock(&vcpu->arch.mmu_shadow_page_cache_lock);
-	mmu_free_sp_memory_cache(&vcpu->arch.mmu_shadow_page_cache);
+	for_each_node(nid)
+		mmu_free_sp_memory_cache(&vcpu->arch.mmu_shadow_page_cache[nid]);
 	mmu_free_sp_memory_cache(&vcpu->arch.mmu_shadowed_info_cache);
 	mutex_unlock(&vcpu->arch.mmu_shadow_page_cache_lock);
 	kvm_mmu_free_memory_cache(&vcpu->arch.mmu_page_header_cache);
@@ -2245,12 +2257,12 @@ static struct kvm_mmu_page *__kvm_mmu_get_shadow_pa=
ge(struct kvm *kvm,
 }
=20
 static struct kvm_mmu_page *kvm_mmu_get_shadow_page(struct kvm_vcpu *vcpu,
-						    gfn_t gfn,
+						    gfn_t gfn, int nid,
 						    union kvm_mmu_page_role role)
 {
 	struct shadow_page_caches caches =3D {
 		.page_header_cache =3D &vcpu->arch.mmu_page_header_cache,
-		.shadow_page_cache =3D &vcpu->arch.mmu_shadow_page_cache,
+		.shadow_page_cache =3D &vcpu->arch.mmu_shadow_page_cache[nid],
 		.shadowed_info_cache =3D &vcpu->arch.mmu_shadowed_info_cache,
 	};
=20
@@ -2305,15 +2317,18 @@ static union kvm_mmu_page_role kvm_mmu_child_role(u=
64 *sptep, bool direct,
=20
 static struct kvm_mmu_page *kvm_mmu_get_child_sp(struct kvm_vcpu *vcpu,
 						 u64 *sptep, gfn_t gfn,
-						 bool direct, unsigned int access)
+						 bool direct, unsigned int access,
+						 kvm_pfn_t pfn)
 {
 	union kvm_mmu_page_role role;
+	int nid;
=20
 	if (is_shadow_present_pte(*sptep) && !is_large_pte(*sptep))
 		return ERR_PTR(-EEXIST);
=20
 	role =3D kvm_mmu_child_role(sptep, direct, access);
-	return kvm_mmu_get_shadow_page(vcpu, gfn, role);
+	nid =3D kvm_pfn_to_mmu_cache_nid(vcpu->kvm, pfn);
+	return kvm_mmu_get_shadow_page(vcpu, gfn, nid, role);
 }
=20
 static void shadow_walk_init_using_root(struct kvm_shadow_walk_iterator *i=
terator,
@@ -3205,7 +3220,8 @@ static int direct_map(struct kvm_vcpu *vcpu, struct k=
vm_page_fault *fault)
 		if (it.level =3D=3D fault->goal_level)
 			break;
=20
-		sp =3D kvm_mmu_get_child_sp(vcpu, it.sptep, base_gfn, true, ACC_ALL);
+		sp =3D kvm_mmu_get_child_sp(vcpu, it.sptep, base_gfn, true,
+					  ACC_ALL, fault->pfn);
 		if (sp =3D=3D ERR_PTR(-EEXIST))
 			continue;
=20
@@ -3625,6 +3641,7 @@ static hpa_t mmu_alloc_root(struct kvm_vcpu *vcpu, gf=
n_t gfn, int quadrant,
 {
 	union kvm_mmu_page_role role =3D vcpu->arch.mmu->root_role;
 	struct kvm_mmu_page *sp;
+	int nid;
=20
 	role.level =3D level;
 	role.quadrant =3D quadrant;
@@ -3632,7 +3649,8 @@ static hpa_t mmu_alloc_root(struct kvm_vcpu *vcpu, gf=
n_t gfn, int quadrant,
 	WARN_ON_ONCE(quadrant && !role.has_4_byte_gpte);
 	WARN_ON_ONCE(role.direct && role.has_4_byte_gpte);
=20
-	sp =3D kvm_mmu_get_shadow_page(vcpu, gfn, role);
+	nid =3D kvm_mmu_root_page_cache_nid(vcpu->kvm);
+	sp =3D kvm_mmu_get_shadow_page(vcpu, gfn, nid, role);
 	++sp->root_count;
=20
 	return __pa(sp->spt);
@@ -5959,7 +5977,7 @@ static int __kvm_mmu_create(struct kvm_vcpu *vcpu, st=
ruct kvm_mmu *mmu)
=20
 int kvm_mmu_create(struct kvm_vcpu *vcpu)
 {
-	int ret;
+	int ret, nid;
=20
 	INIT_KVM_MMU_MEMORY_CACHE(&vcpu->arch.mmu_pte_list_desc_cache);
 	vcpu->arch.mmu_pte_list_desc_cache.kmem_cache =3D pte_list_desc_cache;
@@ -5967,7 +5985,12 @@ int kvm_mmu_create(struct kvm_vcpu *vcpu)
 	INIT_KVM_MMU_MEMORY_CACHE(&vcpu->arch.mmu_page_header_cache);
 	vcpu->arch.mmu_page_header_cache.kmem_cache =3D mmu_page_header_cache;
=20
-	INIT_KVM_MMU_MEMORY_CACHE(&vcpu->arch.mmu_shadow_page_cache);
+	for_each_node(nid) {
+		INIT_KVM_MMU_MEMORY_CACHE(&vcpu->arch.mmu_shadow_page_cache[nid]);
+		if (kvm_numa_aware_page_table_enabled(vcpu->kvm))
+			vcpu->arch.mmu_shadow_page_cache[nid].node =3D nid;
+	}
+
 	mutex_init(&vcpu->arch.mmu_shadow_page_cache_lock);
=20
 	INIT_KVM_MMU_MEMORY_CACHE(&vcpu->arch.mmu_shadowed_info_cache);
@@ -6695,13 +6718,17 @@ void kvm_mmu_invalidate_mmio_sptes(struct kvm *kvm,=
 u64 gen)
 }
=20
 static int mmu_memory_cache_try_empty(struct kvm_mmu_memory_cache *cache,
-				      struct mutex *cache_lock)
+				      int cache_count, struct mutex *cache_lock)
 {
-	int freed =3D 0;
+	int freed =3D 0, nid;
=20
 	if (mutex_trylock(cache_lock)) {
-		freed =3D cache->nobjs;
-		kvm_mmu_empty_memory_cache(cache);
+		for (nid =3D 0; nid < cache_count; nid++) {
+			if (!cache[nid].nobjs)
+				continue;
+			freed +=3D cache[nid].nobjs;
+			kvm_mmu_empty_memory_cache(&cache[nid]);
+		}
 		mutex_unlock(cache_lock);
 	}
 	return freed;
@@ -6725,15 +6752,17 @@ static unsigned long mmu_shrink_scan(struct shrinke=
r *shrink,
 		list_move_tail(&kvm->vm_list, &vm_list);
=20
 		kvm_for_each_vcpu(i, vcpu, kvm) {
-			freed +=3D mmu_memory_cache_try_empty(&vcpu->arch.mmu_shadow_page_cache,
+			freed +=3D mmu_memory_cache_try_empty(vcpu->arch.mmu_shadow_page_cache,
+							    MAX_NUMNODES,
 							    &vcpu->arch.mmu_shadow_page_cache_lock);
 			freed +=3D mmu_memory_cache_try_empty(&vcpu->arch.mmu_shadowed_info_cac=
he,
+							    1,
 							    &vcpu->arch.mmu_shadow_page_cache_lock);
 			if (freed >=3D sc->nr_to_scan)
 				goto out;
 		}
 		freed +=3D mmu_memory_cache_try_empty(&kvm->arch.split_shadow_page_cache,
-						    &kvm->slots_lock);
+						    1, &kvm->slots_lock);
 		if (freed >=3D sc->nr_to_scan)
 			goto out;
 	}
diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_interna=
l.h
index b9d0e09ae974..652fd0c2bcba 100644
--- a/arch/x86/kvm/mmu/mmu_internal.h
+++ b/arch/x86/kvm/mmu/mmu_internal.h
@@ -340,11 +340,16 @@ void track_possible_nx_huge_page(struct kvm *kvm, str=
uct kvm_mmu_page *sp);
 void untrack_possible_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *s=
p);
 void *mmu_sp_memory_cache_alloc(struct kvm_mmu_memory_cache *cache);
=20
+static inline bool kvm_numa_aware_page_table_enabled(struct kvm *kvm)
+{
+	return kvm->arch.numa_aware_page_table;
+}
+
 static inline int kvm_pfn_to_page_table_nid(struct kvm *kvm, kvm_pfn_t pfn)
 {
 	struct page *page;
=20
-	if (!kvm->arch.numa_aware_page_table)
+	if (!kvm_numa_aware_page_table_enabled(kvm))
 		return NUMA_NO_NODE;
=20
 	page =3D kvm_pfn_to_refcounted_page(pfn);
@@ -355,4 +360,21 @@ static inline int kvm_pfn_to_page_table_nid(struct kvm=
 *kvm, kvm_pfn_t pfn)
 		return numa_mem_id();
 }
=20
+static inline int kvm_pfn_to_mmu_cache_nid(struct kvm *kvm, kvm_pfn_t pfn)
+{
+	int index =3D kvm_pfn_to_page_table_nid(kvm, pfn);
+
+	if (index =3D=3D NUMA_NO_NODE)
+		return KVM_MMU_DEFAULT_CACHE_INDEX;
+
+	return index;
+}
+
+static inline int kvm_mmu_root_page_cache_nid(struct kvm *kvm)
+{
+	if (kvm_numa_aware_page_table_enabled(kvm))
+		return numa_mem_id();
+
+	return KVM_MMU_DEFAULT_CACHE_INDEX;
+}
 #endif /* __KVM_X86_MMU_INTERNAL_H */
diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
index 1dea9be6849d..9db8b3df434d 100644
--- a/arch/x86/kvm/mmu/paging_tmpl.h
+++ b/arch/x86/kvm/mmu/paging_tmpl.h
@@ -652,7 +652,7 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, struct k=
vm_page_fault *fault,
 		table_gfn =3D gw->table_gfn[it.level - 2];
 		access =3D gw->pt_access[it.level - 2];
 		sp =3D kvm_mmu_get_child_sp(vcpu, it.sptep, table_gfn,
-					  false, access);
+					  false, access, fault->pfn);
=20
 		if (sp !=3D ERR_PTR(-EEXIST)) {
 			/*
@@ -706,7 +706,7 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, struct k=
vm_page_fault *fault,
 		validate_direct_spte(vcpu, it.sptep, direct_access);
=20
 		sp =3D kvm_mmu_get_child_sp(vcpu, it.sptep, base_gfn,
-					  true, direct_access);
+					  true, direct_access, fault->pfn);
 		if (sp =3D=3D ERR_PTR(-EEXIST))
 			continue;
=20
diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index 61fd9c177694..63113a66f560 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -260,12 +260,12 @@ static struct kvm_mmu_page *tdp_mmu_next_root(struct =
kvm *kvm,
 		    kvm_mmu_page_as_id(_root) !=3D _as_id) {		\
 		} else
=20
-static struct kvm_mmu_page *tdp_mmu_alloc_sp(struct kvm_vcpu *vcpu)
+static struct kvm_mmu_page *tdp_mmu_alloc_sp(struct kvm_vcpu *vcpu, int ni=
d)
 {
 	struct kvm_mmu_page *sp;
=20
 	sp =3D kvm_mmu_memory_cache_alloc(&vcpu->arch.mmu_page_header_cache);
-	sp->spt =3D mmu_sp_memory_cache_alloc(&vcpu->arch.mmu_shadow_page_cache);
+	sp->spt =3D mmu_sp_memory_cache_alloc(&vcpu->arch.mmu_shadow_page_cache[n=
id]);
=20
 	return sp;
 }
@@ -304,6 +304,7 @@ hpa_t kvm_tdp_mmu_get_vcpu_root_hpa(struct kvm_vcpu *vc=
pu)
 	union kvm_mmu_page_role role =3D vcpu->arch.mmu->root_role;
 	struct kvm *kvm =3D vcpu->kvm;
 	struct kvm_mmu_page *root;
+	int nid;
=20
 	lockdep_assert_held_write(&kvm->mmu_lock);
=20
@@ -317,7 +318,8 @@ hpa_t kvm_tdp_mmu_get_vcpu_root_hpa(struct kvm_vcpu *vc=
pu)
 			goto out;
 	}
=20
-	root =3D tdp_mmu_alloc_sp(vcpu);
+	nid =3D kvm_mmu_root_page_cache_nid(vcpu->kvm);
+	root =3D tdp_mmu_alloc_sp(vcpu, nid);
 	tdp_mmu_init_sp(root, NULL, 0, role);
=20
 	refcount_set(&root->tdp_mmu_root_count, 1);
@@ -1149,12 +1151,14 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, struct k=
vm_page_fault *fault)
 	struct kvm *kvm =3D vcpu->kvm;
 	struct tdp_iter iter;
 	struct kvm_mmu_page *sp;
-	int ret =3D RET_PF_RETRY;
+	int ret =3D RET_PF_RETRY, nid;
=20
 	kvm_mmu_hugepage_adjust(vcpu, fault);
=20
 	trace_kvm_mmu_spte_requested(fault);
=20
+	nid =3D kvm_pfn_to_mmu_cache_nid(kvm, fault->pfn);
+
 	rcu_read_lock();
=20
 	tdp_mmu_for_each_pte(iter, mmu, fault->gfn, fault->gfn + 1) {
@@ -1182,7 +1186,7 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, struct kvm=
_page_fault *fault)
 		 * The SPTE is either non-present or points to a huge page that
 		 * needs to be split.
 		 */
-		sp =3D tdp_mmu_alloc_sp(vcpu);
+		sp =3D tdp_mmu_alloc_sp(vcpu, nid);
 		tdp_mmu_init_child_sp(sp, &iter);
=20
 		sp->nx_huge_page_disallowed =3D fault->huge_page_disallowed;
diff --git a/include/linux/kvm_types.h b/include/linux/kvm_types.h
index b2a405c8e629..13032da2ddfc 100644
--- a/include/linux/kvm_types.h
+++ b/include/linux/kvm_types.h
@@ -113,6 +113,12 @@ static inline void INIT_KVM_MMU_MEMORY_CACHE(struct kv=
m_mmu_memory_cache *cache)
 {
 	*cache =3D (struct kvm_mmu_memory_cache)KVM_MMU_MEMORY_CACHE_INIT();
 }
+
+/*
+ * When NUMA aware page table option is disabled for a VM then use cache a=
t the
+ * below index in the array of NUMA caches.
+ */
+#define KVM_MMU_DEFAULT_CACHE_INDEX 0
 #endif
=20
 #define HALT_POLL_HIST_COUNT			32
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 47006d209309..25a549705c8e 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -401,7 +401,7 @@ static inline void *mmu_memory_cache_alloc_obj(struct k=
vm_mmu_memory_cache *mc,
 	if (mc->kmem_cache)
 		return kmem_cache_alloc(mc->kmem_cache, gfp_flags);
 	else
-		return (void *)__get_free_page(gfp_flags);
+		return kvm_mmu_get_free_page(gfp_flags, mc->node);
 }
=20
 int __kvm_mmu_topup_memory_cache(struct kvm_mmu_memory_cache *mc, int capa=
city, int min)
--=20
2.40.0.rc0.216.gc4246ad0f0-goog
From nobody Thu Sep 18 23:35:51 2025
Return-Path: <linux-kernel-owner@vger.kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id C19E4C64EC4
	for <linux-kernel@archiver.kernel.org>; Mon,  6 Mar 2023 22:43:22 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S229886AbjCFWnU (ORCPT <rfc822;linux-kernel@archiver.kernel.org>);
        Mon, 6 Mar 2023 17:43:20 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44414 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S230281AbjCFWmu (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Mon, 6 Mar 2023 17:42:50 -0500
Received: from mail-yw1-x1149.google.com (mail-yw1-x1149.google.com
 [IPv6:2607:f8b0:4864:20::1149])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 963127D548
        for <linux-kernel@vger.kernel.org>;
 Mon,  6 Mar 2023 14:42:18 -0800 (PST)
Received: by mail-yw1-x1149.google.com with SMTP id
 00721157ae682-536c02ed619so116972997b3.8
        for <linux-kernel@vger.kernel.org>;
 Mon, 06 Mar 2023 14:42:18 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20210112; t=1678142524;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:from:to:cc:subject:date:message-id:reply-to;
        bh=VEQuTMy3BKS8pVuX5PwbgXlx8W41CrsAU35jTKJjWKM=;
        b=PQs6IbkXsJ9lld1VVn0ou53Tj+mD5mN/JMwKSFBqww8CU+k++n9JdjO9+5RKeOjY3j
         z1/gwkem0OA83eaGuvDnX+LX/N5AufzHntpp7rjRkWPcijI5hk20sgV7bwLAddkmSCj3
         ac5RV0oP8KcSnkDPK9DDwAJWAsCw+Reb0+qRKsPT0kFJhdhm6SUGV//tcG3x0gh1lnqt
         EghVyHg1PbXKS8G87VUngcQN+Db7y8HpZLnEKYV83J/P4Uzk7ml77lO9R/1zfUJoGsbY
         GHY2XVCWssNUmg+Ccd1UHPOAasmBZKc7W5I6uIRMz8lPIACX4uuk7GRHJvvYhb8cYleT
         eHKA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112; t=1678142524;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to;
        bh=VEQuTMy3BKS8pVuX5PwbgXlx8W41CrsAU35jTKJjWKM=;
        b=BAFHJ3gvlGCwU1wlqqEeq2t5ors2glLYcsbDzRUWydcm73w9+YHxgr5ee1qGuNdL5g
         jEq3YE7KvA+Nt2lfhLnRyPCV05TGDi2tHkzD/NSVS7IC+qfxifI04VNgI04vGvI33RD9
         r8kArN4gQjWVBIJTGy8Ove1QprsCYsI3KvZaBpwRS10dKF5k2vji9SAmjSPOffXKauXF
         E8OVAKBBnCJSW0IUk22B97u4YnVNCa1Xf940DoO7YXCWV3Eq2LciOLMmFbRFy460UEbk
         DYrcsgnmhiE3VxFxj4kxBCJm+RuCS73iUDPbBLXtEY2hoj3LNStlo5K7jyu1i2K1ZTrJ
         Ea9Q==
X-Gm-Message-State: AO0yUKU6hh+jAGpX5+YQwtqm7pXpgVPrf9aDUYBwsS66sX1jUdHYj9Fo
        1wSkJWpEG0W6w0KbeUtG7XxgxQys3FNQ
X-Google-Smtp-Source: 
 AK7set8EdZCOmDm00QFWHsxEQuQ2fSZe9k2hobygiUXFHmNiri0mCqYuv/hf7wRRFK4ZJJIrTgckG/2/OuZ/
X-Received: from vipin.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:479f])
 (user=vipinsh job=sendgmr) by 2002:a05:6902:c3:b0:9f1:6c48:f95f with SMTP id
 i3-20020a05690200c300b009f16c48f95fmr5863124ybs.5.1678142524427; Mon, 06 Mar
 2023 14:42:04 -0800 (PST)
Date: Mon,  6 Mar 2023 14:41:26 -0800
In-Reply-To: <20230306224127.1689967-1-vipinsh@google.com>
Mime-Version: 1.0
References: <20230306224127.1689967-1-vipinsh@google.com>
X-Mailer: git-send-email 2.40.0.rc0.216.gc4246ad0f0-goog
Message-ID: <20230306224127.1689967-18-vipinsh@google.com>
Subject: [Patch v4 17/18] KVM: x86/mmu: Allocate shadow mmu page table on huge
 page split on the same NUMA node
From: Vipin Sharma <vipinsh@google.com>
To: seanjc@google.com, pbonzini@redhat.com, bgardon@google.com,
        dmatlack@google.com
Cc: jmattson@google.com, mizhang@google.com, kvm@vger.kernel.org,
        linux-kernel@vger.kernel.org, Vipin Sharma <vipinsh@google.com>
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset="utf-8"

When splitting a huge page and NUMA aware page split option is enabled,
try to allocate new lower level page tables on the same NUMA node of
the huge page. If NUMA aware page split is disabled then fallback to
default policy of using current thread's mempolicy.

Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
 arch/x86/include/asm/kvm_host.h |  2 +-
 arch/x86/kvm/mmu/mmu.c          | 42 ++++++++++++++++++++-------------
 arch/x86/kvm/x86.c              |  8 ++++++-
 3 files changed, 33 insertions(+), 19 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_hos=
t.h
index 77d3aa368e5e..041302d6132c 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1453,7 +1453,7 @@ struct kvm_arch {
 	 *
 	 * Protected by kvm->slots_lock.
 	 */
-	struct kvm_mmu_memory_cache split_shadow_page_cache;
+	struct kvm_mmu_memory_cache split_shadow_page_cache[MAX_NUMNODES];
 	struct kvm_mmu_memory_cache split_page_header_cache;
=20
 	/*
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 86f0d74d35ed..6d44a4e08328 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -6140,7 +6140,7 @@ static void kvm_mmu_invalidate_zap_pages_in_memslot(s=
truct kvm *kvm,
 int kvm_mmu_init_vm(struct kvm *kvm)
 {
 	struct kvm_page_track_notifier_node *node =3D &kvm->arch.mmu_sp_tracker;
-	int r;
+	int r, nid;
=20
 	INIT_LIST_HEAD(&kvm->arch.active_mmu_pages);
 	INIT_LIST_HEAD(&kvm->arch.possible_nx_huge_pages);
@@ -6159,7 +6159,9 @@ int kvm_mmu_init_vm(struct kvm *kvm)
 	INIT_KVM_MMU_MEMORY_CACHE(&kvm->arch.split_page_header_cache);
 	kvm->arch.split_page_header_cache.kmem_cache =3D mmu_page_header_cache;
=20
-	INIT_KVM_MMU_MEMORY_CACHE(&kvm->arch.split_shadow_page_cache);
+	for_each_node(nid)
+		INIT_KVM_MMU_MEMORY_CACHE(&kvm->arch.split_shadow_page_cache[nid]);
+
=20
 	INIT_KVM_MMU_MEMORY_CACHE(&kvm->arch.split_desc_cache);
 	kvm->arch.split_desc_cache.kmem_cache =3D pte_list_desc_cache;
@@ -6169,10 +6171,13 @@ int kvm_mmu_init_vm(struct kvm *kvm)
=20
 static void mmu_free_vm_memory_caches(struct kvm *kvm)
 {
+	int nid;
+
 	kvm_mmu_free_memory_cache(&kvm->arch.split_desc_cache);
 	kvm_mmu_free_memory_cache(&kvm->arch.split_page_header_cache);
 	mutex_lock(&kvm->slots_lock);
-	mmu_free_sp_memory_cache(&kvm->arch.split_shadow_page_cache);
+	for_each_node(nid)
+		mmu_free_sp_memory_cache(&kvm->arch.split_shadow_page_cache[nid]);
 	mutex_unlock(&kvm->slots_lock);
 }
=20
@@ -6282,7 +6287,7 @@ static inline bool need_topup(struct kvm_mmu_memory_c=
ache *cache, int min)
 	return kvm_mmu_memory_cache_nr_free_objects(cache) < min;
 }
=20
-static bool need_topup_split_caches_or_resched(struct kvm *kvm)
+static bool need_topup_split_caches_or_resched(struct kvm *kvm, int nid)
 {
 	if (need_resched() || rwlock_needbreak(&kvm->mmu_lock))
 		return true;
@@ -6294,10 +6299,10 @@ static bool need_topup_split_caches_or_resched(stru=
ct kvm *kvm)
 	 */
 	return need_topup(&kvm->arch.split_desc_cache, SPLIT_DESC_CACHE_MIN_NR_OB=
JECTS) ||
 	       need_topup(&kvm->arch.split_page_header_cache, 1) ||
-	       need_topup(&kvm->arch.split_shadow_page_cache, 1);
+	       need_topup(&kvm->arch.split_shadow_page_cache[nid], 1);
 }
=20
-static int topup_split_caches(struct kvm *kvm)
+static int topup_split_caches(struct kvm *kvm, int nid)
 {
 	/*
 	 * Allocating rmap list entries when splitting huge pages for nested
@@ -6327,10 +6332,11 @@ static int topup_split_caches(struct kvm *kvm)
 	if (r)
 		return r;
=20
-	return mmu_topup_sp_memory_cache(&kvm->arch.split_shadow_page_cache, 1);
+	return mmu_topup_sp_memory_cache(&kvm->arch.split_shadow_page_cache[nid],=
 1);
 }
=20
-static struct kvm_mmu_page *shadow_mmu_get_sp_for_split(struct kvm *kvm, u=
64 *huge_sptep)
+static struct kvm_mmu_page *shadow_mmu_get_sp_for_split(struct kvm *kvm, u=
64 *huge_sptep,
+							int nid)
 {
 	struct kvm_mmu_page *huge_sp =3D sptep_to_sp(huge_sptep);
 	struct shadow_page_caches caches =3D {};
@@ -6351,7 +6357,7 @@ static struct kvm_mmu_page *shadow_mmu_get_sp_for_spl=
it(struct kvm *kvm, u64 *hu
=20
 	/* Direct SPs do not require a shadowed_info_cache. */
 	caches.page_header_cache =3D &kvm->arch.split_page_header_cache;
-	caches.shadow_page_cache =3D &kvm->arch.split_shadow_page_cache;
+	caches.shadow_page_cache =3D &kvm->arch.split_shadow_page_cache[nid];
=20
 	/* Safe to pass NULL for vCPU since requesting a direct SP. */
 	return __kvm_mmu_get_shadow_page(kvm, NULL, &caches, gfn, role);
@@ -6359,7 +6365,7 @@ static struct kvm_mmu_page *shadow_mmu_get_sp_for_spl=
it(struct kvm *kvm, u64 *hu
=20
 static void shadow_mmu_split_huge_page(struct kvm *kvm,
 				       const struct kvm_memory_slot *slot,
-				       u64 *huge_sptep)
+				       u64 *huge_sptep, int nid)
=20
 {
 	struct kvm_mmu_memory_cache *cache =3D &kvm->arch.split_desc_cache;
@@ -6370,7 +6376,7 @@ static void shadow_mmu_split_huge_page(struct kvm *kv=
m,
 	gfn_t gfn;
 	int index;
=20
-	sp =3D shadow_mmu_get_sp_for_split(kvm, huge_sptep);
+	sp =3D shadow_mmu_get_sp_for_split(kvm, huge_sptep, nid);
=20
 	for (index =3D 0; index < SPTE_ENT_PER_PAGE; index++) {
 		sptep =3D &sp->spt[index];
@@ -6408,7 +6414,7 @@ static int shadow_mmu_try_split_huge_page(struct kvm =
*kvm,
 					  u64 *huge_sptep)
 {
 	struct kvm_mmu_page *huge_sp =3D sptep_to_sp(huge_sptep);
-	int level, r =3D 0;
+	int level, r =3D 0, nid;
 	gfn_t gfn;
 	u64 spte;
=20
@@ -6422,7 +6428,9 @@ static int shadow_mmu_try_split_huge_page(struct kvm =
*kvm,
 		goto out;
 	}
=20
-	if (need_topup_split_caches_or_resched(kvm)) {
+	nid =3D kvm_pfn_to_mmu_cache_nid(kvm, spte_to_pfn(spte));
+
+	if (need_topup_split_caches_or_resched(kvm, nid)) {
 		write_unlock(&kvm->mmu_lock);
 		cond_resched();
 		/*
@@ -6430,12 +6438,12 @@ static int shadow_mmu_try_split_huge_page(struct kv=
m *kvm,
 		 * rmap iterator should be restarted because the MMU lock was
 		 * dropped.
 		 */
-		r =3D topup_split_caches(kvm) ?: -EAGAIN;
+		r =3D topup_split_caches(kvm, nid) ?: -EAGAIN;
 		write_lock(&kvm->mmu_lock);
 		goto out;
 	}
=20
-	shadow_mmu_split_huge_page(kvm, slot, huge_sptep);
+	shadow_mmu_split_huge_page(kvm, slot, huge_sptep, nid);
=20
 out:
 	trace_kvm_mmu_split_huge_page(gfn, spte, level, r);
@@ -6761,8 +6769,8 @@ static unsigned long mmu_shrink_scan(struct shrinker =
*shrink,
 			if (freed >=3D sc->nr_to_scan)
 				goto out;
 		}
-		freed +=3D mmu_memory_cache_try_empty(&kvm->arch.split_shadow_page_cache,
-						    1, &kvm->slots_lock);
+		freed +=3D mmu_memory_cache_try_empty(kvm->arch.split_shadow_page_cache,
+						    MAX_NUMNODES, &kvm->slots_lock);
 		if (freed >=3D sc->nr_to_scan)
 			goto out;
 	}
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 71728abd7f92..d8ea39b248cd 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6176,7 +6176,7 @@ int kvm_vm_ioctl_irq_line(struct kvm *kvm, struct kvm=
_irq_level *irq_event,
 int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
 			    struct kvm_enable_cap *cap)
 {
-	int r;
+	int r, nid;
=20
 	if (cap->flags)
 		return -EINVAL;
@@ -6397,6 +6397,12 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
 		mutex_lock(&kvm->lock);
 		if (!kvm->created_vcpus) {
 			kvm->arch.numa_aware_page_table =3D true;
+
+			mutex_lock(&kvm->slots_lock);
+			for_each_node(nid) {
+				kvm->arch.split_shadow_page_cache[nid].node =3D nid;
+			}
+			mutex_unlock(&kvm->slots_lock);
 			r =3D 0;
 		}
 		mutex_unlock(&kvm->lock);
--=20
2.40.0.rc0.216.gc4246ad0f0-goog
From nobody Thu Sep 18 23:35:51 2025
Return-Path: <linux-kernel-owner@vger.kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id C5A18C6FD1A
	for <linux-kernel@archiver.kernel.org>; Mon,  6 Mar 2023 22:43:25 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S229878AbjCFWnX (ORCPT <rfc822;linux-kernel@archiver.kernel.org>);
        Mon, 6 Mar 2023 17:43:23 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44150 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S230301AbjCFWm4 (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Mon, 6 Mar 2023 17:42:56 -0500
Received: from mail-pl1-x649.google.com (mail-pl1-x649.google.com
 [IPv6:2607:f8b0:4864:20::649])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 377B17C9E3
        for <linux-kernel@vger.kernel.org>;
 Mon,  6 Mar 2023 14:42:24 -0800 (PST)
Received: by mail-pl1-x649.google.com with SMTP id
 s15-20020a170902ea0f00b0019d0c7a83dfso6628243plg.14
        for <linux-kernel@vger.kernel.org>;
 Mon, 06 Mar 2023 14:42:24 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20210112; t=1678142526;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:from:to:cc:subject:date:message-id:reply-to;
        bh=OeL8Ykd+iCS6JS8scOAtZXT6SgE3Ouj+/URgOQ2sX3s=;
        b=hk8vRpbEWBqSY4vpUw9bpWUeCVvQ/YeUcq66u5udQO+XKtmkRe5XynuiaC71UhBsmb
         MJX5G46cSivSg2VyHO2BLfr3zi+MPvsQQPrlUg0fklhseJRi492oFo6Omn0JeoMCiIDA
         ehcsBIttFvLE0M6Z0VE3ZVD8e2ATxMcgmx5gk5b88p/1DDvplGFnZ99N1VQ7grjJ2hX1
         m4Rs2BUrBU3Yjazwt4qvtCMRJMwPuvSnaYwonWVuEmnwDyA2W/r/85gkfWZjutvxQf7J
         yo2LwCs1SLZHtKalYgNMvcaZgVakuyiUlG+IUprqGugBQTirUZbsBqM8O4BeJRmWkQO4
         hLDg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112; t=1678142526;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to;
        bh=OeL8Ykd+iCS6JS8scOAtZXT6SgE3Ouj+/URgOQ2sX3s=;
        b=Pa106kt+KaH/ZjiqjsDWRMcLhSdmw/b7XSM4lV8HiEOzJx0gwpa7Fh+OqEbn8URpEu
         bK8x6Ve2VFPH16a2lOVtGlvrm/nme1dtkjZrjwcg6oE8xNMfofUPIsmOQtQgN+vQDPMF
         WRs727nJ2wkrs8Qa3DbXniAosN19ODLO1/5iMK+4hoZ0dLBl8XHvYKSPwT2mC3ZjGZbj
         RC5LOL3pO+v0yNy4XM46jE2txsZdou5OWahHTkSkEynJZ+xe5zbEPmxsLUI2881j2EkA
         3ooZyDguEabwLHNxtfl2oKJsNktIk4HaYgfSAdrgzf/YygbcxypdQgH35c9UBnxpAX5S
         623g==
X-Gm-Message-State: AO0yUKXlv2jdpACDmxalxerY8d0mRmtb1Jjc2tbgD4SuB3QuvdARnEge
        5wsG9Ah3Zwz1jANwbo3bIDnPlGHurJGa
X-Google-Smtp-Source: 
 AK7set/bN7t34vpFP3b5g8As+ZIP6EwS/0kB74bTqu6kV7sDF35fTifkklq2UmD/u6GAqDmSpVbet0qAjEWc
X-Received: from vipin.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:479f])
 (user=vipinsh job=sendgmr) by 2002:a17:90b:504:b0:233:df5f:4778 with SMTP id
 r4-20020a17090b050400b00233df5f4778mr4600717pjz.6.1678142526350; Mon, 06 Mar
 2023 14:42:06 -0800 (PST)
Date: Mon,  6 Mar 2023 14:41:27 -0800
In-Reply-To: <20230306224127.1689967-1-vipinsh@google.com>
Mime-Version: 1.0
References: <20230306224127.1689967-1-vipinsh@google.com>
X-Mailer: git-send-email 2.40.0.rc0.216.gc4246ad0f0-goog
Message-ID: <20230306224127.1689967-19-vipinsh@google.com>
Subject: [Patch v4 18/18] KVM: x86/mmu: Reduce default mmu memory cache size
From: Vipin Sharma <vipinsh@google.com>
To: seanjc@google.com, pbonzini@redhat.com, bgardon@google.com,
        dmatlack@google.com
Cc: jmattson@google.com, mizhang@google.com, kvm@vger.kernel.org,
        linux-kernel@vger.kernel.org, Vipin Sharma <vipinsh@google.com>
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset="utf-8"

Reduce KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE to PT64_ROOT_MAX_LEVEL - 1.
Opportunistically, use this reduced value for topping up caches.

There was no specific reason to set this value to 40. With addition of
multi NUMA node caches, it is good to save space and make these cachees
lean.

Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
 arch/x86/include/asm/kvm_types.h | 6 +++++-
 arch/x86/kvm/mmu/mmu.c           | 8 ++++----
 2 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/kvm_types.h b/arch/x86/include/asm/kvm_ty=
pes.h
index 08f1b57d3b62..80aff231b708 100644
--- a/arch/x86/include/asm/kvm_types.h
+++ b/arch/x86/include/asm/kvm_types.h
@@ -2,6 +2,10 @@
 #ifndef _ASM_X86_KVM_TYPES_H
 #define _ASM_X86_KVM_TYPES_H
=20
-#define KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE 40
+/*
+ * For each fault only PT64_ROOT_MAX_LEVEL - 1 pages are needed. Root
+ * page is allocated in a separate flow.
+ */
+#define KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE (PT64_ROOT_MAX_LEVEL - 1)
=20
 #endif /* _ASM_X86_KVM_TYPES_H */
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 6d44a4e08328..5463ce6e52fa 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -713,11 +713,11 @@ static int mmu_topup_memory_caches(struct kvm_vcpu *v=
cpu, bool maybe_indirect)
 	if (kvm_numa_aware_page_table_enabled(vcpu->kvm)) {
 		for_each_online_node(nid) {
 			r =3D mmu_topup_sp_memory_cache(&vcpu->arch.mmu_shadow_page_cache[nid],
-						      PT64_ROOT_MAX_LEVEL);
+						      KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE);
 		}
 	} else {
 		r =3D mmu_topup_sp_memory_cache(&vcpu->arch.mmu_shadow_page_cache[nid],
-					      PT64_ROOT_MAX_LEVEL);
+					      KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE);
 	}
=20
 	if (r)
@@ -725,12 +725,12 @@ static int mmu_topup_memory_caches(struct kvm_vcpu *v=
cpu, bool maybe_indirect)
=20
 	if (maybe_indirect) {
 		r =3D mmu_topup_sp_memory_cache(&vcpu->arch.mmu_shadowed_info_cache,
-					      PT64_ROOT_MAX_LEVEL);
+					      KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE);
 		if (r)
 			return r;
 	}
 	return kvm_mmu_topup_memory_cache(&vcpu->arch.mmu_page_header_cache,
-					  PT64_ROOT_MAX_LEVEL);
+					  KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE);
 }
=20
 static void mmu_free_memory_caches(struct kvm_vcpu *vcpu)
--=20
2.40.0.rc0.216.gc4246ad0f0-goog