From nobody Thu Sep 18 23:35:51 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E9BB9C6FD1E for ; Mon, 6 Mar 2023 22:41:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230132AbjCFWlq (ORCPT ); Mon, 6 Mar 2023 17:41:46 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43042 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229718AbjCFWlm (ORCPT ); Mon, 6 Mar 2023 17:41:42 -0500 Received: from mail-pj1-x104a.google.com (mail-pj1-x104a.google.com [IPv6:2607:f8b0:4864:20::104a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E5C3275845 for ; Mon, 6 Mar 2023 14:41:36 -0800 (PST) Received: by mail-pj1-x104a.google.com with SMTP id q61-20020a17090a1b4300b00237d2fb8400so7096134pjq.0 for ; Mon, 06 Mar 2023 14:41:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; t=1678142496; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=E+fxwShkDb5dHF5T9HzQ/pz1bAD+wZheF6ykQqafhXc=; b=OrT0FbrKs4ORTv+USpAB4br2OHaiXGJPuCIu32oANDtvkADWxFu017TG+z3EbDOtvl 4hhpn8AqRw1PQSkDkKbtFWzI7u3Ggk+B9pDVkPdmv2yW46Vx5ubJAeOfUVlWynI2vOxS L7foA1Mg6kS3VA02GkEFb+4DOSz948VkKrUa2pXZL1A3fCDSM8SA6mYFrprKVF5k/5ur NDCnxTBSC8mT6IDHRHsZdqfPCWHb4QcsVpt70nMn1jpF6h+M2bzp8LZ0ovO2vRoC/R3I Vg6qo8wKPNl0+1FQz8QTta7wSwoDIDIkWEoFrcY7T+YadXbJo2TtWfvc2JBPYUXjBFWf +Cmw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1678142496; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=E+fxwShkDb5dHF5T9HzQ/pz1bAD+wZheF6ykQqafhXc=; b=7MzwcVBAMv0orivrtoyKAhq7ytAtLj2Im6/33Sq5mP32oExD4A1zzB22t2bIQtaVFf uH5J0xk7EsUHXNRueJebB/ty3liZdXT/wDVq5ekXHsEtn4VHnWZqjXQSAcn5Ab3pFmS8 nt80VS3qNUSbTn1vbn/6BgSpBbio1gAsTeP0I3Iz8qXa0iZnkXAdW5EoybKAtOsiClTf hPlr2YRiNYxflfh2/Z1BTI/3ZsZG5v4WgxL8q9whxPLabiyVbzQRx1zpFAQ85uDoRtMQ Q+/VGhNDYRx3jHGrzPDQLJFPM/IAW9FCpmPdtwQHsRdzJTuPVyqdxmSnz/tFDs5KZqtL vT/A== X-Gm-Message-State: AO0yUKUHbDK3FHChSoNFdRta+7H2AiChOYdZQYXKyNpCspODvIS0H8pe 8D7bS9FMEdc3lIjAipMsDOd0anYzXWC9 X-Google-Smtp-Source: AK7set8ARylYbd7dbgJM8WSkxYuSrTEawhfqFWJ/XQcVF1DdaaAV06Zg3oM2HeGunLdmf9/CQUhPTAUf23li X-Received: from vipin.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:479f]) (user=vipinsh job=sendgmr) by 2002:a63:7e1c:0:b0:507:2c49:806d with SMTP id z28-20020a637e1c000000b005072c49806dmr3140509pgc.4.1678142496180; Mon, 06 Mar 2023 14:41:36 -0800 (PST) Date: Mon, 6 Mar 2023 14:41:10 -0800 In-Reply-To: <20230306224127.1689967-1-vipinsh@google.com> Mime-Version: 1.0 References: <20230306224127.1689967-1-vipinsh@google.com> X-Mailer: git-send-email 2.40.0.rc0.216.gc4246ad0f0-goog Message-ID: <20230306224127.1689967-2-vipinsh@google.com> Subject: [Patch v4 01/18] KVM: x86/mmu: Change KVM mmu shrinker to no-op From: Vipin Sharma To: seanjc@google.com, pbonzini@redhat.com, bgardon@google.com, dmatlack@google.com Cc: jmattson@google.com, mizhang@google.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Vipin Sharma Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Remove page zapping logic from the shrinker. Keep shrinker infrastructure in place, it will be reused in future commits to free KVM page caches. mmu_shrink_scan() is very disruptive to VMs. It picks the first VM in the vm_list, zaps the oldest page which is most likely an upper level SPTEs and most like to be reused. Prior to TDP MMU, this is even more disruptive in nested VMs case, considering L1 SPTEs will be the oldest even though most of the entries are for L2 SPTEs. As discussed in https://lore.kernel.org/lkml/Y45dldZnI6OIf+a5@google.com/ shrinker logic has not be very useful in actually keeping VMs performant and reducing memory usage. Suggested-by: Sean Christopherson Signed-off-by: Vipin Sharma --- arch/x86/kvm/mmu/mmu.c | 87 +++--------------------------------------- 1 file changed, 5 insertions(+), 82 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index c8ebe542c565..0d07767f7922 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -166,7 +166,6 @@ struct kvm_shadow_walk_iterator { =20 static struct kmem_cache *pte_list_desc_cache; struct kmem_cache *mmu_page_header_cache; -static struct percpu_counter kvm_total_used_mmu_pages; =20 static void mmu_spte_set(u64 *sptep, u64 spte); =20 @@ -1704,27 +1703,15 @@ static int is_empty_shadow_page(u64 *spt) } #endif =20 -/* - * This value is the sum of all of the kvm instances's - * kvm->arch.n_used_mmu_pages values. We need a global, - * aggregate version in order to make the slab shrinker - * faster - */ -static inline void kvm_mod_used_mmu_pages(struct kvm *kvm, long nr) -{ - kvm->arch.n_used_mmu_pages +=3D nr; - percpu_counter_add(&kvm_total_used_mmu_pages, nr); -} - static void kvm_account_mmu_page(struct kvm *kvm, struct kvm_mmu_page *sp) { - kvm_mod_used_mmu_pages(kvm, +1); + kvm->arch.n_used_mmu_pages++; kvm_account_pgtable_pages((void *)sp->spt, +1); } =20 static void kvm_unaccount_mmu_page(struct kvm *kvm, struct kvm_mmu_page *s= p) { - kvm_mod_used_mmu_pages(kvm, -1); + kvm->arch.n_used_mmu_pages--; kvm_account_pgtable_pages((void *)sp->spt, -1); } =20 @@ -6072,11 +6059,6 @@ static void kvm_mmu_zap_all_fast(struct kvm *kvm) kvm_tdp_mmu_zap_invalidated_roots(kvm); } =20 -static bool kvm_has_zapped_obsolete_pages(struct kvm *kvm) -{ - return unlikely(!list_empty_careful(&kvm->arch.zapped_obsolete_pages)); -} - static void kvm_mmu_invalidate_zap_pages_in_memslot(struct kvm *kvm, struct kvm_memory_slot *slot, struct kvm_page_track_notifier_node *node) @@ -6666,66 +6648,13 @@ void kvm_mmu_invalidate_mmio_sptes(struct kvm *kvm,= u64 gen) static unsigned long mmu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc) { - struct kvm *kvm; - int nr_to_scan =3D sc->nr_to_scan; - unsigned long freed =3D 0; - - mutex_lock(&kvm_lock); - - list_for_each_entry(kvm, &vm_list, vm_list) { - int idx; - LIST_HEAD(invalid_list); - - /* - * Never scan more than sc->nr_to_scan VM instances. - * Will not hit this condition practically since we do not try - * to shrink more than one VM and it is very unlikely to see - * !n_used_mmu_pages so many times. - */ - if (!nr_to_scan--) - break; - /* - * n_used_mmu_pages is accessed without holding kvm->mmu_lock - * here. We may skip a VM instance errorneosly, but we do not - * want to shrink a VM that only started to populate its MMU - * anyway. - */ - if (!kvm->arch.n_used_mmu_pages && - !kvm_has_zapped_obsolete_pages(kvm)) - continue; - - idx =3D srcu_read_lock(&kvm->srcu); - write_lock(&kvm->mmu_lock); - - if (kvm_has_zapped_obsolete_pages(kvm)) { - kvm_mmu_commit_zap_page(kvm, - &kvm->arch.zapped_obsolete_pages); - goto unlock; - } - - freed =3D kvm_mmu_zap_oldest_mmu_pages(kvm, sc->nr_to_scan); - -unlock: - write_unlock(&kvm->mmu_lock); - srcu_read_unlock(&kvm->srcu, idx); - - /* - * unfair on small ones - * per-vm shrinkers cry out - * sadness comes quickly - */ - list_move_tail(&kvm->vm_list, &vm_list); - break; - } - - mutex_unlock(&kvm_lock); - return freed; + return SHRINK_STOP; } =20 static unsigned long mmu_shrink_count(struct shrinker *shrink, struct shrink_control *sc) { - return percpu_counter_read_positive(&kvm_total_used_mmu_pages); + return SHRINK_EMPTY; } =20 static struct shrinker mmu_shrinker =3D { @@ -6840,17 +6769,12 @@ int kvm_mmu_vendor_module_init(void) if (!mmu_page_header_cache) goto out; =20 - if (percpu_counter_init(&kvm_total_used_mmu_pages, 0, GFP_KERNEL)) - goto out; - ret =3D register_shrinker(&mmu_shrinker, "x86-mmu"); if (ret) - goto out_shrinker; + goto out; =20 return 0; =20 -out_shrinker: - percpu_counter_destroy(&kvm_total_used_mmu_pages); out: mmu_destroy_caches(); return ret; @@ -6867,7 +6791,6 @@ void kvm_mmu_destroy(struct kvm_vcpu *vcpu) void kvm_mmu_vendor_module_exit(void) { mmu_destroy_caches(); - percpu_counter_destroy(&kvm_total_used_mmu_pages); unregister_shrinker(&mmu_shrinker); } =20 --=20 2.40.0.rc0.216.gc4246ad0f0-goog From nobody Thu Sep 18 23:35:51 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BD0D9C64EC4 for ; Mon, 6 Mar 2023 22:41:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230157AbjCFWlt (ORCPT ); Mon, 6 Mar 2023 17:41:49 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43064 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230113AbjCFWln (ORCPT ); Mon, 6 Mar 2023 17:41:43 -0500 Received: from mail-pl1-x64a.google.com (mail-pl1-x64a.google.com [IPv6:2607:f8b0:4864:20::64a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9952E75854 for ; Mon, 6 Mar 2023 14:41:38 -0800 (PST) Received: by mail-pl1-x64a.google.com with SMTP id lm13-20020a170903298d00b0019a8c8a13dfso6633161plb.16 for ; Mon, 06 Mar 2023 14:41:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; t=1678142498; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=40gADokIJp7LXp0jd/R1DBU/pOdMA934Lm+cHTZ9qZM=; b=T+lSmTcVs1GzNKHcrL6yDPRcrmA/On3+n7qFsig/y3to2l4Di94gZCGr/bCF/QIFCL cLBeX1YxouACoIAPgs5QY9rBE2OkiwAkgbEVHRpOGc3BwbxvIbCvo7tmUGlTyZWsYjaM rcOL2MkUhC/hl7Y55rrGwp+7FG30goIjn4DoVJHcpFT4JcRvPQBwQZiwyp74cNGlB9/G dLwyS0IwCuOH0hP26IbUG5UrvLBfOyWMRTXW1uwn01yE277FOUncIfgzp+cXW4MWpJr2 gy6GNgQW2rkbETRch+20PWhJaRjHq2yMu6JjDYYIH/Bwi7mxegFKsQmmksJVz6W7VuaU nxQg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1678142498; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=40gADokIJp7LXp0jd/R1DBU/pOdMA934Lm+cHTZ9qZM=; b=fnCEO0cZuDnth3OmsgcjwUPNqupKmiSHV2DcIPE5m+pYbzHuBeKPlkhWVsLvv8g5y3 OL851FL+GdVLBmsuPcqleRiUkPMybqdJ5vltn2zOg8CUmT4XtIUq/7WRrRD2Ui5EDrE+ IIZpvxcG6GRT0hA/qAFJCq9+HSNI0ozjktR4XZfZFOXSUZ7dgccrqh/vhL6TUH6ZFHW5 8xkEOBMm4WIyV8augY7pYp6zhQMZkyiqf7Gnl0+Pv4rEX1VzJaTG/uOIzSegzYwd+Pnm jeKPXvQBY71mv6IOWPQnXgxiKVA/g1S/ZHoXWRYq/bPUkwpjkMmRXX/EFXt7tgyKhoi/ BXpg== X-Gm-Message-State: AO0yUKU+NslgiXzgrZyzr9Dm1jokUYYK/eTWX9a9YHxCDhVuUy+878U7 jIgX8zmjk+IHp1o9XrzoMq2Ggz0heBDo X-Google-Smtp-Source: AK7set9sR45L+LDHCgEH2QOACc5WTVEZlCHYGVdCn/w8i1k+AVwEVcDMNn8+7HF/fl2p/1eVU/27oQXHjqdY X-Received: from vipin.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:479f]) (user=vipinsh job=sendgmr) by 2002:a17:90a:d58e:b0:233:fbe0:5ccf with SMTP id v14-20020a17090ad58e00b00233fbe05ccfmr4320085pju.1.1678142498162; Mon, 06 Mar 2023 14:41:38 -0800 (PST) Date: Mon, 6 Mar 2023 14:41:11 -0800 In-Reply-To: <20230306224127.1689967-1-vipinsh@google.com> Mime-Version: 1.0 References: <20230306224127.1689967-1-vipinsh@google.com> X-Mailer: git-send-email 2.40.0.rc0.216.gc4246ad0f0-goog Message-ID: <20230306224127.1689967-3-vipinsh@google.com> Subject: [Patch v4 02/18] KVM: x86/mmu: Remove zapped_obsolete_pages from struct kvm_arch{} From: Vipin Sharma To: seanjc@google.com, pbonzini@redhat.com, bgardon@google.com, dmatlack@google.com Cc: jmattson@google.com, mizhang@google.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Vipin Sharma Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Remove zapped_obsolete_pages from struct kvm_arch{} and use local list in kvm_zap_obsolete_pages(). zapped_obsolete_pages list was used in struct kvm_arch{} to provide pages for KVM MMU shrinker. Since, KVM MMU shrinker is no-op now, this is not needed. Signed-off-by: Vipin Sharma Reviewed-by: David Matlack --- arch/x86/include/asm/kvm_host.h | 1 - arch/x86/kvm/mmu/mmu.c | 8 ++++---- 2 files changed, 4 insertions(+), 5 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_hos= t.h index 808c292ad3f4..ebbe692acf3f 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1252,7 +1252,6 @@ struct kvm_arch { u8 mmu_valid_gen; struct hlist_head mmu_page_hash[KVM_NUM_MMU_PAGES]; struct list_head active_mmu_pages; - struct list_head zapped_obsolete_pages; /* * A list of kvm_mmu_page structs that, if zapped, could possibly be * replaced by an NX huge page. A shadow page is on this list if its diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 0d07767f7922..3a452989f5cd 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -5947,6 +5947,7 @@ static void kvm_zap_obsolete_pages(struct kvm *kvm) { struct kvm_mmu_page *sp, *node; int nr_zapped, batch =3D 0; + LIST_HEAD(invalid_list); bool unstable; =20 restart: @@ -5979,8 +5980,8 @@ static void kvm_zap_obsolete_pages(struct kvm *kvm) goto restart; } =20 - unstable =3D __kvm_mmu_prepare_zap_page(kvm, sp, - &kvm->arch.zapped_obsolete_pages, &nr_zapped); + unstable =3D __kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list, + &nr_zapped); batch +=3D nr_zapped; =20 if (unstable) @@ -5996,7 +5997,7 @@ static void kvm_zap_obsolete_pages(struct kvm *kvm) * kvm_mmu_load()), and the reload in the caller ensure no vCPUs are * running with an obsolete MMU. */ - kvm_mmu_commit_zap_page(kvm, &kvm->arch.zapped_obsolete_pages); + kvm_mmu_commit_zap_page(kvm, &invalid_list); } =20 /* @@ -6072,7 +6073,6 @@ int kvm_mmu_init_vm(struct kvm *kvm) int r; =20 INIT_LIST_HEAD(&kvm->arch.active_mmu_pages); - INIT_LIST_HEAD(&kvm->arch.zapped_obsolete_pages); INIT_LIST_HEAD(&kvm->arch.possible_nx_huge_pages); spin_lock_init(&kvm->arch.mmu_unsync_pages_lock); =20 --=20 2.40.0.rc0.216.gc4246ad0f0-goog From nobody Thu Sep 18 23:35:51 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 00A9CC64EC4 for ; Mon, 6 Mar 2023 22:41:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230168AbjCFWl4 (ORCPT ); Mon, 6 Mar 2023 17:41:56 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43154 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230123AbjCFWlp (ORCPT ); Mon, 6 Mar 2023 17:41:45 -0500 Received: from mail-pf1-x44a.google.com (mail-pf1-x44a.google.com [IPv6:2607:f8b0:4864:20::44a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 435637433C for ; Mon, 6 Mar 2023 14:41:40 -0800 (PST) Received: by mail-pf1-x44a.google.com with SMTP id cr10-20020a056a000f0a00b005cfec6c2354so6126904pfb.9 for ; Mon, 06 Mar 2023 14:41:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; t=1678142500; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=WhzuUwTW42vajQZ5iu4XIp1wWTqvuDE03zxjZV6U7io=; b=MLSRqlBt2QQUmtXc6Hxs6fzWOeClb00AzMS611TeDdDOkU8F9hkTUx46kAoCwz8HAZ 4hDVRQ76rq4hZRHrtxHTV+ooD39DrsfZ4qVkiyRP8OhJdCIE45hhuUtgroZ5EHGvSe3G NY5C3KvmUtPfJj26iS9P9hC3noFbpj92H1ch0flmqE5YHyE/tdUEhZP2guvnyClr0FPV k+XryHmhhGFhy3SWX9+nGAEftqjmccwsytClASnxRXMAuny+G3p3pyRveClOvC30Em+t iBXw9q/hWch7RVcP15vHkSxMkuOmgnKfV8D2J3cQNROQYK6xrMoxokJpRyBHDY9U7ruC cUgw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1678142500; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=WhzuUwTW42vajQZ5iu4XIp1wWTqvuDE03zxjZV6U7io=; b=qazB+/n2ISK4oMjoGO8b6HxEG7Sf6+BZK0AcPyIm6e20f9flhiZQBKaVtw+0HVnm+1 TQ+TP56xgtpwRqSOiNybFHrFYX7kNQGsSvwTXJxChu7LZYmQ/zOA2ZVC5LVPyfoB3fnk 8k1mNo69bEXtSvz5CpStKZzhqAaKPVv9C5OtIRWZG7S02kdtPRfRXKblQTHMpBlKvEwP Wcc7Mmz74FbpGkMMUo/9CQ/9LnqNM+JF3nXPBQXi1EEUMILe4NtNdBllypDTpbOXR1ip ImUHDoQP0Fl19M5lK6VSic1tf1cDsJAHDwauWJgkwCWPd02FYh6fmdYz7q7VA4ZzUGEH PwHA== X-Gm-Message-State: AO0yUKU1lPsIJ0OG9/DtXWJnldP8aOs2BXkpAdQ2h5cFgHisT3fyQiLI O25/EJ0fRDPR0zrw5QC3yu0nETrRpkSj X-Google-Smtp-Source: AK7set/qH3bVtHjFPWVL+83cwT5iW0k2bU+aH8x4GGlERwtIAW+l8ahUpxAr+DBodXZ292gTaRbeLii3XhR+ X-Received: from vipin.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:479f]) (user=vipinsh job=sendgmr) by 2002:a17:902:f809:b0:19a:80b9:78ce with SMTP id ix9-20020a170902f80900b0019a80b978cemr5182747plb.0.1678142499775; Mon, 06 Mar 2023 14:41:39 -0800 (PST) Date: Mon, 6 Mar 2023 14:41:12 -0800 In-Reply-To: <20230306224127.1689967-1-vipinsh@google.com> Mime-Version: 1.0 References: <20230306224127.1689967-1-vipinsh@google.com> X-Mailer: git-send-email 2.40.0.rc0.216.gc4246ad0f0-goog Message-ID: <20230306224127.1689967-4-vipinsh@google.com> Subject: [Patch v4 03/18] KVM: x86/mmu: Track count of pages in KVM MMU page caches globally From: Vipin Sharma To: seanjc@google.com, pbonzini@redhat.com, bgardon@google.com, dmatlack@google.com Cc: jmattson@google.com, mizhang@google.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Vipin Sharma Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Create a global counter for total number of pages available in MMU page caches across all VMs. Add mmu_shadow_page_cache pages to this counter. This accounting will be used in future commits to shrink MMU caches via KVM MMU shrinker. Signed-off-by: Vipin Sharma --- arch/x86/include/asm/kvm_host.h | 5 ++ arch/x86/kvm/mmu/mmu.c | 90 ++++++++++++++++++++++++++++----- arch/x86/kvm/mmu/mmu_internal.h | 2 + arch/x86/kvm/mmu/paging_tmpl.h | 25 +++++---- arch/x86/kvm/mmu/tdp_mmu.c | 3 +- 5 files changed, 100 insertions(+), 25 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_hos= t.h index ebbe692acf3f..4322c7020d5d 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -791,6 +791,11 @@ struct kvm_vcpu_arch { struct kvm_mmu_memory_cache mmu_shadowed_info_cache; struct kvm_mmu_memory_cache mmu_page_header_cache; =20 + /* + * Protect allocation and release of pages from mmu_shadow_page_cache. + */ + struct mutex mmu_shadow_page_cache_lock; + /* * QEMU userspace and the guest each have their own FPU state. * In vcpu_run, we switch between the user and guest FPU contexts. diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 3a452989f5cd..13f41b7ac280 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -167,6 +167,11 @@ struct kvm_shadow_walk_iterator { static struct kmem_cache *pte_list_desc_cache; struct kmem_cache *mmu_page_header_cache; =20 +/* + * Global count of unused pages in MMU page caches across all VMs. + */ +static struct percpu_counter kvm_total_unused_cached_pages; + static void mmu_spte_set(u64 *sptep, u64 spte); =20 struct kvm_mmu_role_regs { @@ -667,6 +672,34 @@ static void walk_shadow_page_lockless_end(struct kvm_v= cpu *vcpu) } } =20 +/** + * Caller should hold mutex lock corresponding to cache, if available. + */ +static int mmu_topup_sp_memory_cache(struct kvm_mmu_memory_cache *cache, + int min) +{ + int orig_nobjs, r; + + orig_nobjs =3D cache->nobjs; + r =3D kvm_mmu_topup_memory_cache(cache, min); + if (orig_nobjs !=3D cache->nobjs) + percpu_counter_add(&kvm_total_unused_cached_pages, + (cache->nobjs - orig_nobjs)); + + return r; +} + +/** + * Caller should hold mutex lock corresponding to kvm_mmu_memory_cache, if + * available. + */ +static void mmu_free_sp_memory_cache(struct kvm_mmu_memory_cache *cache) +{ + if (cache->nobjs) + percpu_counter_sub(&kvm_total_unused_cached_pages, cache->nobjs); + kvm_mmu_free_memory_cache(cache); +} + static int mmu_topup_memory_caches(struct kvm_vcpu *vcpu, bool maybe_indir= ect) { int r; @@ -676,10 +709,11 @@ static int mmu_topup_memory_caches(struct kvm_vcpu *v= cpu, bool maybe_indirect) 1 + PT64_ROOT_MAX_LEVEL + PTE_PREFETCH_NUM); if (r) return r; - r =3D kvm_mmu_topup_memory_cache(&vcpu->arch.mmu_shadow_page_cache, - PT64_ROOT_MAX_LEVEL); + + r =3D mmu_topup_sp_memory_cache(&vcpu->arch.mmu_shadow_page_cache, PT64_R= OOT_MAX_LEVEL); if (r) return r; + if (maybe_indirect) { r =3D kvm_mmu_topup_memory_cache(&vcpu->arch.mmu_shadowed_info_cache, PT64_ROOT_MAX_LEVEL); @@ -693,7 +727,9 @@ static int mmu_topup_memory_caches(struct kvm_vcpu *vcp= u, bool maybe_indirect) static void mmu_free_memory_caches(struct kvm_vcpu *vcpu) { kvm_mmu_free_memory_cache(&vcpu->arch.mmu_pte_list_desc_cache); - kvm_mmu_free_memory_cache(&vcpu->arch.mmu_shadow_page_cache); + mutex_lock(&vcpu->arch.mmu_shadow_page_cache_lock); + mmu_free_sp_memory_cache(&vcpu->arch.mmu_shadow_page_cache); + mutex_unlock(&vcpu->arch.mmu_shadow_page_cache_lock); kvm_mmu_free_memory_cache(&vcpu->arch.mmu_shadowed_info_cache); kvm_mmu_free_memory_cache(&vcpu->arch.mmu_page_header_cache); } @@ -2148,6 +2184,7 @@ struct shadow_page_caches { struct kvm_mmu_memory_cache *page_header_cache; struct kvm_mmu_memory_cache *shadow_page_cache; struct kvm_mmu_memory_cache *shadowed_info_cache; + bool count_shadow_page_allocation; }; =20 static struct kvm_mmu_page *kvm_mmu_alloc_shadow_page(struct kvm *kvm, @@ -2159,7 +2196,8 @@ static struct kvm_mmu_page *kvm_mmu_alloc_shadow_page= (struct kvm *kvm, struct kvm_mmu_page *sp; =20 sp =3D kvm_mmu_memory_cache_alloc(caches->page_header_cache); - sp->spt =3D kvm_mmu_memory_cache_alloc(caches->shadow_page_cache); + sp->spt =3D mmu_sp_memory_cache_alloc(caches->shadow_page_cache, + caches->count_shadow_page_allocation); if (!role.direct) sp->shadowed_translation =3D kvm_mmu_memory_cache_alloc(caches->shadowed= _info_cache); =20 @@ -2216,6 +2254,7 @@ static struct kvm_mmu_page *kvm_mmu_get_shadow_page(s= truct kvm_vcpu *vcpu, .page_header_cache =3D &vcpu->arch.mmu_page_header_cache, .shadow_page_cache =3D &vcpu->arch.mmu_shadow_page_cache, .shadowed_info_cache =3D &vcpu->arch.mmu_shadowed_info_cache, + .count_shadow_page_allocation =3D true, }; =20 return __kvm_mmu_get_shadow_page(vcpu->kvm, vcpu, &caches, gfn, role); @@ -4314,29 +4353,32 @@ static int direct_page_fault(struct kvm_vcpu *vcpu,= struct kvm_page_fault *fault if (r !=3D RET_PF_INVALID) return r; =20 + mutex_lock(&vcpu->arch.mmu_shadow_page_cache_lock); r =3D mmu_topup_memory_caches(vcpu, false); if (r) - return r; + goto out_page_cache_unlock; =20 r =3D kvm_faultin_pfn(vcpu, fault, ACC_ALL); if (r !=3D RET_PF_CONTINUE) - return r; + goto out_page_cache_unlock; =20 r =3D RET_PF_RETRY; write_lock(&vcpu->kvm->mmu_lock); =20 if (is_page_fault_stale(vcpu, fault)) - goto out_unlock; + goto out_mmu_unlock; =20 r =3D make_mmu_pages_available(vcpu); if (r) - goto out_unlock; + goto out_mmu_unlock; =20 r =3D direct_map(vcpu, fault); =20 -out_unlock: +out_mmu_unlock: write_unlock(&vcpu->kvm->mmu_lock); kvm_release_pfn_clean(fault->pfn); +out_page_cache_unlock: + mutex_unlock(&vcpu->arch.mmu_shadow_page_cache_lock); return r; } =20 @@ -4396,25 +4438,28 @@ static int kvm_tdp_mmu_page_fault(struct kvm_vcpu *= vcpu, if (r !=3D RET_PF_INVALID) return r; =20 + mutex_lock(&vcpu->arch.mmu_shadow_page_cache_lock); r =3D mmu_topup_memory_caches(vcpu, false); if (r) - return r; + goto out_page_cache_unlock; =20 r =3D kvm_faultin_pfn(vcpu, fault, ACC_ALL); if (r !=3D RET_PF_CONTINUE) - return r; + goto out_page_cache_unlock; =20 r =3D RET_PF_RETRY; read_lock(&vcpu->kvm->mmu_lock); =20 if (is_page_fault_stale(vcpu, fault)) - goto out_unlock; + goto out_mmu_unlock; =20 r =3D kvm_tdp_mmu_map(vcpu, fault); =20 -out_unlock: +out_mmu_unlock: read_unlock(&vcpu->kvm->mmu_lock); kvm_release_pfn_clean(fault->pfn); +out_page_cache_unlock: + mutex_unlock(&vcpu->arch.mmu_shadow_page_cache_lock); return r; } #endif @@ -5394,6 +5439,7 @@ int kvm_mmu_load(struct kvm_vcpu *vcpu) { int r; =20 + mutex_lock(&vcpu->arch.mmu_shadow_page_cache_lock); r =3D mmu_topup_memory_caches(vcpu, !vcpu->arch.mmu->root_role.direct); if (r) goto out; @@ -5420,6 +5466,7 @@ int kvm_mmu_load(struct kvm_vcpu *vcpu) */ static_call(kvm_x86_flush_tlb_current)(vcpu); out: + mutex_unlock(&vcpu->arch.mmu_shadow_page_cache_lock); return r; } =20 @@ -5924,6 +5971,7 @@ int kvm_mmu_create(struct kvm_vcpu *vcpu) vcpu->arch.mmu_page_header_cache.gfp_zero =3D __GFP_ZERO; =20 vcpu->arch.mmu_shadow_page_cache.gfp_zero =3D __GFP_ZERO; + mutex_init(&vcpu->arch.mmu_shadow_page_cache_lock); =20 vcpu->arch.mmu =3D &vcpu->arch.root_mmu; vcpu->arch.walk_mmu =3D &vcpu->arch.root_mmu; @@ -6769,12 +6817,17 @@ int kvm_mmu_vendor_module_init(void) if (!mmu_page_header_cache) goto out; =20 + if (percpu_counter_init(&kvm_total_unused_cached_pages, 0, GFP_KERNEL)) + goto out; + ret =3D register_shrinker(&mmu_shrinker, "x86-mmu"); if (ret) - goto out; + goto out_shrinker; =20 return 0; =20 +out_shrinker: + percpu_counter_destroy(&kvm_total_unused_cached_pages); out: mmu_destroy_caches(); return ret; @@ -6792,6 +6845,7 @@ void kvm_mmu_vendor_module_exit(void) { mmu_destroy_caches(); unregister_shrinker(&mmu_shrinker); + percpu_counter_destroy(&kvm_total_unused_cached_pages); } =20 /* @@ -6994,3 +7048,11 @@ void kvm_mmu_pre_destroy_vm(struct kvm *kvm) if (kvm->arch.nx_huge_page_recovery_thread) kthread_stop(kvm->arch.nx_huge_page_recovery_thread); } + +void *mmu_sp_memory_cache_alloc(struct kvm_mmu_memory_cache *shadow_page_c= ache, + bool count_allocation) +{ + if (count_allocation && shadow_page_cache->nobjs) + percpu_counter_dec(&kvm_total_unused_cached_pages); + return kvm_mmu_memory_cache_alloc(shadow_page_cache); +} diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_interna= l.h index cc58631e2336..798cfbf0a36b 100644 --- a/arch/x86/kvm/mmu/mmu_internal.h +++ b/arch/x86/kvm/mmu/mmu_internal.h @@ -338,5 +338,7 @@ void *mmu_memory_cache_alloc(struct kvm_mmu_memory_cach= e *mc); =20 void track_possible_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp); void untrack_possible_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *s= p); +void *mmu_sp_memory_cache_alloc(struct kvm_mmu_memory_cache *cache, + bool count_allocation); =20 #endif /* __KVM_X86_MMU_INTERNAL_H */ diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h index 57f0b75c80f9..1dea9be6849d 100644 --- a/arch/x86/kvm/mmu/paging_tmpl.h +++ b/arch/x86/kvm/mmu/paging_tmpl.h @@ -821,9 +821,10 @@ static int FNAME(page_fault)(struct kvm_vcpu *vcpu, st= ruct kvm_page_fault *fault return RET_PF_EMULATE; } =20 + mutex_lock(&vcpu->arch.mmu_shadow_page_cache_lock); r =3D mmu_topup_memory_caches(vcpu, true); if (r) - return r; + goto out_page_cache_unlock; =20 vcpu->arch.write_fault_to_shadow_pgtable =3D false; =20 @@ -837,7 +838,7 @@ static int FNAME(page_fault)(struct kvm_vcpu *vcpu, str= uct kvm_page_fault *fault =20 r =3D kvm_faultin_pfn(vcpu, fault, walker.pte_access); if (r !=3D RET_PF_CONTINUE) - return r; + goto out_page_cache_unlock; =20 /* * Do not change pte_access if the pfn is a mmio page, otherwise @@ -862,16 +863,18 @@ static int FNAME(page_fault)(struct kvm_vcpu *vcpu, s= truct kvm_page_fault *fault write_lock(&vcpu->kvm->mmu_lock); =20 if (is_page_fault_stale(vcpu, fault)) - goto out_unlock; + goto out_mmu_unlock; =20 r =3D make_mmu_pages_available(vcpu); if (r) - goto out_unlock; + goto out_mmu_unlock; r =3D FNAME(fetch)(vcpu, fault, &walker); =20 -out_unlock: +out_mmu_unlock: write_unlock(&vcpu->kvm->mmu_lock); kvm_release_pfn_clean(fault->pfn); +out_page_cache_unlock: + mutex_unlock(&vcpu->arch.mmu_shadow_page_cache_lock); return r; } =20 @@ -897,17 +900,18 @@ static void FNAME(invlpg)(struct kvm_vcpu *vcpu, gva_= t gva, hpa_t root_hpa) =20 vcpu_clear_mmio_info(vcpu, gva); =20 + if (!VALID_PAGE(root_hpa)) { + WARN_ON(1); + return; + } + + mutex_lock(&vcpu->arch.mmu_shadow_page_cache_lock); /* * No need to check return value here, rmap_can_add() can * help us to skip pte prefetch later. */ mmu_topup_memory_caches(vcpu, true); =20 - if (!VALID_PAGE(root_hpa)) { - WARN_ON(1); - return; - } - write_lock(&vcpu->kvm->mmu_lock); for_each_shadow_entry_using_root(vcpu, root_hpa, gva, iterator) { level =3D iterator.level; @@ -943,6 +947,7 @@ static void FNAME(invlpg)(struct kvm_vcpu *vcpu, gva_t = gva, hpa_t root_hpa) break; } write_unlock(&vcpu->kvm->mmu_lock); + mutex_unlock(&vcpu->arch.mmu_shadow_page_cache_lock); } =20 /* Note, @addr is a GPA when gva_to_gpa() translates an L2 GPA to an L1 GP= A. */ diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 7c25dbf32ecc..fa6eb1e9101e 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -265,7 +265,8 @@ static struct kvm_mmu_page *tdp_mmu_alloc_sp(struct kvm= _vcpu *vcpu) struct kvm_mmu_page *sp; =20 sp =3D kvm_mmu_memory_cache_alloc(&vcpu->arch.mmu_page_header_cache); - sp->spt =3D kvm_mmu_memory_cache_alloc(&vcpu->arch.mmu_shadow_page_cache); + sp->spt =3D mmu_sp_memory_cache_alloc(&vcpu->arch.mmu_shadow_page_cache, + true); =20 return sp; } --=20 2.40.0.rc0.216.gc4246ad0f0-goog From nobody Thu Sep 18 23:35:51 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5BFBBC64EC4 for ; Mon, 6 Mar 2023 22:41:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229687AbjCFWlv (ORCPT ); Mon, 6 Mar 2023 17:41:51 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43156 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230119AbjCFWlp (ORCPT ); Mon, 6 Mar 2023 17:41:45 -0500 Received: from mail-pj1-x104a.google.com (mail-pj1-x104a.google.com [IPv6:2607:f8b0:4864:20::104a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3262472B2A for ; Mon, 6 Mar 2023 14:41:42 -0800 (PST) Received: by mail-pj1-x104a.google.com with SMTP id pl10-20020a17090b268a00b00239ed042afcso7068130pjb.4 for ; Mon, 06 Mar 2023 14:41:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; t=1678142501; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=IgIlDEBO3hBgEG9gel6dE65YzklyZJEz6k7NHt2Zk6M=; b=aAvXgQFxXMRp/VPlQA8gq3OEIOGeRr56iAvWaFd6sstiqZr0Zawx9Z398mOWpF+2Ud GRTkoNY101GNHpcTsMf0UfKI6VXc6fyrnLm4lp1Mr1+l2xUTwzVYOFZD6AH+npaWkXzm 3aVgO3L1QbuLnMYJtuEMmhi+TIf8Ef0zRZKrntMy8bqSdfKHIRSIFPaOHuXUUgXVwfua 5uIafCmWgGLxcKgUBnymT+gLziBpdLr/rAyCowZavzGge+VcmZ+PgnXdiVSwS4iOFcqu R4/QgcOIHxCYrTIQfho49OhRUXfgBwI5UoNN71e76zPwkKJ6vOhW5ITrJRul8rrZFSQ/ AtOA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1678142501; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=IgIlDEBO3hBgEG9gel6dE65YzklyZJEz6k7NHt2Zk6M=; b=AmkzcnNB6d2yUsjF9+vmiLUQvf5KcJF5Hy5A1Q0nRkpEjH55wzWc30ESa25O9dhvYj OOSqckdIQy3lrkv/Z0YD9iwEUSVx9HkWPS9gdLCMElrPsiyz8aqlq/lk9OaidQ96agPY WE9GUAV6G4fadynn5/HIYqdlkF16WNN2fYiwdisbW86yXHbo4pG1CciCoZsCeDLMA9/P WUVxcKvYwo0MLAISSpLIzJ9MEnStSB3/BncVVAUjshHg51tHACzMbB4sLLBMus7rnpPK VkNlh6Vh6Q/68fySRTMZT6Cj7DlwKGheYCtHmbjt3NlEQy5y/kY9bmw8/7QQtEj/x2Fh Hv6Q== X-Gm-Message-State: AO0yUKUNLn4AVNjg8NtCnqfQlyUANw4Xb9VFLhXIdYOhMuMW7AU7nb4X QepK7DKnX8gmxW3ry4c80BiiekVhTW8U X-Google-Smtp-Source: AK7set/En9Q/8v9ziiTVNBNjcULdErMB0XNxJpifIOtjunDo7/A+/RXyjrxTrsbzR235Afaol2oomqlvrqIZ X-Received: from vipin.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:479f]) (user=vipinsh job=sendgmr) by 2002:a17:903:449:b0:19a:87dd:9206 with SMTP id iw9-20020a170903044900b0019a87dd9206mr4869867plb.0.1678142501654; Mon, 06 Mar 2023 14:41:41 -0800 (PST) Date: Mon, 6 Mar 2023 14:41:13 -0800 In-Reply-To: <20230306224127.1689967-1-vipinsh@google.com> Mime-Version: 1.0 References: <20230306224127.1689967-1-vipinsh@google.com> X-Mailer: git-send-email 2.40.0.rc0.216.gc4246ad0f0-goog Message-ID: <20230306224127.1689967-5-vipinsh@google.com> Subject: [Patch v4 04/18] KVM: x86/mmu: Shrink shadow page caches via MMU shrinker From: Vipin Sharma To: seanjc@google.com, pbonzini@redhat.com, bgardon@google.com, dmatlack@google.com Cc: jmattson@google.com, mizhang@google.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Vipin Sharma Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Shrink shadow page caches via MMU shrinker based on kvm_total_unused_cached_pages. Traverse each vCPU of all of the VMs, empty the caches and exit the shrinker when sufficient number of pages have been freed. Also, move processed VMs to the end of vm_list so that next time other VMs are tortured first. Signed-off-by: Vipin Sharma --- arch/x86/kvm/mmu/mmu.c | 55 +++++++++++++++++++++++++++++++++++----- include/linux/kvm_host.h | 1 + virt/kvm/kvm_main.c | 6 ++++- 3 files changed, 54 insertions(+), 8 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 13f41b7ac280..df8dcb7e5de7 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -6693,16 +6693,57 @@ void kvm_mmu_invalidate_mmio_sptes(struct kvm *kvm,= u64 gen) } } =20 -static unsigned long -mmu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc) -{ - return SHRINK_STOP; +static unsigned long mmu_shrink_scan(struct shrinker *shrink, + struct shrink_control *sc) +{ + struct kvm *kvm, *next_kvm, *first_kvm =3D NULL; + struct kvm_mmu_memory_cache *cache; + unsigned long i, freed =3D 0; + struct mutex *cache_lock; + struct kvm_vcpu *vcpu; + + mutex_lock(&kvm_lock); + list_for_each_entry_safe(kvm, next_kvm, &vm_list, vm_list) { + if (first_kvm =3D=3D kvm) + break; + + if (!first_kvm) + first_kvm =3D kvm; + + list_move_tail(&kvm->vm_list, &vm_list); + + kvm_for_each_vcpu(i, vcpu, kvm) { + cache =3D &vcpu->arch.mmu_shadow_page_cache; + cache_lock =3D &vcpu->arch.mmu_shadow_page_cache_lock; + if (mutex_trylock(cache_lock)) { + if (cache->nobjs) { + freed +=3D cache->nobjs; + kvm_mmu_empty_memory_cache(cache); + } + mutex_unlock(cache_lock); + if (freed >=3D sc->nr_to_scan) + goto out; + } + } + } +out: + mutex_unlock(&kvm_lock); + if (freed) { + percpu_counter_sub(&kvm_total_unused_cached_pages, freed); + return freed; + } else { + return SHRINK_STOP; + } } =20 -static unsigned long -mmu_shrink_count(struct shrinker *shrink, struct shrink_control *sc) +static unsigned long mmu_shrink_count(struct shrinker *shrink, + struct shrink_control *sc) { - return SHRINK_EMPTY; + s64 count =3D percpu_counter_sum(&kvm_total_unused_cached_pages); + + WARN_ON(count < 0); + return count <=3D 0 ? SHRINK_EMPTY : count; + } =20 static struct shrinker mmu_shrinker =3D { diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 8ada23756b0e..5cfa42c130e0 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -1361,6 +1361,7 @@ void kvm_flush_remote_tlbs(struct kvm *kvm); int kvm_mmu_topup_memory_cache(struct kvm_mmu_memory_cache *mc, int min); int __kvm_mmu_topup_memory_cache(struct kvm_mmu_memory_cache *mc, int capa= city, int min); int kvm_mmu_memory_cache_nr_free_objects(struct kvm_mmu_memory_cache *mc); +void kvm_mmu_empty_memory_cache(struct kvm_mmu_memory_cache *mc); void kvm_mmu_free_memory_cache(struct kvm_mmu_memory_cache *mc); void *kvm_mmu_memory_cache_alloc(struct kvm_mmu_memory_cache *mc); #endif diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index d255964ec331..536d8ab6e61f 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -430,7 +430,7 @@ int kvm_mmu_memory_cache_nr_free_objects(struct kvm_mmu= _memory_cache *mc) return mc->nobjs; } =20 -void kvm_mmu_free_memory_cache(struct kvm_mmu_memory_cache *mc) +void kvm_mmu_empty_memory_cache(struct kvm_mmu_memory_cache *mc) { while (mc->nobjs) { if (mc->kmem_cache) @@ -438,7 +438,11 @@ void kvm_mmu_free_memory_cache(struct kvm_mmu_memory_c= ache *mc) else free_page((unsigned long)mc->objects[--mc->nobjs]); } +} =20 +void kvm_mmu_free_memory_cache(struct kvm_mmu_memory_cache *mc) +{ + kvm_mmu_empty_memory_cache(mc); kvfree(mc->objects); =20 mc->objects =3D NULL; --=20 2.40.0.rc0.216.gc4246ad0f0-goog From nobody Thu Sep 18 23:35:51 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 05A20C6FD1B for ; Mon, 6 Mar 2023 22:42:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230184AbjCFWmA (ORCPT ); Mon, 6 Mar 2023 17:42:00 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43278 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230105AbjCFWlr (ORCPT ); Mon, 6 Mar 2023 17:41:47 -0500 Received: from mail-pj1-x104a.google.com (mail-pj1-x104a.google.com [IPv6:2607:f8b0:4864:20::104a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8DD3B7430C for ; Mon, 6 Mar 2023 14:41:43 -0800 (PST) Received: by mail-pj1-x104a.google.com with SMTP id cl18-20020a17090af69200b0023470d96ae6so127343pjb.1 for ; Mon, 06 Mar 2023 14:41:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; t=1678142503; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=ntsWJChvbXfy8t9EA+eQUOa7LNpcvZXYoCa+kOc7Xxo=; b=QE3wGxcpDkwXukGXapdK3z5AcvymfM9fTDSInE4aMcqqqhApqI13rGs3g8WKQUaPS/ zZ6kNb4R7uz/pdmbRfaqJcYVYuRq+bQgvKsN86OU0tlFIWLibxnJpuqg0yKTzMCyuR1X pL28Sm386yDPzPua4c49qPfANPTKuZn9aWIUurbKMbZucDPpluT8Syfvw708TGu5WMO0 hjDTHUxu7yWc5s4/oRuSPQU74v9qCJL0SWCVXIoKKH6qhy3+hgUFKF39UdJnzPCFwNtA PsAmluDoVGVvbeYF4h5LW3dAFwVLGK0yDNSdG3miH//Ijl6QzEc9pgGGk9eEMrDtuikC CFZA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1678142503; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=ntsWJChvbXfy8t9EA+eQUOa7LNpcvZXYoCa+kOc7Xxo=; b=l2jsHc6NQk6UuCnfsVi6I6SMMWzQ4ePzF/94et/4PQ7xzoXnVoJrb4Y1uV1QEte514 nRA/LYVeBF2f/VGkffi5vu/NwsmYsbH1/QvOnbBNojdHpRsUUDRZBX7qoRI2OYvDdxhm XoLg8/iXdCIsoe6coYANrZjEAwkPgPesM4zOxjZ3Pp6pMlgkSMhpomcC61zSDQuXhxUw /YuWzurcCvMTVH/j4Z/lz0xlP619A0fymciEWWs+AzSZa+j36m6jLfJpf19mL0wTx9X8 tN9C7Aqjb2iCJzgqwBgxMSXxqrD1bGl/15Cz3ScfrdJl896r8Nq1DOg1ngHmN4p7Mmsn zlfg== X-Gm-Message-State: AO0yUKWEgzJm28cby3KOrQI/o/cVOC7kqWD9aFmxb3VaDVoqAUj2UoLM fPt1UkvrkpwUirRCz/diquVXBFi5Vot1 X-Google-Smtp-Source: AK7set+ZI2aBqjIrxBqymLM25KRuDPXvrPAuiXEE5P98Cs7Ib72ELWHQ9A1FvjqxxCOH6L09ow62WHf25Ci6 X-Received: from vipin.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:479f]) (user=vipinsh job=sendgmr) by 2002:a63:7e11:0:b0:503:913f:77b9 with SMTP id z17-20020a637e11000000b00503913f77b9mr4352737pgc.6.1678142503145; Mon, 06 Mar 2023 14:41:43 -0800 (PST) Date: Mon, 6 Mar 2023 14:41:14 -0800 In-Reply-To: <20230306224127.1689967-1-vipinsh@google.com> Mime-Version: 1.0 References: <20230306224127.1689967-1-vipinsh@google.com> X-Mailer: git-send-email 2.40.0.rc0.216.gc4246ad0f0-goog Message-ID: <20230306224127.1689967-6-vipinsh@google.com> Subject: [Patch v4 05/18] KVM: x86/mmu: Add split_shadow_page_cache pages to global count of MMU cache pages From: Vipin Sharma To: seanjc@google.com, pbonzini@redhat.com, bgardon@google.com, dmatlack@google.com Cc: jmattson@google.com, mizhang@google.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Vipin Sharma Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add pages in split_shadow_page_cache to the global counter kvm_total_unused_cached_pages. These pages will be freed by MMU shrinker in future commit. Signed-off-by: Vipin Sharma --- arch/x86/kvm/mmu/mmu.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index df8dcb7e5de7..0ebb8a2eaf47 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -6149,7 +6149,9 @@ static void mmu_free_vm_memory_caches(struct kvm *kvm) { kvm_mmu_free_memory_cache(&kvm->arch.split_desc_cache); kvm_mmu_free_memory_cache(&kvm->arch.split_page_header_cache); - kvm_mmu_free_memory_cache(&kvm->arch.split_shadow_page_cache); + mutex_lock(&kvm->slots_lock); + mmu_free_sp_memory_cache(&kvm->arch.split_shadow_page_cache); + mutex_unlock(&kvm->slots_lock); } =20 void kvm_mmu_uninit_vm(struct kvm *kvm) @@ -6303,7 +6305,7 @@ static int topup_split_caches(struct kvm *kvm) if (r) return r; =20 - return kvm_mmu_topup_memory_cache(&kvm->arch.split_shadow_page_cache, 1); + return mmu_topup_sp_memory_cache(&kvm->arch.split_shadow_page_cache, 1); } =20 static struct kvm_mmu_page *shadow_mmu_get_sp_for_split(struct kvm *kvm, u= 64 *huge_sptep) @@ -6328,6 +6330,7 @@ static struct kvm_mmu_page *shadow_mmu_get_sp_for_spl= it(struct kvm *kvm, u64 *hu /* Direct SPs do not require a shadowed_info_cache. */ caches.page_header_cache =3D &kvm->arch.split_page_header_cache; caches.shadow_page_cache =3D &kvm->arch.split_shadow_page_cache; + caches.count_shadow_page_allocation =3D true; =20 /* Safe to pass NULL for vCPU since requesting a direct SP. */ return __kvm_mmu_get_shadow_page(kvm, NULL, &caches, gfn, role); --=20 2.40.0.rc0.216.gc4246ad0f0-goog From nobody Thu Sep 18 23:35:51 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3F234C61DA4 for ; Mon, 6 Mar 2023 22:42:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230196AbjCFWmD (ORCPT ); Mon, 6 Mar 2023 17:42:03 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43312 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230137AbjCFWlr (ORCPT ); Mon, 6 Mar 2023 17:41:47 -0500 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 99A647C9D9 for ; Mon, 6 Mar 2023 14:41:45 -0800 (PST) Received: by mail-yb1-xb49.google.com with SMTP id l24-20020a25b318000000b007eba3f8e3baso11882909ybj.4 for ; Mon, 06 Mar 2023 14:41:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; t=1678142505; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=cqnpRtg+h63StLNiA50bWnsxI9nJr8KaP+9nXax4nBE=; b=ACVU2Ak73DEsJyCD1CYnIUNV7DZqssWv57lrpw9xJcDfooRtDa2tY3LBqbvNcDU3KP hhYlgY79kRm0fgbdafmrzxHRgKfPAEwLXCwahNQwzRNxhmAyfBYAPK5few9foF3yaGPd /lcFH9jY5YfuOoMwi8e+ez0gG0qGiMaP8D2CYlANXgrHPRHw1P5HlftE5B8zD84EaxRI q7zgM/XeA3Fe4WbCdFWT+96EPjGGWl6cIhWMfg76bGxYmkk0ltneRvBruzuQ5PqN2XDq 54VqqvM9F19QX4nQqAn3F5PtGzHLbFGu+QsPM4zutD5JATaZxYhRaGPqEj/cObP2REkH 4SGA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1678142505; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=cqnpRtg+h63StLNiA50bWnsxI9nJr8KaP+9nXax4nBE=; b=L1+Vm9EkEPJ2k58ZFdJevX5mCSmOMtNuhE3qMRehEMvi6jU1m9EvzL+QJsztjEr01G 8i74028Ek7CGj08iXP609dFO3bZmg9+qMnde4FhTO99uYA9MCrtFraKcRoerBniSghPn C61lMatsUlloNZtzc2qlKbiPj/61f/i2RzNdJkBfrBP7UleHVLp/UCaex21IevGEgkLx MK7KkoSwSLQrLTq/6ej+hBkqXZUshdqoNjiqP5yJXq7SKkVgfmk4PpWdLjskEN3HOPYu Tn8vuo9OPx7exu32nsOAXVED0LGdfhVlOOPZ9Fm5YYHatvuiuAUGUur38JzTv+zFBzlJ LYiA== X-Gm-Message-State: AO0yUKVnGlIGp3nTjacJOkBuaiP2gHXjJOT4MxckjPbG4VNA9WN6QF+I 4XABbGpN6j/Z3f1ldIq4I/b9cXrNMVs3 X-Google-Smtp-Source: AK7set8kXl/my/i/lLk02KsOkXov6TpYpA7DKlG4AzS8XL9UZfw/8EVyvgNRviAycRks9Qvkwea5EqBRmcfc X-Received: from vipin.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:479f]) (user=vipinsh job=sendgmr) by 2002:a05:6902:10e:b0:98e:6280:74ca with SMTP id o14-20020a056902010e00b0098e628074camr5314707ybh.1.1678142504847; Mon, 06 Mar 2023 14:41:44 -0800 (PST) Date: Mon, 6 Mar 2023 14:41:15 -0800 In-Reply-To: <20230306224127.1689967-1-vipinsh@google.com> Mime-Version: 1.0 References: <20230306224127.1689967-1-vipinsh@google.com> X-Mailer: git-send-email 2.40.0.rc0.216.gc4246ad0f0-goog Message-ID: <20230306224127.1689967-7-vipinsh@google.com> Subject: [Patch v4 06/18] KVM: x86/mmu: Shrink split_shadow_page_cache via MMU shrinker From: Vipin Sharma To: seanjc@google.com, pbonzini@redhat.com, bgardon@google.com, dmatlack@google.com Cc: jmattson@google.com, mizhang@google.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Vipin Sharma Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Use MMU shrinker to free unused pages in split_shadow_page_cache. Refactor the code and make common function to try emptying the page cache. Signed-off-by: Vipin Sharma --- arch/x86/kvm/mmu/mmu.c | 34 +++++++++++++++++++++------------- 1 file changed, 21 insertions(+), 13 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 0ebb8a2eaf47..73a0ac9c11ce 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -6696,13 +6696,24 @@ void kvm_mmu_invalidate_mmio_sptes(struct kvm *kvm,= u64 gen) } } =20 +static int mmu_memory_cache_try_empty(struct kvm_mmu_memory_cache *cache, + struct mutex *cache_lock) +{ + int freed =3D 0; + + if (mutex_trylock(cache_lock)) { + freed =3D cache->nobjs; + kvm_mmu_empty_memory_cache(cache); + mutex_unlock(cache_lock); + } + return freed; +} + static unsigned long mmu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc) { struct kvm *kvm, *next_kvm, *first_kvm =3D NULL; - struct kvm_mmu_memory_cache *cache; unsigned long i, freed =3D 0; - struct mutex *cache_lock; struct kvm_vcpu *vcpu; =20 mutex_lock(&kvm_lock); @@ -6716,18 +6727,15 @@ static unsigned long mmu_shrink_scan(struct shrinke= r *shrink, list_move_tail(&kvm->vm_list, &vm_list); =20 kvm_for_each_vcpu(i, vcpu, kvm) { - cache =3D &vcpu->arch.mmu_shadow_page_cache; - cache_lock =3D &vcpu->arch.mmu_shadow_page_cache_lock; - if (mutex_trylock(cache_lock)) { - if (cache->nobjs) { - freed +=3D cache->nobjs; - kvm_mmu_empty_memory_cache(cache); - } - mutex_unlock(cache_lock); - if (freed >=3D sc->nr_to_scan) - goto out; - } + freed +=3D mmu_memory_cache_try_empty(&vcpu->arch.mmu_shadow_page_cache, + &vcpu->arch.mmu_shadow_page_cache_lock); + if (freed >=3D sc->nr_to_scan) + goto out; } + freed +=3D mmu_memory_cache_try_empty(&kvm->arch.split_shadow_page_cache, + &kvm->slots_lock); + if (freed >=3D sc->nr_to_scan) + goto out; } out: mutex_unlock(&kvm_lock); --=20 2.40.0.rc0.216.gc4246ad0f0-goog From nobody Thu Sep 18 23:35:51 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 199A8C64EC4 for ; Mon, 6 Mar 2023 22:42:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229973AbjCFWmG (ORCPT ); Mon, 6 Mar 2023 17:42:06 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43366 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230155AbjCFWlt (ORCPT ); Mon, 6 Mar 2023 17:41:49 -0500 Received: from mail-pj1-x1049.google.com (mail-pj1-x1049.google.com [IPv6:2607:f8b0:4864:20::1049]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3688575854 for ; Mon, 6 Mar 2023 14:41:46 -0800 (PST) Received: by mail-pj1-x1049.google.com with SMTP id m9-20020a17090a7f8900b0023769205928so7063204pjl.6 for ; Mon, 06 Mar 2023 14:41:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; t=1678142506; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=fud6El3rNU/znA5Y7aANLknllha9pomZRWuqilbnIcQ=; b=RHrpxvtkoJL3G7M+x0b7vy1LMCEVHhhInmhGl/pZvP+imrCBzb7gpuU9I4dY0Mk4Md E3mJ9rZE0xkYlu56wN7H7+Wv0o0LrN/Lsh+B4Iz7k7L6oemm3rolHrxMC9Xmod2/w4TW 9ZrbsWYDscNpA+wbxAxnZ74ClfSjyspMqWYI5BhgxKLQLACztlfI3fgQ+2/ZBHEfqFYC pOojCBBKl63E8QCWBNXnDHjA9+D2C3+w4VS6GuaMuxeHHk+KEvBBFbBvqnfWyu1wsOjc 24XDTa7FyWo1Ei8/jlkbUDsoFeAwntPi/zr2ly80qNFKEHtAPSMAmyCiZMgoHlRKOWMK VBfA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1678142506; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=fud6El3rNU/znA5Y7aANLknllha9pomZRWuqilbnIcQ=; b=KxcunUmA0ObQT4c0oYBvqrkdryfp9n77Cchhsc5+AzaVz/gRIq3qBHkm0fSYb+euIX 6FqJymPkRbWS6fSoBAb4mM7XLnTzAPOLn2bWt80NxN2SKaUNOT89RU7G38gh/PxsYj5y xL4ZTnoe3Ib65vPU0XedfsHJ/KxmSDGxJd4OUvi74pfkbEnqImzHe8ZfL/Yo4dC/kQXn guUDYN7Fd+hU/vGPpY8IkMArnAhkl1akgE2LNL2tjLAIv47f42YAD0Uj0SEoa1zEjMQc yUEpH81Ut0kmrRu73Z5K9b5HxaDxBgxSNG/3gW63XV4RSn5e2dPtHAT7zWnYxq9/DVjn EU4Q== X-Gm-Message-State: AO0yUKXVDk0xhJyy9cM7qKodCpbjLz3Tzhg1uJhxPE0+Zv8DACITeEBT oVAgS/AAowiMpzQK+Qz4KAsVzZqkE6qS X-Google-Smtp-Source: AK7set9b0/v6Geu4DXnRkHsEWeT3CmFfj0wEOQBYiSgplekLFzbGS00cUR6Ne+Y+3saMoYhownTfzbVX3gco X-Received: from vipin.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:479f]) (user=vipinsh job=sendgmr) by 2002:a17:90a:f0c9:b0:237:4a5d:5a57 with SMTP id fa9-20020a17090af0c900b002374a5d5a57mr4698851pjb.1.1678142506491; Mon, 06 Mar 2023 14:41:46 -0800 (PST) Date: Mon, 6 Mar 2023 14:41:16 -0800 In-Reply-To: <20230306224127.1689967-1-vipinsh@google.com> Mime-Version: 1.0 References: <20230306224127.1689967-1-vipinsh@google.com> X-Mailer: git-send-email 2.40.0.rc0.216.gc4246ad0f0-goog Message-ID: <20230306224127.1689967-8-vipinsh@google.com> Subject: [Patch v4 07/18] KVM: x86/mmu: Unconditionally count allocations from MMU page caches From: Vipin Sharma To: seanjc@google.com, pbonzini@redhat.com, bgardon@google.com, dmatlack@google.com Cc: jmattson@google.com, mizhang@google.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Vipin Sharma Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Remove count_shadow_page_allocations from struct shadow_page_caches{}. Remove count_allocation boolean condition check from mmu_sp_memory_cache_alloc(). Both split_shadow_page_cache and mmu_shadow_page_cache are counted in global count of unused cache pages. count_shadow_page_allocations boolean is obsolete and can be removed. Signed-off-by: Vipin Sharma --- arch/x86/kvm/mmu/mmu.c | 11 +++-------- arch/x86/kvm/mmu/mmu_internal.h | 3 +-- arch/x86/kvm/mmu/tdp_mmu.c | 3 +-- 3 files changed, 5 insertions(+), 12 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 73a0ac9c11ce..0a0962d8108b 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -2184,7 +2184,6 @@ struct shadow_page_caches { struct kvm_mmu_memory_cache *page_header_cache; struct kvm_mmu_memory_cache *shadow_page_cache; struct kvm_mmu_memory_cache *shadowed_info_cache; - bool count_shadow_page_allocation; }; =20 static struct kvm_mmu_page *kvm_mmu_alloc_shadow_page(struct kvm *kvm, @@ -2196,8 +2195,7 @@ static struct kvm_mmu_page *kvm_mmu_alloc_shadow_page= (struct kvm *kvm, struct kvm_mmu_page *sp; =20 sp =3D kvm_mmu_memory_cache_alloc(caches->page_header_cache); - sp->spt =3D mmu_sp_memory_cache_alloc(caches->shadow_page_cache, - caches->count_shadow_page_allocation); + sp->spt =3D mmu_sp_memory_cache_alloc(caches->shadow_page_cache); if (!role.direct) sp->shadowed_translation =3D kvm_mmu_memory_cache_alloc(caches->shadowed= _info_cache); =20 @@ -2254,7 +2252,6 @@ static struct kvm_mmu_page *kvm_mmu_get_shadow_page(s= truct kvm_vcpu *vcpu, .page_header_cache =3D &vcpu->arch.mmu_page_header_cache, .shadow_page_cache =3D &vcpu->arch.mmu_shadow_page_cache, .shadowed_info_cache =3D &vcpu->arch.mmu_shadowed_info_cache, - .count_shadow_page_allocation =3D true, }; =20 return __kvm_mmu_get_shadow_page(vcpu->kvm, vcpu, &caches, gfn, role); @@ -6330,7 +6327,6 @@ static struct kvm_mmu_page *shadow_mmu_get_sp_for_spl= it(struct kvm *kvm, u64 *hu /* Direct SPs do not require a shadowed_info_cache. */ caches.page_header_cache =3D &kvm->arch.split_page_header_cache; caches.shadow_page_cache =3D &kvm->arch.split_shadow_page_cache; - caches.count_shadow_page_allocation =3D true; =20 /* Safe to pass NULL for vCPU since requesting a direct SP. */ return __kvm_mmu_get_shadow_page(kvm, NULL, &caches, gfn, role); @@ -7101,10 +7097,9 @@ void kvm_mmu_pre_destroy_vm(struct kvm *kvm) kthread_stop(kvm->arch.nx_huge_page_recovery_thread); } =20 -void *mmu_sp_memory_cache_alloc(struct kvm_mmu_memory_cache *shadow_page_c= ache, - bool count_allocation) +void *mmu_sp_memory_cache_alloc(struct kvm_mmu_memory_cache *shadow_page_c= ache) { - if (count_allocation && shadow_page_cache->nobjs) + if (shadow_page_cache->nobjs) percpu_counter_dec(&kvm_total_unused_cached_pages); return kvm_mmu_memory_cache_alloc(shadow_page_cache); } diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_interna= l.h index 798cfbf0a36b..a607314348e3 100644 --- a/arch/x86/kvm/mmu/mmu_internal.h +++ b/arch/x86/kvm/mmu/mmu_internal.h @@ -338,7 +338,6 @@ void *mmu_memory_cache_alloc(struct kvm_mmu_memory_cach= e *mc); =20 void track_possible_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *sp); void untrack_possible_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *s= p); -void *mmu_sp_memory_cache_alloc(struct kvm_mmu_memory_cache *cache, - bool count_allocation); +void *mmu_sp_memory_cache_alloc(struct kvm_mmu_memory_cache *cache); =20 #endif /* __KVM_X86_MMU_INTERNAL_H */ diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index fa6eb1e9101e..d1e85012a008 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -265,8 +265,7 @@ static struct kvm_mmu_page *tdp_mmu_alloc_sp(struct kvm= _vcpu *vcpu) struct kvm_mmu_page *sp; =20 sp =3D kvm_mmu_memory_cache_alloc(&vcpu->arch.mmu_page_header_cache); - sp->spt =3D mmu_sp_memory_cache_alloc(&vcpu->arch.mmu_shadow_page_cache, - true); + sp->spt =3D mmu_sp_memory_cache_alloc(&vcpu->arch.mmu_shadow_page_cache); =20 return sp; } --=20 2.40.0.rc0.216.gc4246ad0f0-goog From nobody Thu Sep 18 23:35:51 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A44AAC61DA4 for ; Mon, 6 Mar 2023 22:42:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229680AbjCFWmQ (ORCPT ); Mon, 6 Mar 2023 17:42:16 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43248 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230181AbjCFWl7 (ORCPT ); Mon, 6 Mar 2023 17:41:59 -0500 Received: from mail-pj1-x104a.google.com (mail-pj1-x104a.google.com [IPv6:2607:f8b0:4864:20::104a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E268178C83 for ; Mon, 6 Mar 2023 14:41:48 -0800 (PST) Received: by mail-pj1-x104a.google.com with SMTP id q9-20020a17090a9f4900b00237d026fc55so7076833pjv.3 for ; Mon, 06 Mar 2023 14:41:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; t=1678142508; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=J8x91/t3dwIfLtqbgCFfhhKLJcllwDKNOmC2E+E2xTE=; b=DYLWpU7tZGbEMTBqF+pAbMWErmZ3leTSXhkNG3iAzMHf9CwPyOH2PG3+hRNKU4dSKk 54VgFhxiQNZ2uL6txzOkCI5k2T2HfkNMs9t/FSAn1IM70FurZUjAiHApnSAFDrGvpjmu lZU9KdqbtDhD2zRkcXX02KbU2rJ1G9vBnthgRufITUMn5EUm3D+/IwUhQMHP/nnc8Qcn /NndLhKuagf9LNHdfQEKWVLnHi++jOuhONrDlH57z+qYjyWcgeyYpVyJxrF3Ufh2XbrM y8ctN7dvwTMZwI2MmJP/RnlK+N7+mSAm7TVDbta0+EPTwyEGj4tMO815hzJooI28zziE +v+g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1678142508; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=J8x91/t3dwIfLtqbgCFfhhKLJcllwDKNOmC2E+E2xTE=; b=WrMj3Q20Yfmm7nvWrvNjIgVzBeEpvBNVg8R0jP7VDMpXLvhnKPk/yi1kSJdci3zPyL SGMaYbjy/BtCYZAy73ENTFsg+yn1kaLVeNEe9QdxcSD4C0R8869mMmYC7ciHkCagKZfl PKDpW+2+175G3sKkE7onnRr9SijwcOqmx8T/JoJiwwc6N9GhgbiZJbmgPrw+5q87sm38 tzbE7L7GxwnstTh7pWgfH2KEAgHrDZEhFezL4YDmEATrbBp8p5VyY4w+Rzvzoj/QNKXb wLL9/E49LuAqXGkEXpnLZSh3E7u837/ysqWTPbV5YfQOTLhtKxKaHsl7XAlZ2LPLqFsM TJpg== X-Gm-Message-State: AO0yUKVw9MHxcLUh6w9ZeBzoS8TZvXzUiKoA972OXSsVsnn2VX5lBPs6 9zsNj+M9q59r29X4k7fqtGzh1FA9cjuv X-Google-Smtp-Source: AK7set+KL1AOCKwf3Kb7nrXUmS3jKvF934DJ+eKPK4nuTqt1uFT5Dh+EO/YoXMmWOFg0NqDRQkatdQds3AQZ X-Received: from vipin.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:479f]) (user=vipinsh job=sendgmr) by 2002:a17:90a:7064:b0:230:3b84:9169 with SMTP id f91-20020a17090a706400b002303b849169mr4600337pjk.2.1678142508419; Mon, 06 Mar 2023 14:41:48 -0800 (PST) Date: Mon, 6 Mar 2023 14:41:17 -0800 In-Reply-To: <20230306224127.1689967-1-vipinsh@google.com> Mime-Version: 1.0 References: <20230306224127.1689967-1-vipinsh@google.com> X-Mailer: git-send-email 2.40.0.rc0.216.gc4246ad0f0-goog Message-ID: <20230306224127.1689967-9-vipinsh@google.com> Subject: [Patch v4 08/18] KVM: x86/mmu: Track unused mmu_shadowed_info_cache pages count via global counter From: Vipin Sharma To: seanjc@google.com, pbonzini@redhat.com, bgardon@google.com, dmatlack@google.com Cc: jmattson@google.com, mizhang@google.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Vipin Sharma Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add unused pages in mmu_shadowed_info_cache to global MMU unused page cache counter i.e. kvm_total_unused_cached_pages. These pages will be freed by MMU shrinker in future commit. Signed-off-by: Vipin Sharma --- arch/x86/include/asm/kvm_host.h | 3 ++- arch/x86/kvm/mmu/mmu.c | 8 ++++---- 2 files changed, 6 insertions(+), 5 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_hos= t.h index 4322c7020d5d..185719dbeb81 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -792,7 +792,8 @@ struct kvm_vcpu_arch { struct kvm_mmu_memory_cache mmu_page_header_cache; =20 /* - * Protect allocation and release of pages from mmu_shadow_page_cache. + * Protect allocation and release of pages from mmu_shadow_page_cache + * and mmu_shadowed_info_cache. */ struct mutex mmu_shadow_page_cache_lock; =20 diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 0a0962d8108b..b7ca31b5699c 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -715,8 +715,8 @@ static int mmu_topup_memory_caches(struct kvm_vcpu *vcp= u, bool maybe_indirect) return r; =20 if (maybe_indirect) { - r =3D kvm_mmu_topup_memory_cache(&vcpu->arch.mmu_shadowed_info_cache, - PT64_ROOT_MAX_LEVEL); + r =3D mmu_topup_sp_memory_cache(&vcpu->arch.mmu_shadowed_info_cache, + PT64_ROOT_MAX_LEVEL); if (r) return r; } @@ -729,8 +729,8 @@ static void mmu_free_memory_caches(struct kvm_vcpu *vcp= u) kvm_mmu_free_memory_cache(&vcpu->arch.mmu_pte_list_desc_cache); mutex_lock(&vcpu->arch.mmu_shadow_page_cache_lock); mmu_free_sp_memory_cache(&vcpu->arch.mmu_shadow_page_cache); + mmu_free_sp_memory_cache(&vcpu->arch.mmu_shadowed_info_cache); mutex_unlock(&vcpu->arch.mmu_shadow_page_cache_lock); - kvm_mmu_free_memory_cache(&vcpu->arch.mmu_shadowed_info_cache); kvm_mmu_free_memory_cache(&vcpu->arch.mmu_page_header_cache); } =20 @@ -2197,7 +2197,7 @@ static struct kvm_mmu_page *kvm_mmu_alloc_shadow_page= (struct kvm *kvm, sp =3D kvm_mmu_memory_cache_alloc(caches->page_header_cache); sp->spt =3D mmu_sp_memory_cache_alloc(caches->shadow_page_cache); if (!role.direct) - sp->shadowed_translation =3D kvm_mmu_memory_cache_alloc(caches->shadowed= _info_cache); + sp->shadowed_translation =3D mmu_sp_memory_cache_alloc(caches->shadowed_= info_cache); =20 set_page_private(virt_to_page(sp->spt), (unsigned long)sp); =20 --=20 2.40.0.rc0.216.gc4246ad0f0-goog From nobody Thu Sep 18 23:35:51 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 95DA7C61DA4 for ; Mon, 6 Mar 2023 22:42:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229542AbjCFWmT (ORCPT ); Mon, 6 Mar 2023 17:42:19 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43802 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229780AbjCFWmF (ORCPT ); Mon, 6 Mar 2023 17:42:05 -0500 Received: from mail-pl1-x64a.google.com (mail-pl1-x64a.google.com [IPv6:2607:f8b0:4864:20::64a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 979B180E26 for ; Mon, 6 Mar 2023 14:41:50 -0800 (PST) Received: by mail-pl1-x64a.google.com with SMTP id s8-20020a170902b18800b0019c92f56a8aso6710666plr.22 for ; Mon, 06 Mar 2023 14:41:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; t=1678142510; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=ld4MGsQjtnTx0Dcfr9NtPlok1/1/gYNLFbhbEuUoYrA=; b=krwov+RRGwrmv2jjs5BE/jAirVZA4W+6fdZ3k7jqm0NcyoDyVN0Iy2f7RNdtZV6pL0 taP6SnYbTHXrwcU/w37ORxzzFBrYm+IM3kOf6FMHbIrJEsfgizuWkBijapf8wpi7Ccj8 IJtbaNqH913nKSrQP43U/ZTB0esbAJr1VnGS6Y0n27248m976Z3s9LHepZESK0jcrxYz ya5ruaw6MSO5mNR77y9pEo9E4FVyvmRZyHx/KGRJYnaLsDce4p0oh8njFTrb3lPdYVLi kpoqjKWJSECDlZ/hBKu0b2pGTcIO/OJSdAv2hJC3/3EEP2PCwYDWa9v1UDkMlUQFcY8k y6Ig== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1678142510; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=ld4MGsQjtnTx0Dcfr9NtPlok1/1/gYNLFbhbEuUoYrA=; b=Vkt544vK0Jeo1XOgYpk9WUqvZNvS8Uj2OcwLwRBEtZFdVn4IRUvjjKCDi9yJwTohDF 7fv1pKFX4Kdx2IyBzpSZ1r94fc+UrpJy7FszfYk6KilKqYhv8RJWS6gu/K9u10f54AVs +G0dMjboZh6y2kH1a2kMDMyZ6XK8M3AhYSitZHvMjyYgPGbCuH7YeELohjXU2PpDvUEt DfHfHnfyM4cPO2/K4U4Qybb98eKHneqi1PDL7onRY9ds6psfIj3+KvaUPAA6hr9BURgj xhIKK7lgY5ZzkKiQPXUCHJvr57rCvpH766TRlpXP4D8wPD/zNa/6dbaSxaaj96czdHIh ZphQ== X-Gm-Message-State: AO0yUKVvBxzzco+ZsAbZ+CDaeOw7ENJfAhaya1iG6xK1D228xIo+ZFfp N7TB9GMSHhzLOjugMsz2RC5VNG0UKbJu X-Google-Smtp-Source: AK7set+nYVXnMVTVXx8L/0tlevRKDekxzEjuxmVVtiFTH9jPAVLUzGzbP9E9MA8O/R60f9kn2yP7smPPz7Ml X-Received: from vipin.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:479f]) (user=vipinsh job=sendgmr) by 2002:a17:903:428b:b0:19a:8751:4dfc with SMTP id ju11-20020a170903428b00b0019a87514dfcmr4898541plb.1.1678142510172; Mon, 06 Mar 2023 14:41:50 -0800 (PST) Date: Mon, 6 Mar 2023 14:41:18 -0800 In-Reply-To: <20230306224127.1689967-1-vipinsh@google.com> Mime-Version: 1.0 References: <20230306224127.1689967-1-vipinsh@google.com> X-Mailer: git-send-email 2.40.0.rc0.216.gc4246ad0f0-goog Message-ID: <20230306224127.1689967-10-vipinsh@google.com> Subject: [Patch v4 09/18] KVM: x86/mmu: Shrink mmu_shadowed_info_cache via MMU shrinker From: Vipin Sharma To: seanjc@google.com, pbonzini@redhat.com, bgardon@google.com, dmatlack@google.com Cc: jmattson@google.com, mizhang@google.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Vipin Sharma Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Shrink shadow page cache via MMU shrinker based on kvm_total_unused_cached_pages. Tested by running dirty_log_perf_test while dropping cache via "echo 2 > /proc/sys/vm/drop_caches" at 1 second interval. Global always return to 0. Shrinker also gets invoked to remove pages in cache. Above test were run with three configurations: - EPT=3DN - EPT=3DY, TDP_MMU=3DN - EPT=3DY, TDP_MMU=3DY Signed-off-by: Vipin Sharma --- arch/x86/kvm/mmu/mmu.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index b7ca31b5699c..a4bf2e433030 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -6725,6 +6725,8 @@ static unsigned long mmu_shrink_scan(struct shrinker = *shrink, kvm_for_each_vcpu(i, vcpu, kvm) { freed +=3D mmu_memory_cache_try_empty(&vcpu->arch.mmu_shadow_page_cache, &vcpu->arch.mmu_shadow_page_cache_lock); + freed +=3D mmu_memory_cache_try_empty(&vcpu->arch.mmu_shadowed_info_cac= he, + &vcpu->arch.mmu_shadow_page_cache_lock); if (freed >=3D sc->nr_to_scan) goto out; } --=20 2.40.0.rc0.216.gc4246ad0f0-goog From nobody Thu Sep 18 23:35:51 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 28343C61DA4 for ; Mon, 6 Mar 2023 22:42:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230190AbjCFWm3 (ORCPT ); Mon, 6 Mar 2023 17:42:29 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44390 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230219AbjCFWmN (ORCPT ); Mon, 6 Mar 2023 17:42:13 -0500 Received: from mail-pl1-x64a.google.com (mail-pl1-x64a.google.com [IPv6:2607:f8b0:4864:20::64a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 900FC82ABF for ; Mon, 6 Mar 2023 14:41:52 -0800 (PST) Received: by mail-pl1-x64a.google.com with SMTP id ju20-20020a170903429400b0019ea5ea044aso4926233plb.21 for ; Mon, 06 Mar 2023 14:41:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; t=1678142512; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=tPKVdqaDisFdGAfkqGOe8iBVNEZA/ukGWtJHDgVP+Ak=; b=klnvvsrGwCWY0xNtfxRp0OMIN4OSqxCgEdWWZinxsMr8EM4F+vjp47fD1GRBoRu8IE wZMKE+LvsZGr6cs1oHgGJoZFWnvG5xdvsLDp/0NKn/b1AR4vIl5EA8B1J15BkOjB1jLL o6IYaDl/UuNu0obuHxfVfilPm66n989izaFjrKf/m3gcPb3gyNx/SdEC1/akRr4FlvNw JuLuP9ZlyvrcmVrmDT/kaTrfx4pXk1OVn2iy0llzINsLsQHYQYo8jLQt/JYxWJLe3FIy CpN+mlqM8PTTTe5Qk0msOFVot0e31lagshAQzRtr+WP+tHSnhAXhsRtLnEbDZ87L0tfA TufA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1678142512; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=tPKVdqaDisFdGAfkqGOe8iBVNEZA/ukGWtJHDgVP+Ak=; b=mmJtF6aydOsmMJTttQIJMSvUxwFeO4/b+y1q8pyYmuVNFHILwuFI4IZQfJS4d6ZnBk dgg9pzd7OSytmdW0yqkU8SYf7Uw/KmsByQFVmEz9aclW/HnHN9/kDvKA/uO4Inglgdh1 cfwGeTxki19KItaqJZBwfXmLVD/BZK7WFGiW2w7F2oyDo70I5JV2n9ZMYB8NU5IuEdJP og6Dl8K6CLKYERKcNovszAnjZ+Bts7tzu9AZ1hbsNp9oZQHJeeFYxHlTIXYvRPx5HGq9 rCAo3PlmX756OS84JVBuKZW+OigDLO6ruzeXrkISv+Zm2Jb7HHt5f9J/LqIVmkFSss6k z8Xw== X-Gm-Message-State: AO0yUKVrn3a4Ez/JO8A5HPABCUuMyU6JgZEE1pJidrg9wJ1wjMlytyRa H5CCPACWaIAvmm+s+Zui1JO4ovXuuuPM X-Google-Smtp-Source: AK7set9/b67iH4rOLvz1VS8akjODPn7aZygSqISR8KUp+OH3XeVyxRNJMmsPzTjQ8S9WaZK/QW0Hqkh4C5W5 X-Received: from vipin.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:479f]) (user=vipinsh job=sendgmr) by 2002:a62:8782:0:b0:60f:b143:8e09 with SMTP id i124-20020a628782000000b0060fb1438e09mr5432145pfe.1.1678142511907; Mon, 06 Mar 2023 14:41:51 -0800 (PST) Date: Mon, 6 Mar 2023 14:41:19 -0800 In-Reply-To: <20230306224127.1689967-1-vipinsh@google.com> Mime-Version: 1.0 References: <20230306224127.1689967-1-vipinsh@google.com> X-Mailer: git-send-email 2.40.0.rc0.216.gc4246ad0f0-goog Message-ID: <20230306224127.1689967-11-vipinsh@google.com> Subject: [Patch v4 10/18] KVM: x86/mmu: Add per VM NUMA aware page table capability From: Vipin Sharma To: seanjc@google.com, pbonzini@redhat.com, bgardon@google.com, dmatlack@google.com Cc: jmattson@google.com, mizhang@google.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Vipin Sharma Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add KVM_CAP_NUMA_AWARE_PAGE_TABLE capability. This capability enables a VM to allocate its page tables, specifically lower level page tables, on the NUMA node of underlying leaf physical page pointed by the page table entry. This patch is only adding this option, future patches will use the boolean numa_aware_page_table to allocate page tables on appropriate NUMA node. For now this capability is for x86 only, it can be extended to other architecture in future if needed. Signed-off-by: Vipin Sharma --- arch/x86/include/asm/kvm_host.h | 6 ++++++ arch/x86/kvm/x86.c | 10 ++++++++++ include/uapi/linux/kvm.h | 1 + 3 files changed, 17 insertions(+) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_hos= t.h index 185719dbeb81..64de083cd6b9 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1467,6 +1467,12 @@ struct kvm_arch { */ #define SPLIT_DESC_CACHE_MIN_NR_OBJECTS (SPTE_ENT_PER_PAGE + 1) struct kvm_mmu_memory_cache split_desc_cache; + + /* + * If true then allocate page tables near to underlying physical page + * NUMA node. + */ + bool numa_aware_page_table; }; =20 struct kvm_vm_stat { diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index f706621c35b8..71728abd7f92 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -4425,6 +4425,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, lon= g ext) case KVM_CAP_VAPIC: case KVM_CAP_ENABLE_CAP: case KVM_CAP_VM_DISABLE_NX_HUGE_PAGES: + case KVM_CAP_NUMA_AWARE_PAGE_TABLE: r =3D 1; break; case KVM_CAP_EXIT_HYPERCALL: @@ -6391,6 +6392,15 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm, } mutex_unlock(&kvm->lock); break; + case KVM_CAP_NUMA_AWARE_PAGE_TABLE: + r =3D -EINVAL; + mutex_lock(&kvm->lock); + if (!kvm->created_vcpus) { + kvm->arch.numa_aware_page_table =3D true; + r =3D 0; + } + mutex_unlock(&kvm->lock); + break; default: r =3D -EINVAL; break; diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index d77aef872a0a..5f367a93762a 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -1184,6 +1184,7 @@ struct kvm_ppc_resize_hpt { #define KVM_CAP_S390_PROTECTED_ASYNC_DISABLE 224 #define KVM_CAP_DIRTY_LOG_RING_WITH_BITMAP 225 #define KVM_CAP_PMU_EVENT_MASKED_EVENTS 226 +#define KVM_CAP_NUMA_AWARE_PAGE_TABLE 227 =20 #ifdef KVM_CAP_IRQ_ROUTING =20 --=20 2.40.0.rc0.216.gc4246ad0f0-goog From nobody Thu Sep 18 23:35:51 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4FF1CC6FD1A for ; Mon, 6 Mar 2023 22:42:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230257AbjCFWmi (ORCPT ); Mon, 6 Mar 2023 17:42:38 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44492 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230253AbjCFWmQ (ORCPT ); Mon, 6 Mar 2023 17:42:16 -0500 Received: from mail-pj1-x104a.google.com (mail-pj1-x104a.google.com [IPv6:2607:f8b0:4864:20::104a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0D62D82AA4 for ; Mon, 6 Mar 2023 14:41:54 -0800 (PST) Received: by mail-pj1-x104a.google.com with SMTP id q24-20020a17090a2e1800b00237c37964d4so7060816pjd.8 for ; Mon, 06 Mar 2023 14:41:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; t=1678142513; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=r1N1Z8zVIXipYiULB/HP9W6AmAVLLJyV6vBxJYQxV+M=; b=FfApUJ58/AuBXxawVuRox/02qb6FsGRX+7YTDtxPU77jaZJBO4CnYJ1tmTBRvQoTf4 1WxdM0Of9MHAMx7Iss3kdWr8ZALDta22kOBWVqiTaxdJ74ntgdGRN2qB0TvXnC7I/0DJ DKAOzvDiaXDO5KEGXKY8UECBsHAYEqJ3tyk786Zz1nA9hk88HjSUnQTCvIGVNTjyVyy+ 5Y3kjVTQAHALJm6tEh1ATDL4UpbYfcP1+OeksxLByOk0BGJfcJ+oj1jxbAPaJfx7D6EB 3j0pXjmg8Bmy4YsD37thgFNRRjZd3v7whTdsYWmtsVHeTVu+dL5SruQ2kAGNnnPXe40q Vveg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1678142513; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=r1N1Z8zVIXipYiULB/HP9W6AmAVLLJyV6vBxJYQxV+M=; b=R9EHvJVqo3OPUjzE8Khxwz3DASUnVIxU1Q4D3HxPge6B9ttBQ0OPBiN5gXza7MOzik J6eFmxNPoBThI/g4v+6czX3AsRyETE6QYYYS0sGJdyIOWS5+mKniQuY5dbhmSKCBiJJV dfdgAWVrLHbDvyvJEq3Ui0fS4r462dXt/zOEzjGxSmdTN2puyPeCLOftNftpCCnKy2rg JGwmcg+OBsSJB4PRsb37zDsfxZY0TdAOwXCJ5M7OqmDd3+dMP/bc9RbC93EEznwyeISZ kQ1ehzWNt+0o3P8wJcsuuDS2KaNI/bCFuyeHsSVcLIEjshoB1fWyxcBMkEpocDsd4Lst 18Yg== X-Gm-Message-State: AO0yUKX3gFZKvnDC5PhFPi8dgrw2PQY+oLwybEcpfFngmRQN0wErcs2I 5d+6R+KdrZk4M21TG3hnUHmWH/XYggBw X-Google-Smtp-Source: AK7set8snFnyvhJ+XXtlE8UT1yhT6chY+2StfCzR+MgJQQLYlG/aXCTvrRYV3Hglo9Y68BVO2tl7+OkABeGg X-Received: from vipin.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:479f]) (user=vipinsh job=sendgmr) by 2002:a62:8245:0:b0:5a8:b093:ff67 with SMTP id w66-20020a628245000000b005a8b093ff67mr5451354pfd.4.1678142513657; Mon, 06 Mar 2023 14:41:53 -0800 (PST) Date: Mon, 6 Mar 2023 14:41:20 -0800 In-Reply-To: <20230306224127.1689967-1-vipinsh@google.com> Mime-Version: 1.0 References: <20230306224127.1689967-1-vipinsh@google.com> X-Mailer: git-send-email 2.40.0.rc0.216.gc4246ad0f0-goog Message-ID: <20230306224127.1689967-12-vipinsh@google.com> Subject: [Patch v4 11/18] KVM: x86/mmu: Add documentation of NUMA aware page table capability From: Vipin Sharma To: seanjc@google.com, pbonzini@redhat.com, bgardon@google.com, dmatlack@google.com Cc: jmattson@google.com, mizhang@google.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Vipin Sharma Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add documentation for KVM_CAP_NUMA_AWARE_PAGE_TABLE capability and explain why it is needed. Signed-off-by: Vipin Sharma --- Documentation/virt/kvm/api.rst | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+) diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index 62de0768d6aa..7e3a1299ca8e 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -7669,6 +7669,35 @@ This capability is aimed to mitigate the threat that= malicious VMs can cause CPU stuck (due to event windows don't open up) and make the CPU unavailable to host or other VMs. =20 +7.34 KVM_CAP_NUMA_AWARE_PAGE_TABLE +------------------------------ + +:Architectures: x86 +:Target: VM +:Returns: 0 on success, -EINVAL if vCPUs are already created. + +This capability allows userspace to enable NUMA aware page tables allocati= ons. +NUMA aware page tables are disabled by default. Once enabled, prior to vCPU +creation, any page table allocated during the life of a VM will be allocat= ed +preferably from the NUMA node of the leaf page. + +Without this capability, default feature is to use current thread mempolic= y and +allocate page table based on that. + +This capability is useful to improve page accesses by a guest. For example= , an +initialization thread which access lots of remote memory and ends up creat= ing +page tables on local NUMA node, or some service thread allocates memory on +remote NUMA nodes and later worker/background threads accessing that memory +will end up accessing remote NUMA node page tables. So, a multi NUMA node +guest, can with high confidence access local memory faster instead of going +through remote page tables first. + +This capability is also helpful for host to reduce live migration impact w= hen +splitting huge pages during dirty log operations. If the thread splitting = huge +page is on remote NUMA node it will create page tables on remote node. Eve= n if +guest is careful in making sure that it only access local memory they will= end +up accessing remote page tables. + 8. Other capabilities. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =20 --=20 2.40.0.rc0.216.gc4246ad0f0-goog From nobody Thu Sep 18 23:35:51 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 54A4FC61DA4 for ; Mon, 6 Mar 2023 22:42:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229958AbjCFWmu (ORCPT ); Mon, 6 Mar 2023 17:42:50 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44630 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230212AbjCFWmU (ORCPT ); Mon, 6 Mar 2023 17:42:20 -0500 Received: from mail-pj1-x104a.google.com (mail-pj1-x104a.google.com [IPv6:2607:f8b0:4864:20::104a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2321D8616F for ; Mon, 6 Mar 2023 14:41:57 -0800 (PST) Received: by mail-pj1-x104a.google.com with SMTP id gf1-20020a17090ac7c100b002369bf87b7aso2981902pjb.8 for ; Mon, 06 Mar 2023 14:41:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; t=1678142515; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=IyMdgcKokNAQM99PayI8bLjHWwV0ForFIy4aNy3o9lk=; b=KgBV2fZBDR75hRmr5WoQIaFK5nxiY1MuIyQqq/Lz0E/QkDMmYsjZW3MRa34kuMgQMb WFHu0aMlW0ZFuhTdf+BmvoqZ42p2tY1SE+KT3f1mhiQULkycdP5rOo+nnu4LzHCaooJ7 c6KkY1/mkucfLXMoqhCsBuMbnzaO5qx/JDFT0cJHALyKeLvsY4NyqQuiVdy85EjXLsLj VqtvdmAWINjXm8+4BEkwg0o7gzzCctUf18vvRdgNMpLF7UkstTX9iKrOdH3J4OBw8Q5n YV4u2mNlr9CEFSbUMSh3uexUnEJUUQJOO7ivnf1EBJMSHq80MICp48/sQvCz3yCvWo84 R6tw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1678142515; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=IyMdgcKokNAQM99PayI8bLjHWwV0ForFIy4aNy3o9lk=; b=Gi1hyW6g20ninlaU5eCMqjj0rJbRVVMPI27UTuW6yPFRkOaqgn7Yl6NMokkXW4Yxzb KUaRRPpI3j5lg9pNCVX8SB9iIqdEY9rCViMveQEU+YY5p8Pk/z7VJMUwfjTVvQOH+Omf HsNvT3DK4ILbeYpSQ8EyUZlfKbgVbINrnIw0dMtM5DjUMyRO/G86bQqGSStq9LlQVdiJ KiaXjEJ5IBYtGyowm1JwtH53wq6tyYQYNhTySoVEwMr/78dV1d9lXOk7wJfAabhyYN2d 6WK3QlYzLodnEhe191/hz0G7kLqCjH773xDgGJ+21TMHt0MwIxgpcQU3LvQ9uG2QAnzT 2Otg== X-Gm-Message-State: AO0yUKVHP0R0AO+XEqduf9m8E+I5yluCeFsweWxXPU2UKOCdaSmBK/tY UTtmkQtJwj1In9pWiYaqTpiT62LCUvVs X-Google-Smtp-Source: AK7set9+eOiqb/vToH7alNWaohc4OhzJSaVaQWex5JU8jZbtLK/hP7rjgkLPv4lw0xP+Po/bflIraSs++1b1 X-Received: from vipin.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:479f]) (user=vipinsh job=sendgmr) by 2002:a17:90a:5993:b0:233:b520:1544 with SMTP id l19-20020a17090a599300b00233b5201544mr6781101pji.0.1678142515516; Mon, 06 Mar 2023 14:41:55 -0800 (PST) Date: Mon, 6 Mar 2023 14:41:21 -0800 In-Reply-To: <20230306224127.1689967-1-vipinsh@google.com> Mime-Version: 1.0 References: <20230306224127.1689967-1-vipinsh@google.com> X-Mailer: git-send-email 2.40.0.rc0.216.gc4246ad0f0-goog Message-ID: <20230306224127.1689967-13-vipinsh@google.com> Subject: [Patch v4 12/18] KVM: x86/mmu: Allocate NUMA aware page tables on TDP huge page splits From: Vipin Sharma To: seanjc@google.com, pbonzini@redhat.com, bgardon@google.com, dmatlack@google.com Cc: jmattson@google.com, mizhang@google.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Vipin Sharma Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" When splitting a huge page, try to allocate new lower level page tables on the same NUMA node of the huge page. Only do NUMA aware page splits if KVM has enabled NUMA aware page table for the VM else fall back to the default method of using current thread mempolicy. When huge pages are split for dirty log, new page tables are created based on the current thread mempolicy, which by default will be the NUMA node of the pCPU executing the thread. If thread enabling dirty log is on a remote NUMA node than the huge page NUMA node then it will create all page tables mapping 4KiB pages of that huge page on the remote node. This reduces performances of the vCPUs which are NUMA bound and are only accessing local NUMA memory as they will access remote NUMA node page tables to access their local NUMA node memory. Tested this feature on synthetic read-write-heavy workload in a 416 vCPU VM on a 8 NUMA node host. This workload creates multiple threads, partitions data in equal sizes and assigns them to each thread. Each thread iterates over its own data in strides, reads and writes value in its partitions. While executing, this workload continuously outputs combined rates at which it is performing operations. When dirty log is enabled in WRPROT mode, workload's performance: - Without NUMA aware page table drops by ~75% - With NUMA aware page table drops by ~20% Raw data from one example run: 1. Without NUMA aware page table Before dirty log: ~2750000 accesses/sec After dirty log: ~700000 accesses/sec 2. With NUMA aware page table Before dirty log: ~2750000 accesses/sec After dirty log: ~2250000 accesses/sec NUMA aware page table improved performance by more than 200% Signed-off-by: Vipin Sharma --- arch/x86/kvm/mmu/mmu_internal.h | 15 +++++++++++++++ arch/x86/kvm/mmu/tdp_mmu.c | 9 +++++---- include/linux/kvm_host.h | 1 + virt/kvm/kvm_main.c | 16 ++++++++++++++++ 4 files changed, 37 insertions(+), 4 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_interna= l.h index a607314348e3..b9d0e09ae974 100644 --- a/arch/x86/kvm/mmu/mmu_internal.h +++ b/arch/x86/kvm/mmu/mmu_internal.h @@ -340,4 +340,19 @@ void track_possible_nx_huge_page(struct kvm *kvm, stru= ct kvm_mmu_page *sp); void untrack_possible_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *s= p); void *mmu_sp_memory_cache_alloc(struct kvm_mmu_memory_cache *cache); =20 +static inline int kvm_pfn_to_page_table_nid(struct kvm *kvm, kvm_pfn_t pfn) +{ + struct page *page; + + if (!kvm->arch.numa_aware_page_table) + return NUMA_NO_NODE; + + page =3D kvm_pfn_to_refcounted_page(pfn); + + if (page) + return page_to_nid(page); + else + return numa_mem_id(); +} + #endif /* __KVM_X86_MMU_INTERNAL_H */ diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index d1e85012a008..61fd9c177694 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -1412,7 +1412,7 @@ bool kvm_tdp_mmu_wrprot_slot(struct kvm *kvm, return spte_set; } =20 -static struct kvm_mmu_page *__tdp_mmu_alloc_sp_for_split(gfp_t gfp) +static struct kvm_mmu_page *__tdp_mmu_alloc_sp_for_split(gfp_t gfp, int ni= d) { struct kvm_mmu_page *sp; =20 @@ -1422,7 +1422,7 @@ static struct kvm_mmu_page *__tdp_mmu_alloc_sp_for_sp= lit(gfp_t gfp) if (!sp) return NULL; =20 - sp->spt =3D (void *)__get_free_page(gfp); + sp->spt =3D kvm_mmu_get_free_page(gfp, nid); if (!sp->spt) { kmem_cache_free(mmu_page_header_cache, sp); return NULL; @@ -1435,6 +1435,7 @@ static struct kvm_mmu_page *tdp_mmu_alloc_sp_for_spli= t(struct kvm *kvm, struct tdp_iter *iter, bool shared) { + int nid =3D kvm_pfn_to_page_table_nid(kvm, spte_to_pfn(iter->old_spte)); struct kvm_mmu_page *sp; =20 /* @@ -1446,7 +1447,7 @@ static struct kvm_mmu_page *tdp_mmu_alloc_sp_for_spli= t(struct kvm *kvm, * If this allocation fails we drop the lock and retry with reclaim * allowed. */ - sp =3D __tdp_mmu_alloc_sp_for_split(GFP_NOWAIT | __GFP_ACCOUNT); + sp =3D __tdp_mmu_alloc_sp_for_split(GFP_NOWAIT | __GFP_ACCOUNT, nid); if (sp) return sp; =20 @@ -1458,7 +1459,7 @@ static struct kvm_mmu_page *tdp_mmu_alloc_sp_for_spli= t(struct kvm *kvm, write_unlock(&kvm->mmu_lock); =20 iter->yielded =3D true; - sp =3D __tdp_mmu_alloc_sp_for_split(GFP_KERNEL_ACCOUNT); + sp =3D __tdp_mmu_alloc_sp_for_split(GFP_KERNEL_ACCOUNT, nid); =20 if (shared) read_lock(&kvm->mmu_lock); diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 5cfa42c130e0..31586a65e346 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -1358,6 +1358,7 @@ void kvm_vcpu_on_spin(struct kvm_vcpu *vcpu, bool yie= ld_to_kernel_mode); void kvm_flush_remote_tlbs(struct kvm *kvm); =20 #ifdef KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE +void *kvm_mmu_get_free_page(gfp_t gfp, int nid); int kvm_mmu_topup_memory_cache(struct kvm_mmu_memory_cache *mc, int min); int __kvm_mmu_topup_memory_cache(struct kvm_mmu_memory_cache *mc, int capa= city, int min); int kvm_mmu_memory_cache_nr_free_objects(struct kvm_mmu_memory_cache *mc); diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 536d8ab6e61f..47006d209309 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -377,6 +377,22 @@ static void kvm_flush_shadow_all(struct kvm *kvm) } =20 #ifdef KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE + +void *kvm_mmu_get_free_page(gfp_t gfp, int nid) +{ +#ifdef CONFIG_NUMA + struct page *page; + + if (nid !=3D NUMA_NO_NODE) { + page =3D alloc_pages_node(nid, gfp, 0); + if (!page) + return (void *)0; + return page_address(page); + } +#endif /* CONFIG_NUMA */ + return (void *)__get_free_page(gfp); +} + static inline void *mmu_memory_cache_alloc_obj(struct kvm_mmu_memory_cache= *mc, gfp_t gfp_flags) { --=20 2.40.0.rc0.216.gc4246ad0f0-goog From nobody Thu Sep 18 23:35:51 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id ADB6DC64EC4 for ; Mon, 6 Mar 2023 22:42:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230297AbjCFWm4 (ORCPT ); Mon, 6 Mar 2023 17:42:56 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43280 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230295AbjCFWm1 (ORCPT ); Mon, 6 Mar 2023 17:42:27 -0500 Received: from mail-pg1-x549.google.com (mail-pg1-x549.google.com [IPv6:2607:f8b0:4864:20::549]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 570FA7C9EA for ; Mon, 6 Mar 2023 14:41:59 -0800 (PST) Received: by mail-pg1-x549.google.com with SMTP id 29-20020a63125d000000b005039a1e2a17so2429357pgs.8 for ; Mon, 06 Mar 2023 14:41:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; t=1678142517; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Wxg15xdE/bH7bhI4zmFRH7O/lZGJ/2R6ueAvOgk8JGw=; b=kd5ycUoFKU+axxo+AJDPdY5gK23AltHJWQ9PllQmiKCr/9wWHIlr51VYjyPnjMbIB3 FXWYI9fGg+ZzUIWcjccLPYyucKzNdEUyOya/oP/cbH+iyIrK32foylhJTbkbD2qPJisu ivHY/JoeYd3ThHTDf00nDXN/B1QztwsfcfAXwSkxReVyCx4teZGfdwRKj/hYU5XG+JpV 1QmhVrO7NFFOo4qDizBBQ+fliilDtPjyQU/4k7Wmo2jdcwZOAHfCoXo6IuwN71WBBOlE /1BSTLosLDYihhcNdVXIY8IwL/iqX+Nb4Xyc2xmcy1/jK8dBxK4LC/l4EflCSu/zh1/3 Sc1g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1678142517; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Wxg15xdE/bH7bhI4zmFRH7O/lZGJ/2R6ueAvOgk8JGw=; b=LWkjo9L2dm4SW/poIm/weIxB4Md2WreZXWdDsgHWIj+Dfn3FP7412AqS5em+CKmVrV JbmbMnVZKmuTVlblictWOVrqTd0HUuK4HHSTFzsVdX5E+Tcv1M2iJqsG/glwV1vEKkLY cp3qRZyeRwl/vFtalahBWhjxeKbOV3Qb7riAxhpAdEj1DbhLITEb/5ikw40vakwCquh4 Bp1hhH6P+vZ1U6RYR3AmN3+Rz5nWljxBEyL6vsBuuG0ZoSkFn7lwkgjgxdj3mNKKiuip 4tLGaLnDvYo5af6Bb49EuuOj0tZGRRiFG+RhFDzm9q2diNqB64rJnAnO0CzJHYprQ9uI QAYQ== X-Gm-Message-State: AO0yUKWCfQ28zWj8OC4aFUfk6C9wDd6/ps3WwEiNYFDqQHBjF4GpUk5y ZSnBgZCvvy2H+pxUdi437BcicsrwyraY X-Google-Smtp-Source: AK7set8zyLRGjCJT715lEA2X1BTzblUJpfiGrVMGO6a2tbRTxhr7dXOJw/2D/EY+UNBggv7NLNT+hSuz8LTV X-Received: from vipin.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:479f]) (user=vipinsh job=sendgmr) by 2002:a05:6a00:2253:b0:603:51de:c0dd with SMTP id i19-20020a056a00225300b0060351dec0ddmr5311084pfu.6.1678142517261; Mon, 06 Mar 2023 14:41:57 -0800 (PST) Date: Mon, 6 Mar 2023 14:41:22 -0800 In-Reply-To: <20230306224127.1689967-1-vipinsh@google.com> Mime-Version: 1.0 References: <20230306224127.1689967-1-vipinsh@google.com> X-Mailer: git-send-email 2.40.0.rc0.216.gc4246ad0f0-goog Message-ID: <20230306224127.1689967-14-vipinsh@google.com> Subject: [Patch v4 13/18] KVM: mmu: Add common initialization logic for struct kvm_mmu_memory_cache{} From: Vipin Sharma To: seanjc@google.com, pbonzini@redhat.com, bgardon@google.com, dmatlack@google.com Cc: jmattson@google.com, mizhang@google.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Vipin Sharma Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add macros and function to make common logic for struct kvm_mmu_memory_cache{} declaration and initialization. Any user which wants different values in struct kvm_mmu_memory_cache{} will overwrite the default values explicitly after the initialization. Suggested-by: David Matlack Signed-off-by: Vipin Sharma --- arch/arm64/kvm/arm.c | 1 + arch/arm64/kvm/mmu.c | 3 ++- arch/riscv/kvm/mmu.c | 9 +++++---- arch/riscv/kvm/vcpu.c | 1 + arch/x86/kvm/mmu/mmu.c | 8 ++++++++ include/linux/kvm_types.h | 10 ++++++++++ 6 files changed, 27 insertions(+), 5 deletions(-) diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index 3bd732eaf087..2b3d88e4ace8 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -330,6 +330,7 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu) vcpu->arch.target =3D -1; bitmap_zero(vcpu->arch.features, KVM_VCPU_MAX_FEATURES); =20 + INIT_KVM_MMU_MEMORY_CACHE(&vcpu->arch.mmu_page_cache); vcpu->arch.mmu_page_cache.gfp_zero =3D __GFP_ZERO; =20 /* diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 7113587222ff..8a56f071ca66 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -895,7 +895,7 @@ int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t = guest_ipa, { phys_addr_t addr; int ret =3D 0; - struct kvm_mmu_memory_cache cache =3D { .gfp_zero =3D __GFP_ZERO }; + KVM_MMU_MEMORY_CACHE(cache); struct kvm_pgtable *pgt =3D kvm->arch.mmu.pgt; enum kvm_pgtable_prot prot =3D KVM_PGTABLE_PROT_DEVICE | KVM_PGTABLE_PROT_R | @@ -904,6 +904,7 @@ int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t = guest_ipa, if (is_protected_kvm_enabled()) return -EPERM; =20 + cache.gfp_zero =3D __GFP_ZERO; size +=3D offset_in_page(guest_ipa); guest_ipa &=3D PAGE_MASK; =20 diff --git a/arch/riscv/kvm/mmu.c b/arch/riscv/kvm/mmu.c index 78211aed36fa..bdd8c17958dd 100644 --- a/arch/riscv/kvm/mmu.c +++ b/arch/riscv/kvm/mmu.c @@ -351,10 +351,11 @@ int kvm_riscv_gstage_ioremap(struct kvm *kvm, gpa_t g= pa, int ret =3D 0; unsigned long pfn; phys_addr_t addr, end; - struct kvm_mmu_memory_cache pcache =3D { - .gfp_custom =3D (in_atomic) ? GFP_ATOMIC | __GFP_ACCOUNT : 0, - .gfp_zero =3D __GFP_ZERO, - }; + KVM_MMU_MEMORY_CACHE(pcache); + + pcache.gfp_zero =3D __GFP_ZERO; + if (in_atomic) + pcache.gfp_custom =3D GFP_ATOMIC | __GFP_ACCOUNT; =20 end =3D (gpa + size + PAGE_SIZE - 1) & PAGE_MASK; pfn =3D __phys_to_pfn(hpa); diff --git a/arch/riscv/kvm/vcpu.c b/arch/riscv/kvm/vcpu.c index 7d010b0be54e..bc743e9122d1 100644 --- a/arch/riscv/kvm/vcpu.c +++ b/arch/riscv/kvm/vcpu.c @@ -163,6 +163,7 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu) =20 /* Mark this VCPU never ran */ vcpu->arch.ran_atleast_once =3D false; + INIT_KVM_MMU_MEMORY_CACHE(&vcpu->arch.mmu_page_cache); vcpu->arch.mmu_page_cache.gfp_zero =3D __GFP_ZERO; bitmap_zero(vcpu->arch.isa, RISCV_ISA_EXT_MAX); =20 diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index a4bf2e433030..b706087ef74e 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -5961,15 +5961,20 @@ int kvm_mmu_create(struct kvm_vcpu *vcpu) { int ret; =20 + INIT_KVM_MMU_MEMORY_CACHE(&vcpu->arch.mmu_pte_list_desc_cache); vcpu->arch.mmu_pte_list_desc_cache.kmem_cache =3D pte_list_desc_cache; vcpu->arch.mmu_pte_list_desc_cache.gfp_zero =3D __GFP_ZERO; =20 + INIT_KVM_MMU_MEMORY_CACHE(&vcpu->arch.mmu_page_header_cache); vcpu->arch.mmu_page_header_cache.kmem_cache =3D mmu_page_header_cache; vcpu->arch.mmu_page_header_cache.gfp_zero =3D __GFP_ZERO; =20 + INIT_KVM_MMU_MEMORY_CACHE(&vcpu->arch.mmu_shadow_page_cache); vcpu->arch.mmu_shadow_page_cache.gfp_zero =3D __GFP_ZERO; mutex_init(&vcpu->arch.mmu_shadow_page_cache_lock); =20 + INIT_KVM_MMU_MEMORY_CACHE(&vcpu->arch.mmu_shadowed_info_cache); + vcpu->arch.mmu =3D &vcpu->arch.root_mmu; vcpu->arch.walk_mmu =3D &vcpu->arch.root_mmu; =20 @@ -6131,11 +6136,14 @@ int kvm_mmu_init_vm(struct kvm *kvm) node->track_flush_slot =3D kvm_mmu_invalidate_zap_pages_in_memslot; kvm_page_track_register_notifier(kvm, node); =20 + INIT_KVM_MMU_MEMORY_CACHE(&kvm->arch.split_page_header_cache); kvm->arch.split_page_header_cache.kmem_cache =3D mmu_page_header_cache; kvm->arch.split_page_header_cache.gfp_zero =3D __GFP_ZERO; =20 + INIT_KVM_MMU_MEMORY_CACHE(&kvm->arch.split_shadow_page_cache); kvm->arch.split_shadow_page_cache.gfp_zero =3D __GFP_ZERO; =20 + INIT_KVM_MMU_MEMORY_CACHE(&kvm->arch.split_desc_cache); kvm->arch.split_desc_cache.kmem_cache =3D pte_list_desc_cache; kvm->arch.split_desc_cache.gfp_zero =3D __GFP_ZERO; =20 diff --git a/include/linux/kvm_types.h b/include/linux/kvm_types.h index 2728d49bbdf6..192516eeccac 100644 --- a/include/linux/kvm_types.h +++ b/include/linux/kvm_types.h @@ -98,6 +98,16 @@ struct kvm_mmu_memory_cache { int capacity; void **objects; }; + +#define KVM_MMU_MEMORY_CACHE_INIT() { } + +#define KVM_MMU_MEMORY_CACHE(_name) \ + struct kvm_mmu_memory_cache _name =3D KVM_MMU_MEMORY_CACHE_INIT() + +static inline void INIT_KVM_MMU_MEMORY_CACHE(struct kvm_mmu_memory_cache *= cache) +{ + *cache =3D (struct kvm_mmu_memory_cache)KVM_MMU_MEMORY_CACHE_INIT(); +} #endif =20 #define HALT_POLL_HIST_COUNT 32 --=20 2.40.0.rc0.216.gc4246ad0f0-goog From nobody Thu Sep 18 23:35:51 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 49284C64EC4 for ; Mon, 6 Mar 2023 22:43:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229868AbjCFWnG (ORCPT ); Mon, 6 Mar 2023 17:43:06 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44506 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230300AbjCFWmh (ORCPT ); Mon, 6 Mar 2023 17:42:37 -0500 Received: from mail-pg1-x549.google.com (mail-pg1-x549.google.com [IPv6:2607:f8b0:4864:20::549]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 262B586175 for ; Mon, 6 Mar 2023 14:42:06 -0800 (PST) Received: by mail-pg1-x549.google.com with SMTP id t185-20020a635fc2000000b00502e332493fso2489940pgb.12 for ; Mon, 06 Mar 2023 14:42:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; t=1678142519; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=9itEUO5Up+fI8iZk0iLC4qpDns9vSkVb4A516EJM+I0=; b=at6TQkK22xH96lnLyAJmubroPcHciUudqTHcykVv1J4vGrW/W+Wm443aAB9k+s4VL6 QhJ271wtkDKYq+yN0qOuZEsZhFa7kpf1EYRE9YM+4gPWg5IHwL4hklbYs1zbUgsk7Bbk 3TcmaPFIIgEvnaZO3k4WnSCc9ZFBZvKPoYHHSFBvgDLaqrbu2ks/AokOLJOKlI1j/4HE AfdPl8TZkPDz4VBBwPbmvfYoWUHigzQM0524TwkBQBlm8NzIMPuKiJEruvOwxb7pvzzS Iuja/F2k+J1CiXNukSTAtlXOklCvIeUEOyDDUU+xou9hV6faTKP4unCqrQIVi3pzdUsT qK3A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1678142519; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=9itEUO5Up+fI8iZk0iLC4qpDns9vSkVb4A516EJM+I0=; b=3pZhpn4VQ+2aq2icSz24gn4lSH2SW+QaMUR6VMDRH0EMh0rdM93/pT9TBs0rjxyKcO PFO6Gdr8RKrBWy7/VFce2zYW8F6tC+LufUDiVIA8KQctdB3nudnyP5+dRQ/sfEqRxQ5w xH1RyAYL6zthMFfpoxqKlXdwi4c4OIkyOTGQBjXPiJ9Sqmr7C4fRbsJpM2WqiDVHVFUV rnbfM5+Z3GGtFpiMQwMpnAC1aeyO1/zMwfGRnO5ZCgvX29GQMp7XqPaiVsU5C487kSyM HaXSHaAgtpcASAtL42jj8MFiBufSLamuSeYqseEE40gQukrTL0pwVYsYV9WMCTZbWYiP ETmQ== X-Gm-Message-State: AO0yUKUvlC0kaORWXw0Z3EWceXu+0CfqmNqHxdLk9LZB2iiL1CJBj5Lw A2Bah6PBgV8r5rO6x+l4KTW6OJNV1qBd X-Google-Smtp-Source: AK7set9LZnkcT+4mc88Pdxib2KGSekMwjKPwN1F6NV9DbxpH+EWfu7Z5xTPOZLDmNsyKAGya/UsFECxhTM3s X-Received: from vipin.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:479f]) (user=vipinsh job=sendgmr) by 2002:a17:902:9a03:b0:19a:afc4:2300 with SMTP id v3-20020a1709029a0300b0019aafc42300mr5089548plp.6.1678142518978; Mon, 06 Mar 2023 14:41:58 -0800 (PST) Date: Mon, 6 Mar 2023 14:41:23 -0800 In-Reply-To: <20230306224127.1689967-1-vipinsh@google.com> Mime-Version: 1.0 References: <20230306224127.1689967-1-vipinsh@google.com> X-Mailer: git-send-email 2.40.0.rc0.216.gc4246ad0f0-goog Message-ID: <20230306224127.1689967-15-vipinsh@google.com> Subject: [Patch v4 14/18] KVM: mmu: Initialize kvm_mmu_memory_cache.gfp_zero to __GFP_ZERO by default From: Vipin Sharma To: seanjc@google.com, pbonzini@redhat.com, bgardon@google.com, dmatlack@google.com Cc: jmattson@google.com, mizhang@google.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Vipin Sharma Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Set __GFP_ZERO to gfp_zero in default initizliation of struct kvm_mmu_memory_cache{} All of the users of default initialization code of struct kvm_mmu_memory_cache{} explicitly sets gfp_zero to __GFP_ZERO. This can be moved to common initialization logic. Signed-off-by: Vipin Sharma --- arch/arm64/kvm/arm.c | 1 - arch/arm64/kvm/mmu.c | 1 - arch/riscv/kvm/mmu.c | 1 - arch/riscv/kvm/vcpu.c | 1 - arch/x86/kvm/mmu/mmu.c | 6 ------ include/linux/kvm_types.h | 4 +++- 6 files changed, 3 insertions(+), 11 deletions(-) diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index 2b3d88e4ace8..b4243978d962 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -331,7 +331,6 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu) bitmap_zero(vcpu->arch.features, KVM_VCPU_MAX_FEATURES); =20 INIT_KVM_MMU_MEMORY_CACHE(&vcpu->arch.mmu_page_cache); - vcpu->arch.mmu_page_cache.gfp_zero =3D __GFP_ZERO; =20 /* * Default value for the FP state, will be overloaded at load diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 8a56f071ca66..133eba96c41f 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -904,7 +904,6 @@ int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t = guest_ipa, if (is_protected_kvm_enabled()) return -EPERM; =20 - cache.gfp_zero =3D __GFP_ZERO; size +=3D offset_in_page(guest_ipa); guest_ipa &=3D PAGE_MASK; =20 diff --git a/arch/riscv/kvm/mmu.c b/arch/riscv/kvm/mmu.c index bdd8c17958dd..62550fd91c70 100644 --- a/arch/riscv/kvm/mmu.c +++ b/arch/riscv/kvm/mmu.c @@ -353,7 +353,6 @@ int kvm_riscv_gstage_ioremap(struct kvm *kvm, gpa_t gpa, phys_addr_t addr, end; KVM_MMU_MEMORY_CACHE(pcache); =20 - pcache.gfp_zero =3D __GFP_ZERO; if (in_atomic) pcache.gfp_custom =3D GFP_ATOMIC | __GFP_ACCOUNT; =20 diff --git a/arch/riscv/kvm/vcpu.c b/arch/riscv/kvm/vcpu.c index bc743e9122d1..f5a96ed1e426 100644 --- a/arch/riscv/kvm/vcpu.c +++ b/arch/riscv/kvm/vcpu.c @@ -164,7 +164,6 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu) /* Mark this VCPU never ran */ vcpu->arch.ran_atleast_once =3D false; INIT_KVM_MMU_MEMORY_CACHE(&vcpu->arch.mmu_page_cache); - vcpu->arch.mmu_page_cache.gfp_zero =3D __GFP_ZERO; bitmap_zero(vcpu->arch.isa, RISCV_ISA_EXT_MAX); =20 /* Setup ISA features available to VCPU */ diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index b706087ef74e..d96afc849ee8 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -5963,14 +5963,11 @@ int kvm_mmu_create(struct kvm_vcpu *vcpu) =20 INIT_KVM_MMU_MEMORY_CACHE(&vcpu->arch.mmu_pte_list_desc_cache); vcpu->arch.mmu_pte_list_desc_cache.kmem_cache =3D pte_list_desc_cache; - vcpu->arch.mmu_pte_list_desc_cache.gfp_zero =3D __GFP_ZERO; =20 INIT_KVM_MMU_MEMORY_CACHE(&vcpu->arch.mmu_page_header_cache); vcpu->arch.mmu_page_header_cache.kmem_cache =3D mmu_page_header_cache; - vcpu->arch.mmu_page_header_cache.gfp_zero =3D __GFP_ZERO; =20 INIT_KVM_MMU_MEMORY_CACHE(&vcpu->arch.mmu_shadow_page_cache); - vcpu->arch.mmu_shadow_page_cache.gfp_zero =3D __GFP_ZERO; mutex_init(&vcpu->arch.mmu_shadow_page_cache_lock); =20 INIT_KVM_MMU_MEMORY_CACHE(&vcpu->arch.mmu_shadowed_info_cache); @@ -6138,14 +6135,11 @@ int kvm_mmu_init_vm(struct kvm *kvm) =20 INIT_KVM_MMU_MEMORY_CACHE(&kvm->arch.split_page_header_cache); kvm->arch.split_page_header_cache.kmem_cache =3D mmu_page_header_cache; - kvm->arch.split_page_header_cache.gfp_zero =3D __GFP_ZERO; =20 INIT_KVM_MMU_MEMORY_CACHE(&kvm->arch.split_shadow_page_cache); - kvm->arch.split_shadow_page_cache.gfp_zero =3D __GFP_ZERO; =20 INIT_KVM_MMU_MEMORY_CACHE(&kvm->arch.split_desc_cache); kvm->arch.split_desc_cache.kmem_cache =3D pte_list_desc_cache; - kvm->arch.split_desc_cache.gfp_zero =3D __GFP_ZERO; =20 return 0; } diff --git a/include/linux/kvm_types.h b/include/linux/kvm_types.h index 192516eeccac..5da7953532ce 100644 --- a/include/linux/kvm_types.h +++ b/include/linux/kvm_types.h @@ -99,7 +99,9 @@ struct kvm_mmu_memory_cache { void **objects; }; =20 -#define KVM_MMU_MEMORY_CACHE_INIT() { } +#define KVM_MMU_MEMORY_CACHE_INIT() { \ + .gfp_zero =3D __GFP_ZERO, \ +} =20 #define KVM_MMU_MEMORY_CACHE(_name) \ struct kvm_mmu_memory_cache _name =3D KVM_MMU_MEMORY_CACHE_INIT() --=20 2.40.0.rc0.216.gc4246ad0f0-goog From nobody Thu Sep 18 23:35:51 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8CA82C61DA4 for ; Mon, 6 Mar 2023 22:43:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230176AbjCFWnP (ORCPT ); Mon, 6 Mar 2023 17:43:15 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44364 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229975AbjCFWmq (ORCPT ); Mon, 6 Mar 2023 17:42:46 -0500 Received: from mail-pl1-x649.google.com (mail-pl1-x649.google.com [IPv6:2607:f8b0:4864:20::649]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5146F79B3C for ; Mon, 6 Mar 2023 14:42:13 -0800 (PST) Received: by mail-pl1-x649.google.com with SMTP id iw4-20020a170903044400b0019ccafc1fbeso6574394plb.3 for ; Mon, 06 Mar 2023 14:42:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; t=1678142520; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=QF0YVo49bCSm6bcyFWH7OqAKf22akzrbGA/pGOKX2s4=; b=j70Rx6ona9KIAnQ4HWSz5x9GD68BX0x1weHZP2/r+WzR2h1/SSUiCMoGT0/LMn8Krl hDwgLLwODSiQraOXzkU5foUjHwl43UOCrK1AKg2uMA8DJsB9kMN/FpRAawq/Jm/VX/cJ LD8cxRl72IW9wCnhEe3Tet/54wvO6hOSZMHRw53rLydqkVnq/i9RrbKZqXUkjhisFLgo MwQy/8x+BSK/8fCADVgj6wpzAz8cfMGDyyHWx55HQAg+EUJ4PFbAHSh2HUhxjhsnfQjL g901dK4CRZerMpvmU1+ZNo/lW5rSjLY+I7K/GMlMMlDpjMd2e9OzGaB7LM7xYGdRwzrT ePNw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1678142520; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=QF0YVo49bCSm6bcyFWH7OqAKf22akzrbGA/pGOKX2s4=; b=IE2/H5Gf5/gIqu2n6ic3oU8hRi7uUMfAxJRl3j8yXGsWf+SZKOd5jLGMYoPKZGz8zI /l+uzn7XBlVJXIUrVc7SOUnT9hwIXmIKEOlNI6IP2+32c8Rg3S92OBXPQUNhMiZTOxkV 3bQ8z19e+vfa4qYfJxyAIqn3RerCbkW+mQKZ7rlrgdyMqPq18qe8tamVz4gszfVtNcTO P+iMsBrCSnrjqlGHlggxlw1kIETcHaXI+qEb/AhJM1g4lSTaXkKCO7itLzZSVUSJgWFQ 2gL6Xw0jkFnZuGglTtsXK2tiFxB7g+FpnQGbymkate6QTjL01coR+I5R33GbYDxejN5b wyhw== X-Gm-Message-State: AO0yUKVNrckwI8kExR9QyqNbc+vEdrcspPq81oqir3pMZ79FadWd+7Ac JAZ9LWcxGkKDW6jK7icvptxg7VxpinEy X-Google-Smtp-Source: AK7set9V/EfuOapXLc/xwlEXZ52sVooC/583kWWlWB81qUshsgV2Atlufd/VpVQMEq1cD94C2YRFlFJFN9P5 X-Received: from vipin.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:479f]) (user=vipinsh job=sendgmr) by 2002:a63:3747:0:b0:503:2d50:5bf1 with SMTP id g7-20020a633747000000b005032d505bf1mr4047163pgn.7.1678142520704; Mon, 06 Mar 2023 14:42:00 -0800 (PST) Date: Mon, 6 Mar 2023 14:41:24 -0800 In-Reply-To: <20230306224127.1689967-1-vipinsh@google.com> Mime-Version: 1.0 References: <20230306224127.1689967-1-vipinsh@google.com> X-Mailer: git-send-email 2.40.0.rc0.216.gc4246ad0f0-goog Message-ID: <20230306224127.1689967-16-vipinsh@google.com> Subject: [Patch v4 15/18] KVM: mmu: Add NUMA node support in struct kvm_mmu_memory_cache{} From: Vipin Sharma To: seanjc@google.com, pbonzini@redhat.com, bgardon@google.com, dmatlack@google.com Cc: jmattson@google.com, mizhang@google.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Vipin Sharma Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add NUMA node id variable in struct kvm_mmu_memory_cache{}. This variable denotes preferable NUMA node from which memory will be allocated under this memory cache. Set this variable to NUMA_NO_NODE if there is no preferred node. MIPS doesn't do any sort of initializatino of struct kvm_mmu_memory_cache{}. Keep things similar in MIPS by setting gfp_zero to 0 as INIT_KVM_MMU_MEMORY_CACHE() will initialize it to __GFP_ZERO. "node" cannot be left as 0, as 0 is a valid NUMA node value. Signed-off-by: Vipin Sharma --- arch/mips/kvm/mips.c | 3 +++ include/linux/kvm_types.h | 3 +++ 2 files changed, 6 insertions(+) diff --git a/arch/mips/kvm/mips.c b/arch/mips/kvm/mips.c index 36c8991b5d39..5ec5ce919918 100644 --- a/arch/mips/kvm/mips.c +++ b/arch/mips/kvm/mips.c @@ -294,6 +294,9 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu) HRTIMER_MODE_REL); vcpu->arch.comparecount_timer.function =3D kvm_mips_comparecount_wakeup; =20 + INIT_KVM_MMU_MEMORY_CACHE(&vcpu->arch.mmu_page_cache); + vcpu->arch.mmu_page_cache.gfp_zero =3D 0; + /* * Allocate space for host mode exception handlers that handle * guest mode exits diff --git a/include/linux/kvm_types.h b/include/linux/kvm_types.h index 5da7953532ce..b2a405c8e629 100644 --- a/include/linux/kvm_types.h +++ b/include/linux/kvm_types.h @@ -97,10 +97,13 @@ struct kvm_mmu_memory_cache { struct kmem_cache *kmem_cache; int capacity; void **objects; + /* Preferred NUMA node of memory allocation. */ + int node; }; =20 #define KVM_MMU_MEMORY_CACHE_INIT() { \ .gfp_zero =3D __GFP_ZERO, \ + .node =3D NUMA_NO_NODE, \ } =20 #define KVM_MMU_MEMORY_CACHE(_name) \ --=20 2.40.0.rc0.216.gc4246ad0f0-goog From nobody Thu Sep 18 23:35:51 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B3963C6FD1B for ; Mon, 6 Mar 2023 22:43:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230287AbjCFWnS (ORCPT ); Mon, 6 Mar 2023 17:43:18 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44398 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229932AbjCFWmt (ORCPT ); Mon, 6 Mar 2023 17:42:49 -0500 Received: from mail-pj1-x104a.google.com (mail-pj1-x104a.google.com [IPv6:2607:f8b0:4864:20::104a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DF18286DDA for ; Mon, 6 Mar 2023 14:42:17 -0800 (PST) Received: by mail-pj1-x104a.google.com with SMTP id m9-20020a17090a7f8900b0023769205928so7063446pjl.6 for ; Mon, 06 Mar 2023 14:42:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; t=1678142522; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=6vNzM0GO5MbtnDqLqku67MPZbUcarzFcKpTc3q9nP3k=; b=gazM+TYEEbP14W+rWRFTwfUrHyN3KdQf76sB5eH96SwCJUfaR6rpxh8xN+WanYYVzF QBZLrpT4BSTNPefyyG4FPpaD7f79PmPGnD6/az01XAnLI2irX1Ygk7x4FjdbcR47ltI8 3S4ScXjzy7/oo/VV+42a0ZFlbEIXdD0vA+eR8nbC1IkTPxJkfrj4/7xDVwjG32OXY1lb sOD+UhczN6cARoP51qOo89jeac+Oj5ykbaYZMttd8P4ADc9KG2JizR3WkTNQtYjmDzAn RiX7UM/yTffI3ZfCP0ekk4ug+KVybUSSaucVFneuKSZsWPQGeOC3eaia9hWC5MzitqYz IvhA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1678142522; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=6vNzM0GO5MbtnDqLqku67MPZbUcarzFcKpTc3q9nP3k=; b=FFumZAtTs2mXlU94fNlTStX+u2TSmg04k/KmILdRAQAFRELbnUhAD5TqhTwYQXsANg oSbWEHAGq3MsehZ7hDQIOa3CuxNA6Blu1uxX8nK3aJbFiHGaQOPOgO8zkmr0L3s0yLgd yGD0K5d1T8B0tuVXMXEi+6dK4jidugx71/FEbvUriavPIY7GRrST+id2iOTuQ/eNU2lu sl3LmzRYfkpFc3VQwN93VWLuzXyYPveTXZpsUtgSNTW3cSICyFXfqoYO97oP6fcxFuuo tYZ6rejyMKuiMPxGI4Eh5lR4L363/ZfDeUATVJvvTYBOyvZ55R/apL0f+/vn2i42/EwH rx5w== X-Gm-Message-State: AO0yUKU8uZn8MBsGF9mRz/FyGn/SBRPK3LidgKzEemm0VrPcYYrRhNRp 7Y72o0QEBnhLP2TwSHfrbYhTg6dApBOw X-Google-Smtp-Source: AK7set/eezTIEbASuhG8LhM6PyFQFMMwDXgH1qzjZn8+eIggy+CHhD62jLS147rj2/81GVxpa6dZKyGYL5T3 X-Received: from vipin.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:479f]) (user=vipinsh job=sendgmr) by 2002:a17:90a:c217:b0:234:b8cb:5133 with SMTP id e23-20020a17090ac21700b00234b8cb5133mr4591584pjt.7.1678142522518; Mon, 06 Mar 2023 14:42:02 -0800 (PST) Date: Mon, 6 Mar 2023 14:41:25 -0800 In-Reply-To: <20230306224127.1689967-1-vipinsh@google.com> Mime-Version: 1.0 References: <20230306224127.1689967-1-vipinsh@google.com> X-Mailer: git-send-email 2.40.0.rc0.216.gc4246ad0f0-goog Message-ID: <20230306224127.1689967-17-vipinsh@google.com> Subject: [Patch v4 16/18] KVM: x86/mmu: Allocate numa aware page tables during page fault From: Vipin Sharma To: seanjc@google.com, pbonzini@redhat.com, bgardon@google.com, dmatlack@google.com Cc: jmattson@google.com, mizhang@google.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Vipin Sharma Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Allocate page tables on the preferred NUMA node via memory cache during page faults. If memory cache doesn't have a preferred NUMA node (node value is set to NUMA_NO_NODE) then fallback to the default logic where pages are selected based on thread's mempolicy. Also, free NUMA aware page caches, mmu_shadow_page_cache, when memory shrinker is invoked. Allocate root pages based on the current thread's NUMA node as there is no way to know which will be the ideal NUMA node in long run. This commit allocate page tables to be on the same NUMA node as the physical page pointed by them, even if a vCPU causing page fault is on a different NUMA node. If memory is not available on the requested NUMA node then the other nearest NUMA node is selected by default. NUMA aware page tables can be beneficial in cases where a thread touches lot of far memory initially and then divide work among multiple threads. VMs generally take advantage of NUMA architecture for faster memory access by moving threads to the NUMA node of the memory they are accessing. This change will help them in accessing pages faster. Downside of this change is that an experimental workload can be created where a guest threads are always accessing remote memory and not the one local to them. This will cause performance to degrade compared to VMs where numa aware page tables are not enabled. Ideally, these VMs when using non-uniform memory access machine should generally be taking advantage of NUMA architecture to improve their performance in the first place. Signed-off-by: Vipin Sharma --- arch/x86/include/asm/kvm_host.h | 2 +- arch/x86/kvm/mmu/mmu.c | 63 ++++++++++++++++++++++++--------- arch/x86/kvm/mmu/mmu_internal.h | 24 ++++++++++++- arch/x86/kvm/mmu/paging_tmpl.h | 4 +-- arch/x86/kvm/mmu/tdp_mmu.c | 14 +++++--- include/linux/kvm_types.h | 6 ++++ virt/kvm/kvm_main.c | 2 +- 7 files changed, 88 insertions(+), 27 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_hos= t.h index 64de083cd6b9..77d3aa368e5e 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -787,7 +787,7 @@ struct kvm_vcpu_arch { struct kvm_mmu *walk_mmu; =20 struct kvm_mmu_memory_cache mmu_pte_list_desc_cache; - struct kvm_mmu_memory_cache mmu_shadow_page_cache; + struct kvm_mmu_memory_cache mmu_shadow_page_cache[MAX_NUMNODES]; struct kvm_mmu_memory_cache mmu_shadowed_info_cache; struct kvm_mmu_memory_cache mmu_page_header_cache; =20 diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index d96afc849ee8..86f0d74d35ed 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -702,7 +702,7 @@ static void mmu_free_sp_memory_cache(struct kvm_mmu_mem= ory_cache *cache) =20 static int mmu_topup_memory_caches(struct kvm_vcpu *vcpu, bool maybe_indir= ect) { - int r; + int r, nid =3D KVM_MMU_DEFAULT_CACHE_INDEX; =20 /* 1 rmap, 1 parent PTE per level, and the prefetched rmaps. */ r =3D kvm_mmu_topup_memory_cache(&vcpu->arch.mmu_pte_list_desc_cache, @@ -710,7 +710,16 @@ static int mmu_topup_memory_caches(struct kvm_vcpu *vc= pu, bool maybe_indirect) if (r) return r; =20 - r =3D mmu_topup_sp_memory_cache(&vcpu->arch.mmu_shadow_page_cache, PT64_R= OOT_MAX_LEVEL); + if (kvm_numa_aware_page_table_enabled(vcpu->kvm)) { + for_each_online_node(nid) { + r =3D mmu_topup_sp_memory_cache(&vcpu->arch.mmu_shadow_page_cache[nid], + PT64_ROOT_MAX_LEVEL); + } + } else { + r =3D mmu_topup_sp_memory_cache(&vcpu->arch.mmu_shadow_page_cache[nid], + PT64_ROOT_MAX_LEVEL); + } + if (r) return r; =20 @@ -726,9 +735,12 @@ static int mmu_topup_memory_caches(struct kvm_vcpu *vc= pu, bool maybe_indirect) =20 static void mmu_free_memory_caches(struct kvm_vcpu *vcpu) { + int nid; + kvm_mmu_free_memory_cache(&vcpu->arch.mmu_pte_list_desc_cache); mutex_lock(&vcpu->arch.mmu_shadow_page_cache_lock); - mmu_free_sp_memory_cache(&vcpu->arch.mmu_shadow_page_cache); + for_each_node(nid) + mmu_free_sp_memory_cache(&vcpu->arch.mmu_shadow_page_cache[nid]); mmu_free_sp_memory_cache(&vcpu->arch.mmu_shadowed_info_cache); mutex_unlock(&vcpu->arch.mmu_shadow_page_cache_lock); kvm_mmu_free_memory_cache(&vcpu->arch.mmu_page_header_cache); @@ -2245,12 +2257,12 @@ static struct kvm_mmu_page *__kvm_mmu_get_shadow_pa= ge(struct kvm *kvm, } =20 static struct kvm_mmu_page *kvm_mmu_get_shadow_page(struct kvm_vcpu *vcpu, - gfn_t gfn, + gfn_t gfn, int nid, union kvm_mmu_page_role role) { struct shadow_page_caches caches =3D { .page_header_cache =3D &vcpu->arch.mmu_page_header_cache, - .shadow_page_cache =3D &vcpu->arch.mmu_shadow_page_cache, + .shadow_page_cache =3D &vcpu->arch.mmu_shadow_page_cache[nid], .shadowed_info_cache =3D &vcpu->arch.mmu_shadowed_info_cache, }; =20 @@ -2305,15 +2317,18 @@ static union kvm_mmu_page_role kvm_mmu_child_role(u= 64 *sptep, bool direct, =20 static struct kvm_mmu_page *kvm_mmu_get_child_sp(struct kvm_vcpu *vcpu, u64 *sptep, gfn_t gfn, - bool direct, unsigned int access) + bool direct, unsigned int access, + kvm_pfn_t pfn) { union kvm_mmu_page_role role; + int nid; =20 if (is_shadow_present_pte(*sptep) && !is_large_pte(*sptep)) return ERR_PTR(-EEXIST); =20 role =3D kvm_mmu_child_role(sptep, direct, access); - return kvm_mmu_get_shadow_page(vcpu, gfn, role); + nid =3D kvm_pfn_to_mmu_cache_nid(vcpu->kvm, pfn); + return kvm_mmu_get_shadow_page(vcpu, gfn, nid, role); } =20 static void shadow_walk_init_using_root(struct kvm_shadow_walk_iterator *i= terator, @@ -3205,7 +3220,8 @@ static int direct_map(struct kvm_vcpu *vcpu, struct k= vm_page_fault *fault) if (it.level =3D=3D fault->goal_level) break; =20 - sp =3D kvm_mmu_get_child_sp(vcpu, it.sptep, base_gfn, true, ACC_ALL); + sp =3D kvm_mmu_get_child_sp(vcpu, it.sptep, base_gfn, true, + ACC_ALL, fault->pfn); if (sp =3D=3D ERR_PTR(-EEXIST)) continue; =20 @@ -3625,6 +3641,7 @@ static hpa_t mmu_alloc_root(struct kvm_vcpu *vcpu, gf= n_t gfn, int quadrant, { union kvm_mmu_page_role role =3D vcpu->arch.mmu->root_role; struct kvm_mmu_page *sp; + int nid; =20 role.level =3D level; role.quadrant =3D quadrant; @@ -3632,7 +3649,8 @@ static hpa_t mmu_alloc_root(struct kvm_vcpu *vcpu, gf= n_t gfn, int quadrant, WARN_ON_ONCE(quadrant && !role.has_4_byte_gpte); WARN_ON_ONCE(role.direct && role.has_4_byte_gpte); =20 - sp =3D kvm_mmu_get_shadow_page(vcpu, gfn, role); + nid =3D kvm_mmu_root_page_cache_nid(vcpu->kvm); + sp =3D kvm_mmu_get_shadow_page(vcpu, gfn, nid, role); ++sp->root_count; =20 return __pa(sp->spt); @@ -5959,7 +5977,7 @@ static int __kvm_mmu_create(struct kvm_vcpu *vcpu, st= ruct kvm_mmu *mmu) =20 int kvm_mmu_create(struct kvm_vcpu *vcpu) { - int ret; + int ret, nid; =20 INIT_KVM_MMU_MEMORY_CACHE(&vcpu->arch.mmu_pte_list_desc_cache); vcpu->arch.mmu_pte_list_desc_cache.kmem_cache =3D pte_list_desc_cache; @@ -5967,7 +5985,12 @@ int kvm_mmu_create(struct kvm_vcpu *vcpu) INIT_KVM_MMU_MEMORY_CACHE(&vcpu->arch.mmu_page_header_cache); vcpu->arch.mmu_page_header_cache.kmem_cache =3D mmu_page_header_cache; =20 - INIT_KVM_MMU_MEMORY_CACHE(&vcpu->arch.mmu_shadow_page_cache); + for_each_node(nid) { + INIT_KVM_MMU_MEMORY_CACHE(&vcpu->arch.mmu_shadow_page_cache[nid]); + if (kvm_numa_aware_page_table_enabled(vcpu->kvm)) + vcpu->arch.mmu_shadow_page_cache[nid].node =3D nid; + } + mutex_init(&vcpu->arch.mmu_shadow_page_cache_lock); =20 INIT_KVM_MMU_MEMORY_CACHE(&vcpu->arch.mmu_shadowed_info_cache); @@ -6695,13 +6718,17 @@ void kvm_mmu_invalidate_mmio_sptes(struct kvm *kvm,= u64 gen) } =20 static int mmu_memory_cache_try_empty(struct kvm_mmu_memory_cache *cache, - struct mutex *cache_lock) + int cache_count, struct mutex *cache_lock) { - int freed =3D 0; + int freed =3D 0, nid; =20 if (mutex_trylock(cache_lock)) { - freed =3D cache->nobjs; - kvm_mmu_empty_memory_cache(cache); + for (nid =3D 0; nid < cache_count; nid++) { + if (!cache[nid].nobjs) + continue; + freed +=3D cache[nid].nobjs; + kvm_mmu_empty_memory_cache(&cache[nid]); + } mutex_unlock(cache_lock); } return freed; @@ -6725,15 +6752,17 @@ static unsigned long mmu_shrink_scan(struct shrinke= r *shrink, list_move_tail(&kvm->vm_list, &vm_list); =20 kvm_for_each_vcpu(i, vcpu, kvm) { - freed +=3D mmu_memory_cache_try_empty(&vcpu->arch.mmu_shadow_page_cache, + freed +=3D mmu_memory_cache_try_empty(vcpu->arch.mmu_shadow_page_cache, + MAX_NUMNODES, &vcpu->arch.mmu_shadow_page_cache_lock); freed +=3D mmu_memory_cache_try_empty(&vcpu->arch.mmu_shadowed_info_cac= he, + 1, &vcpu->arch.mmu_shadow_page_cache_lock); if (freed >=3D sc->nr_to_scan) goto out; } freed +=3D mmu_memory_cache_try_empty(&kvm->arch.split_shadow_page_cache, - &kvm->slots_lock); + 1, &kvm->slots_lock); if (freed >=3D sc->nr_to_scan) goto out; } diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_interna= l.h index b9d0e09ae974..652fd0c2bcba 100644 --- a/arch/x86/kvm/mmu/mmu_internal.h +++ b/arch/x86/kvm/mmu/mmu_internal.h @@ -340,11 +340,16 @@ void track_possible_nx_huge_page(struct kvm *kvm, str= uct kvm_mmu_page *sp); void untrack_possible_nx_huge_page(struct kvm *kvm, struct kvm_mmu_page *s= p); void *mmu_sp_memory_cache_alloc(struct kvm_mmu_memory_cache *cache); =20 +static inline bool kvm_numa_aware_page_table_enabled(struct kvm *kvm) +{ + return kvm->arch.numa_aware_page_table; +} + static inline int kvm_pfn_to_page_table_nid(struct kvm *kvm, kvm_pfn_t pfn) { struct page *page; =20 - if (!kvm->arch.numa_aware_page_table) + if (!kvm_numa_aware_page_table_enabled(kvm)) return NUMA_NO_NODE; =20 page =3D kvm_pfn_to_refcounted_page(pfn); @@ -355,4 +360,21 @@ static inline int kvm_pfn_to_page_table_nid(struct kvm= *kvm, kvm_pfn_t pfn) return numa_mem_id(); } =20 +static inline int kvm_pfn_to_mmu_cache_nid(struct kvm *kvm, kvm_pfn_t pfn) +{ + int index =3D kvm_pfn_to_page_table_nid(kvm, pfn); + + if (index =3D=3D NUMA_NO_NODE) + return KVM_MMU_DEFAULT_CACHE_INDEX; + + return index; +} + +static inline int kvm_mmu_root_page_cache_nid(struct kvm *kvm) +{ + if (kvm_numa_aware_page_table_enabled(kvm)) + return numa_mem_id(); + + return KVM_MMU_DEFAULT_CACHE_INDEX; +} #endif /* __KVM_X86_MMU_INTERNAL_H */ diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h index 1dea9be6849d..9db8b3df434d 100644 --- a/arch/x86/kvm/mmu/paging_tmpl.h +++ b/arch/x86/kvm/mmu/paging_tmpl.h @@ -652,7 +652,7 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, struct k= vm_page_fault *fault, table_gfn =3D gw->table_gfn[it.level - 2]; access =3D gw->pt_access[it.level - 2]; sp =3D kvm_mmu_get_child_sp(vcpu, it.sptep, table_gfn, - false, access); + false, access, fault->pfn); =20 if (sp !=3D ERR_PTR(-EEXIST)) { /* @@ -706,7 +706,7 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, struct k= vm_page_fault *fault, validate_direct_spte(vcpu, it.sptep, direct_access); =20 sp =3D kvm_mmu_get_child_sp(vcpu, it.sptep, base_gfn, - true, direct_access); + true, direct_access, fault->pfn); if (sp =3D=3D ERR_PTR(-EEXIST)) continue; =20 diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 61fd9c177694..63113a66f560 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -260,12 +260,12 @@ static struct kvm_mmu_page *tdp_mmu_next_root(struct = kvm *kvm, kvm_mmu_page_as_id(_root) !=3D _as_id) { \ } else =20 -static struct kvm_mmu_page *tdp_mmu_alloc_sp(struct kvm_vcpu *vcpu) +static struct kvm_mmu_page *tdp_mmu_alloc_sp(struct kvm_vcpu *vcpu, int ni= d) { struct kvm_mmu_page *sp; =20 sp =3D kvm_mmu_memory_cache_alloc(&vcpu->arch.mmu_page_header_cache); - sp->spt =3D mmu_sp_memory_cache_alloc(&vcpu->arch.mmu_shadow_page_cache); + sp->spt =3D mmu_sp_memory_cache_alloc(&vcpu->arch.mmu_shadow_page_cache[n= id]); =20 return sp; } @@ -304,6 +304,7 @@ hpa_t kvm_tdp_mmu_get_vcpu_root_hpa(struct kvm_vcpu *vc= pu) union kvm_mmu_page_role role =3D vcpu->arch.mmu->root_role; struct kvm *kvm =3D vcpu->kvm; struct kvm_mmu_page *root; + int nid; =20 lockdep_assert_held_write(&kvm->mmu_lock); =20 @@ -317,7 +318,8 @@ hpa_t kvm_tdp_mmu_get_vcpu_root_hpa(struct kvm_vcpu *vc= pu) goto out; } =20 - root =3D tdp_mmu_alloc_sp(vcpu); + nid =3D kvm_mmu_root_page_cache_nid(vcpu->kvm); + root =3D tdp_mmu_alloc_sp(vcpu, nid); tdp_mmu_init_sp(root, NULL, 0, role); =20 refcount_set(&root->tdp_mmu_root_count, 1); @@ -1149,12 +1151,14 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, struct k= vm_page_fault *fault) struct kvm *kvm =3D vcpu->kvm; struct tdp_iter iter; struct kvm_mmu_page *sp; - int ret =3D RET_PF_RETRY; + int ret =3D RET_PF_RETRY, nid; =20 kvm_mmu_hugepage_adjust(vcpu, fault); =20 trace_kvm_mmu_spte_requested(fault); =20 + nid =3D kvm_pfn_to_mmu_cache_nid(kvm, fault->pfn); + rcu_read_lock(); =20 tdp_mmu_for_each_pte(iter, mmu, fault->gfn, fault->gfn + 1) { @@ -1182,7 +1186,7 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, struct kvm= _page_fault *fault) * The SPTE is either non-present or points to a huge page that * needs to be split. */ - sp =3D tdp_mmu_alloc_sp(vcpu); + sp =3D tdp_mmu_alloc_sp(vcpu, nid); tdp_mmu_init_child_sp(sp, &iter); =20 sp->nx_huge_page_disallowed =3D fault->huge_page_disallowed; diff --git a/include/linux/kvm_types.h b/include/linux/kvm_types.h index b2a405c8e629..13032da2ddfc 100644 --- a/include/linux/kvm_types.h +++ b/include/linux/kvm_types.h @@ -113,6 +113,12 @@ static inline void INIT_KVM_MMU_MEMORY_CACHE(struct kv= m_mmu_memory_cache *cache) { *cache =3D (struct kvm_mmu_memory_cache)KVM_MMU_MEMORY_CACHE_INIT(); } + +/* + * When NUMA aware page table option is disabled for a VM then use cache a= t the + * below index in the array of NUMA caches. + */ +#define KVM_MMU_DEFAULT_CACHE_INDEX 0 #endif =20 #define HALT_POLL_HIST_COUNT 32 diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 47006d209309..25a549705c8e 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -401,7 +401,7 @@ static inline void *mmu_memory_cache_alloc_obj(struct k= vm_mmu_memory_cache *mc, if (mc->kmem_cache) return kmem_cache_alloc(mc->kmem_cache, gfp_flags); else - return (void *)__get_free_page(gfp_flags); + return kvm_mmu_get_free_page(gfp_flags, mc->node); } =20 int __kvm_mmu_topup_memory_cache(struct kvm_mmu_memory_cache *mc, int capa= city, int min) --=20 2.40.0.rc0.216.gc4246ad0f0-goog From nobody Thu Sep 18 23:35:51 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C19E4C64EC4 for ; Mon, 6 Mar 2023 22:43:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229886AbjCFWnU (ORCPT ); Mon, 6 Mar 2023 17:43:20 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44414 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230281AbjCFWmu (ORCPT ); Mon, 6 Mar 2023 17:42:50 -0500 Received: from mail-yw1-x1149.google.com (mail-yw1-x1149.google.com [IPv6:2607:f8b0:4864:20::1149]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 963127D548 for ; Mon, 6 Mar 2023 14:42:18 -0800 (PST) Received: by mail-yw1-x1149.google.com with SMTP id 00721157ae682-536c02ed619so116972997b3.8 for ; Mon, 06 Mar 2023 14:42:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; t=1678142524; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=VEQuTMy3BKS8pVuX5PwbgXlx8W41CrsAU35jTKJjWKM=; b=PQs6IbkXsJ9lld1VVn0ou53Tj+mD5mN/JMwKSFBqww8CU+k++n9JdjO9+5RKeOjY3j z1/gwkem0OA83eaGuvDnX+LX/N5AufzHntpp7rjRkWPcijI5hk20sgV7bwLAddkmSCj3 ac5RV0oP8KcSnkDPK9DDwAJWAsCw+Reb0+qRKsPT0kFJhdhm6SUGV//tcG3x0gh1lnqt EghVyHg1PbXKS8G87VUngcQN+Db7y8HpZLnEKYV83J/P4Uzk7ml77lO9R/1zfUJoGsbY GHY2XVCWssNUmg+Ccd1UHPOAasmBZKc7W5I6uIRMz8lPIACX4uuk7GRHJvvYhb8cYleT eHKA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1678142524; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=VEQuTMy3BKS8pVuX5PwbgXlx8W41CrsAU35jTKJjWKM=; b=BAFHJ3gvlGCwU1wlqqEeq2t5ors2glLYcsbDzRUWydcm73w9+YHxgr5ee1qGuNdL5g jEq3YE7KvA+Nt2lfhLnRyPCV05TGDi2tHkzD/NSVS7IC+qfxifI04VNgI04vGvI33RD9 r8kArN4gQjWVBIJTGy8Ove1QprsCYsI3KvZaBpwRS10dKF5k2vji9SAmjSPOffXKauXF E8OVAKBBnCJSW0IUk22B97u4YnVNCa1Xf940DoO7YXCWV3Eq2LciOLMmFbRFy460UEbk DYrcsgnmhiE3VxFxj4kxBCJm+RuCS73iUDPbBLXtEY2hoj3LNStlo5K7jyu1i2K1ZTrJ Ea9Q== X-Gm-Message-State: AO0yUKU6hh+jAGpX5+YQwtqm7pXpgVPrf9aDUYBwsS66sX1jUdHYj9Fo 1wSkJWpEG0W6w0KbeUtG7XxgxQys3FNQ X-Google-Smtp-Source: AK7set8EdZCOmDm00QFWHsxEQuQ2fSZe9k2hobygiUXFHmNiri0mCqYuv/hf7wRRFK4ZJJIrTgckG/2/OuZ/ X-Received: from vipin.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:479f]) (user=vipinsh job=sendgmr) by 2002:a05:6902:c3:b0:9f1:6c48:f95f with SMTP id i3-20020a05690200c300b009f16c48f95fmr5863124ybs.5.1678142524427; Mon, 06 Mar 2023 14:42:04 -0800 (PST) Date: Mon, 6 Mar 2023 14:41:26 -0800 In-Reply-To: <20230306224127.1689967-1-vipinsh@google.com> Mime-Version: 1.0 References: <20230306224127.1689967-1-vipinsh@google.com> X-Mailer: git-send-email 2.40.0.rc0.216.gc4246ad0f0-goog Message-ID: <20230306224127.1689967-18-vipinsh@google.com> Subject: [Patch v4 17/18] KVM: x86/mmu: Allocate shadow mmu page table on huge page split on the same NUMA node From: Vipin Sharma To: seanjc@google.com, pbonzini@redhat.com, bgardon@google.com, dmatlack@google.com Cc: jmattson@google.com, mizhang@google.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Vipin Sharma Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" When splitting a huge page and NUMA aware page split option is enabled, try to allocate new lower level page tables on the same NUMA node of the huge page. If NUMA aware page split is disabled then fallback to default policy of using current thread's mempolicy. Signed-off-by: Vipin Sharma --- arch/x86/include/asm/kvm_host.h | 2 +- arch/x86/kvm/mmu/mmu.c | 42 ++++++++++++++++++++------------- arch/x86/kvm/x86.c | 8 ++++++- 3 files changed, 33 insertions(+), 19 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_hos= t.h index 77d3aa368e5e..041302d6132c 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1453,7 +1453,7 @@ struct kvm_arch { * * Protected by kvm->slots_lock. */ - struct kvm_mmu_memory_cache split_shadow_page_cache; + struct kvm_mmu_memory_cache split_shadow_page_cache[MAX_NUMNODES]; struct kvm_mmu_memory_cache split_page_header_cache; =20 /* diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 86f0d74d35ed..6d44a4e08328 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -6140,7 +6140,7 @@ static void kvm_mmu_invalidate_zap_pages_in_memslot(s= truct kvm *kvm, int kvm_mmu_init_vm(struct kvm *kvm) { struct kvm_page_track_notifier_node *node =3D &kvm->arch.mmu_sp_tracker; - int r; + int r, nid; =20 INIT_LIST_HEAD(&kvm->arch.active_mmu_pages); INIT_LIST_HEAD(&kvm->arch.possible_nx_huge_pages); @@ -6159,7 +6159,9 @@ int kvm_mmu_init_vm(struct kvm *kvm) INIT_KVM_MMU_MEMORY_CACHE(&kvm->arch.split_page_header_cache); kvm->arch.split_page_header_cache.kmem_cache =3D mmu_page_header_cache; =20 - INIT_KVM_MMU_MEMORY_CACHE(&kvm->arch.split_shadow_page_cache); + for_each_node(nid) + INIT_KVM_MMU_MEMORY_CACHE(&kvm->arch.split_shadow_page_cache[nid]); + =20 INIT_KVM_MMU_MEMORY_CACHE(&kvm->arch.split_desc_cache); kvm->arch.split_desc_cache.kmem_cache =3D pte_list_desc_cache; @@ -6169,10 +6171,13 @@ int kvm_mmu_init_vm(struct kvm *kvm) =20 static void mmu_free_vm_memory_caches(struct kvm *kvm) { + int nid; + kvm_mmu_free_memory_cache(&kvm->arch.split_desc_cache); kvm_mmu_free_memory_cache(&kvm->arch.split_page_header_cache); mutex_lock(&kvm->slots_lock); - mmu_free_sp_memory_cache(&kvm->arch.split_shadow_page_cache); + for_each_node(nid) + mmu_free_sp_memory_cache(&kvm->arch.split_shadow_page_cache[nid]); mutex_unlock(&kvm->slots_lock); } =20 @@ -6282,7 +6287,7 @@ static inline bool need_topup(struct kvm_mmu_memory_c= ache *cache, int min) return kvm_mmu_memory_cache_nr_free_objects(cache) < min; } =20 -static bool need_topup_split_caches_or_resched(struct kvm *kvm) +static bool need_topup_split_caches_or_resched(struct kvm *kvm, int nid) { if (need_resched() || rwlock_needbreak(&kvm->mmu_lock)) return true; @@ -6294,10 +6299,10 @@ static bool need_topup_split_caches_or_resched(stru= ct kvm *kvm) */ return need_topup(&kvm->arch.split_desc_cache, SPLIT_DESC_CACHE_MIN_NR_OB= JECTS) || need_topup(&kvm->arch.split_page_header_cache, 1) || - need_topup(&kvm->arch.split_shadow_page_cache, 1); + need_topup(&kvm->arch.split_shadow_page_cache[nid], 1); } =20 -static int topup_split_caches(struct kvm *kvm) +static int topup_split_caches(struct kvm *kvm, int nid) { /* * Allocating rmap list entries when splitting huge pages for nested @@ -6327,10 +6332,11 @@ static int topup_split_caches(struct kvm *kvm) if (r) return r; =20 - return mmu_topup_sp_memory_cache(&kvm->arch.split_shadow_page_cache, 1); + return mmu_topup_sp_memory_cache(&kvm->arch.split_shadow_page_cache[nid],= 1); } =20 -static struct kvm_mmu_page *shadow_mmu_get_sp_for_split(struct kvm *kvm, u= 64 *huge_sptep) +static struct kvm_mmu_page *shadow_mmu_get_sp_for_split(struct kvm *kvm, u= 64 *huge_sptep, + int nid) { struct kvm_mmu_page *huge_sp =3D sptep_to_sp(huge_sptep); struct shadow_page_caches caches =3D {}; @@ -6351,7 +6357,7 @@ static struct kvm_mmu_page *shadow_mmu_get_sp_for_spl= it(struct kvm *kvm, u64 *hu =20 /* Direct SPs do not require a shadowed_info_cache. */ caches.page_header_cache =3D &kvm->arch.split_page_header_cache; - caches.shadow_page_cache =3D &kvm->arch.split_shadow_page_cache; + caches.shadow_page_cache =3D &kvm->arch.split_shadow_page_cache[nid]; =20 /* Safe to pass NULL for vCPU since requesting a direct SP. */ return __kvm_mmu_get_shadow_page(kvm, NULL, &caches, gfn, role); @@ -6359,7 +6365,7 @@ static struct kvm_mmu_page *shadow_mmu_get_sp_for_spl= it(struct kvm *kvm, u64 *hu =20 static void shadow_mmu_split_huge_page(struct kvm *kvm, const struct kvm_memory_slot *slot, - u64 *huge_sptep) + u64 *huge_sptep, int nid) =20 { struct kvm_mmu_memory_cache *cache =3D &kvm->arch.split_desc_cache; @@ -6370,7 +6376,7 @@ static void shadow_mmu_split_huge_page(struct kvm *kv= m, gfn_t gfn; int index; =20 - sp =3D shadow_mmu_get_sp_for_split(kvm, huge_sptep); + sp =3D shadow_mmu_get_sp_for_split(kvm, huge_sptep, nid); =20 for (index =3D 0; index < SPTE_ENT_PER_PAGE; index++) { sptep =3D &sp->spt[index]; @@ -6408,7 +6414,7 @@ static int shadow_mmu_try_split_huge_page(struct kvm = *kvm, u64 *huge_sptep) { struct kvm_mmu_page *huge_sp =3D sptep_to_sp(huge_sptep); - int level, r =3D 0; + int level, r =3D 0, nid; gfn_t gfn; u64 spte; =20 @@ -6422,7 +6428,9 @@ static int shadow_mmu_try_split_huge_page(struct kvm = *kvm, goto out; } =20 - if (need_topup_split_caches_or_resched(kvm)) { + nid =3D kvm_pfn_to_mmu_cache_nid(kvm, spte_to_pfn(spte)); + + if (need_topup_split_caches_or_resched(kvm, nid)) { write_unlock(&kvm->mmu_lock); cond_resched(); /* @@ -6430,12 +6438,12 @@ static int shadow_mmu_try_split_huge_page(struct kv= m *kvm, * rmap iterator should be restarted because the MMU lock was * dropped. */ - r =3D topup_split_caches(kvm) ?: -EAGAIN; + r =3D topup_split_caches(kvm, nid) ?: -EAGAIN; write_lock(&kvm->mmu_lock); goto out; } =20 - shadow_mmu_split_huge_page(kvm, slot, huge_sptep); + shadow_mmu_split_huge_page(kvm, slot, huge_sptep, nid); =20 out: trace_kvm_mmu_split_huge_page(gfn, spte, level, r); @@ -6761,8 +6769,8 @@ static unsigned long mmu_shrink_scan(struct shrinker = *shrink, if (freed >=3D sc->nr_to_scan) goto out; } - freed +=3D mmu_memory_cache_try_empty(&kvm->arch.split_shadow_page_cache, - 1, &kvm->slots_lock); + freed +=3D mmu_memory_cache_try_empty(kvm->arch.split_shadow_page_cache, + MAX_NUMNODES, &kvm->slots_lock); if (freed >=3D sc->nr_to_scan) goto out; } diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 71728abd7f92..d8ea39b248cd 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -6176,7 +6176,7 @@ int kvm_vm_ioctl_irq_line(struct kvm *kvm, struct kvm= _irq_level *irq_event, int kvm_vm_ioctl_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap) { - int r; + int r, nid; =20 if (cap->flags) return -EINVAL; @@ -6397,6 +6397,12 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm, mutex_lock(&kvm->lock); if (!kvm->created_vcpus) { kvm->arch.numa_aware_page_table =3D true; + + mutex_lock(&kvm->slots_lock); + for_each_node(nid) { + kvm->arch.split_shadow_page_cache[nid].node =3D nid; + } + mutex_unlock(&kvm->slots_lock); r =3D 0; } mutex_unlock(&kvm->lock); --=20 2.40.0.rc0.216.gc4246ad0f0-goog From nobody Thu Sep 18 23:35:51 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C5A18C6FD1A for ; Mon, 6 Mar 2023 22:43:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229878AbjCFWnX (ORCPT ); Mon, 6 Mar 2023 17:43:23 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44150 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230301AbjCFWm4 (ORCPT ); Mon, 6 Mar 2023 17:42:56 -0500 Received: from mail-pl1-x649.google.com (mail-pl1-x649.google.com [IPv6:2607:f8b0:4864:20::649]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 377B17C9E3 for ; Mon, 6 Mar 2023 14:42:24 -0800 (PST) Received: by mail-pl1-x649.google.com with SMTP id s15-20020a170902ea0f00b0019d0c7a83dfso6628243plg.14 for ; Mon, 06 Mar 2023 14:42:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; t=1678142526; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=OeL8Ykd+iCS6JS8scOAtZXT6SgE3Ouj+/URgOQ2sX3s=; b=hk8vRpbEWBqSY4vpUw9bpWUeCVvQ/YeUcq66u5udQO+XKtmkRe5XynuiaC71UhBsmb MJX5G46cSivSg2VyHO2BLfr3zi+MPvsQQPrlUg0fklhseJRi492oFo6Omn0JeoMCiIDA ehcsBIttFvLE0M6Z0VE3ZVD8e2ATxMcgmx5gk5b88p/1DDvplGFnZ99N1VQ7grjJ2hX1 m4Rs2BUrBU3Yjazwt4qvtCMRJMwPuvSnaYwonWVuEmnwDyA2W/r/85gkfWZjutvxQf7J yo2LwCs1SLZHtKalYgNMvcaZgVakuyiUlG+IUprqGugBQTirUZbsBqM8O4BeJRmWkQO4 hLDg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1678142526; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=OeL8Ykd+iCS6JS8scOAtZXT6SgE3Ouj+/URgOQ2sX3s=; b=Pa106kt+KaH/ZjiqjsDWRMcLhSdmw/b7XSM4lV8HiEOzJx0gwpa7Fh+OqEbn8URpEu bK8x6Ve2VFPH16a2lOVtGlvrm/nme1dtkjZrjwcg6oE8xNMfofUPIsmOQtQgN+vQDPMF WRs727nJ2wkrs8Qa3DbXniAosN19ODLO1/5iMK+4hoZ0dLBl8XHvYKSPwT2mC3ZjGZbj RC5LOL3pO+v0yNy4XM46jE2txsZdou5OWahHTkSkEynJZ+xe5zbEPmxsLUI2881j2EkA 3ooZyDguEabwLHNxtfl2oKJsNktIk4HaYgfSAdrgzf/YygbcxypdQgH35c9UBnxpAX5S 623g== X-Gm-Message-State: AO0yUKXlv2jdpACDmxalxerY8d0mRmtb1Jjc2tbgD4SuB3QuvdARnEge 5wsG9Ah3Zwz1jANwbo3bIDnPlGHurJGa X-Google-Smtp-Source: AK7set/bN7t34vpFP3b5g8As+ZIP6EwS/0kB74bTqu6kV7sDF35fTifkklq2UmD/u6GAqDmSpVbet0qAjEWc X-Received: from vipin.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:479f]) (user=vipinsh job=sendgmr) by 2002:a17:90b:504:b0:233:df5f:4778 with SMTP id r4-20020a17090b050400b00233df5f4778mr4600717pjz.6.1678142526350; Mon, 06 Mar 2023 14:42:06 -0800 (PST) Date: Mon, 6 Mar 2023 14:41:27 -0800 In-Reply-To: <20230306224127.1689967-1-vipinsh@google.com> Mime-Version: 1.0 References: <20230306224127.1689967-1-vipinsh@google.com> X-Mailer: git-send-email 2.40.0.rc0.216.gc4246ad0f0-goog Message-ID: <20230306224127.1689967-19-vipinsh@google.com> Subject: [Patch v4 18/18] KVM: x86/mmu: Reduce default mmu memory cache size From: Vipin Sharma To: seanjc@google.com, pbonzini@redhat.com, bgardon@google.com, dmatlack@google.com Cc: jmattson@google.com, mizhang@google.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Vipin Sharma Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Reduce KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE to PT64_ROOT_MAX_LEVEL - 1. Opportunistically, use this reduced value for topping up caches. There was no specific reason to set this value to 40. With addition of multi NUMA node caches, it is good to save space and make these cachees lean. Signed-off-by: Vipin Sharma --- arch/x86/include/asm/kvm_types.h | 6 +++++- arch/x86/kvm/mmu/mmu.c | 8 ++++---- 2 files changed, 9 insertions(+), 5 deletions(-) diff --git a/arch/x86/include/asm/kvm_types.h b/arch/x86/include/asm/kvm_ty= pes.h index 08f1b57d3b62..80aff231b708 100644 --- a/arch/x86/include/asm/kvm_types.h +++ b/arch/x86/include/asm/kvm_types.h @@ -2,6 +2,10 @@ #ifndef _ASM_X86_KVM_TYPES_H #define _ASM_X86_KVM_TYPES_H =20 -#define KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE 40 +/* + * For each fault only PT64_ROOT_MAX_LEVEL - 1 pages are needed. Root + * page is allocated in a separate flow. + */ +#define KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE (PT64_ROOT_MAX_LEVEL - 1) =20 #endif /* _ASM_X86_KVM_TYPES_H */ diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 6d44a4e08328..5463ce6e52fa 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -713,11 +713,11 @@ static int mmu_topup_memory_caches(struct kvm_vcpu *v= cpu, bool maybe_indirect) if (kvm_numa_aware_page_table_enabled(vcpu->kvm)) { for_each_online_node(nid) { r =3D mmu_topup_sp_memory_cache(&vcpu->arch.mmu_shadow_page_cache[nid], - PT64_ROOT_MAX_LEVEL); + KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE); } } else { r =3D mmu_topup_sp_memory_cache(&vcpu->arch.mmu_shadow_page_cache[nid], - PT64_ROOT_MAX_LEVEL); + KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE); } =20 if (r) @@ -725,12 +725,12 @@ static int mmu_topup_memory_caches(struct kvm_vcpu *v= cpu, bool maybe_indirect) =20 if (maybe_indirect) { r =3D mmu_topup_sp_memory_cache(&vcpu->arch.mmu_shadowed_info_cache, - PT64_ROOT_MAX_LEVEL); + KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE); if (r) return r; } return kvm_mmu_topup_memory_cache(&vcpu->arch.mmu_page_header_cache, - PT64_ROOT_MAX_LEVEL); + KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE); } =20 static void mmu_free_memory_caches(struct kvm_vcpu *vcpu) --=20 2.40.0.rc0.216.gc4246ad0f0-goog