From nobody Sat Sep 13 13:16:22 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 589D9C636D4 for ; Thu, 2 Feb 2023 18:29:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232746AbjBBS3a (ORCPT ); Thu, 2 Feb 2023 13:29:30 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33490 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232658AbjBBS2s (ORCPT ); Thu, 2 Feb 2023 13:28:48 -0500 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 82E9466EEE for ; Thu, 2 Feb 2023 10:28:27 -0800 (PST) Received: by mail-yb1-xb4a.google.com with SMTP id e69-20020a253748000000b00845f15be258so2466644yba.9 for ; Thu, 02 Feb 2023 10:28:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=Leen11Z2EIFQsH+NN7hqdICZGapTSqx3IkQSdi85rv4=; b=swDomma/Ao/1AjVGqZhc77Ly1aJmaYi96lJ6fEnLjho/yWLYjCbG87uCRLZVXzLq1M LScVQt+pZ9knZ5XD8RVoKRxLT1q+fEpmj2L2LXViaHSf7ilAayTs/eQh/Rwtvr2Vs3eW TtDm8cGugXeffCt23ir1w6zNFfj7yik+sgpFjM9V9BSjPfEoc9GUBKMV6qBA74fyHAWP +dsl0anfnGpkYnWUciMVIuHcfYl7QH02zwFv0QlTgGrXln7suPDF2xX5Qqk8Hsk0ZjyP jBboZOjT/qXiq8k9yfGUjmpOpA/fkbxmaAzM0uqiIkU6UwAe9vuKWKkQTF0m1+TCwBM8 dJvQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=Leen11Z2EIFQsH+NN7hqdICZGapTSqx3IkQSdi85rv4=; b=cRLIsyneAQf9tYmJ7Z2YrC/Rx5yxL4HigQJd6hgJrY2yvqlIZR/GDe1AVuxrVQdabM 4ttdOyp933bldbGK1JmuCk5tka/M9KHuQad2pCSDTT4sbbLdBBVNdVXK3bYRL3KFMPlZ Mey83qunQq5NfZX53RfUrxBI5jliqf7Er/7xqIZKyS6mea6QLIQtGDOiKF+hDCzI7WeO 7oLfeVKfWG2nFPr0xNx6ooRbBLVYh0DHiLxdE0SrUcpbupWkDi6Ojpnv1Hb+2WuOhzyv M4SMPvG5cJnrAdyIaAaZy5gmeWU0FgkvE3o1g2DQF5EQbG56tvUWBg53DPeXcum5IXfm Vw7Q== X-Gm-Message-State: AO0yUKWQyrKfJPj1T8tJcRxpEJc/ZN7rjTHgM2XSdhjk13oXyakRIptx OSPX3LnazhwG47bHK+aK9NVc1DJe2Wk= X-Google-Smtp-Source: AK7set+v8zDs6XnzjllU8ur/f0MIRCkHIofH+Wt+ThfPgIot1Um67e13cWqpbQBNYlMCc88CYLX3y0XmC3I= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a25:d095:0:b0:857:8f9c:7b87 with SMTP id h143-20020a25d095000000b008578f9c7b87mr368794ybg.558.1675362505413; Thu, 02 Feb 2023 10:28:25 -0800 (PST) Reply-To: Sean Christopherson Date: Thu, 2 Feb 2023 18:28:17 +0000 In-Reply-To: <20230202182817.407394-1-seanjc@google.com> Mime-Version: 1.0 References: <20230202182817.407394-1-seanjc@google.com> X-Mailer: git-send-email 2.39.1.519.gcb327c4b5f-goog Message-ID: <20230202182817.407394-4-seanjc@google.com> Subject: [PATCH v2 3/3] KVM: x86/mmu: Remove FNAME(is_self_change_mapping) From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Huang Hang , Lai Jiangshan Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Lai Jiangshan Drop FNAME(is_self_change_mapping) and instead rely on kvm_mmu_hugepage_adjust() to adjust the hugepage accordingly. Prior to commit 4cd071d13c5c ("KVM: x86/mmu: Move calls to thp_adjust() down a level"), the hugepage adjustment was done before allocating new shadow pages, i.e. failed to restrict the hugepage sizes if a new shadow page resulted in account_shadowed() changing the disallowed hugepage tracking. Removing FNAME(is_self_change_mapping) fixes a bug reported by Huang Hang where KVM unnecessarily forces a 4KiB page. FNAME(is_self_change_mapping) has a defect in that it blindly disables _all_ hugepage mappings rather than trying to reduce the size of the hugepage. If the guest is writing to a 1GiB page and the 1GiB is self-referential but a 2MiB page is not, then KVM can and should create a 2MiB mapping. Add a comment above the call to kvm_mmu_hugepage_adjust() to call out the new dependency on adjusting the hugepage size after walking indirect PTEs. Reported-by: Huang Hang Signed-off-by: Lai Jiangshan Link: https://lore.kernel.org/r/20221213125538.81209-1-jiangshanlai@gmail.c= om [sean: rework changelog after separating out the emulator change] Signed-off-by: Sean Christopherson --- arch/x86/kvm/mmu/paging_tmpl.h | 51 +++++----------------------------- 1 file changed, 7 insertions(+), 44 deletions(-) diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h index f57d9074fb9b..a056f2773dd9 100644 --- a/arch/x86/kvm/mmu/paging_tmpl.h +++ b/arch/x86/kvm/mmu/paging_tmpl.h @@ -690,6 +690,12 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, struct = kvm_page_fault *fault, fault->write_fault_to_shadow_pgtable =3D true; } =20 + /* + * Adjust the hugepage size _after_ resolving indirect shadow pages. + * KVM doesn't support mapping hugepages into the guest for gfns that + * are being shadowed by KVM, i.e. allocating a new shadow page may + * affect the allowed hugepage size. + */ kvm_mmu_hugepage_adjust(vcpu, fault); =20 trace_kvm_mmu_spte_requested(fault); @@ -734,41 +740,6 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, struct = kvm_page_fault *fault, return RET_PF_RETRY; } =20 - /* - * To see whether the mapped gfn can write its page table in the current - * mapping. - * - * It is the helper function of FNAME(page_fault). When guest uses large p= age - * size to map the writable gfn which is used as current page table, we sh= ould - * force kvm to use small page size to map it because new shadow page will= be - * created when kvm establishes shadow page table that stop kvm using large - * page size. Do it early can avoid unnecessary #PF and emulation. - * - * Note: the PDPT page table is not checked for PAE-32 bit guest. It is ok - * since the PDPT is always shadowed, that means, we can not use large page - * size to map the gfn which is used as PDPT. - */ -static bool -FNAME(is_self_change_mapping)(struct kvm_vcpu *vcpu, - struct guest_walker *walker, bool user_fault) -{ - int level; - gfn_t mask =3D ~(KVM_PAGES_PER_HPAGE(walker->level) - 1); - bool self_changed =3D false; - - if (!(walker->pte_access & ACC_WRITE_MASK || - (!is_cr0_wp(vcpu->arch.mmu) && !user_fault))) - return false; - - for (level =3D walker->level; level <=3D walker->max_level; level++) { - gfn_t gfn =3D walker->gfn ^ walker->table_gfn[level - 1]; - - self_changed |=3D !(gfn & mask); - } - - return self_changed; -} - /* * Page fault handler. There are several causes for a page fault: * - there is no shadow pte for the guest pte @@ -787,7 +758,6 @@ static int FNAME(page_fault)(struct kvm_vcpu *vcpu, str= uct kvm_page_fault *fault { struct guest_walker walker; int r; - bool is_self_change_mapping; =20 pgprintk("%s: addr %lx err %x\n", __func__, fault->addr, fault->error_cod= e); WARN_ON_ONCE(fault->is_tdp); @@ -812,6 +782,7 @@ static int FNAME(page_fault)(struct kvm_vcpu *vcpu, str= uct kvm_page_fault *fault } =20 fault->gfn =3D walker.gfn; + fault->max_level =3D walker.level; fault->slot =3D kvm_vcpu_gfn_to_memslot(vcpu, fault->gfn); =20 if (page_fault_handle_page_track(vcpu, fault)) { @@ -823,14 +794,6 @@ static int FNAME(page_fault)(struct kvm_vcpu *vcpu, st= ruct kvm_page_fault *fault if (r) return r; =20 - is_self_change_mapping =3D FNAME(is_self_change_mapping)(vcpu, - &walker, fault->user); - - if (is_self_change_mapping) - fault->max_level =3D PG_LEVEL_4K; - else - fault->max_level =3D walker.level; - r =3D kvm_faultin_pfn(vcpu, fault, walker.pte_access); if (r !=3D RET_PF_CONTINUE) return r; --=20 2.39.1.519.gcb327c4b5f-goog