From nobody Mon Apr  6 14:55:51 2026
Return-Path: <linux-kernel-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 37838C433F5
	for <linux-kernel@archiver.kernel.org>; Fri, 30 Sep 2022 23:49:20 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S232051AbiI3XtR (ORCPT <rfc822;linux-kernel@archiver.kernel.org>);
        Fri, 30 Sep 2022 19:49:17 -0400
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53818 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S231487AbiI3XtK (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Fri, 30 Sep 2022 19:49:10 -0400
Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com
 [IPv6:2607:f8b0:4864:20::b49])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4A07A1A9A6C
        for <linux-kernel@vger.kernel.org>;
 Fri, 30 Sep 2022 16:49:03 -0700 (PDT)
Received: by mail-yb1-xb49.google.com with SMTP id
 v17-20020a259d91000000b006b4c31c0640so5095703ybp.18
        for <linux-kernel@vger.kernel.org>;
 Fri, 30 Sep 2022 16:49:03 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20210112;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:reply-to:from:to:cc:subject:date;
        bh=hEZ7EhPPJ9sYhcGZyIvx4H0M1XfuwlYetx5I1xX+zQQ=;
        b=rh+TSc3TeCF5RvDzhRIawAEd9A+/5cLghuQVQHt1Ng2nSTDVHS2zKr9WYjynN4jQDH
         37I2YQypl/qJYRyo7RbcWDooNiSZMi9+CoVqC7DQc3vTW7wbJ9KfBsXFRJWP7DpoiePx
         uK/DtSFnOPL4vAf/DpXrE+B9OihGVgzILcpbYlNwTxV+ha9XxPghlwvj/lVAckIjz2DC
         Ea9jgQObUno36NlCYh8Or14TDa2x6E8JZ91/JlgwYFt7/yLIujwVktRXzlxV+5M+a/v3
         sZ0BODS9WUwsO1SmJAe7nKQFZEFsQ9DBuawmKRff0NyJuLe0tsQhQ4hCF1H2T22JNyU3
         N3rQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:reply-to:x-gm-message-state:from:to:cc:subject:date;
        bh=hEZ7EhPPJ9sYhcGZyIvx4H0M1XfuwlYetx5I1xX+zQQ=;
        b=O7GMJi2+GJuHS2RSJdGFg4tBRXEKn5NU2uZ55IYLHhBEgm8RNVat8JDm/Wo3eTnaXh
         R1+5Z88Fn4cPhqXwDVO0xSUbM7DBaa8mFBd61fP3fFVtUGgL/qhmHfISvZN3O220c8Se
         gudhybtc6ODzx/lUaHZxBzInGJJzmrEQ2p82XvG94hNeF3irbUEYRt2YpMfEnxDkgMlB
         bwDSo8+UCdplGPtkQS09vRD5W7NF2XImqSwN6++UNJujEzSem/C5oZ0LLwkmQVXgT9g9
         zdLY4NKhHgQQ49EppgeNigQo7+B5bD9be5G1TL3/cv6NRGKbxO477MwdaIoT/C33UMaZ
         /EDA==
X-Gm-Message-State: ACrzQf1NGhPU7q0KS1TFbDeaINx3qRJaGuKuzCJsRjc7vddbd557pvQ6
        wRPabZSctcqxpSoVsH/oMEug+kOHV8c=
X-Google-Smtp-Source: 
 AMsMyM5bsLXkM66r5cC6Mt45xpUnsgEV7OZm/HrcPlrZ55dftAtJt11/32ew9qtm/XagC0CXj6ykYcdPhSc=
X-Received: from zagreus.c.googlers.com
 ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37])
 (user=seanjc job=sendgmr) by 2002:a0d:df4c:0:b0:357:2dd5:8821 with SMTP id
 i73-20020a0ddf4c000000b003572dd58821mr2254859ywe.228.1664581742470; Fri, 30
 Sep 2022 16:49:02 -0700 (PDT)
Reply-To: Sean Christopherson <seanjc@google.com>
Date: Fri, 30 Sep 2022 23:48:50 +0000
In-Reply-To: <20220930234854.1739690-1-seanjc@google.com>
Mime-Version: 1.0
References: <20220930234854.1739690-1-seanjc@google.com>
X-Mailer: git-send-email 2.38.0.rc1.362.ged0d419d3c-goog
Message-ID: <20220930234854.1739690-4-seanjc@google.com>
Subject: [PATCH v5 3/7] KVM: x86/mmu: Properly account NX huge page workaround
 for nonpaging MMUs
From: Sean Christopherson <seanjc@google.com>
To: Sean Christopherson <seanjc@google.com>,
        Paolo Bonzini <pbonzini@redhat.com>
Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
        Mingwei Zhang <mizhang@google.com>,
        David Matlack <dmatlack@google.com>,
        Yan Zhao <yan.y.zhao@intel.com>,
        Ben Gardon <bgardon@google.com>
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset="utf-8"

Account and track NX huge pages for nonpaging MMUs so that a future
enhancement to precisely check if a shadow page can't be replaced by a NX
huge page doesn't get false positives.  Without correct tracking, KVM can
get stuck in a loop if an instruction is fetching and writing data on the
same huge page, e.g. KVM installs a small executable page on the fetch
fault, replaces it with an NX huge page on the write fault, and faults
again on the fetch.

Alternatively, and perhaps ideally, KVM would simply not enforce the
workaround for nonpaging MMUs.  The guest has no page tables to abuse
and KVM is guaranteed to switch to a different MMU on CR0.PG being
toggled so there's no security or performance concerns.  However, getting
make_spte() to play nice now and in the future is unnecessarily complex.

In the current code base, make_spte() can enforce the mitigation if TDP
is enabled or the MMU is indirect, but make_spte() may not always have a
vCPU/MMU to work with, e.g. if KVM were to support in-line huge page
promotion when disabling dirty logging.

Without a vCPU/MMU, KVM could either pass in the correct information
and/or derive it from the shadow page, but the former is ugly and the
latter subtly non-trivial due to the possibility of direct shadow pages
in indirect MMUs.  Given that using shadow paging with an unpaged guest
is far from top priority _and_ has been subjected to the workaround since
its inception, keep it simple and just fix the accounting glitch.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: David Matlack <dmatlack@google.com>
Reviewed-by: Mingwei Zhang <mizhang@google.com>
---
 arch/x86/kvm/mmu/mmu.c  |  2 +-
 arch/x86/kvm/mmu/spte.c | 12 ++++++++++++
 2 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 5744c63f16d4..431c88617cae 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -3141,7 +3141,7 @@ static int __direct_map(struct kvm_vcpu *vcpu, struct=
 kvm_page_fault *fault)
 			continue;
=20
 		link_shadow_page(vcpu, it.sptep, sp);
-		if (fault->is_tdp && fault->huge_page_disallowed)
+		if (fault->huge_page_disallowed)
 			account_nx_huge_page(vcpu->kvm, sp,
 					     fault->req_level >=3D it.level);
 	}
diff --git a/arch/x86/kvm/mmu/spte.c b/arch/x86/kvm/mmu/spte.c
index 2e08b2a45361..c0fd7e049b4e 100644
--- a/arch/x86/kvm/mmu/spte.c
+++ b/arch/x86/kvm/mmu/spte.c
@@ -161,6 +161,18 @@ bool make_spte(struct kvm_vcpu *vcpu, struct kvm_mmu_p=
age *sp,
 	if (!prefetch)
 		spte |=3D spte_shadow_accessed_mask(spte);
=20
+	/*
+	 * For simplicity, enforce the NX huge page mitigation even if not
+	 * strictly necessary.  KVM could ignore the mitigation if paging is
+	 * disabled in the guest, as the guest doesn't have an page tables to
+	 * abuse.  But to safely ignore the mitigation, KVM would have to
+	 * ensure a new MMU is loaded (or all shadow pages zapped) when CR0.PG
+	 * is toggled on, and that's a net negative for performance when TDP is
+	 * enabled.  When TDP is disabled, KVM will always switch to a new MMU
+	 * when CR0.PG is toggled, but leveraging that to ignore the mitigation
+	 * would tie make_spte() further to vCPU/MMU state, and add complexity
+	 * just to optimize a mode that is anything but performance critical.
+	 */
 	if (level > PG_LEVEL_4K && (pte_access & ACC_EXEC_MASK) &&
 	    is_nx_huge_page_enabled(vcpu->kvm)) {
 		pte_access &=3D ~ACC_EXEC_MASK;
--=20
2.38.0.rc1.362.ged0d419d3c-goog