From nobody Sun Feb  8 01:33:40 2026
Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com
 [209.85.216.74])
	(using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
	(No client certificate requested)
	by smtp.subspace.kernel.org (Postfix) with ESMTPS id 653232DAFAA
	for <linux-kernel@vger.kernel.org>; Thu, 16 Oct 2025 20:04:23 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org;
 arc=none smtp.client-ip=209.85.216.74
ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;
	t=1760645064; cv=none;
 b=WA11HJLxS75DQSgljjA9rUILM2xXrxoAuA5dh8cG/3SSP5fhOg7aPFNqJFASP6sX2REVy35rT7BmjjF6XF8hPtteUdDjoSqRq2RsUCWgGV6UKSdaiFleZeLBWwCzBXWNKWxyz8auSIpTeH4PZvT28aKVNkze85VNw9hANdVrtXo=
ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org;
	s=arc-20240116; t=1760645064; c=relaxed/simple;
	bh=ghs0JTPXViv3gYU+802FotJFDsjdBY8OuXRfpqVMfLY=;
	h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From:
	 To:Cc:Content-Type;
 b=WY2D4BouTzRWHdPMAgwqDF14jqK455haqtlvwJG1nzm9vM1UV/8zkDPMdkTudsplA+mru5CNzVFoiv3XCGQWBk1z9FUG3Xkue2eYAGNsgoZ/ORL9o9AMfSIPeCl51cN+I2AA3JYk9G+WgvShKjY2PsuPJF3GSIYMwJK/9JXiwiU=
ARC-Authentication-Results: i=1; smtp.subspace.kernel.org;
 dmarc=pass (p=reject dis=none) header.from=google.com;
 spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com;
 dkim=pass (2048-bit key) header.d=google.com header.i=@google.com
 header.b=CGX2PH1l; arc=none smtp.client-ip=209.85.216.74
Authentication-Results: smtp.subspace.kernel.org;
 dmarc=pass (p=reject dis=none) header.from=google.com
Authentication-Results: smtp.subspace.kernel.org;
 spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com
Authentication-Results: smtp.subspace.kernel.org;
	dkim=pass (2048-bit key) header.d=google.com header.i=@google.com
 header.b="CGX2PH1l"
Received: by mail-pj1-f74.google.com with SMTP id
 98e67ed59e1d1-33bcb7796d4so482549a91.0
        for <linux-kernel@vger.kernel.org>;
 Thu, 16 Oct 2025 13:04:23 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20230601; t=1760645063; x=1761249863;
 darn=vger.kernel.org;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:reply-to:from:to:cc:subject:date:message-id:reply-to;
        bh=Tm/qXlQLAHEXKOAgwJvK+AVS1CazgIKzCEoGFToXAKI=;
        b=CGX2PH1lgQTaTjenSrfDealGvevD1z/jDMBrFGYnbfoA4KRaLMiD+frDMH4kL7frPY
         6SMqfOzaXrZobUzQvjaDpvfc5F1zBaMhKyABVsghxVOUcX5J+T6dZq4mOYxF+kNWbR9C
         m+tp7BSQ08tJ5stqH2+NncJK1T9LPakPo1socUyhYPIrJZIY4UBQk9lKSpwB9DVPfLnJ
         UiT3iloKEMLe4Aj7UCaVIwQKuW7B//zaJmAuhZg7nGHLxovGaoGFEMGWMfiU/hKtnH4j
         I1RZaU+SuLPkn6qL6ssXgjGWfBinFZ6MCvMY+1D7y1R5swLWhkBXgX0gUD1xzaNk08+F
         zncQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20230601; t=1760645063; x=1761249863;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id
         :reply-to;
        bh=Tm/qXlQLAHEXKOAgwJvK+AVS1CazgIKzCEoGFToXAKI=;
        b=lPTA6AvQP1CpH8KC6evcjRdWjNiGKlcbhmb8Gw2ff8gzZUVyPNqQ81phR0fFSFVFji
         uH35LpLvPrynTr+29niItUUwxFsZNrfHdW14NcMu2+kYqmovqfd6GZ5qhlMUk/fkjnLg
         IQkJW18rNOh7klpN4hGSEI9bnhkzNAPYawQQusIq8BsM/OY7WHmGS+e7seJiIGuILw8G
         A57649oD4urpUcWeRvTkgW46Fgn9hyzerTZtGMIP9tUEtc7L/XG/OqTBJ7fHhKdg7VH+
         dD232KkmL/PCUHtvIAj96L7sSpJMMpr9cKbPBonZ2iZza3VFrGjzlsQhYPvdo0EDf7Ym
         FYCg==
X-Forwarded-Encrypted: i=1;
 AJvYcCXSx/Ro0dGn8Gmy1BppdoIQATfuLsf1V6xi/zO0m0gCO85M2qnHnYXXsqdBcm5Q0vy4g6PRtohYlQI92dk=@vger.kernel.org
X-Gm-Message-State: AOJu0YyBUQEbJA0mUihFX/Jjo6cVppHZ09oGs3k5Ah3cNZXRbXDMjsMk
	bn0FOvTDyt1ZD0JnVOSC9d5kNfn+WLVM3/+3kpxIpORvZb4tERxgO1YmjwhZCN7pjWrRpYFqpAN
	pn3R5Xw==
X-Google-Smtp-Source: 
 AGHT+IGuIUJFiaFRJKV+M0sz1MHDa0oHWE72Edwvx9CfUodCWHzB70GPJHhUzQvB94hkWD6mQpKM9Yhq4pU=
X-Received: from pjg12.prod.google.com ([2002:a17:90b:3f4c:b0:32e:d644:b829])
 (user=seanjc job=prod-delivery.src-stubby-dispatcher) by
 2002:a17:90b:2789:b0:32e:3686:830e
 with SMTP id 98e67ed59e1d1-33bcf8faad4mr1235069a91.23.1760645062625; Thu, 16
 Oct 2025 13:04:22 -0700 (PDT)
Reply-To: Sean Christopherson <seanjc@google.com>
Date: Thu, 16 Oct 2025 13:04:14 -0700
In-Reply-To: <20251016200417.97003-1-seanjc@google.com>
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
List-Id: <linux-kernel.vger.kernel.org>
List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org>
Mime-Version: 1.0
References: <20251016200417.97003-1-seanjc@google.com>
X-Mailer: git-send-email 2.51.0.858.gf9c4a03a3a-goog
Message-ID: <20251016200417.97003-2-seanjc@google.com>
Subject: [PATCH v3 1/4] KVM: VMX: Flush CPU buffers as needed if L1D cache
 flush is skipped
From: Sean Christopherson <seanjc@google.com>
To: Sean Christopherson <seanjc@google.com>,
 Paolo Bonzini <pbonzini@redhat.com>
Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
	Pawan Gupta <pawan.kumar.gupta@linux.intel.com>,
 Brendan Jackman <jackmanb@google.com>
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset="utf-8"

If the L1D flush for L1TF is conditionally enabled, flush CPU buffers to
mitigate MMIO Stale Data as needed if KVM skips the L1D flush, e.g.
because none of the "heavy" paths that trigger an L1D flush were tripped
since the last VM-Enter.

Note, the flaw goes back to the introduction of the MDS mitigation.  The
MDS mitigation was inadvertently fixed by commit 43fb862de8f6 ("KVM/VMX:
Move VERW closer to VMentry for MDS mitigation"), but previous kernels
that flush CPU buffers in vmx_vcpu_enter_exit() are affected.

Fixes: 650b68a0622f ("x86/kvm/vmx: Add MDS protection when L1D Flush is not=
 active")
Cc: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Brendan Jackman <jackmanb@google.com>
---
 arch/x86/kvm/vmx/vmx.c | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index f87c216d976d..ce556d5dc39b 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6663,7 +6663,7 @@ int vmx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t=
 exit_fastpath)
  * information but as all relevant affected CPUs have 32KiB L1D cache size
  * there is no point in doing so.
  */
-static noinstr void vmx_l1d_flush(struct kvm_vcpu *vcpu)
+static noinstr bool vmx_l1d_flush(struct kvm_vcpu *vcpu)
 {
 	int size =3D PAGE_SIZE << L1D_CACHE_ORDER;
=20
@@ -6691,14 +6691,14 @@ static noinstr void vmx_l1d_flush(struct kvm_vcpu *=
vcpu)
 		kvm_clear_cpu_l1tf_flush_l1d();
=20
 		if (!flush_l1d)
-			return;
+			return false;
 	}
=20
 	vcpu->stat.l1d_flush++;
=20
 	if (static_cpu_has(X86_FEATURE_FLUSH_L1D)) {
 		native_wrmsrq(MSR_IA32_FLUSH_CMD, L1D_FLUSH);
-		return;
+		return true;
 	}
=20
 	asm volatile(
@@ -6722,6 +6722,7 @@ static noinstr void vmx_l1d_flush(struct kvm_vcpu *vc=
pu)
 		:: [flush_pages] "r" (vmx_l1d_flush_pages),
 		    [size] "r" (size)
 		: "eax", "ebx", "ecx", "edx");
+	return true;
 }
=20
 void vmx_update_cr8_intercept(struct kvm_vcpu *vcpu, int tpr, int irr)
@@ -7330,8 +7331,9 @@ static noinstr void vmx_vcpu_enter_exit(struct kvm_vc=
pu *vcpu,
 	 * and is affected by MMIO Stale Data. In such cases mitigation in only
 	 * needed against an MMIO capable guest.
 	 */
-	if (static_branch_unlikely(&vmx_l1d_should_flush))
-		vmx_l1d_flush(vcpu);
+	if (static_branch_unlikely(&vmx_l1d_should_flush) &&
+	    vmx_l1d_flush(vcpu))
+		;
 	else if (static_branch_unlikely(&cpu_buf_vm_clear) &&
 		 (flags & VMX_RUN_CLEAR_CPU_BUFFERS_FOR_MMIO))
 		x86_clear_cpu_buffers();
--=20
2.51.0.858.gf9c4a03a3a-goog
From nobody Sun Feb  8 01:33:40 2026
Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com
 [209.85.216.73])
	(using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
	(No client certificate requested)
	by smtp.subspace.kernel.org (Postfix) with ESMTPS id 560AF2DF148
	for <linux-kernel@vger.kernel.org>; Thu, 16 Oct 2025 20:04:25 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org;
 arc=none smtp.client-ip=209.85.216.73
ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;
	t=1760645067; cv=none;
 b=QSAYjkAVC1erF6uB9oHHmpQNhclFhSlnYs4/MyWDefnK5x1rm873yM0SocAQP6qPdpcTXCDLNBPwweQj3oxBqk2dPl8Vt+Xd6k9TNwPQUN5neB73GaB6WWc2gOdM8uQpSSfCSlr+VR+th8W8WPXyOHL82fpVIl5QdaFROrPTPpE=
ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org;
	s=arc-20240116; t=1760645067; c=relaxed/simple;
	bh=yuk7NvONUNQVlqggvv8ub6LxWOfiyoba45LlVODNDco=;
	h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From:
	 To:Cc:Content-Type;
 b=eZbjp791TPqCLUItwf9aBBdC+QZ1KH338L3oiz33e1y5uGhkOJdHAN9lsKHUyJV57Brj0U41w/DbdlxV/O03VMhEwNCjCsw6WTUKGqVr4wPRLCvUXF2EZevrJ0LjFUAZ1v1smNqI53ypR2dcOCjS4flmuJtaKx1KnSAa76wcMf0=
ARC-Authentication-Results: i=1; smtp.subspace.kernel.org;
 dmarc=pass (p=reject dis=none) header.from=google.com;
 spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com;
 dkim=pass (2048-bit key) header.d=google.com header.i=@google.com
 header.b=xOtfLLRt; arc=none smtp.client-ip=209.85.216.73
Authentication-Results: smtp.subspace.kernel.org;
 dmarc=pass (p=reject dis=none) header.from=google.com
Authentication-Results: smtp.subspace.kernel.org;
 spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com
Authentication-Results: smtp.subspace.kernel.org;
	dkim=pass (2048-bit key) header.d=google.com header.i=@google.com
 header.b="xOtfLLRt"
Received: by mail-pj1-f73.google.com with SMTP id
 98e67ed59e1d1-32ec69d22b2so1131652a91.1
        for <linux-kernel@vger.kernel.org>;
 Thu, 16 Oct 2025 13:04:25 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20230601; t=1760645064; x=1761249864;
 darn=vger.kernel.org;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:reply-to:from:to:cc:subject:date:message-id:reply-to;
        bh=U+jp/+XcfRH9dOuRvA/UXEBka06Rhza6HVdo0nGfOSo=;
        b=xOtfLLRt2fBi5/Gb7417Lofc3CGHTyNFn2EUf+hJlAvh/ERuzcdjQ/iRbXPGwmCDkC
         boDk6DaDGHUPTuyNHL4iNbUmskrsLIZfVEzvpnM+7gEw/CagIeEs+ytc6oGzFUAHuVn3
         hw+bdyK5q1C3hwcexzpLCYYNRhbUJ1lIrsSluHIguAdNEZTRConvTPoN7hKROMVaIbXz
         jsnHaF1gRnDlbKczrFkl2A1QTejruNYniCqIoGBFO3AFulBaqFcL331GX0Gfn3mPGPNN
         z3ztL1W9pPcp7nkh4y9NBtUaGkjklOuAD96HKxMzkL+0N5R8BPWhUBDq9z9xUdmZBNLY
         qoNQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20230601; t=1760645064; x=1761249864;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id
         :reply-to;
        bh=U+jp/+XcfRH9dOuRvA/UXEBka06Rhza6HVdo0nGfOSo=;
        b=eOhe0vwLOCTAB4x7ggw+rA6ufGJhgCEW+nv9/EngswiBCOba0E23yIILOA2CVurVfQ
         QIwjLoHAyouOzNeav4QodXhTz7u/ToDkuGkaj48BhOqlnATXgw4BvvAT81+VyT7y5jL4
         78xtDHmEKbCHxxI1S58Y0c6XdaDo02ANRkSkkLr/I2arreAHiFlZDmxoekdbPm/T4ea+
         MOmX28cb8nN8Ro3bI+hM7eNP0sM0iuhbsz5v6b0qf4lI3aCCT9Rig7COySWVD79Mwou3
         y9vfMjFPQ70lfC/fvnvM7rSSdGqwEAEu3KNb2PFCq5S4ZvzZ9ZUTOYbjc7ViOJriPSyT
         dbPw==
X-Forwarded-Encrypted: i=1;
 AJvYcCXT4mLrlgsiqpKAeheD/mtKOgS7ZM68uKO2mXXOL76D3R05P3rfhVPQtD+TaxHfA2IjtJmwrT7ao3CvppY=@vger.kernel.org
X-Gm-Message-State: AOJu0YzeDn1W1CftiVzK2ifXBuuJUcq3Me6h3CM9qQVQq1YsSGiiSMCh
	6XYLIIIc0LVWzF3dWiURx2aIjd34o8aHkXheuAEUXnjfvLYvdCgSGkr2D+K+2Kwoz4RndsaZaxk
	byJi20Q==
X-Google-Smtp-Source: 
 AGHT+IH2nM7xybmdxyLC9czVkjZPXYctCesmDVEcR7IkWRJE94Mg7Obq1GtFksq5gUVDT9p0d9taLBI5DcE=
X-Received: from pjkm8.prod.google.com ([2002:a17:90a:7308:b0:33b:a0cd:53ed])
 (user=seanjc job=prod-delivery.src-stubby-dispatcher) by
 2002:a17:90b:3506:b0:33b:c5de:6a54
 with SMTP id 98e67ed59e1d1-33bcf8e9795mr1055685a91.23.1760645064449; Thu, 16
 Oct 2025 13:04:24 -0700 (PDT)
Reply-To: Sean Christopherson <seanjc@google.com>
Date: Thu, 16 Oct 2025 13:04:15 -0700
In-Reply-To: <20251016200417.97003-1-seanjc@google.com>
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
List-Id: <linux-kernel.vger.kernel.org>
List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org>
Mime-Version: 1.0
References: <20251016200417.97003-1-seanjc@google.com>
X-Mailer: git-send-email 2.51.0.858.gf9c4a03a3a-goog
Message-ID: <20251016200417.97003-3-seanjc@google.com>
Subject: [PATCH v3 2/4] KVM: VMX: Bundle all L1 data cache flush mitigation
 code together
From: Sean Christopherson <seanjc@google.com>
To: Sean Christopherson <seanjc@google.com>,
 Paolo Bonzini <pbonzini@redhat.com>
Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
	Pawan Gupta <pawan.kumar.gupta@linux.intel.com>,
 Brendan Jackman <jackmanb@google.com>
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset="utf-8"

Move vmx_l1d_flush(), vmx_cleanup_l1d_flush(), and the vmentry_l1d_flush
param code up in vmx.c so that all of the L1 data cache flushing code is
bundled together.  This will allow conditioning the mitigation code on
CONFIG_CPU_MITIGATIONS=3Dy with minimal #ifdefs.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Brendan Jackman <jackmanb@google.com>
---
 arch/x86/kvm/vmx/vmx.c | 176 ++++++++++++++++++++---------------------
 1 file changed, 88 insertions(+), 88 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index ce556d5dc39b..cd8ae1b2ae55 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -302,6 +302,16 @@ static int vmx_setup_l1d_flush(enum vmx_l1d_flush_stat=
e l1tf)
 	return 0;
 }
=20
+static void vmx_cleanup_l1d_flush(void)
+{
+	if (vmx_l1d_flush_pages) {
+		free_pages((unsigned long)vmx_l1d_flush_pages, L1D_CACHE_ORDER);
+		vmx_l1d_flush_pages =3D NULL;
+	}
+	/* Restore state so sysfs ignores VMX */
+	l1tf_vmx_mitigation =3D VMENTER_L1D_FLUSH_AUTO;
+}
+
 static int vmentry_l1d_flush_parse(const char *s)
 {
 	unsigned int i;
@@ -352,6 +362,84 @@ static int vmentry_l1d_flush_get(char *s, const struct=
 kernel_param *kp)
 	return sysfs_emit(s, "%s\n", vmentry_l1d_param[l1tf_vmx_mitigation].optio=
n);
 }
=20
+/*
+ * Software based L1D cache flush which is used when microcode providing
+ * the cache control MSR is not loaded.
+ *
+ * The L1D cache is 32 KiB on Nehalem and later microarchitectures, but to
+ * flush it is required to read in 64 KiB because the replacement algorithm
+ * is not exactly LRU. This could be sized at runtime via topology
+ * information but as all relevant affected CPUs have 32KiB L1D cache size
+ * there is no point in doing so.
+ */
+static noinstr bool vmx_l1d_flush(struct kvm_vcpu *vcpu)
+{
+	int size =3D PAGE_SIZE << L1D_CACHE_ORDER;
+
+	/*
+	 * This code is only executed when the flush mode is 'cond' or
+	 * 'always'
+	 */
+	if (static_branch_likely(&vmx_l1d_flush_cond)) {
+		bool flush_l1d;
+
+		/*
+		 * Clear the per-vcpu flush bit, it gets set again if the vCPU
+		 * is reloaded, i.e. if the vCPU is scheduled out or if KVM
+		 * exits to userspace, or if KVM reaches one of the unsafe
+		 * VMEXIT handlers, e.g. if KVM calls into the emulator.
+		 */
+		flush_l1d =3D vcpu->arch.l1tf_flush_l1d;
+		vcpu->arch.l1tf_flush_l1d =3D false;
+
+		/*
+		 * Clear the per-cpu flush bit, it gets set again from
+		 * the interrupt handlers.
+		 */
+		flush_l1d |=3D kvm_get_cpu_l1tf_flush_l1d();
+		kvm_clear_cpu_l1tf_flush_l1d();
+
+		if (!flush_l1d)
+			return false;
+	}
+
+	vcpu->stat.l1d_flush++;
+
+	if (static_cpu_has(X86_FEATURE_FLUSH_L1D)) {
+		native_wrmsrq(MSR_IA32_FLUSH_CMD, L1D_FLUSH);
+		return true;
+	}
+
+	asm volatile(
+		/* First ensure the pages are in the TLB */
+		"xorl	%%eax, %%eax\n"
+		".Lpopulate_tlb:\n\t"
+		"movzbl	(%[flush_pages], %%" _ASM_AX "), %%ecx\n\t"
+		"addl	$4096, %%eax\n\t"
+		"cmpl	%%eax, %[size]\n\t"
+		"jne	.Lpopulate_tlb\n\t"
+		"xorl	%%eax, %%eax\n\t"
+		"cpuid\n\t"
+		/* Now fill the cache */
+		"xorl	%%eax, %%eax\n"
+		".Lfill_cache:\n"
+		"movzbl	(%[flush_pages], %%" _ASM_AX "), %%ecx\n\t"
+		"addl	$64, %%eax\n\t"
+		"cmpl	%%eax, %[size]\n\t"
+		"jne	.Lfill_cache\n\t"
+		"lfence\n"
+		:: [flush_pages] "r" (vmx_l1d_flush_pages),
+		    [size] "r" (size)
+		: "eax", "ebx", "ecx", "edx");
+	return true;
+}
+
+static const struct kernel_param_ops vmentry_l1d_flush_ops =3D {
+	.set =3D vmentry_l1d_flush_set,
+	.get =3D vmentry_l1d_flush_get,
+};
+module_param_cb(vmentry_l1d_flush, &vmentry_l1d_flush_ops, NULL, 0644);
+
 static __always_inline void vmx_disable_fb_clear(struct vcpu_vmx *vmx)
 {
 	u64 msr;
@@ -404,12 +492,6 @@ static void vmx_update_fb_clear_dis(struct kvm_vcpu *v=
cpu, struct vcpu_vmx *vmx)
 		vmx->disable_fb_clear =3D false;
 }
=20
-static const struct kernel_param_ops vmentry_l1d_flush_ops =3D {
-	.set =3D vmentry_l1d_flush_set,
-	.get =3D vmentry_l1d_flush_get,
-};
-module_param_cb(vmentry_l1d_flush, &vmentry_l1d_flush_ops, NULL, 0644);
-
 static u32 vmx_segment_access_rights(struct kvm_segment *var);
=20
 void vmx_vmexit(void);
@@ -6653,78 +6735,6 @@ int vmx_handle_exit(struct kvm_vcpu *vcpu, fastpath_=
t exit_fastpath)
 	return ret;
 }
=20
-/*
- * Software based L1D cache flush which is used when microcode providing
- * the cache control MSR is not loaded.
- *
- * The L1D cache is 32 KiB on Nehalem and later microarchitectures, but to
- * flush it is required to read in 64 KiB because the replacement algorithm
- * is not exactly LRU. This could be sized at runtime via topology
- * information but as all relevant affected CPUs have 32KiB L1D cache size
- * there is no point in doing so.
- */
-static noinstr bool vmx_l1d_flush(struct kvm_vcpu *vcpu)
-{
-	int size =3D PAGE_SIZE << L1D_CACHE_ORDER;
-
-	/*
-	 * This code is only executed when the flush mode is 'cond' or
-	 * 'always'
-	 */
-	if (static_branch_likely(&vmx_l1d_flush_cond)) {
-		bool flush_l1d;
-
-		/*
-		 * Clear the per-vcpu flush bit, it gets set again if the vCPU
-		 * is reloaded, i.e. if the vCPU is scheduled out or if KVM
-		 * exits to userspace, or if KVM reaches one of the unsafe
-		 * VMEXIT handlers, e.g. if KVM calls into the emulator.
-		 */
-		flush_l1d =3D vcpu->arch.l1tf_flush_l1d;
-		vcpu->arch.l1tf_flush_l1d =3D false;
-
-		/*
-		 * Clear the per-cpu flush bit, it gets set again from
-		 * the interrupt handlers.
-		 */
-		flush_l1d |=3D kvm_get_cpu_l1tf_flush_l1d();
-		kvm_clear_cpu_l1tf_flush_l1d();
-
-		if (!flush_l1d)
-			return false;
-	}
-
-	vcpu->stat.l1d_flush++;
-
-	if (static_cpu_has(X86_FEATURE_FLUSH_L1D)) {
-		native_wrmsrq(MSR_IA32_FLUSH_CMD, L1D_FLUSH);
-		return true;
-	}
-
-	asm volatile(
-		/* First ensure the pages are in the TLB */
-		"xorl	%%eax, %%eax\n"
-		".Lpopulate_tlb:\n\t"
-		"movzbl	(%[flush_pages], %%" _ASM_AX "), %%ecx\n\t"
-		"addl	$4096, %%eax\n\t"
-		"cmpl	%%eax, %[size]\n\t"
-		"jne	.Lpopulate_tlb\n\t"
-		"xorl	%%eax, %%eax\n\t"
-		"cpuid\n\t"
-		/* Now fill the cache */
-		"xorl	%%eax, %%eax\n"
-		".Lfill_cache:\n"
-		"movzbl	(%[flush_pages], %%" _ASM_AX "), %%ecx\n\t"
-		"addl	$64, %%eax\n\t"
-		"cmpl	%%eax, %[size]\n\t"
-		"jne	.Lfill_cache\n\t"
-		"lfence\n"
-		:: [flush_pages] "r" (vmx_l1d_flush_pages),
-		    [size] "r" (size)
-		: "eax", "ebx", "ecx", "edx");
-	return true;
-}
-
 void vmx_update_cr8_intercept(struct kvm_vcpu *vcpu, int tpr, int irr)
 {
 	struct vmcs12 *vmcs12 =3D get_vmcs12(vcpu);
@@ -8673,16 +8683,6 @@ __init int vmx_hardware_setup(void)
 	return r;
 }
=20
-static void vmx_cleanup_l1d_flush(void)
-{
-	if (vmx_l1d_flush_pages) {
-		free_pages((unsigned long)vmx_l1d_flush_pages, L1D_CACHE_ORDER);
-		vmx_l1d_flush_pages =3D NULL;
-	}
-	/* Restore state so sysfs ignores VMX */
-	l1tf_vmx_mitigation =3D VMENTER_L1D_FLUSH_AUTO;
-}
-
 void vmx_exit(void)
 {
 	allow_smaller_maxphyaddr =3D false;
--=20
2.51.0.858.gf9c4a03a3a-goog
From nobody Sun Feb  8 01:33:40 2026
Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com
 [209.85.216.74])
	(using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
	(No client certificate requested)
	by smtp.subspace.kernel.org (Postfix) with ESMTPS id A12E72DFA4A
	for <linux-kernel@vger.kernel.org>; Thu, 16 Oct 2025 20:04:26 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org;
 arc=none smtp.client-ip=209.85.216.74
ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;
	t=1760645068; cv=none;
 b=BR3rAgCybH1h8nMjCLFge7UdurE4sV5VIgfwWakFxeOkhth3jGthlwyqjo1z61epOoBnqPQFea4ZO7RmNP2363QKKb0SSMGi9OvioZlGLdURstkaep/fboqbA9nd3IoVWYFHd2ZeMe5ul7QTmMoJZV13F4bTdMGEkORl/5uCnnU=
ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org;
	s=arc-20240116; t=1760645068; c=relaxed/simple;
	bh=+BI3vZGhl1OUxVrO7t7nTf9aykzXpRRBJCRI1x4Dzk4=;
	h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From:
	 To:Cc:Content-Type;
 b=JAzAvYOCWCtdBDntgOwBMXMJ0w7Xo8YAJCebh8qHlmZXZwN5G10cLiK6Xnop5PhvI15E0f15ibrScv88Fb0EP7XVrVS9UP68iIi9UduHcOe/j4dJkVlifZWUXLhUqTufNng97VcCcbjRh3FSL5yNEmfhRNa8UiRfFU2jso9Nyps=
ARC-Authentication-Results: i=1; smtp.subspace.kernel.org;
 dmarc=pass (p=reject dis=none) header.from=google.com;
 spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com;
 dkim=pass (2048-bit key) header.d=google.com header.i=@google.com
 header.b=VmLNRsE7; arc=none smtp.client-ip=209.85.216.74
Authentication-Results: smtp.subspace.kernel.org;
 dmarc=pass (p=reject dis=none) header.from=google.com
Authentication-Results: smtp.subspace.kernel.org;
 spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com
Authentication-Results: smtp.subspace.kernel.org;
	dkim=pass (2048-bit key) header.d=google.com header.i=@google.com
 header.b="VmLNRsE7"
Received: by mail-pj1-f74.google.com with SMTP id
 98e67ed59e1d1-32ee157b9c9so1068452a91.2
        for <linux-kernel@vger.kernel.org>;
 Thu, 16 Oct 2025 13:04:26 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20230601; t=1760645066; x=1761249866;
 darn=vger.kernel.org;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:reply-to:from:to:cc:subject:date:message-id:reply-to;
        bh=gICh0D9JnzmBW3pvgKeRu2d75Fu8xaMjYi9X/63uYNM=;
        b=VmLNRsE7dNJrYlWaZVNk4G/bimpLGAopdeQPhhhDheyIWYOT7pHHMZ6nKIV9alOfLr
         vWdhEBP9fEKawvjs4odiWL0I3w/8+PpreDTQuadUiGyXH6VocSIpP2ockQyIgFICv2iw
         9oyRgQqWQ6+euG4ljQnwTBF076YeMeWYPVVICub6/c+PPQ268xAKmMxlzGrYGBnpfDh6
         d7IkDAtF6WnRySlPghdrgZ8y8Lo/AyjZHm9VD9evT48irh83Kbd/Zcg95aZ6uFwlVmyV
         dw8HBAI4co2xi8FppV3F/XofbnXK6lkDRc7vGdWaajDQbAdn485e2T4cfP6j3/C0aIhN
         7tlA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20230601; t=1760645066; x=1761249866;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id
         :reply-to;
        bh=gICh0D9JnzmBW3pvgKeRu2d75Fu8xaMjYi9X/63uYNM=;
        b=sl5B+M4qavVJ35qQ5zJChm3NVZ4BTMMuwBQT1R7eZ6QbZzUMdkKQRTN8ouIkNHXBcD
         7l+hlnZ1g+vGIR+ktvu32JBDm075kf/95ROdy3Oj3txFvvCb5v++eC1v8isJeQxfv9cj
         vdHQnzr1qpM8zc+lTtO+dsath4ce24IJgEgFcgZh5I+e7WED0l8oax+EiLXe7vC7BNuz
         wqqHL8NcWvxtdYenhTvj9OmEOU928w2DsuLSAyIIbjFRyanZLlKrSLpnnYZe2bwuG82z
         IpkrzTo9aoSG4VsIrEp4XlYHUrucY7rhYW3/unIdiqIkwKdDQ3ud7Kfzb93YmTO0e265
         aOZQ==
X-Forwarded-Encrypted: i=1;
 AJvYcCVXb/fyav0ZfbOqIFzGB13buzZBtfrUoXPs3TH8Zy2YkfvVr+hZcotGHWgWTJ2aFhyQYd6oZ1iOF5xI8+A=@vger.kernel.org
X-Gm-Message-State: AOJu0Yzv/Rvv80rmC7wrnoAL5r2A77TwSpm+AXnNIb90Fweg5e0trCuB
	N/u6sNnwJ3V1wDrAgrFOrTKJkXCHcz0Gk4uEazmCmM4tDZ5VhpMEkYZh64C1IkFOxBAnLi6dv/4
	HfKqrQg==
X-Google-Smtp-Source: 
 AGHT+IHztc2TqgDKpEmWp9jAwqakrYMvnNulb8h0c+tFEAFFdKTRQ4mRZOTbzsyI2FnZxXK4m2S9f8WNbU0=
X-Received: from pjbsr14.prod.google.com
 ([2002:a17:90b:4e8e:b0:33b:8b81:9086])
 (user=seanjc job=prod-delivery.src-stubby-dispatcher) by
 2002:a17:90b:3942:b0:336:bfce:3b48
 with SMTP id 98e67ed59e1d1-33bcf87f431mr1266128a91.9.1760645066002; Thu, 16
 Oct 2025 13:04:26 -0700 (PDT)
Reply-To: Sean Christopherson <seanjc@google.com>
Date: Thu, 16 Oct 2025 13:04:16 -0700
In-Reply-To: <20251016200417.97003-1-seanjc@google.com>
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
List-Id: <linux-kernel.vger.kernel.org>
List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org>
Mime-Version: 1.0
References: <20251016200417.97003-1-seanjc@google.com>
X-Mailer: git-send-email 2.51.0.858.gf9c4a03a3a-goog
Message-ID: <20251016200417.97003-4-seanjc@google.com>
Subject: [PATCH v3 3/4] KVM: VMX: Disable L1TF L1 data cache flush if
 CONFIG_CPU_MITIGATIONS=n
From: Sean Christopherson <seanjc@google.com>
To: Sean Christopherson <seanjc@google.com>,
 Paolo Bonzini <pbonzini@redhat.com>
Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
	Pawan Gupta <pawan.kumar.gupta@linux.intel.com>,
 Brendan Jackman <jackmanb@google.com>
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset="utf-8"

Disable support for flushing the L1 data cache to mitigate L1TF if CPU
mitigations are disabled for the entire kernel.  KVM's mitigation of L1TF
is in no way special enough to justify ignoring CONFIG_CPU_MITIGATIONS=3Dn.

Deliberately use CPU_MITIGATIONS instead of the more precise
MITIGATION_L1TF, as MITIGATION_L1TF only controls the default behavior,
i.e. CONFIG_MITIGATION_L1TF=3Dn doesn't completely disable L1TF mitigations
in the kernel.

Keep the vmentry_l1d_flush module param to avoid breaking existing setups,
and leverage the .set path to alert the user to the fact that
vmentry_l1d_flush will be ignored.  Don't bother validating the incoming
value; if an admin misconfigures vmentry_l1d_flush, the fact that the bad
configuration won't be detected when running with CONFIG_CPU_MITIGATIONS=3Dn
is likely the least of their worries.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/include/asm/hardirq.h |  4 +--
 arch/x86/kvm/vmx/vmx.c         | 56 ++++++++++++++++++++++++++--------
 2 files changed, 46 insertions(+), 14 deletions(-)

diff --git a/arch/x86/include/asm/hardirq.h b/arch/x86/include/asm/hardirq.h
index f00c09ffe6a9..6b6d472baa0b 100644
--- a/arch/x86/include/asm/hardirq.h
+++ b/arch/x86/include/asm/hardirq.h
@@ -5,7 +5,7 @@
 #include <linux/threads.h>
=20
 typedef struct {
-#if IS_ENABLED(CONFIG_KVM_INTEL)
+#if IS_ENABLED(CONFIG_CPU_MITIGATIONS) && IS_ENABLED(CONFIG_KVM_INTEL)
 	u8	     kvm_cpu_l1tf_flush_l1d;
 #endif
 	unsigned int __nmi_count;	/* arch dependent */
@@ -68,7 +68,7 @@ extern u64 arch_irq_stat(void);
 DECLARE_PER_CPU_CACHE_HOT(u16, __softirq_pending);
 #define local_softirq_pending_ref       __softirq_pending
=20
-#if IS_ENABLED(CONFIG_KVM_INTEL)
+#if IS_ENABLED(CONFIG_CPU_MITIGATIONS) && IS_ENABLED(CONFIG_KVM_INTEL)
 /*
  * This function is called from noinstr interrupt contexts
  * and must be inlined to not get instrumentation.
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index cd8ae1b2ae55..e91d99211efe 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -203,6 +203,7 @@ module_param(pt_mode, int, S_IRUGO);
=20
 struct x86_pmu_lbr __ro_after_init vmx_lbr_caps;
=20
+#ifdef CONFIG_CPU_MITIGATIONS
 static DEFINE_STATIC_KEY_FALSE(vmx_l1d_should_flush);
 static DEFINE_STATIC_KEY_FALSE(vmx_l1d_flush_cond);
 static DEFINE_MUTEX(vmx_l1d_flush_mutex);
@@ -225,7 +226,7 @@ static const struct {
 #define L1D_CACHE_ORDER 4
 static void *vmx_l1d_flush_pages;
=20
-static int vmx_setup_l1d_flush(enum vmx_l1d_flush_state l1tf)
+static int __vmx_setup_l1d_flush(enum vmx_l1d_flush_state l1tf)
 {
 	struct page *page;
 	unsigned int i;
@@ -302,6 +303,16 @@ static int vmx_setup_l1d_flush(enum vmx_l1d_flush_stat=
e l1tf)
 	return 0;
 }
=20
+static int vmx_setup_l1d_flush(void)
+{
+	/*
+	 * Hand the parameter mitigation value in which was stored in the pre
+	 * module init parser. If no parameter was given, it will contain
+	 * 'auto' which will be turned into the default 'cond' mitigation mode.
+	 */
+	return vmx_setup_l1d_flush(vmentry_l1d_flush_param);
+}
+
 static void vmx_cleanup_l1d_flush(void)
 {
 	if (vmx_l1d_flush_pages) {
@@ -349,7 +360,7 @@ static int vmentry_l1d_flush_set(const char *s, const s=
truct kernel_param *kp)
 	}
=20
 	mutex_lock(&vmx_l1d_flush_mutex);
-	ret =3D vmx_setup_l1d_flush(l1tf);
+	ret =3D __vmx_setup_l1d_flush(l1tf);
 	mutex_unlock(&vmx_l1d_flush_mutex);
 	return ret;
 }
@@ -376,6 +387,9 @@ static noinstr bool vmx_l1d_flush(struct kvm_vcpu *vcpu)
 {
 	int size =3D PAGE_SIZE << L1D_CACHE_ORDER;
=20
+	if (!static_branch_unlikely(&vmx_l1d_should_flush))
+		return false;
+
 	/*
 	 * This code is only executed when the flush mode is 'cond' or
 	 * 'always'
@@ -434,6 +448,31 @@ static noinstr bool vmx_l1d_flush(struct kvm_vcpu *vcp=
u)
 	return true;
 }
=20
+#else /* CONFIG_CPU_MITIGATIONS*/
+static int vmx_setup_l1d_flush(void)
+{
+	l1tf_vmx_mitigation =3D VMENTER_L1D_FLUSH_NEVER;
+	return 0;
+}
+static void vmx_cleanup_l1d_flush(void)
+{
+	l1tf_vmx_mitigation =3D VMENTER_L1D_FLUSH_AUTO;
+}
+static __always_inline bool vmx_l1d_flush(struct kvm_vcpu *vcpu)
+{
+	return false;
+}
+static int vmentry_l1d_flush_set(const char *s, const struct kernel_param =
*kp)
+{
+	pr_warn_once("Kernel compiled without mitigations, ignoring vmentry_l1d_f=
lush\n");
+	return 0;
+}
+static int vmentry_l1d_flush_get(char *s, const struct kernel_param *kp)
+{
+	return sysfs_emit(s, "never\n");
+}
+#endif
+
 static const struct kernel_param_ops vmentry_l1d_flush_ops =3D {
 	.set =3D vmentry_l1d_flush_set,
 	.get =3D vmentry_l1d_flush_get,
@@ -7341,8 +7380,7 @@ static noinstr void vmx_vcpu_enter_exit(struct kvm_vc=
pu *vcpu,
 	 * and is affected by MMIO Stale Data. In such cases mitigation in only
 	 * needed against an MMIO capable guest.
 	 */
-	if (static_branch_unlikely(&vmx_l1d_should_flush) &&
-	    vmx_l1d_flush(vcpu))
+	if (vmx_l1d_flush(vcpu))
 		;
 	else if (static_branch_unlikely(&cpu_buf_vm_clear) &&
 		 (flags & VMX_RUN_CLEAR_CPU_BUFFERS_FOR_MMIO))
@@ -8718,14 +8756,8 @@ int __init vmx_init(void)
 	if (r)
 		return r;
=20
-	/*
-	 * Must be called after common x86 init so enable_ept is properly set
-	 * up. Hand the parameter mitigation value in which was stored in
-	 * the pre module init parser. If no parameter was given, it will
-	 * contain 'auto' which will be turned into the default 'cond'
-	 * mitigation mode.
-	 */
-	r =3D vmx_setup_l1d_flush(vmentry_l1d_flush_param);
+	/* Must be called after common x86 init so enable_ept is setup. */
+	r =3D vmx_setup_l1d_flush();
 	if (r)
 		goto err_l1d_flush;
=20
--=20
2.51.0.858.gf9c4a03a3a-goog
From nobody Sun Feb  8 01:33:40 2026
Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com
 [209.85.216.73])
	(using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
	(No client certificate requested)
	by smtp.subspace.kernel.org (Postfix) with ESMTPS id 24BEA2E11D7
	for <linux-kernel@vger.kernel.org>; Thu, 16 Oct 2025 20:04:28 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org;
 arc=none smtp.client-ip=209.85.216.73
ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;
	t=1760645069; cv=none;
 b=E/tSVo2y729ttNsqaK+DJUFkCod0sb8GaiVVH6al0ECFHG88A7OC2KoaOqKMZs0DNcsvtr2z9qs3IIbCaANnt+DKGDLnxbEuy8cRJjcbDGs2qQCQ9yTvFSZ3E4x83CQ9dMkjcmYUdRg/qqP0NFY8WwGRw6Jj6/Lc3SiNjmD7OHs=
ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org;
	s=arc-20240116; t=1760645069; c=relaxed/simple;
	bh=SNE+nRiwqlE038YuWxnWn7j7iJTACVTLDjxXYnTjtWg=;
	h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From:
	 To:Cc:Content-Type;
 b=tGtiKK7gJ4ftQNUlZ6Ou+TazzuNTmXpzsSz4kmNqGSel0XrtJ95FVt+pmUHNofyR0phA8GrqL0QgOawXRaIM5OAqN9Guo0P94OfYhufIQLveEc8I+t+18DrCq8QTRtz3iGxq5rtaxhiylxnms1/11e0B1QPAbUrAINFOKFchxrg=
ARC-Authentication-Results: i=1; smtp.subspace.kernel.org;
 dmarc=pass (p=reject dis=none) header.from=google.com;
 spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com;
 dkim=pass (2048-bit key) header.d=google.com header.i=@google.com
 header.b=zrqw8pOF; arc=none smtp.client-ip=209.85.216.73
Authentication-Results: smtp.subspace.kernel.org;
 dmarc=pass (p=reject dis=none) header.from=google.com
Authentication-Results: smtp.subspace.kernel.org;
 spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com
Authentication-Results: smtp.subspace.kernel.org;
	dkim=pass (2048-bit key) header.d=google.com header.i=@google.com
 header.b="zrqw8pOF"
Received: by mail-pj1-f73.google.com with SMTP id
 98e67ed59e1d1-336b9f3b5b0so1287930a91.3
        for <linux-kernel@vger.kernel.org>;
 Thu, 16 Oct 2025 13:04:27 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20230601; t=1760645067; x=1761249867;
 darn=vger.kernel.org;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:reply-to:from:to:cc:subject:date:message-id:reply-to;
        bh=8iJumIyoa7Jv1Ta1I07laKkWSuZSFDfZ4MoevL3qjrU=;
        b=zrqw8pOFo3pEeVjTnQoxMEYdTfWiLOpZ/1K7fAo9llED2PdDSbZE9wBESp9U2tPDOE
         Ial4OmVODB3+alwPHRgD9rEV8JQsjzw9N1Vym6O5QTFlCZfYhB1zoAp2LNSGxglxRriU
         aq3pXs8IvYTPpf9oFyKfyYt+B+ziDbs+aU5FJX8bgG/KMHJo21ZonyiGt8g66wagIGpG
         qBxqQKvvzxWZLHJnxatZme6ZIiZKdqz14ACWMB/xxQnpoWTkq7mSlbXQFOQtIvl8cpk7
         VgVeO8XXSWglQ+yFN+tX2Zdu00FDxpyhKmsESei3AQlxCy9nxNwDG4sjsNwf7xGbVzIZ
         rKIQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20230601; t=1760645067; x=1761249867;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id
         :reply-to;
        bh=8iJumIyoa7Jv1Ta1I07laKkWSuZSFDfZ4MoevL3qjrU=;
        b=TMK7t7/HH8jIAa93Z60ctDx5TOnc1UivLCIkmQhLOcBGi2FRVEAZzHyW/pSGuFePnE
         bFznu/9lOEKo7jtoI4iHHjt7JRQtrhf1sS1TBjDg5jRZNBeu6PAZ7GDUXotgBGOOTeLx
         Ozfy3wCMQMCsKvZH1rkOlzP2eKrBvdHPo02cnUqOzo/unZ6isjLvzisoq4I38/ctLaPR
         2Jiy1wgyKXuiaj7v2R2gf87M+6noQXoaTUq80MJny8GfrZO/GZ0IUm35iDiKaI7jSbfP
         cAh014pqWmTNOcYIf9E8Lce9RBkfx0pNVSeUi0A0RkSPdDgUu47MZUCToVGzx8nlsRD4
         bvgA==
X-Forwarded-Encrypted: i=1;
 AJvYcCXYMt+dwm/asz3oWtWO17YCZVAGZgCprc4vf5NOvgbOlcKNj8JB0jFgP5EUh1rfo0iRh7iOXAgtLQL5+zo=@vger.kernel.org
X-Gm-Message-State: AOJu0YxyAEUWWdD0EIofoDbWIxFOC7uiURO/Qy6StMetHo2BRHfZobD2
	LIM5z6szu1Hh3MOs/hg7ZAwPEUMl/Lktc0opJgoTuPouWAnYKeUaYRM0+aXja0H8evnQ9gyggGB
	Y8cxdjg==
X-Google-Smtp-Source: 
 AGHT+IGOG6A5Ijw3KLuRGlwFU7ZiWO+4sv+3A/3GnT2Oqt8U2ps3rXOjLFbWGQRyn9sJEkzA80a7CZbegDY=
X-Received: from pjkm1.prod.google.com ([2002:a17:90a:7301:b0:339:ee20:f620])
 (user=seanjc job=prod-delivery.src-stubby-dispatcher) by
 2002:a17:90b:3fcc:b0:31e:d4e3:4002
 with SMTP id 98e67ed59e1d1-33bcf85ad9amr1009215a91.2.1760645067487; Thu, 16
 Oct 2025 13:04:27 -0700 (PDT)
Reply-To: Sean Christopherson <seanjc@google.com>
Date: Thu, 16 Oct 2025 13:04:17 -0700
In-Reply-To: <20251016200417.97003-1-seanjc@google.com>
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
List-Id: <linux-kernel.vger.kernel.org>
List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org>
Mime-Version: 1.0
References: <20251016200417.97003-1-seanjc@google.com>
X-Mailer: git-send-email 2.51.0.858.gf9c4a03a3a-goog
Message-ID: <20251016200417.97003-5-seanjc@google.com>
Subject: [PATCH v3 4/4] KVM: x86: Unify L1TF flushing under per-CPU variable
From: Sean Christopherson <seanjc@google.com>
To: Sean Christopherson <seanjc@google.com>,
 Paolo Bonzini <pbonzini@redhat.com>
Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
	Pawan Gupta <pawan.kumar.gupta@linux.intel.com>,
 Brendan Jackman <jackmanb@google.com>
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset="utf-8"

From: Brendan Jackman <jackmanb@google.com>

Currently the tracking of the need to flush L1D for L1TF is tracked by
two bits: one per-CPU and one per-vCPU.

The per-vCPU bit is always set when the vCPU shows up on a core, so
there is no interesting state that's truly per-vCPU. Indeed, this is a
requirement, since L1D is a part of the physical CPU.

So simplify this by combining the two bits.

The vCPU bit was being written from preemption-enabled regions.  To play
nice with those cases, wrap all calls from KVM and use a raw write so that
request a flush with preemption enabled doesn't trigger what would
effectively be DEBUG_PREEMPT false positives.  Preemption doesn't need to
be disabled, as kvm_arch_vcpu_load() will mark the new CPU as needing a
flush if the vCPU task is migrated, or if userspace runs the vCPU on a
different task.

Signed-off-by: Brendan Jackman <jackmanb@google.com>
[sean: put raw write in KVM instead of in a hardirq.h variant]
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/include/asm/kvm_host.h |  3 ---
 arch/x86/kvm/mmu/mmu.c          |  2 +-
 arch/x86/kvm/vmx/nested.c       |  2 +-
 arch/x86/kvm/vmx/vmx.c          | 20 +++++---------------
 arch/x86/kvm/x86.c              |  6 +++---
 arch/x86/kvm/x86.h              | 14 ++++++++++++++
 6 files changed, 24 insertions(+), 23 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_hos=
t.h
index 48598d017d6f..fcdc65ab13d8 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1055,9 +1055,6 @@ struct kvm_vcpu_arch {
 	/* be preempted when it's in kernel-mode(cpl=3D0) */
 	bool preempted_in_kernel;
=20
-	/* Flush the L1 Data cache for L1TF mitigation on VMENTER */
-	bool l1tf_flush_l1d;
-
 	/* Host CPU on which VM-entry was most recently attempted */
 	int last_vmentry_cpu;
=20
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 18d69d48bc55..4e016582adc7 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -4859,7 +4859,7 @@ int kvm_handle_page_fault(struct kvm_vcpu *vcpu, u64 =
error_code,
 	 */
 	BUILD_BUG_ON(lower_32_bits(PFERR_SYNTHETIC_MASK));
=20
-	vcpu->arch.l1tf_flush_l1d =3D true;
+	kvm_request_l1tf_flush_l1d();
 	if (!flags) {
 		trace_kvm_page_fault(vcpu, fault_address, error_code);
=20
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 3fca63a261f5..468a013d9ef3 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -3880,7 +3880,7 @@ static int nested_vmx_run(struct kvm_vcpu *vcpu, bool=
 launch)
 		goto vmentry_failed;
=20
 	/* Hide L1D cache contents from the nested guest.  */
-	vcpu->arch.l1tf_flush_l1d =3D true;
+	kvm_request_l1tf_flush_l1d();
=20
 	/*
 	 * Must happen outside of nested_vmx_enter_non_root_mode() as it will
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index e91d99211efe..0347d321a86e 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -395,26 +395,16 @@ static noinstr bool vmx_l1d_flush(struct kvm_vcpu *vc=
pu)
 	 * 'always'
 	 */
 	if (static_branch_likely(&vmx_l1d_flush_cond)) {
-		bool flush_l1d;
-
 		/*
-		 * Clear the per-vcpu flush bit, it gets set again if the vCPU
+		 * Clear the per-cpu flush bit, it gets set again if the vCPU
 		 * is reloaded, i.e. if the vCPU is scheduled out or if KVM
 		 * exits to userspace, or if KVM reaches one of the unsafe
-		 * VMEXIT handlers, e.g. if KVM calls into the emulator.
+		 * VMEXIT handlers, e.g. if KVM calls into the emulator,
+		 * or from the interrupt handlers.
 		 */
-		flush_l1d =3D vcpu->arch.l1tf_flush_l1d;
-		vcpu->arch.l1tf_flush_l1d =3D false;
-
-		/*
-		 * Clear the per-cpu flush bit, it gets set again from
-		 * the interrupt handlers.
-		 */
-		flush_l1d |=3D kvm_get_cpu_l1tf_flush_l1d();
+		if (!kvm_get_cpu_l1tf_flush_l1d())
+			return;
 		kvm_clear_cpu_l1tf_flush_l1d();
-
-		if (!flush_l1d)
-			return false;
 	}
=20
 	vcpu->stat.l1d_flush++;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index b4b5d2d09634..851f078cd5ca 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5189,7 +5189,7 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cp=
u)
 {
 	struct kvm_pmu *pmu =3D vcpu_to_pmu(vcpu);
=20
-	vcpu->arch.l1tf_flush_l1d =3D true;
+	kvm_request_l1tf_flush_l1d();
=20
 	if (vcpu->scheduled_out && pmu->version && pmu->event_count) {
 		pmu->need_cleanup =3D true;
@@ -7999,7 +7999,7 @@ int kvm_write_guest_virt_system(struct kvm_vcpu *vcpu=
, gva_t addr, void *val,
 				unsigned int bytes, struct x86_exception *exception)
 {
 	/* kvm_write_guest_virt_system can pull in tons of pages. */
-	vcpu->arch.l1tf_flush_l1d =3D true;
+	kvm_request_l1tf_flush_l1d();
=20
 	return kvm_write_guest_virt_helper(addr, val, bytes, vcpu,
 					   PFERR_WRITE_MASK, exception);
@@ -9395,7 +9395,7 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, gp=
a_t cr2_or_gpa,
 		return handle_emulation_failure(vcpu, emulation_type);
 	}
=20
-	vcpu->arch.l1tf_flush_l1d =3D true;
+	kvm_request_l1tf_flush_l1d();
=20
 	if (!(emulation_type & EMULTYPE_NO_DECODE)) {
 		kvm_clear_exception_queue(vcpu);
diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index f3dc77f006f9..cd67ccbb747f 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -420,6 +420,20 @@ static inline bool kvm_check_has_quirk(struct kvm *kvm=
, u64 quirk)
 	return !(kvm->arch.disabled_quirks & quirk);
 }
=20
+static __always_inline void kvm_request_l1tf_flush_l1d(void)
+{
+#if IS_ENABLED(CONFIG_CPU_MITIGATIONS) && IS_ENABLED(CONFIG_KVM_INTEL)
+	/*
+	 * Use a raw write to set the per-CPU flag, as KVM will ensure a flush
+	 * even if preemption is currently enabled..  If the current vCPU task
+	 * is migrated to a different CPU (or userspace runs the vCPU on a
+	 * different task) before the next VM-Entry, then kvm_arch_vcpu_load()
+	 * will request a flush on the new CPU.
+	 */
+	raw_cpu_write(irq_stat.kvm_cpu_l1tf_flush_l1d, 1);
+#endif
+}
+
 void kvm_inject_realmode_interrupt(struct kvm_vcpu *vcpu, int irq, int inc=
_eip);
=20
 u64 get_kvmclock_ns(struct kvm *kvm);
--=20
2.51.0.858.gf9c4a03a3a-goog