From nobody Sun Feb 8 01:33:40 2026 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 653232DAFAA for ; Thu, 16 Oct 2025 20:04:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760645064; cv=none; b=WA11HJLxS75DQSgljjA9rUILM2xXrxoAuA5dh8cG/3SSP5fhOg7aPFNqJFASP6sX2REVy35rT7BmjjF6XF8hPtteUdDjoSqRq2RsUCWgGV6UKSdaiFleZeLBWwCzBXWNKWxyz8auSIpTeH4PZvT28aKVNkze85VNw9hANdVrtXo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760645064; c=relaxed/simple; bh=ghs0JTPXViv3gYU+802FotJFDsjdBY8OuXRfpqVMfLY=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=WY2D4BouTzRWHdPMAgwqDF14jqK455haqtlvwJG1nzm9vM1UV/8zkDPMdkTudsplA+mru5CNzVFoiv3XCGQWBk1z9FUG3Xkue2eYAGNsgoZ/ORL9o9AMfSIPeCl51cN+I2AA3JYk9G+WgvShKjY2PsuPJF3GSIYMwJK/9JXiwiU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=CGX2PH1l; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="CGX2PH1l" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-33bcb7796d4so482549a91.0 for ; Thu, 16 Oct 2025 13:04:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1760645063; x=1761249863; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=Tm/qXlQLAHEXKOAgwJvK+AVS1CazgIKzCEoGFToXAKI=; b=CGX2PH1lgQTaTjenSrfDealGvevD1z/jDMBrFGYnbfoA4KRaLMiD+frDMH4kL7frPY 6SMqfOzaXrZobUzQvjaDpvfc5F1zBaMhKyABVsghxVOUcX5J+T6dZq4mOYxF+kNWbR9C m+tp7BSQ08tJ5stqH2+NncJK1T9LPakPo1socUyhYPIrJZIY4UBQk9lKSpwB9DVPfLnJ UiT3iloKEMLe4Aj7UCaVIwQKuW7B//zaJmAuhZg7nGHLxovGaoGFEMGWMfiU/hKtnH4j I1RZaU+SuLPkn6qL6ssXgjGWfBinFZ6MCvMY+1D7y1R5swLWhkBXgX0gUD1xzaNk08+F zncQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1760645063; x=1761249863; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=Tm/qXlQLAHEXKOAgwJvK+AVS1CazgIKzCEoGFToXAKI=; b=lPTA6AvQP1CpH8KC6evcjRdWjNiGKlcbhmb8Gw2ff8gzZUVyPNqQ81phR0fFSFVFji uH35LpLvPrynTr+29niItUUwxFsZNrfHdW14NcMu2+kYqmovqfd6GZ5qhlMUk/fkjnLg IQkJW18rNOh7klpN4hGSEI9bnhkzNAPYawQQusIq8BsM/OY7WHmGS+e7seJiIGuILw8G A57649oD4urpUcWeRvTkgW46Fgn9hyzerTZtGMIP9tUEtc7L/XG/OqTBJ7fHhKdg7VH+ dD232KkmL/PCUHtvIAj96L7sSpJMMpr9cKbPBonZ2iZza3VFrGjzlsQhYPvdo0EDf7Ym FYCg== X-Forwarded-Encrypted: i=1; AJvYcCXSx/Ro0dGn8Gmy1BppdoIQATfuLsf1V6xi/zO0m0gCO85M2qnHnYXXsqdBcm5Q0vy4g6PRtohYlQI92dk=@vger.kernel.org X-Gm-Message-State: AOJu0YyBUQEbJA0mUihFX/Jjo6cVppHZ09oGs3k5Ah3cNZXRbXDMjsMk bn0FOvTDyt1ZD0JnVOSC9d5kNfn+WLVM3/+3kpxIpORvZb4tERxgO1YmjwhZCN7pjWrRpYFqpAN pn3R5Xw== X-Google-Smtp-Source: AGHT+IGuIUJFiaFRJKV+M0sz1MHDa0oHWE72Edwvx9CfUodCWHzB70GPJHhUzQvB94hkWD6mQpKM9Yhq4pU= X-Received: from pjg12.prod.google.com ([2002:a17:90b:3f4c:b0:32e:d644:b829]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:2789:b0:32e:3686:830e with SMTP id 98e67ed59e1d1-33bcf8faad4mr1235069a91.23.1760645062625; Thu, 16 Oct 2025 13:04:22 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 16 Oct 2025 13:04:14 -0700 In-Reply-To: <20251016200417.97003-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251016200417.97003-1-seanjc@google.com> X-Mailer: git-send-email 2.51.0.858.gf9c4a03a3a-goog Message-ID: <20251016200417.97003-2-seanjc@google.com> Subject: [PATCH v3 1/4] KVM: VMX: Flush CPU buffers as needed if L1D cache flush is skipped From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Pawan Gupta , Brendan Jackman Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" If the L1D flush for L1TF is conditionally enabled, flush CPU buffers to mitigate MMIO Stale Data as needed if KVM skips the L1D flush, e.g. because none of the "heavy" paths that trigger an L1D flush were tripped since the last VM-Enter. Note, the flaw goes back to the introduction of the MDS mitigation. The MDS mitigation was inadvertently fixed by commit 43fb862de8f6 ("KVM/VMX: Move VERW closer to VMentry for MDS mitigation"), but previous kernels that flush CPU buffers in vmx_vcpu_enter_exit() are affected. Fixes: 650b68a0622f ("x86/kvm/vmx: Add MDS protection when L1D Flush is not= active") Cc: Pawan Gupta Signed-off-by: Sean Christopherson Reviewed-by: Brendan Jackman --- arch/x86/kvm/vmx/vmx.c | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index f87c216d976d..ce556d5dc39b 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -6663,7 +6663,7 @@ int vmx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t= exit_fastpath) * information but as all relevant affected CPUs have 32KiB L1D cache size * there is no point in doing so. */ -static noinstr void vmx_l1d_flush(struct kvm_vcpu *vcpu) +static noinstr bool vmx_l1d_flush(struct kvm_vcpu *vcpu) { int size =3D PAGE_SIZE << L1D_CACHE_ORDER; =20 @@ -6691,14 +6691,14 @@ static noinstr void vmx_l1d_flush(struct kvm_vcpu *= vcpu) kvm_clear_cpu_l1tf_flush_l1d(); =20 if (!flush_l1d) - return; + return false; } =20 vcpu->stat.l1d_flush++; =20 if (static_cpu_has(X86_FEATURE_FLUSH_L1D)) { native_wrmsrq(MSR_IA32_FLUSH_CMD, L1D_FLUSH); - return; + return true; } =20 asm volatile( @@ -6722,6 +6722,7 @@ static noinstr void vmx_l1d_flush(struct kvm_vcpu *vc= pu) :: [flush_pages] "r" (vmx_l1d_flush_pages), [size] "r" (size) : "eax", "ebx", "ecx", "edx"); + return true; } =20 void vmx_update_cr8_intercept(struct kvm_vcpu *vcpu, int tpr, int irr) @@ -7330,8 +7331,9 @@ static noinstr void vmx_vcpu_enter_exit(struct kvm_vc= pu *vcpu, * and is affected by MMIO Stale Data. In such cases mitigation in only * needed against an MMIO capable guest. */ - if (static_branch_unlikely(&vmx_l1d_should_flush)) - vmx_l1d_flush(vcpu); + if (static_branch_unlikely(&vmx_l1d_should_flush) && + vmx_l1d_flush(vcpu)) + ; else if (static_branch_unlikely(&cpu_buf_vm_clear) && (flags & VMX_RUN_CLEAR_CPU_BUFFERS_FOR_MMIO)) x86_clear_cpu_buffers(); --=20 2.51.0.858.gf9c4a03a3a-goog From nobody Sun Feb 8 01:33:40 2026 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 560AF2DF148 for ; Thu, 16 Oct 2025 20:04:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760645067; cv=none; b=QSAYjkAVC1erF6uB9oHHmpQNhclFhSlnYs4/MyWDefnK5x1rm873yM0SocAQP6qPdpcTXCDLNBPwweQj3oxBqk2dPl8Vt+Xd6k9TNwPQUN5neB73GaB6WWc2gOdM8uQpSSfCSlr+VR+th8W8WPXyOHL82fpVIl5QdaFROrPTPpE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760645067; c=relaxed/simple; bh=yuk7NvONUNQVlqggvv8ub6LxWOfiyoba45LlVODNDco=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=eZbjp791TPqCLUItwf9aBBdC+QZ1KH338L3oiz33e1y5uGhkOJdHAN9lsKHUyJV57Brj0U41w/DbdlxV/O03VMhEwNCjCsw6WTUKGqVr4wPRLCvUXF2EZevrJ0LjFUAZ1v1smNqI53ypR2dcOCjS4flmuJtaKx1KnSAa76wcMf0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=xOtfLLRt; arc=none smtp.client-ip=209.85.216.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="xOtfLLRt" Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-32ec69d22b2so1131652a91.1 for ; Thu, 16 Oct 2025 13:04:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1760645064; x=1761249864; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=U+jp/+XcfRH9dOuRvA/UXEBka06Rhza6HVdo0nGfOSo=; b=xOtfLLRt2fBi5/Gb7417Lofc3CGHTyNFn2EUf+hJlAvh/ERuzcdjQ/iRbXPGwmCDkC boDk6DaDGHUPTuyNHL4iNbUmskrsLIZfVEzvpnM+7gEw/CagIeEs+ytc6oGzFUAHuVn3 hw+bdyK5q1C3hwcexzpLCYYNRhbUJ1lIrsSluHIguAdNEZTRConvTPoN7hKROMVaIbXz jsnHaF1gRnDlbKczrFkl2A1QTejruNYniCqIoGBFO3AFulBaqFcL331GX0Gfn3mPGPNN z3ztL1W9pPcp7nkh4y9NBtUaGkjklOuAD96HKxMzkL+0N5R8BPWhUBDq9z9xUdmZBNLY qoNQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1760645064; x=1761249864; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=U+jp/+XcfRH9dOuRvA/UXEBka06Rhza6HVdo0nGfOSo=; b=eOhe0vwLOCTAB4x7ggw+rA6ufGJhgCEW+nv9/EngswiBCOba0E23yIILOA2CVurVfQ QIwjLoHAyouOzNeav4QodXhTz7u/ToDkuGkaj48BhOqlnATXgw4BvvAT81+VyT7y5jL4 78xtDHmEKbCHxxI1S58Y0c6XdaDo02ANRkSkkLr/I2arreAHiFlZDmxoekdbPm/T4ea+ MOmX28cb8nN8Ro3bI+hM7eNP0sM0iuhbsz5v6b0qf4lI3aCCT9Rig7COySWVD79Mwou3 y9vfMjFPQ70lfC/fvnvM7rSSdGqwEAEu3KNb2PFCq5S4ZvzZ9ZUTOYbjc7ViOJriPSyT dbPw== X-Forwarded-Encrypted: i=1; AJvYcCXT4mLrlgsiqpKAeheD/mtKOgS7ZM68uKO2mXXOL76D3R05P3rfhVPQtD+TaxHfA2IjtJmwrT7ao3CvppY=@vger.kernel.org X-Gm-Message-State: AOJu0YzeDn1W1CftiVzK2ifXBuuJUcq3Me6h3CM9qQVQq1YsSGiiSMCh 6XYLIIIc0LVWzF3dWiURx2aIjd34o8aHkXheuAEUXnjfvLYvdCgSGkr2D+K+2Kwoz4RndsaZaxk byJi20Q== X-Google-Smtp-Source: AGHT+IH2nM7xybmdxyLC9czVkjZPXYctCesmDVEcR7IkWRJE94Mg7Obq1GtFksq5gUVDT9p0d9taLBI5DcE= X-Received: from pjkm8.prod.google.com ([2002:a17:90a:7308:b0:33b:a0cd:53ed]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:3506:b0:33b:c5de:6a54 with SMTP id 98e67ed59e1d1-33bcf8e9795mr1055685a91.23.1760645064449; Thu, 16 Oct 2025 13:04:24 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 16 Oct 2025 13:04:15 -0700 In-Reply-To: <20251016200417.97003-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251016200417.97003-1-seanjc@google.com> X-Mailer: git-send-email 2.51.0.858.gf9c4a03a3a-goog Message-ID: <20251016200417.97003-3-seanjc@google.com> Subject: [PATCH v3 2/4] KVM: VMX: Bundle all L1 data cache flush mitigation code together From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Pawan Gupta , Brendan Jackman Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Move vmx_l1d_flush(), vmx_cleanup_l1d_flush(), and the vmentry_l1d_flush param code up in vmx.c so that all of the L1 data cache flushing code is bundled together. This will allow conditioning the mitigation code on CONFIG_CPU_MITIGATIONS=3Dy with minimal #ifdefs. No functional change intended. Signed-off-by: Sean Christopherson Reviewed-by: Brendan Jackman --- arch/x86/kvm/vmx/vmx.c | 176 ++++++++++++++++++++--------------------- 1 file changed, 88 insertions(+), 88 deletions(-) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index ce556d5dc39b..cd8ae1b2ae55 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -302,6 +302,16 @@ static int vmx_setup_l1d_flush(enum vmx_l1d_flush_stat= e l1tf) return 0; } =20 +static void vmx_cleanup_l1d_flush(void) +{ + if (vmx_l1d_flush_pages) { + free_pages((unsigned long)vmx_l1d_flush_pages, L1D_CACHE_ORDER); + vmx_l1d_flush_pages =3D NULL; + } + /* Restore state so sysfs ignores VMX */ + l1tf_vmx_mitigation =3D VMENTER_L1D_FLUSH_AUTO; +} + static int vmentry_l1d_flush_parse(const char *s) { unsigned int i; @@ -352,6 +362,84 @@ static int vmentry_l1d_flush_get(char *s, const struct= kernel_param *kp) return sysfs_emit(s, "%s\n", vmentry_l1d_param[l1tf_vmx_mitigation].optio= n); } =20 +/* + * Software based L1D cache flush which is used when microcode providing + * the cache control MSR is not loaded. + * + * The L1D cache is 32 KiB on Nehalem and later microarchitectures, but to + * flush it is required to read in 64 KiB because the replacement algorithm + * is not exactly LRU. This could be sized at runtime via topology + * information but as all relevant affected CPUs have 32KiB L1D cache size + * there is no point in doing so. + */ +static noinstr bool vmx_l1d_flush(struct kvm_vcpu *vcpu) +{ + int size =3D PAGE_SIZE << L1D_CACHE_ORDER; + + /* + * This code is only executed when the flush mode is 'cond' or + * 'always' + */ + if (static_branch_likely(&vmx_l1d_flush_cond)) { + bool flush_l1d; + + /* + * Clear the per-vcpu flush bit, it gets set again if the vCPU + * is reloaded, i.e. if the vCPU is scheduled out or if KVM + * exits to userspace, or if KVM reaches one of the unsafe + * VMEXIT handlers, e.g. if KVM calls into the emulator. + */ + flush_l1d =3D vcpu->arch.l1tf_flush_l1d; + vcpu->arch.l1tf_flush_l1d =3D false; + + /* + * Clear the per-cpu flush bit, it gets set again from + * the interrupt handlers. + */ + flush_l1d |=3D kvm_get_cpu_l1tf_flush_l1d(); + kvm_clear_cpu_l1tf_flush_l1d(); + + if (!flush_l1d) + return false; + } + + vcpu->stat.l1d_flush++; + + if (static_cpu_has(X86_FEATURE_FLUSH_L1D)) { + native_wrmsrq(MSR_IA32_FLUSH_CMD, L1D_FLUSH); + return true; + } + + asm volatile( + /* First ensure the pages are in the TLB */ + "xorl %%eax, %%eax\n" + ".Lpopulate_tlb:\n\t" + "movzbl (%[flush_pages], %%" _ASM_AX "), %%ecx\n\t" + "addl $4096, %%eax\n\t" + "cmpl %%eax, %[size]\n\t" + "jne .Lpopulate_tlb\n\t" + "xorl %%eax, %%eax\n\t" + "cpuid\n\t" + /* Now fill the cache */ + "xorl %%eax, %%eax\n" + ".Lfill_cache:\n" + "movzbl (%[flush_pages], %%" _ASM_AX "), %%ecx\n\t" + "addl $64, %%eax\n\t" + "cmpl %%eax, %[size]\n\t" + "jne .Lfill_cache\n\t" + "lfence\n" + :: [flush_pages] "r" (vmx_l1d_flush_pages), + [size] "r" (size) + : "eax", "ebx", "ecx", "edx"); + return true; +} + +static const struct kernel_param_ops vmentry_l1d_flush_ops =3D { + .set =3D vmentry_l1d_flush_set, + .get =3D vmentry_l1d_flush_get, +}; +module_param_cb(vmentry_l1d_flush, &vmentry_l1d_flush_ops, NULL, 0644); + static __always_inline void vmx_disable_fb_clear(struct vcpu_vmx *vmx) { u64 msr; @@ -404,12 +492,6 @@ static void vmx_update_fb_clear_dis(struct kvm_vcpu *v= cpu, struct vcpu_vmx *vmx) vmx->disable_fb_clear =3D false; } =20 -static const struct kernel_param_ops vmentry_l1d_flush_ops =3D { - .set =3D vmentry_l1d_flush_set, - .get =3D vmentry_l1d_flush_get, -}; -module_param_cb(vmentry_l1d_flush, &vmentry_l1d_flush_ops, NULL, 0644); - static u32 vmx_segment_access_rights(struct kvm_segment *var); =20 void vmx_vmexit(void); @@ -6653,78 +6735,6 @@ int vmx_handle_exit(struct kvm_vcpu *vcpu, fastpath_= t exit_fastpath) return ret; } =20 -/* - * Software based L1D cache flush which is used when microcode providing - * the cache control MSR is not loaded. - * - * The L1D cache is 32 KiB on Nehalem and later microarchitectures, but to - * flush it is required to read in 64 KiB because the replacement algorithm - * is not exactly LRU. This could be sized at runtime via topology - * information but as all relevant affected CPUs have 32KiB L1D cache size - * there is no point in doing so. - */ -static noinstr bool vmx_l1d_flush(struct kvm_vcpu *vcpu) -{ - int size =3D PAGE_SIZE << L1D_CACHE_ORDER; - - /* - * This code is only executed when the flush mode is 'cond' or - * 'always' - */ - if (static_branch_likely(&vmx_l1d_flush_cond)) { - bool flush_l1d; - - /* - * Clear the per-vcpu flush bit, it gets set again if the vCPU - * is reloaded, i.e. if the vCPU is scheduled out or if KVM - * exits to userspace, or if KVM reaches one of the unsafe - * VMEXIT handlers, e.g. if KVM calls into the emulator. - */ - flush_l1d =3D vcpu->arch.l1tf_flush_l1d; - vcpu->arch.l1tf_flush_l1d =3D false; - - /* - * Clear the per-cpu flush bit, it gets set again from - * the interrupt handlers. - */ - flush_l1d |=3D kvm_get_cpu_l1tf_flush_l1d(); - kvm_clear_cpu_l1tf_flush_l1d(); - - if (!flush_l1d) - return false; - } - - vcpu->stat.l1d_flush++; - - if (static_cpu_has(X86_FEATURE_FLUSH_L1D)) { - native_wrmsrq(MSR_IA32_FLUSH_CMD, L1D_FLUSH); - return true; - } - - asm volatile( - /* First ensure the pages are in the TLB */ - "xorl %%eax, %%eax\n" - ".Lpopulate_tlb:\n\t" - "movzbl (%[flush_pages], %%" _ASM_AX "), %%ecx\n\t" - "addl $4096, %%eax\n\t" - "cmpl %%eax, %[size]\n\t" - "jne .Lpopulate_tlb\n\t" - "xorl %%eax, %%eax\n\t" - "cpuid\n\t" - /* Now fill the cache */ - "xorl %%eax, %%eax\n" - ".Lfill_cache:\n" - "movzbl (%[flush_pages], %%" _ASM_AX "), %%ecx\n\t" - "addl $64, %%eax\n\t" - "cmpl %%eax, %[size]\n\t" - "jne .Lfill_cache\n\t" - "lfence\n" - :: [flush_pages] "r" (vmx_l1d_flush_pages), - [size] "r" (size) - : "eax", "ebx", "ecx", "edx"); - return true; -} - void vmx_update_cr8_intercept(struct kvm_vcpu *vcpu, int tpr, int irr) { struct vmcs12 *vmcs12 =3D get_vmcs12(vcpu); @@ -8673,16 +8683,6 @@ __init int vmx_hardware_setup(void) return r; } =20 -static void vmx_cleanup_l1d_flush(void) -{ - if (vmx_l1d_flush_pages) { - free_pages((unsigned long)vmx_l1d_flush_pages, L1D_CACHE_ORDER); - vmx_l1d_flush_pages =3D NULL; - } - /* Restore state so sysfs ignores VMX */ - l1tf_vmx_mitigation =3D VMENTER_L1D_FLUSH_AUTO; -} - void vmx_exit(void) { allow_smaller_maxphyaddr =3D false; --=20 2.51.0.858.gf9c4a03a3a-goog From nobody Sun Feb 8 01:33:40 2026 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A12E72DFA4A for ; Thu, 16 Oct 2025 20:04:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760645068; cv=none; b=BR3rAgCybH1h8nMjCLFge7UdurE4sV5VIgfwWakFxeOkhth3jGthlwyqjo1z61epOoBnqPQFea4ZO7RmNP2363QKKb0SSMGi9OvioZlGLdURstkaep/fboqbA9nd3IoVWYFHd2ZeMe5ul7QTmMoJZV13F4bTdMGEkORl/5uCnnU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760645068; c=relaxed/simple; bh=+BI3vZGhl1OUxVrO7t7nTf9aykzXpRRBJCRI1x4Dzk4=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=JAzAvYOCWCtdBDntgOwBMXMJ0w7Xo8YAJCebh8qHlmZXZwN5G10cLiK6Xnop5PhvI15E0f15ibrScv88Fb0EP7XVrVS9UP68iIi9UduHcOe/j4dJkVlifZWUXLhUqTufNng97VcCcbjRh3FSL5yNEmfhRNa8UiRfFU2jso9Nyps= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=VmLNRsE7; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="VmLNRsE7" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-32ee157b9c9so1068452a91.2 for ; Thu, 16 Oct 2025 13:04:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1760645066; x=1761249866; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=gICh0D9JnzmBW3pvgKeRu2d75Fu8xaMjYi9X/63uYNM=; b=VmLNRsE7dNJrYlWaZVNk4G/bimpLGAopdeQPhhhDheyIWYOT7pHHMZ6nKIV9alOfLr vWdhEBP9fEKawvjs4odiWL0I3w/8+PpreDTQuadUiGyXH6VocSIpP2ockQyIgFICv2iw 9oyRgQqWQ6+euG4ljQnwTBF076YeMeWYPVVICub6/c+PPQ268xAKmMxlzGrYGBnpfDh6 d7IkDAtF6WnRySlPghdrgZ8y8Lo/AyjZHm9VD9evT48irh83Kbd/Zcg95aZ6uFwlVmyV dw8HBAI4co2xi8FppV3F/XofbnXK6lkDRc7vGdWaajDQbAdn485e2T4cfP6j3/C0aIhN 7tlA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1760645066; x=1761249866; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=gICh0D9JnzmBW3pvgKeRu2d75Fu8xaMjYi9X/63uYNM=; b=sl5B+M4qavVJ35qQ5zJChm3NVZ4BTMMuwBQT1R7eZ6QbZzUMdkKQRTN8ouIkNHXBcD 7l+hlnZ1g+vGIR+ktvu32JBDm075kf/95ROdy3Oj3txFvvCb5v++eC1v8isJeQxfv9cj vdHQnzr1qpM8zc+lTtO+dsath4ce24IJgEgFcgZh5I+e7WED0l8oax+EiLXe7vC7BNuz wqqHL8NcWvxtdYenhTvj9OmEOU928w2DsuLSAyIIbjFRyanZLlKrSLpnnYZe2bwuG82z IpkrzTo9aoSG4VsIrEp4XlYHUrucY7rhYW3/unIdiqIkwKdDQ3ud7Kfzb93YmTO0e265 aOZQ== X-Forwarded-Encrypted: i=1; AJvYcCVXb/fyav0ZfbOqIFzGB13buzZBtfrUoXPs3TH8Zy2YkfvVr+hZcotGHWgWTJ2aFhyQYd6oZ1iOF5xI8+A=@vger.kernel.org X-Gm-Message-State: AOJu0Yzv/Rvv80rmC7wrnoAL5r2A77TwSpm+AXnNIb90Fweg5e0trCuB N/u6sNnwJ3V1wDrAgrFOrTKJkXCHcz0Gk4uEazmCmM4tDZ5VhpMEkYZh64C1IkFOxBAnLi6dv/4 HfKqrQg== X-Google-Smtp-Source: AGHT+IHztc2TqgDKpEmWp9jAwqakrYMvnNulb8h0c+tFEAFFdKTRQ4mRZOTbzsyI2FnZxXK4m2S9f8WNbU0= X-Received: from pjbsr14.prod.google.com ([2002:a17:90b:4e8e:b0:33b:8b81:9086]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:3942:b0:336:bfce:3b48 with SMTP id 98e67ed59e1d1-33bcf87f431mr1266128a91.9.1760645066002; Thu, 16 Oct 2025 13:04:26 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 16 Oct 2025 13:04:16 -0700 In-Reply-To: <20251016200417.97003-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251016200417.97003-1-seanjc@google.com> X-Mailer: git-send-email 2.51.0.858.gf9c4a03a3a-goog Message-ID: <20251016200417.97003-4-seanjc@google.com> Subject: [PATCH v3 3/4] KVM: VMX: Disable L1TF L1 data cache flush if CONFIG_CPU_MITIGATIONS=n From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Pawan Gupta , Brendan Jackman Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Disable support for flushing the L1 data cache to mitigate L1TF if CPU mitigations are disabled for the entire kernel. KVM's mitigation of L1TF is in no way special enough to justify ignoring CONFIG_CPU_MITIGATIONS=3Dn. Deliberately use CPU_MITIGATIONS instead of the more precise MITIGATION_L1TF, as MITIGATION_L1TF only controls the default behavior, i.e. CONFIG_MITIGATION_L1TF=3Dn doesn't completely disable L1TF mitigations in the kernel. Keep the vmentry_l1d_flush module param to avoid breaking existing setups, and leverage the .set path to alert the user to the fact that vmentry_l1d_flush will be ignored. Don't bother validating the incoming value; if an admin misconfigures vmentry_l1d_flush, the fact that the bad configuration won't be detected when running with CONFIG_CPU_MITIGATIONS=3Dn is likely the least of their worries. Signed-off-by: Sean Christopherson --- arch/x86/include/asm/hardirq.h | 4 +-- arch/x86/kvm/vmx/vmx.c | 56 ++++++++++++++++++++++++++-------- 2 files changed, 46 insertions(+), 14 deletions(-) diff --git a/arch/x86/include/asm/hardirq.h b/arch/x86/include/asm/hardirq.h index f00c09ffe6a9..6b6d472baa0b 100644 --- a/arch/x86/include/asm/hardirq.h +++ b/arch/x86/include/asm/hardirq.h @@ -5,7 +5,7 @@ #include =20 typedef struct { -#if IS_ENABLED(CONFIG_KVM_INTEL) +#if IS_ENABLED(CONFIG_CPU_MITIGATIONS) && IS_ENABLED(CONFIG_KVM_INTEL) u8 kvm_cpu_l1tf_flush_l1d; #endif unsigned int __nmi_count; /* arch dependent */ @@ -68,7 +68,7 @@ extern u64 arch_irq_stat(void); DECLARE_PER_CPU_CACHE_HOT(u16, __softirq_pending); #define local_softirq_pending_ref __softirq_pending =20 -#if IS_ENABLED(CONFIG_KVM_INTEL) +#if IS_ENABLED(CONFIG_CPU_MITIGATIONS) && IS_ENABLED(CONFIG_KVM_INTEL) /* * This function is called from noinstr interrupt contexts * and must be inlined to not get instrumentation. diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index cd8ae1b2ae55..e91d99211efe 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -203,6 +203,7 @@ module_param(pt_mode, int, S_IRUGO); =20 struct x86_pmu_lbr __ro_after_init vmx_lbr_caps; =20 +#ifdef CONFIG_CPU_MITIGATIONS static DEFINE_STATIC_KEY_FALSE(vmx_l1d_should_flush); static DEFINE_STATIC_KEY_FALSE(vmx_l1d_flush_cond); static DEFINE_MUTEX(vmx_l1d_flush_mutex); @@ -225,7 +226,7 @@ static const struct { #define L1D_CACHE_ORDER 4 static void *vmx_l1d_flush_pages; =20 -static int vmx_setup_l1d_flush(enum vmx_l1d_flush_state l1tf) +static int __vmx_setup_l1d_flush(enum vmx_l1d_flush_state l1tf) { struct page *page; unsigned int i; @@ -302,6 +303,16 @@ static int vmx_setup_l1d_flush(enum vmx_l1d_flush_stat= e l1tf) return 0; } =20 +static int vmx_setup_l1d_flush(void) +{ + /* + * Hand the parameter mitigation value in which was stored in the pre + * module init parser. If no parameter was given, it will contain + * 'auto' which will be turned into the default 'cond' mitigation mode. + */ + return vmx_setup_l1d_flush(vmentry_l1d_flush_param); +} + static void vmx_cleanup_l1d_flush(void) { if (vmx_l1d_flush_pages) { @@ -349,7 +360,7 @@ static int vmentry_l1d_flush_set(const char *s, const s= truct kernel_param *kp) } =20 mutex_lock(&vmx_l1d_flush_mutex); - ret =3D vmx_setup_l1d_flush(l1tf); + ret =3D __vmx_setup_l1d_flush(l1tf); mutex_unlock(&vmx_l1d_flush_mutex); return ret; } @@ -376,6 +387,9 @@ static noinstr bool vmx_l1d_flush(struct kvm_vcpu *vcpu) { int size =3D PAGE_SIZE << L1D_CACHE_ORDER; =20 + if (!static_branch_unlikely(&vmx_l1d_should_flush)) + return false; + /* * This code is only executed when the flush mode is 'cond' or * 'always' @@ -434,6 +448,31 @@ static noinstr bool vmx_l1d_flush(struct kvm_vcpu *vcp= u) return true; } =20 +#else /* CONFIG_CPU_MITIGATIONS*/ +static int vmx_setup_l1d_flush(void) +{ + l1tf_vmx_mitigation =3D VMENTER_L1D_FLUSH_NEVER; + return 0; +} +static void vmx_cleanup_l1d_flush(void) +{ + l1tf_vmx_mitigation =3D VMENTER_L1D_FLUSH_AUTO; +} +static __always_inline bool vmx_l1d_flush(struct kvm_vcpu *vcpu) +{ + return false; +} +static int vmentry_l1d_flush_set(const char *s, const struct kernel_param = *kp) +{ + pr_warn_once("Kernel compiled without mitigations, ignoring vmentry_l1d_f= lush\n"); + return 0; +} +static int vmentry_l1d_flush_get(char *s, const struct kernel_param *kp) +{ + return sysfs_emit(s, "never\n"); +} +#endif + static const struct kernel_param_ops vmentry_l1d_flush_ops =3D { .set =3D vmentry_l1d_flush_set, .get =3D vmentry_l1d_flush_get, @@ -7341,8 +7380,7 @@ static noinstr void vmx_vcpu_enter_exit(struct kvm_vc= pu *vcpu, * and is affected by MMIO Stale Data. In such cases mitigation in only * needed against an MMIO capable guest. */ - if (static_branch_unlikely(&vmx_l1d_should_flush) && - vmx_l1d_flush(vcpu)) + if (vmx_l1d_flush(vcpu)) ; else if (static_branch_unlikely(&cpu_buf_vm_clear) && (flags & VMX_RUN_CLEAR_CPU_BUFFERS_FOR_MMIO)) @@ -8718,14 +8756,8 @@ int __init vmx_init(void) if (r) return r; =20 - /* - * Must be called after common x86 init so enable_ept is properly set - * up. Hand the parameter mitigation value in which was stored in - * the pre module init parser. If no parameter was given, it will - * contain 'auto' which will be turned into the default 'cond' - * mitigation mode. - */ - r =3D vmx_setup_l1d_flush(vmentry_l1d_flush_param); + /* Must be called after common x86 init so enable_ept is setup. */ + r =3D vmx_setup_l1d_flush(); if (r) goto err_l1d_flush; =20 --=20 2.51.0.858.gf9c4a03a3a-goog From nobody Sun Feb 8 01:33:40 2026 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 24BEA2E11D7 for ; Thu, 16 Oct 2025 20:04:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760645069; cv=none; b=E/tSVo2y729ttNsqaK+DJUFkCod0sb8GaiVVH6al0ECFHG88A7OC2KoaOqKMZs0DNcsvtr2z9qs3IIbCaANnt+DKGDLnxbEuy8cRJjcbDGs2qQCQ9yTvFSZ3E4x83CQ9dMkjcmYUdRg/qqP0NFY8WwGRw6Jj6/Lc3SiNjmD7OHs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760645069; c=relaxed/simple; bh=SNE+nRiwqlE038YuWxnWn7j7iJTACVTLDjxXYnTjtWg=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=tGtiKK7gJ4ftQNUlZ6Ou+TazzuNTmXpzsSz4kmNqGSel0XrtJ95FVt+pmUHNofyR0phA8GrqL0QgOawXRaIM5OAqN9Guo0P94OfYhufIQLveEc8I+t+18DrCq8QTRtz3iGxq5rtaxhiylxnms1/11e0B1QPAbUrAINFOKFchxrg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=zrqw8pOF; arc=none smtp.client-ip=209.85.216.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="zrqw8pOF" Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-336b9f3b5b0so1287930a91.3 for ; Thu, 16 Oct 2025 13:04:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1760645067; x=1761249867; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=8iJumIyoa7Jv1Ta1I07laKkWSuZSFDfZ4MoevL3qjrU=; b=zrqw8pOFo3pEeVjTnQoxMEYdTfWiLOpZ/1K7fAo9llED2PdDSbZE9wBESp9U2tPDOE Ial4OmVODB3+alwPHRgD9rEV8JQsjzw9N1Vym6O5QTFlCZfYhB1zoAp2LNSGxglxRriU aq3pXs8IvYTPpf9oFyKfyYt+B+ziDbs+aU5FJX8bgG/KMHJo21ZonyiGt8g66wagIGpG qBxqQKvvzxWZLHJnxatZme6ZIiZKdqz14ACWMB/xxQnpoWTkq7mSlbXQFOQtIvl8cpk7 VgVeO8XXSWglQ+yFN+tX2Zdu00FDxpyhKmsESei3AQlxCy9nxNwDG4sjsNwf7xGbVzIZ rKIQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1760645067; x=1761249867; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=8iJumIyoa7Jv1Ta1I07laKkWSuZSFDfZ4MoevL3qjrU=; b=TMK7t7/HH8jIAa93Z60ctDx5TOnc1UivLCIkmQhLOcBGi2FRVEAZzHyW/pSGuFePnE bFznu/9lOEKo7jtoI4iHHjt7JRQtrhf1sS1TBjDg5jRZNBeu6PAZ7GDUXotgBGOOTeLx Ozfy3wCMQMCsKvZH1rkOlzP2eKrBvdHPo02cnUqOzo/unZ6isjLvzisoq4I38/ctLaPR 2Jiy1wgyKXuiaj7v2R2gf87M+6noQXoaTUq80MJny8GfrZO/GZ0IUm35iDiKaI7jSbfP cAh014pqWmTNOcYIf9E8Lce9RBkfx0pNVSeUi0A0RkSPdDgUu47MZUCToVGzx8nlsRD4 bvgA== X-Forwarded-Encrypted: i=1; AJvYcCXYMt+dwm/asz3oWtWO17YCZVAGZgCprc4vf5NOvgbOlcKNj8JB0jFgP5EUh1rfo0iRh7iOXAgtLQL5+zo=@vger.kernel.org X-Gm-Message-State: AOJu0YxyAEUWWdD0EIofoDbWIxFOC7uiURO/Qy6StMetHo2BRHfZobD2 LIM5z6szu1Hh3MOs/hg7ZAwPEUMl/Lktc0opJgoTuPouWAnYKeUaYRM0+aXja0H8evnQ9gyggGB Y8cxdjg== X-Google-Smtp-Source: AGHT+IGOG6A5Ijw3KLuRGlwFU7ZiWO+4sv+3A/3GnT2Oqt8U2ps3rXOjLFbWGQRyn9sJEkzA80a7CZbegDY= X-Received: from pjkm1.prod.google.com ([2002:a17:90a:7301:b0:339:ee20:f620]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:3fcc:b0:31e:d4e3:4002 with SMTP id 98e67ed59e1d1-33bcf85ad9amr1009215a91.2.1760645067487; Thu, 16 Oct 2025 13:04:27 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 16 Oct 2025 13:04:17 -0700 In-Reply-To: <20251016200417.97003-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251016200417.97003-1-seanjc@google.com> X-Mailer: git-send-email 2.51.0.858.gf9c4a03a3a-goog Message-ID: <20251016200417.97003-5-seanjc@google.com> Subject: [PATCH v3 4/4] KVM: x86: Unify L1TF flushing under per-CPU variable From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Pawan Gupta , Brendan Jackman Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Brendan Jackman Currently the tracking of the need to flush L1D for L1TF is tracked by two bits: one per-CPU and one per-vCPU. The per-vCPU bit is always set when the vCPU shows up on a core, so there is no interesting state that's truly per-vCPU. Indeed, this is a requirement, since L1D is a part of the physical CPU. So simplify this by combining the two bits. The vCPU bit was being written from preemption-enabled regions. To play nice with those cases, wrap all calls from KVM and use a raw write so that request a flush with preemption enabled doesn't trigger what would effectively be DEBUG_PREEMPT false positives. Preemption doesn't need to be disabled, as kvm_arch_vcpu_load() will mark the new CPU as needing a flush if the vCPU task is migrated, or if userspace runs the vCPU on a different task. Signed-off-by: Brendan Jackman [sean: put raw write in KVM instead of in a hardirq.h variant] Signed-off-by: Sean Christopherson --- arch/x86/include/asm/kvm_host.h | 3 --- arch/x86/kvm/mmu/mmu.c | 2 +- arch/x86/kvm/vmx/nested.c | 2 +- arch/x86/kvm/vmx/vmx.c | 20 +++++--------------- arch/x86/kvm/x86.c | 6 +++--- arch/x86/kvm/x86.h | 14 ++++++++++++++ 6 files changed, 24 insertions(+), 23 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_hos= t.h index 48598d017d6f..fcdc65ab13d8 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1055,9 +1055,6 @@ struct kvm_vcpu_arch { /* be preempted when it's in kernel-mode(cpl=3D0) */ bool preempted_in_kernel; =20 - /* Flush the L1 Data cache for L1TF mitigation on VMENTER */ - bool l1tf_flush_l1d; - /* Host CPU on which VM-entry was most recently attempted */ int last_vmentry_cpu; =20 diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 18d69d48bc55..4e016582adc7 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -4859,7 +4859,7 @@ int kvm_handle_page_fault(struct kvm_vcpu *vcpu, u64 = error_code, */ BUILD_BUG_ON(lower_32_bits(PFERR_SYNTHETIC_MASK)); =20 - vcpu->arch.l1tf_flush_l1d =3D true; + kvm_request_l1tf_flush_l1d(); if (!flags) { trace_kvm_page_fault(vcpu, fault_address, error_code); =20 diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index 3fca63a261f5..468a013d9ef3 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -3880,7 +3880,7 @@ static int nested_vmx_run(struct kvm_vcpu *vcpu, bool= launch) goto vmentry_failed; =20 /* Hide L1D cache contents from the nested guest. */ - vcpu->arch.l1tf_flush_l1d =3D true; + kvm_request_l1tf_flush_l1d(); =20 /* * Must happen outside of nested_vmx_enter_non_root_mode() as it will diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index e91d99211efe..0347d321a86e 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -395,26 +395,16 @@ static noinstr bool vmx_l1d_flush(struct kvm_vcpu *vc= pu) * 'always' */ if (static_branch_likely(&vmx_l1d_flush_cond)) { - bool flush_l1d; - /* - * Clear the per-vcpu flush bit, it gets set again if the vCPU + * Clear the per-cpu flush bit, it gets set again if the vCPU * is reloaded, i.e. if the vCPU is scheduled out or if KVM * exits to userspace, or if KVM reaches one of the unsafe - * VMEXIT handlers, e.g. if KVM calls into the emulator. + * VMEXIT handlers, e.g. if KVM calls into the emulator, + * or from the interrupt handlers. */ - flush_l1d =3D vcpu->arch.l1tf_flush_l1d; - vcpu->arch.l1tf_flush_l1d =3D false; - - /* - * Clear the per-cpu flush bit, it gets set again from - * the interrupt handlers. - */ - flush_l1d |=3D kvm_get_cpu_l1tf_flush_l1d(); + if (!kvm_get_cpu_l1tf_flush_l1d()) + return; kvm_clear_cpu_l1tf_flush_l1d(); - - if (!flush_l1d) - return false; } =20 vcpu->stat.l1d_flush++; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index b4b5d2d09634..851f078cd5ca 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -5189,7 +5189,7 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cp= u) { struct kvm_pmu *pmu =3D vcpu_to_pmu(vcpu); =20 - vcpu->arch.l1tf_flush_l1d =3D true; + kvm_request_l1tf_flush_l1d(); =20 if (vcpu->scheduled_out && pmu->version && pmu->event_count) { pmu->need_cleanup =3D true; @@ -7999,7 +7999,7 @@ int kvm_write_guest_virt_system(struct kvm_vcpu *vcpu= , gva_t addr, void *val, unsigned int bytes, struct x86_exception *exception) { /* kvm_write_guest_virt_system can pull in tons of pages. */ - vcpu->arch.l1tf_flush_l1d =3D true; + kvm_request_l1tf_flush_l1d(); =20 return kvm_write_guest_virt_helper(addr, val, bytes, vcpu, PFERR_WRITE_MASK, exception); @@ -9395,7 +9395,7 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, gp= a_t cr2_or_gpa, return handle_emulation_failure(vcpu, emulation_type); } =20 - vcpu->arch.l1tf_flush_l1d =3D true; + kvm_request_l1tf_flush_l1d(); =20 if (!(emulation_type & EMULTYPE_NO_DECODE)) { kvm_clear_exception_queue(vcpu); diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h index f3dc77f006f9..cd67ccbb747f 100644 --- a/arch/x86/kvm/x86.h +++ b/arch/x86/kvm/x86.h @@ -420,6 +420,20 @@ static inline bool kvm_check_has_quirk(struct kvm *kvm= , u64 quirk) return !(kvm->arch.disabled_quirks & quirk); } =20 +static __always_inline void kvm_request_l1tf_flush_l1d(void) +{ +#if IS_ENABLED(CONFIG_CPU_MITIGATIONS) && IS_ENABLED(CONFIG_KVM_INTEL) + /* + * Use a raw write to set the per-CPU flag, as KVM will ensure a flush + * even if preemption is currently enabled.. If the current vCPU task + * is migrated to a different CPU (or userspace runs the vCPU on a + * different task) before the next VM-Entry, then kvm_arch_vcpu_load() + * will request a flush on the new CPU. + */ + raw_cpu_write(irq_stat.kvm_cpu_l1tf_flush_l1d, 1); +#endif +} + void kvm_inject_realmode_interrupt(struct kvm_vcpu *vcpu, int irq, int inc= _eip); =20 u64 get_kvmclock_ns(struct kvm *kvm); --=20 2.51.0.858.gf9c4a03a3a-goog