From nobody Sun Feb 8 17:36:54 2026 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0477E221739 for ; Fri, 31 Oct 2025 00:30:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761870659; cv=none; b=nGmv0LTKJJtD7iZ5TbyMQDRyvGfFKwQUANqNZh6sNCpVzTNc/2l9jkOLh6zhPJ5c755qhztHbLt7c22/UdCtKs5GnNVX0B8alYQMyuY8UI6B+PFBVnYC8/qJa3sKj0HIhmVeYN5eencOvv2Um/ocWBhWi1mRPEvbOa3C/Pcx5XQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1761870659; c=relaxed/simple; bh=aYmWXaoH1i80BGqIeHrFdG1O1IAKwSd0+UKfWgjlogw=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=S7LbQEHEcfOoil6KtrXkoCRDZit3X5UohLssBWKCaWnK7ATmKCI+R1LfbKGDrt1bWJbmJxe+ocBNkrm7lkSd8OLsIzotldknKSJUeGXnUntnJypJt7+FYGzvd5DYrLqf8QOPGF9lE1Z8CeBDyZWr5wBRuAcXHOfLOEoRYybZ5lY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=oCTn32W1; arc=none smtp.client-ip=209.85.216.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="oCTn32W1" Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-33da21394adso1777565a91.1 for ; Thu, 30 Oct 2025 17:30:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1761870657; x=1762475457; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=J+Ez2liootURycULRis0r6eLUgh70ESz5qK4iYDo4pQ=; b=oCTn32W1jqkTitewuU21G+lhT9OlsOvRgtAazaIHQKntZ/VR/pSE1pk5NVGVEcnBXo w9U11bKgLCdVTz01IRRpR5H2GpF016H1Cn5epgI6xxQekayzv0OWkOy6g7j6qwAWb6RX yHpypVtdyjYYM/iJt51UNIn35RaCVsERKnGWNbIVnw+bQKnS/x60gBzS+vmQd8qDJr8Y qmiJKuTvbIf1+MLAf+11PlUb4jR2SmdpEB/AzUQpl8QSrJnnbOZ03JG1+y/zOLg6nNZd U1jgI0mYkl2aWdRNQS6fvbqgR67kK+6HlERXpkxBZW43OMMj428ZOwhD5nNLbyyFBbME 3BEw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1761870657; x=1762475457; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=J+Ez2liootURycULRis0r6eLUgh70ESz5qK4iYDo4pQ=; b=VJ4qGnlH0pR8L91D0YnnGos3q3fqOQnYXctMX9JtULHT3XZmus/l6pBVnsQGhCOJba GvQSAXWOdG7OZJj466ia3o5YCWX243z+C3whZ0YZDXgX12ULHdHyTGwpLUVrWv2n1/gw 3JLoiR4b8h2chJiJidl4wfotutYftgbElNJiHHb15iB7sG8PfihkKrkcr3SO1yPfpYnB JASsXGb1cM7ttFb+yrtZarGTLxj7jtyZ6nN3/xLqrBwbwWzZwudf/s4bqIFhU5qxSp1k GS+SesVHnB5TvdfPhTtK7bDlNh0g1ZL0hRtdixyb/PQdypuyLSlWmr9R19xkMtAd/EXc j1Eg== X-Forwarded-Encrypted: i=1; AJvYcCVdLyKtSIVOmG0+0RkhMkphHh3A5ORWmoLk82G/7cEqBhmKDTx2EaQfTPrWBmrf4N3b4p9juSwGIrzGv6I=@vger.kernel.org X-Gm-Message-State: AOJu0YxCmrtcyUgvT49CUiB+iSNbXYuckScoKMKlLev5F9AgpcebWxvc 7eQPCC4oGU1NuOTY2y4xzpXPWp4nJ5PgtpwcfS2SzqAK2C5XLqEfTv5B94ymQQ29S0fQVdesGNX nW0KURw== X-Google-Smtp-Source: AGHT+IFeWZiIu8lcfwk0BeqW4AcKRL0jcrFiKIe53HTAdWTNST0vC7Y2cgXO/sr9CaxDDQHI55Unzyy0ltg= X-Received: from pjod4.prod.google.com ([2002:a17:90a:8d84:b0:33b:51fe:1a7a]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:3947:b0:340:7b4a:392f with SMTP id 98e67ed59e1d1-34083074659mr2369461a91.17.1761870657326; Thu, 30 Oct 2025 17:30:57 -0700 (PDT) Reply-To: Sean Christopherson Date: Thu, 30 Oct 2025 17:30:38 -0700 In-Reply-To: <20251031003040.3491385-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251031003040.3491385-1-seanjc@google.com> X-Mailer: git-send-email 2.51.1.930.gacf6e81ea2-goog Message-ID: <20251031003040.3491385-7-seanjc@google.com> Subject: [PATCH v4 6/8] KVM: VMX: Bundle all L1 data cache flush mitigation code together From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini , Thomas Gleixner , Borislav Petkov , Peter Zijlstra , Josh Poimboeuf Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Pawan Gupta , Brendan Jackman Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Move vmx_l1d_flush(), vmx_cleanup_l1d_flush(), and the vmentry_l1d_flush param code up in vmx.c so that all of the L1 data cache flushing code is bundled together. This will allow conditioning the mitigation code on CONFIG_CPU_MITIGATIONS=3Dy with minimal #ifdefs. No functional change intended. Reviewed-by: Brendan Jackman Signed-off-by: Sean Christopherson Reviewed-by: Pawan Gupta --- arch/x86/kvm/vmx/vmx.c | 174 ++++++++++++++++++++--------------------- 1 file changed, 87 insertions(+), 87 deletions(-) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 5af2338c7cb8..55962146fc34 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -302,6 +302,16 @@ static int vmx_setup_l1d_flush(enum vmx_l1d_flush_stat= e l1tf) return 0; } =20 +static void vmx_cleanup_l1d_flush(void) +{ + if (vmx_l1d_flush_pages) { + free_pages((unsigned long)vmx_l1d_flush_pages, L1D_CACHE_ORDER); + vmx_l1d_flush_pages =3D NULL; + } + /* Restore state so sysfs ignores VMX */ + l1tf_vmx_mitigation =3D VMENTER_L1D_FLUSH_AUTO; +} + static int vmentry_l1d_flush_parse(const char *s) { unsigned int i; @@ -352,6 +362,83 @@ static int vmentry_l1d_flush_get(char *s, const struct= kernel_param *kp) return sysfs_emit(s, "%s\n", vmentry_l1d_param[l1tf_vmx_mitigation].optio= n); } =20 +/* + * Software based L1D cache flush which is used when microcode providing + * the cache control MSR is not loaded. + * + * The L1D cache is 32 KiB on Nehalem and later microarchitectures, but to + * flush it is required to read in 64 KiB because the replacement algorithm + * is not exactly LRU. This could be sized at runtime via topology + * information but as all relevant affected CPUs have 32KiB L1D cache size + * there is no point in doing so. + */ +static noinstr void vmx_l1d_flush(struct kvm_vcpu *vcpu) +{ + int size =3D PAGE_SIZE << L1D_CACHE_ORDER; + + /* + * This code is only executed when the flush mode is 'cond' or + * 'always' + */ + if (static_branch_likely(&vmx_l1d_flush_cond)) { + bool flush_l1d; + + /* + * Clear the per-vcpu flush bit, it gets set again if the vCPU + * is reloaded, i.e. if the vCPU is scheduled out or if KVM + * exits to userspace, or if KVM reaches one of the unsafe + * VMEXIT handlers, e.g. if KVM calls into the emulator. + */ + flush_l1d =3D vcpu->arch.l1tf_flush_l1d; + vcpu->arch.l1tf_flush_l1d =3D false; + + /* + * Clear the per-cpu flush bit, it gets set again from + * the interrupt handlers. + */ + flush_l1d |=3D kvm_get_cpu_l1tf_flush_l1d(); + kvm_clear_cpu_l1tf_flush_l1d(); + + if (!flush_l1d) + return; + } + + vcpu->stat.l1d_flush++; + + if (static_cpu_has(X86_FEATURE_FLUSH_L1D)) { + native_wrmsrq(MSR_IA32_FLUSH_CMD, L1D_FLUSH); + return; + } + + asm volatile( + /* First ensure the pages are in the TLB */ + "xorl %%eax, %%eax\n" + ".Lpopulate_tlb:\n\t" + "movzbl (%[flush_pages], %%" _ASM_AX "), %%ecx\n\t" + "addl $4096, %%eax\n\t" + "cmpl %%eax, %[size]\n\t" + "jne .Lpopulate_tlb\n\t" + "xorl %%eax, %%eax\n\t" + "cpuid\n\t" + /* Now fill the cache */ + "xorl %%eax, %%eax\n" + ".Lfill_cache:\n" + "movzbl (%[flush_pages], %%" _ASM_AX "), %%ecx\n\t" + "addl $64, %%eax\n\t" + "cmpl %%eax, %[size]\n\t" + "jne .Lfill_cache\n\t" + "lfence\n" + :: [flush_pages] "r" (vmx_l1d_flush_pages), + [size] "r" (size) + : "eax", "ebx", "ecx", "edx"); +} + +static const struct kernel_param_ops vmentry_l1d_flush_ops =3D { + .set =3D vmentry_l1d_flush_set, + .get =3D vmentry_l1d_flush_get, +}; +module_param_cb(vmentry_l1d_flush, &vmentry_l1d_flush_ops, NULL, 0644); + static __always_inline void vmx_disable_fb_clear(struct vcpu_vmx *vmx) { u64 msr; @@ -404,12 +491,6 @@ static void vmx_update_fb_clear_dis(struct kvm_vcpu *v= cpu, struct vcpu_vmx *vmx) vmx->disable_fb_clear =3D false; } =20 -static const struct kernel_param_ops vmentry_l1d_flush_ops =3D { - .set =3D vmentry_l1d_flush_set, - .get =3D vmentry_l1d_flush_get, -}; -module_param_cb(vmentry_l1d_flush, &vmentry_l1d_flush_ops, NULL, 0644); - static u32 vmx_segment_access_rights(struct kvm_segment *var); =20 void vmx_vmexit(void); @@ -6672,77 +6753,6 @@ int vmx_handle_exit(struct kvm_vcpu *vcpu, fastpath_= t exit_fastpath) return ret; } =20 -/* - * Software based L1D cache flush which is used when microcode providing - * the cache control MSR is not loaded. - * - * The L1D cache is 32 KiB on Nehalem and later microarchitectures, but to - * flush it is required to read in 64 KiB because the replacement algorithm - * is not exactly LRU. This could be sized at runtime via topology - * information but as all relevant affected CPUs have 32KiB L1D cache size - * there is no point in doing so. - */ -static noinstr void vmx_l1d_flush(struct kvm_vcpu *vcpu) -{ - int size =3D PAGE_SIZE << L1D_CACHE_ORDER; - - /* - * This code is only executed when the flush mode is 'cond' or - * 'always' - */ - if (static_branch_likely(&vmx_l1d_flush_cond)) { - bool flush_l1d; - - /* - * Clear the per-vcpu flush bit, it gets set again if the vCPU - * is reloaded, i.e. if the vCPU is scheduled out or if KVM - * exits to userspace, or if KVM reaches one of the unsafe - * VMEXIT handlers, e.g. if KVM calls into the emulator. - */ - flush_l1d =3D vcpu->arch.l1tf_flush_l1d; - vcpu->arch.l1tf_flush_l1d =3D false; - - /* - * Clear the per-cpu flush bit, it gets set again from - * the interrupt handlers. - */ - flush_l1d |=3D kvm_get_cpu_l1tf_flush_l1d(); - kvm_clear_cpu_l1tf_flush_l1d(); - - if (!flush_l1d) - return; - } - - vcpu->stat.l1d_flush++; - - if (static_cpu_has(X86_FEATURE_FLUSH_L1D)) { - native_wrmsrq(MSR_IA32_FLUSH_CMD, L1D_FLUSH); - return; - } - - asm volatile( - /* First ensure the pages are in the TLB */ - "xorl %%eax, %%eax\n" - ".Lpopulate_tlb:\n\t" - "movzbl (%[flush_pages], %%" _ASM_AX "), %%ecx\n\t" - "addl $4096, %%eax\n\t" - "cmpl %%eax, %[size]\n\t" - "jne .Lpopulate_tlb\n\t" - "xorl %%eax, %%eax\n\t" - "cpuid\n\t" - /* Now fill the cache */ - "xorl %%eax, %%eax\n" - ".Lfill_cache:\n" - "movzbl (%[flush_pages], %%" _ASM_AX "), %%ecx\n\t" - "addl $64, %%eax\n\t" - "cmpl %%eax, %[size]\n\t" - "jne .Lfill_cache\n\t" - "lfence\n" - :: [flush_pages] "r" (vmx_l1d_flush_pages), - [size] "r" (size) - : "eax", "ebx", "ecx", "edx"); -} - void vmx_update_cr8_intercept(struct kvm_vcpu *vcpu, int tpr, int irr) { struct vmcs12 *vmcs12 =3D get_vmcs12(vcpu); @@ -8677,16 +8687,6 @@ __init int vmx_hardware_setup(void) return r; } =20 -static void vmx_cleanup_l1d_flush(void) -{ - if (vmx_l1d_flush_pages) { - free_pages((unsigned long)vmx_l1d_flush_pages, L1D_CACHE_ORDER); - vmx_l1d_flush_pages =3D NULL; - } - /* Restore state so sysfs ignores VMX */ - l1tf_vmx_mitigation =3D VMENTER_L1D_FLUSH_AUTO; -} - void vmx_exit(void) { allow_smaller_maxphyaddr =3D false; --=20 2.51.1.930.gacf6e81ea2-goog