From nobody Sat Feb 7 21:48:00 2026 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.9]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 488E91D8DFB; Thu, 16 Oct 2025 01:51:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.9 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760579519; cv=none; b=j7+shuzhN6c1fy2d5y82AxCKRKAfgCUuflaatFlihG4kF5cFVJNU6n6DomMLSt/SaNKOryOEOD01kdTIj9vO2Wg7QGXyEMT3L2rl74OZQSPL0x98Gw0QRo+C4wqqgLURUd4qUm0ZUEhf025Vq5h06isobKRa23+nZbKyzds2sgk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760579519; c=relaxed/simple; bh=WHMz5SG6W3N1B8H9QwVveY7DiuNYYtHq4k7vh7UJKR4=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=R3pPxjEM2Flg+FqLRFXhW3LtiqfrGcebCl5HYqiTCHmyBq+8Y7rYC5V3IjfiB+WZDs3DnMgWZfer1n3SOEkf7RYm0FAQdHAuFp1C/eBGL8EUHJA0P29uuxiYlk22gAhQlSczosrO9V59vTIorHrJv4VzW6gECt+B1A920TWKu78= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=pass smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=gJccbYhd; arc=none smtp.client-ip=198.175.65.9 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="gJccbYhd" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1760579518; x=1792115518; h=date:from:to:cc:subject:message-id:references: mime-version:in-reply-to; bh=WHMz5SG6W3N1B8H9QwVveY7DiuNYYtHq4k7vh7UJKR4=; b=gJccbYhdsYRwcEGvNgYhJRnZKD+tEQLRCxJofxmcxYJYvjbLE+kKHI0U m+UvwFWogg/t8n80jwE8GPyJYlCIP5Fvacxz+MXGonxw3hxEKMt/1VNiM LWJlvvkJP/bvML5t1hA3y7fihJ2hTEOJ48BDrj0P1+VK58V7g/fKmeoax qlH2CSLifAJ2OFsTlmvHOXxLtV50LZJqi4soa5apxWEX12xaNUgjP/kMv Tr0ic0MfLRtajHvfELC82a2k0DTwz0kvEuxUzDE3X+7tSFOYzknNJwk4U 5h8P74FzA0ZSXyGJtLcCQzUBSSFu7jQVCQ74wqWFwJNeIPGiP7NxL4CpX Q==; X-CSE-ConnectionGUID: qd5hMo7RTX+wZMgCwD16cQ== X-CSE-MsgGUID: aRh3+rXwRxe37x4tpSjuKw== X-IronPort-AV: E=McAfee;i="6800,10657,11583"; a="85382932" X-IronPort-AV: E=Sophos;i="6.19,232,1754982000"; d="scan'208";a="85382932" Received: from fmviesa001.fm.intel.com ([10.60.135.141]) by orvoesa101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Oct 2025 18:51:57 -0700 X-CSE-ConnectionGUID: xTpZhuYAR/uz4EC1rrFxwA== X-CSE-MsgGUID: EUKuiRHFSbCLH+5YVo6Mgg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.19,232,1754982000"; d="scan'208";a="213279901" Received: from dnelso2-mobl.amr.corp.intel.com (HELO desk) ([10.124.223.20]) by smtpauth.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Oct 2025 18:51:56 -0700 Date: Wed, 15 Oct 2025 18:51:55 -0700 From: Pawan Gupta To: x86@kernel.org, "H. Peter Anvin" , Josh Poimboeuf , David Kaplan , Sean Christopherson , Paolo Bonzini Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Asit Mallick , Tao Zhang Subject: [PATCH v2 1/3] x86/bhi: Add BHB clearing for CPUs with larger branch history Message-ID: <20251015-vmscape-bhb-v2-1-91cbdd9c3a96@linux.intel.com> X-Mailer: b4 0.14.2 References: <20251015-vmscape-bhb-v2-0-91cbdd9c3a96@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20251015-vmscape-bhb-v2-0-91cbdd9c3a96@linux.intel.com> Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add a version of clear_bhb_loop() that works on CPUs with larger branch history table such as Alder Lake and newer. This could serve as a cheaper alternative to IBPB mitigation for VMSCAPE. clear_bhb_loop() and the new clear_bhb_long_loop() only differ in the loop counter. Convert the asm implementation of clear_bhb_loop() into a macro that is used by both the variants, passing counter as an argument. There is no difference in the output of: $ objdump --disassemble=3Dclear_bhb_loop vmlinux before and after this commit. Signed-off-by: Pawan Gupta --- arch/x86/entry/entry_64.S | 47 ++++++++++++++++++++++++++------= ---- arch/x86/include/asm/nospec-branch.h | 3 +++ 2 files changed, 37 insertions(+), 13 deletions(-) diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S index ed04a968cc7d0095ab0185b2e3b5beffb7680afd..f5f62af080d8ec6fe81e4dbe78c= e44d08e62aa59 100644 --- a/arch/x86/entry/entry_64.S +++ b/arch/x86/entry/entry_64.S @@ -1499,11 +1499,6 @@ SYM_CODE_END(rewind_stack_and_make_dead) * from the branch history tracker in the Branch Predictor, therefore remo= ving * user influence on subsequent BTB lookups. * - * It should be used on parts prior to Alder Lake. Newer parts should use = the - * BHI_DIS_S hardware control instead. If a pre-Alder Lake part is being - * virtualized on newer hardware the VMM should protect against BHI attack= s by - * setting BHI_DIS_S for the guests. - * * CALLs/RETs are necessary to prevent Loop Stream Detector(LSD) from enga= ging * and not clearing the branch history. The call tree looks like: * @@ -1529,11 +1524,12 @@ SYM_CODE_END(rewind_stack_and_make_dead) * that all RETs are in the second half of a cacheline to mitigate Indirect * Target Selection, rather than taking the slowpath via its_return_thunk. */ -SYM_FUNC_START(clear_bhb_loop) +.macro __CLEAR_BHB_LOOP outer_loop_count:req, inner_loop_count:req ANNOTATE_NOENDBR push %rbp mov %rsp, %rbp - movl $5, %ecx + + movl $\outer_loop_count, %ecx ANNOTATE_INTRA_FUNCTION_CALL call 1f jmp 5f @@ -1542,29 +1538,54 @@ SYM_FUNC_START(clear_bhb_loop) * Shift instructions so that the RET is in the upper half of the * cacheline and don't take the slowpath to its_return_thunk. */ - .skip 32 - (.Lret1 - 1f), 0xcc + .skip 32 - (.Lret1_\@ - 1f), 0xcc ANNOTATE_INTRA_FUNCTION_CALL 1: call 2f -.Lret1: RET +.Lret1_\@: + RET .align 64, 0xcc /* - * As above shift instructions for RET at .Lret2 as well. + * As above shift instructions for RET at .Lret2_\@ as well. * - * This should be ideally be: .skip 32 - (.Lret2 - 2f), 0xcc + * This should be ideally be: .skip 32 - (.Lret2_\@ - 2f), 0xcc * but some Clang versions (e.g. 18) don't like this. */ .skip 32 - 18, 0xcc -2: movl $5, %eax +2: movl $\inner_loop_count, %eax 3: jmp 4f nop 4: sub $1, %eax jnz 3b sub $1, %ecx jnz 1b -.Lret2: RET +.Lret2_\@: + RET 5: lfence + pop %rbp RET +.endm + +/* + * This should be used on parts prior to Alder Lake. Newer parts should us= e the + * BHI_DIS_S hardware control instead. If a pre-Alder Lake part is being + * virtualized on newer hardware the VMM should protect against BHI attack= s by + * setting BHI_DIS_S for the guests. + */ +SYM_FUNC_START(clear_bhb_loop) + __CLEAR_BHB_LOOP 5, 5 SYM_FUNC_END(clear_bhb_loop) EXPORT_SYMBOL_GPL(clear_bhb_loop) STACK_FRAME_NON_STANDARD(clear_bhb_loop) + +/* + * A longer version of clear_bhb_loop to ensure that the BHB is cleared on= CPUs + * with larger branch history tables (i.e. Alder Lake and newer). BHI_DIS_S + * protects the kernel, but to mitigate the guest influence on the host + * userspace either IBPB or this sequence should be used. See VMSCAPE bug. + */ +SYM_FUNC_START(clear_bhb_long_loop) + __CLEAR_BHB_LOOP 12, 7 +SYM_FUNC_END(clear_bhb_long_loop) +EXPORT_SYMBOL_GPL(clear_bhb_long_loop) +STACK_FRAME_NON_STANDARD(clear_bhb_long_loop) diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/no= spec-branch.h index 08ed5a2e46a5fd790bcb1b73feb6469518809c06..49707e563bdf71bdd05d3827f10= dd2b8ac6bca2c 100644 --- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -388,6 +388,9 @@ extern void write_ibpb(void); =20 #ifdef CONFIG_X86_64 extern void clear_bhb_loop(void); +extern void clear_bhb_long_loop(void); +#else +static inline void clear_bhb_long_loop(void) {} #endif =20 extern void (*x86_return_thunk)(void); --=20 2.34.1 From nobody Sat Feb 7 21:48:00 2026 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B18921A314B; Thu, 16 Oct 2025 01:52:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.12 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760579534; cv=none; b=CUhlkLGGyx+3Xz1BMM1t7bMOC3TIL6NcatpbIwa9SrR/hFZBpabqJ34/O86KuidGq7c8dl9saPUbGzMlNavsWEK+lmThiWs4kw5ubIu00O0unxltXjEnlSQZ16eVS2XAoodrnVZwuv9dNsBffapiRCEoMxUqU25+IzaCxY6C/8I= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760579534; c=relaxed/simple; bh=Md5mh5P2CoxSeGafPxzDde1kkC6ROIPEUt1Ozo32c6I=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=D6Fgc/eEwUz6K5Hluaou8mPMdPZCuYZVzMhjQ+P6yfXbvye0R6rIHDJGppDkO+KGxF8WAKAWaLjhrKe9k6KhpyoI4EqC6SoW2WenvFhCCV7mRhDCHkZQfNnZcxMvt8M88QVX9FmX9N31IiGdtX3YG3U3Md8xSwRWTvmByPM6T+o= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=pass smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=SFRC1M+M; arc=none smtp.client-ip=198.175.65.12 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="SFRC1M+M" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1760579533; x=1792115533; h=date:from:to:cc:subject:message-id:references: mime-version:in-reply-to; bh=Md5mh5P2CoxSeGafPxzDde1kkC6ROIPEUt1Ozo32c6I=; b=SFRC1M+MPXOofS2L0hWDim+iG48OhjFFNDUHshSWxd6vco5br9dDFsxG JxQ2CqJdFQUAToZ61WvgbQW8T0kLln29lGUx36gE+I5AqjoAP3GMA7sY7 HEJSgSpC9ffpHkg48TMzCIXYxfCSna5hna4QUgHbM9WQUTV8RFWrjbfK1 xxRfmP+BUhODXacKAWuC1duKn97V3CghA6RV7fwuldfwqwUjdQ2Gd0ewI cx9K2avyBAgmPRhLK0YknaiLZuoPL5nZvGEJIUReuCZaJ2/VqvYaDS1nh HC3xHYkDIfo7Jj29q172VVfo/yp2KEz7QwPbgGDFawe7Ffy6CCLhYDYKq g==; X-CSE-ConnectionGUID: NMg9/BsjQSSJ9X5vwwBGZA== X-CSE-MsgGUID: v4dZXPKLSCej13UkekRUZw== X-IronPort-AV: E=McAfee;i="6800,10657,11583"; a="74210575" X-IronPort-AV: E=Sophos;i="6.19,232,1754982000"; d="scan'208";a="74210575" Received: from fmviesa007.fm.intel.com ([10.60.135.147]) by orvoesa104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Oct 2025 18:52:12 -0700 X-CSE-ConnectionGUID: sVBuyciDSymBw8LOvD2Edg== X-CSE-MsgGUID: AimN0kPLTmWZrFp2KV0S7g== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.19,232,1754982000"; d="scan'208";a="182001043" Received: from dnelso2-mobl.amr.corp.intel.com (HELO desk) ([10.124.223.20]) by fmviesa007-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Oct 2025 18:52:10 -0700 Date: Wed, 15 Oct 2025 18:52:11 -0700 From: Pawan Gupta To: x86@kernel.org, "H. Peter Anvin" , Josh Poimboeuf , David Kaplan , Sean Christopherson , Paolo Bonzini Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Asit Mallick , Tao Zhang Subject: [PATCH v2 2/3] x86/vmscape: Replace IBPB with branch history clear on exit to userspace Message-ID: <20251015-vmscape-bhb-v2-2-91cbdd9c3a96@linux.intel.com> X-Mailer: b4 0.14.2 References: <20251015-vmscape-bhb-v2-0-91cbdd9c3a96@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20251015-vmscape-bhb-v2-0-91cbdd9c3a96@linux.intel.com> Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" IBPB mitigation for VMSCAPE is an overkill for CPUs that are only affected by the BHI variant of VMSCAPE. On such CPUs, eIBRS already provides indirect branch isolation between guest and host userspace. But, a guest could still poison the branch history. To mitigate that, use the recently added clear_bhb_long_loop() to isolate the branch history between guest and userspace. Add cmdline option 'vmscape=3Don' that automatically selects the appropriate mitigation based on the CPU. Signed-off-by: Pawan Gupta --- Documentation/admin-guide/hw-vuln/vmscape.rst | 8 ++++ Documentation/admin-guide/kernel-parameters.txt | 4 +- arch/x86/include/asm/cpufeatures.h | 1 + arch/x86/include/asm/entry-common.h | 12 +++--- arch/x86/include/asm/nospec-branch.h | 2 +- arch/x86/kernel/cpu/bugs.c | 53 ++++++++++++++++++---= ---- arch/x86/kvm/x86.c | 5 ++- 7 files changed, 61 insertions(+), 24 deletions(-) diff --git a/Documentation/admin-guide/hw-vuln/vmscape.rst b/Documentation/= admin-guide/hw-vuln/vmscape.rst index d9b9a2b6c114c05a7325e5f3c9d42129339b870b..580f288ae8bfc601ff000d6d95d= 711bb9084459e 100644 --- a/Documentation/admin-guide/hw-vuln/vmscape.rst +++ b/Documentation/admin-guide/hw-vuln/vmscape.rst @@ -86,6 +86,10 @@ The possible values in this file are: run a potentially malicious guest and issues an IBPB before the first exit to userspace after VM-exit. =20 + * 'Mitigation: Clear BHB before exit to userspace': + + As above, conditional BHB clearing mitigation is enabled. + * 'Mitigation: IBPB on VMEXIT': =20 IBPB is issued on every VM-exit. This occurs when other mitigations like @@ -108,3 +112,7 @@ The mitigation can be controlled via the ``vmscape=3D``= command line parameter: =20 Force vulnerability detection and mitigation even on processors that are not known to be affected. + + * ``vmscape=3Don``: + + Choose the mitigation based on the VMSCAPE variant the CPU is affected = by. diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentatio= n/admin-guide/kernel-parameters.txt index 6c42061ca20e581b5192b66c6f25aba38d4f8ff8..4b4711ced5e187495476b5365cd= 7b3df81db893b 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -8104,9 +8104,11 @@ =20 off - disable the mitigation ibpb - use Indirect Branch Prediction Barrier - (IBPB) mitigation (default) + (IBPB) mitigation force - force vulnerability detection even on unaffected processors + on - (default) automatically select IBPB + or BHB clear mitigation based on CPU =20 vsyscall=3D [X86-64,EARLY] Controls the behavior of vsyscalls (i.e. calls to diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpuf= eatures.h index 4091a776e37aaed67ca93b0a0cd23cc25dbc33d4..3d547c3eab4e3290de3eee8e89f= 21587fee34931 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -499,6 +499,7 @@ #define X86_FEATURE_IBPB_EXIT_TO_USER (21*32+14) /* Use IBPB on exit-to-us= erspace, see VMSCAPE bug */ #define X86_FEATURE_ABMC (21*32+15) /* Assignable Bandwidth Monitoring Co= unters */ #define X86_FEATURE_MSR_IMM (21*32+16) /* MSR immediate form instructions= */ +#define X86_FEATURE_CLEAR_BHB_EXIT_TO_USER (21*32+17) /* Clear branch hist= ory on exit-to-userspace, see VMSCAPE bug */ =20 /* * BUG word(s) diff --git a/arch/x86/include/asm/entry-common.h b/arch/x86/include/asm/ent= ry-common.h index ce3eb6d5fdf9f2dba59b7bad24afbfafc8c36918..b7b9af1b641385b8283edf24495= 78ff65e5bd6df 100644 --- a/arch/x86/include/asm/entry-common.h +++ b/arch/x86/include/asm/entry-common.h @@ -94,11 +94,13 @@ static inline void arch_exit_to_user_mode_prepare(struc= t pt_regs *regs, */ choose_random_kstack_offset(rdtsc()); =20 - /* Avoid unnecessary reads of 'x86_ibpb_exit_to_user' */ - if (cpu_feature_enabled(X86_FEATURE_IBPB_EXIT_TO_USER) && - this_cpu_read(x86_ibpb_exit_to_user)) { - indirect_branch_prediction_barrier(); - this_cpu_write(x86_ibpb_exit_to_user, false); + if (unlikely(this_cpu_read(x86_pred_flush_pending))) { + if (cpu_feature_enabled(X86_FEATURE_IBPB_EXIT_TO_USER)) + indirect_branch_prediction_barrier(); + else if (cpu_feature_enabled(X86_FEATURE_CLEAR_BHB_EXIT_TO_USER)) + clear_bhb_long_loop(); + + this_cpu_write(x86_pred_flush_pending, false); } } #define arch_exit_to_user_mode_prepare arch_exit_to_user_mode_prepare diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/no= spec-branch.h index 49707e563bdf71bdd05d3827f10dd2b8ac6bca2c..00730cc22c2e7115f6dbb38a1ed= 8d10383ada5c0 100644 --- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -534,7 +534,7 @@ void alternative_msr_write(unsigned int msr, u64 val, u= nsigned int feature) : "memory"); } =20 -DECLARE_PER_CPU(bool, x86_ibpb_exit_to_user); +DECLARE_PER_CPU(bool, x86_pred_flush_pending); =20 static inline void indirect_branch_prediction_barrier(void) { diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index 6a526ae1fe9933229947db5b7676a18328fe2204..02fd37bf4e6d77494c72806775f= 4415a27652206 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -109,12 +109,11 @@ DEFINE_PER_CPU(u64, x86_spec_ctrl_current); EXPORT_PER_CPU_SYMBOL_GPL(x86_spec_ctrl_current); =20 /* - * Set when the CPU has run a potentially malicious guest. An IBPB will - * be needed to before running userspace. That IBPB will flush the branch - * predictor content. + * Set when the CPU has run a potentially malicious guest. Indicates that a + * branch predictor flush is needed before running userspace. */ -DEFINE_PER_CPU(bool, x86_ibpb_exit_to_user); -EXPORT_PER_CPU_SYMBOL_GPL(x86_ibpb_exit_to_user); +DEFINE_PER_CPU(bool, x86_pred_flush_pending); +EXPORT_PER_CPU_SYMBOL_GPL(x86_pred_flush_pending); =20 u64 x86_pred_cmd __ro_after_init =3D PRED_CMD_IBPB; =20 @@ -3202,13 +3201,15 @@ enum vmscape_mitigations { VMSCAPE_MITIGATION_AUTO, VMSCAPE_MITIGATION_IBPB_EXIT_TO_USER, VMSCAPE_MITIGATION_IBPB_ON_VMEXIT, + VMSCAPE_MITIGATION_BHB_CLEAR_EXIT_TO_USER, }; =20 static const char * const vmscape_strings[] =3D { - [VMSCAPE_MITIGATION_NONE] =3D "Vulnerable", + [VMSCAPE_MITIGATION_NONE] =3D "Vulnerable", /* [VMSCAPE_MITIGATION_AUTO] */ - [VMSCAPE_MITIGATION_IBPB_EXIT_TO_USER] =3D "Mitigation: IBPB before exit = to userspace", - [VMSCAPE_MITIGATION_IBPB_ON_VMEXIT] =3D "Mitigation: IBPB on VMEXIT", + [VMSCAPE_MITIGATION_IBPB_EXIT_TO_USER] =3D "Mitigation: IBPB before exit= to userspace", + [VMSCAPE_MITIGATION_IBPB_ON_VMEXIT] =3D "Mitigation: IBPB on VMEXIT", + [VMSCAPE_MITIGATION_BHB_CLEAR_EXIT_TO_USER] =3D "Mitigation: Clear BHB be= fore exit to userspace", }; =20 static enum vmscape_mitigations vmscape_mitigation __ro_after_init =3D @@ -3226,6 +3227,8 @@ static int __init vmscape_parse_cmdline(char *str) } else if (!strcmp(str, "force")) { setup_force_cpu_bug(X86_BUG_VMSCAPE); vmscape_mitigation =3D VMSCAPE_MITIGATION_AUTO; + } else if (!strcmp(str, "on")) { + vmscape_mitigation =3D VMSCAPE_MITIGATION_AUTO; } else { pr_err("Ignoring unknown vmscape=3D%s option.\n", str); } @@ -3236,18 +3239,35 @@ early_param("vmscape", vmscape_parse_cmdline); =20 static void __init vmscape_select_mitigation(void) { - if (!boot_cpu_has_bug(X86_BUG_VMSCAPE) || - !boot_cpu_has(X86_FEATURE_IBPB)) { + if (!boot_cpu_has_bug(X86_BUG_VMSCAPE)) { vmscape_mitigation =3D VMSCAPE_MITIGATION_NONE; return; } =20 - if (vmscape_mitigation =3D=3D VMSCAPE_MITIGATION_AUTO) { - if (should_mitigate_vuln(X86_BUG_VMSCAPE)) - vmscape_mitigation =3D VMSCAPE_MITIGATION_IBPB_EXIT_TO_USER; - else - vmscape_mitigation =3D VMSCAPE_MITIGATION_NONE; + if (vmscape_mitigation =3D=3D VMSCAPE_MITIGATION_AUTO && + !should_mitigate_vuln(X86_BUG_VMSCAPE)) + vmscape_mitigation =3D VMSCAPE_MITIGATION_NONE; + + if (vmscape_mitigation =3D=3D VMSCAPE_MITIGATION_IBPB_EXIT_TO_USER && + !boot_cpu_has(X86_FEATURE_IBPB)) { + pr_err("IBPB not supported, switching to AUTO select\n"); + vmscape_mitigation =3D VMSCAPE_MITIGATION_AUTO; } + + if (vmscape_mitigation !=3D VMSCAPE_MITIGATION_AUTO) + return; + + /* + * CPUs with BHI_CTRL(ADL and newer) can avoid the IBPB and use BHB + * clear sequence. These CPUs are only vulnerable to the BHI variant + * of the VMSCAPE attack and does not require an IBPB flush. + */ + if (boot_cpu_has(X86_FEATURE_BHI_CTRL)) + vmscape_mitigation =3D VMSCAPE_MITIGATION_BHB_CLEAR_EXIT_TO_USER; + else if (boot_cpu_has(X86_FEATURE_IBPB)) + vmscape_mitigation =3D VMSCAPE_MITIGATION_IBPB_EXIT_TO_USER; + else + vmscape_mitigation =3D VMSCAPE_MITIGATION_NONE; } =20 static void __init vmscape_update_mitigation(void) @@ -3266,6 +3286,8 @@ static void __init vmscape_apply_mitigation(void) { if (vmscape_mitigation =3D=3D VMSCAPE_MITIGATION_IBPB_EXIT_TO_USER) setup_force_cpu_cap(X86_FEATURE_IBPB_EXIT_TO_USER); + else if (vmscape_mitigation =3D=3D VMSCAPE_MITIGATION_BHB_CLEAR_EXIT_TO_U= SER) + setup_force_cpu_cap(X86_FEATURE_CLEAR_BHB_EXIT_TO_USER); } =20 #undef pr_fmt @@ -3357,6 +3379,7 @@ void cpu_bugs_smt_update(void) break; case VMSCAPE_MITIGATION_IBPB_ON_VMEXIT: case VMSCAPE_MITIGATION_IBPB_EXIT_TO_USER: + case VMSCAPE_MITIGATION_BHB_CLEAR_EXIT_TO_USER: /* * Hypervisors can be attacked across-threads, warn for SMT when * STIBP is not already enabled system-wide. diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 42ecd093bb4c8ecfae2523b52b85779ca1e56bb5..57d26dbb43e9115880e92e448e4= c018e03dca063 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -11397,8 +11397,9 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu) * set for the CPU that actually ran the guest, and not the CPU that it * may migrate to. */ - if (cpu_feature_enabled(X86_FEATURE_IBPB_EXIT_TO_USER)) - this_cpu_write(x86_ibpb_exit_to_user, true); + if (cpu_feature_enabled(X86_FEATURE_IBPB_EXIT_TO_USER) || + cpu_feature_enabled(X86_FEATURE_CLEAR_BHB_EXIT_TO_USER)) + this_cpu_write(x86_pred_flush_pending, true); =20 /* * Consume any pending interrupts, including the possible source of --=20 2.34.1 From nobody Sat Feb 7 21:48:00 2026 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8B0F21DE2D7; Thu, 16 Oct 2025 01:52:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.12 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760579550; cv=none; b=jEgyZgyb+5BQLMI5Hbvk3Jm0e45BDvS5AaAQr/uf6zM1nC0wml9C1Qvfb+lrky/nf4cylrztMADb6Yd1cm1uaO6W0iHzXOdkdSnnwwxh3DxBu330Uh4RnFz0Az/dVxtYXnn8ZQIzLOjg96/sDKNZhtGnHLnc4twi1kwh0htpnVU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760579550; c=relaxed/simple; bh=woq5C4XAhmSCnRFLmSIdNScVyP7bE3LD7+vv5yYzPoI=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=j95oAWHW89H5JP0lQUouR3o1Y7NjVt285FPJrD75CCwjV7gKQaWohumjf1n1aZJUkeg2tvaGQU0+7Wns5VfBI+bH2ZVcywmHLNFrIGl3BBy2qvFMyesrkNcI28KBdLVSGMoT39qD1fli86vTIsg3SzpatTsxC6SLeNvYervMTpQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=pass smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=BDr/AgIe; arc=none smtp.client-ip=198.175.65.12 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="BDr/AgIe" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1760579548; x=1792115548; h=date:from:to:cc:subject:message-id:references: mime-version:in-reply-to; bh=woq5C4XAhmSCnRFLmSIdNScVyP7bE3LD7+vv5yYzPoI=; b=BDr/AgIe/si8j4LPeH7dn9Z3R1SYvS/3Los3qatF3CRu5BX/7KNs3bVy /5YcsllfgQPuA9JhJ6Qd8oEovEx6wfI8ktgUkUnRwl/9nJxCxevz+M7Yn bnPmFRYsFBQHTXVvMo6Q2rR0vYH0B0ucy1jcltbuh8KBLUb10ieeONyGb IygoRJYK1VOEgAhqZRmVCbJSiD3eo1j85+atnCD8a0lrciO1c2j84yPLh WswVFulRN6jyGEA64hTPBmcGJSZWG/BlhLn2D4wz+D3AayFmeWp547Ifx FNkAxVdtkIG4ppr/7dNW5MMF7k95uqChBdgdKfk4gVS3yegVAms/OPFb1 w==; X-CSE-ConnectionGUID: q4uoiG2qQIKIclfMFXcMbA== X-CSE-MsgGUID: HRcj0vCUQeeuWKU79ARexA== X-IronPort-AV: E=McAfee;i="6800,10657,11583"; a="74210615" X-IronPort-AV: E=Sophos;i="6.19,232,1754982000"; d="scan'208";a="74210615" Received: from fmviesa007.fm.intel.com ([10.60.135.147]) by orvoesa104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Oct 2025 18:52:28 -0700 X-CSE-ConnectionGUID: 3QTDFGsRTXKxD/kDsPowPg== X-CSE-MsgGUID: FkJkE/UQR9e5Ry2TOuVj3A== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.19,232,1754982000"; d="scan'208";a="182001111" Received: from dnelso2-mobl.amr.corp.intel.com (HELO desk) ([10.124.223.20]) by fmviesa007-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Oct 2025 18:52:26 -0700 Date: Wed, 15 Oct 2025 18:52:26 -0700 From: Pawan Gupta To: x86@kernel.org, "H. Peter Anvin" , Josh Poimboeuf , David Kaplan , Sean Christopherson , Paolo Bonzini Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Asit Mallick , Tao Zhang Subject: [PATCH v2 3/3] x86/vmscape: Remove LFENCE from BHB clearing long loop Message-ID: <20251015-vmscape-bhb-v2-3-91cbdd9c3a96@linux.intel.com> X-Mailer: b4 0.14.2 References: <20251015-vmscape-bhb-v2-0-91cbdd9c3a96@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20251015-vmscape-bhb-v2-0-91cbdd9c3a96@linux.intel.com> Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Long loop is used to clear the branch history when switching from a guest to host userspace. The LFENCE barrier is not required in this case as ring transition itself acts as a barrier. Move the prologue, LFENCE and epilogue out of __CLEAR_BHB_LOOP macro to allow skipping the LFENCE in the long loop variant. Rename the long loop function to clear_bhb_long_loop_no_barrier() to reflect the change. Signed-off-by: Pawan Gupta --- arch/x86/entry/entry_64.S | 32 ++++++++++++++++++++------------ arch/x86/include/asm/entry-common.h | 2 +- arch/x86/include/asm/nospec-branch.h | 4 ++-- 3 files changed, 23 insertions(+), 15 deletions(-) diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S index f5f62af080d8ec6fe81e4dbe78ce44d08e62aa59..bb456a3c652e97f3a6fe72866b6= dee04f59ccc98 100644 --- a/arch/x86/entry/entry_64.S +++ b/arch/x86/entry/entry_64.S @@ -1525,10 +1525,6 @@ SYM_CODE_END(rewind_stack_and_make_dead) * Target Selection, rather than taking the slowpath via its_return_thunk. */ .macro __CLEAR_BHB_LOOP outer_loop_count:req, inner_loop_count:req - ANNOTATE_NOENDBR - push %rbp - mov %rsp, %rbp - movl $\outer_loop_count, %ecx ANNOTATE_INTRA_FUNCTION_CALL call 1f @@ -1560,10 +1556,7 @@ SYM_CODE_END(rewind_stack_and_make_dead) jnz 1b .Lret2_\@: RET -5: lfence - - pop %rbp - RET +5: .endm =20 /* @@ -1573,7 +1566,15 @@ SYM_CODE_END(rewind_stack_and_make_dead) * setting BHI_DIS_S for the guests. */ SYM_FUNC_START(clear_bhb_loop) + ANNOTATE_NOENDBR + push %rbp + mov %rsp, %rbp + __CLEAR_BHB_LOOP 5, 5 + + lfence + pop %rbp + RET SYM_FUNC_END(clear_bhb_loop) EXPORT_SYMBOL_GPL(clear_bhb_loop) STACK_FRAME_NON_STANDARD(clear_bhb_loop) @@ -1584,8 +1585,15 @@ STACK_FRAME_NON_STANDARD(clear_bhb_loop) * protects the kernel, but to mitigate the guest influence on the host * userspace either IBPB or this sequence should be used. See VMSCAPE bug. */ -SYM_FUNC_START(clear_bhb_long_loop) +SYM_FUNC_START(clear_bhb_long_loop_no_barrier) + ANNOTATE_NOENDBR + push %rbp + mov %rsp, %rbp + __CLEAR_BHB_LOOP 12, 7 -SYM_FUNC_END(clear_bhb_long_loop) -EXPORT_SYMBOL_GPL(clear_bhb_long_loop) -STACK_FRAME_NON_STANDARD(clear_bhb_long_loop) + + pop %rbp + RET +SYM_FUNC_END(clear_bhb_long_loop_no_barrier) +EXPORT_SYMBOL_GPL(clear_bhb_long_loop_no_barrier) +STACK_FRAME_NON_STANDARD(clear_bhb_long_loop_no_barrier) diff --git a/arch/x86/include/asm/entry-common.h b/arch/x86/include/asm/ent= ry-common.h index b7b9af1b641385b8283edf2449578ff65e5bd6df..c70454bdd0e3f544dedf582ad6f= 7f62e2833704c 100644 --- a/arch/x86/include/asm/entry-common.h +++ b/arch/x86/include/asm/entry-common.h @@ -98,7 +98,7 @@ static inline void arch_exit_to_user_mode_prepare(struct = pt_regs *regs, if (cpu_feature_enabled(X86_FEATURE_IBPB_EXIT_TO_USER)) indirect_branch_prediction_barrier(); else if (cpu_feature_enabled(X86_FEATURE_CLEAR_BHB_EXIT_TO_USER)) - clear_bhb_long_loop(); + clear_bhb_long_loop_no_barrier(); =20 this_cpu_write(x86_pred_flush_pending, false); } diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/no= spec-branch.h index 00730cc22c2e7115f6dbb38a1ed8d10383ada5c0..3bcf9f180c21d468f17fa9c1210= cba84a541e6ea 100644 --- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -388,9 +388,9 @@ extern void write_ibpb(void); =20 #ifdef CONFIG_X86_64 extern void clear_bhb_loop(void); -extern void clear_bhb_long_loop(void); +extern void clear_bhb_long_loop_no_barrier(void); #else -static inline void clear_bhb_long_loop(void) {} +static inline void clear_bhb_long_loop_no_barrier(void) {} #endif =20 extern void (*x86_return_thunk)(void); --=20 2.34.1