Long loop is used to clear the branch history when switching from a guest
to host userspace. The LFENCE barrier is not required in this case as ring
transition itself acts as a barrier.
Move the prologue, LFENCE and epilogue out of __CLEAR_BHB_LOOP macro to
allow skipping the LFENCE in the long loop variant. Rename the long loop
function to clear_bhb_long_loop_no_barrier() to reflect the change.
Acked-by: David Kaplan <david.kaplan@amd.com>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
---
arch/x86/entry/entry_64.S | 32 ++++++++++++++++++++------------
arch/x86/include/asm/entry-common.h | 2 +-
arch/x86/include/asm/nospec-branch.h | 4 ++--
3 files changed, 23 insertions(+), 15 deletions(-)
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index f5f62af080d8ec6fe81e4dbe78ce44d08e62aa59..bb456a3c652e97f3a6fe72866b6dee04f59ccc98 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -1525,10 +1525,6 @@ SYM_CODE_END(rewind_stack_and_make_dead)
* Target Selection, rather than taking the slowpath via its_return_thunk.
*/
.macro __CLEAR_BHB_LOOP outer_loop_count:req, inner_loop_count:req
- ANNOTATE_NOENDBR
- push %rbp
- mov %rsp, %rbp
-
movl $\outer_loop_count, %ecx
ANNOTATE_INTRA_FUNCTION_CALL
call 1f
@@ -1560,10 +1556,7 @@ SYM_CODE_END(rewind_stack_and_make_dead)
jnz 1b
.Lret2_\@:
RET
-5: lfence
-
- pop %rbp
- RET
+5:
.endm
/*
@@ -1573,7 +1566,15 @@ SYM_CODE_END(rewind_stack_and_make_dead)
* setting BHI_DIS_S for the guests.
*/
SYM_FUNC_START(clear_bhb_loop)
+ ANNOTATE_NOENDBR
+ push %rbp
+ mov %rsp, %rbp
+
__CLEAR_BHB_LOOP 5, 5
+
+ lfence
+ pop %rbp
+ RET
SYM_FUNC_END(clear_bhb_loop)
EXPORT_SYMBOL_GPL(clear_bhb_loop)
STACK_FRAME_NON_STANDARD(clear_bhb_loop)
@@ -1584,8 +1585,15 @@ STACK_FRAME_NON_STANDARD(clear_bhb_loop)
* protects the kernel, but to mitigate the guest influence on the host
* userspace either IBPB or this sequence should be used. See VMSCAPE bug.
*/
-SYM_FUNC_START(clear_bhb_long_loop)
+SYM_FUNC_START(clear_bhb_long_loop_no_barrier)
+ ANNOTATE_NOENDBR
+ push %rbp
+ mov %rsp, %rbp
+
__CLEAR_BHB_LOOP 12, 7
-SYM_FUNC_END(clear_bhb_long_loop)
-EXPORT_SYMBOL_GPL(clear_bhb_long_loop)
-STACK_FRAME_NON_STANDARD(clear_bhb_long_loop)
+
+ pop %rbp
+ RET
+SYM_FUNC_END(clear_bhb_long_loop_no_barrier)
+EXPORT_SYMBOL_GPL(clear_bhb_long_loop_no_barrier)
+STACK_FRAME_NON_STANDARD(clear_bhb_long_loop_no_barrier)
diff --git a/arch/x86/include/asm/entry-common.h b/arch/x86/include/asm/entry-common.h
index b629e85c33aa7387042cce60040b8a493e3e6d46..eb2b7303a9c1fc5976388c2a6a3fb7914b553239 100644
--- a/arch/x86/include/asm/entry-common.h
+++ b/arch/x86/include/asm/entry-common.h
@@ -98,7 +98,7 @@ static inline void arch_exit_to_user_mode_prepare(struct pt_regs *regs,
if (cpu_feature_enabled(X86_FEATURE_IBPB_EXIT_TO_USER))
indirect_branch_prediction_barrier();
if (cpu_feature_enabled(X86_FEATURE_CLEAR_BHB_EXIT_TO_USER))
- clear_bhb_long_loop();
+ clear_bhb_long_loop_no_barrier();
this_cpu_write(x86_predictor_flush_exit_to_user, false);
}
diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index 745394be734f3c2b5640c9aef10156fe1d02636b..7f479aaa21313e484e7a0fded0b8b417feb8e2d0 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -388,9 +388,9 @@ extern void write_ibpb(void);
#ifdef CONFIG_X86_64
extern void clear_bhb_loop(void);
-extern void clear_bhb_long_loop(void);
+extern void clear_bhb_long_loop_no_barrier(void);
#else
-static inline void clear_bhb_long_loop(void) {}
+static inline void clear_bhb_long_loop_no_barrier(void) {}
#endif
extern void (*x86_return_thunk)(void);
--
2.34.1
On 10/27/25 16:43, Pawan Gupta wrote:
> Long loop is used to clear the branch history when switching from a guest
> to host userspace. The LFENCE barrier is not required in this case as ring
> transition itself acts as a barrier.
>
> Move the prologue, LFENCE and epilogue out of __CLEAR_BHB_LOOP macro to
> allow skipping the LFENCE in the long loop variant. Rename the long loop
> function to clear_bhb_long_loop_no_barrier() to reflect the change.
Too. Much. Assembly.
Is there a reason we can't do more of this in C? Can we have _one_
assembly function, please? One that takes the loop counts? No macros, no
duplication functions. Just one:
void __clear_bhb_loop(int inner, int outer);
Then we have sensible code that looks like this:
void clear_bhb_loop()
{
__clear_bhb_loop(inner, outer);
lfence();
}
void clear_bhb_loop_nofence()
{
__clear_bhb_loop(inner, outer);
}
We don't need a short and a long *version*. We just have one function
(or pair of functions) that gets called that works everywhere.
Actually, if you just used global variables and called the assembly one:
extern void clear_bhb_loop_nofence();
then the other implementation would just be:
void clear_bhb_loop()
{
__clear_bhb_loop(inner, outer);
lfence();
}
Then we have *ONE* assembly function instead of four.
Right? What am I missing?
Does the LFENCE *need* to be before that last pop and RET?
On Mon, Nov 03, 2025 at 12:45:35PM -0800, Dave Hansen wrote:
> On 10/27/25 16:43, Pawan Gupta wrote:
> > Long loop is used to clear the branch history when switching from a guest
> > to host userspace. The LFENCE barrier is not required in this case as ring
> > transition itself acts as a barrier.
> >
> > Move the prologue, LFENCE and epilogue out of __CLEAR_BHB_LOOP macro to
> > allow skipping the LFENCE in the long loop variant. Rename the long loop
> > function to clear_bhb_long_loop_no_barrier() to reflect the change.
>
> Too. Much. Assembly.
>
> Is there a reason we can't do more of this in C?
Apart from VMSCAPE, BHB clearing is also required when entering kernel from
system calls. And one of the safety requirement is to absolutely not
execute any indirect call/jmp unless we have cleared the BHB. In a C
implementation we cannot guarantee that the compiler won't generate
indirect branches before the BHB clearing can be done.
> Can we have _one_ assembly function, please? One that takes the loop
> counts? No macros, no duplication functions. Just one:
This seems possible for all the C callers. ASM callers should stick to asm
versions of BHB clearing to guarantee the compiler did not do anything
funky that would break the mitigation.
> void __clear_bhb_loop(int inner, int outer);
>
> Then we have sensible code that looks like this:
>
> void clear_bhb_loop()
> {
> __clear_bhb_loop(inner, outer);
> lfence();
> }
>
> void clear_bhb_loop_nofence()
> {
> __clear_bhb_loop(inner, outer);
> }
>
> We don't need a short and a long *version*. We just have one function
> (or pair of functions) that gets called that works everywhere.
>
> Actually, if you just used global variables and called the assembly one:
>
> extern void clear_bhb_loop_nofence();
>
> then the other implementation would just be:
>
> void clear_bhb_loop()
> {
> __clear_bhb_loop(inner, outer);
> lfence();
> }
>
> Then we have *ONE* assembly function instead of four.
>
> Right? What am I missing?
Overall, these look to be good improvements to me. The only concern is
making sure that we don't inadvertently call the C version from places that
strictly require no indirect branches before BHB clearing.
> Does the LFENCE *need* to be before that last pop and RET?
At syscall entry, VMexit and BPF (for native BHI mitigation), it does not
matter whether the LFENCE is before or after the last RET, if we can
guarantee that there will be no indirect call/jmp before LFENCE. C version
may not be able to provide this guarantee.
For exit-to-userspace (for VMSCAPE), C implementation is perfectly fine
since the goal is to protect userspace.
To summarize, only 1 of the BHB clear callsite can safely use the C
version, while others need to continue to use the assembly version. I do
not anticipate more such callsites that would be okay with indirect
branches before BHB clearing.
I am open to suggestions on making the code more readable while ensuring
the safety.
On 11/4/25 14:01, Pawan Gupta wrote:
> On Mon, Nov 03, 2025 at 12:45:35PM -0800, Dave Hansen wrote:
...
>> Too. Much. Assembly.
>>
>> Is there a reason we can't do more of this in C?
>
> Apart from VMSCAPE, BHB clearing is also required when entering kernel from
> system calls. And one of the safety requirement is to absolutely not
> execute any indirect call/jmp unless we have cleared the BHB. In a C
> implementation we cannot guarantee that the compiler won't generate
> indirect branches before the BHB clearing can be done.
That's a good reason, and I did forget about the CLEAR_BRANCH_HISTORY
route to get in to this code.
But my main aversion was to having so many different functions with
different names to do different things that are also exported to the world.
For instance, if we need an LFENCE in the entry code, we could do this:
.macro CLEAR_BRANCH_HISTORY
ALTERNATIVE "", "call clear_bhb_loop; lfence",\
X86_FEATURE_CLEAR_BHB_LOOP
.endm
Instead of having a LFENCE variant of clear_bhb_loop().
>> Can we have _one_ assembly function, please? One that takes the loop
>> counts? No macros, no duplication functions. Just one:
>
> This seems possible for all the C callers. ASM callers should stick to asm
> versions of BHB clearing to guarantee the compiler did not do anything
> funky that would break the mitigation.
ASM callers can pass arguments to functions too. ;)
Sure, the syscall entry path might not be the *best* place in the world
to do that because it'll add even more noops.
It does make me wonder if we want to deal with this more holistically
somehow:
/* clobbers %rax, make sure it is after saving the syscall nr */
IBRS_ENTER
UNTRAIN_RET
CLEAR_BRANCH_HISTORY
especially if we're creating lots and lots of variants of functions to
keep the ALTERNATIVE noop padding short.
On Tue, Nov 04, 2025 at 02:35:11PM -0800, Dave Hansen wrote: > On 11/4/25 14:01, Pawan Gupta wrote: > > On Mon, Nov 03, 2025 at 12:45:35PM -0800, Dave Hansen wrote: > ... > >> Too. Much. Assembly. > >> > >> Is there a reason we can't do more of this in C? > > > > Apart from VMSCAPE, BHB clearing is also required when entering kernel from > > system calls. And one of the safety requirement is to absolutely not > > execute any indirect call/jmp unless we have cleared the BHB. In a C > > implementation we cannot guarantee that the compiler won't generate > > indirect branches before the BHB clearing can be done. > > That's a good reason, and I did forget about the CLEAR_BRANCH_HISTORY > route to get in to this code. > > But my main aversion was to having so many different functions with > different names to do different things that are also exported to the world. > > For instance, if we need an LFENCE in the entry code, we could do this: > > .macro CLEAR_BRANCH_HISTORY > ALTERNATIVE "", "call clear_bhb_loop; lfence",\ > X86_FEATURE_CLEAR_BHB_LOOP > .endm > > Instead of having a LFENCE variant of clear_bhb_loop(). This makes perfect sense. I will do that. > >> Can we have _one_ assembly function, please? One that takes the loop > >> counts? No macros, no duplication functions. Just one: > > > > This seems possible for all the C callers. ASM callers should stick to asm > > versions of BHB clearing to guarantee the compiler did not do anything > > funky that would break the mitigation. > > ASM callers can pass arguments to functions too. ;) Oh my comment was more from the safety perspective of compiler induced code. > Sure, the syscall entry path might not be the *best* place in the world > to do that because it'll add even more noops. Right. > It does make me wonder if we want to deal with this more holistically > somehow: > > /* clobbers %rax, make sure it is after saving the syscall nr */ > IBRS_ENTER > UNTRAIN_RET > CLEAR_BRANCH_HISTORY > > especially if we're creating lots and lots of variants of functions to > keep the ALTERNATIVE noop padding short. Hmm, mitigations that are mutually exclusive can certainly be grouped together in an ALTERNATIVE_N block. It also has a potential to quickly become messy. But certainly worth exploring.
© 2016 - 2026 Red Hat, Inc.