Hi! Second version of the SRSO fixes/cleanup. I've redone some, reorderd most and left out the interface bits entirely for now. Although I do strongly feel the extra interface is superfluous (and ugly). This is based on top of current tip/x86/urgent 833fd800bf56. The one open techinical issue I have with the mitigation is the alignment of the RET inside srso_safe_ret(). The details given for retbleed stated that RET should be on a 64byte boundary, which is not the case here. I'll go prod at bringing the rest of the patches forward after I stare at some other email.
On Mon, Aug 14, 2023 at 01:44:26PM +0200, Peter Zijlstra wrote:
> The one open techinical issue I have with the mitigation is the alignment of
> the RET inside srso_safe_ret(). The details given for retbleed stated that RET
> should be on a 64byte boundary, which is not the case here.
I have written this in the hope to make this more clear:
/*
* Some generic notes on the untraining sequences:
*
* They are interchangeable when it comes to flushing potentially wrong
* RET predictions from the BTB.
*
* The SRSO Zen1/2 (MOVABS) untraining sequence is longer than the
* Retbleed sequence because the return sequence done there
* (srso_safe_ret()) is longer and the return sequence must fully nest
* (end before) the untraining sequence. Therefore, the untraining
* sequence must overlap the return sequence.
*
* Regarding alignment - the instructions which need to be untrained,
* must all start at a cacheline boundary for Zen1/2 generations. That
* is, both the ret in zen_untrain_ret() and srso_safe_ret() in the
* srso_untrain_ret() must both be placed at the beginning of
* a cacheline.
*/
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Mon, Aug 14, 2023 at 06:44:47PM +0200, Borislav Petkov wrote: > On Mon, Aug 14, 2023 at 01:44:26PM +0200, Peter Zijlstra wrote: > > The one open techinical issue I have with the mitigation is the alignment of > > the RET inside srso_safe_ret(). The details given for retbleed stated that RET > > should be on a 64byte boundary, which is not the case here. > > I have written this in the hope to make this more clear: > > /* > * Some generic notes on the untraining sequences: > * > * They are interchangeable when it comes to flushing potentially wrong > * RET predictions from the BTB. > * > * The SRSO Zen1/2 (MOVABS) untraining sequence is longer than the > * Retbleed sequence because the return sequence done there > * (srso_safe_ret()) is longer and the return sequence must fully nest > * (end before) the untraining sequence. Therefore, the untraining > * sequence must overlap the return sequence. > * > * Regarding alignment - the instructions which need to be untrained, > * must all start at a cacheline boundary for Zen1/2 generations. That > * is, both the ret in zen_untrain_ret() and srso_safe_ret() in the > * srso_untrain_ret() must both be placed at the beginning of > * a cacheline. > */ It's a good comment, but RET in srso_safe_ret() is still misaligned. Don't we need something like so? diff --git a/arch/x86/lib/retpoline.S b/arch/x86/lib/retpoline.S index 9bc19deacad1..373ac128a30a 100644 --- a/arch/x86/lib/retpoline.S +++ b/arch/x86/lib/retpoline.S @@ -251,13 +251,14 @@ __EXPORT_THUNK(retbleed_untrain_ret) * thus a "safe" one to use. */ .align 64 - .skip 64 - (srso_safe_ret - srso_untrain_ret), 0xcc + .skip 64 - (.Lsrso_ret - srso_untrain_ret), 0xcc SYM_START(srso_untrain_ret, SYM_L_GLOBAL, SYM_A_NONE) ANNOTATE_NOENDBR .byte 0x48, 0xb8 SYM_INNER_LABEL(srso_safe_ret, SYM_L_GLOBAL) lea 8(%_ASM_SP), %_ASM_SP +.Lsrso_ret: ret int3 int3
On Mon, Aug 14, 2023 at 12:51:55PM -0700, Josh Poimboeuf wrote: > On Mon, Aug 14, 2023 at 06:44:47PM +0200, Borislav Petkov wrote: > > On Mon, Aug 14, 2023 at 01:44:26PM +0200, Peter Zijlstra wrote: > > > The one open techinical issue I have with the mitigation is the alignment of > > > the RET inside srso_safe_ret(). The details given for retbleed stated that RET > > > should be on a 64byte boundary, which is not the case here. > > > > I have written this in the hope to make this more clear: > > > > /* > > * Some generic notes on the untraining sequences: > > * > > * They are interchangeable when it comes to flushing potentially wrong > > * RET predictions from the BTB. > > * > > * The SRSO Zen1/2 (MOVABS) untraining sequence is longer than the > > * Retbleed sequence because the return sequence done there > > * (srso_safe_ret()) is longer and the return sequence must fully nest > > * (end before) the untraining sequence. Therefore, the untraining > > * sequence must overlap the return sequence. > > * > > * Regarding alignment - the instructions which need to be untrained, > > * must all start at a cacheline boundary for Zen1/2 generations. That > > * is, both the ret in zen_untrain_ret() and srso_safe_ret() in the > > * srso_untrain_ret() must both be placed at the beginning of > > * a cacheline. > > */ > > It's a good comment, but RET in srso_safe_ret() is still misaligned. > Don't we need something like so? Scratch that, I guess I misread the confusingly worded comment: "both the ret in zen_untrain_ret() and srso_safe_ret()..." to mean the RET in each function. How about: "both the RET in zen_untrain_ret() and the LEA in srso_untrain_ret()" ? -- Josh
On Mon, Aug 14, 2023 at 01:01:28PM -0700, Josh Poimboeuf wrote:
> How about:
>
> "both the RET in zen_untrain_ret() and the LEA in srso_untrain_ret()"
>
> ?
Yeah, or the "instruction sequences starting at srso_safe_ret and
retbleed_return_thunk" (that's what it's called now) "must start at
a cacheline boundary."
Because the LEA was ADD but that changed so saying "the instruction
sequences" just works.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
From: "Borislav Petkov (AMD)" <bp@alien8.de>
Date: Mon, 14 Aug 2023 21:29:50 +0200
The goal is to eventually have a proper documentation about all this.
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
diff --git a/arch/x86/lib/retpoline.S b/arch/x86/lib/retpoline.S
index 915c4fe17718..e59c46581bbb 100644
--- a/arch/x86/lib/retpoline.S
+++ b/arch/x86/lib/retpoline.S
@@ -183,6 +183,25 @@ SYM_CODE_START(srso_alias_return_thunk)
ud2
SYM_CODE_END(srso_alias_return_thunk)
+/*
+ * Some generic notes on the untraining sequences:
+ *
+ * They are interchangeable when it comes to flushing potentially wrong
+ * RET predictions from the BTB.
+ *
+ * The SRSO Zen1/2 (MOVABS) untraining sequence is longer than the
+ * Retbleed sequence because the return sequence done there
+ * (srso_safe_ret()) is longer and the return sequence must fully nest
+ * (end before) the untraining sequence. Therefore, the untraining
+ * sequence must fully overlap the return sequence.
+ *
+ * Regarding alignment - the instructions which need to be untrained,
+ * must all start at a cacheline boundary for Zen1/2 generations. That
+ * is, instruction sequences starting at srso_safe_ret() and
+ * the respective instruction sequences at retbleed_return_thunk()
+ * must start at a cacheline boundary.
+ */
+
/*
* Safety details here pertain to the AMD Zen{1,2} microarchitecture:
* 1) The RET at retbleed_return_thunk must be on a 64 byte boundary, for
--
2.42.0.rc0.25.ga82fb66fed25
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On 15.08.23 г. 17:26 ч., Borislav Petkov wrote:
> From: "Borislav Petkov (AMD)" <bp@alien8.de>
> Date: Mon, 14 Aug 2023 21:29:50 +0200
>
> The goal is to eventually have a proper documentation about all this.
>
> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
>
> diff --git a/arch/x86/lib/retpoline.S b/arch/x86/lib/retpoline.S
> index 915c4fe17718..e59c46581bbb 100644
> --- a/arch/x86/lib/retpoline.S
> +++ b/arch/x86/lib/retpoline.S
> @@ -183,6 +183,25 @@ SYM_CODE_START(srso_alias_return_thunk)
> ud2
> SYM_CODE_END(srso_alias_return_thunk)
>
> +/*
> + * Some generic notes on the untraining sequences:
> + *
> + * They are interchangeable when it comes to flushing potentially wrong
> + * RET predictions from the BTB.
> + *
> + * The SRSO Zen1/2 (MOVABS) untraining sequence is longer than the
> + * Retbleed sequence because the return sequence done there
> + * (srso_safe_ret()) is longer and the return sequence must fully nest
> + * (end before) the untraining sequence. Therefore, the untraining
> + * sequence must fully overlap the return sequence.
> + *
> + * Regarding alignment - the instructions which need to be untrained,
> + * must all start at a cacheline boundary for Zen1/2 generations. That
> + * is, instruction sequences starting at srso_safe_ret() and
> + * the respective instruction sequences at retbleed_return_thunk()
> + * must start at a cacheline boundary.
> + */
Are there any salient generic details about zen 3/4 ?
> +
> /*
> * Safety details here pertain to the AMD Zen{1,2} microarchitecture:
> * 1) The RET at retbleed_return_thunk must be on a 64 byte boundary, for
On Mon, Aug 14, 2023 at 12:51:53PM -0700, Josh Poimboeuf wrote:
> > * Regarding alignment - the instructions which need to be untrained,
> > * must all start at a cacheline boundary for Zen1/2 generations. That
> > * is, both the ret in zen_untrain_ret() and srso_safe_ret() in the
> > * srso_untrain_ret() must both be placed at the beginning of
> > * a cacheline.
> > */
>
> It's a good comment, but RET in srso_safe_ret() is still misaligned.
> Don't we need something like so?
Well, I guess that comment is still not good enough. Not the RET must be
cacheline-aligned but the function return sequences must be.
IOW, we need this:
<--- cacheline begin
SYM_INNER_LABEL(retbleed_return_thunk, SYM_L_GLOBAL)
ret
int3
and
<--- cacheline begin
SYM_INNER_LABEL(srso_safe_ret, SYM_L_GLOBAL)
lea 8(%_ASM_SP), %_ASM_SP
ret
int3
I'll improve on it before I apply it.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
© 2016 - 2025 Red Hat, Inc.