[PATCH v2 00/11] Fix up SRSO stuff

Peter Zijlstra posted 11 patches 2 years, 4 months ago
[PATCH v2 00/11] Fix up SRSO stuff
Posted by Peter Zijlstra 2 years, 4 months ago
Hi!

Second version of the SRSO fixes/cleanup.

I've redone some, reorderd most and left out the interface bits entirely for
now. Although I do strongly feel the extra interface is superfluous (and ugly).

This is based on top of current tip/x86/urgent 833fd800bf56.

The one open techinical issue I have with the mitigation is the alignment of
the RET inside srso_safe_ret(). The details given for retbleed stated that RET
should be on a 64byte boundary, which is not the case here.

I'll go prod at bringing the rest of the patches forward after I stare at some
other email.
Re: [PATCH v2 00/11] Fix up SRSO stuff
Posted by Borislav Petkov 2 years, 4 months ago
On Mon, Aug 14, 2023 at 01:44:26PM +0200, Peter Zijlstra wrote:
> The one open techinical issue I have with the mitigation is the alignment of
> the RET inside srso_safe_ret(). The details given for retbleed stated that RET
> should be on a 64byte boundary, which is not the case here.

I have written this in the hope to make this more clear:

/*
 * Some generic notes on the untraining sequences:
 *
 * They are interchangeable when it comes to flushing potentially wrong
 * RET predictions from the BTB.
 *
 * The SRSO Zen1/2 (MOVABS) untraining sequence is longer than the
 * Retbleed sequence because the return sequence done there
 * (srso_safe_ret()) is longer and the return sequence must fully nest
 * (end before) the untraining sequence. Therefore, the untraining
 * sequence must overlap the return sequence.
 *
 * Regarding alignment - the instructions which need to be untrained,
 * must all start at a cacheline boundary for Zen1/2 generations. That
 * is, both the ret in zen_untrain_ret() and srso_safe_ret() in the
 * srso_untrain_ret() must both be placed at the beginning of
 * a cacheline.
 */

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette
Re: [PATCH v2 00/11] Fix up SRSO stuff
Posted by Josh Poimboeuf 2 years, 4 months ago
On Mon, Aug 14, 2023 at 06:44:47PM +0200, Borislav Petkov wrote:
> On Mon, Aug 14, 2023 at 01:44:26PM +0200, Peter Zijlstra wrote:
> > The one open techinical issue I have with the mitigation is the alignment of
> > the RET inside srso_safe_ret(). The details given for retbleed stated that RET
> > should be on a 64byte boundary, which is not the case here.
> 
> I have written this in the hope to make this more clear:
> 
> /*
>  * Some generic notes on the untraining sequences:
>  *
>  * They are interchangeable when it comes to flushing potentially wrong
>  * RET predictions from the BTB.
>  *
>  * The SRSO Zen1/2 (MOVABS) untraining sequence is longer than the
>  * Retbleed sequence because the return sequence done there
>  * (srso_safe_ret()) is longer and the return sequence must fully nest
>  * (end before) the untraining sequence. Therefore, the untraining
>  * sequence must overlap the return sequence.
>  *
>  * Regarding alignment - the instructions which need to be untrained,
>  * must all start at a cacheline boundary for Zen1/2 generations. That
>  * is, both the ret in zen_untrain_ret() and srso_safe_ret() in the
>  * srso_untrain_ret() must both be placed at the beginning of
>  * a cacheline.
>  */

It's a good comment, but RET in srso_safe_ret() is still misaligned.
Don't we need something like so?

diff --git a/arch/x86/lib/retpoline.S b/arch/x86/lib/retpoline.S
index 9bc19deacad1..373ac128a30a 100644
--- a/arch/x86/lib/retpoline.S
+++ b/arch/x86/lib/retpoline.S
@@ -251,13 +251,14 @@ __EXPORT_THUNK(retbleed_untrain_ret)
  * thus a "safe" one to use.
  */
 	.align 64
-	.skip 64 - (srso_safe_ret - srso_untrain_ret), 0xcc
+	.skip 64 - (.Lsrso_ret - srso_untrain_ret), 0xcc
 SYM_START(srso_untrain_ret, SYM_L_GLOBAL, SYM_A_NONE)
 	ANNOTATE_NOENDBR
 	.byte 0x48, 0xb8
 
 SYM_INNER_LABEL(srso_safe_ret, SYM_L_GLOBAL)
 	lea 8(%_ASM_SP), %_ASM_SP
+.Lsrso_ret:
 	ret
 	int3
 	int3
Re: [PATCH v2 00/11] Fix up SRSO stuff
Posted by Josh Poimboeuf 2 years, 4 months ago
On Mon, Aug 14, 2023 at 12:51:55PM -0700, Josh Poimboeuf wrote:
> On Mon, Aug 14, 2023 at 06:44:47PM +0200, Borislav Petkov wrote:
> > On Mon, Aug 14, 2023 at 01:44:26PM +0200, Peter Zijlstra wrote:
> > > The one open techinical issue I have with the mitigation is the alignment of
> > > the RET inside srso_safe_ret(). The details given for retbleed stated that RET
> > > should be on a 64byte boundary, which is not the case here.
> > 
> > I have written this in the hope to make this more clear:
> > 
> > /*
> >  * Some generic notes on the untraining sequences:
> >  *
> >  * They are interchangeable when it comes to flushing potentially wrong
> >  * RET predictions from the BTB.
> >  *
> >  * The SRSO Zen1/2 (MOVABS) untraining sequence is longer than the
> >  * Retbleed sequence because the return sequence done there
> >  * (srso_safe_ret()) is longer and the return sequence must fully nest
> >  * (end before) the untraining sequence. Therefore, the untraining
> >  * sequence must overlap the return sequence.
> >  *
> >  * Regarding alignment - the instructions which need to be untrained,
> >  * must all start at a cacheline boundary for Zen1/2 generations. That
> >  * is, both the ret in zen_untrain_ret() and srso_safe_ret() in the
> >  * srso_untrain_ret() must both be placed at the beginning of
> >  * a cacheline.
> >  */
> 
> It's a good comment, but RET in srso_safe_ret() is still misaligned.
> Don't we need something like so?

Scratch that, I guess I misread the confusingly worded comment:

  "both the ret in zen_untrain_ret() and srso_safe_ret()..."

to mean the RET in each function.

How about:

  "both the RET in zen_untrain_ret() and the LEA in srso_untrain_ret()"

?

-- 
Josh
Re: [PATCH v2 00/11] Fix up SRSO stuff
Posted by Borislav Petkov 2 years, 4 months ago
On Mon, Aug 14, 2023 at 01:01:28PM -0700, Josh Poimboeuf wrote:
> How about:
> 
>   "both the RET in zen_untrain_ret() and the LEA in srso_untrain_ret()"
> 
> ?

Yeah, or the "instruction sequences starting at srso_safe_ret and
retbleed_return_thunk" (that's what it's called now) "must start at
a cacheline boundary."

Because the LEA was ADD but that changed so saying "the instruction
sequences" just works.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette
[PATCH] x86/srso: Explain the untraining sequences a bit more
Posted by Borislav Petkov 2 years, 4 months ago
From: "Borislav Petkov (AMD)" <bp@alien8.de>
Date: Mon, 14 Aug 2023 21:29:50 +0200

The goal is to eventually have a proper documentation about all this.

Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>

diff --git a/arch/x86/lib/retpoline.S b/arch/x86/lib/retpoline.S
index 915c4fe17718..e59c46581bbb 100644
--- a/arch/x86/lib/retpoline.S
+++ b/arch/x86/lib/retpoline.S
@@ -183,6 +183,25 @@ SYM_CODE_START(srso_alias_return_thunk)
 	ud2
 SYM_CODE_END(srso_alias_return_thunk)
 
+/*
+ * Some generic notes on the untraining sequences:
+ *
+ * They are interchangeable when it comes to flushing potentially wrong
+ * RET predictions from the BTB.
+ *
+ * The SRSO Zen1/2 (MOVABS) untraining sequence is longer than the
+ * Retbleed sequence because the return sequence done there
+ * (srso_safe_ret()) is longer and the return sequence must fully nest
+ * (end before) the untraining sequence. Therefore, the untraining
+ * sequence must fully overlap the return sequence.
+ *
+ * Regarding alignment - the instructions which need to be untrained,
+ * must all start at a cacheline boundary for Zen1/2 generations. That
+ * is, instruction sequences starting at srso_safe_ret() and
+ * the respective instruction sequences at retbleed_return_thunk()
+ * must start at a cacheline boundary.
+ */
+
 /*
  * Safety details here pertain to the AMD Zen{1,2} microarchitecture:
  * 1) The RET at retbleed_return_thunk must be on a 64 byte boundary, for
-- 
2.42.0.rc0.25.ga82fb66fed25

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette
Re: [PATCH] x86/srso: Explain the untraining sequences a bit more
Posted by Nikolay Borisov 2 years, 4 months ago

On 15.08.23 г. 17:26 ч., Borislav Petkov wrote:
> From: "Borislav Petkov (AMD)" <bp@alien8.de>
> Date: Mon, 14 Aug 2023 21:29:50 +0200
> 
> The goal is to eventually have a proper documentation about all this.
> 
> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
> 
> diff --git a/arch/x86/lib/retpoline.S b/arch/x86/lib/retpoline.S
> index 915c4fe17718..e59c46581bbb 100644
> --- a/arch/x86/lib/retpoline.S
> +++ b/arch/x86/lib/retpoline.S
> @@ -183,6 +183,25 @@ SYM_CODE_START(srso_alias_return_thunk)
>   	ud2
>   SYM_CODE_END(srso_alias_return_thunk)
>   
> +/*
> + * Some generic notes on the untraining sequences:
> + *
> + * They are interchangeable when it comes to flushing potentially wrong
> + * RET predictions from the BTB.
> + *
> + * The SRSO Zen1/2 (MOVABS) untraining sequence is longer than the
> + * Retbleed sequence because the return sequence done there
> + * (srso_safe_ret()) is longer and the return sequence must fully nest
> + * (end before) the untraining sequence. Therefore, the untraining
> + * sequence must fully overlap the return sequence.
> + *
> + * Regarding alignment - the instructions which need to be untrained,
> + * must all start at a cacheline boundary for Zen1/2 generations. That
> + * is, instruction sequences starting at srso_safe_ret() and
> + * the respective instruction sequences at retbleed_return_thunk()
> + * must start at a cacheline boundary.
> + */

Are there any salient generic details about zen 3/4 ?
> +
>   /*
>    * Safety details here pertain to the AMD Zen{1,2} microarchitecture:
>    * 1) The RET at retbleed_return_thunk must be on a 64 byte boundary, for
Re: [PATCH v2 00/11] Fix up SRSO stuff
Posted by Borislav Petkov 2 years, 4 months ago
On Mon, Aug 14, 2023 at 12:51:53PM -0700, Josh Poimboeuf wrote:
> >  * Regarding alignment - the instructions which need to be untrained,
> >  * must all start at a cacheline boundary for Zen1/2 generations. That
> >  * is, both the ret in zen_untrain_ret() and srso_safe_ret() in the
> >  * srso_untrain_ret() must both be placed at the beginning of
> >  * a cacheline.
> >  */
> 
> It's a good comment, but RET in srso_safe_ret() is still misaligned.
> Don't we need something like so?

Well, I guess that comment is still not good enough. Not the RET must be
cacheline-aligned but the function return sequences must be.

IOW, we need this:

<--- cacheline begin
SYM_INNER_LABEL(retbleed_return_thunk, SYM_L_GLOBAL)
        ret
        int3


and

<--- cacheline begin
SYM_INNER_LABEL(srso_safe_ret, SYM_L_GLOBAL)
        lea 8(%_ASM_SP), %_ASM_SP
        ret
        int3

I'll improve on it before I apply it.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette