[v2] Add support for the ERAPS feature

[RFC PATCH v2 1/3] x86: cpu/bugs: update SpectreRSB comments for AMD

Posted by Amit Shah 1 year, 2 months ago

From: Amit Shah <amit.shah@amd.com>

AMD CPUs do not fall back to the BTB when the RSB underflows for RET
address speculation.  AMD CPUs have not needed to stuff the RSB for
underflow conditions.

The RSB poisoning case is addressed by RSB filling - clean up the FIXME
comment about it.

Signed-off-by: Amit Shah <amit.shah@amd.com>
---
 arch/x86/kernel/cpu/bugs.c | 5 -----
 1 file changed, 5 deletions(-)

diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 47a01d4028f6..0aa629b5537d 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -1828,9 +1828,6 @@ static void __init spectre_v2_select_mitigation(void)
 	 *    speculated return targets may come from the branch predictor,
 	 *    which could have a user-poisoned BTB or BHB entry.
 	 *
-	 *    AMD has it even worse: *all* returns are speculated from the BTB,
-	 *    regardless of the state of the RSB.
-	 *
 	 *    When IBRS or eIBRS is enabled, the "user -> kernel" attack
 	 *    scenario is mitigated by the IBRS branch prediction isolation
 	 *    properties, so the RSB buffer filling wouldn't be necessary to
@@ -1852,8 +1849,6 @@ static void __init spectre_v2_select_mitigation(void)
 	 *
 	 * So to mitigate all cases, unconditionally fill RSB on context
 	 * switches.
-	 *
-	 * FIXME: Is this pointless for retbleed-affected AMD?
 	 */
 	setup_force_cpu_cap(X86_FEATURE_RSB_CTXSW);
 	pr_info("Spectre v2 / SpectreRSB mitigation: Filling RSB on context switch\n");
-- 
2.47.0

Re: [RFC PATCH v2 1/3] x86: cpu/bugs: update SpectreRSB comments for AMD

Posted by Josh Poimboeuf 1 year, 2 months ago

On Mon, Nov 11, 2024 at 05:39:11PM +0100, Amit Shah wrote:
> From: Amit Shah <amit.shah@amd.com>
> 
> AMD CPUs do not fall back to the BTB when the RSB underflows for RET
> address speculation.  AMD CPUs have not needed to stuff the RSB for
> underflow conditions.
> 
> The RSB poisoning case is addressed by RSB filling - clean up the FIXME
> comment about it.

I'm thinking the comments need more clarification in light of BTC and
SRSO.

This:

> -	 *    AMD has it even worse: *all* returns are speculated from the BTB,
> -	 *    regardless of the state of the RSB.

is still true (mostly: "all" should be "some"), though it doesn't belong
in the "RSB underflow" section.

Also the RSB stuffing not only mitigates RET, it mitigates any other
instruction which happen to be predicted as a RET.  Which is presumably
why it's still needed even when SRSO is enabled.

Something like below?

diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 47a01d4028f6..e95d3aa14259 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -1828,9 +1828,6 @@ static void __init spectre_v2_select_mitigation(void)
 	 *    speculated return targets may come from the branch predictor,
 	 *    which could have a user-poisoned BTB or BHB entry.
 	 *
-	 *    AMD has it even worse: *all* returns are speculated from the BTB,
-	 *    regardless of the state of the RSB.
-	 *
 	 *    When IBRS or eIBRS is enabled, the "user -> kernel" attack
 	 *    scenario is mitigated by the IBRS branch prediction isolation
 	 *    properties, so the RSB buffer filling wouldn't be necessary to
@@ -1850,10 +1847,22 @@ static void __init spectre_v2_select_mitigation(void)
 	 *    The "user -> user" scenario, also known as SpectreBHB, requires
 	 *    RSB clearing.
 	 *
+	 *    AMD Branch Type Confusion (aka "AMD retbleed") adds some
+	 *    additional wrinkles:
+	 *
+	 *      - A RET can be mispredicted as a direct or indirect branch,
+	 *        causing the CPU to speculatively branch to a BTB target, in
+	 *        which case the RSB filling obviously doesn't help.  That case
+	 *        is mitigated by removing all the RETs (SRSO mitigation).
+	 *
+	 *      - The RSB is not only used for architectural RET instructions,
+	 *        it may also be used for other instructions which happen to
+	 *        get mispredicted as RETs.  Therefore RSB filling is still
+	 *        needed even when the RETs have all been removed by the SRSO
+	 *        mitigation.
+	 *
 	 * So to mitigate all cases, unconditionally fill RSB on context
 	 * switches.
-	 *
-	 * FIXME: Is this pointless for retbleed-affected AMD?
 	 */
 	setup_force_cpu_cap(X86_FEATURE_RSB_CTXSW);
 	pr_info("Spectre v2 / SpectreRSB mitigation: Filling RSB on context switch\n");

Re: [RFC PATCH v2 1/3] x86: cpu/bugs: update SpectreRSB comments for AMD

Posted by Andrew Cooper 1 year, 2 months ago

On 11/11/2024 7:33 pm, Josh Poimboeuf wrote:
> On Mon, Nov 11, 2024 at 05:39:11PM +0100, Amit Shah wrote:
>> From: Amit Shah <amit.shah@amd.com>
>>
>> AMD CPUs do not fall back to the BTB when the RSB underflows for RET
>> address speculation.  AMD CPUs have not needed to stuff the RSB for
>> underflow conditions.
>>
>> The RSB poisoning case is addressed by RSB filling - clean up the FIXME
>> comment about it.
> I'm thinking the comments need more clarification in light of BTC and
> SRSO.
>
> This:
>
>> -	 *    AMD has it even worse: *all* returns are speculated from the BTB,
>> -	 *    regardless of the state of the RSB.
> is still true (mostly: "all" should be "some"), though it doesn't belong
> in the "RSB underflow" section.
>
> Also the RSB stuffing not only mitigates RET, it mitigates any other
> instruction which happen to be predicted as a RET.  Which is presumably
> why it's still needed even when SRSO is enabled.
>
> Something like below?
>
> diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
> index 47a01d4028f6..e95d3aa14259 100644
> --- a/arch/x86/kernel/cpu/bugs.c
> +++ b/arch/x86/kernel/cpu/bugs.c
> @@ -1828,9 +1828,6 @@ static void __init spectre_v2_select_mitigation(void)
>  	 *    speculated return targets may come from the branch predictor,
>  	 *    which could have a user-poisoned BTB or BHB entry.
>  	 *
> -	 *    AMD has it even worse: *all* returns are speculated from the BTB,
> -	 *    regardless of the state of the RSB.
> -	 *
>  	 *    When IBRS or eIBRS is enabled, the "user -> kernel" attack
>  	 *    scenario is mitigated by the IBRS branch prediction isolation
>  	 *    properties, so the RSB buffer filling wouldn't be necessary to
> @@ -1850,10 +1847,22 @@ static void __init spectre_v2_select_mitigation(void)
>  	 *    The "user -> user" scenario, also known as SpectreBHB, requires
>  	 *    RSB clearing.
>  	 *
> +	 *    AMD Branch Type Confusion (aka "AMD retbleed") adds some
> +	 *    additional wrinkles:
> +	 *
> +	 *      - A RET can be mispredicted as a direct or indirect branch,
> +	 *        causing the CPU to speculatively branch to a BTB target, in
> +	 *        which case the RSB filling obviously doesn't help.  That case
> +	 *        is mitigated by removing all the RETs (SRSO mitigation).
> +	 *
> +	 *      - The RSB is not only used for architectural RET instructions,
> +	 *        it may also be used for other instructions which happen to
> +	 *        get mispredicted as RETs.  Therefore RSB filling is still
> +	 *        needed even when the RETs have all been removed by the SRSO
> +	 *        mitigation.

This is my take.  On AMD CPUs, there are two unrelated issues to take
into account:

1) SRSO

Affects anything which doesn't enumerate SRSO_NO, which is all parts to
date including Zen5.

SRSO ends up overflowing the RAS with arbitrary BTB targets, such that a
subsequent genuine RET follows a prediction which never came from a real
CALL instruction.

Mitigations for SRSO are either safe-ret, or IBPB-on-entry.  Parts
without IBPB_RET using IBPB-on-entry need to manually flush the RAS.

Importantly, SMEP does not protection you against SRSO across the
user->kernel boundary, because the bad RAS entries are arbitrary.  New
in Zen5 is the SRSO_U/S_NO bit which says this case can't occur any
more.  So on Zen5, you can in principle get away without a RAS flush on
entry.

2) BTC

Affects anything which doesn't enumerate BTC_NO, which is Zen2 and older
(Fam17h for AMD, Fam18h for Hygon).

Attacker can forge any branch type prediction, and the most dangerous
one is RET-mispredicted-as-INDIRECT.  This causes a genuine RET
instruction to follow a prediction that was believed to be an indirect
branch.

All CPUs which suffer BTC also suffer SRSO, so while jmp2ret is a
mitigation for BTC, it's utility became 0 when SRSO was discovered. 
(Which as shame, because it's equal parts beautiful and terrifying.) 
Mitigations for BTC are therefore safe-ret or IBPB-on-entry.

Flushing the RAS has no effect on BTC, because the whole problem with
BTC is that the prediction comes from the "wrong" predictor, but you
need to do it for other reasons.

~Andrew

Re: [RFC PATCH v2 1/3] x86: cpu/bugs: update SpectreRSB comments for AMD

Posted by Josh Poimboeuf 1 year, 2 months ago

On Tue, Nov 12, 2024 at 12:29:28AM +0000, Andrew Cooper wrote:
> This is my take.  On AMD CPUs, there are two unrelated issues to take
> into account:
> 
> 1) SRSO
> 
> Affects anything which doesn't enumerate SRSO_NO, which is all parts to
> date including Zen5.
> 
> SRSO ends up overflowing the RAS with arbitrary BTB targets, such that a
> subsequent genuine RET follows a prediction which never came from a real
> CALL instruction.
> 
> Mitigations for SRSO are either safe-ret, or IBPB-on-entry.  Parts
> without IBPB_RET using IBPB-on-entry need to manually flush the RAS.
> 
> Importantly, SMEP does not protection you against SRSO across the
> user->kernel boundary, because the bad RAS entries are arbitrary.  New
> in Zen5 is the SRSO_U/S_NO bit which says this case can't occur any
> more.  So on Zen5, you can in principle get away without a RAS flush on
> entry.

Updated to mention SRSO:

	/*
	 * In general there are two types of RSB attacks:
	 *
	 * 1) RSB underflow ("Intel Retbleed")
	 *
	 *    Some Intel parts have "bottomless RSB".  When the RSB is empty,
	 *    speculated return targets may come from the branch predictor,
	 *    which could have a user-poisoned BTB or BHB entry.
	 *
	 *    When IBRS or eIBRS is enabled, the "user -> kernel" attack is
	 *    mitigated by the IBRS branch prediction isolation properties, so
	 *    the RSB buffer filling wouldn't be necessary to protect against
	 *    this type of attack.
	 *
	 *    The "user -> user" attack is mitigated by RSB filling on context
	 *    switch.
	 *
	 *    The "guest -> host" attack is mitigated by IBRS or eIBRS.
	 *
	 * 2) Poisoned RSB entry
	 *
	 *    If the 'next' in-kernel return stack is shorter than 'prev',
	 *    'next' could be tricked into speculating with a user-poisoned RSB
	 *    entry.  Poisoned RSB entries can also be created by Branch Type
	 *    Confusion ("AMD retbleed") or SRSO.
	 *
	 *    The "user -> kernel" attack is mitigated by SMEP and eIBRS.  AMD
	 *    without SRSO_NO also needs the SRSO mitigation.
	 *
	 *    The "user -> user" attack, also known as SpectreBHB, requires RSB
	 *    clearing.
	 *
	 *    The "guest -> host" attack is mitigated by either eIBRS (not
	 *    IBRS!) or RSB clearing on vmexit.  Note that eIBRS
	 *    implementations with X86_BUG_EIBRS_PBRSB still need "lite" RSB
	 *    clearing which retires a single CALL before the first RET.
	 */


---8<---

From: Josh Poimboeuf <jpoimboe@kernel.org>
Subject: [PATCH] x86/bugs: Update insanely long comment about RSB attacks

The long comment above the setting of X86_FEATURE_RSB_CTXSW is a bit
confusing.  It starts out being about context switching specifically,
but then goes on to describe "user -> kernel" mitigations, which aren't
necessarily limited to context switches.

Clarify that it's about *all* RSB attacks and their mitigations.

For consistency, add the "guest -> host" mitigations as well.  Then the
comment above spectre_v2_determine_rsb_fill_type_at_vmexit() can be
removed and the overall line count is reduced.

Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
---
 arch/x86/kernel/cpu/bugs.c | 60 +++++++++++++-------------------------
 1 file changed, 20 insertions(+), 40 deletions(-)

diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 47a01d4028f6..3dd1e504d706 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -1581,26 +1581,6 @@ static void __init spec_ctrl_disable_kernel_rrsba(void)
 
 static void __init spectre_v2_determine_rsb_fill_type_at_vmexit(enum spectre_v2_mitigation mode)
 {
-	/*
-	 * Similar to context switches, there are two types of RSB attacks
-	 * after VM exit:
-	 *
-	 * 1) RSB underflow
-	 *
-	 * 2) Poisoned RSB entry
-	 *
-	 * When retpoline is enabled, both are mitigated by filling/clearing
-	 * the RSB.
-	 *
-	 * When IBRS is enabled, while #1 would be mitigated by the IBRS branch
-	 * prediction isolation protections, RSB still needs to be cleared
-	 * because of #2.  Note that SMEP provides no protection here, unlike
-	 * user-space-poisoned RSB entries.
-	 *
-	 * eIBRS should protect against RSB poisoning, but if the EIBRS_PBRSB
-	 * bug is present then a LITE version of RSB protection is required,
-	 * just a single call needs to retire before a RET is executed.
-	 */
 	switch (mode) {
 	case SPECTRE_V2_NONE:
 		return;
@@ -1818,43 +1798,43 @@ static void __init spectre_v2_select_mitigation(void)
 	pr_info("%s\n", spectre_v2_strings[mode]);
 
 	/*
-	 * If Spectre v2 protection has been enabled, fill the RSB during a
-	 * context switch.  In general there are two types of RSB attacks
-	 * across context switches, for which the CALLs/RETs may be unbalanced.
+	 * In general there are two types of RSB attacks:
 	 *
-	 * 1) RSB underflow
+	 * 1) RSB underflow ("Intel Retbleed")
 	 *
 	 *    Some Intel parts have "bottomless RSB".  When the RSB is empty,
 	 *    speculated return targets may come from the branch predictor,
 	 *    which could have a user-poisoned BTB or BHB entry.
 	 *
-	 *    AMD has it even worse: *all* returns are speculated from the BTB,
-	 *    regardless of the state of the RSB.
+	 *    When IBRS or eIBRS is enabled, the "user -> kernel" attack is
+	 *    mitigated by the IBRS branch prediction isolation properties, so
+	 *    the RSB buffer filling wouldn't be necessary to protect against
+	 *    this type of attack.
 	 *
-	 *    When IBRS or eIBRS is enabled, the "user -> kernel" attack
-	 *    scenario is mitigated by the IBRS branch prediction isolation
-	 *    properties, so the RSB buffer filling wouldn't be necessary to
-	 *    protect against this type of attack.
+	 *    The "user -> user" attack is mitigated by RSB filling on context
+	 *    switch.
 	 *
-	 *    The "user -> user" attack scenario is mitigated by RSB filling.
+	 *    The "guest -> host" attack is mitigated by IBRS or eIBRS.
 	 *
 	 * 2) Poisoned RSB entry
 	 *
 	 *    If the 'next' in-kernel return stack is shorter than 'prev',
 	 *    'next' could be tricked into speculating with a user-poisoned RSB
-	 *    entry.
+	 *    entry.  Poisoned RSB entries can also be created by Branch Type
+	 *    Confusion ("AMD retbleed") or SRSO.
 	 *
-	 *    The "user -> kernel" attack scenario is mitigated by SMEP and
-	 *    eIBRS.
+	 *    The "user -> kernel" attack is mitigated by SMEP and eIBRS.  AMD
+	 *    without SRSO_NO also needs the SRSO mitigation.
 	 *
-	 *    The "user -> user" scenario, also known as SpectreBHB, requires
-	 *    RSB clearing.
+	 *    The "user -> user" attack, also known as SpectreBHB, requires RSB
+	 *    clearing.
 	 *
-	 * So to mitigate all cases, unconditionally fill RSB on context
-	 * switches.
-	 *
-	 * FIXME: Is this pointless for retbleed-affected AMD?
+	 *    The "guest -> host" attack is mitigated by either eIBRS (not
+	 *    IBRS!) or RSB clearing on vmexit.  Note that eIBRS
+	 *    implementations with X86_BUG_EIBRS_PBRSB still need "lite" RSB
+	 *    clearing which retires a single CALL before the first RET.
 	 */
+
 	setup_force_cpu_cap(X86_FEATURE_RSB_CTXSW);
 	pr_info("Spectre v2 / SpectreRSB mitigation: Filling RSB on context switch\n");
 
-- 
2.47.0

Re: [RFC PATCH v2 1/3] x86: cpu/bugs: update SpectreRSB comments for AMD

Posted by Pawan Gupta 1 year, 2 months ago

On Mon, Nov 11, 2024 at 05:46:44PM -0800, Josh Poimboeuf wrote:
> +	 * 1) RSB underflow ("Intel Retbleed")
>  	 *
>  	 *    Some Intel parts have "bottomless RSB".  When the RSB is empty,
>  	 *    speculated return targets may come from the branch predictor,
>  	 *    which could have a user-poisoned BTB or BHB entry.
>  	 *
> -	 *    AMD has it even worse: *all* returns are speculated from the BTB,
> -	 *    regardless of the state of the RSB.
> +	 *    When IBRS or eIBRS is enabled, the "user -> kernel" attack is
> +	 *    mitigated by the IBRS branch prediction isolation properties, so
> +	 *    the RSB buffer filling wouldn't be necessary to protect against
> +	 *    this type of attack.
>  	 *
> -	 *    When IBRS or eIBRS is enabled, the "user -> kernel" attack
> -	 *    scenario is mitigated by the IBRS branch prediction isolation
> -	 *    properties, so the RSB buffer filling wouldn't be necessary to
> -	 *    protect against this type of attack.
> +	 *    The "user -> user" attack is mitigated by RSB filling on context
> +	 *    switch.

user->user SpectreRSB is also mitigated by IBPB, so RSB filling is
unnecessary when IBPB is issued. Also, when an appication does not opted-in
for IBPB at context switch, spectre-v2 for that app is not mitigated,
filling RSB is only a half measure in that case.

Is RSB filling really serving any purpose for userspace?

Re: [RFC PATCH v2 1/3] x86: cpu/bugs: update SpectreRSB comments for AMD

Posted by Josh Poimboeuf 1 year, 2 months ago

On Tue, Nov 12, 2024 at 01:43:48PM -0800, Pawan Gupta wrote:
> On Mon, Nov 11, 2024 at 05:46:44PM -0800, Josh Poimboeuf wrote:
> > +	 * 1) RSB underflow ("Intel Retbleed")
> >  	 *
> >  	 *    Some Intel parts have "bottomless RSB".  When the RSB is empty,
> >  	 *    speculated return targets may come from the branch predictor,
> >  	 *    which could have a user-poisoned BTB or BHB entry.
> >  	 *
> > -	 *    AMD has it even worse: *all* returns are speculated from the BTB,
> > -	 *    regardless of the state of the RSB.
> > +	 *    When IBRS or eIBRS is enabled, the "user -> kernel" attack is
> > +	 *    mitigated by the IBRS branch prediction isolation properties, so
> > +	 *    the RSB buffer filling wouldn't be necessary to protect against
> > +	 *    this type of attack.
> >  	 *
> > -	 *    When IBRS or eIBRS is enabled, the "user -> kernel" attack
> > -	 *    scenario is mitigated by the IBRS branch prediction isolation
> > -	 *    properties, so the RSB buffer filling wouldn't be necessary to
> > -	 *    protect against this type of attack.
> > +	 *    The "user -> user" attack is mitigated by RSB filling on context
> > +	 *    switch.
> 
> user->user SpectreRSB is also mitigated by IBPB, so RSB filling is
> unnecessary when IBPB is issued. Also, when an appication does not opted-in
> for IBPB at context switch, spectre-v2 for that app is not mitigated,
> filling RSB is only a half measure in that case.
> 
> Is RSB filling really serving any purpose for userspace?

Indeed...

If we don't need to flush RSB for user->user, we'd only need to worry
about protecting the kernel.  Something like so?

  - eIBRS+!PBRSB:	no flush
  - eIBRS+PBRSB:	lite flush
  - everything else:	full flush

i.e., same logic as spectre_v2_determine_rsb_fill_type_at_vmexit(), but
also for context switches.

-- 
Josh

Re: [RFC PATCH v2 1/3] x86: cpu/bugs: update SpectreRSB comments for AMD

Posted by Pawan Gupta 1 year, 2 months ago

On Wed, Nov 13, 2024 at 12:21:05PM -0800, Josh Poimboeuf wrote:
> On Tue, Nov 12, 2024 at 01:43:48PM -0800, Pawan Gupta wrote:
> > On Mon, Nov 11, 2024 at 05:46:44PM -0800, Josh Poimboeuf wrote:
> > > +	 * 1) RSB underflow ("Intel Retbleed")
> > >  	 *
> > >  	 *    Some Intel parts have "bottomless RSB".  When the RSB is empty,
> > >  	 *    speculated return targets may come from the branch predictor,
> > >  	 *    which could have a user-poisoned BTB or BHB entry.
> > >  	 *
> > > -	 *    AMD has it even worse: *all* returns are speculated from the BTB,
> > > -	 *    regardless of the state of the RSB.
> > > +	 *    When IBRS or eIBRS is enabled, the "user -> kernel" attack is
> > > +	 *    mitigated by the IBRS branch prediction isolation properties, so
> > > +	 *    the RSB buffer filling wouldn't be necessary to protect against
> > > +	 *    this type of attack.
> > >  	 *
> > > -	 *    When IBRS or eIBRS is enabled, the "user -> kernel" attack
> > > -	 *    scenario is mitigated by the IBRS branch prediction isolation
> > > -	 *    properties, so the RSB buffer filling wouldn't be necessary to
> > > -	 *    protect against this type of attack.
> > > +	 *    The "user -> user" attack is mitigated by RSB filling on context
> > > +	 *    switch.
> > 
> > user->user SpectreRSB is also mitigated by IBPB, so RSB filling is
> > unnecessary when IBPB is issued. Also, when an appication does not opted-in
> > for IBPB at context switch, spectre-v2 for that app is not mitigated,
> > filling RSB is only a half measure in that case.
> > 
> > Is RSB filling really serving any purpose for userspace?
> 
> Indeed...
> 
> If we don't need to flush RSB for user->user, we'd only need to worry
> about protecting the kernel.  Something like so?
> 
>   - eIBRS+!PBRSB:	no flush
>   - eIBRS+PBRSB:	lite flush

Yes for VMexit, but not at kernel entry. PBRSB requires an unbalanced RET,
and it is only a problem until the first retired CALL. At VMexit we do have
unbalanced RET but not at kernel entry.

>   - everything else:	full flush

> i.e., same logic as spectre_v2_determine_rsb_fill_type_at_vmexit(), but
> also for context switches.

Yes, assuming you mean user->kernel switch, and not process context switch.

Re: [RFC PATCH v2 1/3] x86: cpu/bugs: update SpectreRSB comments for AMD

Posted by Josh Poimboeuf 1 year, 2 months ago

On Wed, Nov 13, 2024 at 05:55:05PM -0800, Pawan Gupta wrote:
> > > user->user SpectreRSB is also mitigated by IBPB, so RSB filling is
> > > unnecessary when IBPB is issued. Also, when an appication does not opted-in
> > > for IBPB at context switch, spectre-v2 for that app is not mitigated,
> > > filling RSB is only a half measure in that case.
> > > 
> > > Is RSB filling really serving any purpose for userspace?
> > 
> > Indeed...
> > 
> > If we don't need to flush RSB for user->user, we'd only need to worry
> > about protecting the kernel.  Something like so?
> > 
> >   - eIBRS+!PBRSB:	no flush
> >   - eIBRS+PBRSB:	lite flush
> 
> Yes for VMexit, but not at kernel entry. PBRSB requires an unbalanced RET,
> and it is only a problem until the first retired CALL. At VMexit we do have
> unbalanced RET but not at kernel entry.
> 
> >   - everything else:	full flush
> 
> > i.e., same logic as spectre_v2_determine_rsb_fill_type_at_vmexit(), but
> > also for context switches.
> 
> Yes, assuming you mean user->kernel switch, and not process context switch.

Actually I did mean context switch.  AFAIK we don't need to flush RSB at
kernel entry.

If user->user RSB is already mitigated by IBPB, then at context switch
we only have to worry about user->kernel.  e.g., if 'next' has more (in
kernel) RETs then 'prev' had (in kernel) CALLs, the user could trigger
RSB underflow or corruption inside the kernel after the context switch.

Doesn't eIBRS already protect against that?

For PBRSB, I guess we don't need to worry about that since there would
be at least one kernel CALL before context switch.

-- 
Josh

Re: [RFC PATCH v2 1/3] x86: cpu/bugs: update SpectreRSB comments for AMD

Posted by Pawan Gupta 1 year, 2 months ago

On Wed, Nov 13, 2024 at 06:31:41PM -0800, Josh Poimboeuf wrote:
> On Wed, Nov 13, 2024 at 05:55:05PM -0800, Pawan Gupta wrote:
> > > > user->user SpectreRSB is also mitigated by IBPB, so RSB filling is
> > > > unnecessary when IBPB is issued. Also, when an appication does not opted-in
> > > > for IBPB at context switch, spectre-v2 for that app is not mitigated,
> > > > filling RSB is only a half measure in that case.
> > > > 
> > > > Is RSB filling really serving any purpose for userspace?
> > > 
> > > Indeed...
> > > 
> > > If we don't need to flush RSB for user->user, we'd only need to worry
> > > about protecting the kernel.  Something like so?
> > > 
> > >   - eIBRS+!PBRSB:	no flush
> > >   - eIBRS+PBRSB:	lite flush
> > 
> > Yes for VMexit, but not at kernel entry. PBRSB requires an unbalanced RET,
> > and it is only a problem until the first retired CALL. At VMexit we do have
> > unbalanced RET but not at kernel entry.
> > 
> > >   - everything else:	full flush
> > 
> > > i.e., same logic as spectre_v2_determine_rsb_fill_type_at_vmexit(), but
> > > also for context switches.
> > 
> > Yes, assuming you mean user->kernel switch, and not process context switch.
> 
> Actually I did mean context switch.  AFAIK we don't need to flush RSB at
> kernel entry.
> 
> If user->user RSB is already mitigated by IBPB, then at context switch
> we only have to worry about user->kernel.  e.g., if 'next' has more (in
> kernel) RETs then 'prev' had (in kernel) CALLs, the user could trigger
> RSB underflow or corruption inside the kernel after the context switch.

Yes, this condition can cause RSB underflow, but that is not enough. More
importantly an attacker also needs to control the target of RET.

> Doesn't eIBRS already protect against that?

Yes, eIBRS does protect against that, because the alternate predictor (TA)
is isolated by eIBRS from user influence.

> For PBRSB, I guess we don't need to worry about that since there would
> be at least one kernel CALL before context switch.

Right. So the case where we need RSB filling at context switch is
retpoline+CDT mitigation.

Re: [RFC PATCH v2 1/3] x86: cpu/bugs: update SpectreRSB comments for AMD

Posted by Josh Poimboeuf 1 year, 2 months ago

On Thu, Nov 14, 2024 at 12:01:16AM -0800, Pawan Gupta wrote:
> > For PBRSB, I guess we don't need to worry about that since there would
> > be at least one kernel CALL before context switch.
> 
> Right. So the case where we need RSB filling at context switch is
> retpoline+CDT mitigation.

According to the docs, classic IBRS also needs RSB filling at context
switch to protect against corrupt RSB entries (as opposed to RSB
underflow).


Something like so...


diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 47a01d4028f6..7b9c0a21e478 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -1579,27 +1579,44 @@ static void __init spec_ctrl_disable_kernel_rrsba(void)
 	rrsba_disabled = true;
 }
 
-static void __init spectre_v2_determine_rsb_fill_type_at_vmexit(enum spectre_v2_mitigation mode)
+static void __init spectre_v2_mitigate_rsb(enum spectre_v2_mitigation mode)
 {
 	/*
-	 * Similar to context switches, there are two types of RSB attacks
-	 * after VM exit:
+	 * In general there are two types of RSB attacks:
 	 *
-	 * 1) RSB underflow
+	 * 1) RSB underflow ("Intel Retbleed")
+	 *
+	 *    Some Intel parts have "bottomless RSB".  When the RSB is empty,
+	 *    speculated return targets may come from the branch predictor,
+	 *    which could have a user-poisoned BTB or BHB entry.
+	 *
+	 *    user->user attacks are mitigated by IBPB on context switch.
+	 *
+	 *    user->kernel attacks via context switch are mitigated by IBRS,
+	 *    eIBRS, or RSB filling.
+	 *
+	 *    user->kernel attacks via kernel entry are mitigated by IBRS,
+	 *    eIBRS, or call depth tracking.
+	 *
+	 *    On VMEXIT, guest->host attacks are mitigated by IBRS, eIBRS, or
+	 *    RSB filling.
 	 *
 	 * 2) Poisoned RSB entry
 	 *
-	 * When retpoline is enabled, both are mitigated by filling/clearing
-	 * the RSB.
+	 *    On a context switch, the previous task can poison RSB entries
+	 *    used by the next task, controlling its speculative return
+	 *    targets.  Poisoned RSB entries can also be created by "AMD
+	 *    Retbleed" or SRSO.
 	 *
-	 * When IBRS is enabled, while #1 would be mitigated by the IBRS branch
-	 * prediction isolation protections, RSB still needs to be cleared
-	 * because of #2.  Note that SMEP provides no protection here, unlike
-	 * user-space-poisoned RSB entries.
+	 *    user->user attacks are mitigated by IBPB on context switch.
 	 *
-	 * eIBRS should protect against RSB poisoning, but if the EIBRS_PBRSB
-	 * bug is present then a LITE version of RSB protection is required,
-	 * just a single call needs to retire before a RET is executed.
+	 *    user->kernel attacks via context switch are prevented by
+	 *    SMEP+eIBRS+SRSO mitigations, or RSB clearing.
+	 *
+	 *    guest->host attacks are mitigated by eIBRS or RSB clearing on
+	 *    VMEXIT.  eIBRS implementations with X86_BUG_EIBRS_PBRSB still
+	 *    need "lite" RSB filling which retires a CALL before the first
+	 *    RET.
 	 */
 	switch (mode) {
 	case SPECTRE_V2_NONE:
@@ -1608,8 +1625,8 @@ static void __init spectre_v2_determine_rsb_fill_type_at_vmexit(enum spectre_v2_
 	case SPECTRE_V2_EIBRS_LFENCE:
 	case SPECTRE_V2_EIBRS:
 		if (boot_cpu_has_bug(X86_BUG_EIBRS_PBRSB)) {
-			setup_force_cpu_cap(X86_FEATURE_RSB_VMEXIT_LITE);
 			pr_info("Spectre v2 / PBRSB-eIBRS: Retire a single CALL on VMEXIT\n");
+			setup_force_cpu_cap(X86_FEATURE_RSB_VMEXIT_LITE);
 		}
 		return;
 
@@ -1617,12 +1634,13 @@ static void __init spectre_v2_determine_rsb_fill_type_at_vmexit(enum spectre_v2_
 	case SPECTRE_V2_RETPOLINE:
 	case SPECTRE_V2_LFENCE:
 	case SPECTRE_V2_IBRS:
+		pr_info("Spectre v2 / SpectreRSB : Filling RSB on context switch and VMEXIT\n");
+		setup_force_cpu_cap(X86_FEATURE_RSB_CTXSW);
 		setup_force_cpu_cap(X86_FEATURE_RSB_VMEXIT);
-		pr_info("Spectre v2 / SpectreRSB : Filling RSB on VMEXIT\n");
 		return;
 	}
 
-	pr_warn_once("Unknown Spectre v2 mode, disabling RSB mitigation at VM exit");
+	pr_warn_once("Unknown Spectre v2 mode, disabling RSB mitigation\n");
 	dump_stack();
 }
 
@@ -1817,48 +1835,7 @@ static void __init spectre_v2_select_mitigation(void)
 	spectre_v2_enabled = mode;
 	pr_info("%s\n", spectre_v2_strings[mode]);
 
-	/*
-	 * If Spectre v2 protection has been enabled, fill the RSB during a
-	 * context switch.  In general there are two types of RSB attacks
-	 * across context switches, for which the CALLs/RETs may be unbalanced.
-	 *
-	 * 1) RSB underflow
-	 *
-	 *    Some Intel parts have "bottomless RSB".  When the RSB is empty,
-	 *    speculated return targets may come from the branch predictor,
-	 *    which could have a user-poisoned BTB or BHB entry.
-	 *
-	 *    AMD has it even worse: *all* returns are speculated from the BTB,
-	 *    regardless of the state of the RSB.
-	 *
-	 *    When IBRS or eIBRS is enabled, the "user -> kernel" attack
-	 *    scenario is mitigated by the IBRS branch prediction isolation
-	 *    properties, so the RSB buffer filling wouldn't be necessary to
-	 *    protect against this type of attack.
-	 *
-	 *    The "user -> user" attack scenario is mitigated by RSB filling.
-	 *
-	 * 2) Poisoned RSB entry
-	 *
-	 *    If the 'next' in-kernel return stack is shorter than 'prev',
-	 *    'next' could be tricked into speculating with a user-poisoned RSB
-	 *    entry.
-	 *
-	 *    The "user -> kernel" attack scenario is mitigated by SMEP and
-	 *    eIBRS.
-	 *
-	 *    The "user -> user" scenario, also known as SpectreBHB, requires
-	 *    RSB clearing.
-	 *
-	 * So to mitigate all cases, unconditionally fill RSB on context
-	 * switches.
-	 *
-	 * FIXME: Is this pointless for retbleed-affected AMD?
-	 */
-	setup_force_cpu_cap(X86_FEATURE_RSB_CTXSW);
-	pr_info("Spectre v2 / SpectreRSB mitigation: Filling RSB on context switch\n");
-
-	spectre_v2_determine_rsb_fill_type_at_vmexit(mode);
+	spectre_v2_mitigate_rsb(mode);
 
 	/*
 	 * Retpoline protects the kernel, but doesn't protect firmware.  IBRS

Re: [RFC PATCH v2 1/3] x86: cpu/bugs: update SpectreRSB comments for AMD

Posted by Pawan Gupta 1 year, 2 months ago

On Thu, Nov 14, 2024 at 09:48:36PM -0800, Josh Poimboeuf wrote:
> According to the docs, classic IBRS also needs RSB filling at context
> switch to protect against corrupt RSB entries (as opposed to RSB
> underflow).

Correct.

> Something like so...
> 
> 
> diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
> index 47a01d4028f6..7b9c0a21e478 100644
> --- a/arch/x86/kernel/cpu/bugs.c
> +++ b/arch/x86/kernel/cpu/bugs.c
> @@ -1579,27 +1579,44 @@ static void __init spec_ctrl_disable_kernel_rrsba(void)
>  	rrsba_disabled = true;
>  }
>  
> -static void __init spectre_v2_determine_rsb_fill_type_at_vmexit(enum spectre_v2_mitigation mode)
> +static void __init spectre_v2_mitigate_rsb(enum spectre_v2_mitigation mode)
>  {
>  	/*
> -	 * Similar to context switches, there are two types of RSB attacks
> -	 * after VM exit:
> +	 * In general there are two types of RSB attacks:
>  	 *
> -	 * 1) RSB underflow
> +	 * 1) RSB underflow ("Intel Retbleed")
> +	 *
> +	 *    Some Intel parts have "bottomless RSB".  When the RSB is empty,
> +	 *    speculated return targets may come from the branch predictor,
> +	 *    which could have a user-poisoned BTB or BHB entry.
> +	 *
> +	 *    user->user attacks are mitigated by IBPB on context switch.
> +	 *
> +	 *    user->kernel attacks via context switch are mitigated by IBRS,
> +	 *    eIBRS, or RSB filling.
> +	 *
> +	 *    user->kernel attacks via kernel entry are mitigated by IBRS,
> +	 *    eIBRS, or call depth tracking.
> +	 *
> +	 *    On VMEXIT, guest->host attacks are mitigated by IBRS, eIBRS, or
> +	 *    RSB filling.
>  	 *
>  	 * 2) Poisoned RSB entry
>  	 *
> -	 * When retpoline is enabled, both are mitigated by filling/clearing
> -	 * the RSB.
> +	 *    On a context switch, the previous task can poison RSB entries
> +	 *    used by the next task, controlling its speculative return
> +	 *    targets.  Poisoned RSB entries can also be created by "AMD
> +	 *    Retbleed" or SRSO.
>  	 *
> -	 * When IBRS is enabled, while #1 would be mitigated by the IBRS branch
> -	 * prediction isolation protections, RSB still needs to be cleared
> -	 * because of #2.  Note that SMEP provides no protection here, unlike
> -	 * user-space-poisoned RSB entries.
> +	 *    user->user attacks are mitigated by IBPB on context switch.
>  	 *
> -	 * eIBRS should protect against RSB poisoning, but if the EIBRS_PBRSB
> -	 * bug is present then a LITE version of RSB protection is required,
> -	 * just a single call needs to retire before a RET is executed.
> +	 *    user->kernel attacks via context switch are prevented by
> +	 *    SMEP+eIBRS+SRSO mitigations, or RSB clearing.
> +	 *
> +	 *    guest->host attacks are mitigated by eIBRS or RSB clearing on
> +	 *    VMEXIT.  eIBRS implementations with X86_BUG_EIBRS_PBRSB still
> +	 *    need "lite" RSB filling which retires a CALL before the first
> +	 *    RET.
>  	 */
>  	switch (mode) {
>  	case SPECTRE_V2_NONE:
> @@ -1608,8 +1625,8 @@ static void __init spectre_v2_determine_rsb_fill_type_at_vmexit(enum spectre_v2_
>  	case SPECTRE_V2_EIBRS_LFENCE:
>  	case SPECTRE_V2_EIBRS:
>  		if (boot_cpu_has_bug(X86_BUG_EIBRS_PBRSB)) {
> -			setup_force_cpu_cap(X86_FEATURE_RSB_VMEXIT_LITE);
>  			pr_info("Spectre v2 / PBRSB-eIBRS: Retire a single CALL on VMEXIT\n");
> +			setup_force_cpu_cap(X86_FEATURE_RSB_VMEXIT_LITE);
>  		}
>  		return;
>  
> @@ -1617,12 +1634,13 @@ static void __init spectre_v2_determine_rsb_fill_type_at_vmexit(enum spectre_v2_
>  	case SPECTRE_V2_RETPOLINE:
>  	case SPECTRE_V2_LFENCE:
>  	case SPECTRE_V2_IBRS:
> +		pr_info("Spectre v2 / SpectreRSB : Filling RSB on context switch and VMEXIT\n");
> +		setup_force_cpu_cap(X86_FEATURE_RSB_CTXSW);
>  		setup_force_cpu_cap(X86_FEATURE_RSB_VMEXIT);
> -		pr_info("Spectre v2 / SpectreRSB : Filling RSB on VMEXIT\n");
>  		return;
>  	}
>  
> -	pr_warn_once("Unknown Spectre v2 mode, disabling RSB mitigation at VM exit");
> +	pr_warn_once("Unknown Spectre v2 mode, disabling RSB mitigation\n");
>  	dump_stack();
>  }
>  
> @@ -1817,48 +1835,7 @@ static void __init spectre_v2_select_mitigation(void)
>  	spectre_v2_enabled = mode;
>  	pr_info("%s\n", spectre_v2_strings[mode]);
>  
> -	/*
> -	 * If Spectre v2 protection has been enabled, fill the RSB during a
> -	 * context switch.  In general there are two types of RSB attacks
> -	 * across context switches, for which the CALLs/RETs may be unbalanced.
> -	 *
> -	 * 1) RSB underflow
> -	 *
> -	 *    Some Intel parts have "bottomless RSB".  When the RSB is empty,
> -	 *    speculated return targets may come from the branch predictor,
> -	 *    which could have a user-poisoned BTB or BHB entry.
> -	 *
> -	 *    AMD has it even worse: *all* returns are speculated from the BTB,
> -	 *    regardless of the state of the RSB.
> -	 *
> -	 *    When IBRS or eIBRS is enabled, the "user -> kernel" attack
> -	 *    scenario is mitigated by the IBRS branch prediction isolation
> -	 *    properties, so the RSB buffer filling wouldn't be necessary to
> -	 *    protect against this type of attack.
> -	 *
> -	 *    The "user -> user" attack scenario is mitigated by RSB filling.
> -	 *
> -	 * 2) Poisoned RSB entry
> -	 *
> -	 *    If the 'next' in-kernel return stack is shorter than 'prev',
> -	 *    'next' could be tricked into speculating with a user-poisoned RSB
> -	 *    entry.
> -	 *
> -	 *    The "user -> kernel" attack scenario is mitigated by SMEP and
> -	 *    eIBRS.
> -	 *
> -	 *    The "user -> user" scenario, also known as SpectreBHB, requires
> -	 *    RSB clearing.
> -	 *
> -	 * So to mitigate all cases, unconditionally fill RSB on context
> -	 * switches.
> -	 *
> -	 * FIXME: Is this pointless for retbleed-affected AMD?
> -	 */
> -	setup_force_cpu_cap(X86_FEATURE_RSB_CTXSW);
> -	pr_info("Spectre v2 / SpectreRSB mitigation: Filling RSB on context switch\n");
> -
> -	spectre_v2_determine_rsb_fill_type_at_vmexit(mode);
> +	spectre_v2_mitigate_rsb(mode);
>  
>  	/*
>  	 * Retpoline protects the kernel, but doesn't protect firmware.  IBRS

This LGTM.

I think SPECTRE_V2_EIBRS_RETPOLINE is placed in the wrong leg, it
doesn't need RSB filling on context switch, and only needs VMEXIT_LITE.
Does below change on top of your patch look okay?

---
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 7b9c0a21e478..d3b9a0d7a2b5 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -1622,6 +1622,7 @@ static void __init spectre_v2_mitigate_rsb(enum spectre_v2_mitigation mode)
 	case SPECTRE_V2_NONE:
 		return;
 
+	case SPECTRE_V2_EIBRS_RETPOLINE:
 	case SPECTRE_V2_EIBRS_LFENCE:
 	case SPECTRE_V2_EIBRS:
 		if (boot_cpu_has_bug(X86_BUG_EIBRS_PBRSB)) {
@@ -1630,7 +1631,6 @@ static void __init spectre_v2_mitigate_rsb(enum spectre_v2_mitigation mode)
 		}
 		return;
 
-	case SPECTRE_V2_EIBRS_RETPOLINE:
 	case SPECTRE_V2_RETPOLINE:
 	case SPECTRE_V2_LFENCE:
 	case SPECTRE_V2_IBRS:

Re: [RFC PATCH v2 1/3] x86: cpu/bugs: update SpectreRSB comments for AMD

Posted by Josh Poimboeuf 1 year, 2 months ago

On Fri, Nov 15, 2024 at 09:50:47AM -0800, Pawan Gupta wrote:
> This LGTM.
> 
> I think SPECTRE_V2_EIBRS_RETPOLINE is placed in the wrong leg, it
> doesn't need RSB filling on context switch, and only needs VMEXIT_LITE.
> Does below change on top of your patch look okay?

Yeah, I was wondering about that too.  Since it changes existing
VMEXIT_LITE behavior I'll make it a separate patch.  And I'll probably
do the comment changes in a separate patch as well.

> ---
> diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
> index 7b9c0a21e478..d3b9a0d7a2b5 100644
> --- a/arch/x86/kernel/cpu/bugs.c
> +++ b/arch/x86/kernel/cpu/bugs.c
> @@ -1622,6 +1622,7 @@ static void __init spectre_v2_mitigate_rsb(enum spectre_v2_mitigation mode)
>  	case SPECTRE_V2_NONE:
>  		return;
>  
> +	case SPECTRE_V2_EIBRS_RETPOLINE:
>  	case SPECTRE_V2_EIBRS_LFENCE:
>  	case SPECTRE_V2_EIBRS:
>  		if (boot_cpu_has_bug(X86_BUG_EIBRS_PBRSB)) {
> @@ -1630,7 +1631,6 @@ static void __init spectre_v2_mitigate_rsb(enum spectre_v2_mitigation mode)
>  		}
>  		return;
>  
> -	case SPECTRE_V2_EIBRS_RETPOLINE:
>  	case SPECTRE_V2_RETPOLINE:
>  	case SPECTRE_V2_LFENCE:
>  	case SPECTRE_V2_IBRS:

-- 
Josh

Re: [RFC PATCH v2 1/3] x86: cpu/bugs: update SpectreRSB comments for AMD

Posted by Shah, Amit 1 year, 2 months ago

On Fri, 2024-11-15 at 10:10 -0800, Josh Poimboeuf wrote:
> On Fri, Nov 15, 2024 at 09:50:47AM -0800, Pawan Gupta wrote:
> > This LGTM.
> > 
> > I think SPECTRE_V2_EIBRS_RETPOLINE is placed in the wrong leg, it
> > doesn't need RSB filling on context switch, and only needs
> > VMEXIT_LITE.
> > Does below change on top of your patch look okay?
> 
> Yeah, I was wondering about that too.  Since it changes existing
> VMEXIT_LITE behavior I'll make it a separate patch.  And I'll
> probably
> do the comment changes in a separate patch as well.

So all of that looks good to me as well.  I think a standalone series
makes sense - maybe even for 6.13.  I'll base my patches on top of
yours.

Thanks!

		Amit

Re: [RFC PATCH v2 1/3] x86: cpu/bugs: update SpectreRSB comments for AMD

Posted by Shah, Amit 1 year, 2 months ago

On Mon, 2024-11-11 at 17:46 -0800, Josh Poimboeuf wrote:
> On Tue, Nov 12, 2024 at 12:29:28AM +0000, Andrew Cooper wrote:
> > This is my take.  On AMD CPUs, there are two unrelated issues to
> > take
> > into account:
> > 
> > 1) SRSO
> > 
> > Affects anything which doesn't enumerate SRSO_NO, which is all
> > parts to
> > date including Zen5.
> > 
> > SRSO ends up overflowing the RAS with arbitrary BTB targets, such
> > that a
> > subsequent genuine RET follows a prediction which never came from a
> > real
> > CALL instruction.
> > 
> > Mitigations for SRSO are either safe-ret, or IBPB-on-entry.  Parts
> > without IBPB_RET using IBPB-on-entry need to manually flush the
> > RAS.
> > 
> > Importantly, SMEP does not protection you against SRSO across the
> > user->kernel boundary, because the bad RAS entries are arbitrary. 
> > New
> > in Zen5 is the SRSO_U/S_NO bit which says this case can't occur any
> > more.  So on Zen5, you can in principle get away without a RAS
> > flush on
> > entry.
> 
> Updated to mention SRSO:
> 
> 	/*
> 	 * In general there are two types of RSB attacks:
> 	 *
> 	 * 1) RSB underflow ("Intel Retbleed")
> 	 *
> 	 *    Some Intel parts have "bottomless RSB".  When the RSB
> is empty,
> 	 *    speculated return targets may come from the branch
> predictor,
> 	 *    which could have a user-poisoned BTB or BHB entry.
> 	 *
> 	 *    When IBRS or eIBRS is enabled, the "user -> kernel"
> attack is
> 	 *    mitigated by the IBRS branch prediction isolation
> properties, so
> 	 *    the RSB buffer filling wouldn't be necessary to
> protect against
> 	 *    this type of attack.
> 	 *
> 	 *    The "user -> user" attack is mitigated by RSB filling
> on context
> 	 *    switch.
> 	 *
> 	 *    The "guest -> host" attack is mitigated by IBRS or
> eIBRS.
> 	 *
> 	 * 2) Poisoned RSB entry
> 	 *
> 	 *    If the 'next' in-kernel return stack is shorter than
> 'prev',
> 	 *    'next' could be tricked into speculating with a user-
> poisoned RSB
> 	 *    entry.  Poisoned RSB entries can also be created by
> Branch Type
> 	 *    Confusion ("AMD retbleed") or SRSO.
> 	 *
> 	 *    The "user -> kernel" attack is mitigated by SMEP and
> eIBRS.  AMD
> 	 *    without SRSO_NO also needs the SRSO mitigation.
> 	 *
> 	 *    The "user -> user" attack, also known as SpectreBHB,
> requires RSB
> 	 *    clearing.
> 	 *
> 	 *    The "guest -> host" attack is mitigated by either
> eIBRS (not
> 	 *    IBRS!) or RSB clearing on vmexit.  Note that eIBRS
> 	 *    implementations with X86_BUG_EIBRS_PBRSB still need
> "lite" RSB
> 	 *    clearing which retires a single CALL before the first
> RET.
> 	 */

Thanks, Josh and Andrew.  This reads well to me.  In the context of
ERAPS, I'll end up adding a couple more sentences there as well.

		Amit

Re: [RFC PATCH v2 1/3] x86: cpu/bugs: update SpectreRSB comments for AMD

Posted by Borislav Petkov 1 year, 2 months ago

On Mon, Nov 11, 2024 at 05:46:44PM -0800, Josh Poimboeuf wrote:
> Subject: [PATCH] x86/bugs: Update insanely long comment about RSB attacks

Why don't you stick this insanely long comment in
Documentation/admin-guide/hw-vuln/ while at it?

Its place is hardly in the code. You can point to it from the code tho...

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

Re: [RFC PATCH v2 1/3] x86: cpu/bugs: update SpectreRSB comments for AMD

Posted by Josh Poimboeuf 1 year, 2 months ago

On Tue, Nov 12, 2024 at 12:58:11PM +0100, Borislav Petkov wrote:
> On Mon, Nov 11, 2024 at 05:46:44PM -0800, Josh Poimboeuf wrote:
> > Subject: [PATCH] x86/bugs: Update insanely long comment about RSB attacks
> 
> Why don't you stick this insanely long comment in
> Documentation/admin-guide/hw-vuln/ while at it?
> 
> Its place is hardly in the code. You can point to it from the code tho...

There are a lot of subtle details to this $#!tstorm, and IMO we probably
wouldn't be having these discussions in the first place if the comment
lived in the docs, as most people seem to ignore them...

-- 
Josh

Re: [RFC PATCH v2 1/3] x86: cpu/bugs: update SpectreRSB comments for AMD

Posted by Borislav Petkov 1 year, 2 months ago

On Wed, Nov 13, 2024 at 01:24:40PM -0800, Josh Poimboeuf wrote:
> There are a lot of subtle details to this $#!tstorm, and IMO we probably
> wouldn't be having these discussions in the first place if the comment
> lived in the docs, as most people seem to ignore them...

That's why I'm saying point to the docs from the code. You can't have a big
fat comment in the code about this but everything else in the hw-vuln docs.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

Re: [RFC PATCH v2 1/3] x86: cpu/bugs: update SpectreRSB comments for AMD

Posted by Josh Poimboeuf 1 year, 2 months ago

On Wed, Nov 13, 2024 at 10:37:24PM +0100, Borislav Petkov wrote:
> On Wed, Nov 13, 2024 at 01:24:40PM -0800, Josh Poimboeuf wrote:
> > There are a lot of subtle details to this $#!tstorm, and IMO we probably
> > wouldn't be having these discussions in the first place if the comment
> > lived in the docs, as most people seem to ignore them...
> 
> That's why I'm saying point to the docs from the code. You can't have a big
> fat comment in the code about this but everything else in the hw-vuln docs.

But those docs are user facing, describing the "what" for each
vulnerability individually.  They're basically historical documents
which don't evolve over time unless we tweak an interface or add a new
mitigation.

This comment relates to the "why" for the code itself (and its poor
confused developers), taking all the RSB-related vulnerabilities into
account.

-- 
Josh

Re: [RFC PATCH v2 1/3] x86: cpu/bugs: update SpectreRSB comments for AMD

Posted by Borislav Petkov 1 year, 2 months ago

On Wed, Nov 13, 2024 at 04:43:58PM -0800, Josh Poimboeuf wrote:
> This comment relates to the "why" for the code itself (and its poor
> confused developers), taking all the RSB-related vulnerabilities into
> account.

So use Documentation/arch/x86/.

This is exactly the reason why we need more "why" documentation - because
everytime we have to swap the whole bugs.c horror back in, we're poor confused
developers. And we have the "why" spread out across commit messages and other
folklore which means everytime we have to change stuff, the git archeology
starts. :-\ "err, do you remember why we're doing this?!" And so on
converstaions on IRC.

So having an implementation document explaining clearly why we did things is
long overdue.

But it's fine - I can move it later when the dust settles here.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

Re: [RFC PATCH v2 1/3] x86: cpu/bugs: update SpectreRSB comments for AMD

Posted by Ingo Molnar 1 year, 2 months ago

* Borislav Petkov <bp@alien8.de> wrote:

> On Wed, Nov 13, 2024 at 04:43:58PM -0800, Josh Poimboeuf wrote:
> > This comment relates to the "why" for the code itself (and its poor
> > confused developers), taking all the RSB-related vulnerabilities into
> > account.
> 
> So use Documentation/arch/x86/.
> 
> This is exactly the reason why we need more "why" documentation - because
> everytime we have to swap the whole bugs.c horror back in, we're poor confused
> developers. And we have the "why" spread out across commit messages and other
> folklore which means everytime we have to change stuff, the git archeology
> starts. :-\ "err, do you remember why we're doing this?!" And so on
> converstaions on IRC.
> 
> So having an implementation document explaining clearly why we did things is
> long overdue.
> 
> But it's fine - I can move it later when the dust settles here.

I think in-line documentation is better in this case: the primary defense
against mistakes and misunderstandings is in the source code itself.

And "it's too long" is an argument *against* moving it out into some obscure
place 99% of developers aren't even aware of...

Thanks,

	Ingo

Re: [RFC PATCH v2 1/3] x86: cpu/bugs: update SpectreRSB comments for AMD

Posted by Borislav Petkov 1 year, 2 months ago

On Thu, Nov 14, 2024 at 10:47:23AM +0100, Ingo Molnar wrote:
> I think in-line documentation is better in this case: the primary defense
> against mistakes and misunderstandings is in the source code itself.
> 
> And "it's too long" is an argument *against* moving it out into some obscure
> place 99% of developers aren't even aware of...

You mean developers can't even read?

	/* 
	 * See Documentation/arch/x86/ for details on this mitigation
	 * implementation.
	 */

And if we want to expand the "why" and do proper documentation on the
implementation decisions of each mitigation, we still keep it there in the
code?

Or we do one part in Documentation/ and another part in the code?

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

Re: [RFC PATCH v2 1/3] x86: cpu/bugs: update SpectreRSB comments for AMD

Posted by Dave Hansen 1 year, 2 months ago

On 11/11/24 08:39, Amit Shah wrote:
> From: Amit Shah <amit.shah@amd.com>
> 
> AMD CPUs do not fall back to the BTB when the RSB underflows for RET
> address speculation.  AMD CPUs have not needed to stuff the RSB for
> underflow conditions.
> 
> The RSB poisoning case is addressed by RSB filling - clean up the FIXME
> comment about it.

This amounts to "Josh was wrong" in commit 9756bba284. Before moving
forward with this, it would be great to get his ack on this to make sure
you two are on the same page.