[PATCH v3] x86/amd: Address AMD erratum #1485

Alejandro Vallejo posted 1 patch 6 months, 3 weeks ago
Patches applied successfully (tree, apply log)
git fetch https://gitlab.com/xen-project/patchew/xen tags/patchew/20231013153801.188324-1-alejandro.vallejo@cloud.com
xen/arch/x86/cpu/amd.c               | 23 +++++++++++++++++++++++
xen/arch/x86/include/asm/amd.h       |  5 +++++
xen/arch/x86/include/asm/msr-index.h |  1 +
3 files changed, 29 insertions(+)
[PATCH v3] x86/amd: Address AMD erratum #1485
Posted by Alejandro Vallejo 6 months, 3 weeks ago
Fix adapted off Linux's mailing list:
  https://lore.kernel.org/lkml/D99589F4-BC5D-430B-87B2-72C20370CF57@exactcode.com/T/#u

Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
Release-acked-by: Henry Wang <Henry.Wang@arm.com>
---
v3:
  * Factored MSR shuffling into a separate function and moved the call to
    the end of init_amd() rather than having it in the middle
  * Style changes
  * Added comment about the unavoidable wrmsr race
  * Avoid wrmsrl() if the chickenbit already set
  * Removed the "ull" suffix as it's pointless with a local variable and it
    violates a MISRA 7.3.

Pipeline pending:
    https://gitlab.com/xen-project/people/agvallejo/xen/-/pipelines/1036125869
---
 xen/arch/x86/cpu/amd.c               | 23 +++++++++++++++++++++++
 xen/arch/x86/include/asm/amd.h       |  5 +++++
 xen/arch/x86/include/asm/msr-index.h |  1 +
 3 files changed, 29 insertions(+)

diff --git a/xen/arch/x86/cpu/amd.c b/xen/arch/x86/cpu/amd.c
index 4f27187f92..cb77c23f1a 100644
--- a/xen/arch/x86/cpu/amd.c
+++ b/xen/arch/x86/cpu/amd.c
@@ -1004,6 +1004,28 @@ static void cf_check zen2_disable_c6(void *arg)
 	wrmsrl(MSR_AMD_CSTATE_CFG, val & mask);
 }
 
+static void amd_check_erratum_1485(void)
+{
+	uint64_t val, chickenbit = (1 << 5);
+
+	if (cpu_has_hypervisor || boot_cpu_data.x86 != 0x19 || !is_zen4_uarch())
+		return;
+
+	rdmsrl(MSR_AMD64_BP_CFG, val);
+
+	if (val & chickenbit)
+		return;
+
+	/*
+	 * BP_CFG is a core-scoped MSR. There's a benign race on this write
+	 * on the case where 2 threads perform the previous check at the
+	 * same time before the chickenbit is set. It's benign because the
+	 * value being written is the same on both.
+	 */
+	wrmsrl(MSR_AMD64_BP_CFG, val | chickenbit);
+
+}
+
 static void cf_check init_amd(struct cpuinfo_x86 *c)
 {
 	u32 l, h;
@@ -1271,6 +1293,7 @@ static void cf_check init_amd(struct cpuinfo_x86 *c)
 		disable_c1_ramping();
 
 	amd_check_zenbleed();
+	amd_check_erratum_1485();
 
 	if (zen2_c6_disabled)
 		zen2_disable_c6(NULL);
diff --git a/xen/arch/x86/include/asm/amd.h b/xen/arch/x86/include/asm/amd.h
index d862cb7972..0700827561 100644
--- a/xen/arch/x86/include/asm/amd.h
+++ b/xen/arch/x86/include/asm/amd.h
@@ -145,11 +145,16 @@
  * Hygon (Fam18h) but without simple model number rules.  Instead, use STIBP
  * as a heuristic that distinguishes the two.
  *
+ * For Zen3 and Zen4 (Fam19h) the heuristic is the presence of AutoIBRS, as
+ * it's Zen4-specific.
+ *
  * The caller is required to perform the appropriate vendor/family checks
  * first.
  */
 #define is_zen1_uarch() (!boot_cpu_has(X86_FEATURE_AMD_STIBP))
 #define is_zen2_uarch()   boot_cpu_has(X86_FEATURE_AMD_STIBP)
+#define is_zen3_uarch() (!boot_cpu_has(X86_FEATURE_AUTO_IBRS))
+#define is_zen4_uarch()   boot_cpu_has(X86_FEATURE_AUTO_IBRS)
 
 struct cpuinfo_x86;
 int cpu_has_amd_erratum(const struct cpuinfo_x86 *, int, ...);
diff --git a/xen/arch/x86/include/asm/msr-index.h b/xen/arch/x86/include/asm/msr-index.h
index 11ffed543a..7b3490bfb1 100644
--- a/xen/arch/x86/include/asm/msr-index.h
+++ b/xen/arch/x86/include/asm/msr-index.h
@@ -403,6 +403,7 @@
 #define MSR_AMD64_DE_CFG		0xc0011029
 #define AMD64_DE_CFG_LFENCE_SERIALISE	(_AC(1, ULL) << 1)
 #define MSR_AMD64_EX_CFG		0xc001102c
+#define MSR_AMD64_BP_CFG		0xc001102e
 #define MSR_AMD64_DE_CFG2		0xc00110e3
 
 #define MSR_AMD64_DR0_ADDRESS_MASK	0xc0011027
-- 
2.34.1
Re: [PATCH v3] x86/amd: Address AMD erratum #1485
Posted by Jan Beulich 6 months, 3 weeks ago
On 13.10.2023 17:38, Alejandro Vallejo wrote:
> Fix adapted off Linux's mailing list:
>   https://lore.kernel.org/lkml/D99589F4-BC5D-430B-87B2-72C20370CF57@exactcode.com/T/#u

Why reference the bug report when there's a proper commit (f454b18e07f5) now?
Plus in any event a short summary of the erratum would help if put right here
(without needing to look up any documents or follow any links).

> --- a/xen/arch/x86/cpu/amd.c
> +++ b/xen/arch/x86/cpu/amd.c
> @@ -1004,6 +1004,28 @@ static void cf_check zen2_disable_c6(void *arg)
>  	wrmsrl(MSR_AMD_CSTATE_CFG, val & mask);
>  }
>  
> +static void amd_check_erratum_1485(void)
> +{
> +	uint64_t val, chickenbit = (1 << 5);

Linux gives the bit a name. Any reason you don't?

Everything else lgtm.

Jan
Re: [PATCH v3] x86/amd: Address AMD erratum #1485
Posted by Andrew Cooper 6 months, 3 weeks ago
On 17/10/2023 8:44 am, Jan Beulich wrote:
> On 13.10.2023 17:38, Alejandro Vallejo wrote:
>> Fix adapted off Linux's mailing list:
>>   https://lore.kernel.org/lkml/D99589F4-BC5D-430B-87B2-72C20370CF57@exactcode.com/T/#u
> Why reference the bug report when there's a proper commit (f454b18e07f5) now?
> Plus in any event a short summary of the erratum would help if put right here
> (without needing to look up any documents or follow any links).

That is not public information yet.  The erratum number alone is the
best we can do at this juncture.
>> --- a/xen/arch/x86/cpu/amd.c
>> +++ b/xen/arch/x86/cpu/amd.c
>> @@ -1004,6 +1004,28 @@ static void cf_check zen2_disable_c6(void *arg)
>>  	wrmsrl(MSR_AMD_CSTATE_CFG, val & mask);
>>  }
>>  
>> +static void amd_check_erratum_1485(void)
>> +{
>> +	uint64_t val, chickenbit = (1 << 5);
> Linux gives the bit a name. Any reason you don't?

There are multiple different names depending on where you look, and none
are particularly relevant here.

~Andrew
Re: [PATCH v3] x86/amd: Address AMD erratum #1485
Posted by Roger Pau Monné 6 months, 3 weeks ago
On Tue, Oct 17, 2023 at 08:50:45AM +0100, Andrew Cooper wrote:
> On 17/10/2023 8:44 am, Jan Beulich wrote:
> > On 13.10.2023 17:38, Alejandro Vallejo wrote:
> >> Fix adapted off Linux's mailing list:
> >>   https://lore.kernel.org/lkml/D99589F4-BC5D-430B-87B2-72C20370CF57@exactcode.com/T/#u
> > Why reference the bug report when there's a proper commit (f454b18e07f5) now?
> > Plus in any event a short summary of the erratum would help if put right here
> > (without needing to look up any documents or follow any links).
> 
> That is not public information yet.  The erratum number alone is the
> best we can do at this juncture.
> >> --- a/xen/arch/x86/cpu/amd.c
> >> +++ b/xen/arch/x86/cpu/amd.c
> >> @@ -1004,6 +1004,28 @@ static void cf_check zen2_disable_c6(void *arg)
> >>  	wrmsrl(MSR_AMD_CSTATE_CFG, val & mask);
> >>  }
> >>  
> >> +static void amd_check_erratum_1485(void)
> >> +{
> >> +	uint64_t val, chickenbit = (1 << 5);
> > Linux gives the bit a name. Any reason you don't?
> 
> There are multiple different names depending on where you look, and none
> are particularly relevant here.

Could we make chickenbit const static?

I would also use ULL just to be on the safe side, because we then copy
this for a different bit and it explodes.

(not strong requirements, but if a resend is needed it would be nice
to have).

Thanks, Roger.

Re: [PATCH v3] x86/amd: Address AMD erratum #1485
Posted by Andrew Cooper 6 months, 3 weeks ago
On 17/10/2023 10:13 am, Roger Pau Monné wrote:
> On Tue, Oct 17, 2023 at 08:50:45AM +0100, Andrew Cooper wrote:
>> On 17/10/2023 8:44 am, Jan Beulich wrote:
>>> On 13.10.2023 17:38, Alejandro Vallejo wrote:
>>>> Fix adapted off Linux's mailing list:
>>>>   https://lore.kernel.org/lkml/D99589F4-BC5D-430B-87B2-72C20370CF57@exactcode.com/T/#u
>>> Why reference the bug report when there's a proper commit (f454b18e07f5) now?
>>> Plus in any event a short summary of the erratum would help if put right here
>>> (without needing to look up any documents or follow any links).
>> That is not public information yet.  The erratum number alone is the
>> best we can do at this juncture.
>>>> --- a/xen/arch/x86/cpu/amd.c
>>>> +++ b/xen/arch/x86/cpu/amd.c
>>>> @@ -1004,6 +1004,28 @@ static void cf_check zen2_disable_c6(void *arg)
>>>>  	wrmsrl(MSR_AMD_CSTATE_CFG, val & mask);
>>>>  }
>>>>  
>>>> +static void amd_check_erratum_1485(void)
>>>> +{
>>>> +	uint64_t val, chickenbit = (1 << 5);
>>> Linux gives the bit a name. Any reason you don't?
>> There are multiple different names depending on where you look, and none
>> are particularly relevant here.
> Could we make chickenbit const static?

Why would we want to force something that's optimised to an instruction
immediate into a .data variable?

~Andrew

Re: [PATCH v3] x86/amd: Address AMD erratum #1485
Posted by Jan Beulich 6 months, 3 weeks ago
On 17.10.2023 11:13, Roger Pau Monné wrote:
> On Tue, Oct 17, 2023 at 08:50:45AM +0100, Andrew Cooper wrote:
>> On 17/10/2023 8:44 am, Jan Beulich wrote:
>>> On 13.10.2023 17:38, Alejandro Vallejo wrote:
>>>> Fix adapted off Linux's mailing list:
>>>>   https://lore.kernel.org/lkml/D99589F4-BC5D-430B-87B2-72C20370CF57@exactcode.com/T/#u
>>> Why reference the bug report when there's a proper commit (f454b18e07f5) now?
>>> Plus in any event a short summary of the erratum would help if put right here
>>> (without needing to look up any documents or follow any links).
>>
>> That is not public information yet.  The erratum number alone is the
>> best we can do at this juncture.
>>>> --- a/xen/arch/x86/cpu/amd.c
>>>> +++ b/xen/arch/x86/cpu/amd.c
>>>> @@ -1004,6 +1004,28 @@ static void cf_check zen2_disable_c6(void *arg)
>>>>  	wrmsrl(MSR_AMD_CSTATE_CFG, val & mask);
>>>>  }
>>>>  
>>>> +static void amd_check_erratum_1485(void)
>>>> +{
>>>> +	uint64_t val, chickenbit = (1 << 5);
>>> Linux gives the bit a name. Any reason you don't?
>>
>> There are multiple different names depending on where you look, and none
>> are particularly relevant here.
> 
> Could we make chickenbit const static?
> 
> I would also use ULL just to be on the safe side, because we then copy
> this for a different bit and it explodes.

I guess the way it is resembles what we already have in amd_check_zenbleed().
Also it's not clear to me why besides "const" you also ask for "static".

Jan

Re: [PATCH v3] x86/amd: Address AMD erratum #1485
Posted by Roger Pau Monné 6 months, 3 weeks ago
On Tue, Oct 17, 2023 at 11:21:47AM +0200, Jan Beulich wrote:
> On 17.10.2023 11:13, Roger Pau Monné wrote:
> > On Tue, Oct 17, 2023 at 08:50:45AM +0100, Andrew Cooper wrote:
> >> On 17/10/2023 8:44 am, Jan Beulich wrote:
> >>> On 13.10.2023 17:38, Alejandro Vallejo wrote:
> >>>> Fix adapted off Linux's mailing list:
> >>>>   https://lore.kernel.org/lkml/D99589F4-BC5D-430B-87B2-72C20370CF57@exactcode.com/T/#u
> >>> Why reference the bug report when there's a proper commit (f454b18e07f5) now?
> >>> Plus in any event a short summary of the erratum would help if put right here
> >>> (without needing to look up any documents or follow any links).
> >>
> >> That is not public information yet.  The erratum number alone is the
> >> best we can do at this juncture.
> >>>> --- a/xen/arch/x86/cpu/amd.c
> >>>> +++ b/xen/arch/x86/cpu/amd.c
> >>>> @@ -1004,6 +1004,28 @@ static void cf_check zen2_disable_c6(void *arg)
> >>>>  	wrmsrl(MSR_AMD_CSTATE_CFG, val & mask);
> >>>>  }
> >>>>  
> >>>> +static void amd_check_erratum_1485(void)
> >>>> +{
> >>>> +	uint64_t val, chickenbit = (1 << 5);
> >>> Linux gives the bit a name. Any reason you don't?
> >>
> >> There are multiple different names depending on where you look, and none
> >> are particularly relevant here.
> > 
> > Could we make chickenbit const static?
> > 
> > I would also use ULL just to be on the safe side, because we then copy
> > this for a different bit and it explodes.
> 
> I guess the way it is resembles what we already have in amd_check_zenbleed().
> Also it's not clear to me why besides "const" you also ask for "static".

Yes, makes no sense to put in .rodata, sorry, just const.

Roger.