[PATCH v3] amd: disable C6 after 1000 days on Zen2

Roger Pau Monne posted 1 patch 9 months, 1 week ago
Patches applied successfully (tree, apply log)
git fetch https://gitlab.com/xen-project/patchew/xen tags/patchew/20230728144729.17446-1-roger.pau@citrix.com
xen/arch/x86/cpu/amd.c               | 74 ++++++++++++++++++++++++++++
xen/arch/x86/include/asm/msr-index.h |  2 +
xen/include/xen/time.h               |  1 +
3 files changed, 77 insertions(+)
[PATCH v3] amd: disable C6 after 1000 days on Zen2
Posted by Roger Pau Monne 9 months, 1 week ago
As specified on Errata 1474:

"A core will fail to exit CC6 after about 1044 days after the last
system reset. The time of failure may vary depending on the spread
spectrum and REFCLK frequency."

Detect when running on AMD Zen2 and setup a timer to prevent entering
C6 after 1000 days of uptime.  Take into account the TSC value at boot
in order to account for any time elapsed before Xen has been booted.
Worst case we end up disabling C6 before strictly necessary, but that
would still be safe, and it's better than not taking the TSC value
into account and hanging.

Disable C6 by updating the MSR listed in the revision guide, this
avoids applying workarounds in the CPU idle drivers, as the processor
won't be allowed to enter C6 by the hardware itself.

Print a message once C6 is disabled in order to let the user know.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
The current Revision Guide for Fam17h model 60-6Fh (Lucienne and
Renoir) hasn't been updated to reflect the MSR workaround, but the PPR
for those models lists the MSR and the bits as having the expected
meaning, so I assume it's safe to apply the same workaround there.

For all accounts this seems to affect all Zen2 models, and hence the
workaround should be the same.  Might also affect Hygon, albeit I
think Hygon is strictly limited to Zen1.

Instead of the while loop around get_cpu_maps() we could re-schedule
the timer to NOW() + 1s, but seems more complex.
---
Changes since v2:
 - Add zen2 prefix to added functions and variables.
 - Check for Fam17h and STIBP for Zen2.
 - Prevent CPU hotplug while engaging in disabling C6.
 - Don't use _safe msr access variants.
 - Define the MSR bits inside of zen2_disable_c6().

Changes since v1:
 - Apply the workaround listed by AMD: toggle some MSR bits.
 - Do not apply the workaround if virtualized.
 - Check for STIBP feature instead of listing specific models.
 - Implement the DAYS macro based on SECONDS.
---
 xen/arch/x86/cpu/amd.c               | 74 ++++++++++++++++++++++++++++
 xen/arch/x86/include/asm/msr-index.h |  2 +
 xen/include/xen/time.h               |  1 +
 3 files changed, 77 insertions(+)

diff --git a/xen/arch/x86/cpu/amd.c b/xen/arch/x86/cpu/amd.c
index 3ed06f670491..0358a610605c 100644
--- a/xen/arch/x86/cpu/amd.c
+++ b/xen/arch/x86/cpu/amd.c
@@ -1,8 +1,10 @@
+#include <xen/cpu.h>
 #include <xen/init.h>
 #include <xen/bitops.h>
 #include <xen/mm.h>
 #include <xen/param.h>
 #include <xen/smp.h>
+#include <xen/softirq.h>
 #include <xen/pci.h>
 #include <xen/sched.h>
 #include <xen/warning.h>
@@ -52,6 +54,8 @@ bool __read_mostly amd_acpi_c1e_quirk;
 bool __ro_after_init amd_legacy_ssbd;
 bool __initdata amd_virt_spec_ctrl;
 
+static bool __read_mostly zen2_c6_disabled;
+
 static inline int rdmsr_amd_safe(unsigned int msr, unsigned int *lo,
 				 unsigned int *hi)
 {
@@ -972,6 +976,32 @@ void amd_check_zenbleed(void)
 		       val & chickenbit ? "chickenbit" : "microcode");
 }
 
+static void cf_check zen2_disable_c6(void *arg)
+{
+	/* Disable C6 by clearing the CCR{0,1,2}_CC6EN bits. */
+	const uint64_t mask = ~((1ul << 6) | (1ul << 14) | (1ul << 22));
+	uint64_t val;
+
+	if (!zen2_c6_disabled) {
+		printk(XENLOG_WARNING
+    "Disabling C6 after 1000 days apparent uptime due to AMD errata 1474\n");
+		zen2_c6_disabled = true;
+		/*
+		 * Prevent CPU hotplug so that started CPUs will either see
+		 * zen2_c6_disabled set, or will be handled by
+		 * smp_call_function().
+		 */
+		while (!get_cpu_maps())
+			process_pending_softirqs();
+		smp_call_function(zen2_disable_c6, NULL, 0);
+		put_cpu_maps();
+	}
+
+	/* Update the MSR to disable C6, done on all threads. */
+	rdmsrl(MSR_AMD_CSTATE_CFG, val);
+	wrmsrl(MSR_AMD_CSTATE_CFG, val & mask);
+}
+
 static void cf_check init_amd(struct cpuinfo_x86 *c)
 {
 	u32 l, h;
@@ -1240,6 +1270,9 @@ static void cf_check init_amd(struct cpuinfo_x86 *c)
 
 	amd_check_zenbleed();
 
+	if (zen2_c6_disabled)
+		zen2_disable_c6(NULL);
+
 	check_syscfg_dram_mod_en();
 
 	amd_log_freq(c);
@@ -1249,3 +1282,44 @@ const struct cpu_dev amd_cpu_dev = {
 	.c_early_init	= early_init_amd,
 	.c_init		= init_amd,
 };
+
+static int __init cf_check zen2_c6_errata_check(void)
+{
+	/*
+	 * Errata #1474: A Core May Hang After About 1044 Days
+	 * Set up a timer to disable C6 after 1000 days uptime.
+	 */
+	s_time_t delta;
+
+	/*
+	 * Zen1 vs Zen2 isn't a simple model number comparison, so use STIBP as
+	 * a heuristic to separate the two uarches in Fam17h.
+	 */
+	if (cpu_has_hypervisor || boot_cpu_data.x86 != 0x17 ||
+	    !boot_cpu_has(X86_FEATURE_AMD_STIBP))
+		return 0;
+
+	/*
+	 * Deduct current TSC value, this would be relevant if kexec'ed for
+	 * example.  Might not be accurate, but worst case we end up disabling
+	 * C6 before strictly required, which would still be safe.
+	 *
+	 * NB: all affected models (Zen2) have invariant TSC and TSC adjust
+	 * MSR, so early_time_init() will have already cleared any TSC offset.
+	 */
+	delta = DAYS(1000) - tsc_ticks2ns(rdtsc());
+	if (delta > 0) {
+		static struct timer errata_c6;
+
+		init_timer(&errata_c6, zen2_disable_c6, NULL, 0);
+		set_timer(&errata_c6, NOW() + delta);
+	} else
+		zen2_disable_c6(NULL);
+
+	return 0;
+}
+/*
+ * Must be executed after early_time_init() for tsc_ticks2ns() to have been
+ * calibrated.  That prevents us doing the check in init_amd().
+ */
+presmp_initcall(zen2_c6_errata_check);
diff --git a/xen/arch/x86/include/asm/msr-index.h b/xen/arch/x86/include/asm/msr-index.h
index 2382fc8e1181..4d41c171d291 100644
--- a/xen/arch/x86/include/asm/msr-index.h
+++ b/xen/arch/x86/include/asm/msr-index.h
@@ -211,6 +211,8 @@
 
 #define MSR_VIRT_SPEC_CTRL                  0xc001011f /* Layout matches MSR_SPEC_CTRL */
 
+#define MSR_AMD_CSTATE_CFG                  0xc0010296
+
 /*
  * Legacy MSR constants in need of cleanup.  No new MSRs below this comment.
  */
diff --git a/xen/include/xen/time.h b/xen/include/xen/time.h
index b7427460dd13..9ceaec541f4d 100644
--- a/xen/include/xen/time.h
+++ b/xen/include/xen/time.h
@@ -53,6 +53,7 @@ struct tm wallclock_time(uint64_t *ns);
 
 #define SYSTEM_TIME_HZ  1000000000ULL
 #define NOW()           ((s_time_t)get_s_time())
+#define DAYS(_d)        SECONDS((_d) * 86400ULL)
 #define SECONDS(_s)     ((s_time_t)((_s)  * 1000000000ULL))
 #define MILLISECS(_ms)  ((s_time_t)((_ms) * 1000000ULL))
 #define MICROSECS(_us)  ((s_time_t)((_us) * 1000ULL))
-- 
2.41.0


Re: [PATCH v3] amd: disable C6 after 1000 days on Zen2
Posted by Jan Beulich 9 months ago
On 28.07.2023 16:47, Roger Pau Monne wrote:
> As specified on Errata 1474:
> 
> "A core will fail to exit CC6 after about 1044 days after the last
> system reset. The time of failure may vary depending on the spread
> spectrum and REFCLK frequency."
> 
> Detect when running on AMD Zen2 and setup a timer to prevent entering
> C6 after 1000 days of uptime.  Take into account the TSC value at boot
> in order to account for any time elapsed before Xen has been booted.
> Worst case we end up disabling C6 before strictly necessary, but that
> would still be safe, and it's better than not taking the TSC value
> into account and hanging.
> 
> Disable C6 by updating the MSR listed in the revision guide, this
> avoids applying workarounds in the CPU idle drivers, as the processor
> won't be allowed to enter C6 by the hardware itself.
> 
> Print a message once C6 is disabled in order to let the user know.
> 
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>
with two remarks:

> @@ -1249,3 +1282,44 @@ const struct cpu_dev amd_cpu_dev = {
>  	.c_early_init	= early_init_amd,
>  	.c_init		= init_amd,
>  };
> +
> +static int __init cf_check zen2_c6_errata_check(void)
> +{
> +	/*
> +	 * Errata #1474: A Core May Hang After About 1044 Days
> +	 * Set up a timer to disable C6 after 1000 days uptime.
> +	 */
> +	s_time_t delta;
> +
> +	/*
> +	 * Zen1 vs Zen2 isn't a simple model number comparison, so use STIBP as
> +	 * a heuristic to separate the two uarches in Fam17h.
> +	 */
> +	if (cpu_has_hypervisor || boot_cpu_data.x86 != 0x17 ||
> +	    !boot_cpu_has(X86_FEATURE_AMD_STIBP))
> +		return 0;
> +
> +	/*
> +	 * Deduct current TSC value, this would be relevant if kexec'ed for
> +	 * example.  Might not be accurate, but worst case we end up disabling
> +	 * C6 before strictly required, which would still be safe.

I'm not really convinced of this being the worst case. TSC can be
written, and hence it could also have been altered in a way making
things look as if system wasn't running for a long time when really
it has been. But I'm okay to leave that special case aside right
now.

> +	 * NB: all affected models (Zen2) have invariant TSC and TSC adjust
> +	 * MSR, so early_time_init() will have already cleared any TSC offset.
> +	 */
> +	delta = DAYS(1000) - tsc_ticks2ns(rdtsc());
> +	if (delta > 0) {
> +		static struct timer errata_c6;
> +
> +		init_timer(&errata_c6, zen2_disable_c6, NULL, 0);
> +		set_timer(&errata_c6, NOW() + delta);
> +	} else
> +		zen2_disable_c6(NULL);

Strictly speaking you don't need the if/else here, since timers set
in the past will simply have their handlers executed right away (and
if that wasn't the case, there would be a race here).

Jan