[PATCH v3 5/5] x86/mwait-idle: make SPR C1 and C1E be independent

Jan Beulich posted 5 patches 3 years, 5 months ago
[PATCH v3 5/5] x86/mwait-idle: make SPR C1 and C1E be independent
Posted by Jan Beulich 3 years, 5 months ago
From: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>

This patch partially reverts the changes made by the following commit:

da0e58c038e6 intel_idle: add 'preferred_cstates' module argument

As that commit describes, on early Sapphire Rapids Xeon platforms the C1 and
C1E states were mutually exclusive, so that users could only have either C1 and
C6, or C1E and C6.

However, Intel firmware engineers managed to remove this limitation and make C1
and C1E to be completely independent, just like on previous Xeon platforms.

Therefore, this patch:
 * Removes commentary describing the old, and now non-existing SPR C1E
   limitation.
 * Marks SPR C1E as available by default.
 * Removes the 'preferred_cstates' parameter handling for SPR. Both C1 and
   C1E will be available regardless of 'preferred_cstates' value.

We expect that all SPR systems are shipping with new firmware, which includes
the C1/C1E improvement.

Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Origin: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 1548fac47a11
Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v3: New.

--- a/xen/arch/x86/cpu/mwait-idle.c
+++ b/xen/arch/x86/cpu/mwait-idle.c
@@ -689,16 +689,6 @@ static struct cpuidle_state __read_mostl
 	{}
 };
 
-/*
- * On Sapphire Rapids Xeon C1 has to be disabled if C1E is enabled, and vice
- * versa. On SPR C1E is enabled only if "C1E promotion" bit is set in
- * MSR_IA32_POWER_CTL. But in this case there effectively no C1, because C1
- * requests are promoted to C1E. If the "C1E promotion" bit is cleared, then
- * both C1 and C1E requests end up with C1, so there is effectively no C1E.
- *
- * By default we enable C1 and disable C1E by marking it with
- * 'CPUIDLE_FLAG_DISABLED'.
- */
 static struct cpuidle_state __read_mostly spr_cstates[] = {
 	{
 		.name = "C1",
@@ -708,7 +698,7 @@ static struct cpuidle_state __read_mostl
 	},
 	{
 		.name = "C1E",
-		.flags = MWAIT2flg(0x01) | CPUIDLE_FLAG_DISABLED,
+		.flags = MWAIT2flg(0x01),
 		.exit_latency = 2,
 		.target_residency = 4,
 	},
@@ -1401,17 +1391,6 @@ static void __init spr_idle_state_table_
 {
 	uint64_t msr;
 
-	/* Check if user prefers C1E over C1. */
-	if ((preferred_states_mask & BIT(2, U)) &&
-	    !(preferred_states_mask & BIT(1, U))) {
-		/* Disable C1 and enable C1E. */
-		spr_cstates[0].flags |= CPUIDLE_FLAG_DISABLED;
-		spr_cstates[1].flags &= ~CPUIDLE_FLAG_DISABLED;
-
-		/* Request enabling C1E using the "C1E promotion" bit. */
-		idle_cpu_spr.c1e_promotion = C1E_PROMOTION_ENABLE;
-	}
-
 	/*
 	 * By default, the C6 state assumes the worst-case scenario of package
 	 * C6. However, if PC6 is disabled, we update the numbers to match
Re: [PATCH v3 5/5] x86/mwait-idle: make SPR C1 and C1E be independent
Posted by Roger Pau Monné 3 years, 3 months ago
On Thu, Aug 18, 2022 at 03:05:19PM +0200, Jan Beulich wrote:
> From: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
> 
> This patch partially reverts the changes made by the following commit:
> 
> da0e58c038e6 intel_idle: add 'preferred_cstates' module argument
> 
> As that commit describes, on early Sapphire Rapids Xeon platforms the C1 and
> C1E states were mutually exclusive, so that users could only have either C1 and
> C6, or C1E and C6.
> 
> However, Intel firmware engineers managed to remove this limitation and make C1
> and C1E to be completely independent, just like on previous Xeon platforms.
> 
> Therefore, this patch:
>  * Removes commentary describing the old, and now non-existing SPR C1E
>    limitation.
>  * Marks SPR C1E as available by default.
>  * Removes the 'preferred_cstates' parameter handling for SPR. Both C1 and
>    C1E will be available regardless of 'preferred_cstates' value.
> 
> We expect that all SPR systems are shipping with new firmware, which includes
> the C1/C1E improvement.
> 
> Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> Origin: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 1548fac47a11
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Acked-by: Roger Pau Monné <roger.pau@citrix.com>

I guess we need to be careful of running this on pre-production
hardware then?

Thanks, Roger.

Re: [PATCH v3 5/5] x86/mwait-idle: make SPR C1 and C1E be independent
Posted by Jan Beulich 3 years, 3 months ago
On 13.10.2022 14:05, Roger Pau Monné wrote:
> On Thu, Aug 18, 2022 at 03:05:19PM +0200, Jan Beulich wrote:
>> From: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
>>
>> This patch partially reverts the changes made by the following commit:
>>
>> da0e58c038e6 intel_idle: add 'preferred_cstates' module argument
>>
>> As that commit describes, on early Sapphire Rapids Xeon platforms the C1 and
>> C1E states were mutually exclusive, so that users could only have either C1 and
>> C6, or C1E and C6.
>>
>> However, Intel firmware engineers managed to remove this limitation and make C1
>> and C1E to be completely independent, just like on previous Xeon platforms.
>>
>> Therefore, this patch:
>>  * Removes commentary describing the old, and now non-existing SPR C1E
>>    limitation.
>>  * Marks SPR C1E as available by default.
>>  * Removes the 'preferred_cstates' parameter handling for SPR. Both C1 and
>>    C1E will be available regardless of 'preferred_cstates' value.
>>
>> We expect that all SPR systems are shipping with new firmware, which includes
>> the C1/C1E improvement.
>>
>> Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
>> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>> Origin: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 1548fac47a11
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> Acked-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks.

> I guess we need to be careful of running this on pre-production
> hardware then?

Well, power savings may not be as expected there, but beyond that I don't
think there would be much of an observable effect.

Jan