Re: [PATCH V3] acpi: Allow ignoring _OSC CPPC v2 bit via kernel parameter

Rafael J. Wysocki posted 1 patch 1 year, 7 months ago
drivers/cpufreq/intel_pstate.c |   16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)
Re: [PATCH V3] acpi: Allow ignoring _OSC CPPC v2 bit via kernel parameter
Posted by Rafael J. Wysocki 1 year, 7 months ago
On Wednesday, June 19, 2024 7:09:35 PM CEST Rafael J. Wysocki wrote:
> On Wed, Jun 19, 2024 at 6:33 AM Aaron Rainbolt <arainbolt@kfocus.org> wrote:
> >
> > acpi: Allow ignoring _OSC CPPC v2 bit via kernel parameter
> >
> > The _OSC is supposed to contain a bit indicating whether the hardware
> > supports CPPC v2 or not. This bit is not always set, causing CPPC v2 to
> > be considered absent. This results in severe single-core performance
> > issues with the EEVDF scheduler on heterogenous-core Intel processors.
> 
> While some things can be affected by this, I don't immediately see a
> connection between CPPC v2, Intel hybrid processors and EEVDF.
> 
> In particular, why would EEVDF alone be affected?
> 
> Care to explain this?

And the reason why I am asking is because I think that you really need
something like this (untested beyond compilation):

---
 drivers/cpufreq/intel_pstate.c |   16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

Index: linux-pm/drivers/cpufreq/intel_pstate.c
===================================================================
--- linux-pm.orig/drivers/cpufreq/intel_pstate.c
+++ linux-pm/drivers/cpufreq/intel_pstate.c
@@ -355,16 +355,16 @@ static void intel_pstate_set_itmt_prio(i
 	int ret;
 
 	ret = cppc_get_perf_caps(cpu, &cppc_perf);
-	if (ret)
-		return;
-
 	/*
-	 * On some systems with overclocking enabled, CPPC.highest_perf is hardcoded to 0xff.
-	 * In this case we can't use CPPC.highest_perf to enable ITMT.
-	 * In this case we can look at MSR_HWP_CAPABILITIES bits [8:0] to decide.
+	 * If CPPC is not available, fall back to MSR_HWP_CAPABILITIES bits [8:0].
+	 *
+	 * Also, on some systems with overclocking enabled, CPPC.highest_perf is
+	 * hardcoded to 0xff, so CPPC.highest_perf cannot be used to enable ITMT.
+	 * Fall back to MSR_HWP_CAPABILITIES then too.
 	 */
-	if (cppc_perf.highest_perf == CPPC_MAX_PERF)
-		cppc_perf.highest_perf = HWP_HIGHEST_PERF(READ_ONCE(all_cpu_data[cpu]->hwp_cap_cached));
+	if (ret || cppc_perf.highest_perf == CPPC_MAX_PERF)
+		cppc_perf.highest_perf =
+			HWP_HIGHEST_PERF(READ_ONCE(all_cpu_data[cpu]->hwp_cap_cached));
 
 	/*
 	 * The priorities can be set regardless of whether or not
Re: [PATCH V3] acpi: Allow ignoring _OSC CPPC v2 bit via kernel parameter
Posted by Aaron Rainbolt 1 year, 7 months ago
On Wed, Jun 19, 2024 at 07:30:55PM +0200, Rafael J. Wysocki wrote:
> On Wednesday, June 19, 2024 7:09:35 PM CEST Rafael J. Wysocki wrote:
> > On Wed, Jun 19, 2024 at 6:33 AM Aaron Rainbolt <arainbolt@kfocus.org> wrote:
> > >
> > > acpi: Allow ignoring _OSC CPPC v2 bit via kernel parameter
> > >
> > > The _OSC is supposed to contain a bit indicating whether the hardware
> > > supports CPPC v2 or not. This bit is not always set, causing CPPC v2 to
> > > be considered absent. This results in severe single-core performance
> > > issues with the EEVDF scheduler on heterogenous-core Intel processors.
> > 
> > While some things can be affected by this, I don't immediately see a
> > connection between CPPC v2, Intel hybrid processors and EEVDF.
> > 
> > In particular, why would EEVDF alone be affected?
> > 
> > Care to explain this?
> 
> And the reason why I am asking is because I think that you really need
> something like this (untested beyond compilation):

Alright, our team has compiled and tested the patch.

Results were mixed - with my patch, both CPPC and ITMT were enabled. This
patch appears to enable only ITMT. (To be specific, running
"find /proc /sys | grep '(cppc|itmt)'" shows only ITMT enabled under
/proc, and no CPPC directories under /sys.) This causes a performance hit
in benchmarking - using my patch with 'ignore_osc_cppc_bit', we were
getting Geekbench 5 scores of at least 1907 single-core, and 10500
multi-core. With this patch, we only are getting approximately 1700
single-core, and less than 9000 multi-core. (With an entirely unpatched
kernel, we were getting less than 1000 single-core, and about 10000
multi-core.)

Ultimately this is an upgrade over unpatched performance, but a
downgrade from the previous patch. It seems having CPPC and ITMT
available at the same time made things work noticeably faster.

Is there some way that can get both CPPC and ITMT to work with an approach
like this? Our hardware does support both, it just has an incorrectly set
bit in _OSC.

> ---
>  drivers/cpufreq/intel_pstate.c |   16 ++++++++--------
>  1 file changed, 8 insertions(+), 8 deletions(-)
> 
> Index: linux-pm/drivers/cpufreq/intel_pstate.c
> ===================================================================
> --- linux-pm.orig/drivers/cpufreq/intel_pstate.c
> +++ linux-pm/drivers/cpufreq/intel_pstate.c
> @@ -355,16 +355,16 @@ static void intel_pstate_set_itmt_prio(i
>  	int ret;
>  
>  	ret = cppc_get_perf_caps(cpu, &cppc_perf);
> -	if (ret)
> -		return;
> -
>  	/*
> -	 * On some systems with overclocking enabled, CPPC.highest_perf is hardcoded to 0xff.
> -	 * In this case we can't use CPPC.highest_perf to enable ITMT.
> -	 * In this case we can look at MSR_HWP_CAPABILITIES bits [8:0] to decide.
> +	 * If CPPC is not available, fall back to MSR_HWP_CAPABILITIES bits [8:0].
> +	 *
> +	 * Also, on some systems with overclocking enabled, CPPC.highest_perf is
> +	 * hardcoded to 0xff, so CPPC.highest_perf cannot be used to enable ITMT.
> +	 * Fall back to MSR_HWP_CAPABILITIES then too.
>  	 */
> -	if (cppc_perf.highest_perf == CPPC_MAX_PERF)
> -		cppc_perf.highest_perf = HWP_HIGHEST_PERF(READ_ONCE(all_cpu_data[cpu]->hwp_cap_cached));
> +	if (ret || cppc_perf.highest_perf == CPPC_MAX_PERF)
> +		cppc_perf.highest_perf =
> +			HWP_HIGHEST_PERF(READ_ONCE(all_cpu_data[cpu]->hwp_cap_cached));
>  
>  	/*
>  	 * The priorities can be set regardless of whether or not
> 
> 
> 
Re: [PATCH V3] acpi: Allow ignoring _OSC CPPC v2 bit via kernel parameter
Posted by Aaron Rainbolt 1 year, 7 months ago
On Wed, Jun 19, 2024 at 04:39:39PM -0500, Aaron Rainbolt wrote:
> On Wed, Jun 19, 2024 at 07:30:55PM +0200, Rafael J. Wysocki wrote:
> > On Wednesday, June 19, 2024 7:09:35 PM CEST Rafael J. Wysocki wrote:
> > > On Wed, Jun 19, 2024 at 6:33 AM Aaron Rainbolt <arainbolt@kfocus.org> wrote:
> > > >
> > > > acpi: Allow ignoring _OSC CPPC v2 bit via kernel parameter
> > > >
> > > > The _OSC is supposed to contain a bit indicating whether the hardware
> > > > supports CPPC v2 or not. This bit is not always set, causing CPPC v2 to
> > > > be considered absent. This results in severe single-core performance
> > > > issues with the EEVDF scheduler on heterogenous-core Intel processors.
> > > 
> > > While some things can be affected by this, I don't immediately see a
> > > connection between CPPC v2, Intel hybrid processors and EEVDF.
> > > 
> > > In particular, why would EEVDF alone be affected?
> > > 
> > > Care to explain this?
> > 
> > And the reason why I am asking is because I think that you really need
> > something like this (untested beyond compilation):
> 
> Alright, our team has compiled and tested the patch.
> 
> Results were mixed - with my patch, both CPPC and ITMT were enabled. This
> patch appears to enable only ITMT. (To be specific, running
> "find /proc /sys | grep '(cppc|itmt)'" shows only ITMT enabled under
> /proc, and no CPPC directories under /sys.) This causes a performance hit
> in benchmarking - using my patch with 'ignore_osc_cppc_bit', we were
> getting Geekbench 5 scores of at least 1907 single-core, and 10500
> multi-core. With this patch, we only are getting approximately 1700
> single-core, and less than 9000 multi-core. (With an entirely unpatched
> kernel, we were getting less than 1000 single-core, and about 10000
> multi-core.)
> 
> Ultimately this is an upgrade over unpatched performance, but a
> downgrade from the previous patch. It seems having CPPC and ITMT
> available at the same time made things work noticeably faster.
> 
> Is there some way that can get both CPPC and ITMT to work with an approach
> like this? Our hardware does support both, it just has an incorrectly set
> bit in _OSC.

I was premature in sending this - we reverted back to the previous patch
and are now getting similar scores to the new patch, meaning the benchmark
results may have had nothing to do with the new patch. We'll test further
and report back once we have conclusive results.
 
> > ---
> >  drivers/cpufreq/intel_pstate.c |   16 ++++++++--------
> >  1 file changed, 8 insertions(+), 8 deletions(-)
> > 
> > Index: linux-pm/drivers/cpufreq/intel_pstate.c
> > ===================================================================
> > --- linux-pm.orig/drivers/cpufreq/intel_pstate.c
> > +++ linux-pm/drivers/cpufreq/intel_pstate.c
> > @@ -355,16 +355,16 @@ static void intel_pstate_set_itmt_prio(i
> >  	int ret;
> >  
> >  	ret = cppc_get_perf_caps(cpu, &cppc_perf);
> > -	if (ret)
> > -		return;
> > -
> >  	/*
> > -	 * On some systems with overclocking enabled, CPPC.highest_perf is hardcoded to 0xff.
> > -	 * In this case we can't use CPPC.highest_perf to enable ITMT.
> > -	 * In this case we can look at MSR_HWP_CAPABILITIES bits [8:0] to decide.
> > +	 * If CPPC is not available, fall back to MSR_HWP_CAPABILITIES bits [8:0].
> > +	 *
> > +	 * Also, on some systems with overclocking enabled, CPPC.highest_perf is
> > +	 * hardcoded to 0xff, so CPPC.highest_perf cannot be used to enable ITMT.
> > +	 * Fall back to MSR_HWP_CAPABILITIES then too.
> >  	 */
> > -	if (cppc_perf.highest_perf == CPPC_MAX_PERF)
> > -		cppc_perf.highest_perf = HWP_HIGHEST_PERF(READ_ONCE(all_cpu_data[cpu]->hwp_cap_cached));
> > +	if (ret || cppc_perf.highest_perf == CPPC_MAX_PERF)
> > +		cppc_perf.highest_perf =
> > +			HWP_HIGHEST_PERF(READ_ONCE(all_cpu_data[cpu]->hwp_cap_cached));
> >  
> >  	/*
> >  	 * The priorities can be set regardless of whether or not
> > 
> > 
> > 
Re: [PATCH V3] acpi: Allow ignoring _OSC CPPC v2 bit via kernel parameter
Posted by Aaron Rainbolt 1 year, 7 months ago
OK, we have done thorough benchmarking of the two patches. In summary,
they both seem to provide exactly the same performance improvements.
My initial worry that Rafael's patch didn't deliver the same performance
improvements was unfounded.

The following are the single-core and multi-core scores from running
Geekbench 5 multiple times on a Carbon Systems Iridium 16 system. The
first batch of tests was done with an Ubuntu kernel built with with my V3
proposed patch, while the second batch was done with a kernel build with
Rafael's proposed patch.

Links to the Geekbench 5 reports can be shared if needed.

_OSC CPPC bit ignore patch (written by Aaron Rainbolt):
Kernel parameter 'ignore_osc_cppc_bit' set in
'/etc/default/grub.d/kfocus.cfg'.
'/sys/devices/system/cpu/cpu*/acpi_cppc' and
'/proc/sys/kernel/sched_itmt_enabled' both present

| Run | Single | Multi  |
| --- | ------ | ------ |
|  01 |   1874 |  10475 |
|  02 |   1831 |  10132 |
|  03 |   1858 |  10348 |
|  04 |   1848 |  10370 |
|  05 |   1831 |  10413 |
| --- | ------ | ------ |
| AVG |   1848 |  10348 |


intel_pstate CPPC override patch (written by Rafael Wysocki):
No special kernel parameters set.
'/sys/devices/system/cpu/cpu*/acpi_cppc' ABSENT,
'/proc/sys/kernel/sched_itmt_enabled' present

| Run | Single | Multi  |
| --- | ------ | ------ |
|  01 |   1820 |  10310 |
|  02 |   1870 |  10303 |
|  03 |   1867 |  10420 |
|  04 |   1844 |  10283 |
|  05 |   1835 |  10451 |
| --- | ------ | ------ |
| AVG |   1847 |  10353 |
Re: [PATCH V3] acpi: Allow ignoring _OSC CPPC v2 bit via kernel parameter
Posted by Rafael J. Wysocki 1 year, 7 months ago
On Thu, Jun 20, 2024 at 3:05 AM Aaron Rainbolt <arainbolt@kfocus.org> wrote:
>
> OK, we have done thorough benchmarking of the two patches. In summary,
> they both seem to provide exactly the same performance improvements.
> My initial worry that Rafael's patch didn't deliver the same performance
> improvements was unfounded.

Good to know, thanks!

> The following are the single-core and multi-core scores from running
> Geekbench 5 multiple times on a Carbon Systems Iridium 16 system. The
> first batch of tests was done with an Ubuntu kernel built with with my V3
> proposed patch, while the second batch was done with a kernel build with
> Rafael's proposed patch.
>
> Links to the Geekbench 5 reports can be shared if needed.
>
> _OSC CPPC bit ignore patch (written by Aaron Rainbolt):
> Kernel parameter 'ignore_osc_cppc_bit' set in
> '/etc/default/grub.d/kfocus.cfg'.
> '/sys/devices/system/cpu/cpu*/acpi_cppc' and
> '/proc/sys/kernel/sched_itmt_enabled' both present
>
> | Run | Single | Multi  |
> | --- | ------ | ------ |
> |  01 |   1874 |  10475 |
> |  02 |   1831 |  10132 |
> |  03 |   1858 |  10348 |
> |  04 |   1848 |  10370 |
> |  05 |   1831 |  10413 |
> | --- | ------ | ------ |
> | AVG |   1848 |  10348 |
>
>
> intel_pstate CPPC override patch (written by Rafael Wysocki):
> No special kernel parameters set.
> '/sys/devices/system/cpu/cpu*/acpi_cppc' ABSENT,
> '/proc/sys/kernel/sched_itmt_enabled' present
>
> | Run | Single | Multi  |
> | --- | ------ | ------ |
> |  01 |   1820 |  10310 |
> |  02 |   1870 |  10303 |
> |  03 |   1867 |  10420 |
> |  04 |   1844 |  10283 |
> |  05 |   1835 |  10451 |
> | --- | ------ | ------ |
> | AVG |   1847 |  10353 |

The problem with ignoring what the platform firmware is telling (or
not telling) the OS through ACPI is that only it knows the reason why
it is doing that.

It may be by mistake, but it also may be on purpose and it is hard to say.

However, intel_pstate already knows that HWP is enabled on the
processor, so it can be used directly regardless of whether or not
CPPC is enabled.  That is more appropriate and does not require users
to modify their kernel command line.

I'll add a changelog to the patch and submit it properly.

Thanks!
Re: [PATCH V3] acpi: Allow ignoring _OSC CPPC v2 bit via kernel parameter
Posted by Aaron Rainbolt 1 year, 7 months ago
On Wed, Jun 19, 2024 at 07:30:55PM +0200, Rafael J. Wysocki wrote:
> On Wednesday, June 19, 2024 7:09:35 PM CEST Rafael J. Wysocki wrote:
> > On Wed, Jun 19, 2024 at 6:33 AM Aaron Rainbolt <arainbolt@kfocus.org> wrote:
> > >
> > > acpi: Allow ignoring _OSC CPPC v2 bit via kernel parameter
> > >
> > > The _OSC is supposed to contain a bit indicating whether the hardware
> > > supports CPPC v2 or not. This bit is not always set, causing CPPC v2 to
> > > be considered absent. This results in severe single-core performance
> > > issues with the EEVDF scheduler on heterogenous-core Intel processors.
> > 
> > While some things can be affected by this, I don't immediately see a
> > connection between CPPC v2, Intel hybrid processors and EEVDF.
> > 
> > In particular, why would EEVDF alone be affected?
> > 
> > Care to explain this?
> 
> And the reason why I am asking is because I think that you really need
> something like this (untested beyond compilation):
> 
> ---
>  drivers/cpufreq/intel_pstate.c |   16 ++++++++--------
>  1 file changed, 8 insertions(+), 8 deletions(-)
> 
> Index: linux-pm/drivers/cpufreq/intel_pstate.c
> ===================================================================
> --- linux-pm.orig/drivers/cpufreq/intel_pstate.c
> +++ linux-pm/drivers/cpufreq/intel_pstate.c
> @@ -355,16 +355,16 @@ static void intel_pstate_set_itmt_prio(i
>  	int ret;
>  
>  	ret = cppc_get_perf_caps(cpu, &cppc_perf);
> -	if (ret)
> -		return;
> -
>  	/*
> -	 * On some systems with overclocking enabled, CPPC.highest_perf is hardcoded to 0xff.
> -	 * In this case we can't use CPPC.highest_perf to enable ITMT.
> -	 * In this case we can look at MSR_HWP_CAPABILITIES bits [8:0] to decide.
> +	 * If CPPC is not available, fall back to MSR_HWP_CAPABILITIES bits [8:0].
> +	 *
> +	 * Also, on some systems with overclocking enabled, CPPC.highest_perf is
> +	 * hardcoded to 0xff, so CPPC.highest_perf cannot be used to enable ITMT.
> +	 * Fall back to MSR_HWP_CAPABILITIES then too.
>  	 */
> -	if (cppc_perf.highest_perf == CPPC_MAX_PERF)
> -		cppc_perf.highest_perf = HWP_HIGHEST_PERF(READ_ONCE(all_cpu_data[cpu]->hwp_cap_cached));
> +	if (ret || cppc_perf.highest_perf == CPPC_MAX_PERF)
> +		cppc_perf.highest_perf =
> +			HWP_HIGHEST_PERF(READ_ONCE(all_cpu_data[cpu]->hwp_cap_cached));
>  
>  	/*
>  	 * The priorities can be set regardless of whether or not
> 
> 
> 

Gah. I can't read apparently. That patch may very well work because I
just realized the "if (ret) return;" means to return if ret is NOT 0. I
had it confused with "return if ret is 0".

That patch looks like it may very well work, and better than what I had
because it doesn't require manually setting a kernel parameter. I'll apply
it and test it. (That may take me a bit, I don't have access to the
hardware with the problem, only my boss does, but I should be able to get
it done before the end of today.)
Re: [PATCH V3] acpi: Allow ignoring _OSC CPPC v2 bit via kernel parameter
Posted by Aaron Rainbolt 1 year, 7 months ago
On Wed, Jun 19, 2024 at 07:30:55PM +0200, Rafael J. Wysocki wrote:
> On Wednesday, June 19, 2024 7:09:35 PM CEST Rafael J. Wysocki wrote:
> > On Wed, Jun 19, 2024 at 6:33 AM Aaron Rainbolt <arainbolt@kfocus.org> wrote:
> > >
> > > acpi: Allow ignoring _OSC CPPC v2 bit via kernel parameter
> > >
> > > The _OSC is supposed to contain a bit indicating whether the hardware
> > > supports CPPC v2 or not. This bit is not always set, causing CPPC v2 to
> > > be considered absent. This results in severe single-core performance
> > > issues with the EEVDF scheduler on heterogenous-core Intel processors.
> > 
> > While some things can be affected by this, I don't immediately see a
> > connection between CPPC v2, Intel hybrid processors and EEVDF.
> > 
> > In particular, why would EEVDF alone be affected?
> > 
> > Care to explain this?
> 
> And the reason why I am asking is because I think that you really need
> something like this (untested beyond compilation):
> 
> ---
>  drivers/cpufreq/intel_pstate.c |   16 ++++++++--------
>  1 file changed, 8 insertions(+), 8 deletions(-)
> 
> Index: linux-pm/drivers/cpufreq/intel_pstate.c
> ===================================================================
> --- linux-pm.orig/drivers/cpufreq/intel_pstate.c
> +++ linux-pm/drivers/cpufreq/intel_pstate.c
> @@ -355,16 +355,16 @@ static void intel_pstate_set_itmt_prio(i
>  	int ret;
>  
>  	ret = cppc_get_perf_caps(cpu, &cppc_perf);
> -	if (ret)
> -		return;
> -
>  	/*
> -	 * On some systems with overclocking enabled, CPPC.highest_perf is hardcoded to 0xff.
> -	 * In this case we can't use CPPC.highest_perf to enable ITMT.
> -	 * In this case we can look at MSR_HWP_CAPABILITIES bits [8:0] to decide.
> +	 * If CPPC is not available, fall back to MSR_HWP_CAPABILITIES bits [8:0].
> +	 *
> +	 * Also, on some systems with overclocking enabled, CPPC.highest_perf is
> +	 * hardcoded to 0xff, so CPPC.highest_perf cannot be used to enable ITMT.
> +	 * Fall back to MSR_HWP_CAPABILITIES then too.
>  	 */
> -	if (cppc_perf.highest_perf == CPPC_MAX_PERF)
> -		cppc_perf.highest_perf = HWP_HIGHEST_PERF(READ_ONCE(all_cpu_data[cpu]->hwp_cap_cached));
> +	if (ret || cppc_perf.highest_perf == CPPC_MAX_PERF)
> +		cppc_perf.highest_perf =
> +			HWP_HIGHEST_PERF(READ_ONCE(all_cpu_data[cpu]->hwp_cap_cached));
>  
>  	/*
>  	 * The priorities can be set regardless of whether or not
> 
> 
> 

That could work, maybe? It modifies the code path that is being hit, but
our systems are not overclocked, and when the patch suggested earlier is
in place things "just work" when the right parameter is set, so I don't
*think* cppc_perf.highest_perf will be equal to CPPC_MAX_PERF. (I know
from https://github.com/StarLabsLtd/firmware/issues/143 that 'ret' will be
0, since cppc_get_perf_caps will error out with "ACPI CPPC: No CPC
descriptor for CPU:X", so that code path cannot be triggered in our case
unless cppc_perf.highest_pert == CPPC_MAX_PERF.)