[RFT][PATCH v1 8/8] cpufreq: intel_pstate: EAS: Increase cost for CPUs using L3 cache

Rafael J. Wysocki posted 1 patch 8 months ago
drivers/cpufreq/intel_pstate.c |    8 ++++++++
1 file changed, 8 insertions(+)
[RFT][PATCH v1 8/8] cpufreq: intel_pstate: EAS: Increase cost for CPUs using L3 cache
Posted by Rafael J. Wysocki 8 months ago
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

On some hybrid platforms some efficient CPUs (E-cores) are not connected
to the L3 cache, but there are no other differences between them and the
other E-cores that use L3.  In that case, it is generally more efficient
to run "light" workloads on the E-cores that do not use L3 and allow all
of the cores using L3, including P-cores, to go into idle states.

For this reason, slightly increase the cost for all CPUs sharing the L3
cache to make EAS prefer CPUs that do not use it to the other CPUs with
the same perf-to-frequency scaling factor (if any).

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/cpufreq/intel_pstate.c |    8 ++++++++
 1 file changed, 8 insertions(+)

--- a/drivers/cpufreq/intel_pstate.c
+++ b/drivers/cpufreq/intel_pstate.c
@@ -979,6 +979,7 @@
 			   unsigned long *cost)
 {
 	struct pstate_data *pstate = &all_cpu_data[dev->id]->pstate;
+	struct cpu_cacheinfo *cacheinfo = get_cpu_cacheinfo(dev->id);
 
 	/*
 	 * The smaller the perf-to-frequency scaling factor, the larger the IPC
@@ -991,6 +992,13 @@
 	 * of the same type in different "utilization bins" is different.
 	 */
 	*cost = div_u64(100ULL * INTEL_PSTATE_CORE_SCALING, pstate->scaling) + freq;
+	/*
+	 * Inrease the cost slightly for CPUs able to access L3 to avoid litting
+	 * it up too eagerly in case some other CPUs of the same type cannot
+	 * access it.
+	 */
+	if (cacheinfo->num_levels >= 3)
+		(*cost)++;
 
 	return 0;
 }
Re: [RFT][PATCH v1 8/8] cpufreq: intel_pstate: EAS: Increase cost for CPUs using L3 cache
Posted by Christian Loehle 7 months, 3 weeks ago
On 4/16/25 19:12, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> On some hybrid platforms some efficient CPUs (E-cores) are not connected
> to the L3 cache, but there are no other differences between them and the
> other E-cores that use L3.  In that case, it is generally more efficient
> to run "light" workloads on the E-cores that do not use L3 and allow all
> of the cores using L3, including P-cores, to go into idle states.
> 
> For this reason, slightly increase the cost for all CPUs sharing the L3
> cache to make EAS prefer CPUs that do not use it to the other CPUs with
> the same perf-to-frequency scaling factor (if any).
> 
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>  drivers/cpufreq/intel_pstate.c |    8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> --- a/drivers/cpufreq/intel_pstate.c
> +++ b/drivers/cpufreq/intel_pstate.c
> @@ -979,6 +979,7 @@
>  			   unsigned long *cost)
>  {
>  	struct pstate_data *pstate = &all_cpu_data[dev->id]->pstate;
> +	struct cpu_cacheinfo *cacheinfo = get_cpu_cacheinfo(dev->id);
>  
>  	/*
>  	 * The smaller the perf-to-frequency scaling factor, the larger the IPC
> @@ -991,6 +992,13 @@
>  	 * of the same type in different "utilization bins" is different.
>  	 */
>  	*cost = div_u64(100ULL * INTEL_PSTATE_CORE_SCALING, pstate->scaling) + freq;
> +	/*
> +	 * Inrease the cost slightly for CPUs able to access L3 to avoid litting

s/Inrease/Increase
and I guess s/litting/littering

> +	 * it up too eagerly in case some other CPUs of the same type cannot
> +	 * access it.
> +	 */
> +	if (cacheinfo->num_levels >= 3)
> +		(*cost)++;

This makes cost(OPP1) of the SoC Tile e-core as expensive as cost(OPP0) of a
normal e-core.
Is that the intended behaviour?
Re: [RFT][PATCH v1 8/8] cpufreq: intel_pstate: EAS: Increase cost for CPUs using L3 cache
Posted by Rafael J. Wysocki 7 months, 3 weeks ago
On Fri, Apr 25, 2025 at 11:32 PM Christian Loehle
<christian.loehle@arm.com> wrote:
>
> On 4/16/25 19:12, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >
> > On some hybrid platforms some efficient CPUs (E-cores) are not connected
> > to the L3 cache, but there are no other differences between them and the
> > other E-cores that use L3.  In that case, it is generally more efficient
> > to run "light" workloads on the E-cores that do not use L3 and allow all
> > of the cores using L3, including P-cores, to go into idle states.
> >
> > For this reason, slightly increase the cost for all CPUs sharing the L3
> > cache to make EAS prefer CPUs that do not use it to the other CPUs with
> > the same perf-to-frequency scaling factor (if any).
> >
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > ---
> >  drivers/cpufreq/intel_pstate.c |    8 ++++++++
> >  1 file changed, 8 insertions(+)
> >
> > --- a/drivers/cpufreq/intel_pstate.c
> > +++ b/drivers/cpufreq/intel_pstate.c
> > @@ -979,6 +979,7 @@
> >                          unsigned long *cost)
> >  {
> >       struct pstate_data *pstate = &all_cpu_data[dev->id]->pstate;
> > +     struct cpu_cacheinfo *cacheinfo = get_cpu_cacheinfo(dev->id);
> >
> >       /*
> >        * The smaller the perf-to-frequency scaling factor, the larger the IPC
> > @@ -991,6 +992,13 @@
> >        * of the same type in different "utilization bins" is different.
> >        */
> >       *cost = div_u64(100ULL * INTEL_PSTATE_CORE_SCALING, pstate->scaling) + freq;
> > +     /*
> > +      * Inrease the cost slightly for CPUs able to access L3 to avoid litting
>
> s/Inrease/Increase
> and I guess s/litting/littering
>
> > +      * it up too eagerly in case some other CPUs of the same type cannot
> > +      * access it.
> > +      */
> > +     if (cacheinfo->num_levels >= 3)

This check actually doesn't work on Intel processors, I have a
replacement patch for this one.

> > +             (*cost)++;
>
> This makes cost(OPP1) of the SoC Tile e-core as expensive as cost(OPP0) of a
> normal e-core.

If "a normal Ecore" is one using L3, then yes.

> Is that the intended behaviour?

Yes, it is.  I wanted the Ecores on L3 to appear somewhat more
expensive, but not too much.

It looks like *cost += 2 would work better, though.