[PATCH v1] cpufreq: intel_pstate: Always use HWP_DESIRED_PERF in passive mode

Rafael J. Wysocki posted 1 patch 3 months, 3 weeks ago
drivers/cpufreq/intel_pstate.c |    4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
[PATCH v1] cpufreq: intel_pstate: Always use HWP_DESIRED_PERF in passive mode
Posted by Rafael J. Wysocki 3 months, 3 weeks ago
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

In the passive mode, intel_cpufreq_update_pstate() sets HWP_MIN_PERF in
accordance with the target frequency to ensure delivering adequate
performance, but it sets HWP_DESIRED_PERF to 0, so the processor has no
indication that the desired performance level is actually equal to the
floor one.  This may cause it to choose a performance point way above
the desired level.

Moreover, this is inconsistent with intel_cpufreq_adjust_perf() which
actually sets HWP_DESIRED_PERF in accordance with the target performance
value.

Address this by adjusting intel_cpufreq_update_pstate() to pass
target_pstate as both the minimum and the desired performance levels
to intel_cpufreq_hwp_update().

Fixes: a365ab6b9dfb ("cpufreq: intel_pstate: Implement the ->adjust_perf() callback")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/cpufreq/intel_pstate.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/drivers/cpufreq/intel_pstate.c
+++ b/drivers/cpufreq/intel_pstate.c
@@ -3249,8 +3249,8 @@
 		int max_pstate = policy->strict_target ?
 					target_pstate : cpu->max_perf_ratio;
 
-		intel_cpufreq_hwp_update(cpu, target_pstate, max_pstate, 0,
-					 fast_switch);
+		intel_cpufreq_hwp_update(cpu, target_pstate, max_pstate,
+					 target_pstate, fast_switch);
 	} else if (target_pstate != old_pstate) {
 		intel_cpufreq_perf_ctl_update(cpu, target_pstate, fast_switch);
 	}
Re: [PATCH v1] cpufreq: intel_pstate: Always use HWP_DESIRED_PERF in passive mode
Posted by Shashank Balaji 2 months, 3 weeks ago
Hi Rafael,

On Mon, Jun 16, 2025 at 08:19:19PM +0200, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> In the passive mode, intel_cpufreq_update_pstate() sets HWP_MIN_PERF in
> accordance with the target frequency to ensure delivering adequate
> performance, but it sets HWP_DESIRED_PERF to 0, so the processor has no
> indication that the desired performance level is actually equal to the
> floor one.  This may cause it to choose a performance point way above
> the desired level.
> 
> Moreover, this is inconsistent with intel_cpufreq_adjust_perf() which
> actually sets HWP_DESIRED_PERF in accordance with the target performance
> value.
> 
> Address this by adjusting intel_cpufreq_update_pstate() to pass
> target_pstate as both the minimum and the desired performance levels
> to intel_cpufreq_hwp_update().
> 
> Fixes: a365ab6b9dfb ("cpufreq: intel_pstate: Implement the ->adjust_perf() callback")
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>  drivers/cpufreq/intel_pstate.c |    4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> --- a/drivers/cpufreq/intel_pstate.c
> +++ b/drivers/cpufreq/intel_pstate.c
> @@ -3249,8 +3249,8 @@
>  		int max_pstate = policy->strict_target ?
>  					target_pstate : cpu->max_perf_ratio;
>  
> -		intel_cpufreq_hwp_update(cpu, target_pstate, max_pstate, 0,
> -					 fast_switch);
> +		intel_cpufreq_hwp_update(cpu, target_pstate, max_pstate,
> +					 target_pstate, fast_switch);
>  	} else if (target_pstate != old_pstate) {
>  		intel_cpufreq_perf_ctl_update(cpu, target_pstate, fast_switch);
>  	}

The patch looks good to me. In fact, I'm surprised this is not how it has always
been :)

Here are two tests to look at the power consumption and performance implications.
The tests were run on a Intel Core Ultra 5 135H machine with 6.16-rc5 defconfig +
CONFIG_CPU_FREQ_GOV_POWERSAVE + CONFIG_CPU_FREQ_GOV_CONSERVATIVE.

Both the idle power consumption test and the CPU stressor power consumption test
were run with the powersave, performance, conservative, and ondemand governors
on all the cores.

We don't expect any changes in the powersave and performance governors because
they have strict_target set. So in their case, the change is:

	min_perf     = target        min_perf     = target
	desired_perf = 0        =>   desired_perf = target
	max_perf     = target        max_perf     = target

So, that change does nothing. We only expect to see a change in the conservative
and ondemand governors, which is confirmed by the test results.

In summary, this patch lowers idle power consumption with conservative and
ondemand governors by 9%. There are no significant energy or duration changes
with any of the governors for the stress-ng cpu stressor.

1. Idle power consumption

Monitor average power usage every minute for six minutes:

turbostat --Summary --quiet --show PkgWatt --interval 60 --num_iterations 6

+--------------+-------------------+-------+---------+
|              |     Average power (W)     |         |
+   Governor   +-------------------+-------+ Change  +
|              | Before            | After |         |
+--------------+-------------------+-------+---------+
| Powersave    | 7                 | 7.1   | ~0%     |
| Performance  | 11.85             | 11.85 | ~0%     |
| Conservative | 8.1               | 7.35  | -9%     |
| Ondemand     | 7.55              | 6.85  | -9%     |
+--------------+-------------------+-------+---------+

2. CPU stressor's power consumption

Run stress-ng's matrixprod cpu stressor on each of the cores for 5 million bogo
ops (fixed workload), with a cpu load of 50%, so that there's some leeway for
frequency tuning by the governor. At the default 100% cpu load, the frequency
would just shoot up to the maximum.

turbostat --quiet --Summary --Joules --show Pkg_J \ 
	stress-ng --cpu $(nproc) --cpu-ops 5000000 --cpu-load 50 \
		  --cpu-method matrixprod --metrics-brief

+--------------+------------+--------------+------------+--------------+--------+-----------+
|              |          Before           |           After           |      Change        |
+   Governor   +------------+--------------+------------+--------------+--------+-----------+
|              | Energy (J) | Duration (s) | Energy (J) | Duration (s) | Energy | Duration  |
|--------------+------------+--------------+------------+--------------+--------+-----------+
| Powersave    | 10680      | 773          | 10691      | 776          | 0%     | 0%        |
| Performance  | 11753      | 409          | 11723      | 405          | 0%     | -1%       |
| Conservative | 11815      | 409          | 11922      | 414          | 1%     | 1%        |
| Ondemand     | 11803      | 408          | 11814      | 409          | 0%     | 0%        |
+--------------+------------+--------------+------------+--------------+--------+-----------+

Tested-by: Shashank Balaji <shashank.mahadasyam@sony.com>

Thanks,
Shashank