[PATCH] cpufreq/amd-pstate: Fix MinPerf MSR value for performance policy

Juan Martinez posted 1 patch 1 month ago
drivers/cpufreq/amd-pstate.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
[PATCH] cpufreq/amd-pstate: Fix MinPerf MSR value for performance policy
Posted by Juan Martinez 1 month ago
When the CPU frequency policy is set to CPUFREQ_POLICY_PERFORMANCE
(which occurs when EPP hint is set to "performance"), the driver
incorrectly sets the MinPerf field in CPPC request MSR to nominal_perf
instead of lowest_nonlinear_perf.

According to the AMD architectural programmer's manual volume 2 [1],
in section "17.6.4.1 CPPC_CAPABILITY_1", lowest_nonlinear_perf represents
the most energy efficient performance level (in terms of performance per
watt). The MinPerf field should be set to this value even in performance
mode to maintain proper power/performance characteristics.

This fixes a regression introduced by commit 0c411b39e4f4c ("amd-pstate: Set
min_perf to nominal_perf for active mode performance gov"), which correctly
identified that highest_perf was too high but chose nominal_perf as an
intermediate value instead of lowest_nonlinear_perf.

The fix changes amd_pstate_update_min_max_limit() to use lowest_nonlinear_perf
instead of nominal_perf when the policy is CPUFREQ_POLICY_PERFORMANCE.

[1] https://docs.amd.com/v/u/en-US/24593_3.43
    AMD64 Architecture Programmer's Manual Volume 2: System Programming
    Section 17.6.4.1 CPPC_CAPABILITY_1
    (Referenced in commit 5d9a354cf839a)

Fixes: 0c411b39e4f4c ("amd-pstate: Set min_perf to nominal_perf for active mode performance gov")
Tested-by: Kaushik Reddy S <kaushik.reddys@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Juan Martinez <juan.martinez@amd.com>
---
 drivers/cpufreq/amd-pstate.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
index e4f1933dd7d47..de0bb5b325502 100644
--- a/drivers/cpufreq/amd-pstate.c
+++ b/drivers/cpufreq/amd-pstate.c
@@ -634,8 +634,8 @@ static void amd_pstate_update_min_max_limit(struct cpufreq_policy *policy)
 	WRITE_ONCE(cpudata->max_limit_freq, policy->max);
 
 	if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE) {
-		perf.min_limit_perf = min(perf.nominal_perf, perf.max_limit_perf);
-		WRITE_ONCE(cpudata->min_limit_freq, min(cpudata->nominal_freq, cpudata->max_limit_freq));
+		perf.min_limit_perf = min(perf.lowest_nonlinear_perf, perf.max_limit_perf);
+		WRITE_ONCE(cpudata->min_limit_freq, min(cpudata->lowest_nonlinear_freq, cpudata->max_limit_freq));
 	} else {
 		perf.min_limit_perf = freq_to_perf(perf, cpudata->nominal_freq, policy->min);
 		WRITE_ONCE(cpudata->min_limit_freq, policy->min);
-- 
2.34.1
Re: [PATCH] cpufreq/amd-pstate: Fix MinPerf MSR value for performance policy
Posted by Mario Limonciello 1 month ago
On 1/7/26 3:19 PM, Juan Martinez wrote:
> When the CPU frequency policy is set to CPUFREQ_POLICY_PERFORMANCE
> (which occurs when EPP hint is set to "performance"), the driver
> incorrectly sets the MinPerf field in CPPC request MSR to nominal_perf
> instead of lowest_nonlinear_perf.
> 
> According to the AMD architectural programmer's manual volume 2 [1],
> in section "17.6.4.1 CPPC_CAPABILITY_1", lowest_nonlinear_perf represents
> the most energy efficient performance level (in terms of performance per
> watt). The MinPerf field should be set to this value even in performance
> mode to maintain proper power/performance characteristics.
> 
> This fixes a regression introduced by commit 0c411b39e4f4c ("amd-pstate: Set
> min_perf to nominal_perf for active mode performance gov"), which correctly
> identified that highest_perf was too high but chose nominal_perf as an
> intermediate value instead of lowest_nonlinear_perf.
> 
> The fix changes amd_pstate_update_min_max_limit() to use lowest_nonlinear_perf
> instead of nominal_perf when the policy is CPUFREQ_POLICY_PERFORMANCE.
> 
> [1] https://docs.amd.com/v/u/en-US/24593_3.43
>      AMD64 Architecture Programmer's Manual Volume 2: System Programming
>      Section 17.6.4.1 CPPC_CAPABILITY_1
>      (Referenced in commit 5d9a354cf839a)
> 
> Fixes: 0c411b39e4f4c ("amd-pstate: Set min_perf to nominal_perf for active mode performance gov")
> Tested-by: Kaushik Reddy S <kaushik.reddys@amd.com>
> Cc: stable@vger.kernel.org
> Signed-off-by: Juan Martinez <juan.martinez@amd.com>

I think this change is reasonable, but I'd like to get Gautham's 
comments as the original author of 0c411b39e4f4c.

> ---
>   drivers/cpufreq/amd-pstate.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
> index e4f1933dd7d47..de0bb5b325502 100644
> --- a/drivers/cpufreq/amd-pstate.c
> +++ b/drivers/cpufreq/amd-pstate.c
> @@ -634,8 +634,8 @@ static void amd_pstate_update_min_max_limit(struct cpufreq_policy *policy)
>   	WRITE_ONCE(cpudata->max_limit_freq, policy->max);
>   
>   	if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE) {
> -		perf.min_limit_perf = min(perf.nominal_perf, perf.max_limit_perf);
> -		WRITE_ONCE(cpudata->min_limit_freq, min(cpudata->nominal_freq, cpudata->max_limit_freq));
> +		perf.min_limit_perf = min(perf.lowest_nonlinear_perf, perf.max_limit_perf);
> +		WRITE_ONCE(cpudata->min_limit_freq, min(cpudata->lowest_nonlinear_freq, cpudata->max_limit_freq));
>   	} else {
>   		perf.min_limit_perf = freq_to_perf(perf, cpudata->nominal_freq, policy->min);
>   		WRITE_ONCE(cpudata->min_limit_freq, policy->min);
Re: [PATCH] cpufreq/amd-pstate: Fix MinPerf MSR value for performance policy
Posted by Gautham R. Shenoy 3 weeks, 1 day ago
Hello Juan, Mario,

On Wed, Jan 07, 2026 at 03:24:31PM -0600, Mario Limonciello wrote:
> On 1/7/26 3:19 PM, Juan Martinez wrote:
> > When the CPU frequency policy is set to CPUFREQ_POLICY_PERFORMANCE
> > (which occurs when EPP hint is set to "performance"), the driver
> > incorrectly sets the MinPerf field in CPPC request MSR to nominal_perf
> > instead of lowest_nonlinear_perf.
> > 
> > According to the AMD architectural programmer's manual volume 2 [1],
> > in section "17.6.4.1 CPPC_CAPABILITY_1", lowest_nonlinear_perf represents
> > the most energy efficient performance level (in terms of performance per
> > watt). The MinPerf field should be set to this value even in performance
> > mode to maintain proper power/performance characteristics.
> > 
> > This fixes a regression introduced by commit 0c411b39e4f4c ("amd-pstate: Set
> > min_perf to nominal_perf for active mode performance gov"), which correctly
> > identified that highest_perf was too high but chose nominal_perf as an
> > intermediate value instead of lowest_nonlinear_perf.

> > 
> > The fix changes amd_pstate_update_min_max_limit() to use lowest_nonlinear_perf
> > instead of nominal_perf when the policy is CPUFREQ_POLICY_PERFORMANCE.
> > 
> > [1] https://docs.amd.com/v/u/en-US/24593_3.43
> >      AMD64 Architecture Programmer's Manual Volume 2: System Programming
> >      Section 17.6.4.1 CPPC_CAPABILITY_1
> >      (Referenced in commit 5d9a354cf839a)
> > 
> > Fixes: 0c411b39e4f4c ("amd-pstate: Set min_perf to nominal_perf for active mode performance gov")
> > Tested-by: Kaushik Reddy S <kaushik.reddys@amd.com>
> > Cc: stable@vger.kernel.org
> > Signed-off-by: Juan Martinez <juan.martinez@amd.com>
> 
> I think this change is reasonable, but I'd like to get Gautham's comments as
> the original author of 0c411b39e4f4c.

The active mode performance governor was intended to run the cores at
the highest possible frequency at all times.  Originally the min_perf
was set to max_perf, but we observed frequency throttling in TDP
constrained environments as mentioned in commit 0c411b39e4f4c
("amd-pstate: Set min_perf to nominal_perf for active mode performance
gov"), and as a result min_perf was lowered to nominal_perf so that
the frequency doesn't drop below the nominal_perf as long as the
power/thermal constraints allow it.

This is the behaviour that is desired by customers. So unless you are
observing a performance regression when the min_perf is set to
nominal_perf, I would like to retain this behaviour.

When the governor is switched to "powersave", the min_perf is lowered
to "lowest_nonlinear_perf" by default to match the description in
section 17.6.4.1 CPPC_CAPABILITY_1 of the APM volume 2.

That said, I think we should document this in the code as to why the
min_perf is being set to nominal_perf when
cpudata->policy == CPUFREQ_POLICY_PERFORMANCE.

Juan, do you want to give it a try ?

-- 
Thanks and Regards
gautham.