cpufreq: governor: Apply limits with target_freq instead of policy->cur

[PATCH 2/2] cpufreq: governor: Apply limits with requested_freq or next_freq

Posted by Lifeng Zheng 1 month, 3 weeks ago

For conservative, ondemand and schedutil governor,
cpufreq_policy_apply_limits() is called in .limits(). This function updates
the target because the limits (policy->max and policy->min) may be changed.
However, it uses policy->cur as the reference for the target frequency.
This may cause some problems because the value of policy->cur is influenced
by a variety of factors.

For example, for some reason, the platform determines a final
frequency divided from the frequency distributed by the OS, and this is
reflected in policy->cur. After that, cpufreq_policy_apply_limits() is
called and because policy->cur is out of limmit, policy->min will be used
as the new target. This caused the real frequency lower but it's
unnecessary. Consertative and ondemand governor use requested_freq and
schedutil governor uses next_freq to represent the target frequency. It's
more reasonable to use them in cpufreq_policy_apply_limits().

At the same time, use policy->cur as the initial value of next_freq in
schedutil governor's start() callback.

Signed-off-by: Lifeng Zheng <zhenglifeng1@huawei.com>
---
 drivers/cpufreq/cpufreq_governor.c | 2 +-
 include/linux/cpufreq.h            | 7 ++++---
 kernel/sched/cpufreq_schedutil.c   | 4 ++--
 3 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
index 7ec38407230f..dade45f7e57c 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -573,7 +573,7 @@ void cpufreq_dbs_governor_limits(struct cpufreq_policy *policy)
 		goto out;
 
 	mutex_lock(&policy_dbs->update_mutex);
-	cpufreq_policy_apply_limits(policy);
+	cpufreq_policy_apply_limits(policy, policy_dbs->requested_freq);
 	gov_update_sample_delay(policy_dbs, 0);
 	mutex_unlock(&policy_dbs->update_mutex);
 
diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
index 0465d1e6f72a..4d7341ef3645 100644
--- a/include/linux/cpufreq.h
+++ b/include/linux/cpufreq.h
@@ -657,12 +657,13 @@ static inline bool sugov_is_governor(struct cpufreq_policy *policy)
 }
 #endif
 
-static inline void cpufreq_policy_apply_limits(struct cpufreq_policy *policy)
+static inline void cpufreq_policy_apply_limits(struct cpufreq_policy *policy,
+					       unsigned int target_freq)
 {
-	if (policy->max < policy->cur)
+	if (policy->max < target_freq)
 		__cpufreq_driver_target(policy, policy->max,
 					CPUFREQ_RELATION_HE);
-	else if (policy->min > policy->cur)
+	else if (policy->min > target_freq)
 		__cpufreq_driver_target(policy, policy->min,
 					CPUFREQ_RELATION_LE);
 }
diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
index 0ab5f9d4bc59..8d239fe3afa8 100644
--- a/kernel/sched/cpufreq_schedutil.c
+++ b/kernel/sched/cpufreq_schedutil.c
@@ -848,7 +848,7 @@ static int sugov_start(struct cpufreq_policy *policy)
 
 	sg_policy->freq_update_delay_ns	= sg_policy->tunables->rate_limit_us * NSEC_PER_USEC;
 	sg_policy->last_freq_update_time	= 0;
-	sg_policy->next_freq			= 0;
+	sg_policy->next_freq			= policy->cur;
 	sg_policy->work_in_progress		= false;
 	sg_policy->limits_changed		= false;
 	sg_policy->cached_raw_freq		= 0;
@@ -895,7 +895,7 @@ static void sugov_limits(struct cpufreq_policy *policy)
 
 	if (!policy->fast_switch_enabled) {
 		mutex_lock(&sg_policy->work_lock);
-		cpufreq_policy_apply_limits(policy);
+		cpufreq_policy_apply_limits(policy, sg_policy->next_freq);
 		mutex_unlock(&sg_policy->work_lock);
 	}
 
-- 
2.33.0

Re: [PATCH 2/2] cpufreq: governor: Apply limits with requested_freq or next_freq

Posted by Viresh Kumar 4 weeks ago

On 10-02-26, 19:54, Lifeng Zheng wrote:
> For conservative, ondemand and schedutil governor,
> cpufreq_policy_apply_limits() is called in .limits(). This function updates
> the target because the limits (policy->max and policy->min) may be changed.
> However, it uses policy->cur as the reference for the target frequency.
> This may cause some problems because the value of policy->cur is influenced
> by a variety of factors.
> 
> For example, for some reason, the platform determines a final
> frequency divided from the frequency distributed by the OS, and this is
> reflected in policy->cur. After that, cpufreq_policy_apply_limits() is
> called and because policy->cur is out of limmit, policy->min will be used
> as the new target.

Yes, but policy->min should be higher than current frequency then. The algorithm
has derived policy->cur to be a reasonable frequency, and we are taking decision
based on that, which looks absolutely fine. We can fix the algorithm
(governors), so they choose the right frequency, but this logic looks to be okay
I guess.

> This caused the real frequency lower but it's
> unnecessary.

Lower than what ? It is still higher than the last configured frequency.

-- 
viresh

Re: [PATCH 2/2] cpufreq: governor: Apply limits with requested_freq or next_freq

Posted by zhenglifeng (A) 3 weeks, 6 days ago

On 3/5/2026 2:21 PM, Viresh Kumar wrote:
> On 10-02-26, 19:54, Lifeng Zheng wrote:
>> For conservative, ondemand and schedutil governor,
>> cpufreq_policy_apply_limits() is called in .limits(). This function updates
>> the target because the limits (policy->max and policy->min) may be changed.
>> However, it uses policy->cur as the reference for the target frequency.
>> This may cause some problems because the value of policy->cur is influenced
>> by a variety of factors.
>>
>> For example, for some reason, the platform determines a final
>> frequency divided from the frequency distributed by the OS, and this is
>> reflected in policy->cur. After that, cpufreq_policy_apply_limits() is
>> called and because policy->cur is out of limmit, policy->min will be used
>> as the new target.
> 
> Yes, but policy->min should be higher than current frequency then. The algorithm
> has derived policy->cur to be a reasonable frequency, and we are taking decision
> based on that, which looks absolutely fine. We can fix the algorithm
> (governors), so they choose the right frequency, but this logic looks to be okay
> I guess.
> 
>> This caused the real frequency lower but it's
>> unnecessary.
> 
> Lower than what ? It is still higher than the last configured frequency.

Hi Viresh,

You can take a look at the example I showed in the cover letter. On our
platform, under certain special circumstances, the final adjusted frequency
is obtained by dividing the frequency sent by the OS. Therefore, when the
frequency of OS updates decreases, it will also lead to a decrease in the
final frequency. Furthermore, since requested_freq remains at the highest
frequency, the conservative governor will not update the frequency again
until the utilization rate drops. This will cause the frequency issued by
the OS to remain at the lowest frequency even after the frequency division
is restored.

>