[PATCH v6 1/4] cpufreq: Remove per-CPU QoS constraint

Pierre Gondois posted 4 patches 2 weeks, 6 days ago
There is a newer version of this series
[PATCH v6 1/4] cpufreq: Remove per-CPU QoS constraint
Posted by Pierre Gondois 2 weeks, 6 days ago
policy->max_freq_req represents the maximum allowed frequency as
requested by the policyX/scaling_max_freq sysfs file. This request
applies to all CPUs of the policy. It is not possible to request
a per-CPU maximum frequency.

Thus, the interaction between the policy boost and scaling_max_freq
settings should be handled by adding a boost specific QoS constraint.
This will be handled in the following patches.

This patch reverts of:
commit 1608f0230510 ("cpufreq: Fix re-boost issue after hotplugging
a CPU")

Note:
I was unable to reproduce the issue initially reported.
So far it seems that was fixed in:
commit 121baab7b88e ("cpufreq: Force sync policy boost with
global boost on sysfs update")
Another point is that the initial patch states that the error
comes a max_freq_req request is repeated, but calls to
freq_qos_update_request() would return 0 and all calls are
checked against negative values.

Reviewed-by: Lifeng Zheng <zhenglifeng1@huawei.com>
Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
---
 drivers/cpufreq/cpufreq.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 4472bb1ec83c7..db414c052658b 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -1481,10 +1481,6 @@ static int cpufreq_policy_online(struct cpufreq_policy *policy,
 
 		blocking_notifier_call_chain(&cpufreq_policy_notifier_list,
 				CPUFREQ_CREATE_POLICY, policy);
-	} else {
-		ret = freq_qos_update_request(policy->max_freq_req, policy->max);
-		if (ret < 0)
-			goto out_destroy_policy;
 	}
 
 	if (cpufreq_driver->get && has_target()) {
-- 
2.43.0
Re: [PATCH v6 1/4] cpufreq: Remove per-CPU QoS constraint
Posted by Viresh Kumar 2 weeks, 5 days ago
On 17-03-26, 11:17, Pierre Gondois wrote:
> policy->max_freq_req represents the maximum allowed frequency as
> requested by the policyX/scaling_max_freq sysfs file. This request
> applies to all CPUs of the policy. It is not possible to request
> a per-CPU maximum frequency.
> 
> Thus, the interaction between the policy boost and scaling_max_freq
> settings should be handled by adding a boost specific QoS constraint.
> This will be handled in the following patches.

I don't think the above is required anymore. This patch is removing stale code
now which isn't useful anymore. It has nothing to do with a boost specific QOS
constraint.

And it would be better to know for sure why this isn't required anymore and
which patch exactly fixed this issue.

-- 
viresh
Re: [PATCH v6 1/4] cpufreq: Remove per-CPU QoS constraint
Posted by Pierre Gondois 2 weeks, 4 days ago
On 3/18/26 12:13, Viresh Kumar wrote:
> On 17-03-26, 11:17, Pierre Gondois wrote:
>> policy->max_freq_req represents the maximum allowed frequency as
>> requested by the policyX/scaling_max_freq sysfs file. This request
>> applies to all CPUs of the policy. It is not possible to request
>> a per-CPU maximum frequency.
>>
>> Thus, the interaction between the policy boost and scaling_max_freq
>> settings should be handled by adding a boost specific QoS constraint.
>> This will be handled in the following patches.
> I don't think the above is required anymore. This patch is removing stale code
> now which isn't useful anymore. It has nothing to do with a boost specific QOS
> constraint.
Yes ok
> And it would be better to know for sure why this isn't required anymore and
> which patch exactly fixed this issue.
>
On a kernel based on 1608f0230510~, and replicating the
process described in the commit message of

commit 1608f0230510 ("cpufreq: Fix re-boost issue after hotplugging
a CPU")

I could not see any issue regarding the values of:

- policy1/cpuinfo_max_freq
- policy1/scaling_max_freq

The following sequence however had an issue:

1. echo 0 > /sys/devices/system/cpu/cpu1/online
2. echo 1 > /sys/devices/system/cpu/cpufreq/boost
3. echo 1 > /sys/devices/system/cpu/cpu1/online

as after 1.:

cpufreq_boost_trigger_state()
\-for_each_active_policy()

doesn't enable boost for inactive policies. This leads to
CPU1 having the non-boosted frequency as its max freq.

The above sequence is fixed by:

commit a153c6049ab8 ("cpufreq: Introduce a more
generic way to set default per-policy boost flag")

---

@Lifeng, should I check something else than the value of:

- policy1/cpuinfo_max_freq
- policy1/scaling_max_freq

in order to reproduce the issue fixed by:

commit 1608f0230510 ("cpufreq: Fix re-boost issue after hotplugging
a CPU")

?


Re: [PATCH v6 1/4] cpufreq: Remove per-CPU QoS constraint
Posted by zhenglifeng (A) 2 weeks, 3 days ago
On 3/19/2026 5:30 PM, Pierre Gondois wrote:
> 
> On 3/18/26 12:13, Viresh Kumar wrote:
>> On 17-03-26, 11:17, Pierre Gondois wrote:
>>> policy->max_freq_req represents the maximum allowed frequency as
>>> requested by the policyX/scaling_max_freq sysfs file. This request
>>> applies to all CPUs of the policy. It is not possible to request
>>> a per-CPU maximum frequency.
>>>
>>> Thus, the interaction between the policy boost and scaling_max_freq
>>> settings should be handled by adding a boost specific QoS constraint.
>>> This will be handled in the following patches.
>> I don't think the above is required anymore. This patch is removing stale code
>> now which isn't useful anymore. It has nothing to do with a boost specific QOS
>> constraint.
> Yes ok
>> And it would be better to know for sure why this isn't required anymore and
>> which patch exactly fixed this issue.
>>
> On a kernel based on 1608f0230510~, and replicating the
> process described in the commit message of
> 
> commit 1608f0230510 ("cpufreq: Fix re-boost issue after hotplugging
> a CPU")
> 
> I could not see any issue regarding the values of:
> 
> - policy1/cpuinfo_max_freq
> - policy1/scaling_max_freq
> 
> The following sequence however had an issue:
> 
> 1. echo 0 > /sys/devices/system/cpu/cpu1/online
> 2. echo 1 > /sys/devices/system/cpu/cpufreq/boost
> 3. echo 1 > /sys/devices/system/cpu/cpu1/online
> 
> as after 1.:
> 
> cpufreq_boost_trigger_state()
> \-for_each_active_policy()
> 
> doesn't enable boost for inactive policies. This leads to
> CPU1 having the non-boosted frequency as its max freq.
> 
> The above sequence is fixed by:
> 
> commit a153c6049ab8 ("cpufreq: Introduce a more
> generic way to set default per-policy boost flag")

Yes, I think this commit fixed the issue but I didn't realize it before so
I sent them both. 😂

> 
> ---
> 
> @Lifeng, should I check something else than the value of:
> 
> - policy1/cpuinfo_max_freq
> - policy1/scaling_max_freq
> 
> in order to reproduce the issue fixed by:
> 
> commit 1608f0230510 ("cpufreq: Fix re-boost issue after hotplugging
> a CPU")
> 
> ?
> 
> 
> 

Re: [PATCH v6 1/4] cpufreq: Remove per-CPU QoS constraint
Posted by Viresh Kumar 2 weeks, 3 days ago
On 19-03-26, 10:30, Pierre Gondois wrote:
> 
> On 3/18/26 12:13, Viresh Kumar wrote:
> > On 17-03-26, 11:17, Pierre Gondois wrote:
> > > policy->max_freq_req represents the maximum allowed frequency as
> > > requested by the policyX/scaling_max_freq sysfs file. This request
> > > applies to all CPUs of the policy. It is not possible to request
> > > a per-CPU maximum frequency.
> > > 
> > > Thus, the interaction between the policy boost and scaling_max_freq
> > > settings should be handled by adding a boost specific QoS constraint.
> > > This will be handled in the following patches.
> > I don't think the above is required anymore. This patch is removing stale code
> > now which isn't useful anymore. It has nothing to do with a boost specific QOS
> > constraint.
> Yes ok
> > And it would be better to know for sure why this isn't required anymore and
> > which patch exactly fixed this issue.
> > 
> On a kernel based on 1608f0230510~, and replicating the
> process described in the commit message of
> 
> commit 1608f0230510 ("cpufreq: Fix re-boost issue after hotplugging
> a CPU")
> 
> I could not see any issue regarding the values of:
> 
> - policy1/cpuinfo_max_freq
> - policy1/scaling_max_freq

The commit message (of 1608f0230510) is confusing. The issue was discussed
properly in the following thread.

https://lore.kernel.org/all/20250120082723.am7rxujmdvzz4eky@vireshk-i7/

The problem is that policy->max and policy->cpuinfo_max_freq are incorrect after
the sequence mentioned in the commit, while max_freq_req is correct.

I think another commit has fixed that (incorrectly and unintentionally):
commit 6db0f533d320 ("cpufreq: preserve freq_table_sorted across suspend/hibernate")

@@ -1421,9 +1421,12 @@ static int cpufreq_policy_online(struct cpufreq_policy *policy,
                 * If there is a problem with its frequency table, take it
                 * offline and drop it.
                 */
-               ret = cpufreq_table_validate_and_sort(policy);
-               if (ret)
-                       goto out_offline_policy;
+               if (policy->freq_table_sorted != CPUFREQ_TABLE_SORTED_ASCENDING &&
+                   policy->freq_table_sorted != CPUFREQ_TABLE_SORTED_DESCENDING) {
+                       ret = cpufreq_table_validate_and_sort(policy);
+                       if (ret)
+                               goto out_offline_policy;
+               }

This skipped calling cpufreq_table_validate_and_sort() completely on online and
so max/cpuinfo_max_freq, max_freq_req are all in sync.

That change should be fixed with:

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 277884d91913..1f794524a1d9 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -1427,12 +1427,9 @@ static int cpufreq_policy_online(struct cpufreq_policy *policy,
                 * If there is a problem with its frequency table, take it
                 * offline and drop it.
                 */
-               if (policy->freq_table_sorted != CPUFREQ_TABLE_SORTED_ASCENDING &&
-                   policy->freq_table_sorted != CPUFREQ_TABLE_SORTED_DESCENDING) {
-                       ret = cpufreq_table_validate_and_sort(policy);
-                       if (ret)
-                               goto out_offline_policy;
-               }
+               ret = cpufreq_table_validate_and_sort(policy);
+               if (ret)
+                       goto out_offline_policy;

                /* related_cpus should at least include policy->cpus. */
                cpumask_copy(policy->related_cpus, policy->cpus);
diff --git a/drivers/cpufreq/freq_table.c b/drivers/cpufreq/freq_table.c
index 7f251daf03ce..5b364d8da4f9 100644
--- a/drivers/cpufreq/freq_table.c
+++ b/drivers/cpufreq/freq_table.c
@@ -360,6 +360,10 @@ int cpufreq_table_validate_and_sort(struct cpufreq_policy *policy)
        if (policy_has_boost_freq(policy))
                policy->boost_supported = true;

+       if (policy->freq_table_sorted == CPUFREQ_TABLE_SORTED_ASCENDING ||
+           policy->freq_table_sorted == CPUFREQ_TABLE_SORTED_DESCENDING)
+               return 0;
+
        return set_freq_table_sorted(policy);
 }

-- 
viresh
Re: [PATCH v6 1/4] cpufreq: Remove per-CPU QoS constraint
Posted by Pierre Gondois 1 week, 6 days ago
Hello Viresh,

On 3/20/26 10:18, Viresh Kumar wrote:
> On 19-03-26, 10:30, Pierre Gondois wrote:
>> On 3/18/26 12:13, Viresh Kumar wrote:
>>> On 17-03-26, 11:17, Pierre Gondois wrote:
>>>> policy->max_freq_req represents the maximum allowed frequency as
>>>> requested by the policyX/scaling_max_freq sysfs file. This request
>>>> applies to all CPUs of the policy. It is not possible to request
>>>> a per-CPU maximum frequency.
>>>>
>>>> Thus, the interaction between the policy boost and scaling_max_freq
>>>> settings should be handled by adding a boost specific QoS constraint.
>>>> This will be handled in the following patches.
>>> I don't think the above is required anymore. This patch is removing stale code
>>> now which isn't useful anymore. It has nothing to do with a boost specific QOS
>>> constraint.
>> Yes ok
>>> And it would be better to know for sure why this isn't required anymore and
>>> which patch exactly fixed this issue.
>>>
>> On a kernel based on 1608f0230510~, and replicating the
>> process described in the commit message of
>>
>> commit 1608f0230510 ("cpufreq: Fix re-boost issue after hotplugging
>> a CPU")
>>
>> I could not see any issue regarding the values of:
>>
>> - policy1/cpuinfo_max_freq
>> - policy1/scaling_max_freq
> The commit message (of 1608f0230510) is confusing. The issue was discussed
> properly in the following thread.
>
> https://lore.kernel.org/all/20250120082723.am7rxujmdvzz4eky@vireshk-i7/
>
> The problem is that policy->max and policy->cpuinfo_max_freq are incorrect after
> the sequence mentioned in the commit, while max_freq_req is correct.

I experimented a bit more and it seems the following happens:

1. boost all CPUs: echo 1 > /sys/devices/system/cpu/cpufreq/boost
2. offline one CPU: echo 0 > /sys/devices/system/cpu/cpuX/online
3. deboost all CPUs: echo 0 > /sys/devices/system/cpu/cpufreq/boost

cpufreq_boost_trigger_state()
\-for_each_active_policy()
   \-cpufreq_driver->set_boost()
doesn't act on the policy where there are no more online CPUs,
so the max/cpuinfo.max/max_freq_req is left to the actual
boost freq.

4. online CPUX: echo 1 > /sys/devices/system/cpu/cpuX/online

cpufreq_online()
\-cpufreq_driver->init()
   \-cppc_cpufreq_cpu_init()
There:
- policy->max
- policy->cpuinfo.max_freq
are set to the maximal non-boost freq., which is the correct value.

However, max_freq_req is left to the boosted frequency, so this
is effectively an incorrect state.
Also in cpufreq_set_policy(), policy->max is set to:
   min(max_freq_req, cpuinfo.max_freq)
(cf. verify() cb), so the incorrect state of max_freq_req is not
visible.

5. boost all CPUs again: echo 1 > /sys/devices/system/cpu/cpufreq/boost

As the max_freq_req value and the new boost value are equal,
cpufreq_notifier_max() won't be called, which means that if the
CPU needed to raise its freq., it won't be notified until another
event trigger a re-evaluation of the freq. selection.

To observe that, I had to:
- use the performance governor to be sure to select the max. available
   freq.
- observe scaling_cur_freq to see the last requested freq. for the CPU

So IMO the issue was actually fixed by:
dd016f379ebc ("cpufreq: Introduce a more generic way to set default
per-policy boost flag")
which sets the correct max_freq_req value when putting back an
inactive policy.


>
> I think another commit has fixed that (incorrectly and unintentionally):
> commit 6db0f533d320 ("cpufreq: preserve freq_table_sorted across suspend/hibernate")
>
> @@ -1421,9 +1421,12 @@ static int cpufreq_policy_online(struct cpufreq_policy *policy,
>                   * If there is a problem with its frequency table, take it
>                   * offline and drop it.
>                   */
> -               ret = cpufreq_table_validate_and_sort(policy);
> -               if (ret)
> -                       goto out_offline_policy;
> +               if (policy->freq_table_sorted != CPUFREQ_TABLE_SORTED_ASCENDING &&
> +                   policy->freq_table_sorted != CPUFREQ_TABLE_SORTED_DESCENDING) {
> +                       ret = cpufreq_table_validate_and_sort(policy);
> +                       if (ret)
> +                               goto out_offline_policy;
> +               }
>
> This skipped calling cpufreq_table_validate_and_sort() completely on online and
> so max/cpuinfo_max_freq, max_freq_req are all in sync.
>
> That change should be fixed with:
>
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index 277884d91913..1f794524a1d9 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -1427,12 +1427,9 @@ static int cpufreq_policy_online(struct cpufreq_policy *policy,
>                   * If there is a problem with its frequency table, take it
>                   * offline and drop it.
>                   */
> -               if (policy->freq_table_sorted != CPUFREQ_TABLE_SORTED_ASCENDING &&
> -                   policy->freq_table_sorted != CPUFREQ_TABLE_SORTED_DESCENDING) {
> -                       ret = cpufreq_table_validate_and_sort(policy);
> -                       if (ret)
> -                               goto out_offline_policy;
> -               }
> +               ret = cpufreq_table_validate_and_sort(policy);
> +               if (ret)
> +                       goto out_offline_policy;
>
>                  /* related_cpus should at least include policy->cpus. */
>                  cpumask_copy(policy->related_cpus, policy->cpus);
> diff --git a/drivers/cpufreq/freq_table.c b/drivers/cpufreq/freq_table.c
> index 7f251daf03ce..5b364d8da4f9 100644
> --- a/drivers/cpufreq/freq_table.c
> +++ b/drivers/cpufreq/freq_table.c
> @@ -360,6 +360,10 @@ int cpufreq_table_validate_and_sort(struct cpufreq_policy *policy)
>          if (policy_has_boost_freq(policy))
>                  policy->boost_supported = true;
>
> +       if (policy->freq_table_sorted == CPUFREQ_TABLE_SORTED_ASCENDING ||
> +           policy->freq_table_sorted == CPUFREQ_TABLE_SORTED_DESCENDING)
> +               return 0;
> +
>          return set_freq_table_sorted(policy);
>   }
>
Re: [PATCH v6 1/4] cpufreq: Remove per-CPU QoS constraint
Posted by Viresh Kumar 1 week, 5 days ago
On 24-03-26, 12:39, Pierre Gondois wrote:
> I experimented a bit more and it seems the following happens:
> 
> 1. boost all CPUs: echo 1 > /sys/devices/system/cpu/cpufreq/boost
> 2. offline one CPU: echo 0 > /sys/devices/system/cpu/cpuX/online
> 3. deboost all CPUs: echo 0 > /sys/devices/system/cpu/cpufreq/boost
> 
> cpufreq_boost_trigger_state()
> \-for_each_active_policy()
>   \-cpufreq_driver->set_boost()
> doesn't act on the policy where there are no more online CPUs,
> so the max/cpuinfo.max/max_freq_req is left to the actual
> boost freq.

Right.

> 4. online CPUX: echo 1 > /sys/devices/system/cpu/cpuX/online
> 
> cpufreq_online()
> \-cpufreq_driver->init()
>   \-cppc_cpufreq_cpu_init()
> There:
> - policy->max
> - policy->cpuinfo.max_freq
> are set to the maximal non-boost freq., which is the correct value.

Right, but that is true only for CPPC. Not all drivers would set max freq from
there and depend on cpufreq_table_validate_and_sort() to fix this. Which was
broken earlier, but with my patch it should be fixed now.

So yes, we will have max/cpuinfo.max_freq set at this point (with my patch
applied) for all drivers.

> However, max_freq_req is left to the boosted frequency, so this
> is effectively an incorrect state.

Right.

> Also in cpufreq_set_policy(), policy->max is set to:
>   min(max_freq_req, cpuinfo.max_freq)
> (cf. verify() cb), so the incorrect state of max_freq_req is not
> visible.
> 
> 5. boost all CPUs again: echo 1 > /sys/devices/system/cpu/cpufreq/boost
> 
> As the max_freq_req value and the new boost value are equal,
> cpufreq_notifier_max() won't be called, which means that if the
> CPU needed to raise its freq., it won't be notified until another
> event trigger a re-evaluation of the freq. selection.

Right and max/cpuinfo.max_freq will be incorrect here. This was the issue the
commit 1608f0230510 was trying to fix.

> To observe that, I had to:
> - use the performance governor to be sure to select the max. available
>   freq.
> - observe scaling_cur_freq to see the last requested freq. for the CPU
> 
> So IMO the issue was actually fixed by:
> dd016f379ebc ("cpufreq: Introduce a more generic way to set default
> per-policy boost flag")
> which sets the correct max_freq_req value when putting back an
> inactive policy.

Right. This patch fixes the issue I think. Your patch is fine I guess, just
mention this reason and we are good to go.

-- 
viresh
Re: [PATCH v6 1/4] cpufreq: Remove per-CPU QoS constraint
Posted by Viresh Kumar 2 weeks, 3 days ago
On 20-03-26, 14:48, Viresh Kumar wrote:
> That change should be fixed with:
> 
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index 277884d91913..1f794524a1d9 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -1427,12 +1427,9 @@ static int cpufreq_policy_online(struct cpufreq_policy *policy,
>                  * If there is a problem with its frequency table, take it
>                  * offline and drop it.
>                  */
> -               if (policy->freq_table_sorted != CPUFREQ_TABLE_SORTED_ASCENDING &&
> -                   policy->freq_table_sorted != CPUFREQ_TABLE_SORTED_DESCENDING) {
> -                       ret = cpufreq_table_validate_and_sort(policy);
> -                       if (ret)
> -                               goto out_offline_policy;
> -               }
> +               ret = cpufreq_table_validate_and_sort(policy);
> +               if (ret)
> +                       goto out_offline_policy;
> 
>                 /* related_cpus should at least include policy->cpus. */
>                 cpumask_copy(policy->related_cpus, policy->cpus);
> diff --git a/drivers/cpufreq/freq_table.c b/drivers/cpufreq/freq_table.c
> index 7f251daf03ce..5b364d8da4f9 100644
> --- a/drivers/cpufreq/freq_table.c
> +++ b/drivers/cpufreq/freq_table.c
> @@ -360,6 +360,10 @@ int cpufreq_table_validate_and_sort(struct cpufreq_policy *policy)
>         if (policy_has_boost_freq(policy))
>                 policy->boost_supported = true;
> 
> +       if (policy->freq_table_sorted == CPUFREQ_TABLE_SORTED_ASCENDING ||
> +           policy->freq_table_sorted == CPUFREQ_TABLE_SORTED_DESCENDING)
> +               return 0;
> +
>         return set_freq_table_sorted(policy);
>  }

Posted as:

https://lore.kernel.org/all/65ba5c45749267c82e8a87af3dc788b37a0b3f48.1773998611.git.viresh.kumar@linaro.org/

-- 
viresh