[PATCH v3 0/2] governor: Fix races and stale baseline on prev_cpu_nice

Zhongqiu Han posted 2 patches 1 month, 3 weeks ago
drivers/cpufreq/cpufreq_governor.c | 42 ++++++++++++++++++++----------
1 file changed, 28 insertions(+), 14 deletions(-)
[PATCH v3 0/2] governor: Fix races and stale baseline on prev_cpu_nice
Posted by Zhongqiu Han 1 month, 3 weeks ago
Patch 1 fixes two baseline-consistency issues in the DBS governor
framework.

gov_update_cpu_data() writes prev_cpu_idle and prev_cpu_nice while
holding only attr_set->update_lock, whereas dbs_update() reads and
writes the same fields while holding only policy_dbs->update_mutex.
Because these are independent locks, the two paths are not mutually
exclusive.  The fix acquires policy_dbs->update_mutex inside
gov_update_cpu_data() for each policy.

Additionally, cpufreq_dbs_governor_start() reads io_is_busy before
acquiring policy_dbs->update_mutex.  A concurrent io_is_busy_store()
can update io_is_busy and call gov_update_cpu_data(), which writes
prev_cpu_idle with the new value under the mutex.
cpufreq_dbs_governor_start() then overwrites prev_cpu_idle with the
stale value.  The fix reads io_is_busy inside the mutex.

Patch 2 fixes a stale-baseline race on prev_cpu_nice when
ignore_nice_load is enabled via sysfs.  There is a race window
between setting ignore_nice_load and gov_update_cpu_data() acquiring
update_mutex: a concurrent dbs_update() can see the new ignore_nice_load
value but read a stale prev_cpu_nice, producing an inflated idle_time
for that sample.

The fix unconditionally advances prev_cpu_nice on every dbs_update()
call.  With prev_cpu_nice always reflecting the most recent sample,
enabling ignore_nice_load can never produce a stale-baseline spike.
To keep prev_cpu_nice handling consistent, gov_update_cpu_data()
resets prev_cpu_nice unconditionally alongside prev_cpu_idle, and
cpufreq_dbs_governor_start() initializes prev_cpu_nice unconditionally.

v3:
- Update linux-next base
- Fix two additional issues found during sashiko.dev review:
    - Patch 1: move io_busy read inside policy_dbs->update_mutex in
      cpufreq_dbs_governor_start() to close a TOCTOU window against
      concurrent io_is_busy_store().
    - Patch 2: restore unconditional prev_cpu_nice reset in
      gov_update_cpu_data() to keep both baselines consistent when
      io_is_busy changes with ignore_nice_load enabled; update comment.
    - Patch 2: keep gov_update_cpu_data() in ignore_nice_load_store() to
      prevent a stale-baseline spike in the governor-start window (when
      ignore_nice_load is set before the first dbs_update() call); only
      cpufreq_governor.c is modified.  Update commit message and add one
      more fixes tag '326c86deaed54a ("[CPUFREQ] Remove unneeded locks").'
    - Update the commit messages.
- Link to v2: https://lore.kernel.org/all/20260409111407.9775-1-zhongqiu.han@oss.qualcomm.com/
- Sashiko v2 comments: https://sashiko.dev/#/patchset/20260409111407.9775-1-zhongqiu.han%40oss.qualcomm.com

v2:
- Update linux-next base
- Based on v1 review, patch 1 is updated to add the missing
  protection around cpufreq_dbs_governor_start(), and patch 2/2 is added.
- Link to v1: https://lore.kernel.org/all/20260406110113.3475920-1-zhongqiu.han@oss.qualcomm.com/

Zhongqiu Han (2):
  cpufreq: governor: Fix data races on per-CPU idle/nice baselines
  cpufreq: governor: Fix stale prev_cpu_nice spike when enabling
    ignore_nice_load

 drivers/cpufreq/cpufreq_governor.c | 42 ++++++++++++++++++++----------
 1 file changed, 28 insertions(+), 14 deletions(-)


base-commit: c7275b05bc428c7373d97aa2da02d3a7fa6b9f66
-- 
2.43.0
Re: [PATCH v3 0/2] governor: Fix races and stale baseline on prev_cpu_nice
Posted by Rafael J. Wysocki 3 weeks, 3 days ago
On Sun, Apr 19, 2026 at 3:27 PM Zhongqiu Han
<zhongqiu.han@oss.qualcomm.com> wrote:
>
> Patch 1 fixes two baseline-consistency issues in the DBS governor
> framework.
>
> gov_update_cpu_data() writes prev_cpu_idle and prev_cpu_nice while
> holding only attr_set->update_lock, whereas dbs_update() reads and
> writes the same fields while holding only policy_dbs->update_mutex.
> Because these are independent locks, the two paths are not mutually
> exclusive.  The fix acquires policy_dbs->update_mutex inside
> gov_update_cpu_data() for each policy.
>
> Additionally, cpufreq_dbs_governor_start() reads io_is_busy before
> acquiring policy_dbs->update_mutex.  A concurrent io_is_busy_store()
> can update io_is_busy and call gov_update_cpu_data(), which writes
> prev_cpu_idle with the new value under the mutex.
> cpufreq_dbs_governor_start() then overwrites prev_cpu_idle with the
> stale value.  The fix reads io_is_busy inside the mutex.
>
> Patch 2 fixes a stale-baseline race on prev_cpu_nice when
> ignore_nice_load is enabled via sysfs.  There is a race window
> between setting ignore_nice_load and gov_update_cpu_data() acquiring
> update_mutex: a concurrent dbs_update() can see the new ignore_nice_load
> value but read a stale prev_cpu_nice, producing an inflated idle_time
> for that sample.
>
> The fix unconditionally advances prev_cpu_nice on every dbs_update()
> call.  With prev_cpu_nice always reflecting the most recent sample,
> enabling ignore_nice_load can never produce a stale-baseline spike.
> To keep prev_cpu_nice handling consistent, gov_update_cpu_data()
> resets prev_cpu_nice unconditionally alongside prev_cpu_idle, and
> cpufreq_dbs_governor_start() initializes prev_cpu_nice unconditionally.
>
> v3:
> - Update linux-next base
> - Fix two additional issues found during sashiko.dev review:
>     - Patch 1: move io_busy read inside policy_dbs->update_mutex in
>       cpufreq_dbs_governor_start() to close a TOCTOU window against
>       concurrent io_is_busy_store().
>     - Patch 2: restore unconditional prev_cpu_nice reset in
>       gov_update_cpu_data() to keep both baselines consistent when
>       io_is_busy changes with ignore_nice_load enabled; update comment.
>     - Patch 2: keep gov_update_cpu_data() in ignore_nice_load_store() to
>       prevent a stale-baseline spike in the governor-start window (when
>       ignore_nice_load is set before the first dbs_update() call); only
>       cpufreq_governor.c is modified.  Update commit message and add one
>       more fixes tag '326c86deaed54a ("[CPUFREQ] Remove unneeded locks").'
>     - Update the commit messages.
> - Link to v2: https://lore.kernel.org/all/20260409111407.9775-1-zhongqiu.han@oss.qualcomm.com/
> - Sashiko v2 comments: https://sashiko.dev/#/patchset/20260409111407.9775-1-zhongqiu.han%40oss.qualcomm.com

So sashiko.dev still has some comments on the first patch in v3:

https://sashiko.dev/#/patchset/20260419132655.3800673-1-zhongqiu.han%40oss.qualcomm.com

Can you please have a look at that and let me know what you think?

> v2:
> - Update linux-next base
> - Based on v1 review, patch 1 is updated to add the missing
>   protection around cpufreq_dbs_governor_start(), and patch 2/2 is added.
> - Link to v1: https://lore.kernel.org/all/20260406110113.3475920-1-zhongqiu.han@oss.qualcomm.com/
>
> Zhongqiu Han (2):
>   cpufreq: governor: Fix data races on per-CPU idle/nice baselines
>   cpufreq: governor: Fix stale prev_cpu_nice spike when enabling
>     ignore_nice_load
>
>  drivers/cpufreq/cpufreq_governor.c | 42 ++++++++++++++++++++----------
>  1 file changed, 28 insertions(+), 14 deletions(-)
>
>
> base-commit: c7275b05bc428c7373d97aa2da02d3a7fa6b9f66
> --
> 2.43.0
>
>
Re: [PATCH v3 0/2] governor: Fix races and stale baseline on prev_cpu_nice
Posted by Zhongqiu Han 3 weeks, 2 days ago
On 5/22/2026 11:14 PM, Rafael J. Wysocki wrote:
> On Sun, Apr 19, 2026 at 3:27 PM Zhongqiu Han
> <zhongqiu.han@oss.qualcomm.com> wrote:
>>
>> Patch 1 fixes two baseline-consistency issues in the DBS governor
>> framework.
>>
>> gov_update_cpu_data() writes prev_cpu_idle and prev_cpu_nice while
>> holding only attr_set->update_lock, whereas dbs_update() reads and
>> writes the same fields while holding only policy_dbs->update_mutex.
>> Because these are independent locks, the two paths are not mutually
>> exclusive.  The fix acquires policy_dbs->update_mutex inside
>> gov_update_cpu_data() for each policy.
>>
>> Additionally, cpufreq_dbs_governor_start() reads io_is_busy before
>> acquiring policy_dbs->update_mutex.  A concurrent io_is_busy_store()
>> can update io_is_busy and call gov_update_cpu_data(), which writes
>> prev_cpu_idle with the new value under the mutex.
>> cpufreq_dbs_governor_start() then overwrites prev_cpu_idle with the
>> stale value.  The fix reads io_is_busy inside the mutex.
>>
>> Patch 2 fixes a stale-baseline race on prev_cpu_nice when
>> ignore_nice_load is enabled via sysfs.  There is a race window
>> between setting ignore_nice_load and gov_update_cpu_data() acquiring
>> update_mutex: a concurrent dbs_update() can see the new ignore_nice_load
>> value but read a stale prev_cpu_nice, producing an inflated idle_time
>> for that sample.
>>
>> The fix unconditionally advances prev_cpu_nice on every dbs_update()
>> call.  With prev_cpu_nice always reflecting the most recent sample,
>> enabling ignore_nice_load can never produce a stale-baseline spike.
>> To keep prev_cpu_nice handling consistent, gov_update_cpu_data()
>> resets prev_cpu_nice unconditionally alongside prev_cpu_idle, and
>> cpufreq_dbs_governor_start() initializes prev_cpu_nice unconditionally.
>>
>> v3:
>> - Update linux-next base
>> - Fix two additional issues found during sashiko.dev review:
>>      - Patch 1: move io_busy read inside policy_dbs->update_mutex in
>>        cpufreq_dbs_governor_start() to close a TOCTOU window against
>>        concurrent io_is_busy_store().
>>      - Patch 2: restore unconditional prev_cpu_nice reset in
>>        gov_update_cpu_data() to keep both baselines consistent when
>>        io_is_busy changes with ignore_nice_load enabled; update comment.
>>      - Patch 2: keep gov_update_cpu_data() in ignore_nice_load_store() to
>>        prevent a stale-baseline spike in the governor-start window (when
>>        ignore_nice_load is set before the first dbs_update() call); only
>>        cpufreq_governor.c is modified.  Update commit message and add one
>>        more fixes tag '326c86deaed54a ("[CPUFREQ] Remove unneeded locks").'
>>      - Update the commit messages.
>> - Link to v2: https://lore.kernel.org/all/20260409111407.9775-1-zhongqiu.han@oss.qualcomm.com/
>> - Sashiko v2 comments: https://sashiko.dev/#/patchset/20260409111407.9775-1-zhongqiu.han%40oss.qualcomm.com
> 
> So sashiko.dev still has some comments on the first patch in v3:
> 
> https://sashiko.dev/#/patchset/20260419132655.3800673-1-zhongqiu.han%40oss.qualcomm.com
> 
> Can you please have a look at that and let me know what you think?
> 


Hi Rafael,
Thanks for pointing this out. I've looked at both comments:

Comment 1 (io_is_busy transition window):

The race is real but pre-existing. After Patch 1 the only remaining
window is the gap between the io_is_busy write and gov_update_cpu_data()
acquiring the lock:

CPU0 (io_is_busy_store):    CPU1 (dbs_update):
io_is_busy = 1              mutex_lock()
/* waiting for mutex */     io_busy = 1  /* sees new value */
                             cur_idle = get_cpu_idle_time(io_busy=1)
                             /* prev_cpu_idle still set with io_busy=0 */
                             /* idle_time may be clamped to 0 */
                             /* load inflated for this sample */
                             mutex_unlock()
mutex_lock()
prev_cpu_idle = ../* reset */
mutex_unlock()

The mismatch between the two metrics causes a one-sample anomaly in
either direction: 0→1 tends to inflate the load estimate (frequency
pushed upward), while 1→0 tends to deflate it. The AI's "drives to
minimum" conclusion is wrong for the 0→1 case.

This window is inherent to the design. One possible fix is to add a per
policy io_is_busy copy inside policy_dbs_info and update it under
update_mutex in gov_update_cpu_data(), so dbs_update() always reads a
value consistent with the current prev_cpu_idle baseline. The impact
should be a single-sample frequency spike that self-corrects.


Comment 2 (sampling_rate_store vs. gov_set_update_util):

Also pre-existing. The race:

   cpufreq_dbs_governor_start():     sampling_rate_store():
     mutex_unlock()                    mutex_lock()
     gov->start()                      sample_delay_ns = 0
     gov_set_update_util()             mutex_unlock()
       sample_delay_ns = rate  <--->

If needed, e possible approach would be to move both gov->start() and
gov_set_update_util() inside the mutex would work.


Perhaps both can be evaluated and handled either in this patch or as a
separate follow-up.


>> v2:
>> - Update linux-next base
>> - Based on v1 review, patch 1 is updated to add the missing
>>    protection around cpufreq_dbs_governor_start(), and patch 2/2 is added.
>> - Link to v1: https://lore.kernel.org/all/20260406110113.3475920-1-zhongqiu.han@oss.qualcomm.com/
>>
>> Zhongqiu Han (2):
>>    cpufreq: governor: Fix data races on per-CPU idle/nice baselines
>>    cpufreq: governor: Fix stale prev_cpu_nice spike when enabling
>>      ignore_nice_load
>>
>>   drivers/cpufreq/cpufreq_governor.c | 42 ++++++++++++++++++++----------
>>   1 file changed, 28 insertions(+), 14 deletions(-)
>>
>>
>> base-commit: c7275b05bc428c7373d97aa2da02d3a7fa6b9f66
>> --
>> 2.43.0
>>
>>


-- 
Thx and BRs,
Zhongqiu Han
Re: [PATCH v3 0/2] governor: Fix races and stale baseline on prev_cpu_nice
Posted by Rafael J. Wysocki 2 weeks, 6 days ago
On Sun, May 24, 2026 at 7:57 AM Zhongqiu Han
<zhongqiu.han@oss.qualcomm.com> wrote:
>
> On 5/22/2026 11:14 PM, Rafael J. Wysocki wrote:
> > On Sun, Apr 19, 2026 at 3:27 PM Zhongqiu Han
> > <zhongqiu.han@oss.qualcomm.com> wrote:
> >>
> >> Patch 1 fixes two baseline-consistency issues in the DBS governor
> >> framework.
> >>
> >> gov_update_cpu_data() writes prev_cpu_idle and prev_cpu_nice while
> >> holding only attr_set->update_lock, whereas dbs_update() reads and
> >> writes the same fields while holding only policy_dbs->update_mutex.
> >> Because these are independent locks, the two paths are not mutually
> >> exclusive.  The fix acquires policy_dbs->update_mutex inside
> >> gov_update_cpu_data() for each policy.
> >>
> >> Additionally, cpufreq_dbs_governor_start() reads io_is_busy before
> >> acquiring policy_dbs->update_mutex.  A concurrent io_is_busy_store()
> >> can update io_is_busy and call gov_update_cpu_data(), which writes
> >> prev_cpu_idle with the new value under the mutex.
> >> cpufreq_dbs_governor_start() then overwrites prev_cpu_idle with the
> >> stale value.  The fix reads io_is_busy inside the mutex.
> >>
> >> Patch 2 fixes a stale-baseline race on prev_cpu_nice when
> >> ignore_nice_load is enabled via sysfs.  There is a race window
> >> between setting ignore_nice_load and gov_update_cpu_data() acquiring
> >> update_mutex: a concurrent dbs_update() can see the new ignore_nice_load
> >> value but read a stale prev_cpu_nice, producing an inflated idle_time
> >> for that sample.
> >>
> >> The fix unconditionally advances prev_cpu_nice on every dbs_update()
> >> call.  With prev_cpu_nice always reflecting the most recent sample,
> >> enabling ignore_nice_load can never produce a stale-baseline spike.
> >> To keep prev_cpu_nice handling consistent, gov_update_cpu_data()
> >> resets prev_cpu_nice unconditionally alongside prev_cpu_idle, and
> >> cpufreq_dbs_governor_start() initializes prev_cpu_nice unconditionally.
> >>
> >> v3:
> >> - Update linux-next base
> >> - Fix two additional issues found during sashiko.dev review:
> >>      - Patch 1: move io_busy read inside policy_dbs->update_mutex in
> >>        cpufreq_dbs_governor_start() to close a TOCTOU window against
> >>        concurrent io_is_busy_store().
> >>      - Patch 2: restore unconditional prev_cpu_nice reset in
> >>        gov_update_cpu_data() to keep both baselines consistent when
> >>        io_is_busy changes with ignore_nice_load enabled; update comment.
> >>      - Patch 2: keep gov_update_cpu_data() in ignore_nice_load_store() to
> >>        prevent a stale-baseline spike in the governor-start window (when
> >>        ignore_nice_load is set before the first dbs_update() call); only
> >>        cpufreq_governor.c is modified.  Update commit message and add one
> >>        more fixes tag '326c86deaed54a ("[CPUFREQ] Remove unneeded locks").'
> >>      - Update the commit messages.
> >> - Link to v2: https://lore.kernel.org/all/20260409111407.9775-1-zhongqiu.han@oss.qualcomm.com/
> >> - Sashiko v2 comments: https://sashiko.dev/#/patchset/20260409111407.9775-1-zhongqiu.han%40oss.qualcomm.com
> >
> > So sashiko.dev still has some comments on the first patch in v3:
> >
> > https://sashiko.dev/#/patchset/20260419132655.3800673-1-zhongqiu.han%40oss.qualcomm.com
> >
> > Can you please have a look at that and let me know what you think?
> >
>
>
> Hi Rafael,
> Thanks for pointing this out. I've looked at both comments:
>
> Comment 1 (io_is_busy transition window):
>
> The race is real but pre-existing. After Patch 1 the only remaining
> window is the gap between the io_is_busy write and gov_update_cpu_data()
> acquiring the lock:
>
> CPU0 (io_is_busy_store):    CPU1 (dbs_update):
> io_is_busy = 1              mutex_lock()
> /* waiting for mutex */     io_busy = 1  /* sees new value */
>                              cur_idle = get_cpu_idle_time(io_busy=1)
>                              /* prev_cpu_idle still set with io_busy=0 */
>                              /* idle_time may be clamped to 0 */
>                              /* load inflated for this sample */
>                              mutex_unlock()
> mutex_lock()
> prev_cpu_idle = ../* reset */
> mutex_unlock()
>
> The mismatch between the two metrics causes a one-sample anomaly in
> either direction: 0→1 tends to inflate the load estimate (frequency
> pushed upward), while 1→0 tends to deflate it. The AI's "drives to
> minimum" conclusion is wrong for the 0→1 case.
>
> This window is inherent to the design. One possible fix is to add a per
> policy io_is_busy copy inside policy_dbs_info and update it under
> update_mutex in gov_update_cpu_data(), so dbs_update() always reads a
> value consistent with the current prev_cpu_idle baseline. The impact
> should be a single-sample frequency spike that self-corrects.
>
>
> Comment 2 (sampling_rate_store vs. gov_set_update_util):
>
> Also pre-existing. The race:
>
>    cpufreq_dbs_governor_start():     sampling_rate_store():
>      mutex_unlock()                    mutex_lock()
>      gov->start()                      sample_delay_ns = 0
>      gov_set_update_util()             mutex_unlock()
>        sample_delay_ns = rate  <--->
>
> If needed, e possible approach would be to move both gov->start() and
> gov_set_update_util() inside the mutex would work.

Yeah.

So I've applied both patches in the series as 7.2 material.

Please feel free to send a follow-up patch making the change mentioned above.

Thanks!
Re: [PATCH v3 0/2] governor: Fix races and stale baseline on prev_cpu_nice
Posted by Zhongqiu Han 2 weeks, 5 days ago
On 5/27/2026 1:09 AM, Rafael J. Wysocki wrote:
> On Sun, May 24, 2026 at 7:57 AM Zhongqiu Han
> <zhongqiu.han@oss.qualcomm.com> wrote:
>>
>> On 5/22/2026 11:14 PM, Rafael J. Wysocki wrote:
>>> On Sun, Apr 19, 2026 at 3:27 PM Zhongqiu Han
>>> <zhongqiu.han@oss.qualcomm.com> wrote:
>>>>
>>>> Patch 1 fixes two baseline-consistency issues in the DBS governor
>>>> framework.
>>>>
>>>> gov_update_cpu_data() writes prev_cpu_idle and prev_cpu_nice while
>>>> holding only attr_set->update_lock, whereas dbs_update() reads and
>>>> writes the same fields while holding only policy_dbs->update_mutex.
>>>> Because these are independent locks, the two paths are not mutually
>>>> exclusive.  The fix acquires policy_dbs->update_mutex inside
>>>> gov_update_cpu_data() for each policy.
>>>>
>>>> Additionally, cpufreq_dbs_governor_start() reads io_is_busy before
>>>> acquiring policy_dbs->update_mutex.  A concurrent io_is_busy_store()
>>>> can update io_is_busy and call gov_update_cpu_data(), which writes
>>>> prev_cpu_idle with the new value under the mutex.
>>>> cpufreq_dbs_governor_start() then overwrites prev_cpu_idle with the
>>>> stale value.  The fix reads io_is_busy inside the mutex.
>>>>
>>>> Patch 2 fixes a stale-baseline race on prev_cpu_nice when
>>>> ignore_nice_load is enabled via sysfs.  There is a race window
>>>> between setting ignore_nice_load and gov_update_cpu_data() acquiring
>>>> update_mutex: a concurrent dbs_update() can see the new ignore_nice_load
>>>> value but read a stale prev_cpu_nice, producing an inflated idle_time
>>>> for that sample.
>>>>
>>>> The fix unconditionally advances prev_cpu_nice on every dbs_update()
>>>> call.  With prev_cpu_nice always reflecting the most recent sample,
>>>> enabling ignore_nice_load can never produce a stale-baseline spike.
>>>> To keep prev_cpu_nice handling consistent, gov_update_cpu_data()
>>>> resets prev_cpu_nice unconditionally alongside prev_cpu_idle, and
>>>> cpufreq_dbs_governor_start() initializes prev_cpu_nice unconditionally.
>>>>
>>>> v3:
>>>> - Update linux-next base
>>>> - Fix two additional issues found during sashiko.dev review:
>>>>       - Patch 1: move io_busy read inside policy_dbs->update_mutex in
>>>>         cpufreq_dbs_governor_start() to close a TOCTOU window against
>>>>         concurrent io_is_busy_store().
>>>>       - Patch 2: restore unconditional prev_cpu_nice reset in
>>>>         gov_update_cpu_data() to keep both baselines consistent when
>>>>         io_is_busy changes with ignore_nice_load enabled; update comment.
>>>>       - Patch 2: keep gov_update_cpu_data() in ignore_nice_load_store() to
>>>>         prevent a stale-baseline spike in the governor-start window (when
>>>>         ignore_nice_load is set before the first dbs_update() call); only
>>>>         cpufreq_governor.c is modified.  Update commit message and add one
>>>>         more fixes tag '326c86deaed54a ("[CPUFREQ] Remove unneeded locks").'
>>>>       - Update the commit messages.
>>>> - Link to v2: https://lore.kernel.org/all/20260409111407.9775-1-zhongqiu.han@oss.qualcomm.com/
>>>> - Sashiko v2 comments: https://sashiko.dev/#/patchset/20260409111407.9775-1-zhongqiu.han%40oss.qualcomm.com
>>>
>>> So sashiko.dev still has some comments on the first patch in v3:
>>>
>>> https://sashiko.dev/#/patchset/20260419132655.3800673-1-zhongqiu.han%40oss.qualcomm.com
>>>
>>> Can you please have a look at that and let me know what you think?
>>>
>>
>>
>> Hi Rafael,
>> Thanks for pointing this out. I've looked at both comments:
>>
>> Comment 1 (io_is_busy transition window):
>>
>> The race is real but pre-existing. After Patch 1 the only remaining
>> window is the gap between the io_is_busy write and gov_update_cpu_data()
>> acquiring the lock:
>>
>> CPU0 (io_is_busy_store):    CPU1 (dbs_update):
>> io_is_busy = 1              mutex_lock()
>> /* waiting for mutex */     io_busy = 1  /* sees new value */
>>                               cur_idle = get_cpu_idle_time(io_busy=1)
>>                               /* prev_cpu_idle still set with io_busy=0 */
>>                               /* idle_time may be clamped to 0 */
>>                               /* load inflated for this sample */
>>                               mutex_unlock()
>> mutex_lock()
>> prev_cpu_idle = ../* reset */
>> mutex_unlock()
>>
>> The mismatch between the two metrics causes a one-sample anomaly in
>> either direction: 0→1 tends to inflate the load estimate (frequency
>> pushed upward), while 1→0 tends to deflate it. The AI's "drives to
>> minimum" conclusion is wrong for the 0→1 case.
>>
>> This window is inherent to the design. One possible fix is to add a per
>> policy io_is_busy copy inside policy_dbs_info and update it under
>> update_mutex in gov_update_cpu_data(), so dbs_update() always reads a
>> value consistent with the current prev_cpu_idle baseline. The impact
>> should be a single-sample frequency spike that self-corrects.
>>
>>
>> Comment 2 (sampling_rate_store vs. gov_set_update_util):
>>
>> Also pre-existing. The race:
>>
>>     cpufreq_dbs_governor_start():     sampling_rate_store():
>>       mutex_unlock()                    mutex_lock()
>>       gov->start()                      sample_delay_ns = 0
>>       gov_set_update_util()             mutex_unlock()
>>         sample_delay_ns = rate  <--->
>>
>> If needed, e possible approach would be to move both gov->start() and
>> gov_set_update_util() inside the mutex would work.
> 
> Yeah.
> 
> So I've applied both patches in the series as 7.2 material.
> 
> Please feel free to send a follow-up patch making the change mentioned above.
> 
> Thanks!


Thanks Rafael,

Sure, I'll look into that and follow up with a patch accordingly.


-- 
Thx and BRs,
Zhongqiu Han