This WARN_DOUBLE_CLOCK warning is triggered during cpu offline.
------------[ cut here ]------------
rq->clock_update_flags & RQCF_UPDATED
WARNING: CPU: 0 PID: 3323 at kernel/sched/core.c:741
update_rq_clock+0xaf/0x180
Call Trace:
<TASK>
unthrottle_cfs_rq+0x4b/0x300
rq_offline_fair+0x89/0x90
set_rq_offline.part.118+0x28/0x60
rq_attach_root+0xc4/0xd0
cpu_attach_domain+0x3dc/0x7f0
partition_sched_domains_locked+0x2a5/0x3c0
rebuild_sched_domains_locked+0x477/0x830
rebuild_sched_domains+0x1b/0x30
cpuset_hotplug_workfn+0x2ca/0xc90
? balance_push+0x56/0xf0
? _raw_spin_unlock+0x15/0x30
? finish_task_switch+0x98/0x2f0
? __switch_to+0x291/0x410
? __schedule+0x65e/0x1310
process_one_work+0x1bc/0x3d0
worker_thread+0x4c/0x380
? preempt_count_add+0x92/0xa0
? rescuer_thread+0x310/0x310
kthread+0xe6/0x110
? kthread_complete_and_exit+0x20/0x20
ret_from_fork+0x1f/0x30
The rq clock has been updated before the set_rq_offline()
function runs, so we don't need to call update_rq_clock() in
unthrottle_offline_cfs_rqs().
We only need to call rq_clock_start_loop_update() before the
loop starts and rq_clock_stop_loop_update() after the loop
to avoid this warning.
Suggested-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Hao Jia <jiahao.os@bytedance.com>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
---
kernel/sched/fair.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index af9604f4b135..9e961e0ec971 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6124,6 +6124,13 @@ static void __maybe_unused unthrottle_offline_cfs_rqs(struct rq *rq)
lockdep_assert_rq_held(rq);
+ /*
+ * The rq clock has already been updated before the
+ * set_rq_offline() runs, so we should skip updating
+ * the rq clock again in unthrottle_cfs_rq().
+ */
+ rq_clock_start_loop_update(rq);
+
rcu_read_lock();
list_for_each_entry_rcu(tg, &task_groups, list) {
struct cfs_rq *cfs_rq = tg->cfs_rq[cpu_of(rq)];
@@ -6146,6 +6153,8 @@ static void __maybe_unused unthrottle_offline_cfs_rqs(struct rq *rq)
unthrottle_cfs_rq(cfs_rq);
}
rcu_read_unlock();
+
+ rq_clock_stop_loop_update(rq);
}
#else /* CONFIG_CFS_BANDWIDTH */
--
2.37.0 (Apple Git-136)
Hao Jia <jiahao.os@bytedance.com> writes: > This WARN_DOUBLE_CLOCK warning is triggered during cpu offline. > ------------[ cut here ]------------ > rq->clock_update_flags & RQCF_UPDATED > WARNING: CPU: 0 PID: 3323 at kernel/sched/core.c:741 > update_rq_clock+0xaf/0x180 > Call Trace: > <TASK> > unthrottle_cfs_rq+0x4b/0x300 > rq_offline_fair+0x89/0x90 > set_rq_offline.part.118+0x28/0x60 > rq_attach_root+0xc4/0xd0 > cpu_attach_domain+0x3dc/0x7f0 > partition_sched_domains_locked+0x2a5/0x3c0 > rebuild_sched_domains_locked+0x477/0x830 > rebuild_sched_domains+0x1b/0x30 > cpuset_hotplug_workfn+0x2ca/0xc90 > ? balance_push+0x56/0xf0 > ? _raw_spin_unlock+0x15/0x30 > ? finish_task_switch+0x98/0x2f0 > ? __switch_to+0x291/0x410 > ? __schedule+0x65e/0x1310 > process_one_work+0x1bc/0x3d0 > worker_thread+0x4c/0x380 > ? preempt_count_add+0x92/0xa0 > ? rescuer_thread+0x310/0x310 > kthread+0xe6/0x110 > ? kthread_complete_and_exit+0x20/0x20 > ret_from_fork+0x1f/0x30 > > The rq clock has been updated before the set_rq_offline() > function runs, so we don't need to call update_rq_clock() in > unthrottle_offline_cfs_rqs(). I don't think we do in the path from rq_attach_root (though that's easy enough to fix, of course).
On 2023/6/9 Benjamin Segall wrote:
> Hao Jia <jiahao.os@bytedance.com> writes:
>
>> This WARN_DOUBLE_CLOCK warning is triggered during cpu offline.
>> ------------[ cut here ]------------
>> rq->clock_update_flags & RQCF_UPDATED
>> WARNING: CPU: 0 PID: 3323 at kernel/sched/core.c:741
>> update_rq_clock+0xaf/0x180
>> Call Trace:
>> <TASK>
>> unthrottle_cfs_rq+0x4b/0x300
>> rq_offline_fair+0x89/0x90
>> set_rq_offline.part.118+0x28/0x60
>> rq_attach_root+0xc4/0xd0
>> cpu_attach_domain+0x3dc/0x7f0
>> partition_sched_domains_locked+0x2a5/0x3c0
>> rebuild_sched_domains_locked+0x477/0x830
>> rebuild_sched_domains+0x1b/0x30
>> cpuset_hotplug_workfn+0x2ca/0xc90
>> ? balance_push+0x56/0xf0
>> ? _raw_spin_unlock+0x15/0x30
>> ? finish_task_switch+0x98/0x2f0
>> ? __switch_to+0x291/0x410
>> ? __schedule+0x65e/0x1310
>> process_one_work+0x1bc/0x3d0
>> worker_thread+0x4c/0x380
>> ? preempt_count_add+0x92/0xa0
>> ? rescuer_thread+0x310/0x310
>> kthread+0xe6/0x110
>> ? kthread_complete_and_exit+0x20/0x20
>> ret_from_fork+0x1f/0x30
>>
>> The rq clock has been updated before the set_rq_offline()
>> function runs, so we don't need to call update_rq_clock() in
>> unthrottle_offline_cfs_rqs().
>
> I don't think we do in the path from rq_attach_root (though that's easy
> enough to fix, of course).
>
Thanks for your review.
Now our fix method is that after applying patch1, we update the rq clock
before set_rq_offline(). Then use rq_clock_{start, stop}_loop_update to
avoid updating the rq clock multiple times in unthrottle_cfs_rq().
Do you have any better suggestions?
Thanks,
Hao
Hao Jia <jiahao.os@bytedance.com> writes:
> On 2023/6/9 Benjamin Segall wrote:
>> Hao Jia <jiahao.os@bytedance.com> writes:
>>
>>> This WARN_DOUBLE_CLOCK warning is triggered during cpu offline.
>>> ------------[ cut here ]------------
>>> rq->clock_update_flags & RQCF_UPDATED
>>> WARNING: CPU: 0 PID: 3323 at kernel/sched/core.c:741
>>> update_rq_clock+0xaf/0x180
>>> Call Trace:
>>> <TASK>
>>> unthrottle_cfs_rq+0x4b/0x300
>>> rq_offline_fair+0x89/0x90
>>> set_rq_offline.part.118+0x28/0x60
>>> rq_attach_root+0xc4/0xd0
>>> cpu_attach_domain+0x3dc/0x7f0
>>> partition_sched_domains_locked+0x2a5/0x3c0
>>> rebuild_sched_domains_locked+0x477/0x830
>>> rebuild_sched_domains+0x1b/0x30
>>> cpuset_hotplug_workfn+0x2ca/0xc90
>>> ? balance_push+0x56/0xf0
>>> ? _raw_spin_unlock+0x15/0x30
>>> ? finish_task_switch+0x98/0x2f0
>>> ? __switch_to+0x291/0x410
>>> ? __schedule+0x65e/0x1310
>>> process_one_work+0x1bc/0x3d0
>>> worker_thread+0x4c/0x380
>>> ? preempt_count_add+0x92/0xa0
>>> ? rescuer_thread+0x310/0x310
>>> kthread+0xe6/0x110
>>> ? kthread_complete_and_exit+0x20/0x20
>>> ret_from_fork+0x1f/0x30
>>>
>>> The rq clock has been updated before the set_rq_offline()
>>> function runs, so we don't need to call update_rq_clock() in
>>> unthrottle_offline_cfs_rqs().
>> I don't think we do in the path from rq_attach_root (though that's easy
>> enough to fix, of course).
>>
>
> Thanks for your review.
>
> Now our fix method is that after applying patch1, we update the rq clock before
> set_rq_offline(). Then use rq_clock_{start, stop}_loop_update to avoid updating
> the rq clock multiple times in unthrottle_cfs_rq().
>
> Do you have any better suggestions?
>
> Thanks,
> Hao
Yeah, the obvious fixes are to either add an update_rq_clock in
rq_attach_root as you suggest, or put it in set_rq_offline instead of
the callers.
On 2023/6/10 Benjamin Segall wrote:
> Hao Jia <jiahao.os@bytedance.com> writes:
>
>> On 2023/6/9 Benjamin Segall wrote:
>>> Hao Jia <jiahao.os@bytedance.com> writes:
>>>
>>>> This WARN_DOUBLE_CLOCK warning is triggered during cpu offline.
>>>> ------------[ cut here ]------------
>>>> rq->clock_update_flags & RQCF_UPDATED
>>>> WARNING: CPU: 0 PID: 3323 at kernel/sched/core.c:741
>>>> update_rq_clock+0xaf/0x180
>>>> Call Trace:
>>>> <TASK>
>>>> unthrottle_cfs_rq+0x4b/0x300
>>>> rq_offline_fair+0x89/0x90
>>>> set_rq_offline.part.118+0x28/0x60
>>>> rq_attach_root+0xc4/0xd0
>>>> cpu_attach_domain+0x3dc/0x7f0
>>>> partition_sched_domains_locked+0x2a5/0x3c0
>>>> rebuild_sched_domains_locked+0x477/0x830
>>>> rebuild_sched_domains+0x1b/0x30
>>>> cpuset_hotplug_workfn+0x2ca/0xc90
>>>> ? balance_push+0x56/0xf0
>>>> ? _raw_spin_unlock+0x15/0x30
>>>> ? finish_task_switch+0x98/0x2f0
>>>> ? __switch_to+0x291/0x410
>>>> ? __schedule+0x65e/0x1310
>>>> process_one_work+0x1bc/0x3d0
>>>> worker_thread+0x4c/0x380
>>>> ? preempt_count_add+0x92/0xa0
>>>> ? rescuer_thread+0x310/0x310
>>>> kthread+0xe6/0x110
>>>> ? kthread_complete_and_exit+0x20/0x20
>>>> ret_from_fork+0x1f/0x30
>>>>
>>>> The rq clock has been updated before the set_rq_offline()
>>>> function runs, so we don't need to call update_rq_clock() in
>>>> unthrottle_offline_cfs_rqs().
>>> I don't think we do in the path from rq_attach_root (though that's easy
>>> enough to fix, of course).
>>>
>>
>> Thanks for your review.
>>
>> Now our fix method is that after applying patch1, we update the rq clock before
>> set_rq_offline(). Then use rq_clock_{start, stop}_loop_update to avoid updating
>> the rq clock multiple times in unthrottle_cfs_rq().
>>
>> Do you have any better suggestions?
>>
>> Thanks,
>> Hao
>
> Yeah, the obvious fixes are to either add an update_rq_clock in
> rq_attach_root as you suggest, or put it in set_rq_offline instead of
> the callers.
Thanks for your suggestion. I will do it in the next version.
Thanks,
Hao
© 2016 - 2026 Red Hat, Inc.