When running the following command:
while true; do
stress-ng --cyclic 30 --timeout 30s --minimize --quiet
done
a warning is eventually triggered:
WARNING: CPU: 43 PID: 2848 at kernel/sched/deadline.c:794
setup_new_dl_entity+0x13e/0x180
...
Call Trace:
<TASK>
? show_trace_log_lvl+0x1c4/0x2df
? enqueue_dl_entity+0x631/0x6e0
? setup_new_dl_entity+0x13e/0x180
? __warn+0x7e/0xd0
? report_bug+0x11a/0x1a0
? handle_bug+0x3c/0x70
? exc_invalid_op+0x14/0x70
? asm_exc_invalid_op+0x16/0x20
enqueue_dl_entity+0x631/0x6e0
enqueue_task_dl+0x7d/0x120
__do_set_cpus_allowed+0xe3/0x280
__set_cpus_allowed_ptr_locked+0x140/0x1d0
__set_cpus_allowed_ptr+0x54/0xa0
migrate_enable+0x7e/0x150
rt_spin_unlock+0x1c/0x90
group_send_sig_info+0xf7/0x1a0
? kill_pid_info+0x1f/0x1d0
kill_pid_info+0x78/0x1d0
kill_proc_info+0x5b/0x110
__x64_sys_kill+0x93/0xc0
do_syscall_64+0x5c/0xf0
entry_SYSCALL_64_after_hwframe+0x6e/0x76
RIP: 0033:0x7f0dab31f92b
This warning occurs because set_cpus_allowed dequeues and enqueues tasks
with the ENQUEUE_RESTORE flag set. If the task is boosted, the warning
is triggered. A boosted task already had its parameters set by
rt_mutex_setprio, and a new call to setup_new_dl_entity is unnecessary,
hence the WARN_ON call.
Check if we are requeueing a boosted task and avoid calling
setup_new_dl_entity if that's the case.
Signed-off-by: Wander Lairson Costa <wander@redhat.com>
Fixes: 2279f540ea7d ("sched/deadline: Fix priority inheritance with multiple scheduling classes")
---
The initial idea for this fix was to introduce another ENQUEUE flag,
ENQUEUE_SET_CPUS_ALLOWED. When this flag was set, the deadline scheduler
would bypass the call to setup_new_dl_entity, regardless of whether the
task was boosted. However, this idea was abandoned due to the presence
of other DEQUEUE_SAVE/ENQUEUE_RESTORE pairs in the code. Ultimately, a
simpler approach was chosen, which achieves the same practical effects
without the need to create an additional flag for enqueue_task.
Signed-off-by: Wander Lairson Costa <wander@redhat.com>
---
kernel/sched/deadline.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index f59e5c19d944..312e8fa7ce94 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -1753,6 +1753,7 @@ enqueue_dl_entity(struct sched_dl_entity *dl_se, int flags)
} else if (flags & ENQUEUE_REPLENISH) {
replenish_dl_entity(dl_se);
} else if ((flags & ENQUEUE_RESTORE) &&
+ !is_dl_boosted(dl_se) &&
dl_time_before(dl_se->deadline, rq_clock(rq_of_dl_se(dl_se)))) {
setup_new_dl_entity(dl_se);
}
--
2.45.2
Hi Wander,
On 22/07/24 10:29, Wander Lairson Costa wrote:
> When running the following command:
>
> while true; do
> stress-ng --cyclic 30 --timeout 30s --minimize --quiet
> done
>
> a warning is eventually triggered:
>
> WARNING: CPU: 43 PID: 2848 at kernel/sched/deadline.c:794
> setup_new_dl_entity+0x13e/0x180
> ...
> Call Trace:
> <TASK>
> ? show_trace_log_lvl+0x1c4/0x2df
> ? enqueue_dl_entity+0x631/0x6e0
> ? setup_new_dl_entity+0x13e/0x180
> ? __warn+0x7e/0xd0
> ? report_bug+0x11a/0x1a0
> ? handle_bug+0x3c/0x70
> ? exc_invalid_op+0x14/0x70
> ? asm_exc_invalid_op+0x16/0x20
> enqueue_dl_entity+0x631/0x6e0
> enqueue_task_dl+0x7d/0x120
> __do_set_cpus_allowed+0xe3/0x280
> __set_cpus_allowed_ptr_locked+0x140/0x1d0
> __set_cpus_allowed_ptr+0x54/0xa0
> migrate_enable+0x7e/0x150
> rt_spin_unlock+0x1c/0x90
> group_send_sig_info+0xf7/0x1a0
> ? kill_pid_info+0x1f/0x1d0
> kill_pid_info+0x78/0x1d0
> kill_proc_info+0x5b/0x110
> __x64_sys_kill+0x93/0xc0
> do_syscall_64+0x5c/0xf0
> entry_SYSCALL_64_after_hwframe+0x6e/0x76
> RIP: 0033:0x7f0dab31f92b
>
> This warning occurs because set_cpus_allowed dequeues and enqueues tasks
> with the ENQUEUE_RESTORE flag set. If the task is boosted, the warning
> is triggered. A boosted task already had its parameters set by
> rt_mutex_setprio, and a new call to setup_new_dl_entity is unnecessary,
> hence the WARN_ON call.
>
> Check if we are requeueing a boosted task and avoid calling
> setup_new_dl_entity if that's the case.
>
> Signed-off-by: Wander Lairson Costa <wander@redhat.com>
> Fixes: 2279f540ea7d ("sched/deadline: Fix priority inheritance with multiple scheduling classes")
I believe your fix makes sense to me. I only wonder if however it
actually fixes 295d6d5e37360 ("sched/deadline: Fix switching to
-deadline") instead of the change you reference above?
Thanks,
Juri
On Tue, Jul 23, 2024 at 5:50 AM Juri Lelli <juri.lelli@redhat.com> wrote:
>
> Hi Wander,
>
> On 22/07/24 10:29, Wander Lairson Costa wrote:
> > When running the following command:
> >
> > while true; do
> > stress-ng --cyclic 30 --timeout 30s --minimize --quiet
> > done
> >
> > a warning is eventually triggered:
> >
> > WARNING: CPU: 43 PID: 2848 at kernel/sched/deadline.c:794
> > setup_new_dl_entity+0x13e/0x180
> > ...
> > Call Trace:
> > <TASK>
> > ? show_trace_log_lvl+0x1c4/0x2df
> > ? enqueue_dl_entity+0x631/0x6e0
> > ? setup_new_dl_entity+0x13e/0x180
> > ? __warn+0x7e/0xd0
> > ? report_bug+0x11a/0x1a0
> > ? handle_bug+0x3c/0x70
> > ? exc_invalid_op+0x14/0x70
> > ? asm_exc_invalid_op+0x16/0x20
> > enqueue_dl_entity+0x631/0x6e0
> > enqueue_task_dl+0x7d/0x120
> > __do_set_cpus_allowed+0xe3/0x280
> > __set_cpus_allowed_ptr_locked+0x140/0x1d0
> > __set_cpus_allowed_ptr+0x54/0xa0
> > migrate_enable+0x7e/0x150
> > rt_spin_unlock+0x1c/0x90
> > group_send_sig_info+0xf7/0x1a0
> > ? kill_pid_info+0x1f/0x1d0
> > kill_pid_info+0x78/0x1d0
> > kill_proc_info+0x5b/0x110
> > __x64_sys_kill+0x93/0xc0
> > do_syscall_64+0x5c/0xf0
> > entry_SYSCALL_64_after_hwframe+0x6e/0x76
> > RIP: 0033:0x7f0dab31f92b
> >
> > This warning occurs because set_cpus_allowed dequeues and enqueues tasks
> > with the ENQUEUE_RESTORE flag set. If the task is boosted, the warning
> > is triggered. A boosted task already had its parameters set by
> > rt_mutex_setprio, and a new call to setup_new_dl_entity is unnecessary,
> > hence the WARN_ON call.
> >
> > Check if we are requeueing a boosted task and avoid calling
> > setup_new_dl_entity if that's the case.
> >
> > Signed-off-by: Wander Lairson Costa <wander@redhat.com>
> > Fixes: 2279f540ea7d ("sched/deadline: Fix priority inheritance with multiple scheduling classes")
>
> I believe your fix makes sense to me. I only wonder if however it
> actually fixes 295d6d5e37360 ("sched/deadline: Fix switching to
> -deadline") instead of the change you reference above?
>
That makes more sense, thanks.
> Thanks,
> Juri
>
© 2016 - 2026 Red Hat, Inc.