[PATCH 1/3] sched/deadline: Fix warning in migrate_enable for boosted tasks

Wander Lairson Costa posted 3 patches 1 year, 6 months ago
There is a newer version of this series
[PATCH 1/3] sched/deadline: Fix warning in migrate_enable for boosted tasks
Posted by Wander Lairson Costa 1 year, 6 months ago
When running the following command:

while true; do
    stress-ng --cyclic 30 --timeout 30s --minimize --quiet
done

a warning is eventually triggered:

WARNING: CPU: 43 PID: 2848 at kernel/sched/deadline.c:794
setup_new_dl_entity+0x13e/0x180
...
Call Trace:
 <TASK>
 ? show_trace_log_lvl+0x1c4/0x2df
 ? enqueue_dl_entity+0x631/0x6e0
 ? setup_new_dl_entity+0x13e/0x180
 ? __warn+0x7e/0xd0
 ? report_bug+0x11a/0x1a0
 ? handle_bug+0x3c/0x70
 ? exc_invalid_op+0x14/0x70
 ? asm_exc_invalid_op+0x16/0x20
 enqueue_dl_entity+0x631/0x6e0
 enqueue_task_dl+0x7d/0x120
 __do_set_cpus_allowed+0xe3/0x280
 __set_cpus_allowed_ptr_locked+0x140/0x1d0
 __set_cpus_allowed_ptr+0x54/0xa0
 migrate_enable+0x7e/0x150
 rt_spin_unlock+0x1c/0x90
 group_send_sig_info+0xf7/0x1a0
 ? kill_pid_info+0x1f/0x1d0
 kill_pid_info+0x78/0x1d0
 kill_proc_info+0x5b/0x110
 __x64_sys_kill+0x93/0xc0
 do_syscall_64+0x5c/0xf0
 entry_SYSCALL_64_after_hwframe+0x6e/0x76
 RIP: 0033:0x7f0dab31f92b

This warning occurs because set_cpus_allowed dequeues and enqueues tasks
with the ENQUEUE_RESTORE flag set. If the task is boosted, the warning
is triggered. A boosted task already had its parameters set by
rt_mutex_setprio, and a new call to setup_new_dl_entity is unnecessary,
hence the WARN_ON call.

Check if we are requeueing a boosted task and avoid calling
setup_new_dl_entity if that's the case.

Signed-off-by: Wander Lairson Costa <wander@redhat.com>
Fixes: 2279f540ea7d ("sched/deadline: Fix priority inheritance with multiple scheduling classes")

---

The initial idea for this fix was to introduce another ENQUEUE flag,
ENQUEUE_SET_CPUS_ALLOWED. When this flag was set, the deadline scheduler
would bypass the call to setup_new_dl_entity, regardless of whether the
task was boosted. However, this idea was abandoned due to the presence
of other DEQUEUE_SAVE/ENQUEUE_RESTORE pairs in the code. Ultimately, a
simpler approach was chosen, which achieves the same practical effects
without the need to create an additional flag for enqueue_task.

Signed-off-by: Wander Lairson Costa <wander@redhat.com>
---
 kernel/sched/deadline.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index f59e5c19d944..312e8fa7ce94 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -1753,6 +1753,7 @@ enqueue_dl_entity(struct sched_dl_entity *dl_se, int flags)
 	} else if (flags & ENQUEUE_REPLENISH) {
 		replenish_dl_entity(dl_se);
 	} else if ((flags & ENQUEUE_RESTORE) &&
+		   !is_dl_boosted(dl_se) &&
 		   dl_time_before(dl_se->deadline, rq_clock(rq_of_dl_se(dl_se)))) {
 		setup_new_dl_entity(dl_se);
 	}
-- 
2.45.2
Re: [PATCH 1/3] sched/deadline: Fix warning in migrate_enable for boosted tasks
Posted by Juri Lelli 1 year, 6 months ago
Hi Wander,

On 22/07/24 10:29, Wander Lairson Costa wrote:
> When running the following command:
> 
> while true; do
>     stress-ng --cyclic 30 --timeout 30s --minimize --quiet
> done
> 
> a warning is eventually triggered:
> 
> WARNING: CPU: 43 PID: 2848 at kernel/sched/deadline.c:794
> setup_new_dl_entity+0x13e/0x180
> ...
> Call Trace:
>  <TASK>
>  ? show_trace_log_lvl+0x1c4/0x2df
>  ? enqueue_dl_entity+0x631/0x6e0
>  ? setup_new_dl_entity+0x13e/0x180
>  ? __warn+0x7e/0xd0
>  ? report_bug+0x11a/0x1a0
>  ? handle_bug+0x3c/0x70
>  ? exc_invalid_op+0x14/0x70
>  ? asm_exc_invalid_op+0x16/0x20
>  enqueue_dl_entity+0x631/0x6e0
>  enqueue_task_dl+0x7d/0x120
>  __do_set_cpus_allowed+0xe3/0x280
>  __set_cpus_allowed_ptr_locked+0x140/0x1d0
>  __set_cpus_allowed_ptr+0x54/0xa0
>  migrate_enable+0x7e/0x150
>  rt_spin_unlock+0x1c/0x90
>  group_send_sig_info+0xf7/0x1a0
>  ? kill_pid_info+0x1f/0x1d0
>  kill_pid_info+0x78/0x1d0
>  kill_proc_info+0x5b/0x110
>  __x64_sys_kill+0x93/0xc0
>  do_syscall_64+0x5c/0xf0
>  entry_SYSCALL_64_after_hwframe+0x6e/0x76
>  RIP: 0033:0x7f0dab31f92b
> 
> This warning occurs because set_cpus_allowed dequeues and enqueues tasks
> with the ENQUEUE_RESTORE flag set. If the task is boosted, the warning
> is triggered. A boosted task already had its parameters set by
> rt_mutex_setprio, and a new call to setup_new_dl_entity is unnecessary,
> hence the WARN_ON call.
> 
> Check if we are requeueing a boosted task and avoid calling
> setup_new_dl_entity if that's the case.
> 
> Signed-off-by: Wander Lairson Costa <wander@redhat.com>
> Fixes: 2279f540ea7d ("sched/deadline: Fix priority inheritance with multiple scheduling classes")

I believe your fix makes sense to me. I only wonder if however it
actually fixes 295d6d5e37360 ("sched/deadline: Fix switching to
-deadline") instead of the change you reference above?

Thanks,
Juri
Re: [PATCH 1/3] sched/deadline: Fix warning in migrate_enable for boosted tasks
Posted by Wander Lairson Costa 1 year, 6 months ago
On Tue, Jul 23, 2024 at 5:50 AM Juri Lelli <juri.lelli@redhat.com> wrote:
>
> Hi Wander,
>
> On 22/07/24 10:29, Wander Lairson Costa wrote:
> > When running the following command:
> >
> > while true; do
> >     stress-ng --cyclic 30 --timeout 30s --minimize --quiet
> > done
> >
> > a warning is eventually triggered:
> >
> > WARNING: CPU: 43 PID: 2848 at kernel/sched/deadline.c:794
> > setup_new_dl_entity+0x13e/0x180
> > ...
> > Call Trace:
> >  <TASK>
> >  ? show_trace_log_lvl+0x1c4/0x2df
> >  ? enqueue_dl_entity+0x631/0x6e0
> >  ? setup_new_dl_entity+0x13e/0x180
> >  ? __warn+0x7e/0xd0
> >  ? report_bug+0x11a/0x1a0
> >  ? handle_bug+0x3c/0x70
> >  ? exc_invalid_op+0x14/0x70
> >  ? asm_exc_invalid_op+0x16/0x20
> >  enqueue_dl_entity+0x631/0x6e0
> >  enqueue_task_dl+0x7d/0x120
> >  __do_set_cpus_allowed+0xe3/0x280
> >  __set_cpus_allowed_ptr_locked+0x140/0x1d0
> >  __set_cpus_allowed_ptr+0x54/0xa0
> >  migrate_enable+0x7e/0x150
> >  rt_spin_unlock+0x1c/0x90
> >  group_send_sig_info+0xf7/0x1a0
> >  ? kill_pid_info+0x1f/0x1d0
> >  kill_pid_info+0x78/0x1d0
> >  kill_proc_info+0x5b/0x110
> >  __x64_sys_kill+0x93/0xc0
> >  do_syscall_64+0x5c/0xf0
> >  entry_SYSCALL_64_after_hwframe+0x6e/0x76
> >  RIP: 0033:0x7f0dab31f92b
> >
> > This warning occurs because set_cpus_allowed dequeues and enqueues tasks
> > with the ENQUEUE_RESTORE flag set. If the task is boosted, the warning
> > is triggered. A boosted task already had its parameters set by
> > rt_mutex_setprio, and a new call to setup_new_dl_entity is unnecessary,
> > hence the WARN_ON call.
> >
> > Check if we are requeueing a boosted task and avoid calling
> > setup_new_dl_entity if that's the case.
> >
> > Signed-off-by: Wander Lairson Costa <wander@redhat.com>
> > Fixes: 2279f540ea7d ("sched/deadline: Fix priority inheritance with multiple scheduling classes")
>
> I believe your fix makes sense to me. I only wonder if however it
> actually fixes 295d6d5e37360 ("sched/deadline: Fix switching to
> -deadline") instead of the change you reference above?
>

That makes more sense, thanks.

> Thanks,
> Juri
>