[PATCH 2/4] selftests/sched_ext: enq_immed: fix IMMED reenqueue livelock

zhidao su posted 4 patches 6 days, 22 hours ago
[PATCH 2/4] selftests/sched_ext: enq_immed: fix IMMED reenqueue livelock
Posted by zhidao su 6 days, 22 hours ago
When the IMMED slow path fires, ops.enqueue() is called with SCX_ENQ_REENQ
and SCX_TASK_REENQ_IMMED set.  The original code fell through to the
normal enqueue path, which re-inserted the task into SCX_DSQ_LOCAL_ON|0
(CPU 0's local DSQ) with SCX_ENQ_IMMED still present.

This immediately re-triggered the slow path, which called ops.enqueue()
again, creating an infinite cycle.  After SCX_REENQ_LOCAL_MAX_REPEAT (256)
iterations the kernel aborts with BUG_ON.

Additionally, the dispatch handler consumed from SCX_DSQ_GLOBAL even
though no tasks were ever placed there, so it was a no-op.

Fix by redirecting reenqueued tasks to SCX_DSQ_GLOBAL so they escape
the CPU 0 local DSQ orbit, breaking the cycle.  Remove the now-pointless
scx_bpf_dsq_move_to_local(SCX_DSQ_GLOBAL) call from dispatch.

Fixes: c50dcf533149 ("selftests/sched_ext: Add tests for SCX_ENQ_IMMED and scx_bpf_dsq_reenq()")
Signed-off-by: zhidao su <suzhidao@xiaomi.com>
---
 tools/testing/selftests/sched_ext/enq_immed.bpf.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/sched_ext/enq_immed.bpf.c b/tools/testing/selftests/sched_ext/enq_immed.bpf.c
index 805dd0256218..a006f07334ef 100644
--- a/tools/testing/selftests/sched_ext/enq_immed.bpf.c
+++ b/tools/testing/selftests/sched_ext/enq_immed.bpf.c
@@ -35,6 +35,15 @@ void BPF_STRUCT_OPS(enq_immed_enqueue, struct task_struct *p, u64 enq_flags)
 
 		if (reason == SCX_TASK_REENQ_IMMED)
 			__sync_fetch_and_add(&nr_immed_reenq, 1);
+
+		/*
+		 * The slow path re-enqueues IMMED tasks that couldn't run
+		 * immediately.  Avoid re-pinning them to CPU 0's local DSQ,
+		 * which would trigger another slow-path cycle (livelock).
+		 * Send them to the global DSQ instead.
+		 */
+		scx_bpf_dsq_insert(p, SCX_DSQ_GLOBAL, SCX_SLICE_DFL, enq_flags);
+		return;
 	}
 
 	if (p->tgid == (pid_t)test_tgid)
@@ -47,7 +56,6 @@ void BPF_STRUCT_OPS(enq_immed_enqueue, struct task_struct *p, u64 enq_flags)
 
 void BPF_STRUCT_OPS(enq_immed_dispatch, s32 cpu, struct task_struct *prev)
 {
-	scx_bpf_dsq_move_to_local(SCX_DSQ_GLOBAL, 0);
 }
 
 void BPF_STRUCT_OPS(enq_immed_exit, struct scx_exit_info *ei)
-- 
2.43.0