kernel/sched/core.c | 17 +++++++++++++++++ 1 file changed, 17 insertions(+)
From: zhidao su <suzhidao@xiaomi.com>
find_proxy_task() follows the blocked_on chain with:
for (p = donor; task_is_blocked(p); p = owner)
The existing WARN_ON(owner == p) only detects immediate self-loops
(a task waiting on a mutex it already owns). It does not detect
multi-task cycles: if tasks A and B form a cycle where A waits on
B's mutex and B waits on A's mutex, the chain traversal loops forever
between A and B, hanging the CPU indefinitely while holding rq->lock.
The scenario is real under PE: mutex-blocked tasks are kept on the
runqueue (try_to_block_task() with should_block=false), so both A and
B remain selectable by pick_next_task(). When A is selected as donor,
find_proxy_task() follows A->mutex_B->owner=B->mutex_A->owner=A->...
with no termination condition for cycles.
rt-mutex handles this identically with max_lock_depth (default 1024),
printing a warning and returning -EDEADLK when the chain is too deep.
Add a chain_depth counter with MAX_PROXY_CHAIN_DEPTH=64. When exceeded,
emit WARN_ONCE and call proxy_resched_idle() to schedule idle briefly,
consistent with how other unresolvable states are handled in the
function (e.g., owner migrating, curr_in_chain bailouts). This keeps
the kernel healthy without spinning; the deadlock resolution is the
caller's problem.
Tested with a built-in boot-param test (pe_cycle_test) that creates two
kthreads on CPU 0 each holding one kernel mutex while trying to acquire
the other, forming an A->B->A deadlock cycle.
With this fix:
[ 111.758150] sched/pe: proxy chain depth exceeded 64, possible deadlock cycle involving pid 120
[ 111.758150] WARNING: CPU: 0 PID: 119 at kernel/sched/core.c:7339 __schedule+0x1e6e/0x1e80
...
[ 112.694277] pe_cycle_test: still alive after 1s (CPU not hung)
Without this fix, an NMI watchdog (nmi_watchdog=1, watchdog_thresh=15)
fires a hard LOCKUP on CPU 0 with RIP in do_raw_spin_lock, called from
__schedule, confirming the CPU spins inside find_proxy_task() holding
rq->lock with no forward progress:
[ 109.951781] watchdog: CPU0: Watchdog detected hard LOCKUP on cpu 0
[ 109.951781] RIP: 0010:do_raw_spin_lock+0x3e/0xb0
[ 109.951781] Call Trace:
[ 109.951781] __schedule+0x11e7/0x1e10
[ 109.951781] schedule_preempt_disabled+0x18/0x30
[ 109.951781] __mutex_lock+0x6f0/0xac0
[ 109.951781] pe_test_thread_a+0x9c/0xe0
Fixes: 7de9d4f94638 ("sched: Start blocked_on chain processing in find_proxy_task()")
Signed-off-by: zhidao su <suzhidao@xiaomi.com>
---
kernel/sched/core.c | 17 +++++++++++++++++
1 file changed, 17 insertions(+)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 3f3425c6b2f2..bafb59432f7f 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -7310,6 +7310,17 @@ DEFINE_LOCK_GUARD_1(blocked_on_lock, struct blocked_on_lock,
* Returns the task that is going to be used as execution context (the one
* that is actually going to be run on cpu_of(rq)).
*/
+/*
+ * Limit proxy chain traversal depth to avoid infinite loops in pathological
+ * cases (e.g., A waits for B's mutex while B waits for A's mutex). The
+ * existing WARN_ON(owner == p) only catches immediate self-loops; multi-task
+ * cycles like A->B->A are not detected without a depth counter.
+ *
+ * rt-mutex uses a similar guard (max_lock_depth = 1024). We use a smaller
+ * limit since proxy chains are expected to be short in practice.
+ */
+#define MAX_PROXY_CHAIN_DEPTH 64
+
static struct task_struct *
find_proxy_task(struct rq *rq, struct task_struct *donor, struct rq_flags *rf)
__must_hold(__rq_lockp(rq))
@@ -7318,11 +7329,17 @@ find_proxy_task(struct rq *rq, struct task_struct *donor, struct rq_flags *rf)
struct task_struct *owner = NULL;
bool curr_in_chain = false;
int this_cpu = cpu_of(rq);
+ int chain_depth = 0;
struct task_struct *p;
int owner_cpu;
/* Follow blocked_on chain. */
for (p = donor; task_is_blocked(p); p = owner) {
+ if (++chain_depth > MAX_PROXY_CHAIN_DEPTH) {
+ WARN_ONCE(1, "sched/pe: proxy chain depth exceeded %d, possible deadlock cycle involving pid %d\n",
+ MAX_PROXY_CHAIN_DEPTH, p->pid);
+ return proxy_resched_idle(rq);
+ }
/* copy the entire blocked_on structure */
raw_spin_lock(&p->blocked_lock);
bo = p->blocked_on;
--
2.43.0
© 2016 - 2026 Red Hat, Inc.