From nobody Tue Apr 14 13:59:24 2026 Received: from mail-pf1-f173.google.com (mail-pf1-f173.google.com [209.85.210.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A477C25332E for ; Tue, 14 Apr 2026 05:36:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.173 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776144998; cv=none; b=cnQxfR5FO7bgiH9mk1hlA2iaRK6WsvtMUYW5/GESV/M9iLAD2wPyRM/gDfrPG9g1c+JcjCS19CngRrn5KaSOYB1lnnaLSp6Gif3tHx07xrEYTwDPpJnnReZOFoR42j3eEtIkH4HFSce/6XZ44xfWpq0Jb3B0ur00TUAoqUa75Zo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776144998; c=relaxed/simple; bh=VmR5DMNWzrP/od2pXdghz2b6mGjRSWhzRKJtV6vvU7k=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version; b=RbiUw34n62nLeT3LC/ef8r4ikVtOu3SykVGq/ExsY79x50VhojXCACZfM1Dd5LC26m+4Q4o3L01lQ5KKHJ6zvJPQixZvQgQyXadUdevkglFQzFC/2JUQbZjDSewH+Y7bgvXRVzL7LQlM0xzRTLCrkJrvbPx43rAPHKFFuQzTLXw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=RSMqpSz1; arc=none smtp.client-ip=209.85.210.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="RSMqpSz1" Received: by mail-pf1-f173.google.com with SMTP id d2e1a72fcca58-827270d50d4so4743012b3a.3 for ; Mon, 13 Apr 2026 22:36:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1776144997; x=1776749797; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=KrGMXOW6xFX/7Enr2p5+h6vcvLlwdoAdfhIeK7W3XBg=; b=RSMqpSz1XkxmTft1nP5Mv+9xhz58Ut/7+ELCnuNgGkjzlBtcfptphS5kHM0GxgfzOt bJUXlyQ+aB2dmk8rBsvMivvGvObJPS1utmuQFMVlAP2+ELP5+iCxwqnBQ4ODtRyQeQME VHLZ59oYgpOPZ+QWyemA2Qw7Mvyk629S75OAGMiq3mWklUWAl4ekbCZm0y7LoQa618QZ 4xeaaJNtCma1CMo8rZAw87Xb9a5b8SQV0V+QZUl2TMBf5fdN6+JBxSNzBjmRiIs+liP4 aEan9zQVX4GnnBdKjvK1CzMXFRi++O43n8sVlQQj/oO9oqtA6wynMqAoj2mmSfcqXKmZ rcww== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1776144997; x=1776749797; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=KrGMXOW6xFX/7Enr2p5+h6vcvLlwdoAdfhIeK7W3XBg=; b=Uktba4dlr54rrCTID1HTFGn59BP7Ls93TiW9Nwmm6BH6hi+BBLNJ7ppLmH+XYWilfG AjQifbPtUOPflepU3g/I6f0yGMC7wJFYLMbl+hCViMH++cjnsvY8Kpa594PKUpxiJ/2B ETLbYR00pahpahYHx/DiQ+4zOt5FQie0YeNbnM1wRVJOHk6lzzQ0VLfuquZRX0yV7IJd NM+1NnE35INDZFFMXKJZvPn/a2GcyJ7c0EAy0s7U6GIuYrELOfQ+jL1phCBBzoP3ellN d7o3MiyXK+9dsUEe7fApOkEzj7n/tRymHtqS13QMCsov/tJf3+UqUdpVeX6aSfMcPJ+H MpTA== X-Forwarded-Encrypted: i=1; AFNElJ/aBl5epqp4XJ9DHCMk/L9TUendK04VurfhJ3cwtjST2iYRgbJCUbsL7KhlBsjZ2GtVYY6l+zMAka6NWSc=@vger.kernel.org X-Gm-Message-State: AOJu0YzHgk0DAOSqXTvhQgyI/ZwC+GbBGGmczKSnsDWn7k+szSCOm92x /iOMxQSOaXxkLV881EYKhLVZI2Gm8YlYo4YH3eDfgzI2yE+6Zt2Rg7OPbrrTvw== X-Gm-Gg: AeBDievHYrgHuuxilu9CpsPpY+TJAweCFMQgXyWdOcCG2xASPvNxk9CsD9/1bvqj1n8 X7OtDOEntSJ1NvHsqJbYbJNQEMdL/mvUhRvuK2yGVPMSMihJyAyNqcn/EMKuTm1lPo7fqOeWro3 CZYMz9hhel66//dKdL4guznfeF2NrLryCxpuRzZx8tSQQsnORD5clPP/W32F9NXyxKii4nQCMe8 6sski0iVfAmkAOZn5WzW/NtJibl5/nrQRTs3SVbyzIVNPLHoX/fykiB7bNH5+ULaAxKI5EaLJ2P 8Rxofj0wVQ7eyloaFhpK6hMVLKUqZhH8QL/ZLiHKC4yg0mBszaqnmqYkF/oF6kn0LxdamSOfOQw XOH0Cd5AXAulLgPlGHgZhE94Lx88pEYAQ59gynVq7WOW3zs0kcNw4X2Rsz7EEt8BG8vxEHPwbih R3PuxhNAzZJWSd2AKSJtsD0E//ITo3WrjOVVAjW0JSsVwM/Z3/Tvs2yvVrkJhVN5gCT7hgq5D/Z JGi X-Received: by 2002:a05:6a00:278d:b0:82f:250b:9f1b with SMTP id d2e1a72fcca58-82f250ba2bcmr9759154b3a.23.1776144996842; Mon, 13 Apr 2026 22:36:36 -0700 (PDT) Received: from mi-HP-ProDesk-680-G6-PCI-Microtower-PC.mioffice.cn ([43.224.245.226]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-82f0c50cd34sm13775786b3a.54.2026.04.13.22.36.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 13 Apr 2026 22:36:36 -0700 (PDT) From: soolaugust@gmail.com To: John Stultz Cc: Peter Zijlstra , Ingo Molnar , linux-kernel@vger.kernel.org, zhidao su Subject: [PATCH] sched/proxy_exec: Limit find_proxy_task() chain depth to prevent CPU hang Date: Tue, 14 Apr 2026 13:36:25 +0800 Message-ID: <20260414053625.3582936-1-soolaugust@gmail.com> X-Mailer: git-send-email 2.43.0 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: zhidao su find_proxy_task() follows the blocked_on chain with: for (p =3D donor; task_is_blocked(p); p =3D owner) The existing WARN_ON(owner =3D=3D p) only detects immediate self-loops (a task waiting on a mutex it already owns). It does not detect multi-task cycles: if tasks A and B form a cycle where A waits on B's mutex and B waits on A's mutex, the chain traversal loops forever between A and B, hanging the CPU indefinitely while holding rq->lock. The scenario is real under PE: mutex-blocked tasks are kept on the runqueue (try_to_block_task() with should_block=3Dfalse), so both A and B remain selectable by pick_next_task(). When A is selected as donor, find_proxy_task() follows A->mutex_B->owner=3DB->mutex_A->owner=3DA->... with no termination condition for cycles. rt-mutex handles this identically with max_lock_depth (default 1024), printing a warning and returning -EDEADLK when the chain is too deep. Add a chain_depth counter with MAX_PROXY_CHAIN_DEPTH=3D64. When exceeded, emit WARN_ONCE and call proxy_resched_idle() to schedule idle briefly, consistent with how other unresolvable states are handled in the function (e.g., owner migrating, curr_in_chain bailouts). This keeps the kernel healthy without spinning; the deadlock resolution is the caller's problem. Tested with a built-in boot-param test (pe_cycle_test) that creates two kthreads on CPU 0 each holding one kernel mutex while trying to acquire the other, forming an A->B->A deadlock cycle. With this fix: [ 111.758150] sched/pe: proxy chain depth exceeded 64, possible deadlock= cycle involving pid 120 [ 111.758150] WARNING: CPU: 0 PID: 119 at kernel/sched/core.c:7339 __sch= edule+0x1e6e/0x1e80 ... [ 112.694277] pe_cycle_test: still alive after 1s (CPU not hung) Without this fix, an NMI watchdog (nmi_watchdog=3D1, watchdog_thresh=3D15) fires a hard LOCKUP on CPU 0 with RIP in do_raw_spin_lock, called from __schedule, confirming the CPU spins inside find_proxy_task() holding rq->lock with no forward progress: [ 109.951781] watchdog: CPU0: Watchdog detected hard LOCKUP on cpu 0 [ 109.951781] RIP: 0010:do_raw_spin_lock+0x3e/0xb0 [ 109.951781] Call Trace: [ 109.951781] __schedule+0x11e7/0x1e10 [ 109.951781] schedule_preempt_disabled+0x18/0x30 [ 109.951781] __mutex_lock+0x6f0/0xac0 [ 109.951781] pe_test_thread_a+0x9c/0xe0 Fixes: 7de9d4f94638 ("sched: Start blocked_on chain processing in find_prox= y_task()") Signed-off-by: zhidao su --- kernel/sched/core.c | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 3f3425c6b2f2..bafb59432f7f 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -7310,6 +7310,17 @@ DEFINE_LOCK_GUARD_1(blocked_on_lock, struct blocked_= on_lock, * Returns the task that is going to be used as execution context (the one * that is actually going to be run on cpu_of(rq)). */ +/* + * Limit proxy chain traversal depth to avoid infinite loops in pathologic= al + * cases (e.g., A waits for B's mutex while B waits for A's mutex). The + * existing WARN_ON(owner =3D=3D p) only catches immediate self-loops; mul= ti-task + * cycles like A->B->A are not detected without a depth counter. + * + * rt-mutex uses a similar guard (max_lock_depth =3D 1024). We use a small= er + * limit since proxy chains are expected to be short in practice. + */ +#define MAX_PROXY_CHAIN_DEPTH 64 + static struct task_struct * find_proxy_task(struct rq *rq, struct task_struct *donor, struct rq_flags = *rf) __must_hold(__rq_lockp(rq)) @@ -7318,11 +7329,17 @@ find_proxy_task(struct rq *rq, struct task_struct *= donor, struct rq_flags *rf) struct task_struct *owner =3D NULL; bool curr_in_chain =3D false; int this_cpu =3D cpu_of(rq); + int chain_depth =3D 0; struct task_struct *p; int owner_cpu; =20 /* Follow blocked_on chain. */ for (p =3D donor; task_is_blocked(p); p =3D owner) { + if (++chain_depth > MAX_PROXY_CHAIN_DEPTH) { + WARN_ONCE(1, "sched/pe: proxy chain depth exceeded %d, possible deadloc= k cycle involving pid %d\n", + MAX_PROXY_CHAIN_DEPTH, p->pid); + return proxy_resched_idle(rq); + } /* copy the entire blocked_on structure */ raw_spin_lock(&p->blocked_lock); bo =3D p->blocked_on; --=20 2.43.0