From nobody Thu Apr 9 21:54:10 2026 Received: from stravinsky.debian.org (stravinsky.debian.org [82.195.75.108]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 30E5619CC0C for ; Thu, 5 Mar 2026 16:16:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=82.195.75.108 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772727362; cv=none; b=aiEf9isNnspRrDytIIVSu7PURX3PHl5nczuwQ9qgLrFc8noIFqZ5LIkH5B2nLX6/LNUvL8C8bBJ8yr+NIenKSLZKiXf+5dII4Fct4rhFbI1kOSrnLeKnqMyzMpOP7v62dyou7j8UX6rFXDuZH/Wkn6pG3NV2TMNDd0Z/sWFWS54= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772727362; c=relaxed/simple; bh=ayZa6eUX42z6X1IZOjyqGaQSXE2kotFlsEPGUri5Ogw=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=IIQGEcdvUU1YkBhI2KAw8+PppAN3Rxu3Nqh+QpMuQS5UU2sB2xQMjs4grkwxclHFsILRJHTFc0g5kfkBjppZYQM4bxynf6YMag4eJCYg3TRAabG9d0nVy3grGkE7EiTT1B4lDPj5JRRO7t4AFpfeVHTewNi0iMlR5qm9xUH09IM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=debian.org; spf=none smtp.mailfrom=debian.org; dkim=pass (2048-bit key) header.d=debian.org header.i=@debian.org header.b=sVKDQJkZ; arc=none smtp.client-ip=82.195.75.108 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=debian.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=debian.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=debian.org header.i=@debian.org header.b="sVKDQJkZ" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=debian.org; s=smtpauto.stravinsky; h=X-Debian-User:Cc:To:In-Reply-To:References: Message-Id:Content-Transfer-Encoding:Content-Type:MIME-Version:Subject:Date: From:Reply-To:Content-ID:Content-Description; bh=/a+GWLIdcwkHBuwol8H36LzeFXVyS6jDGhiuvAeHeQU=; b=sVKDQJkZ6MaWDuUWoXeAC0C6W1 ZRWAtxO/eM8x2cWUi9o/MfbvBYxYSI3EOkimXiM6jeekYUR8TSU3DE50CgKccrbQLOBVSrpCUONez w8km0GIPVeZrNtwvgk9vUaHHQQFq8poxPU3+A0v6sdu/JzA8BnNamTjYRn+S2XgrSbcYBI/TSAo0O Og8yU/dHs9wCZAMFX9OVgi8y3H4gPAL43H4wXRrmUfX0kr3ECNPfs1uutKswj5AkY0+jYyL7N4MSI M8l7EYGnDx7saj1yyQ1WZ/Kebl+jPm26SweH0DvYlioKJaKY5dD7K7AbHqSkGTbqqLs/AhlOuOina 1KRYLy4g==; Received: from authenticated user by stravinsky.debian.org with esmtpsa (TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim 4.94.2) (envelope-from ) id 1vyBMj-00Gqq8-Cs; Thu, 05 Mar 2026 16:15:57 +0000 From: Breno Leitao Date: Thu, 05 Mar 2026 08:15:37 -0800 Subject: [PATCH v2 1/5] workqueue: Use POOL_BH instead of WQ_BH when checking pool flags Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260305-wqstall_start-at-v2-1-b60863ee0899@debian.org> References: <20260305-wqstall_start-at-v2-0-b60863ee0899@debian.org> In-Reply-To: <20260305-wqstall_start-at-v2-0-b60863ee0899@debian.org> To: Tejun Heo , Lai Jiangshan , Andrew Morton Cc: linux-kernel@vger.kernel.org, Omar Sandoval , Song Liu , Danielle Costantino , kasan-dev@googlegroups.com, Petr Mladek , kernel-team@meta.com, Breno Leitao X-Mailer: b4 0.15-dev-363b9 X-Developer-Signature: v=1; a=openpgp-sha256; l=1075; i=leitao@debian.org; h=from:subject:message-id; bh=ayZa6eUX42z6X1IZOjyqGaQSXE2kotFlsEPGUri5Ogw=; b=owEBbQKS/ZANAwAIATWjk5/8eHdtAcsmYgBpqawzFjrISbfyzAjsU4t6y0T8xqK9Hf+IB0mlb 0pNMWyyX1+JAjMEAAEIAB0WIQSshTmm6PRnAspKQ5s1o5Of/Hh3bQUCaamsMwAKCRA1o5Of/Hh3 bTAiD/4nJSvjlW9FdyS7S6CjWzJrDIyYotlGnLrT3S3LmxR6XhIdnb/3pTSR+x8DxabOBXpTArH FCvfGlvqv//IQ/tn1GV1mPZyrmS8h09o6R6zmAS9NsDzL3zbP6K9y3TBRVjp/nilf/wDl9lI0TQ V8c7LOR9KNyLGWUCVcBmW+1rwFC3Xl9NK7vOoGOZ6gy7ANMa+pfoEPkplv9uM4GUtBvuG+6V4fI 4C6ZN76r3W6+RhIkdjbkonACIPn/YwkuYBH0jcLRrH9k/q7953uXuhgA8xZs5anoOX+h4hsJw56 PohrSrUlhxgi5JS5R7Bmx6D0Lvm+tZE5hhUjyIe3ZgR3ugYdo1jKkxumvb58VussbosFSIEPT3v PTTMp9Ty5p59OzhCMM6BUycK8+lbHCEQrip7yMaQUWMforoM1lsvNPY6jvwN5N38xqzo6+Z13ue QOxlRIcJUDjsEqPQdDV8HMvtabJ4YSHeU7DtRLfMurGx7IlJEQB9Cypn2c8zTSgeKmGYhjJ4lZm lqnRe6AcvsHdpUc3/9W0CXWY7JUhbzt4N6WsjVp6fBSv6JcFi9RHyFRNDy3dRMOnuSeCLN6/olm 35FDuSAAo4Z+PGcNVnfdqdS/GL8ag/jW22mTYsYDTLR9wIPTIIkT4kFYKwNOdEatKCBW3ZqC0N0 SEjfcE1f9BcyL0Q== X-Developer-Key: i=leitao@debian.org; a=openpgp; fpr=AC8539A6E8F46702CA4A439B35A3939FFC78776D X-Debian-User: leitao pr_cont_worker_id() checks pool->flags against WQ_BH, which is a workqueue-level flag (defined in workqueue.h). Pool flags use a separate namespace with POOL_* constants (defined in workqueue.c). The correct constant is POOL_BH. Both WQ_BH and POOL_BH are defined as (1 << 0) so this has no behavioral impact, but it is semantically wrong and inconsistent with every other pool-level BH check in the file. Fixes: 4cb1ef64609f ("workqueue: Implement BH workqueues to eventually repl= ace tasklets") Signed-off-by: Breno Leitao Acked-by: Song Liu --- kernel/workqueue.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/workqueue.c b/kernel/workqueue.c index aeaec79bc09c4..1e5b6cb0fbda6 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -6274,7 +6274,7 @@ static void pr_cont_worker_id(struct worker *worker) { struct worker_pool *pool =3D worker->pool; =20 - if (pool->flags & WQ_BH) + if (pool->flags & POOL_BH) pr_cont("bh%s", pool->attrs->nice =3D=3D HIGHPRI_NICE_LEVEL ? "-hi" : ""); else --=20 2.47.3 From nobody Thu Apr 9 21:54:10 2026 Received: from stravinsky.debian.org (stravinsky.debian.org [82.195.75.108]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EEA6F19CC0C for ; Thu, 5 Mar 2026 16:16:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=82.195.75.108 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772727366; cv=none; b=h9LBd/R29L+5qI1lLBs5Wj3fnPEU2TiYnZA8up8RexkTpZF+zqowQCZcLRCuQIImyYkZreZyBswMikg4H8HR4Aro2qGrIlk5T1nm8AkAID13aKyr6LNT1d6J1GKgDkhLJx9fxNSCzJdeOOrfhBBoy2v2x7eQuubZi16NQk5Ct8Y= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772727366; c=relaxed/simple; bh=6EydzvmFj8FtEPXreGAsQ9uPpRkJYUhi0mAr3wk1XNg=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=rNjsY/I5/CYgE55nOCoaALt4HR5CvNMi4z50Ip4sTvaVH+qwEcu5t36K/3bWUz8sxFziE3/cKIj9A0tbs5Ucy35HI6RWcz5w3/iKAIU8Wudm+GQSUkEXv1rvkAwTalIPitAridBpN8kyNSDN3HrTMdl9hZ5CGuUCbGMqEhiIqq4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=debian.org; spf=none smtp.mailfrom=debian.org; dkim=pass (2048-bit key) header.d=debian.org header.i=@debian.org header.b=ViA7B+iD; arc=none smtp.client-ip=82.195.75.108 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=debian.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=debian.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=debian.org header.i=@debian.org header.b="ViA7B+iD" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=debian.org; s=smtpauto.stravinsky; h=X-Debian-User:Cc:To:In-Reply-To:References: Message-Id:Content-Transfer-Encoding:Content-Type:MIME-Version:Subject:Date: From:Reply-To:Content-ID:Content-Description; bh=VZ95mweHQ1vmkIUmkfYL7umUDQqMZc9sVmEAZmR6MeU=; b=ViA7B+iD0neS8Q3gVksWODD1fC H6Iu3mmPvRecMwF4Tmv6ZGqNMjYs5VZg3/S6TxdBpYKXbtEw7hP1XOFn6rqmVmwtk9HIueBz+JLb8 AbAyon6mbX4+9bC/GmeuaDtj4nOg8/GftJ3rAz/dXlGKFb/YkQughKSYnstw5EeOwQrKtj4zpnzLT U5hS6uDEDm4hRQOayWb0UqyLVZQKurmdE8/I6hsMnKN7SlrC8OWkYO7C/MP4iRPqPKveGOPTQnMHq 1ivP2e01OOlJAH4qsUM6FrfjCVD99FrdFtcPnU+5AUZJy58JGAoiCoOAl2RtyGEo1LCZ58TewyRlG EWnAo5iw==; Received: from authenticated user by stravinsky.debian.org with esmtpsa (TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim 4.94.2) (envelope-from ) id 1vyBMn-00GqqJ-CS; Thu, 05 Mar 2026 16:16:01 +0000 From: Breno Leitao Date: Thu, 05 Mar 2026 08:15:38 -0800 Subject: [PATCH v2 2/5] workqueue: Rename pool->watchdog_ts to pool->last_progress_ts Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260305-wqstall_start-at-v2-2-b60863ee0899@debian.org> References: <20260305-wqstall_start-at-v2-0-b60863ee0899@debian.org> In-Reply-To: <20260305-wqstall_start-at-v2-0-b60863ee0899@debian.org> To: Tejun Heo , Lai Jiangshan , Andrew Morton Cc: linux-kernel@vger.kernel.org, Omar Sandoval , Song Liu , Danielle Costantino , kasan-dev@googlegroups.com, Petr Mladek , kernel-team@meta.com, Breno Leitao X-Mailer: b4 0.15-dev-363b9 X-Developer-Signature: v=1; a=openpgp-sha256; l=3162; i=leitao@debian.org; h=from:subject:message-id; bh=6EydzvmFj8FtEPXreGAsQ9uPpRkJYUhi0mAr3wk1XNg=; b=owEBbQKS/ZANAwAIATWjk5/8eHdtAcsmYgBpqawz+HFuZiX3bVrygyQi68X7XuG2pybr6mKiT oX3kEA0dpmJAjMEAAEIAB0WIQSshTmm6PRnAspKQ5s1o5Of/Hh3bQUCaamsMwAKCRA1o5Of/Hh3 bfCDEACoGVew6HYwSbOZ+DkWskdcZiGfBoQr0laHxcHIq0agPX9g1LWjgEKN/3FQ7a1dBeI00EA UL91ZTGNg6YqYZAx//lXujrySlNEb0Ko7ZWnGlCeM9H0WsYVJgAlLIBrcPnDSPb0/V5HtbhY1cg Vja5yOXmlZSCP1ttjNX2lV6E7UXA79WTbmCfZ9gXJ1QpHq5cK9sM8D5Kjn08JdE4wofrOR9C9yO NCyTXPglYpPg+HuDurx0ztFC62CTLHiXtKqAxb28RKOtdM3rcxTVmPjJsFUX8xokE8Z3lAHfm/K u3LdZYHkSsJ2Si2ZN+L2+pCgA5QyUZLVXKgSRpskyDnmCJFzSYVLCyJUffBVUP6PYbGaI9fiBJ2 1v1fcqrmw2d/gx9ODlugma2rZkjl/CfM99j0Q1PsE1+bxmeh0mcEU1qqDFO3Jpc8ZCHJYV7tin7 4y1+l4KfHfSCYhM4kJjmOs9IgeopQprH8m6nWyg8fDd/w4KftudZ/uhBUuJZUMzqE9OryzNEGJE WaS3ulMLTLBDvRJI012R2NdCZuKdujposRxtglXHSX2lR5BOygws28gC+m2AEC/X9NdyAdjotTV RGgw22egYGcu+qU+emLklvtw4JW6ecVS1SugupFrPbnu5h8f6MCR5G+y+gtAtMKzWAAWB5Ojl1d uwkufkwMyEyxr6Q== X-Developer-Key: i=leitao@debian.org; a=openpgp; fpr=AC8539A6E8F46702CA4A439B35A3939FFC78776D X-Debian-User: leitao The watchdog_ts name doesn't convey what the timestamp actually tracks. This field tracks the last time a workqueue got progress. Rename it to last_progress_ts to make it clear that it records when the pool last made forward progress (started processing new work items). No functional change. Signed-off-by: Breno Leitao Acked-by: Song Liu --- kernel/workqueue.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/kernel/workqueue.c b/kernel/workqueue.c index 1e5b6cb0fbda6..687d5c55c6174 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -190,7 +190,7 @@ struct worker_pool { int id; /* I: pool ID */ unsigned int flags; /* L: flags */ =20 - unsigned long watchdog_ts; /* L: watchdog timestamp */ + unsigned long last_progress_ts; /* L: last forward progress timestamp */ bool cpu_stall; /* WD: stalled cpu bound pool */ =20 /* @@ -1697,7 +1697,7 @@ static void __pwq_activate_work(struct pool_workqueue= *pwq, WARN_ON_ONCE(!(*wdb & WORK_STRUCT_INACTIVE)); trace_workqueue_activate_work(work); if (list_empty(&pwq->pool->worklist)) - pwq->pool->watchdog_ts =3D jiffies; + pwq->pool->last_progress_ts =3D jiffies; move_linked_works(work, &pwq->pool->worklist, NULL); __clear_bit(WORK_STRUCT_INACTIVE_BIT, wdb); } @@ -2348,7 +2348,7 @@ static void __queue_work(int cpu, struct workqueue_st= ruct *wq, */ if (list_empty(&pwq->inactive_works) && pwq_tryinc_nr_active(pwq, false))= { if (list_empty(&pool->worklist)) - pool->watchdog_ts =3D jiffies; + pool->last_progress_ts =3D jiffies; =20 trace_workqueue_activate_work(work); insert_work(pwq, work, &pool->worklist, work_flags); @@ -3352,7 +3352,7 @@ static void process_scheduled_works(struct worker *wo= rker) while ((work =3D list_first_entry_or_null(&worker->scheduled, struct work_struct, entry))) { if (first) { - worker->pool->watchdog_ts =3D jiffies; + worker->pool->last_progress_ts =3D jiffies; first =3D false; } process_one_work(worker, work); @@ -4850,7 +4850,7 @@ static int init_worker_pool(struct worker_pool *pool) pool->cpu =3D -1; pool->node =3D NUMA_NO_NODE; pool->flags |=3D POOL_DISASSOCIATED; - pool->watchdog_ts =3D jiffies; + pool->last_progress_ts =3D jiffies; INIT_LIST_HEAD(&pool->worklist); INIT_LIST_HEAD(&pool->idle_list); hash_init(pool->busy_hash); @@ -6462,7 +6462,7 @@ static void show_one_worker_pool(struct worker_pool *= pool) =20 /* How long the first pending work is waiting for a worker. */ if (!list_empty(&pool->worklist)) - hung =3D jiffies_to_msecs(jiffies - pool->watchdog_ts) / 1000; + hung =3D jiffies_to_msecs(jiffies - pool->last_progress_ts) / 1000; =20 /* * Defer printing to avoid deadlocks in console drivers that @@ -7691,7 +7691,7 @@ static void wq_watchdog_timer_fn(struct timer_list *u= nused) touched =3D READ_ONCE(per_cpu(wq_watchdog_touched_cpu, pool->cpu)); else touched =3D READ_ONCE(wq_watchdog_touched); - pool_ts =3D READ_ONCE(pool->watchdog_ts); + pool_ts =3D READ_ONCE(pool->last_progress_ts); =20 if (time_after(pool_ts, touched)) ts =3D pool_ts; --=20 2.47.3 From nobody Thu Apr 9 21:54:10 2026 Received: from stravinsky.debian.org (stravinsky.debian.org [82.195.75.108]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 924F63CF679 for ; Thu, 5 Mar 2026 16:16:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=82.195.75.108 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772727369; cv=none; b=cr6C1a5edFGHy/9JgzL5GVqRwwfJYU2PDBbwGAM3Aqpui6q9AkzAlE2pUs7q6xW5hyVlnxCfCyx+dFWM3Zc42D/r2Z3FmvB7iiptH3eyLXt/vLBaG0wUCaTvf2XQtDhiYUrfsTvhp1rKt8JGvy906KTH4XX8hXuCnJmFdSbjv3U= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772727369; c=relaxed/simple; bh=6FexgOl/YVWpjj0jIRoVNTpnAdN4iYrWi4l+UFPFdMU=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=I1l0ZORnOPqNdezzRHfLZjnNyxIRmORZ6TTrFWxuhOcM/1fGQ2LRSWa/VrpcgsnlDwCXLXOFqdUt2MONezFp6VRKLJ5yUZMcMxuu37vhKzIk03uPqKr+lLMTQVJ5wNG4O3072DpYj9xssKGu/9iB9gzaUj8BGx6RvtR0m0PB6n0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=debian.org; spf=none smtp.mailfrom=debian.org; dkim=pass (2048-bit key) header.d=debian.org header.i=@debian.org header.b=mgME5aVF; arc=none smtp.client-ip=82.195.75.108 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=debian.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=debian.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=debian.org header.i=@debian.org header.b="mgME5aVF" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=debian.org; s=smtpauto.stravinsky; h=X-Debian-User:Cc:To:In-Reply-To:References: Message-Id:Content-Transfer-Encoding:Content-Type:MIME-Version:Subject:Date: From:Reply-To:Content-ID:Content-Description; bh=/p5loGM8jok20LIPIaiNcteQRMdR3qdR3xO0RfQ3suo=; b=mgME5aVFrmQCyqUATCjnDUHcrO BwFBXbJL31mp67Xp5TnMKWpPbLVp4qTk3GmroY04gNmyLk7fujcn1U2tQJ86eo6klesyNYAx9yyeW Q7QvLoVM/1mC854UXsbodF/v4avgh+LeB1haxNY2F2DK6ZvwuQa+pM3CBKxB1WYgqghCNa0PfKbuo FexRwW45rI4OsJ4nKK7Nl5vCeg42ajSmC9z2kl6kZnB3sz6Lv7hbCebh9m/yb0qiL/v1EDnlS2vaj 5Y0qNdOPQDzwA7SPyQ7oNsm22ELCIb0Kz4RebXS7T5kYUB2DqwptxFPnkr1uNa/TncS/c+nVsVpOh QP2oUH8A==; Received: from authenticated user by stravinsky.debian.org with esmtpsa (TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim 4.94.2) (envelope-from ) id 1vyBMr-00Gqqh-8x; Thu, 05 Mar 2026 16:16:05 +0000 From: Breno Leitao Date: Thu, 05 Mar 2026 08:15:39 -0800 Subject: [PATCH v2 3/5] workqueue: Show in-flight work item duration in stall diagnostics Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260305-wqstall_start-at-v2-3-b60863ee0899@debian.org> References: <20260305-wqstall_start-at-v2-0-b60863ee0899@debian.org> In-Reply-To: <20260305-wqstall_start-at-v2-0-b60863ee0899@debian.org> To: Tejun Heo , Lai Jiangshan , Andrew Morton Cc: linux-kernel@vger.kernel.org, Omar Sandoval , Song Liu , Danielle Costantino , kasan-dev@googlegroups.com, Petr Mladek , kernel-team@meta.com, Breno Leitao X-Mailer: b4 0.15-dev-363b9 X-Developer-Signature: v=1; a=openpgp-sha256; l=2206; i=leitao@debian.org; h=from:subject:message-id; bh=6FexgOl/YVWpjj0jIRoVNTpnAdN4iYrWi4l+UFPFdMU=; b=owEBbQKS/ZANAwAIATWjk5/8eHdtAcsmYgBpqaw0TAz/Z05WDXEWmt2FKYK+iZ/e261ng5cVl LBj4qKqu+aJAjMEAAEIAB0WIQSshTmm6PRnAspKQ5s1o5Of/Hh3bQUCaamsNAAKCRA1o5Of/Hh3 bXauD/9uatHOJbfvUmvQ+aZQo6ietoqjfzHWNG7lgcm5OuGbOWKexKTQLk5uDmFDm3AatM1NjSd r6qG6KO6iUCrAlQKhiMGv8cqxURUdx1WwwvaBLqIbmRVD0jDFv5nGrUN013v9lw+t2dfGUkx+8r 9FxPQXQIWKqsVF4yhVhzDnAnbtsgnrN5u7tRpxPpM7C5jVKcpBn0iFyoZsF7b+PBSN9KlkKCh/9 bMnhamRCLxjMO3xomfcfVcfjjkzdV+uslmYsgDUbVa7EMkokG55p6FU92mTH5/9PhizwEmyEk9E CRSAy53kT/XvBtVCqRnRqH+jujw4ClC/19NftRVpLFu6mkwjQKvwi3QF5U1fWBFTiDKz3jkG3oB l2pqKDibvfJbu38oSo1aABvGiMyKEVNBZgBILv3uql2P6S0h5IiwEYGJMR1VNO781kXUuM6UX4T kikn4vktRTgdoqQm0/MM9vrwcc2yUqseMbxZPlH7GfIJlLgGb9thuuGoFYybVy766Fb0KI+iMvv BE6LsOBRHTJ0wHELnH+Ne3Nj9cWb5G7sen7+MQD9xEQwCNOiZ1cKeASgMPV0xLoyCYERG+SUTGk eY/56XmW8OtuE0eu+cxvwyOIqAi1w2fDGXNP8S3ajkou5ddvi2wWk3eNM6k0V7N1CRs/tL87cpC sbC2y3cAhr+0MUQ== X-Developer-Key: i=leitao@debian.org; a=openpgp; fpr=AC8539A6E8F46702CA4A439B35A3939FFC78776D X-Debian-User: leitao When diagnosing workqueue stalls, knowing how long each in-flight work item has been executing is valuable. Add a current_start timestamp (jiffies) to struct worker, set it when a work item begins execution in process_one_work(), and print the elapsed wall-clock time in show_pwq(). Unlike current_at (which tracks CPU runtime and resets on wakeup for CPU-intensive detection), current_start is never reset because the diagnostic cares about total wall-clock time including sleeps. Before: in-flight: 165:stall_work_fn [wq_stall] After: in-flight: 165:stall_work_fn [wq_stall] for 100s Signed-off-by: Breno Leitao Acked-by: Song Liu --- kernel/workqueue.c | 3 +++ kernel/workqueue_internal.h | 1 + 2 files changed, 4 insertions(+) diff --git a/kernel/workqueue.c b/kernel/workqueue.c index 687d5c55c6174..56d8af13843f8 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -3204,6 +3204,7 @@ __acquires(&pool->lock) worker->current_pwq =3D pwq; if (worker->task) worker->current_at =3D worker->task->se.sum_exec_runtime; + worker->current_start =3D jiffies; work_data =3D *work_data_bits(work); worker->current_color =3D get_work_color(work_data); =20 @@ -6359,6 +6360,8 @@ static void show_pwq(struct pool_workqueue *pwq) pr_cont(" %s", comma ? "," : ""); pr_cont_worker_id(worker); pr_cont(":%ps", worker->current_func); + pr_cont(" for %us", + jiffies_to_msecs(jiffies - worker->current_start) / 1000); list_for_each_entry(work, &worker->scheduled, entry) pr_cont_work(false, work, &pcws); pr_cont_work_flush(comma, (work_func_t)-1L, &pcws); diff --git a/kernel/workqueue_internal.h b/kernel/workqueue_internal.h index f6275944ada77..8def1ddc5a1bf 100644 --- a/kernel/workqueue_internal.h +++ b/kernel/workqueue_internal.h @@ -32,6 +32,7 @@ struct worker { work_func_t current_func; /* K: function */ struct pool_workqueue *current_pwq; /* K: pwq */ u64 current_at; /* K: runtime at start or last wakeup */ + unsigned long current_start; /* K: start time of current work item */ unsigned int current_color; /* K: color */ =20 int sleeping; /* S: is worker sleeping? */ --=20 2.47.3 From nobody Thu Apr 9 21:54:10 2026 Received: from stravinsky.debian.org (stravinsky.debian.org [82.195.75.108]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F0E1B3CE4BB for ; Thu, 5 Mar 2026 16:16:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=82.195.75.108 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772727374; cv=none; b=qpUnNvaTZOtbj+KfNJJCgeZg+wxL3kbo3DXYQZTWmOGcQKVIqxDBi6dM/3Rbetr/e81FFzc8jv0xjJEyNNX+TojunkW1I19BiqtkVUXhWvY0E7lVTkiJaVaoLV/vmRQjuVoMFFxmzhCDxiKiQC/cTeEQCAPzd4Bceiic1asS820= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772727374; c=relaxed/simple; bh=LiVumjyG7bZ7XQH4sr3/408yTD7fFBw48LSnf9KFbjI=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=c4vGFa8dtRBG6VdIFWyQvcAxGSZMbrp5e1SvVFkGXIuGrmhWeQl0qb6npn2W6FBDtz/NTGjzO9ZEhiD+AnDqZJhi1E7EHo9I+8Zmi/8F9AH8n5En6FXmjZ/CWuNb4cr0Haf2q7g7CDeieB1twTTQT9Riao6I8RrxJJa7G6xKMF8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=debian.org; spf=none smtp.mailfrom=debian.org; dkim=pass (2048-bit key) header.d=debian.org header.i=@debian.org header.b=aapduRpW; arc=none smtp.client-ip=82.195.75.108 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=debian.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=debian.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=debian.org header.i=@debian.org header.b="aapduRpW" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=debian.org; s=smtpauto.stravinsky; h=X-Debian-User:Cc:To:In-Reply-To:References: Message-Id:Content-Transfer-Encoding:Content-Type:MIME-Version:Subject:Date: From:Reply-To:Content-ID:Content-Description; bh=jrNtcuD0+LCQ0GXSBo8d49WJRSGDNH67GbmqlwT7i6o=; b=aapduRpWUUO732dfqVD1GdU+cn vR41pUy2dsLD9ysIbQlk6elahXv2lIlSQkHy8oGQb4r3n9ZJnABxsZRBDNaQ8jOKrzSKMhC2jRJaY Yo/dCi9st768hkQOjqBae2HAL/tuaPjQfcsra6K+KMqioiMbQXggFzcMErAq1XrSyegBk3UIsTZG4 6OwXo7i5IWdYkIX3hX4KEf7U9Kuhgvrl0NUzDUD6j6c0dcxjDnvjH7uiwH/yoCnYv8imCL1eHdTTk OIn51HOEtLb/0EgNsqnfVvQlheJrdFnsiw2gPj05CG0c2jr6KP5oslZ3wwQjd+1Gr+FqGZHUGqhUE BNx/Ifwg==; Received: from authenticated user by stravinsky.debian.org with esmtpsa (TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim 4.94.2) (envelope-from ) id 1vyBMv-00Gqr5-8S; Thu, 05 Mar 2026 16:16:09 +0000 From: Breno Leitao Date: Thu, 05 Mar 2026 08:15:40 -0800 Subject: [PATCH v2 4/5] workqueue: Show all busy workers in stall diagnostics Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260305-wqstall_start-at-v2-4-b60863ee0899@debian.org> References: <20260305-wqstall_start-at-v2-0-b60863ee0899@debian.org> In-Reply-To: <20260305-wqstall_start-at-v2-0-b60863ee0899@debian.org> To: Tejun Heo , Lai Jiangshan , Andrew Morton Cc: linux-kernel@vger.kernel.org, Omar Sandoval , Song Liu , Danielle Costantino , kasan-dev@googlegroups.com, Petr Mladek , kernel-team@meta.com, Breno Leitao X-Mailer: b4 0.15-dev-363b9 X-Developer-Signature: v=1; a=openpgp-sha256; l=2985; i=leitao@debian.org; h=from:subject:message-id; bh=LiVumjyG7bZ7XQH4sr3/408yTD7fFBw48LSnf9KFbjI=; b=owEBbQKS/ZANAwAIATWjk5/8eHdtAcsmYgBpqaw0+4KdZ1bZ/6RpgPjb98Fi8CGUtO1g7SggT on9De5+jP+JAjMEAAEIAB0WIQSshTmm6PRnAspKQ5s1o5Of/Hh3bQUCaamsNAAKCRA1o5Of/Hh3 bSaHD/0WfWVWoTO9dH+aT+BJlfv+wRrNM9nWl1eUAw5aSDz69XE2hBRkzbgWKF7ycmt+P5sliJx G5fdRj6tOfTf8YDTTbNudxf/qmYdZ7gAPQ4R75j7wxKEPoBA6Y23XjvF+eGwG8H1W1+qtE5hCzz 4MhqReGF5zXfvKkGWlW5rDnWlGvPJjG0U4aq0tnZQ2S7C9rz0MNL5Wt2Xf6iMMk12e1tfB6FSu9 d6/yBGagM/9aHFUVH4Mvt5RtVZW6IgFNehEnuc08Wr+HPD+tQ1+5spzRyiG93sgEv49arg8rPlS lcCErKK5lzG+5zUcv5VNqIFePO6oDxu/xKTDGc7Jqo7m1EigED7/r4yoI2wJIsmOKIVun54KxUr BsDOC5KUXDdZsppfkmm5jUnPzwyWyG1wJW4Vp3ozZ6R0yBj7W+GqkydIW0YVjrhrLXDFVdUOTU6 dfVM6w8m/FjsYJ6LNJ3/sjqJfbhSi7IK2VVHUbMwxzaeDYhPOp9fRqSVWXVyDzpHmrBugMxhbWv m3viWlEenK47t+E5fNh0OI4WK6fzquRtvcHmEIBTl/aV+5TTqGHuySOCJ+xTXEplYyWjrBeZNNb fXEUesWeDhbCI4OsIeZ+QP34npN7TlsEaaZPzz+Onf/HsJSCji9c2mjvdDGpcL5N0jcVIrTWqDU EyUOA2gEGX/pIrw== X-Developer-Key: i=leitao@debian.org; a=openpgp; fpr=AC8539A6E8F46702CA4A439B35A3939FFC78776D X-Debian-User: leitao show_cpu_pool_hog() only prints workers whose task is currently running on the CPU (task_is_running()). This misses workers that are busy processing a work item but are sleeping or blocked =E2=80=94 for example, a worker that clears PF_WQ_WORKER and enters wait_event_idle(). Such a worker still occupies a pool slot and prevents progress, yet produces an empty backtrace section in the watchdog output. This is happening on real arm64 systems, where toggle_allocation_gate() IPIs every single CPU in the machine (which lacks NMI), causing workqueue stalls that show empty backtraces because toggle_allocation_gate() is sleeping in wait_event_idle(). Remove the task_is_running() filter so every in-flight worker in the pool's busy_hash is dumped. The busy_hash is protected by pool->lock, which is already held. Signed-off-by: Breno Leitao Acked-by: Song Liu --- kernel/workqueue.c | 28 +++++++++++++--------------- 1 file changed, 13 insertions(+), 15 deletions(-) diff --git a/kernel/workqueue.c b/kernel/workqueue.c index 56d8af13843f8..09b9ad78d566c 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -7583,9 +7583,9 @@ MODULE_PARM_DESC(panic_on_stall_time, "Panic if stall= exceeds this many seconds =20 /* * Show workers that might prevent the processing of pending work items. - * The only candidates are CPU-bound workers in the running state. - * Pending work items should be handled by another idle worker - * in all other situations. + * A busy worker that is not running on the CPU (e.g. sleeping in + * wait_event_idle() with PF_WQ_WORKER cleared) can stall the pool just as + * effectively as a CPU-bound one, so dump every in-flight worker. */ static void show_cpu_pool_hog(struct worker_pool *pool) { @@ -7596,19 +7596,17 @@ static void show_cpu_pool_hog(struct worker_pool *p= ool) raw_spin_lock_irqsave(&pool->lock, irq_flags); =20 hash_for_each(pool->busy_hash, bkt, worker, hentry) { - if (task_is_running(worker->task)) { - /* - * Defer printing to avoid deadlocks in console - * drivers that queue work while holding locks - * also taken in their write paths. - */ - printk_deferred_enter(); + /* + * Defer printing to avoid deadlocks in console + * drivers that queue work while holding locks + * also taken in their write paths. + */ + printk_deferred_enter(); =20 - pr_info("pool %d:\n", pool->id); - sched_show_task(worker->task); + pr_info("pool %d:\n", pool->id); + sched_show_task(worker->task); =20 - printk_deferred_exit(); - } + printk_deferred_exit(); } =20 raw_spin_unlock_irqrestore(&pool->lock, irq_flags); @@ -7619,7 +7617,7 @@ static void show_cpu_pools_hogs(void) struct worker_pool *pool; int pi; =20 - pr_info("Showing backtraces of running workers in stalled CPU-bound worke= r pools:\n"); + pr_info("Showing backtraces of busy workers in stalled CPU-bound worker p= ools:\n"); =20 rcu_read_lock(); =20 --=20 2.47.3 From nobody Thu Apr 9 21:54:10 2026 Received: from stravinsky.debian.org (stravinsky.debian.org [82.195.75.108]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 997E93CE4A5 for ; Thu, 5 Mar 2026 16:16:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=82.195.75.108 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772727378; cv=none; b=d7kAmRZIZ9SSZK62TdaUQX5vZxsMXrWRN34hsKrwc4rMpa1jBAOLtom8aNU2M3PiKYJs5CX4n/Bpm38jQDCqomYiYHafSR7uQ40iOoeQPPXVsfnXNg+V5OiMPWuA37zpyNK44Jl9mBmVlwkk/HH6TmfFK89unnKsemWixkgYLxI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772727378; c=relaxed/simple; bh=S4vA+q+VwMb69TSVGfpJnrhNcwW7Ul9gdGBE+SKblT4=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=USwfLcd9r84urVfv2095aQOuFcK5fh3Iq1O2v3W+qDydOplq66r7XHIJABE9YftFYOfboFzH2y03NBMh8ZmKakIPL99tQJHh7WaNfCY+qBO9QW0M3Si5egJpDdKEwe6dHmZrUs5xrJ+j5hmKUz978+4ig3Og7DH3uuDp4fgI4mI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=debian.org; spf=none smtp.mailfrom=debian.org; dkim=pass (2048-bit key) header.d=debian.org header.i=@debian.org header.b=cQlbazqd; arc=none smtp.client-ip=82.195.75.108 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=debian.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=debian.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=debian.org header.i=@debian.org header.b="cQlbazqd" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=debian.org; s=smtpauto.stravinsky; h=X-Debian-User:Cc:To:In-Reply-To:References: Message-Id:Content-Transfer-Encoding:Content-Type:MIME-Version:Subject:Date: From:Reply-To:Content-ID:Content-Description; bh=d/8arRnvKp6yW6gD1fp68y2qdqj0ZVn0huQFIoPOgBA=; b=cQlbazqdLYW3dK5kVAM/xInQRI pWNhk2Nola+OSLX+xJUMLBn4McmEAH47i1qQlLeAx5wvIynL/nGE6GjL2xkXXXn56Y9r7ibkkMHMy ligRaXDbf+BVOd2qLakHiv+N5PnJGiaBpTkAns2RhW/CeLWokjB1ZJsOqxOA2Mi/IIKQcr022ZRis PNDgnRMrhYIzvgELuevE9yS49G3ZMBSSsnBWOQcwJs7uA5Owlh3a0WA9iyV8zUars49VNonP811Y4 toQ+Wk8UKcoFTTQ9za8HvksQe8K9U4GZHAObIdIhXTxsqc+553iUKoEXgJJgC3tCM5QUjcG9JylGz 7qL9rc1Q==; Received: from authenticated user by stravinsky.debian.org with esmtpsa (TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim 4.94.2) (envelope-from ) id 1vyBMz-00GqrH-4z; Thu, 05 Mar 2026 16:16:13 +0000 From: Breno Leitao Date: Thu, 05 Mar 2026 08:15:41 -0800 Subject: [PATCH v2 5/5] workqueue: Add stall detector sample module Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260305-wqstall_start-at-v2-5-b60863ee0899@debian.org> References: <20260305-wqstall_start-at-v2-0-b60863ee0899@debian.org> In-Reply-To: <20260305-wqstall_start-at-v2-0-b60863ee0899@debian.org> To: Tejun Heo , Lai Jiangshan , Andrew Morton Cc: linux-kernel@vger.kernel.org, Omar Sandoval , Song Liu , Danielle Costantino , kasan-dev@googlegroups.com, Petr Mladek , kernel-team@meta.com, Breno Leitao X-Mailer: b4 0.15-dev-363b9 X-Developer-Signature: v=1; a=openpgp-sha256; l=4422; i=leitao@debian.org; h=from:subject:message-id; bh=S4vA+q+VwMb69TSVGfpJnrhNcwW7Ul9gdGBE+SKblT4=; b=owEBbQKS/ZANAwAIATWjk5/8eHdtAcsmYgBpqaw0SFZKUq3gvc8ICSsX75MQ8b9+vYkk6okaq u/ZIg1naCmJAjMEAAEIAB0WIQSshTmm6PRnAspKQ5s1o5Of/Hh3bQUCaamsNAAKCRA1o5Of/Hh3 bQaDD/92Bw7gTZQBrb/gLXTqCDRVNvQjPb94AXKsIiL9VxSQl4UMzQwlRRaq2IryGPc0WwVu0WU MqA1tVPiXN3y52NaZa1dfrmqTXMdDgneU9SoDX/oA6kdyQA795ZqDQ0ydkG+KhbiHTCQ3vLd6dJ kS2k1rgHUy9pgTQzms+447827ZB4jTVG9JlsEa5auw33GwhOejoDZ65SXLs2w8x9DJycj3gRddj 3jis028FhGLIrfwx1Wfr4yoqvdVu+p6MmBVqvMiGJqOCERHOKf3SM9isnU6/bri1RbzrTh14Bkz 9PN05MLdmYm4KURoT0sKqvGh0vfWg5XuOdErred6fuhJ1xR48K+R7DT+qoA4hi50YhGRo4fVNog ESSMjYp4Z1mhNrhVj3sFdtfVgVhrkpwBO/saPOIcR3s3lMCrzXXA3VBA1wjma2If5sodbvg8HXa ymb1TgUo8JKBnS5eFMp6RITlZaxwD+3oJ7w4OAcJqht+RkmXjkJfsNKKXwZfkA5QIoA8gdRE9fA XgLC7F8CIrMgg/W0do6L8kGvPg1prYBZG+NBjQbum5h8bec/0Cy+wTETqyV99NQoPFhp1BYJrJI ZXwM/Eid4My9KJNBMzNBznvxd+ySTm9ozva7scZinZFMB0Kkwl17YLrC7/EWgBD3QutV2TG1qmG 1XsDxGyyJnV3cAw== X-Developer-Key: i=leitao@debian.org; a=openpgp; fpr=AC8539A6E8F46702CA4A439B35A3939FFC78776D X-Debian-User: leitao Add a sample module under samples/workqueue/stall_detector/ that reproduces a workqueue stall caused by PF_WQ_WORKER misuse. The module queues two work items on the same per-CPU pool, then clears PF_WQ_WORKER and sleeps in wait_event_idle(), hiding from the concurrency manager and stalling the second work item indefinitely. This is useful for testing the workqueue watchdog stall diagnostics. Signed-off-by: Breno Leitao Acked-by: Song Liu --- samples/workqueue/stall_detector/Makefile | 1 + samples/workqueue/stall_detector/wq_stall.c | 98 +++++++++++++++++++++++++= ++++ 2 files changed, 99 insertions(+) diff --git a/samples/workqueue/stall_detector/Makefile b/samples/workqueue/= stall_detector/Makefile new file mode 100644 index 0000000000000..8849e85e95bb9 --- /dev/null +++ b/samples/workqueue/stall_detector/Makefile @@ -0,0 +1 @@ +obj-m +=3D wq_stall.o diff --git a/samples/workqueue/stall_detector/wq_stall.c b/samples/workqueu= e/stall_detector/wq_stall.c new file mode 100644 index 0000000000000..6f4a497b18814 --- /dev/null +++ b/samples/workqueue/stall_detector/wq_stall.c @@ -0,0 +1,98 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * wq_stall - Test module for the workqueue stall detector. + * + * Deliberately creates a workqueue stall so the watchdog fires and + * prints diagnostic output. Useful for verifying that the stall + * detector correctly identifies stuck workers and produces useful + * backtraces. + * + * The stall is triggered by clearing PF_WQ_WORKER before sleeping, + * which hides the worker from the concurrency manager. A second + * work item queued on the same pool then sits in the worklist with + * no worker available to process it. + * + * After ~30s the workqueue watchdog fires: + * BUG: workqueue lockup - pool cpus=3DN ... + * + * Build: + * make -C M=3Dsamples/workqueue/stall_detector modules + * + * Copyright (c) 2026 Meta Platforms, Inc. and affiliates. + * Copyright (c) 2026 Breno Leitao + */ + +#include +#include +#include +#include +#include + +static DECLARE_WAIT_QUEUE_HEAD(stall_wq_head); +static atomic_t wake_condition =3D ATOMIC_INIT(0); +static struct work_struct stall_work1; +static struct work_struct stall_work2; + +static void stall_work2_fn(struct work_struct *work) +{ + pr_info("wq_stall: second work item finally ran\n"); +} + +static void stall_work1_fn(struct work_struct *work) +{ + pr_info("wq_stall: first work item running on cpu %d\n", + raw_smp_processor_id()); + + /* + * Queue second item while we're still counted as running + * (pool->nr_running > 0). Since schedule_work() on a per-CPU + * workqueue targets raw_smp_processor_id(), item 2 lands on the + * same pool. __queue_work -> kick_pool -> need_more_worker() + * sees nr_running > 0 and does NOT wake a new worker. + */ + schedule_work(&stall_work2); + + /* + * Hide from the workqueue concurrency manager. Without + * PF_WQ_WORKER, schedule() won't call wq_worker_sleeping(), + * so nr_running is never decremented and no replacement + * worker is created. Item 2 stays stuck in pool->worklist. + */ + current->flags &=3D ~PF_WQ_WORKER; + + pr_info("wq_stall: entering wait_event_idle (PF_WQ_WORKER cleared)\n"); + pr_info("wq_stall: expect 'BUG: workqueue lockup' in ~30-60s\n"); + wait_event_idle(stall_wq_head, atomic_read(&wake_condition) !=3D 0); + + /* Restore so process_one_work() cleanup works correctly */ + current->flags |=3D PF_WQ_WORKER; + pr_info("wq_stall: woke up, PF_WQ_WORKER restored\n"); +} + +static int __init wq_stall_init(void) +{ + pr_info("wq_stall: loading\n"); + + INIT_WORK(&stall_work1, stall_work1_fn); + INIT_WORK(&stall_work2, stall_work2_fn); + schedule_work(&stall_work1); + + return 0; +} + +static void __exit wq_stall_exit(void) +{ + pr_info("wq_stall: unloading\n"); + atomic_set(&wake_condition, 1); + wake_up(&stall_wq_head); + flush_work(&stall_work1); + flush_work(&stall_work2); + pr_info("wq_stall: all work flushed, module unloaded\n"); +} + +module_init(wq_stall_init); +module_exit(wq_stall_exit); + +MODULE_LICENSE("GPL"); +MODULE_DESCRIPTION("Reproduce workqueue stall caused by PF_WQ_WORKER misus= e"); +MODULE_AUTHOR("Breno Leitao "); --=20 2.47.3