From nobody Wed Dec 17 09:18:30 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 40E23EE4993 for ; Tue, 22 Aug 2023 11:07:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234496AbjHVLHG (ORCPT ); Tue, 22 Aug 2023 07:07:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46004 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232234AbjHVLHF (ORCPT ); Tue, 22 Aug 2023 07:07:05 -0400 Received: from mx5.didiglobal.com (mx5.didiglobal.com [111.202.70.122]) by lindbergh.monkeyblade.net (Postfix) with SMTP id 146631BE for ; Tue, 22 Aug 2023 04:07:03 -0700 (PDT) Received: from mail.didiglobal.com (unknown [10.79.65.12]) by mx5.didiglobal.com (Maildata Gateway V2.8) with ESMTPS id 575C9B0015C1F; Tue, 22 Aug 2023 19:07:00 +0800 (CST) Received: from didi-ThinkCentre-M930t-N000 (10.79.64.101) by ZJY02-ACTMBX-02.didichuxing.com (10.79.65.12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.27; Tue, 22 Aug 2023 19:06:59 +0800 Date: Tue, 22 Aug 2023 19:06:51 +0800 X-MD-Sfrom: tiozhang@didiglobal.com X-MD-SrcIP: 10.79.65.12 From: Tio Zhang To: CC: , , , , Subject: [PATCH v2] workqueue: let WORKER_CPU_INTENSIVE be included in watchdog Message-ID: <20230822110609.GA3702@didi-ThinkCentre-M930t-N000> Mail-Followup-To: tj@kernel.org, linux-kernel@vger.kernel.org, jiangshanlai@gmail.com, zyhtheonly@gmail.com, zyhtheonly@yeah.net MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <202308142140.cf9be57a-oliver.sang@intel.com> User-Agent: Mutt/1.9.4 (2018-02-28) X-Originating-IP: [10.79.64.101] X-ClientProxiedBy: ZJY01-PUBMBX-01.didichuxing.com (10.79.64.32) To ZJY02-ACTMBX-02.didichuxing.com (10.79.65.12) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" When a pool has a worker with WORKER_CPU_INTENSIVE set but other workers are not that busy, the pool->worklist will mostly be empty, which leads the intensive work always having a chance of escaping from the watchdog's check. This may cause watchdog miss finding out a forever running work in WQ_CPU_INTENSIVE. Also, after commit '616db8779b1e3f93075df691432cccc5ef3c3ba0', workers with potentially intensive works will automatically be converted into WORKER_CPU_INTENSIVE. This might let watchdog to miss all work potentially running forever. Signed-off-by: Tio Zhang --- kernel/workqueue.c | 22 +++++++++++++++++++--- 1 file changed, 19 insertions(+), 3 deletions(-) diff --git a/kernel/workqueue.c b/kernel/workqueue.c index 02a8f402eeb5..564d96c38d4d 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -6277,13 +6277,29 @@ static void wq_watchdog_timer_fn(struct timer_list = *unused) if (!thresh) return; =20 - rcu_read_lock(); + mutex_lock(&wq_pool_mutex); =20 for_each_pool(pool, pi) { + struct worker *worker; unsigned long pool_ts, touched, ts; + bool check_intensive =3D false; =20 pool->cpu_stall =3D false; - if (list_empty(&pool->worklist)) + + /* Not sure if we should let WORKER_UNBOUND to + * be included? Since let a unbound work to last + * more than e,g, 30 seconds seem also unacceptable. + */ + mutex_lock(&wq_pool_attach_mutex); + for_each_pool_worker(worker, pool) { + if (worker->flags & WORKER_CPU_INTENSIVE) { + check_intensive =3D true; + break; + } + } + mutex_unlock(&wq_pool_attach_mutex); + + if (list_empty(&pool->worklist) && !check_intensive) continue; =20 /* @@ -6320,7 +6336,7 @@ static void wq_watchdog_timer_fn(struct timer_list *u= nused) =20 } =20 - rcu_read_unlock(); + mutex_unlock(&wq_pool_mutex); =20 if (lockup_detected) show_all_workqueues(); --=20 2.17.1