From nobody Fri May 8 00:11:17 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 92A92C433F5 for ; Mon, 16 May 2022 01:33:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239060AbiEPBdF (ORCPT ); Sun, 15 May 2022 21:33:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37936 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231853AbiEPBdC (ORCPT ); Sun, 15 May 2022 21:33:02 -0400 Received: from www262.sakura.ne.jp (www262.sakura.ne.jp [202.181.97.72]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D473C10FF4 for ; Sun, 15 May 2022 18:33:00 -0700 (PDT) Received: from fsav413.sakura.ne.jp (fsav413.sakura.ne.jp [133.242.250.112]) by www262.sakura.ne.jp (8.15.2/8.15.2) with ESMTP id 24G1WmNd085346; Mon, 16 May 2022 10:32:48 +0900 (JST) (envelope-from penguin-kernel@I-love.SAKURA.ne.jp) Received: from www262.sakura.ne.jp (202.181.97.72) by fsav413.sakura.ne.jp (F-Secure/fsigk_smtp/550/fsav413.sakura.ne.jp); Mon, 16 May 2022 10:32:48 +0900 (JST) X-Virus-Status: clean(F-Secure/fsigk_smtp/550/fsav413.sakura.ne.jp) Received: from [192.168.1.9] (M106072142033.v4.enabler.ne.jp [106.72.142.33]) (authenticated bits=0) by www262.sakura.ne.jp (8.15.2/8.15.2) with ESMTPSA id 24G1WmWw085343 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NO); Mon, 16 May 2022 10:32:48 +0900 (JST) (envelope-from penguin-kernel@I-love.SAKURA.ne.jp) Message-ID: <7b2fecdb-59ae-2c54-5a5b-774ef7054d1b@I-love.SAKURA.ne.jp> Date: Mon, 16 May 2022 10:32:43 +0900 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 6.3; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.9.0 Subject: [PATCH v2] workqueue: Wrap flush_workqueue() using a macro Content-Language: en-US To: Tejun Heo Cc: LKML References: <6e4ed62e-888b-6e7a-c13d-67656f39ca94@I-love.SAKURA.ne.jp> <738afe71-2983-05d5-f0fc-d94efbdf7634@I-love.SAKURA.ne.jp> From: Tetsuo Handa In-Reply-To: Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Since flush operation synchronously waits for completion, flushing system-wide WQs (e.g. system_wq) might introduce possibility of deadlock due to unexpected locking dependency. Tejun Heo commented at [1] that it makes no sense at all to call flush_workqueue() on the shared WQs as the caller has no idea what it's gonna end up waiting for. Although there is flush_scheduled_work() which flushes system_wq WQ with "Think twice before calling this function! It's very easy to get into trouble if you don't take great care." warning message, syzbot found a circular locking dependency caused by flushing system_wq WQ [2]. Therefore, let's change the direction to that developers had better use their local WQs if flush_scheduled_work()/flush_workqueue(system_*_wq) is inevitable. Steps for converting system-wide WQs into local WQs are explained at [3], and a conversion to stop flushing system-wide WQs is in progress. Now we want some mechanism for preventing developers who are not aware of this conversion from again start flushing system-wide WQs. Since I found that WARN_ON() is complete but awkward approach for teaching developers about this problem, let's use BUILD_BUG_ON() for incomplete but handy approach. For completeness, we will also insert WARN_ON() into flush_workqueue() after all users stopped calling flush_scheduled_work(). Link: https://lore.kernel.org/all/YgnQGZWT%2Fn3VAITX@slm.duckdns.org/ [1] Link: https://syzkaller.appspot.com/bug?extid=3Dbde0f89deacca7c765b8 [2] Link: https://lkml.kernel.org/r/49925af7-78a8-a3dd-bce6-cfc02e1a9236@I-love= .SAKURA.ne.jp [3] Signed-off-by: Tetsuo Handa --- Changes in v2: Use "void (flush_workqueue)(struct workqueue_struct *wq)" and remove "#undef flush_workqueue", suggested by Joe Perches . This patch is expected to be sent to linux.git immediately before the merge window for 5.19 closes, for we need to wait until patches for removing flush_workqueue(system_*_wq) calls reached linux.git during the merge window for 5.19. Due to this ordering dependency, the kernel build bots complain l= ike https://lkml.kernel.org/r/202205021448.6SzhD1Cz-lkp@intel.com , but it seems that simply ignoring these reports is the best choice. If it is inconvenient for you to send pull request twice from your wq.git t= ree (once for immediately after 5.18 is released, the other for immediately bef= ore the merge window for 5.19 closes), I can send pull request for this patch f= rom my tree. In that case, please give me Acked-by: or Reviewed-by: response re= garding this patch. include/linux/workqueue.h | 31 +++++++++++++++++++++++++++++++ kernel/workqueue.c | 2 +- 2 files changed, 32 insertions(+), 1 deletion(-) diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h index 7fee9b6cfede..5bad503925de 100644 --- a/include/linux/workqueue.h +++ b/include/linux/workqueue.h @@ -663,4 +663,35 @@ int workqueue_offline_cpu(unsigned int cpu); void __init workqueue_init_early(void); void __init workqueue_init(void); =20 +/* + * Detect attempt to flush system-wide workqueues at compile time when pos= sible. + * See https://lkml.kernel.org/r/49925af7-78a8-a3dd-bce6-cfc02e1a9236@I-lo= ve.SAKURA.ne.jp for + * reasons and steps for converting system-wide workqueues into local work= queues. + */ +#define flush_workqueue(wq) \ +({ \ + BUILD_BUG_ON_MSG(__builtin_constant_p(&(wq) =3D=3D &system_wq) && \ + &(wq) =3D=3D &system_wq, \ + "Please avoid flushing system_wq."); \ + BUILD_BUG_ON_MSG(__builtin_constant_p(&(wq) =3D=3D &system_highpri_wq) &&= \ + &(wq) =3D=3D &system_highpri_wq, \ + "Please avoid flushing system_highpri_wq."); \ + BUILD_BUG_ON_MSG(__builtin_constant_p(&(wq) =3D=3D &system_long_wq) && \ + &(wq) =3D=3D &system_long_wq, \ + "Please avoid flushing system_long_wq."); \ + BUILD_BUG_ON_MSG(__builtin_constant_p(&(wq) =3D=3D &system_unbound_wq) &&= \ + &(wq) =3D=3D &system_unbound_wq, \ + "Please avoid flushing system_unbound_wq."); \ + BUILD_BUG_ON_MSG(__builtin_constant_p(&(wq) =3D=3D &system_freezable_wq) = && \ + &(wq) =3D=3D &system_freezable_wq, \ + "Please avoid flushing system_freezable_wq."); \ + BUILD_BUG_ON_MSG(__builtin_constant_p(&(wq) =3D=3D &system_power_efficien= t_wq) && \ + &(wq) =3D=3D &system_power_efficient_wq, \ + "Please avoid flushing system_power_efficient_wq."); \ + BUILD_BUG_ON_MSG(__builtin_constant_p(&(wq) =3D=3D &system_freezable_powe= r_efficient_wq) && \ + &(wq) =3D=3D &system_freezable_power_efficient_wq, \ + "Please avoid flushing system_freezable_power_efficient_wq."); \ + flush_workqueue(wq); \ +}) + #endif diff --git a/kernel/workqueue.c b/kernel/workqueue.c index 0d2514b4ff0d..071a4eb7fd94 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -2794,7 +2794,7 @@ static bool flush_workqueue_prep_pwqs(struct workqueu= e_struct *wq, * This function sleeps until all work items which were queued on entry * have finished execution, but it is not livelocked by new incoming ones. */ -void flush_workqueue(struct workqueue_struct *wq) +void (flush_workqueue)(struct workqueue_struct *wq) { struct wq_flusher this_flusher =3D { .list =3D LIST_HEAD_INIT(this_flusher.list), --=20 2.18.4