Documentation/core-api/workqueue.rst | 6 ++++++ include/linux/workqueue.h | 9 ++++++--- kernel/workqueue.c | 13 +++++++++---- 3 files changed, 21 insertions(+), 7 deletions(-)
Hi! Below is a summary of a discussion about the Workqueue API and cpu isolation considerations. Details and more information are available here: "workqueue: Always use wq_select_unbound_cpu() for WORK_CPU_UNBOUND." https://lore.kernel.org/all/20250221112003.1dSuoGyc@linutronix.de/ === Current situation: problems === Let's consider a nohz_full system with isolated CPUs: wq_unbound_cpumask is set to the housekeeping CPUs, for !WQ_UNBOUND the local CPU is selected. This leads to different scenarios if a work item is scheduled on an isolated CPU where "delay" value is 0 or greater then 0: schedule_delayed_work(, 0); This will be handled by __queue_work() that will queue the work item on the current local (isolated) CPU, while: schedule_delayed_work(, 1); Will move the timer on an housekeeping CPU, and schedule the work there. Currently if a user enqueue a work item using schedule_delayed_work() the used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to schedule_work() that is using system_wq and queue_work(), that makes use again of WORK_CPU_UNBOUND. This lack of consistentcy cannot be addressed without refactoring the API. === Plan and future plans === This patchset is the first stone on a refactoring needed in order to address the points aforementioned; it will have a positive impact also on the cpu isolation, in the long term, moving away percpu workqueue in favor to an unbound model. These are the main steps: 1) API refactoring (that this patch is introducing) - Make more clear and uniform the system wq names, both per-cpu and unbound. This to avoid any possible confusion on what should be used. - Introduction of WQ_PERCPU: this flag is the complement of WQ_UNBOUND, introduced in this patchset and used on all the callers that are not currently using WQ_UNBOUND. WQ_UNBOUND will be removed in a future release cycle. Most users don't need to be per-cpu, because they don't have locality requirements, because of that, a next future step will be make "unbound" the default behavior. 2) Check who really needs to be per-cpu - Remove the WQ_PERCPU flag when is not strictly required. 3) Add a new API (prefer local cpu) - There are users that don't require a local execution, like mentioned above; despite that, local execution yeld to performance gain. This new API will prefer the local execution, without requiring it. === Introduced Changes by this patchset === 1) [P1] add system_percpu_wq and system_dfl_wq system_wq is a per-CPU workqueue, but his name is not clear. system_unbound_wq is to be used when locality is not required. Because of that, system_percpu_wq and system_dfl_wq have been introduced in order to replace, in future, system_wq and system_unbound_wq. 2) [P2] add new WQ_PERCPU flag This patch adds the new WQ_PERCPU flag to explicitly require to be per-cpu. WQ_UNBOUND will be removed in a next release cycle. 3) [P3] Doc change about WQ_PERCPU Added a short section about WQ_PERCPU and a Note under WQ_UNBOUND mentioning that it will be removed in the future. --- Changes in v5: - workqueue(s) early init allocation - Doc fixes Changes in v4: - Take a step back from the previous version, in order to add first the new wq(s) and the new flag (WQ_PERCPU), addressing later all the other changes. Changes in v3: - The introduction of the new wq(s) and the WQ_PERCPU flag have been moved in separated patches (1 for wq(s) and 1 for WQ_PERCPU). - WQ_PERCPU is now added to all the alloc_workqueue callers in separated patches addressing few subsystems first (fs, mm, net). Changes in v2: - Introduction of WQ_PERCPU change has been merged with the alloc_workqueue() patch that pass the WQ_PERCPU flag explicitly to every caller. - (2 drivers) in the code not matched by Coccinelle; WQ_PERCPU added also there. - WQ_PERCPU added to __WQ_BH_ALLOWS. - queue_work() now prints a warning (pr_warn_once()) if a user is using the old wq and redirect the wrong / old wq to the new one. - Changes to workqueue.rst about the WQ_PERCPU flag and a Note about the future of WQ_UNBOUND. Marco Crivellari (3): Workqueue: add system_percpu_wq and system_dfl_wq Workqueue: add new WQ_PERCPU flag [Doc] Workqueue: add WQ_PERCPU Documentation/core-api/workqueue.rst | 6 ++++++ include/linux/workqueue.h | 9 ++++++--- kernel/workqueue.c | 13 +++++++++---- 3 files changed, 21 insertions(+), 7 deletions(-) -- 2.49.0
On Sat, Jun 14, 2025 at 03:35:28PM +0200, Marco Crivellari wrote: > Marco Crivellari (3): > Workqueue: add system_percpu_wq and system_dfl_wq > Workqueue: add new WQ_PERCPU flag > [Doc] Workqueue: add WQ_PERCPU Applied 1-3 to wq/for-6.17. I applied as-is but the third patch didn't need to be separate. Maybe something to consider for future. Thanks. -- tejun
Le Mon, Jun 16, 2025 at 08:35:32AM -1000, Tejun Heo a écrit : > On Sat, Jun 14, 2025 at 03:35:28PM +0200, Marco Crivellari wrote: > > Marco Crivellari (3): > > Workqueue: add system_percpu_wq and system_dfl_wq > > Workqueue: add new WQ_PERCPU flag > > [Doc] Workqueue: add WQ_PERCPU > > Applied 1-3 to wq/for-6.17. I applied as-is but the third patch didn't need > to be separate. Maybe something to consider for future. If this is for the next merge window, I guess the easiest is to wait for it before sending patches to other subsystems to convert them? I guess we could shortcut that with providing a branch that other subsystems could pull from but that doesn't look convenient... Thanks. > > Thanks. > > -- > tejun -- Frederic Weisbecker SUSE Labs
On Tue, Jun 17, 2025 at 03:08:30PM +0200, Frederic Weisbecker wrote: > Le Mon, Jun 16, 2025 at 08:35:32AM -1000, Tejun Heo a écrit : > > On Sat, Jun 14, 2025 at 03:35:28PM +0200, Marco Crivellari wrote: > > > Marco Crivellari (3): > > > Workqueue: add system_percpu_wq and system_dfl_wq > > > Workqueue: add new WQ_PERCPU flag > > > [Doc] Workqueue: add WQ_PERCPU > > > > Applied 1-3 to wq/for-6.17. I applied as-is but the third patch didn't need > > to be separate. Maybe something to consider for future. > > If this is for the next merge window, I guess the easiest is to wait for it > before sending patches to other subsystems to convert them? > > I guess we could shortcut that with providing a branch that other subsystems > could pull from but that doesn't look convenient... Oh yeah, I said I was gonna do that and promptly forgot. I'll set up a separate branch based on v6.15. Thanks. -- tejun
On Tue, Jun 17, 2025 at 08:14:48AM -1000, Tejun Heo wrote: > On Tue, Jun 17, 2025 at 03:08:30PM +0200, Frederic Weisbecker wrote: > > Le Mon, Jun 16, 2025 at 08:35:32AM -1000, Tejun Heo a écrit : > > > On Sat, Jun 14, 2025 at 03:35:28PM +0200, Marco Crivellari wrote: > > > > Marco Crivellari (3): > > > > Workqueue: add system_percpu_wq and system_dfl_wq > > > > Workqueue: add new WQ_PERCPU flag > > > > [Doc] Workqueue: add WQ_PERCPU > > > > > > Applied 1-3 to wq/for-6.17. I applied as-is but the third patch didn't need > > > to be separate. Maybe something to consider for future. > > > > If this is for the next merge window, I guess the easiest is to wait for it > > before sending patches to other subsystems to convert them? > > > > I guess we could shortcut that with providing a branch that other subsystems > > could pull from but that doesn't look convenient... > > Oh yeah, I said I was gonna do that and promptly forgot. I'll set up a > separate branch based on v6.15. Okay, I folded the doc patch into the second one and applied them to the following branch. git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq.git WQ_PERCPU This is v6.15 + only the two patches and should be easy to pull into any devel branch. Thanks. -- tejun
Hi, Just a quick question Tejun: when do you expect to receive the other patches? Should I wait till the next rc1? I just want to check the work again, but they are ready. Thanks! On Tue, Jun 17, 2025 at 8:54 PM Tejun Heo <tj@kernel.org> wrote: > > On Tue, Jun 17, 2025 at 08:14:48AM -1000, Tejun Heo wrote: > > On Tue, Jun 17, 2025 at 03:08:30PM +0200, Frederic Weisbecker wrote: > > > Le Mon, Jun 16, 2025 at 08:35:32AM -1000, Tejun Heo a écrit : > > > > On Sat, Jun 14, 2025 at 03:35:28PM +0200, Marco Crivellari wrote: > > > > > Marco Crivellari (3): > > > > > Workqueue: add system_percpu_wq and system_dfl_wq > > > > > Workqueue: add new WQ_PERCPU flag > > > > > [Doc] Workqueue: add WQ_PERCPU > > > > > > > > Applied 1-3 to wq/for-6.17. I applied as-is but the third patch didn't need > > > > to be separate. Maybe something to consider for future. > > > > > > If this is for the next merge window, I guess the easiest is to wait for it > > > before sending patches to other subsystems to convert them? > > > > > > I guess we could shortcut that with providing a branch that other subsystems > > > could pull from but that doesn't look convenient... > > > > Oh yeah, I said I was gonna do that and promptly forgot. I'll set up a > > separate branch based on v6.15. > > Okay, I folded the doc patch into the second one and applied them to the > following branch. > > git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq.git WQ_PERCPU > > This is v6.15 + only the two patches and should be easy to pull into any > devel branch. > > Thanks. > > -- > tejun -- Marco Crivellari L3 Support Engineer, Technology & Product marco.crivellari@suse.com
Hello, On Fri, Jun 20, 2025 at 06:13:09PM +0200, Marco Crivellari wrote: > Just a quick question Tejun: when do you expect to receive the other patches? > Should I wait till the next rc1? > > I just want to check the work again, but they are ready. So, I can route the patches through the wq tree but I shouldn't do so unless subsystem maintainers want to do so for the specific subsystem. Waiting for rc1 is an option but not the only one. You can send out subsystem-specific patchdes to the subsystem maintainers and me cc'd with: - Explanation on what's going on and why. - What needs to happen if the subsystem wants to route the patch (pull the wq branch with the prep changes). - Offer the option to route the changes through a wq branch. There are no hard rules on how to do this but it's all about making logistics understandable and easy for the involved subsystems. Thanks. -- tejun
© 2016 - 2025 Red Hat, Inc.