Documentation/core-api/workqueue.rst | 6 ++++++ include/linux/workqueue.h | 9 ++++++--- kernel/workqueue.c | 13 +++++++++---- 3 files changed, 21 insertions(+), 7 deletions(-)
Hi!
Below is a summary of a discussion about the Workqueue API and cpu isolation
considerations. Details and more information are available here:
"workqueue: Always use wq_select_unbound_cpu() for WORK_CPU_UNBOUND."
https://lore.kernel.org/all/20250221112003.1dSuoGyc@linutronix.de/
=== Current situation: problems ===
Let's consider a nohz_full system with isolated CPUs: wq_unbound_cpumask is
set to the housekeeping CPUs, for !WQ_UNBOUND the local CPU is selected.
This leads to different scenarios if a work item is scheduled on an isolated
CPU where "delay" value is 0 or greater then 0:
schedule_delayed_work(, 0);
This will be handled by __queue_work() that will queue the work item on the
current local (isolated) CPU, while:
schedule_delayed_work(, 1);
Will move the timer on an housekeeping CPU, and schedule the work there.
Currently if a user enqueue a work item using schedule_delayed_work() the
used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
schedule_work() that is using system_wq and queue_work(), that makes use
again of WORK_CPU_UNBOUND.
This lack of consistentcy cannot be addressed without refactoring the API.
=== Plan and future plans ===
This patchset is the first stone on a refactoring needed in order to
address the points aforementioned; it will have a positive impact also
on the cpu isolation, in the long term, moving away percpu workqueue in
favor to an unbound model.
These are the main steps:
1) API refactoring (that this patch is introducing)
- Make more clear and uniform the system wq names, both per-cpu and
unbound. This to avoid any possible confusion on what should be
used.
- Introduction of WQ_PERCPU: this flag is the complement of WQ_UNBOUND,
introduced in this patchset and used on all the callers that are not
currently using WQ_UNBOUND.
WQ_UNBOUND will be removed in a future release cycle.
Most users don't need to be per-cpu, because they don't have
locality requirements, because of that, a next future step will be
make "unbound" the default behavior.
2) Check who really needs to be per-cpu
- Remove the WQ_PERCPU flag when is not strictly required.
3) Add a new API (prefer local cpu)
- There are users that don't require a local execution, like mentioned
above; despite that, local execution yeld to performance gain.
This new API will prefer the local execution, without requiring it.
=== Introduced Changes by this patchset ===
1) [P1] add system_percpu_wq and system_dfl_wq
system_wq is a per-CPU workqueue, but his name is not clear.
system_unbound_wq is to be used when locality is not required.
Because of that, system_percpu_wq and system_dfl_wq have been
introduced in order to replace, in future, system_wq and
system_unbound_wq.
2) [P2] add new WQ_PERCPU flag
This patch adds the new WQ_PERCPU flag to explicitly require to be per-cpu.
WQ_UNBOUND will be removed in a next release cycle.
3) [P3] Doc change about WQ_PERCPU
Added a short section about WQ_PERCPU and a Note under WQ_UNBOUND
mentioning that it will be removed in the future.
---
Changes in v5:
- workqueue(s) early init allocation
- Doc fixes
Changes in v4:
- Take a step back from the previous version, in order to add first the new
wq(s) and the new flag (WQ_PERCPU), addressing later all the other changes.
Changes in v3:
- The introduction of the new wq(s) and the WQ_PERCPU flag have been moved
in separated patches (1 for wq(s) and 1 for WQ_PERCPU).
- WQ_PERCPU is now added to all the alloc_workqueue callers in separated patches
addressing few subsystems first (fs, mm, net).
Changes in v2:
- Introduction of WQ_PERCPU change has been merged with the alloc_workqueue()
patch that pass the WQ_PERCPU flag explicitly to every caller.
- (2 drivers) in the code not matched by Coccinelle; WQ_PERCPU added also there.
- WQ_PERCPU added to __WQ_BH_ALLOWS.
- queue_work() now prints a warning (pr_warn_once()) if a user is using the
old wq and redirect the wrong / old wq to the new one.
- Changes to workqueue.rst about the WQ_PERCPU flag and a Note about the
future of WQ_UNBOUND.
Marco Crivellari (3):
Workqueue: add system_percpu_wq and system_dfl_wq
Workqueue: add new WQ_PERCPU flag
[Doc] Workqueue: add WQ_PERCPU
Documentation/core-api/workqueue.rst | 6 ++++++
include/linux/workqueue.h | 9 ++++++---
kernel/workqueue.c | 13 +++++++++----
3 files changed, 21 insertions(+), 7 deletions(-)
--
2.49.0
On Sat, Jun 14, 2025 at 03:35:28PM +0200, Marco Crivellari wrote: > Marco Crivellari (3): > Workqueue: add system_percpu_wq and system_dfl_wq > Workqueue: add new WQ_PERCPU flag > [Doc] Workqueue: add WQ_PERCPU Applied 1-3 to wq/for-6.17. I applied as-is but the third patch didn't need to be separate. Maybe something to consider for future. Thanks. -- tejun
Le Mon, Jun 16, 2025 at 08:35:32AM -1000, Tejun Heo a écrit : > On Sat, Jun 14, 2025 at 03:35:28PM +0200, Marco Crivellari wrote: > > Marco Crivellari (3): > > Workqueue: add system_percpu_wq and system_dfl_wq > > Workqueue: add new WQ_PERCPU flag > > [Doc] Workqueue: add WQ_PERCPU > > Applied 1-3 to wq/for-6.17. I applied as-is but the third patch didn't need > to be separate. Maybe something to consider for future. If this is for the next merge window, I guess the easiest is to wait for it before sending patches to other subsystems to convert them? I guess we could shortcut that with providing a branch that other subsystems could pull from but that doesn't look convenient... Thanks. > > Thanks. > > -- > tejun -- Frederic Weisbecker SUSE Labs
On Tue, Jun 17, 2025 at 03:08:30PM +0200, Frederic Weisbecker wrote: > Le Mon, Jun 16, 2025 at 08:35:32AM -1000, Tejun Heo a écrit : > > On Sat, Jun 14, 2025 at 03:35:28PM +0200, Marco Crivellari wrote: > > > Marco Crivellari (3): > > > Workqueue: add system_percpu_wq and system_dfl_wq > > > Workqueue: add new WQ_PERCPU flag > > > [Doc] Workqueue: add WQ_PERCPU > > > > Applied 1-3 to wq/for-6.17. I applied as-is but the third patch didn't need > > to be separate. Maybe something to consider for future. > > If this is for the next merge window, I guess the easiest is to wait for it > before sending patches to other subsystems to convert them? > > I guess we could shortcut that with providing a branch that other subsystems > could pull from but that doesn't look convenient... Oh yeah, I said I was gonna do that and promptly forgot. I'll set up a separate branch based on v6.15. Thanks. -- tejun
On Tue, Jun 17, 2025 at 08:14:48AM -1000, Tejun Heo wrote: > On Tue, Jun 17, 2025 at 03:08:30PM +0200, Frederic Weisbecker wrote: > > Le Mon, Jun 16, 2025 at 08:35:32AM -1000, Tejun Heo a écrit : > > > On Sat, Jun 14, 2025 at 03:35:28PM +0200, Marco Crivellari wrote: > > > > Marco Crivellari (3): > > > > Workqueue: add system_percpu_wq and system_dfl_wq > > > > Workqueue: add new WQ_PERCPU flag > > > > [Doc] Workqueue: add WQ_PERCPU > > > > > > Applied 1-3 to wq/for-6.17. I applied as-is but the third patch didn't need > > > to be separate. Maybe something to consider for future. > > > > If this is for the next merge window, I guess the easiest is to wait for it > > before sending patches to other subsystems to convert them? > > > > I guess we could shortcut that with providing a branch that other subsystems > > could pull from but that doesn't look convenient... > > Oh yeah, I said I was gonna do that and promptly forgot. I'll set up a > separate branch based on v6.15. Okay, I folded the doc patch into the second one and applied them to the following branch. git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq.git WQ_PERCPU This is v6.15 + only the two patches and should be easy to pull into any devel branch. Thanks. -- tejun
Hi, Just a quick question Tejun: when do you expect to receive the other patches? Should I wait till the next rc1? I just want to check the work again, but they are ready. Thanks! On Tue, Jun 17, 2025 at 8:54 PM Tejun Heo <tj@kernel.org> wrote: > > On Tue, Jun 17, 2025 at 08:14:48AM -1000, Tejun Heo wrote: > > On Tue, Jun 17, 2025 at 03:08:30PM +0200, Frederic Weisbecker wrote: > > > Le Mon, Jun 16, 2025 at 08:35:32AM -1000, Tejun Heo a écrit : > > > > On Sat, Jun 14, 2025 at 03:35:28PM +0200, Marco Crivellari wrote: > > > > > Marco Crivellari (3): > > > > > Workqueue: add system_percpu_wq and system_dfl_wq > > > > > Workqueue: add new WQ_PERCPU flag > > > > > [Doc] Workqueue: add WQ_PERCPU > > > > > > > > Applied 1-3 to wq/for-6.17. I applied as-is but the third patch didn't need > > > > to be separate. Maybe something to consider for future. > > > > > > If this is for the next merge window, I guess the easiest is to wait for it > > > before sending patches to other subsystems to convert them? > > > > > > I guess we could shortcut that with providing a branch that other subsystems > > > could pull from but that doesn't look convenient... > > > > Oh yeah, I said I was gonna do that and promptly forgot. I'll set up a > > separate branch based on v6.15. > > Okay, I folded the doc patch into the second one and applied them to the > following branch. > > git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq.git WQ_PERCPU > > This is v6.15 + only the two patches and should be easy to pull into any > devel branch. > > Thanks. > > -- > tejun -- Marco Crivellari L3 Support Engineer, Technology & Product marco.crivellari@suse.com
Hello, On Fri, Jun 20, 2025 at 06:13:09PM +0200, Marco Crivellari wrote: > Just a quick question Tejun: when do you expect to receive the other patches? > Should I wait till the next rc1? > > I just want to check the work again, but they are ready. So, I can route the patches through the wq tree but I shouldn't do so unless subsystem maintainers want to do so for the specific subsystem. Waiting for rc1 is an option but not the only one. You can send out subsystem-specific patchdes to the subsystem maintainers and me cc'd with: - Explanation on what's going on and why. - What needs to happen if the subsystem wants to route the patch (pull the wq branch with the prep changes). - Offer the option to route the changes through a wq branch. There are no hard rules on how to do this but it's all about making logistics understandable and easy for the involved subsystems. Thanks. -- tejun
© 2016 - 2026 Red Hat, Inc.