[PATCH 0/2] Workqueue: fs: replace use of system_wq and add WQ_PERCPU to alloc_workqueue users

Marco Crivellari posted 2 patches 1 month, 2 weeks ago
fs/afs/main.c                  |  4 ++--
fs/aio.c                       |  2 +-
fs/bcachefs/super.c            | 10 +++++-----
fs/btrfs/async-thread.c        |  3 +--
fs/btrfs/disk-io.c             |  2 +-
fs/ceph/super.c                |  2 +-
fs/dlm/lowcomms.c              |  2 +-
fs/dlm/main.c                  |  2 +-
fs/fs-writeback.c              |  4 ++--
fs/fuse/dev.c                  |  2 +-
fs/fuse/inode.c                |  2 +-
fs/gfs2/main.c                 |  5 +++--
fs/gfs2/ops_fstype.c           |  6 ++++--
fs/nfs/namespace.c             |  2 +-
fs/nfs/nfs4renewd.c            |  2 +-
fs/ocfs2/dlm/dlmdomain.c       |  3 ++-
fs/ocfs2/dlmfs/dlmfs.c         |  3 ++-
fs/smb/client/cifsfs.c         | 16 +++++++++++-----
fs/smb/server/ksmbd_work.c     |  2 +-
fs/smb/server/transport_rdma.c |  3 ++-
fs/super.c                     |  3 ++-
fs/verity/verify.c             |  2 +-
fs/xfs/xfs_log.c               |  3 +--
fs/xfs/xfs_mru_cache.c         |  3 ++-
fs/xfs/xfs_super.c             | 15 ++++++++-------
25 files changed, 58 insertions(+), 45 deletions(-)
[PATCH 0/2] Workqueue: fs: replace use of system_wq and add WQ_PERCPU to alloc_workqueue users
Posted by Marco Crivellari 1 month, 2 weeks ago
Hello!

Below is a summary of a discussion about the Workqueue API and cpu isolation
considerations. Details and more information are available here:

        "workqueue: Always use wq_select_unbound_cpu() for WORK_CPU_UNBOUND."
        https://lore.kernel.org/all/20250221112003.1dSuoGyc@linutronix.de/

=== Current situation: problems ===

Let's consider a nohz_full system with isolated CPUs: wq_unbound_cpumask is
set to the housekeeping CPUs, for !WQ_UNBOUND the local CPU is selected.

This leads to different scenarios if a work item is scheduled on an isolated
CPU where "delay" value is 0 or greater then 0:
		schedule_delayed_work(, 0);

This will be handled by __queue_work() that will queue the work item on the
current local (isolated) CPU, while:

		schedule_delayed_work(, 1);

Will move the timer on an housekeeping CPU, and schedule the work there.

Currently if a user enqueue a work item using schedule_delayed_work() the
used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
schedule_work() that is using system_wq and queue_work(), that makes use
again of WORK_CPU_UNBOUND.

This lack of consistentcy cannot be addressed without refactoring the API.

=== Plan and future plans ===

This patchset is the first stone on a refactoring needed in order to
address the points aforementioned; it will have a positive impact also
on the cpu isolation, in the long term, moving away percpu workqueue in
favor to an unbound model.

These are the main steps:
1)  API refactoring (that this patch is introducing)
	-	Make more clear and uniform the system wq names, both per-cpu and
		unbound. This to avoid any possible confusion on what should be
		used.

	-	Introduction of WQ_PERCPU: this flag is the complement of WQ_UNBOUND,
		introduced in this patchset and used on all the callers that are not
		currently using WQ_UNBOUND.

		WQ_UNBOUND will be removed in a future release cycle.

		Most users don't need to be per-cpu, because they don't have
		locality requirements, because of that, a next future step will be
		make "unbound" the default behavior.

2)  Check who really needs to be per-cpu
	-	Remove the WQ_PERCPU flag when is not strictly required.

3)  Add a new API (prefer local cpu)
	-	There are users that don't require a local execution, like mentioned
		above; despite that, local execution yeld to performance gain.

		This new API will prefer the local execution, without requiring it.
		
=== Introduced Changes by this patchset ===

1) [P 1] replace use of system_wq with system_percpu_wq (under fs)

		system_wq is a per-CPU workqueue, but his name is not clear.
		system_unbound_wq is to be used when locality is not required.

		Because of that, system_wq has been renamed in system_percpu_wq in the
		fs subsystm (details in the next section).

2) [P 2] add WQ_PERCPU to alloc_workqueue() users (under fs)

		Every alloc_workqueue() caller should use one among WQ_PERCPU or
		WQ_UNBOUND. This is actually enforced warning if both or none of them
		are present at the same time.

		These patches introduce WQ_PERCPU in the fs subsystem 
		(details in the next section).

		WQ_UNBOUND will be removed in a next release cycle.

=== For fs Maintainers ===

If you agree with these changes, one option is pull the preparation changes from
Tejun's wq branch [1].

As an alternative, the patches can be routed through a wq branch.

The preparation changes are described in the present cover letter, under the
"main steps" section. The changes done in summary are:

- add system_percpu_wq and system_dfl_wq, for now without replace the older wq(s)
  (system_unbound_wq and system_wq).
- add WQ_PERCPU flag, currently without removing WQ_UNBOUND; it will be removed
  in a future release cycle.

You can find the aforementioned changes reading:

("Workqueue: add WQ_PERCPU, system_dfl_wq and system_percpu_wq")
https://lore.kernel.org/all/20250614133531.76742-1-marco.crivellari@suse.com/


- [1] git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq.git WQ_PERCPU

Thanks!
		
Marco Crivellari (2):
  Workqueue: fs: replace use of system_wq with system_percpu_wq
  Workqueue: fs: WQ_PERCPU added to alloc_workqueue users

 fs/afs/main.c                  |  4 ++--
 fs/aio.c                       |  2 +-
 fs/bcachefs/super.c            | 10 +++++-----
 fs/btrfs/async-thread.c        |  3 +--
 fs/btrfs/disk-io.c             |  2 +-
 fs/ceph/super.c                |  2 +-
 fs/dlm/lowcomms.c              |  2 +-
 fs/dlm/main.c                  |  2 +-
 fs/fs-writeback.c              |  4 ++--
 fs/fuse/dev.c                  |  2 +-
 fs/fuse/inode.c                |  2 +-
 fs/gfs2/main.c                 |  5 +++--
 fs/gfs2/ops_fstype.c           |  6 ++++--
 fs/nfs/namespace.c             |  2 +-
 fs/nfs/nfs4renewd.c            |  2 +-
 fs/ocfs2/dlm/dlmdomain.c       |  3 ++-
 fs/ocfs2/dlmfs/dlmfs.c         |  3 ++-
 fs/smb/client/cifsfs.c         | 16 +++++++++++-----
 fs/smb/server/ksmbd_work.c     |  2 +-
 fs/smb/server/transport_rdma.c |  3 ++-
 fs/super.c                     |  3 ++-
 fs/verity/verify.c             |  2 +-
 fs/xfs/xfs_log.c               |  3 +--
 fs/xfs/xfs_mru_cache.c         |  3 ++-
 fs/xfs/xfs_super.c             | 15 ++++++++-------
 25 files changed, 58 insertions(+), 45 deletions(-)

-- 
2.50.1
Re: [PATCH 0/2] Workqueue: fs: replace use of system_wq and add WQ_PERCPU to alloc_workqueue users
Posted by Christian Brauner 1 month, 2 weeks ago
On Fri, Aug 15, 2025 at 11:47:13AM +0200, Marco Crivellari wrote:
> Hello!
> 
> Below is a summary of a discussion about the Workqueue API and cpu isolation
> considerations. Details and more information are available here:
> 
>         "workqueue: Always use wq_select_unbound_cpu() for WORK_CPU_UNBOUND."
>         https://lore.kernel.org/all/20250221112003.1dSuoGyc@linutronix.de/
> 
> === Current situation: problems ===
> 
> Let's consider a nohz_full system with isolated CPUs: wq_unbound_cpumask is
> set to the housekeeping CPUs, for !WQ_UNBOUND the local CPU is selected.
> 
> This leads to different scenarios if a work item is scheduled on an isolated
> CPU where "delay" value is 0 or greater then 0:
> 		schedule_delayed_work(, 0);
> 
> This will be handled by __queue_work() that will queue the work item on the
> current local (isolated) CPU, while:
> 
> 		schedule_delayed_work(, 1);
> 
> Will move the timer on an housekeeping CPU, and schedule the work there.
> 
> Currently if a user enqueue a work item using schedule_delayed_work() the
> used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
> WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
> schedule_work() that is using system_wq and queue_work(), that makes use
> again of WORK_CPU_UNBOUND.
> 
> This lack of consistentcy cannot be addressed without refactoring the API.
> 
> === Plan and future plans ===
> 
> This patchset is the first stone on a refactoring needed in order to
> address the points aforementioned; it will have a positive impact also
> on the cpu isolation, in the long term, moving away percpu workqueue in
> favor to an unbound model.
> 
> These are the main steps:
> 1)  API refactoring (that this patch is introducing)
> 	-	Make more clear and uniform the system wq names, both per-cpu and
> 		unbound. This to avoid any possible confusion on what should be
> 		used.
> 
> 	-	Introduction of WQ_PERCPU: this flag is the complement of WQ_UNBOUND,
> 		introduced in this patchset and used on all the callers that are not
> 		currently using WQ_UNBOUND.
> 
> 		WQ_UNBOUND will be removed in a future release cycle.
> 
> 		Most users don't need to be per-cpu, because they don't have
> 		locality requirements, because of that, a next future step will be
> 		make "unbound" the default behavior.
> 
> 2)  Check who really needs to be per-cpu
> 	-	Remove the WQ_PERCPU flag when is not strictly required.
> 
> 3)  Add a new API (prefer local cpu)
> 	-	There are users that don't require a local execution, like mentioned
> 		above; despite that, local execution yeld to performance gain.
> 
> 		This new API will prefer the local execution, without requiring it.
> 		
> === Introduced Changes by this patchset ===
> 
> 1) [P 1] replace use of system_wq with system_percpu_wq (under fs)
> 
> 		system_wq is a per-CPU workqueue, but his name is not clear.
> 		system_unbound_wq is to be used when locality is not required.
> 
> 		Because of that, system_wq has been renamed in system_percpu_wq in the
> 		fs subsystm (details in the next section).
> 
> 2) [P 2] add WQ_PERCPU to alloc_workqueue() users (under fs)
> 
> 		Every alloc_workqueue() caller should use one among WQ_PERCPU or
> 		WQ_UNBOUND. This is actually enforced warning if both or none of them
> 		are present at the same time.
> 
> 		These patches introduce WQ_PERCPU in the fs subsystem 
> 		(details in the next section).
> 
> 		WQ_UNBOUND will be removed in a next release cycle.
> 
> === For fs Maintainers ===
> 
> If you agree with these changes, one option is pull the preparation changes from
> Tejun's wq branch [1].

I'll take it through the vfs-6.18.workqueue branch.
Can I just pull the series from the list so we have all the lore links
and the cover letter?
Re: [PATCH 0/2] Workqueue: fs: replace use of system_wq and add WQ_PERCPU to alloc_workqueue users
Posted by Tejun Heo 1 month, 2 weeks ago
Hello,

On Tue, Aug 19, 2025 at 01:23:26PM +0200, Christian Brauner wrote:
> > If you agree with these changes, one option is pull the preparation changes from
> > Tejun's wq branch [1].
> 
> I'll take it through the vfs-6.18.workqueue branch.
> Can I just pull the series from the list so we have all the lore links
> and the cover letter?

I believe all the prerequisites were already in -rc1 and you should be able
to apply the series without pulling anything else. Marco, please correct me
if I'm mistaken.

Thanks.

-- 
tejun
Re: [PATCH 0/2] Workqueue: fs: replace use of system_wq and add WQ_PERCPU to alloc_workqueue users
Posted by Marco Crivellari 1 month, 2 weeks ago
Hello Tejun and Christian,

> I believe all the prerequisites were already in -rc1 and you should be able
> to apply the series without pulling anything else. Marco, please correct me
> if I'm mistaken.

Yes the prerequisites are in rc1.

But there are changes introduced in
https://lore.kernel.org/all/20250815094510.52360-4-marco.crivellari@suse.com/
("Workqueue: WQ_PERCPU added to all the remaining users") that in my
opinion should be present as well when this series and the others are
merged.

For example, there are changes in __alloc_workqueue(), and also
changes inside queue_work() (include/kernel/workqueue.h) that were
added only in the "core" patches / series.

About queue_work() and the other functions, changes are made by
("Workqueue: replace use of system_unbound_wq with system_dfl_wq") and
the related patch (about system_percpu_wq): it is mostly about add a
pr_warn_once() if the old wq is used, and redirect on the new one.

Btw: in this fs branch I only have the prerequisites and this series,
and I can compile and boot.

Thank you!


On Thu, Aug 21, 2025 at 1:23 AM Tejun Heo <tj@kernel.org> wrote:
>
> Hello,
>
> On Tue, Aug 19, 2025 at 01:23:26PM +0200, Christian Brauner wrote:
> > > If you agree with these changes, one option is pull the preparation changes from
> > > Tejun's wq branch [1].
> >
> > I'll take it through the vfs-6.18.workqueue branch.
> > Can I just pull the series from the list so we have all the lore links
> > and the cover letter?
>
> I believe all the prerequisites were already in -rc1 and you should be able
> to apply the series without pulling anything else. Marco, please correct me
> if I'm mistaken.
>
> Thanks.
>
> --
> tejun



-- 

Marco Crivellari

L3 Support Engineer, Technology & Product




marco.crivellari@suse.com
Re: [PATCH 0/2] Workqueue: fs: replace use of system_wq and add WQ_PERCPU to alloc_workqueue users
Posted by Christian Brauner 1 month, 2 weeks ago
On Thu, Aug 21, 2025 at 10:40:30AM +0200, Marco Crivellari wrote:
> Hello Tejun and Christian,
> 
> > I believe all the prerequisites were already in -rc1 and you should be able
> > to apply the series without pulling anything else. Marco, please correct me
> > if I'm mistaken.
> 
> Yes the prerequisites are in rc1.
> 
> But there are changes introduced in
> https://lore.kernel.org/all/20250815094510.52360-4-marco.crivellari@suse.com/
> ("Workqueue: WQ_PERCPU added to all the remaining users") that in my
> opinion should be present as well when this series and the others are
> merged.
> 
> For example, there are changes in __alloc_workqueue(), and also
> changes inside queue_work() (include/kernel/workqueue.h) that were
> added only in the "core" patches / series.
> 
> About queue_work() and the other functions, changes are made by
> ("Workqueue: replace use of system_unbound_wq with system_dfl_wq") and
> the related patch (about system_percpu_wq): it is mostly about add a
> pr_warn_once() if the old wq is used, and redirect on the new one.
> 
> Btw: in this fs branch I only have the prerequisites and this series,
> and I can compile and boot.

So either you can give me the fs specific changes without any additional
workqueue changes and I maintain a stable vfs-6.18.workqueue branch that
you can pull and base your additional changes on or Tejun gives me a
stable branch with the preparatory workqueue changes that I pull and use
as the base for the fs specific changes.

(Please note that vfs branches are usually pulled first thing during a
merge window.)

> 
> Thank you!
> 
> 
> On Thu, Aug 21, 2025 at 1:23 AM Tejun Heo <tj@kernel.org> wrote:
> >
> > Hello,
> >
> > On Tue, Aug 19, 2025 at 01:23:26PM +0200, Christian Brauner wrote:
> > > > If you agree with these changes, one option is pull the preparation changes from
> > > > Tejun's wq branch [1].
> > >
> > > I'll take it through the vfs-6.18.workqueue branch.
> > > Can I just pull the series from the list so we have all the lore links
> > > and the cover letter?
> >
> > I believe all the prerequisites were already in -rc1 and you should be able
> > to apply the series without pulling anything else. Marco, please correct me
> > if I'm mistaken.
> >
> > Thanks.
> >
> > --
> > tejun
> 
> 
> 
> -- 
> 
> Marco Crivellari
> 
> L3 Support Engineer, Technology & Product
> 
> 
> 
> 
> marco.crivellari@suse.com