[v5] Workqueue: add WQ_PERCPU, system_dfl_wq and system_percpu_wq

[PATCH v5 0/3] Workqueue: add WQ_PERCPU, system_dfl_wq and system_percpu_wq

Posted by Marco Crivellari 3 months, 3 weeks ago

Hi!

Below is a summary of a discussion about the Workqueue API and cpu isolation
considerations. Details and more information are available here:

    "workqueue: Always use wq_select_unbound_cpu() for WORK_CPU_UNBOUND."
    https://lore.kernel.org/all/20250221112003.1dSuoGyc@linutronix.de/

=== Current situation: problems ===

Let's consider a nohz_full system with isolated CPUs: wq_unbound_cpumask is
set to the housekeeping CPUs, for !WQ_UNBOUND the local CPU is selected.

This leads to different scenarios if a work item is scheduled on an isolated
CPU where "delay" value is 0 or greater then 0:
    schedule_delayed_work(, 0);

This will be handled by __queue_work() that will queue the work item on the
current local (isolated) CPU, while:

    schedule_delayed_work(, 1);

Will move the timer on an housekeeping CPU, and schedule the work there.

Currently if a user enqueue a work item using schedule_delayed_work() the
used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
schedule_work() that is using system_wq and queue_work(), that makes use
again of WORK_CPU_UNBOUND.

This lack of consistentcy cannot be addressed without refactoring the API.

=== Plan and future plans ===

This patchset is the first stone on a refactoring needed in order to
address the points aforementioned; it will have a positive impact also
on the cpu isolation, in the long term, moving away percpu workqueue in
favor to an unbound model.

These are the main steps:
1)  API refactoring (that this patch is introducing)
    -   Make more clear and uniform the system wq names, both per-cpu and
        unbound. This to avoid any possible confusion on what should be
        used.

    -   Introduction of WQ_PERCPU: this flag is the complement of WQ_UNBOUND,
        introduced in this patchset and used on all the callers that are not
        currently using WQ_UNBOUND.

        WQ_UNBOUND will be removed in a future release cycle.

        Most users don't need to be per-cpu, because they don't have
        locality requirements, because of that, a next future step will be
        make "unbound" the default behavior.

2)  Check who really needs to be per-cpu
    -   Remove the WQ_PERCPU flag when is not strictly required.

3)  Add a new API (prefer local cpu)
    -   There are users that don't require a local execution, like mentioned
        above; despite that, local execution yeld to performance gain.

        This new API will prefer the local execution, without requiring it.

=== Introduced Changes by this patchset ===

1)  [P1] add system_percpu_wq and system_dfl_wq

    system_wq is a per-CPU workqueue, but his name is not clear.
    system_unbound_wq is to be used when locality is not required.

    Because of that, system_percpu_wq and system_dfl_wq have been
    introduced in order to replace, in future, system_wq and
    system_unbound_wq.

2)  [P2] add new WQ_PERCPU flag

    This patch adds the new WQ_PERCPU flag to explicitly require to be per-cpu.
    WQ_UNBOUND will be removed in a next release cycle.

3)  [P3] Doc change about WQ_PERCPU

    Added a short section about WQ_PERCPU and a Note under WQ_UNBOUND
    mentioning that it will be removed in the future.

---
Changes in v5:
-	workqueue(s) early init allocation
-	Doc fixes

Changes in v4:
-   Take a step back from the previous version, in order to add first the new
    wq(s) and the new flag (WQ_PERCPU), addressing later all the other changes.

Changes in v3:
-   The introduction of the new wq(s) and the WQ_PERCPU flag have been moved
    in separated patches (1 for wq(s) and 1 for WQ_PERCPU).
-   WQ_PERCPU is now added to all the alloc_workqueue callers in separated patches
    addressing few subsystems first (fs, mm, net).

Changes in v2:
-   Introduction of WQ_PERCPU change has been merged with the alloc_workqueue()
    patch that pass the WQ_PERCPU flag explicitly to every caller.
-   (2 drivers) in the code not matched by Coccinelle; WQ_PERCPU added also there.
-   WQ_PERCPU added to __WQ_BH_ALLOWS.
-   queue_work() now prints a warning (pr_warn_once()) if a user is using the
    old wq and redirect the wrong / old wq to the new one.
-   Changes to workqueue.rst about the WQ_PERCPU flag and a Note about the
    future of WQ_UNBOUND.


Marco Crivellari (3):
  Workqueue: add system_percpu_wq and system_dfl_wq
  Workqueue: add new WQ_PERCPU flag
  [Doc] Workqueue: add WQ_PERCPU

 Documentation/core-api/workqueue.rst |  6 ++++++
 include/linux/workqueue.h            |  9 ++++++---
 kernel/workqueue.c                   | 13 +++++++++----
 3 files changed, 21 insertions(+), 7 deletions(-)

-- 
2.49.0

Re: [PATCH v5 0/3] Workqueue: add WQ_PERCPU, system_dfl_wq and system_percpu_wq

Posted by Tejun Heo 3 months, 3 weeks ago

On Sat, Jun 14, 2025 at 03:35:28PM +0200, Marco Crivellari wrote:
> Marco Crivellari (3):
>   Workqueue: add system_percpu_wq and system_dfl_wq
>   Workqueue: add new WQ_PERCPU flag
>   [Doc] Workqueue: add WQ_PERCPU

Applied 1-3 to wq/for-6.17. I applied as-is but the third patch didn't need
to be separate. Maybe something to consider for future.

Thanks.

-- 
tejun

Re: [PATCH v5 0/3] Workqueue: add WQ_PERCPU, system_dfl_wq and system_percpu_wq

Posted by Frederic Weisbecker 3 months, 3 weeks ago

Le Mon, Jun 16, 2025 at 08:35:32AM -1000, Tejun Heo a écrit :
> On Sat, Jun 14, 2025 at 03:35:28PM +0200, Marco Crivellari wrote:
> > Marco Crivellari (3):
> >   Workqueue: add system_percpu_wq and system_dfl_wq
> >   Workqueue: add new WQ_PERCPU flag
> >   [Doc] Workqueue: add WQ_PERCPU
> 
> Applied 1-3 to wq/for-6.17. I applied as-is but the third patch didn't need
> to be separate. Maybe something to consider for future.

If this is for the next merge window, I guess the easiest is to wait for it
before sending patches to other subsystems to convert them?

I guess we could shortcut that with providing a branch that other subsystems
could pull from but that doesn't look convenient...

Thanks.

> 
> Thanks.
> 
> -- 
> tejun

-- 
Frederic Weisbecker
SUSE Labs

Re: [PATCH v5 0/3] Workqueue: add WQ_PERCPU, system_dfl_wq and system_percpu_wq

Posted by Tejun Heo 3 months, 3 weeks ago

On Tue, Jun 17, 2025 at 03:08:30PM +0200, Frederic Weisbecker wrote:
> Le Mon, Jun 16, 2025 at 08:35:32AM -1000, Tejun Heo a écrit :
> > On Sat, Jun 14, 2025 at 03:35:28PM +0200, Marco Crivellari wrote:
> > > Marco Crivellari (3):
> > >   Workqueue: add system_percpu_wq and system_dfl_wq
> > >   Workqueue: add new WQ_PERCPU flag
> > >   [Doc] Workqueue: add WQ_PERCPU
> > 
> > Applied 1-3 to wq/for-6.17. I applied as-is but the third patch didn't need
> > to be separate. Maybe something to consider for future.
> 
> If this is for the next merge window, I guess the easiest is to wait for it
> before sending patches to other subsystems to convert them?
> 
> I guess we could shortcut that with providing a branch that other subsystems
> could pull from but that doesn't look convenient...

Oh yeah, I said I was gonna do that and promptly forgot. I'll set up a
separate branch based on v6.15.

Thanks.

-- 
tejun

Re: [PATCH v5 0/3] Workqueue: add WQ_PERCPU, system_dfl_wq and system_percpu_wq

Posted by Tejun Heo 3 months, 3 weeks ago

On Tue, Jun 17, 2025 at 08:14:48AM -1000, Tejun Heo wrote:
> On Tue, Jun 17, 2025 at 03:08:30PM +0200, Frederic Weisbecker wrote:
> > Le Mon, Jun 16, 2025 at 08:35:32AM -1000, Tejun Heo a écrit :
> > > On Sat, Jun 14, 2025 at 03:35:28PM +0200, Marco Crivellari wrote:
> > > > Marco Crivellari (3):
> > > >   Workqueue: add system_percpu_wq and system_dfl_wq
> > > >   Workqueue: add new WQ_PERCPU flag
> > > >   [Doc] Workqueue: add WQ_PERCPU
> > > 
> > > Applied 1-3 to wq/for-6.17. I applied as-is but the third patch didn't need
> > > to be separate. Maybe something to consider for future.
> > 
> > If this is for the next merge window, I guess the easiest is to wait for it
> > before sending patches to other subsystems to convert them?
> > 
> > I guess we could shortcut that with providing a branch that other subsystems
> > could pull from but that doesn't look convenient...
> 
> Oh yeah, I said I was gonna do that and promptly forgot. I'll set up a
> separate branch based on v6.15.

Okay, I folded the doc patch into the second one and applied them to the
following branch.

 git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq.git WQ_PERCPU

This is v6.15 + only the two patches and should be easy to pull into any
devel branch.

Thanks.

-- 
tejun

Re: [PATCH v5 0/3] Workqueue: add WQ_PERCPU, system_dfl_wq and system_percpu_wq

Posted by Marco Crivellari 3 months, 3 weeks ago

Hi,

Just a quick question Tejun: when do you expect to receive the other patches?
Should I wait till the next rc1?

I just want to check the work again, but they are ready.

Thanks!


On Tue, Jun 17, 2025 at 8:54 PM Tejun Heo <tj@kernel.org> wrote:
>
> On Tue, Jun 17, 2025 at 08:14:48AM -1000, Tejun Heo wrote:
> > On Tue, Jun 17, 2025 at 03:08:30PM +0200, Frederic Weisbecker wrote:
> > > Le Mon, Jun 16, 2025 at 08:35:32AM -1000, Tejun Heo a écrit :
> > > > On Sat, Jun 14, 2025 at 03:35:28PM +0200, Marco Crivellari wrote:
> > > > > Marco Crivellari (3):
> > > > >   Workqueue: add system_percpu_wq and system_dfl_wq
> > > > >   Workqueue: add new WQ_PERCPU flag
> > > > >   [Doc] Workqueue: add WQ_PERCPU
> > > >
> > > > Applied 1-3 to wq/for-6.17. I applied as-is but the third patch didn't need
> > > > to be separate. Maybe something to consider for future.
> > >
> > > If this is for the next merge window, I guess the easiest is to wait for it
> > > before sending patches to other subsystems to convert them?
> > >
> > > I guess we could shortcut that with providing a branch that other subsystems
> > > could pull from but that doesn't look convenient...
> >
> > Oh yeah, I said I was gonna do that and promptly forgot. I'll set up a
> > separate branch based on v6.15.
>
> Okay, I folded the doc patch into the second one and applied them to the
> following branch.
>
>  git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq.git WQ_PERCPU
>
> This is v6.15 + only the two patches and should be easy to pull into any
> devel branch.
>
> Thanks.
>
> --
> tejun



-- 

Marco Crivellari

L3 Support Engineer, Technology & Product




marco.crivellari@suse.com

Re: [PATCH v5 0/3] Workqueue: add WQ_PERCPU, system_dfl_wq and system_percpu_wq

Posted by Tejun Heo 3 months, 2 weeks ago

Hello,

On Fri, Jun 20, 2025 at 06:13:09PM +0200, Marco Crivellari wrote:
> Just a quick question Tejun: when do you expect to receive the other patches?
> Should I wait till the next rc1?
> 
> I just want to check the work again, but they are ready.

So, I can route the patches through the wq tree but I shouldn't do so unless
subsystem maintainers want to do so for the specific subsystem. Waiting for
rc1 is an option but not the only one. You can send out subsystem-specific
patchdes to the subsystem maintainers and me cc'd with:

- Explanation on what's going on and why.

- What needs to happen if the subsystem wants to route the patch (pull the
  wq branch with the prep changes).

- Offer the option to route the changes through a wq branch.

There are no hard rules on how to do this but it's all about making
logistics understandable and easy for the involved subsystems.

Thanks.

-- 
tejun