rust/kernel/workqueue.rs | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-)
Below is a summary of a discussion about the Workqueue API and cpu isolation
considerations. Details and more information are available here:
"workqueue: Always use wq_select_unbound_cpu() for WORK_CPU_UNBOUND."
https://lore.kernel.org/all/20250221112003.1dSuoGyc@linutronix.de/
=== Current situation: problems ===
Let's consider a nohz_full system with isolated CPUs: wq_unbound_cpumask is
set to the housekeeping CPUs, for !WQ_UNBOUND the local CPU is selected.
This leads to different scenarios if a work item is scheduled on an isolated
CPU where "delay" value is 0 or greater then 0:
schedule_delayed_work(, 0);
This will be handled by __queue_work() that will queue the work item on the
current local (isolated) CPU, while:
schedule_delayed_work(, 1);
Will move the timer on an housekeeping CPU, and schedule the work there.
Currently if a user enqueue a work item using schedule_delayed_work() the
used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
schedule_work() that is using system_wq and queue_work(), that makes use
again of WORK_CPU_UNBOUND.
This lack of consistentcy cannot be addressed without refactoring the API.
=== Plan and future plans ===
This patchset is the first stone on a refactoring needed in order to
address the points aforementioned; it will have a positive impact also
on the cpu isolation, in the long term, moving away percpu workqueue in
favor to an unbound model.
These are the main steps:
1) API refactoring (that this patch is introducing)
- Make more clear and uniform the system wq names, both per-cpu and
unbound. This to avoid any possible confusion on what should be
used.
- Introduction of WQ_PERCPU: this flag is the complement of WQ_UNBOUND,
introduced in this patchset and used on all the callers that are not
currently using WQ_UNBOUND.
WQ_UNBOUND will be removed in a future release cycle.
Most users don't need to be per-cpu, because they don't have
locality requirements, because of that, a next future step will be
make "unbound" the default behavior.
2) Check who really needs to be per-cpu
- Remove the WQ_PERCPU flag when is not strictly required.
3) Add a new API (prefer local cpu)
- There are users that don't require a local execution, like mentioned
above; despite that, local execution yeld to performance gain.
This new API will prefer the local execution, without requiring it.
=== Introduced Changes by this series ===
1) [P 1-2] Replace use of system_wq and system_unbound_wq
system_wq is a per-CPU workqueue, but his name is not clear.
system_unbound_wq is to be used when locality is not required.
Because of that, system_wq has been renamed in system_percpu_wq, and
system_unbound_wq has been renamed in system_dfl_wq.
=== For Maintainers ===
There are prerequisites for this series, already merged in the master branch.
The commits are:
128ea9f6ccfb6960293ae4212f4f97165e42222d ("workqueue: Add system_percpu_wq and
system_dfl_wq")
930c2ea566aff59e962c50b2421d5fcc3b98b8be ("workqueue: Add new WQ_PERCPU flag")
Thanks!
Marco Crivellari (2):
rust: replace use of system_unbound_wq with system_dfl_wq
rust: replace use of system_wq with system_percpu_wq
rust/kernel/workqueue.rs | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
--
2.51.0
On Fri, Sep 5, 2025 at 11:12 AM Marco Crivellari
<marco.crivellari@suse.com> wrote:
>
> Below is a summary of a discussion about the Workqueue API and cpu isolation
> considerations. Details and more information are available here:
>
> "workqueue: Always use wq_select_unbound_cpu() for WORK_CPU_UNBOUND."
> https://lore.kernel.org/all/20250221112003.1dSuoGyc@linutronix.de/
>
> === Current situation: problems ===
>
> Let's consider a nohz_full system with isolated CPUs: wq_unbound_cpumask is
> set to the housekeeping CPUs, for !WQ_UNBOUND the local CPU is selected.
>
> This leads to different scenarios if a work item is scheduled on an isolated
> CPU where "delay" value is 0 or greater then 0:
> schedule_delayed_work(, 0);
>
> This will be handled by __queue_work() that will queue the work item on the
> current local (isolated) CPU, while:
>
> schedule_delayed_work(, 1);
>
> Will move the timer on an housekeeping CPU, and schedule the work there.
>
> Currently if a user enqueue a work item using schedule_delayed_work() the
> used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
> WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
> schedule_work() that is using system_wq and queue_work(), that makes use
> again of WORK_CPU_UNBOUND.
>
> This lack of consistentcy cannot be addressed without refactoring the API.
>
> === Plan and future plans ===
>
> This patchset is the first stone on a refactoring needed in order to
> address the points aforementioned; it will have a positive impact also
> on the cpu isolation, in the long term, moving away percpu workqueue in
> favor to an unbound model.
>
> These are the main steps:
> 1) API refactoring (that this patch is introducing)
> - Make more clear and uniform the system wq names, both per-cpu and
> unbound. This to avoid any possible confusion on what should be
> used.
>
> - Introduction of WQ_PERCPU: this flag is the complement of WQ_UNBOUND,
> introduced in this patchset and used on all the callers that are not
> currently using WQ_UNBOUND.
>
> WQ_UNBOUND will be removed in a future release cycle.
>
> Most users don't need to be per-cpu, because they don't have
> locality requirements, because of that, a next future step will be
> make "unbound" the default behavior.
>
> 2) Check who really needs to be per-cpu
> - Remove the WQ_PERCPU flag when is not strictly required.
>
> 3) Add a new API (prefer local cpu)
> - There are users that don't require a local execution, like mentioned
> above; despite that, local execution yeld to performance gain.
>
> This new API will prefer the local execution, without requiring it.
>
> === Introduced Changes by this series ===
>
> 1) [P 1-2] Replace use of system_wq and system_unbound_wq
>
> system_wq is a per-CPU workqueue, but his name is not clear.
> system_unbound_wq is to be used when locality is not required.
>
> Because of that, system_wq has been renamed in system_percpu_wq, and
> system_unbound_wq has been renamed in system_dfl_wq.
>
>
> === For Maintainers ===
>
> There are prerequisites for this series, already merged in the master branch.
> The commits are:
>
> 128ea9f6ccfb6960293ae4212f4f97165e42222d ("workqueue: Add system_percpu_wq and
> system_dfl_wq")
>
> 930c2ea566aff59e962c50b2421d5fcc3b98b8be ("workqueue: Add new WQ_PERCPU flag")
>
>
> Thanks!
>
> Marco Crivellari (2):
> rust: replace use of system_unbound_wq with system_dfl_wq
> rust: replace use of system_wq with system_percpu_wq
>
> rust/kernel/workqueue.rs | 12 ++++++------
> 1 file changed, 6 insertions(+), 6 deletions(-)
The functions you are changing are intended to match 1-to-1 with the
wq globals defined by the C side. Changing them so that Rust and C no
longer agrees on what the wqs are called seems wrong. How about adding
new functions instead?
Alice
Hello Alice,
Thanks for your reply!
I'm not entirely sure I understood your doubt.
system_dfl_wq and system_percpu_wq are wq(s) also present in the C code.
They are part of the prerequisite already merged in:
128ea9f6ccfb6960293ae4212f4f97165e42222d
("workqueue: Add system_percpu_wq and system_dfl_wq")
Let me know if I haven't understood correctly what you meant.
Thanks!
On Fri, Sep 5, 2025 at 1:34 PM Alice Ryhl <aliceryhl@google.com> wrote:
>
> On Fri, Sep 5, 2025 at 11:12 AM Marco Crivellari
> <marco.crivellari@suse.com> wrote:
> >
> > Below is a summary of a discussion about the Workqueue API and cpu isolation
> > considerations. Details and more information are available here:
> >
> > "workqueue: Always use wq_select_unbound_cpu() for WORK_CPU_UNBOUND."
> > https://lore.kernel.org/all/20250221112003.1dSuoGyc@linutronix.de/
> >
> > === Current situation: problems ===
> >
> > Let's consider a nohz_full system with isolated CPUs: wq_unbound_cpumask is
> > set to the housekeeping CPUs, for !WQ_UNBOUND the local CPU is selected.
> >
> > This leads to different scenarios if a work item is scheduled on an isolated
> > CPU where "delay" value is 0 or greater then 0:
> > schedule_delayed_work(, 0);
> >
> > This will be handled by __queue_work() that will queue the work item on the
> > current local (isolated) CPU, while:
> >
> > schedule_delayed_work(, 1);
> >
> > Will move the timer on an housekeeping CPU, and schedule the work there.
> >
> > Currently if a user enqueue a work item using schedule_delayed_work() the
> > used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
> > WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
> > schedule_work() that is using system_wq and queue_work(), that makes use
> > again of WORK_CPU_UNBOUND.
> >
> > This lack of consistentcy cannot be addressed without refactoring the API.
> >
> > === Plan and future plans ===
> >
> > This patchset is the first stone on a refactoring needed in order to
> > address the points aforementioned; it will have a positive impact also
> > on the cpu isolation, in the long term, moving away percpu workqueue in
> > favor to an unbound model.
> >
> > These are the main steps:
> > 1) API refactoring (that this patch is introducing)
> > - Make more clear and uniform the system wq names, both per-cpu and
> > unbound. This to avoid any possible confusion on what should be
> > used.
> >
> > - Introduction of WQ_PERCPU: this flag is the complement of WQ_UNBOUND,
> > introduced in this patchset and used on all the callers that are not
> > currently using WQ_UNBOUND.
> >
> > WQ_UNBOUND will be removed in a future release cycle.
> >
> > Most users don't need to be per-cpu, because they don't have
> > locality requirements, because of that, a next future step will be
> > make "unbound" the default behavior.
> >
> > 2) Check who really needs to be per-cpu
> > - Remove the WQ_PERCPU flag when is not strictly required.
> >
> > 3) Add a new API (prefer local cpu)
> > - There are users that don't require a local execution, like mentioned
> > above; despite that, local execution yeld to performance gain.
> >
> > This new API will prefer the local execution, without requiring it.
> >
> > === Introduced Changes by this series ===
> >
> > 1) [P 1-2] Replace use of system_wq and system_unbound_wq
> >
> > system_wq is a per-CPU workqueue, but his name is not clear.
> > system_unbound_wq is to be used when locality is not required.
> >
> > Because of that, system_wq has been renamed in system_percpu_wq, and
> > system_unbound_wq has been renamed in system_dfl_wq.
> >
> >
> > === For Maintainers ===
> >
> > There are prerequisites for this series, already merged in the master branch.
> > The commits are:
> >
> > 128ea9f6ccfb6960293ae4212f4f97165e42222d ("workqueue: Add system_percpu_wq and
> > system_dfl_wq")
> >
> > 930c2ea566aff59e962c50b2421d5fcc3b98b8be ("workqueue: Add new WQ_PERCPU flag")
> >
> >
> > Thanks!
> >
> > Marco Crivellari (2):
> > rust: replace use of system_unbound_wq with system_dfl_wq
> > rust: replace use of system_wq with system_percpu_wq
> >
> > rust/kernel/workqueue.rs | 12 ++++++------
> > 1 file changed, 6 insertions(+), 6 deletions(-)
>
> The functions you are changing are intended to match 1-to-1 with the
> wq globals defined by the C side. Changing them so that Rust and C no
> longer agrees on what the wqs are called seems wrong. How about adding
> new functions instead?
>
> Alice
--
Marco Crivellari
L3 Support Engineer, Technology & Product
marco.crivellari@suse.com
On Fri, Sep 05, 2025 at 06:05:50PM +0200, Marco Crivellari wrote:
> Hello Alice,
>
> Thanks for your reply!
>
> I'm not entirely sure I understood your doubt.
>
> system_dfl_wq and system_percpu_wq are wq(s) also present in the C code.
> They are part of the prerequisite already merged in:
>
> 128ea9f6ccfb6960293ae4212f4f97165e42222d
> ("workqueue: Add system_percpu_wq and system_dfl_wq")
>
> Let me know if I haven't understood correctly what you meant.
>
> Thanks!
I mean that instead of:
+/// Returns the system unbound work queue (`system_dfl_wq`).
///
/// Workers are not bound to any specific CPU, not concurrency managed, and all queued work items
/// are executed immediately as long as `max_active` limit is not reached and resources are
/// available.
pub fn system_unbound() -> &'static Queue {
- // SAFETY: `system_unbound_wq` is a C global, always available.
- unsafe { Queue::from_raw(bindings::system_unbound_wq) }
+ // SAFETY: `system_dfl_wq` is a C global, always available.
+ unsafe { Queue::from_raw(bindings::system_dfl_wq) }
}
you add a new function:
pub fn system_dfl() -> &'static Queue {
// SAFETY: `system_dfl_wq` is a C global, always available.
unsafe { Queue::from_raw(bindings::system_dfl_wq) }
}
and do *not* modify system_unbound().
Alice
On Sun, Sep 7, 2025 at 12:59 PM Alice Ryhl <aliceryhl@google.com> wrote:
> I mean that instead of:
>
> +/// Returns the system unbound work queue (`system_dfl_wq`).
> ///
> /// Workers are not bound to any specific CPU, not concurrency managed, and all queued work items
> /// are executed immediately as long as `max_active` limit is not reached and resources are
> /// available.
> pub fn system_unbound() -> &'static Queue {
> - // SAFETY: `system_unbound_wq` is a C global, always available.
> - unsafe { Queue::from_raw(bindings::system_unbound_wq) }
> + // SAFETY: `system_dfl_wq` is a C global, always available.
> + unsafe { Queue::from_raw(bindings::system_dfl_wq) }
> }
>
> you add a new function:
>
> pub fn system_dfl() -> &'static Queue {
> // SAFETY: `system_dfl_wq` is a C global, always available.
> unsafe { Queue::from_raw(bindings::system_dfl_wq) }
> }
>
> and do *not* modify system_unbound().
>
> Alice
Hello Alice,
Ah, perfect. Yes it makes sense this change, you're right.
I will send the v2 introducing the new functions for both the patches
in this series.
It would also make sense to also change the above comment, mentioning that
system_unbound() uses a wq that will be removed in the future, and so
it is better to
use system_dfl() instead?
I'm thinking to something like:
+///
+/// Note: system_unbound_wq will be removed in a future release
cycle. Use system_dfl_wq instead.
pub fn system_unbound() -> &'static Queue {
// SAFETY: `system_unbound_wq` is a C global, always available.
unsafe { Queue::from_raw(bindings::system_unbound_wq) }
}
+pub fn system_dfl() -> &'static Queue {
+ // SAFETY: `system_dfl_wq` is a C global, always available.
+ unsafe { Queue::from_raw(bindings::system_dfl_wq) }
+}
Sounds good?
Thanks!
--
Marco Crivellari
L3 Support Engineer, Technology & Product
marco.crivellari@suse.com
On Mon, Sep 8, 2025 at 12:24 PM Marco Crivellari
<marco.crivellari@suse.com> wrote:
>
> On Sun, Sep 7, 2025 at 12:59 PM Alice Ryhl <aliceryhl@google.com> wrote:
> > I mean that instead of:
> >
> > +/// Returns the system unbound work queue (`system_dfl_wq`).
> > ///
> > /// Workers are not bound to any specific CPU, not concurrency managed, and all queued work items
> > /// are executed immediately as long as `max_active` limit is not reached and resources are
> > /// available.
> > pub fn system_unbound() -> &'static Queue {
> > - // SAFETY: `system_unbound_wq` is a C global, always available.
> > - unsafe { Queue::from_raw(bindings::system_unbound_wq) }
> > + // SAFETY: `system_dfl_wq` is a C global, always available.
> > + unsafe { Queue::from_raw(bindings::system_dfl_wq) }
> > }
> >
> > you add a new function:
> >
> > pub fn system_dfl() -> &'static Queue {
> > // SAFETY: `system_dfl_wq` is a C global, always available.
> > unsafe { Queue::from_raw(bindings::system_dfl_wq) }
> > }
> >
> > and do *not* modify system_unbound().
> >
> > Alice
>
> Hello Alice,
>
> Ah, perfect. Yes it makes sense this change, you're right.
> I will send the v2 introducing the new functions for both the patches
> in this series.
>
> It would also make sense to also change the above comment, mentioning that
> system_unbound() uses a wq that will be removed in the future, and so
> it is better to
> use system_dfl() instead?
>
> I'm thinking to something like:
>
> +///
> +/// Note: system_unbound_wq will be removed in a future release
> cycle. Use system_dfl_wq instead.
> pub fn system_unbound() -> &'static Queue {
> // SAFETY: `system_unbound_wq` is a C global, always available.
> unsafe { Queue::from_raw(bindings::system_unbound_wq) }
> }
>
> +pub fn system_dfl() -> &'static Queue {
> + // SAFETY: `system_dfl_wq` is a C global, always available.
> + unsafe { Queue::from_raw(bindings::system_dfl_wq) }
> +}
>
> Sounds good?
That is reasonable, yes. Please make it a link:
/// Note: `system_unbound_wq` will be removed in a future release
cycle. Use [`system_dfl`] instead.
Alice
© 2016 - 2026 Red Hat, Inc.