rust/kernel/workqueue.rs | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-)
Below is a summary of a discussion about the Workqueue API and cpu isolation considerations. Details and more information are available here: "workqueue: Always use wq_select_unbound_cpu() for WORK_CPU_UNBOUND." https://lore.kernel.org/all/20250221112003.1dSuoGyc@linutronix.de/ === Current situation: problems === Let's consider a nohz_full system with isolated CPUs: wq_unbound_cpumask is set to the housekeeping CPUs, for !WQ_UNBOUND the local CPU is selected. This leads to different scenarios if a work item is scheduled on an isolated CPU where "delay" value is 0 or greater then 0: schedule_delayed_work(, 0); This will be handled by __queue_work() that will queue the work item on the current local (isolated) CPU, while: schedule_delayed_work(, 1); Will move the timer on an housekeeping CPU, and schedule the work there. Currently if a user enqueue a work item using schedule_delayed_work() the used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to schedule_work() that is using system_wq and queue_work(), that makes use again of WORK_CPU_UNBOUND. This lack of consistentcy cannot be addressed without refactoring the API. === Plan and future plans === This patchset is the first stone on a refactoring needed in order to address the points aforementioned; it will have a positive impact also on the cpu isolation, in the long term, moving away percpu workqueue in favor to an unbound model. These are the main steps: 1) API refactoring (that this patch is introducing) - Make more clear and uniform the system wq names, both per-cpu and unbound. This to avoid any possible confusion on what should be used. - Introduction of WQ_PERCPU: this flag is the complement of WQ_UNBOUND, introduced in this patchset and used on all the callers that are not currently using WQ_UNBOUND. WQ_UNBOUND will be removed in a future release cycle. Most users don't need to be per-cpu, because they don't have locality requirements, because of that, a next future step will be make "unbound" the default behavior. 2) Check who really needs to be per-cpu - Remove the WQ_PERCPU flag when is not strictly required. 3) Add a new API (prefer local cpu) - There are users that don't require a local execution, like mentioned above; despite that, local execution yeld to performance gain. This new API will prefer the local execution, without requiring it. === Introduced Changes by this series === 1) [P 1-2] Replace use of system_wq and system_unbound_wq system_wq is a per-CPU workqueue, but his name is not clear. system_unbound_wq is to be used when locality is not required. Because of that, system_wq has been renamed in system_percpu_wq, and system_unbound_wq has been renamed in system_dfl_wq. === For Maintainers === There are prerequisites for this series, already merged in the master branch. The commits are: 128ea9f6ccfb6960293ae4212f4f97165e42222d ("workqueue: Add system_percpu_wq and system_dfl_wq") 930c2ea566aff59e962c50b2421d5fcc3b98b8be ("workqueue: Add new WQ_PERCPU flag") Thanks! Marco Crivellari (2): rust: replace use of system_unbound_wq with system_dfl_wq rust: replace use of system_wq with system_percpu_wq rust/kernel/workqueue.rs | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) -- 2.51.0
On Fri, Sep 5, 2025 at 11:12 AM Marco Crivellari <marco.crivellari@suse.com> wrote: > > Below is a summary of a discussion about the Workqueue API and cpu isolation > considerations. Details and more information are available here: > > "workqueue: Always use wq_select_unbound_cpu() for WORK_CPU_UNBOUND." > https://lore.kernel.org/all/20250221112003.1dSuoGyc@linutronix.de/ > > === Current situation: problems === > > Let's consider a nohz_full system with isolated CPUs: wq_unbound_cpumask is > set to the housekeeping CPUs, for !WQ_UNBOUND the local CPU is selected. > > This leads to different scenarios if a work item is scheduled on an isolated > CPU where "delay" value is 0 or greater then 0: > schedule_delayed_work(, 0); > > This will be handled by __queue_work() that will queue the work item on the > current local (isolated) CPU, while: > > schedule_delayed_work(, 1); > > Will move the timer on an housekeeping CPU, and schedule the work there. > > Currently if a user enqueue a work item using schedule_delayed_work() the > used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use > WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to > schedule_work() that is using system_wq and queue_work(), that makes use > again of WORK_CPU_UNBOUND. > > This lack of consistentcy cannot be addressed without refactoring the API. > > === Plan and future plans === > > This patchset is the first stone on a refactoring needed in order to > address the points aforementioned; it will have a positive impact also > on the cpu isolation, in the long term, moving away percpu workqueue in > favor to an unbound model. > > These are the main steps: > 1) API refactoring (that this patch is introducing) > - Make more clear and uniform the system wq names, both per-cpu and > unbound. This to avoid any possible confusion on what should be > used. > > - Introduction of WQ_PERCPU: this flag is the complement of WQ_UNBOUND, > introduced in this patchset and used on all the callers that are not > currently using WQ_UNBOUND. > > WQ_UNBOUND will be removed in a future release cycle. > > Most users don't need to be per-cpu, because they don't have > locality requirements, because of that, a next future step will be > make "unbound" the default behavior. > > 2) Check who really needs to be per-cpu > - Remove the WQ_PERCPU flag when is not strictly required. > > 3) Add a new API (prefer local cpu) > - There are users that don't require a local execution, like mentioned > above; despite that, local execution yeld to performance gain. > > This new API will prefer the local execution, without requiring it. > > === Introduced Changes by this series === > > 1) [P 1-2] Replace use of system_wq and system_unbound_wq > > system_wq is a per-CPU workqueue, but his name is not clear. > system_unbound_wq is to be used when locality is not required. > > Because of that, system_wq has been renamed in system_percpu_wq, and > system_unbound_wq has been renamed in system_dfl_wq. > > > === For Maintainers === > > There are prerequisites for this series, already merged in the master branch. > The commits are: > > 128ea9f6ccfb6960293ae4212f4f97165e42222d ("workqueue: Add system_percpu_wq and > system_dfl_wq") > > 930c2ea566aff59e962c50b2421d5fcc3b98b8be ("workqueue: Add new WQ_PERCPU flag") > > > Thanks! > > Marco Crivellari (2): > rust: replace use of system_unbound_wq with system_dfl_wq > rust: replace use of system_wq with system_percpu_wq > > rust/kernel/workqueue.rs | 12 ++++++------ > 1 file changed, 6 insertions(+), 6 deletions(-) The functions you are changing are intended to match 1-to-1 with the wq globals defined by the C side. Changing them so that Rust and C no longer agrees on what the wqs are called seems wrong. How about adding new functions instead? Alice
Hello Alice, Thanks for your reply! I'm not entirely sure I understood your doubt. system_dfl_wq and system_percpu_wq are wq(s) also present in the C code. They are part of the prerequisite already merged in: 128ea9f6ccfb6960293ae4212f4f97165e42222d ("workqueue: Add system_percpu_wq and system_dfl_wq") Let me know if I haven't understood correctly what you meant. Thanks! On Fri, Sep 5, 2025 at 1:34 PM Alice Ryhl <aliceryhl@google.com> wrote: > > On Fri, Sep 5, 2025 at 11:12 AM Marco Crivellari > <marco.crivellari@suse.com> wrote: > > > > Below is a summary of a discussion about the Workqueue API and cpu isolation > > considerations. Details and more information are available here: > > > > "workqueue: Always use wq_select_unbound_cpu() for WORK_CPU_UNBOUND." > > https://lore.kernel.org/all/20250221112003.1dSuoGyc@linutronix.de/ > > > > === Current situation: problems === > > > > Let's consider a nohz_full system with isolated CPUs: wq_unbound_cpumask is > > set to the housekeeping CPUs, for !WQ_UNBOUND the local CPU is selected. > > > > This leads to different scenarios if a work item is scheduled on an isolated > > CPU where "delay" value is 0 or greater then 0: > > schedule_delayed_work(, 0); > > > > This will be handled by __queue_work() that will queue the work item on the > > current local (isolated) CPU, while: > > > > schedule_delayed_work(, 1); > > > > Will move the timer on an housekeeping CPU, and schedule the work there. > > > > Currently if a user enqueue a work item using schedule_delayed_work() the > > used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use > > WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to > > schedule_work() that is using system_wq and queue_work(), that makes use > > again of WORK_CPU_UNBOUND. > > > > This lack of consistentcy cannot be addressed without refactoring the API. > > > > === Plan and future plans === > > > > This patchset is the first stone on a refactoring needed in order to > > address the points aforementioned; it will have a positive impact also > > on the cpu isolation, in the long term, moving away percpu workqueue in > > favor to an unbound model. > > > > These are the main steps: > > 1) API refactoring (that this patch is introducing) > > - Make more clear and uniform the system wq names, both per-cpu and > > unbound. This to avoid any possible confusion on what should be > > used. > > > > - Introduction of WQ_PERCPU: this flag is the complement of WQ_UNBOUND, > > introduced in this patchset and used on all the callers that are not > > currently using WQ_UNBOUND. > > > > WQ_UNBOUND will be removed in a future release cycle. > > > > Most users don't need to be per-cpu, because they don't have > > locality requirements, because of that, a next future step will be > > make "unbound" the default behavior. > > > > 2) Check who really needs to be per-cpu > > - Remove the WQ_PERCPU flag when is not strictly required. > > > > 3) Add a new API (prefer local cpu) > > - There are users that don't require a local execution, like mentioned > > above; despite that, local execution yeld to performance gain. > > > > This new API will prefer the local execution, without requiring it. > > > > === Introduced Changes by this series === > > > > 1) [P 1-2] Replace use of system_wq and system_unbound_wq > > > > system_wq is a per-CPU workqueue, but his name is not clear. > > system_unbound_wq is to be used when locality is not required. > > > > Because of that, system_wq has been renamed in system_percpu_wq, and > > system_unbound_wq has been renamed in system_dfl_wq. > > > > > > === For Maintainers === > > > > There are prerequisites for this series, already merged in the master branch. > > The commits are: > > > > 128ea9f6ccfb6960293ae4212f4f97165e42222d ("workqueue: Add system_percpu_wq and > > system_dfl_wq") > > > > 930c2ea566aff59e962c50b2421d5fcc3b98b8be ("workqueue: Add new WQ_PERCPU flag") > > > > > > Thanks! > > > > Marco Crivellari (2): > > rust: replace use of system_unbound_wq with system_dfl_wq > > rust: replace use of system_wq with system_percpu_wq > > > > rust/kernel/workqueue.rs | 12 ++++++------ > > 1 file changed, 6 insertions(+), 6 deletions(-) > > The functions you are changing are intended to match 1-to-1 with the > wq globals defined by the C side. Changing them so that Rust and C no > longer agrees on what the wqs are called seems wrong. How about adding > new functions instead? > > Alice -- Marco Crivellari L3 Support Engineer, Technology & Product marco.crivellari@suse.com
On Fri, Sep 05, 2025 at 06:05:50PM +0200, Marco Crivellari wrote: > Hello Alice, > > Thanks for your reply! > > I'm not entirely sure I understood your doubt. > > system_dfl_wq and system_percpu_wq are wq(s) also present in the C code. > They are part of the prerequisite already merged in: > > 128ea9f6ccfb6960293ae4212f4f97165e42222d > ("workqueue: Add system_percpu_wq and system_dfl_wq") > > Let me know if I haven't understood correctly what you meant. > > Thanks! I mean that instead of: +/// Returns the system unbound work queue (`system_dfl_wq`). /// /// Workers are not bound to any specific CPU, not concurrency managed, and all queued work items /// are executed immediately as long as `max_active` limit is not reached and resources are /// available. pub fn system_unbound() -> &'static Queue { - // SAFETY: `system_unbound_wq` is a C global, always available. - unsafe { Queue::from_raw(bindings::system_unbound_wq) } + // SAFETY: `system_dfl_wq` is a C global, always available. + unsafe { Queue::from_raw(bindings::system_dfl_wq) } } you add a new function: pub fn system_dfl() -> &'static Queue { // SAFETY: `system_dfl_wq` is a C global, always available. unsafe { Queue::from_raw(bindings::system_dfl_wq) } } and do *not* modify system_unbound(). Alice
On Sun, Sep 7, 2025 at 12:59 PM Alice Ryhl <aliceryhl@google.com> wrote: > I mean that instead of: > > +/// Returns the system unbound work queue (`system_dfl_wq`). > /// > /// Workers are not bound to any specific CPU, not concurrency managed, and all queued work items > /// are executed immediately as long as `max_active` limit is not reached and resources are > /// available. > pub fn system_unbound() -> &'static Queue { > - // SAFETY: `system_unbound_wq` is a C global, always available. > - unsafe { Queue::from_raw(bindings::system_unbound_wq) } > + // SAFETY: `system_dfl_wq` is a C global, always available. > + unsafe { Queue::from_raw(bindings::system_dfl_wq) } > } > > you add a new function: > > pub fn system_dfl() -> &'static Queue { > // SAFETY: `system_dfl_wq` is a C global, always available. > unsafe { Queue::from_raw(bindings::system_dfl_wq) } > } > > and do *not* modify system_unbound(). > > Alice Hello Alice, Ah, perfect. Yes it makes sense this change, you're right. I will send the v2 introducing the new functions for both the patches in this series. It would also make sense to also change the above comment, mentioning that system_unbound() uses a wq that will be removed in the future, and so it is better to use system_dfl() instead? I'm thinking to something like: +/// +/// Note: system_unbound_wq will be removed in a future release cycle. Use system_dfl_wq instead. pub fn system_unbound() -> &'static Queue { // SAFETY: `system_unbound_wq` is a C global, always available. unsafe { Queue::from_raw(bindings::system_unbound_wq) } } +pub fn system_dfl() -> &'static Queue { + // SAFETY: `system_dfl_wq` is a C global, always available. + unsafe { Queue::from_raw(bindings::system_dfl_wq) } +} Sounds good? Thanks! -- Marco Crivellari L3 Support Engineer, Technology & Product marco.crivellari@suse.com
On Mon, Sep 8, 2025 at 12:24 PM Marco Crivellari <marco.crivellari@suse.com> wrote: > > On Sun, Sep 7, 2025 at 12:59 PM Alice Ryhl <aliceryhl@google.com> wrote: > > I mean that instead of: > > > > +/// Returns the system unbound work queue (`system_dfl_wq`). > > /// > > /// Workers are not bound to any specific CPU, not concurrency managed, and all queued work items > > /// are executed immediately as long as `max_active` limit is not reached and resources are > > /// available. > > pub fn system_unbound() -> &'static Queue { > > - // SAFETY: `system_unbound_wq` is a C global, always available. > > - unsafe { Queue::from_raw(bindings::system_unbound_wq) } > > + // SAFETY: `system_dfl_wq` is a C global, always available. > > + unsafe { Queue::from_raw(bindings::system_dfl_wq) } > > } > > > > you add a new function: > > > > pub fn system_dfl() -> &'static Queue { > > // SAFETY: `system_dfl_wq` is a C global, always available. > > unsafe { Queue::from_raw(bindings::system_dfl_wq) } > > } > > > > and do *not* modify system_unbound(). > > > > Alice > > Hello Alice, > > Ah, perfect. Yes it makes sense this change, you're right. > I will send the v2 introducing the new functions for both the patches > in this series. > > It would also make sense to also change the above comment, mentioning that > system_unbound() uses a wq that will be removed in the future, and so > it is better to > use system_dfl() instead? > > I'm thinking to something like: > > +/// > +/// Note: system_unbound_wq will be removed in a future release > cycle. Use system_dfl_wq instead. > pub fn system_unbound() -> &'static Queue { > // SAFETY: `system_unbound_wq` is a C global, always available. > unsafe { Queue::from_raw(bindings::system_unbound_wq) } > } > > +pub fn system_dfl() -> &'static Queue { > + // SAFETY: `system_dfl_wq` is a C global, always available. > + unsafe { Queue::from_raw(bindings::system_dfl_wq) } > +} > > Sounds good? That is reasonable, yes. Please make it a link: /// Note: `system_unbound_wq` will be removed in a future release cycle. Use [`system_dfl`] instead. Alice
© 2016 - 2025 Red Hat, Inc.