[PATCH 2/2] s390: replace use of system_wq with system_percpu_wq

Marco Crivellari posted 2 patches 4 days, 12 hours ago
[PATCH 2/2] s390: replace use of system_wq with system_percpu_wq
Posted by Marco Crivellari 4 days, 12 hours ago
Currently if a user enqueue a work item using schedule_delayed_work() the
used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
schedule_work() that is using system_wq and queue_work(), that makes use
again of WORK_CPU_UNBOUND.

This lack of consistentcy cannot be addressed without refactoring the API.

system_wq is a per-CPU worqueue, yet nothing in its name tells about that
CPU affinity constraint, which is very often not required by users. Make
it clear by adding a system_percpu_wq.

queue_work() / queue_delayed_work() mod_delayed_work() will now use the
new per-cpu wq: whether the user still stick on the old name a warn will
be printed along a wq redirect to the new one.

This patch add the new system_percpu_wq except for mm, fs and net
subsystem, whom are handled in separated patches.

The old wq will be kept for a few release cylces.

Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
---
 arch/s390/kernel/diag/diag324.c  | 4 ++--
 arch/s390/kernel/hiperdispatch.c | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/s390/kernel/diag/diag324.c b/arch/s390/kernel/diag/diag324.c
index 7fa4c0b7eb6c..f0a8b4841fb9 100644
--- a/arch/s390/kernel/diag/diag324.c
+++ b/arch/s390/kernel/diag/diag324.c
@@ -116,7 +116,7 @@ static void pibwork_handler(struct work_struct *work)
 	mutex_lock(&pibmutex);
 	timedout = ktime_add_ns(data->expire, PIBWORK_DELAY);
 	if (ktime_before(ktime_get(), timedout)) {
-		mod_delayed_work(system_wq, &pibwork, nsecs_to_jiffies(PIBWORK_DELAY));
+		mod_delayed_work(system_percpu_wq, &pibwork, nsecs_to_jiffies(PIBWORK_DELAY));
 		goto out;
 	}
 	vfree(data->pib);
@@ -174,7 +174,7 @@ long diag324_pibbuf(unsigned long arg)
 		pib_update(data);
 		data->sequence++;
 		data->expire = ktime_add_ns(ktime_get(), tod_to_ns(data->pib->intv));
-		mod_delayed_work(system_wq, &pibwork, nsecs_to_jiffies(PIBWORK_DELAY));
+		mod_delayed_work(system_percpu_wq, &pibwork, nsecs_to_jiffies(PIBWORK_DELAY));
 		first = false;
 	}
 	rc = data->rc;
diff --git a/arch/s390/kernel/hiperdispatch.c b/arch/s390/kernel/hiperdispatch.c
index e7b66d046e8d..85b5508ab62c 100644
--- a/arch/s390/kernel/hiperdispatch.c
+++ b/arch/s390/kernel/hiperdispatch.c
@@ -191,7 +191,7 @@ int hd_enable_hiperdispatch(void)
 		return 0;
 	if (hd_online_cores <= hd_entitled_cores)
 		return 0;
-	mod_delayed_work(system_wq, &hd_capacity_work, HD_DELAY_INTERVAL * hd_delay_factor);
+	mod_delayed_work(system_percpu_wq, &hd_capacity_work, HD_DELAY_INTERVAL * hd_delay_factor);
 	hd_update_capacities();
 	return 1;
 }
-- 
2.51.0
Re: [PATCH 2/2] s390: replace use of system_wq with system_percpu_wq
Posted by Alexander Gordeev 1 day, 9 hours ago
On Fri, Sep 05, 2025 at 11:08:57AM +0200, Marco Crivellari wrote:

Hi Marco,

> Currently if a user enqueue a work item using schedule_delayed_work() the
> used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
> WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
> schedule_work() that is using system_wq and queue_work(), that makes use
> again of WORK_CPU_UNBOUND.
> 
> This lack of consistentcy cannot be addressed without refactoring the API.
> 
> system_wq is a per-CPU worqueue, yet nothing in its name tells about that
> CPU affinity constraint, which is very often not required by users. Make
> it clear by adding a system_percpu_wq.

This paragraph is not exactly correct. You switch from system_wq to
system_percpu_wq - which are two different queues with the same
characteristics:

	system_wq = alloc_workqueue("events", 0, 0);                            
	system_percpu_wq = alloc_workqueue("events", 0, 0);                     

> queue_work() / queue_delayed_work() mod_delayed_work() will now use the
> new per-cpu wq: whether the user still stick on the old name a warn will
> be printed along a wq redirect to the new one.
> 
> This patch add the new system_percpu_wq except for mm, fs and net
> subsystem, whom are handled in separated patches.

I do not see this patch does anything like that.

> The old wq will be kept for a few release cylces.
> 
> Suggested-by: Tejun Heo <tj@kernel.org>
> Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
> ---
>  arch/s390/kernel/diag/diag324.c  | 4 ++--
>  arch/s390/kernel/hiperdispatch.c | 2 +-
>  2 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/s390/kernel/diag/diag324.c b/arch/s390/kernel/diag/diag324.c
> index 7fa4c0b7eb6c..f0a8b4841fb9 100644
> --- a/arch/s390/kernel/diag/diag324.c
> +++ b/arch/s390/kernel/diag/diag324.c
> @@ -116,7 +116,7 @@ static void pibwork_handler(struct work_struct *work)
>  	mutex_lock(&pibmutex);
>  	timedout = ktime_add_ns(data->expire, PIBWORK_DELAY);
>  	if (ktime_before(ktime_get(), timedout)) {
> -		mod_delayed_work(system_wq, &pibwork, nsecs_to_jiffies(PIBWORK_DELAY));
> +		mod_delayed_work(system_percpu_wq, &pibwork, nsecs_to_jiffies(PIBWORK_DELAY));
>  		goto out;
>  	}
>  	vfree(data->pib);
> @@ -174,7 +174,7 @@ long diag324_pibbuf(unsigned long arg)
>  		pib_update(data);
>  		data->sequence++;
>  		data->expire = ktime_add_ns(ktime_get(), tod_to_ns(data->pib->intv));
> -		mod_delayed_work(system_wq, &pibwork, nsecs_to_jiffies(PIBWORK_DELAY));
> +		mod_delayed_work(system_percpu_wq, &pibwork, nsecs_to_jiffies(PIBWORK_DELAY));
>  		first = false;
>  	}
>  	rc = data->rc;
> diff --git a/arch/s390/kernel/hiperdispatch.c b/arch/s390/kernel/hiperdispatch.c
> index e7b66d046e8d..85b5508ab62c 100644
> --- a/arch/s390/kernel/hiperdispatch.c
> +++ b/arch/s390/kernel/hiperdispatch.c
> @@ -191,7 +191,7 @@ int hd_enable_hiperdispatch(void)
>  		return 0;
>  	if (hd_online_cores <= hd_entitled_cores)
>  		return 0;
> -	mod_delayed_work(system_wq, &hd_capacity_work, HD_DELAY_INTERVAL * hd_delay_factor);
> +	mod_delayed_work(system_percpu_wq, &hd_capacity_work, HD_DELAY_INTERVAL * hd_delay_factor);
>  	hd_update_capacities();
>  	return 1;
>  }
> -- 
> 2.51.0
>
Re: [PATCH 2/2] s390: replace use of system_wq with system_percpu_wq
Posted by Marco Crivellari 11 hours ago
On Mon, Sep 8, 2025 at 1:44 PM Alexander Gordeev <agordeev@linux.ibm.com> wrote:
>
> On Fri, Sep 05, 2025 at 11:08:57AM +0200, Marco Crivellari wrote:
>
> Hi Marco,
>
> > Currently if a user enqueue a work item using schedule_delayed_work() the
> > used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
> > WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
> > schedule_work() that is using system_wq and queue_work(), that makes use
> > again of WORK_CPU_UNBOUND.
> >
> > This lack of consistentcy cannot be addressed without refactoring the API.
> >
> > system_wq is a per-CPU worqueue, yet nothing in its name tells about that
> > CPU affinity constraint, which is very often not required by users. Make
> > it clear by adding a system_percpu_wq.
>
> This paragraph is not exactly correct. You switch from system_wq to
> system_percpu_wq - which are two different queues with the same
> characteristics:
>
>         system_wq = alloc_workqueue("events", 0, 0);
>         system_percpu_wq = alloc_workqueue("events", 0, 0);

Hi Alexander,

Yes, system_percpu_wq will be in future the replacement of system_wq.

> > queue_work() / queue_delayed_work() mod_delayed_work() will now use the
> > new per-cpu wq: whether the user still stick on the old name a warn will
> > be printed along a wq redirect to the new one.
> >
> > This patch add the new system_percpu_wq except for mm, fs and net
> > subsystem, whom are handled in separated patches.
>
> I do not see this patch does anything like that.
>

I'm sorry for the confusion, I forgot to update the log after the
previous change.
There are not, indeed, the changes you mentioned.

Thanks!
-- 

Marco Crivellari

L3 Support Engineer, Technology & Product

marco.crivellari@suse.com
Re: [PATCH 2/2] s390: replace use of system_wq with system_percpu_wq
Posted by Mete Durlu 1 day, 13 hours ago
On 9/5/25 11:08 AM, Marco Crivellari wrote:
> Currently if a user enqueue a work item using schedule_delayed_work() the
> used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
> WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
> schedule_work() that is using system_wq and queue_work(), that makes use
> again of WORK_CPU_UNBOUND.
> 
> This lack of consistentcy cannot be addressed without refactoring the API.
> 
> system_wq is a per-CPU worqueue, yet nothing in its name tells about that
> CPU affinity constraint, which is very often not required by users. Make
> it clear by adding a system_percpu_wq.
> 
> queue_work() / queue_delayed_work() mod_delayed_work() will now use the
> new per-cpu wq: whether the user still stick on the old name a warn will
> be printed along a wq redirect to the new one.
> 
> This patch add the new system_percpu_wq except for mm, fs and net
> subsystem, whom are handled in separated patches.
> 
> The old wq will be kept for a few release cylces.
> 
> Suggested-by: Tejun Heo <tj@kernel.org>
> Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>

If I get this correctly system_wq will be obsolete and users will get
system_percpu_wq instead, which means local cpu gets to deal with the
delayed work and its timer and it has an affinity to that cpu via per
cpu workqueue. In that case;


> diff --git a/arch/s390/kernel/hiperdispatch.c b/arch/s390/kernel/hiperdispatch.c
> index e7b66d046e8d..85b5508ab62c 100644
> --- a/arch/s390/kernel/hiperdispatch.c
> +++ b/arch/s390/kernel/hiperdispatch.c
> @@ -191,7 +191,7 @@ int hd_enable_hiperdispatch(void)
>   		return 0;
>   	if (hd_online_cores <= hd_entitled_cores)
>   		return 0;
> -	mod_delayed_work(system_wq, &hd_capacity_work, HD_DELAY_INTERVAL * hd_delay_factor);
> +	mod_delayed_work(system_percpu_wq, &hd_capacity_work, HD_DELAY_INTERVAL * hd_delay_factor);
>   	hd_update_capacities();

Hiperdispatch's delayed work wouldn't get a noticeable benefit from
utilizing a per-cpu workqueue. We probably settled on system_wq to
utilize the global work queue at the time. Would system_unbound_wq
make more sense here?

Thanks.
Re: [PATCH 2/2] s390: replace use of system_wq with system_percpu_wq
Posted by Marco Crivellari 11 hours ago
On Mon, Sep 8, 2025 at 10:33 AM Mete Durlu <meted@linux.ibm.com> wrote:
>
> If I get this correctly system_wq will be obsolete and users will get
> system_percpu_wq instead, which means local cpu gets to deal with the
> delayed work and its timer and it has an affinity to that cpu via per
> cpu workqueue. In that case;
>
> > diff --git a/arch/s390/kernel/hiperdispatch.c b/arch/s390/kernel/hiperdispatch.c
> > index e7b66d046e8d..85b5508ab62c 100644
> > --- a/arch/s390/kernel/hiperdispatch.c
> > +++ b/arch/s390/kernel/hiperdispatch.c
> > @@ -191,7 +191,7 @@ int hd_enable_hiperdispatch(void)
> >               return 0;
> >       if (hd_online_cores <= hd_entitled_cores)
> >               return 0;
> > -     mod_delayed_work(system_wq, &hd_capacity_work, HD_DELAY_INTERVAL * hd_delay_factor);
> > +     mod_delayed_work(system_percpu_wq, &hd_capacity_work, HD_DELAY_INTERVAL * hd_delay_factor);
> >       hd_update_capacities();
>
> Hiperdispatch's delayed work wouldn't get a noticeable benefit from
> utilizing a per-cpu workqueue. We probably settled on system_wq to
> utilize the global work queue at the time. Would system_unbound_wq
> make more sense here?
>
> Thanks.

Hello,

I will check the code and if it is possible, I will send the v2 with
system_dfl_wq (eg. the current/old system_unbound_wq).

Thanks!
-- 

Marco Crivellari

L3 Support Engineer, Technology & Product

marco.crivellari@suse.com