scsi: scsi_transport_srp: Move long delayed work on system_dfl_long_wq

[RFC PATCH] scsi: scsi_transport_srp: Move long delayed work on system_dfl_long_wq

Posted by Marco Crivellari 1 month ago

Currently the code enqueue work items using {queue|mod}_delayed_work(),
using system_long_wq. This workqueue should be used when long works are
expected and it is a per-cpu workqueue.

The function(s) end up calling __queue_delayed_work(), which set a global
timer that could fire anywhere, enqueuing the work where the timer fired.

Unbound works could benefit from scheduler task placement, to optimize
performance and power consumption. Long work shouldn't stick to a single
CPU.

Recently, a new unbound workqueue specific for long running work has
been added:

    c116737e972e ("workqueue: Add system_dfl_long_wq for long unbound works")

Since the workqueue work doesn't rely on per-cpu variables, there is no
obvious reason that justify the use of a per-cpu workqueue. So change
system_long_wq with system_dfl_long_wq so that the work may benefit from
scheduler task placement.

Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
---
 drivers/scsi/scsi_transport_srp.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/scsi/scsi_transport_srp.c b/drivers/scsi/scsi_transport_srp.c
index d71ab5fdb758..a61cbb079ab4 100644
--- a/drivers/scsi/scsi_transport_srp.c
+++ b/drivers/scsi/scsi_transport_srp.c
@@ -234,7 +234,7 @@ static ssize_t store_reconnect_delay(struct device *dev,
 
 	if (rport->reconnect_delay <= 0 && delay > 0 &&
 	    rport->state != SRP_RPORT_RUNNING) {
-		queue_delayed_work(system_long_wq, &rport->reconnect_work,
+		queue_delayed_work(system_dfl_long_wq, &rport->reconnect_work,
 				   delay * HZ);
 	} else if (delay <= 0) {
 		cancel_delayed_work(&rport->reconnect_work);
@@ -390,7 +390,7 @@ static void srp_reconnect_work(struct work_struct *work)
 		delay = rport->reconnect_delay *
 			clamp(rport->failed_reconnects - 10, 1, 100);
 		if (delay > 0)
-			queue_delayed_work(system_long_wq,
+			queue_delayed_work(system_dfl_long_wq,
 					   &rport->reconnect_work, delay * HZ);
 	}
 }
@@ -474,7 +474,7 @@ static void __srp_start_tl_fail_timers(struct srp_rport *rport)
 	if (rport->state == SRP_RPORT_LOST)
 		return;
 	if (delay > 0)
-		queue_delayed_work(system_long_wq, &rport->reconnect_work,
+		queue_delayed_work(system_dfl_long_wq, &rport->reconnect_work,
 				   1UL * delay * HZ);
 	if ((fast_io_fail_tmo >= 0 || dev_loss_tmo >= 0) &&
 	    srp_rport_set_state(rport, SRP_RPORT_BLOCKED) == 0) {
@@ -482,11 +482,11 @@ static void __srp_start_tl_fail_timers(struct srp_rport *rport)
 			 rport->state);
 		scsi_block_targets(shost, &shost->shost_gendev);
 		if (fast_io_fail_tmo >= 0)
-			queue_delayed_work(system_long_wq,
+			queue_delayed_work(system_dfl_long_wq,
 					   &rport->fast_io_fail_work,
 					   1UL * fast_io_fail_tmo * HZ);
 		if (dev_loss_tmo >= 0)
-			queue_delayed_work(system_long_wq,
+			queue_delayed_work(system_dfl_long_wq,
 					   &rport->dev_loss_work,
 					   1UL * dev_loss_tmo * HZ);
 	}
-- 
2.53.0

Re: [RFC PATCH] scsi: scsi_transport_srp: Move long delayed work on system_dfl_long_wq

Posted by Martin K. Petersen 3 weeks ago

On Thu, 07 May 2026 16:34:10 +0200, Marco Crivellari wrote:

> Currently the code enqueue work items using {queue|mod}_delayed_work(),
> using system_long_wq. This workqueue should be used when long works are
> expected and it is a per-cpu workqueue.
> 
> The function(s) end up calling __queue_delayed_work(), which set a global
> timer that could fire anywhere, enqueuing the work where the timer fired.
> 
> [...]

Applied to 7.2/scsi-queue, thanks!

[1/1] scsi: scsi_transport_srp: Move long delayed work on system_dfl_long_wq
      https://git.kernel.org/mkp/scsi/c/1039939c52f2

-- 
Martin K. Petersen

Re: [RFC PATCH] scsi: scsi_transport_srp: Move long delayed work on system_dfl_long_wq

Posted by Martin K. Petersen 4 weeks, 1 day ago

Marco,

> Currently the code enqueue work items using
> {queue|mod}_delayed_work(), using system_long_wq. This workqueue
> should be used when long works are expected and it is a per-cpu
> workqueue.

Applied to 7.2/scsi-staging, thanks!

-- 
Martin K. Petersen

Re: [RFC PATCH] scsi: scsi_transport_srp: Move long delayed work on system_dfl_long_wq

Posted by Marco Crivellari 4 weeks, 1 day ago

On Fri, May 15, 2026 at 3:58 AM Martin K. Petersen
<martin.petersen@oracle.com> wrote:
>
>
> Marco,
>
> > Currently the code enqueue work items using
> > {queue|mod}_delayed_work(), using system_long_wq. This workqueue
> > should be used when long works are expected and it is a per-cpu
> > workqueue.
>
> Applied to 7.2/scsi-staging, thanks!
>
> --
> Martin K. Petersen

Many thanks!

-- 

Marco Crivellari

SUSE Labs

Re: [RFC PATCH] scsi: scsi_transport_srp: Move long delayed work on system_dfl_long_wq

Posted by Bart Van Assche 1 month ago

On 5/7/26 7:34 AM, Marco Crivellari wrote:
> Currently the code enqueue work items using {queue|mod}_delayed_work(),
> using system_long_wq. This workqueue should be used when long works are
> expected and it is a per-cpu workqueue.
> 
> The function(s) end up calling __queue_delayed_work(), which set a global
> timer that could fire anywhere, enqueuing the work where the timer fired.
> 
> Unbound works could benefit from scheduler task placement, to optimize
> performance and power consumption. Long work shouldn't stick to a single
> CPU.
Reviewed-by: Bart Van Assche <bvanassche@acm.org>

Re: [RFC PATCH] scsi: scsi_transport_srp: Move long delayed work on system_dfl_long_wq

Posted by Bart Van Assche 1 month ago

On 5/7/26 7:34 AM, Marco Crivellari wrote:
> Currently the code enqueue work items using {queue|mod}_delayed_work(),
> using system_long_wq. This workqueue should be used when long works are
> expected and it is a per-cpu workqueue.
> 
> The function(s) end up calling __queue_delayed_work(), which set a global
> timer that could fire anywhere, enqueuing the work where the timer fired.
> 
> Unbound works could benefit from scheduler task placement, to optimize
> performance and power consumption. Long work shouldn't stick to a single
> CPU.
> 
> Recently, a new unbound workqueue specific for long running work has
> been added:
> 
>      c116737e972e ("workqueue: Add system_dfl_long_wq for long unbound works")
> 
> Since the workqueue work doesn't rely on per-cpu variables, there is no
> obvious reason that justify the use of a per-cpu workqueue. So change
> system_long_wq with system_dfl_long_wq so that the work may benefit from
> scheduler task placement.

This looks like unnecessary churn to me. The motivation for the
introduction of system_dfl_long_wq seems very weak to me. Wouldn't we
all be better off if commit c116737e972e would be reverted and if the
behavior of system_long_wq would be modified from per-CPU into unbound?

Thanks,

Bart.

Re: [RFC PATCH] scsi: scsi_transport_srp: Move long delayed work on system_dfl_long_wq

Posted by Marco Crivellari 1 month ago

On Fri, May 8, 2026 at 6:11 PM Bart Van Assche <bvanassche@acm.org> wrote:
> [...]
> This looks like unnecessary churn to me. The motivation for the
> introduction of system_dfl_long_wq seems very weak to me. Wouldn't we
> all be better off if commit c116737e972e would be reverted and if the
> behavior of system_long_wq would be modified from per-CPU into unbound?

Hello Bart,

There are not many users of `queue_delayed_work(system_long_wq,...)`, anyhow
there are API guarantees, so we cannot just change the workqueue used there,
in my opinion.

Also, consider that system_long_wq is used by more than just
queue_delayed_work().
Approximately, 60 queue_work() users specify system_long_wq, some of them
may really need to be per-cpu.

Thanks.

-- 

Marco Crivellari

SUSE Labs