io_uring/io-wq.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
If a task has a pending signal when create_io_thread() is called,
copy_process() will return -ERESTARTNOINTR. io_should_retry_thread()
will request a retry of create_io_thread() up to WORKER_INIT_LIMIT = 3
times. If all retries fail, the io_uring request will fail with
ECANCELED.
Commit 3918315c5dc ("io-wq: backoff when retrying worker creation")
added a linear backoff to allow the thread to handle its signal before
the retry. However, a thread receiving frequent signals may get unlucky
and have a signal pending at every retry. Since the userspace task
doesn't control when it receives signals, there's no easy way for it to
prevent the create_io_thread() failure due to pending signals. The task
may also lack the information necessary to regenerate the canceled SQE.
So always retry the create_io_thread() on the ERESTART* errors,
analogous to what a fork() syscall would do. EAGAIN can occur due to
various persistent conditions such as exceeding RLIMIT_NPROC, so respect
the WORKER_INIT_LIMIT retry limit for EAGAIN errors.
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
---
io_uring/io-wq.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/io_uring/io-wq.c b/io_uring/io-wq.c
index 1d03b2fc4b25..cd13d8aac3d2 100644
--- a/io_uring/io-wq.c
+++ b/io_uring/io-wq.c
@@ -803,15 +803,16 @@ static inline bool io_should_retry_thread(struct io_worker *worker, long err)
* Prevent perpetual task_work retry, if the task (or its group) is
* exiting.
*/
if (fatal_signal_pending(current))
return false;
- if (worker->init_retries++ >= WORKER_INIT_LIMIT)
- return false;
+ worker->init_retries++;
switch (err) {
case -EAGAIN:
+ return worker->init_retries <= WORKER_INIT_LIMIT;
+ /* Analogous to a fork() syscall, always retry on a restartable error */
case -ERESTARTSYS:
case -ERESTARTNOINTR:
case -ERESTARTNOHAND:
return true;
default:
--
2.45.2
On Tue, 02 Dec 2025 13:57:44 -0700, Caleb Sander Mateos wrote:
> If a task has a pending signal when create_io_thread() is called,
> copy_process() will return -ERESTARTNOINTR. io_should_retry_thread()
> will request a retry of create_io_thread() up to WORKER_INIT_LIMIT = 3
> times. If all retries fail, the io_uring request will fail with
> ECANCELED.
> Commit 3918315c5dc ("io-wq: backoff when retrying worker creation")
> added a linear backoff to allow the thread to handle its signal before
> the retry. However, a thread receiving frequent signals may get unlucky
> and have a signal pending at every retry. Since the userspace task
> doesn't control when it receives signals, there's no easy way for it to
> prevent the create_io_thread() failure due to pending signals. The task
> may also lack the information necessary to regenerate the canceled SQE.
> So always retry the create_io_thread() on the ERESTART* errors,
> analogous to what a fork() syscall would do. EAGAIN can occur due to
> various persistent conditions such as exceeding RLIMIT_NPROC, so respect
> the WORKER_INIT_LIMIT retry limit for EAGAIN errors.
>
> [...]
Applied, thanks!
[1/1] io_uring/io-wq: always retry worker create on ERESTART*
commit: 777dfd696d3db9b7b08a41c3c03554ce0ba6c94e
Best regards,
--
Jens Axboe
© 2016 - 2025 Red Hat, Inc.