[PATCH v3 0/4] io_uring/uring_cmd: allow non-iopoll cmds with IORING_SETUP_IOPOLL

Caleb Sander Mateos posted 4 patches 1 month, 1 week ago
There is a newer version of this series
drivers/nvme/host/ioctl.c      |  4 ----
include/linux/io_uring_types.h |  3 +++
io_uring/io_uring.c            | 10 ++++------
io_uring/opdef.c               | 10 ----------
io_uring/opdef.h               |  2 --
io_uring/rw.c                  | 11 ++++++-----
io_uring/uring_cmd.c           |  9 ++++-----
7 files changed, 17 insertions(+), 32 deletions(-)
[PATCH v3 0/4] io_uring/uring_cmd: allow non-iopoll cmds with IORING_SETUP_IOPOLL
Posted by Caleb Sander Mateos 1 month, 1 week ago
Currently, creating an io_uring with IORING_SETUP_IOPOLL requires all
requests issued to it to support iopoll. This prevents, for example,
using ublk zero-copy together with IORING_SETUP_IOPOLL, as ublk
zero-copy buffer registrations are performed using a uring_cmd. There's
no technical reason why these non-iopoll uring_cmds can't be supported.
They will either complete synchronously or via an external mechanism
that calls io_uring_cmd_done(), so they don't need to be polled.

Allow uring_cmd requests to be issued to IORING_SETUP_IOPOLL io_urings
even if their files don't implement ->uring_cmd_iopoll().

Use a new REQ_F_IOPOLL flag to track whether a request is using iopoll.
This makes the iopoll_queue opcode definition flag unnecessary.

The last commit removes an unnecessary IO_URING_F_IOPOLL check in
nvme_dev_uring_cmd() as NVMe admin passthru commands can be issued to
IORING_SETUP_IOPOLL io_urings now.

v3: fix REW -> REQ typo (Anuj)

v2:
- Add REQ_F_IOPOLL request flag, remove redundant iopoll_queue
- Split IORING_OP_URING_CMD128 fix to a separate commit

Caleb Sander Mateos (4):
  io_uring: add REQ_F_IOPOLL
  io_uring: remove iopoll_queue from struct io_issue_def
  io_uring/uring_cmd: allow non-iopoll cmds with IORING_SETUP_IOPOLL
  nvme: remove nvme_dev_uring_cmd() IO_URING_F_IOPOLL check

 drivers/nvme/host/ioctl.c      |  4 ----
 include/linux/io_uring_types.h |  3 +++
 io_uring/io_uring.c            | 10 ++++------
 io_uring/opdef.c               | 10 ----------
 io_uring/opdef.h               |  2 --
 io_uring/rw.c                  | 11 ++++++-----
 io_uring/uring_cmd.c           |  9 ++++-----
 7 files changed, 17 insertions(+), 32 deletions(-)

-- 
2.45.2
Re: [PATCH v3 0/4] io_uring/uring_cmd: allow non-iopoll cmds with IORING_SETUP_IOPOLL
Posted by Anuj gupta 1 month, 1 week ago
Reviewed-by: Anuj Gupta <anuj20.g@samsung.com>
Re: [PATCH v3 0/4] io_uring/uring_cmd: allow non-iopoll cmds with IORING_SETUP_IOPOLL
Posted by Kanchan Joshi 1 month, 1 week ago
Series looked good to me.

Reviewed-by: Kanchan Joshi <joshi.k@samsung.com>
Re: [PATCH v3 0/4] io_uring/uring_cmd: allow non-iopoll cmds with IORING_SETUP_IOPOLL
Posted by Ming Lei 1 month, 1 week ago
On Thu, Feb 19, 2026 at 10:22:23AM -0700, Caleb Sander Mateos wrote:
> Currently, creating an io_uring with IORING_SETUP_IOPOLL requires all
> requests issued to it to support iopoll. This prevents, for example,
> using ublk zero-copy together with IORING_SETUP_IOPOLL, as ublk
> zero-copy buffer registrations are performed using a uring_cmd. There's
> no technical reason why these non-iopoll uring_cmds can't be supported.
> They will either complete synchronously or via an external mechanism
> that calls io_uring_cmd_done(), so they don't need to be polled.

For sync uring command, it is fine to support for IOPOLL.

However, there are async uring command, which may be completed in irq
context, or in multishot way, at least the later isn't supported in
io_do_iopoll() yet.


Thanks, 
Ming
Re: [PATCH v3 0/4] io_uring/uring_cmd: allow non-iopoll cmds with IORING_SETUP_IOPOLL
Posted by Caleb Sander Mateos 1 month, 1 week ago
On Fri, Feb 20, 2026 at 6:25 AM Ming Lei <ming.lei@redhat.com> wrote:
>
> On Thu, Feb 19, 2026 at 10:22:23AM -0700, Caleb Sander Mateos wrote:
> > Currently, creating an io_uring with IORING_SETUP_IOPOLL requires all
> > requests issued to it to support iopoll. This prevents, for example,
> > using ublk zero-copy together with IORING_SETUP_IOPOLL, as ublk
> > zero-copy buffer registrations are performed using a uring_cmd. There's
> > no technical reason why these non-iopoll uring_cmds can't be supported.
> > They will either complete synchronously or via an external mechanism
> > that calls io_uring_cmd_done(), so they don't need to be polled.
>
> For sync uring command, it is fine to support for IOPOLL.
>
> However, there are async uring command, which may be completed in irq
> context, or in multishot way, at least the later isn't supported in
> io_do_iopoll() yet.

Can you describe the issues you envision in more detail?

io_uring_cmd_done() can already be called in irq context. Even if the
request is not REQ_F_IOPOLL, any completion from irq context must call
io_uring_cmd_done() with IO_URING_F_UNLOCKED because an interrupt
handler can't acquire a mutex. That means the completion will be via
task work. The mutex acquisition in io_uring_cmd_del_cancelable()
would be a problem in irq context, so ->uring_cmd() implementations
that use io_uring_cmd_mark_cancelable() will have to call
io_uring_cmd_done() from task context, which both ublk and fuse
already do.

I missed that the SOCKET_URING_OP_TX_TIMESTAMP uring_cmd may use
apoll, so that one will probably have to be prohibited on
IORING_SETUP_IOPOLL io_urings.

Best,
Caleb
Re: [PATCH v3 0/4] io_uring/uring_cmd: allow non-iopoll cmds with IORING_SETUP_IOPOLL
Posted by Ming Lei 1 month, 1 week ago
On Fri, Feb 20, 2026 at 07:55:33AM -0800, Caleb Sander Mateos wrote:
> On Fri, Feb 20, 2026 at 6:25 AM Ming Lei <ming.lei@redhat.com> wrote:
> >
> > On Thu, Feb 19, 2026 at 10:22:23AM -0700, Caleb Sander Mateos wrote:
> > > Currently, creating an io_uring with IORING_SETUP_IOPOLL requires all
> > > requests issued to it to support iopoll. This prevents, for example,
> > > using ublk zero-copy together with IORING_SETUP_IOPOLL, as ublk
> > > zero-copy buffer registrations are performed using a uring_cmd. There's
> > > no technical reason why these non-iopoll uring_cmds can't be supported.
> > > They will either complete synchronously or via an external mechanism
> > > that calls io_uring_cmd_done(), so they don't need to be polled.
> >
> > For sync uring command, it is fine to support for IOPOLL.
> >
> > However, there are async uring command, which may be completed in irq
> > context, or in multishot way, at least the later isn't supported in
> > io_do_iopoll() yet.
> 
> Can you describe the issues you envision in more detail?

Basically IOPOLL doesn't support multishot request yet.

For example, when io_uring_mshot_cmd_post_cqe() is called and new cqe is
queued, it can't be found from io_iopoll_check()<-io_uring_enter(IORING_ENTER_GETEVENTS).


Thanks,
Ming

Re: [PATCH v3 0/4] io_uring/uring_cmd: allow non-iopoll cmds with IORING_SETUP_IOPOLL
Posted by Caleb Sander Mateos 1 month, 1 week ago
On Fri, Feb 20, 2026 at 8:11 AM Ming Lei <ming.lei@redhat.com> wrote:
>
> On Fri, Feb 20, 2026 at 07:55:33AM -0800, Caleb Sander Mateos wrote:
> > On Fri, Feb 20, 2026 at 6:25 AM Ming Lei <ming.lei@redhat.com> wrote:
> > >
> > > On Thu, Feb 19, 2026 at 10:22:23AM -0700, Caleb Sander Mateos wrote:
> > > > Currently, creating an io_uring with IORING_SETUP_IOPOLL requires all
> > > > requests issued to it to support iopoll. This prevents, for example,
> > > > using ublk zero-copy together with IORING_SETUP_IOPOLL, as ublk
> > > > zero-copy buffer registrations are performed using a uring_cmd. There's
> > > > no technical reason why these non-iopoll uring_cmds can't be supported.
> > > > They will either complete synchronously or via an external mechanism
> > > > that calls io_uring_cmd_done(), so they don't need to be polled.
> > >
> > > For sync uring command, it is fine to support for IOPOLL.
> > >
> > > However, there are async uring command, which may be completed in irq
> > > context, or in multishot way, at least the later isn't supported in
> > > io_do_iopoll() yet.
> >
> > Can you describe the issues you envision in more detail?
>
> Basically IOPOLL doesn't support multishot request yet.
>
> For example, when io_uring_mshot_cmd_post_cqe() is called and new cqe is
> queued, it can't be found from io_iopoll_check()<-io_uring_enter(IORING_ENTER_GETEVENTS).

I don't think that's a new issue, though. You're right that
io_uring_mshot_cmd_post_cqe() assumes a non-REQ_F_IOPOLL request, so
it's up to the ->uring_cmd() implementation to ensure that (which ublk
already does). Since ublk's struct file_operations don't provide
->uring_cmd_iopoll(), any ublk uring_cmds issued to an
IORING_SETUP_IOPOLL io_uring won't have REQ_F_IOPOLL set, so
io_uring_mshot_cmd_post_cqe() should work just fine.

Best,
Caleb
Re: [PATCH v3 0/4] io_uring/uring_cmd: allow non-iopoll cmds with IORING_SETUP_IOPOLL
Posted by Ming Lei 1 month, 1 week ago
On Fri, Feb 20, 2026 at 08:22:29AM -0800, Caleb Sander Mateos wrote:
> On Fri, Feb 20, 2026 at 8:11 AM Ming Lei <ming.lei@redhat.com> wrote:
> >
> > On Fri, Feb 20, 2026 at 07:55:33AM -0800, Caleb Sander Mateos wrote:
> > > On Fri, Feb 20, 2026 at 6:25 AM Ming Lei <ming.lei@redhat.com> wrote:
> > > >
> > > > On Thu, Feb 19, 2026 at 10:22:23AM -0700, Caleb Sander Mateos wrote:
> > > > > Currently, creating an io_uring with IORING_SETUP_IOPOLL requires all
> > > > > requests issued to it to support iopoll. This prevents, for example,
> > > > > using ublk zero-copy together with IORING_SETUP_IOPOLL, as ublk
> > > > > zero-copy buffer registrations are performed using a uring_cmd. There's
> > > > > no technical reason why these non-iopoll uring_cmds can't be supported.
> > > > > They will either complete synchronously or via an external mechanism
> > > > > that calls io_uring_cmd_done(), so they don't need to be polled.
> > > >
> > > > For sync uring command, it is fine to support for IOPOLL.
> > > >
> > > > However, there are async uring command, which may be completed in irq
> > > > context, or in multishot way, at least the later isn't supported in
> > > > io_do_iopoll() yet.
> > >
> > > Can you describe the issues you envision in more detail?
> >
> > Basically IOPOLL doesn't support multishot request yet.
> >
> > For example, when io_uring_mshot_cmd_post_cqe() is called and new cqe is
> > queued, it can't be found from io_iopoll_check()<-io_uring_enter(IORING_ENTER_GETEVENTS).
> 
> I don't think that's a new issue, though. You're right that
> io_uring_mshot_cmd_post_cqe() assumes a non-REQ_F_IOPOLL request, so
> it's up to the ->uring_cmd() implementation to ensure that (which ublk
> already does). Since ublk's struct file_operations don't provide
> ->uring_cmd_iopoll(), any ublk uring_cmds issued to an
> IORING_SETUP_IOPOLL io_uring won't have REQ_F_IOPOLL set, so
> io_uring_mshot_cmd_post_cqe() should work just fine.

Please look in the following way:

1) without patch of `io_uring/uring_cmd: allow non-iopoll cmds with IORING_SETUP_IOPOLL`,
multishot command submission can't succeed

2) with patch of "io_uring/uring_cmd: allow non-iopoll cmds with IORING_SETUP_IOPOLL", people
may see hang forever in io_uring_enter() if multishot command is submitted
in context IORING_SETUP_IOPOLL.


Thanks,
Ming

Re: [PATCH v3 0/4] io_uring/uring_cmd: allow non-iopoll cmds with IORING_SETUP_IOPOLL
Posted by Caleb Sander Mateos 1 month, 1 week ago
On Fri, Feb 20, 2026 at 8:31 AM Ming Lei <ming.lei@redhat.com> wrote:
>
> On Fri, Feb 20, 2026 at 08:22:29AM -0800, Caleb Sander Mateos wrote:
> > On Fri, Feb 20, 2026 at 8:11 AM Ming Lei <ming.lei@redhat.com> wrote:
> > >
> > > On Fri, Feb 20, 2026 at 07:55:33AM -0800, Caleb Sander Mateos wrote:
> > > > On Fri, Feb 20, 2026 at 6:25 AM Ming Lei <ming.lei@redhat.com> wrote:
> > > > >
> > > > > On Thu, Feb 19, 2026 at 10:22:23AM -0700, Caleb Sander Mateos wrote:
> > > > > > Currently, creating an io_uring with IORING_SETUP_IOPOLL requires all
> > > > > > requests issued to it to support iopoll. This prevents, for example,
> > > > > > using ublk zero-copy together with IORING_SETUP_IOPOLL, as ublk
> > > > > > zero-copy buffer registrations are performed using a uring_cmd. There's
> > > > > > no technical reason why these non-iopoll uring_cmds can't be supported.
> > > > > > They will either complete synchronously or via an external mechanism
> > > > > > that calls io_uring_cmd_done(), so they don't need to be polled.
> > > > >
> > > > > For sync uring command, it is fine to support for IOPOLL.
> > > > >
> > > > > However, there are async uring command, which may be completed in irq
> > > > > context, or in multishot way, at least the later isn't supported in
> > > > > io_do_iopoll() yet.
> > > >
> > > > Can you describe the issues you envision in more detail?
> > >
> > > Basically IOPOLL doesn't support multishot request yet.
> > >
> > > For example, when io_uring_mshot_cmd_post_cqe() is called and new cqe is
> > > queued, it can't be found from io_iopoll_check()<-io_uring_enter(IORING_ENTER_GETEVENTS).
> >
> > I don't think that's a new issue, though. You're right that
> > io_uring_mshot_cmd_post_cqe() assumes a non-REQ_F_IOPOLL request, so
> > it's up to the ->uring_cmd() implementation to ensure that (which ublk
> > already does). Since ublk's struct file_operations don't provide
> > ->uring_cmd_iopoll(), any ublk uring_cmds issued to an
> > IORING_SETUP_IOPOLL io_uring won't have REQ_F_IOPOLL set, so
> > io_uring_mshot_cmd_post_cqe() should work just fine.
>
> Please look in the following way:
>
> 1) without patch of `io_uring/uring_cmd: allow non-iopoll cmds with IORING_SETUP_IOPOLL`,
> multishot command submission can't succeed
>
> 2) with patch of "io_uring/uring_cmd: allow non-iopoll cmds with IORING_SETUP_IOPOLL", people
> may see hang forever in io_uring_enter() if multishot command is submitted
> in context IORING_SETUP_IOPOLL.

Okay, I see what you mean. If ctx->iopoll_list is nonempty and a
non-REQ_F_IOPOLL request posts a completion without going through task
work, io_iopoll_check() won't check for CQEs already posted outside of
iopoll. I think it should be simple enough to check for CQEs
unconditionally in the io_iopoll_check() loop.

Thanks,
Caleb
Re: [PATCH v3 0/4] io_uring/uring_cmd: allow non-iopoll cmds with IORING_SETUP_IOPOLL
Posted by Ming Lei 1 month, 1 week ago
On Fri, Feb 20, 2026 at 09:47:38AM -0800, Caleb Sander Mateos wrote:
> On Fri, Feb 20, 2026 at 8:31 AM Ming Lei <ming.lei@redhat.com> wrote:
> >
> > On Fri, Feb 20, 2026 at 08:22:29AM -0800, Caleb Sander Mateos wrote:
> > > On Fri, Feb 20, 2026 at 8:11 AM Ming Lei <ming.lei@redhat.com> wrote:
> > > >
> > > > On Fri, Feb 20, 2026 at 07:55:33AM -0800, Caleb Sander Mateos wrote:
> > > > > On Fri, Feb 20, 2026 at 6:25 AM Ming Lei <ming.lei@redhat.com> wrote:
> > > > > >
> > > > > > On Thu, Feb 19, 2026 at 10:22:23AM -0700, Caleb Sander Mateos wrote:
> > > > > > > Currently, creating an io_uring with IORING_SETUP_IOPOLL requires all
> > > > > > > requests issued to it to support iopoll. This prevents, for example,
> > > > > > > using ublk zero-copy together with IORING_SETUP_IOPOLL, as ublk
> > > > > > > zero-copy buffer registrations are performed using a uring_cmd. There's
> > > > > > > no technical reason why these non-iopoll uring_cmds can't be supported.
> > > > > > > They will either complete synchronously or via an external mechanism
> > > > > > > that calls io_uring_cmd_done(), so they don't need to be polled.
> > > > > >
> > > > > > For sync uring command, it is fine to support for IOPOLL.
> > > > > >
> > > > > > However, there are async uring command, which may be completed in irq
> > > > > > context, or in multishot way, at least the later isn't supported in
> > > > > > io_do_iopoll() yet.
> > > > >
> > > > > Can you describe the issues you envision in more detail?
> > > >
> > > > Basically IOPOLL doesn't support multishot request yet.
> > > >
> > > > For example, when io_uring_mshot_cmd_post_cqe() is called and new cqe is
> > > > queued, it can't be found from io_iopoll_check()<-io_uring_enter(IORING_ENTER_GETEVENTS).
> > >
> > > I don't think that's a new issue, though. You're right that
> > > io_uring_mshot_cmd_post_cqe() assumes a non-REQ_F_IOPOLL request, so
> > > it's up to the ->uring_cmd() implementation to ensure that (which ublk
> > > already does). Since ublk's struct file_operations don't provide
> > > ->uring_cmd_iopoll(), any ublk uring_cmds issued to an
> > > IORING_SETUP_IOPOLL io_uring won't have REQ_F_IOPOLL set, so
> > > io_uring_mshot_cmd_post_cqe() should work just fine.
> >
> > Please look in the following way:
> >
> > 1) without patch of `io_uring/uring_cmd: allow non-iopoll cmds with IORING_SETUP_IOPOLL`,
> > multishot command submission can't succeed
> >
> > 2) with patch of "io_uring/uring_cmd: allow non-iopoll cmds with IORING_SETUP_IOPOLL", people
> > may see hang forever in io_uring_enter() if multishot command is submitted
> > in context IORING_SETUP_IOPOLL.
> 
> Okay, I see what you mean. If ctx->iopoll_list is nonempty and a
> non-REQ_F_IOPOLL request posts a completion without going through task
> work, io_iopoll_check() won't check for CQEs already posted outside of
> iopoll. I think it should be simple enough to check for CQEs
> unconditionally in the io_iopoll_check() loop.

Yeah, it shouldn't be hard to deal with.

Thanks,
Ming