fs/io_uring.c | 179 +++++++++++++------------------- include/trace/events/io_uring.h | 13 +-- 2 files changed, 75 insertions(+), 117 deletions(-)
Ring quiesce is currently used for registering/unregistering eventfds,
registering restrictions and enabling rings.
For opcodes relating to registering/unregistering eventfds, ring quiesce
can be avoided by creating a new RCU data structure (io_ev_fd) as part
of io_ring_ctx that holds the eventfd_ctx, with reads to the structure
protected by rcu_read_lock and writes (register/unregister calls)
protected by a mutex.
With the above approach ring quiesce can be avoided which is much more
expensive then using RCU lock. On the system tested, io_uring_reigster with
IORING_REGISTER_EVENTFD takes less than 1ms with RCU lock, compared to 15ms
before with ring quiesce.
IORING_SETUP_R_DISABLED prevents submitting requests and
so there will be no requests until IORING_REGISTER_ENABLE_RINGS
is called. And IORING_REGISTER_RESTRICTIONS works only before
IORING_REGISTER_ENABLE_RINGS is called. Hence ring quiesce is
not needed for these opcodes.
---
v5->v6:
- Split removing ring quiesce completely from io_uring_register into
2 patches (Pavel Begunkov)
- Removed extra mutex while registering/unregistering eventfd as uring_lock
can be used (Pavel Begunkov)
- Move setting ctx->evfd to NULL from io_eventfd_put to before call_rcu
(Pavel Begunkov)
v4->v5:
- Remove ring quiesce completely from io_uring_register (Pavel Begunkov)
- Replaced rcu_barrier with unregistering flag (Jens Axboe)
- Created a faster check for ctx->io_ev_fd in io_eventfd_signal and cleaned up
io_eventfd_unregister (Jens Axboe)
v3->v4:
- Switch back to call_rcu and use rcu_barrier incase io_eventfd_register fails
to make sure all rcu callbacks have finished.
v2->v3:
- Switched to using synchronize_rcu from call_rcu in io_eventfd_unregister.
v1->v2:
- Added patch to remove eventfd from tracepoint (Patch 1) (Jens Axboe)
- Made the code of io_should_trigger_evfd as part of io_eventfd_signal (Jens Axboe)
Usama Arif (5):
io_uring: remove trace for eventfd
io_uring: avoid ring quiesce while registering/unregistering eventfd
io_uring: avoid ring quiesce while registering async eventfd
io_uring: avoid ring quiesce while registering restrictions and
enabling rings
io_uring: remove ring quiesce for io_uring_register
fs/io_uring.c | 179 +++++++++++++-------------------
include/trace/events/io_uring.h | 13 +--
2 files changed, 75 insertions(+), 117 deletions(-)
--
2.25.1
On 2/4/22 7:51 AM, Usama Arif wrote: > Ring quiesce is currently used for registering/unregistering eventfds, > registering restrictions and enabling rings. > > For opcodes relating to registering/unregistering eventfds, ring quiesce > can be avoided by creating a new RCU data structure (io_ev_fd) as part > of io_ring_ctx that holds the eventfd_ctx, with reads to the structure > protected by rcu_read_lock and writes (register/unregister calls) > protected by a mutex. > > With the above approach ring quiesce can be avoided which is much more > expensive then using RCU lock. On the system tested, io_uring_reigster with > IORING_REGISTER_EVENTFD takes less than 1ms with RCU lock, compared to 15ms > before with ring quiesce. > > IORING_SETUP_R_DISABLED prevents submitting requests and > so there will be no requests until IORING_REGISTER_ENABLE_RINGS > is called. And IORING_REGISTER_RESTRICTIONS works only before > IORING_REGISTER_ENABLE_RINGS is called. Hence ring quiesce is > not needed for these opcodes. I wrote a simple test case just verifying register+unregister, and also doing a loop to catch any issues around that. Here's the current kernel: [root@archlinux liburing]# time test/eventfd-reg real 0m7.980s user 0m0.004s sys 0m0.000s [root@archlinux liburing]# time test/eventfd-reg real 0m8.197s user 0m0.004s sys 0m0.000s which is around ~80ms for each register/unregister cycle, and here are the results with this patchset: [root@archlinux liburing]# time test/eventfd-reg real 0m0.002s user 0m0.001s sys 0m0.000s [root@archlinux liburing]# time test/eventfd-reg real 0m0.001s user 0m0.001s sys 0m0.000s which looks a lot more reasonable. I'll look over this one and see if I've got anything to complain about, just ran it first since I wrote the test anyway. Here's the test case, btw: https://git.kernel.dk/cgit/liburing/commit/?id=5bde26e4587168a439cabdbe73740454249e5204 -- Jens Axboe
On Fri, 4 Feb 2022 14:51:12 +0000, Usama Arif wrote:
> Ring quiesce is currently used for registering/unregistering eventfds,
> registering restrictions and enabling rings.
>
> For opcodes relating to registering/unregistering eventfds, ring quiesce
> can be avoided by creating a new RCU data structure (io_ev_fd) as part
> of io_ring_ctx that holds the eventfd_ctx, with reads to the structure
> protected by rcu_read_lock and writes (register/unregister calls)
> protected by a mutex.
>
> [...]
Applied, thanks!
[1/5] io_uring: remove trace for eventfd
commit: 054f8098d98be4c53ef317e9dd745bb5759f61d9
[2/5] io_uring: avoid ring quiesce while registering/unregistering eventfd
commit: b77e315a96445e5f19a83546c73d2abbcedfa5db
[3/5] io_uring: avoid ring quiesce while registering async eventfd
commit: 13bcfd43fd0ef5e0de306e6ffb566970499b6888
[4/5] io_uring: avoid ring quiesce while registering restrictions and enabling rings
commit: 1769f1468f4697409ee44f494940b5381acc1bae
[5/5] io_uring: remove ring quiesce for io_uring_register
commit: 971d72eb476604fc91a8e82f0421e6f599f9c300
Best regards,
--
Jens Axboe
© 2016 - 2026 Red Hat, Inc.