fs/io_uring.c | 202 ++++++++++++++++---------------- include/trace/events/io_uring.h | 13 +- 2 files changed, 107 insertions(+), 108 deletions(-)
For opcodes relating to registering/unregistering eventfds, this is done by creating a new RCU data structure (io_ev_fd) as part of io_ring_ctx that holds the eventfd_ctx, with reads to the structure protected by rcu_read_lock and writes (register/unregister calls) protected by a mutex. With the above approach ring quiesce can be avoided which is much more expensive then using RCU lock. On the system tested, io_uring_reigster with IORING_REGISTER_EVENTFD takes less than 1ms with RCU lock, compared to 15ms before with ring quiesce. The second patch creates the RCU protected data structure and removes ring quiesce for IORING_REGISTER_EVENTFD and IORING_UNREGISTER_EVENTFD. The third patch builds on top of the second patch and removes ring quiesce for IORING_REGISTER_EVENTFD_ASYNC. The fourth patch completely removes ring quiesce from io_uring_register, as IORING_REGISTER_ENABLE_RINGS and IORING_REGISTER_RESTRICTIONS dont need them. --- v4->v5: - Remove ring quiesce completely from io_uring_register (Pavel Begunkov) - Replaced rcu_barrier with unregistering flag (Jens Axboe) - Created a faster check for ctx->io_ev_fd in io_eventfd_signal and cleaned up io_eventfd_unregister (Jens Axboe) v3->v4: - Switch back to call_rcu and use rcu_barrier incase io_eventfd_register fails to make sure all rcu callbacks have finished. v2->v3: - Switched to using synchronize_rcu from call_rcu in io_eventfd_unregister. v1->v2: - Added patch to remove eventfd from tracepoint (Patch 1) (Jens Axboe) - Made the code of io_should_trigger_evfd as part of io_eventfd_signal (Jens Axboe) Usama Arif (4): io_uring: remove trace for eventfd io_uring: avoid ring quiesce while registering/unregistering eventfd io_uring: avoid ring quiesce for IORING_REGISTER_EVENTFD_ASYNC io_uring: remove ring quiesce for io_uring_register fs/io_uring.c | 202 ++++++++++++++++---------------- include/trace/events/io_uring.h | 13 +- 2 files changed, 107 insertions(+), 108 deletions(-) -- 2.25.1
On 2/3/22 23:34, Usama Arif wrote: > For opcodes relating to registering/unregistering eventfds, this is done by > creating a new RCU data structure (io_ev_fd) as part of io_ring_ctx that > holds the eventfd_ctx, with reads to the structure protected by > rcu_read_lock and writes (register/unregister calls) protected by a mutex. > > With the above approach ring quiesce can be avoided which is much more > expensive then using RCU lock. On the system tested, io_uring_reigster with > IORING_REGISTER_EVENTFD takes less than 1ms with RCU lock, compared to 15ms > before with ring quiesce. > > The second patch creates the RCU protected data structure and removes ring > quiesce for IORING_REGISTER_EVENTFD and IORING_UNREGISTER_EVENTFD. > > The third patch builds on top of the second patch and removes ring quiesce > for IORING_REGISTER_EVENTFD_ASYNC. > > The fourth patch completely removes ring quiesce from io_uring_register, > as IORING_REGISTER_ENABLE_RINGS and IORING_REGISTER_RESTRICTIONS dont need > them. Let me leave it just for history: I strongly dislike it considering there is no one who uses or going to use it. Even more, I can't find a single user of io_uring_unregister_eventfd() in liburing tests, so most probably the paths are not tested at all. > --- > v4->v5: > - Remove ring quiesce completely from io_uring_register (Pavel Begunkov) > - Replaced rcu_barrier with unregistering flag (Jens Axboe) > - Created a faster check for ctx->io_ev_fd in io_eventfd_signal and cleaned up > io_eventfd_unregister (Jens Axboe) > > v3->v4: > - Switch back to call_rcu and use rcu_barrier incase io_eventfd_register fails > to make sure all rcu callbacks have finished. > > v2->v3: > - Switched to using synchronize_rcu from call_rcu in io_eventfd_unregister. > > v1->v2: > - Added patch to remove eventfd from tracepoint (Patch 1) (Jens Axboe) > - Made the code of io_should_trigger_evfd as part of io_eventfd_signal (Jens Axboe) > > Usama Arif (4): > io_uring: remove trace for eventfd > io_uring: avoid ring quiesce while registering/unregistering eventfd > io_uring: avoid ring quiesce for IORING_REGISTER_EVENTFD_ASYNC > io_uring: remove ring quiesce for io_uring_register > > fs/io_uring.c | 202 ++++++++++++++++---------------- > include/trace/events/io_uring.h | 13 +- > 2 files changed, 107 insertions(+), 108 deletions(-) > -- Pavel Begunkov
On 2/3/22 5:02 PM, Pavel Begunkov wrote: > On 2/3/22 23:34, Usama Arif wrote: >> For opcodes relating to registering/unregistering eventfds, this is done by >> creating a new RCU data structure (io_ev_fd) as part of io_ring_ctx that >> holds the eventfd_ctx, with reads to the structure protected by >> rcu_read_lock and writes (register/unregister calls) protected by a mutex. >> >> With the above approach ring quiesce can be avoided which is much more >> expensive then using RCU lock. On the system tested, io_uring_reigster with >> IORING_REGISTER_EVENTFD takes less than 1ms with RCU lock, compared to 15ms >> before with ring quiesce. >> >> The second patch creates the RCU protected data structure and removes ring >> quiesce for IORING_REGISTER_EVENTFD and IORING_UNREGISTER_EVENTFD. >> >> The third patch builds on top of the second patch and removes ring quiesce >> for IORING_REGISTER_EVENTFD_ASYNC. >> >> The fourth patch completely removes ring quiesce from io_uring_register, >> as IORING_REGISTER_ENABLE_RINGS and IORING_REGISTER_RESTRICTIONS dont need >> them. > > Let me leave it just for history: I strongly dislike it considering > there is no one who uses or going to use it. Are you referring to the 4th patch? Or the patchset as a whole? Not clear to me, because eventfd registration is most certainly used by folks today. > Even more, I can't find a single user of io_uring_unregister_eventfd() > in liburing tests, so most probably the paths are not tested at all. That's definitely a general issue, not related to this patchset. Something that most certainly should get added! Ring exit will use the same unregister path for eventfd, however, so it does get exercised from there with existing tests too. But for this change, we definitely need a test that exercises both register and unregister, trying to trigger something funky there. -- Jens Axboe
On 2/4/22 00:15, Jens Axboe wrote: > On 2/3/22 5:02 PM, Pavel Begunkov wrote: >> On 2/3/22 23:34, Usama Arif wrote: >>> For opcodes relating to registering/unregistering eventfds, this is done by >>> creating a new RCU data structure (io_ev_fd) as part of io_ring_ctx that >>> holds the eventfd_ctx, with reads to the structure protected by >>> rcu_read_lock and writes (register/unregister calls) protected by a mutex. >>> >>> With the above approach ring quiesce can be avoided which is much more >>> expensive then using RCU lock. On the system tested, io_uring_reigster with >>> IORING_REGISTER_EVENTFD takes less than 1ms with RCU lock, compared to 15ms >>> before with ring quiesce. >>> >>> The second patch creates the RCU protected data structure and removes ring >>> quiesce for IORING_REGISTER_EVENTFD and IORING_UNREGISTER_EVENTFD. >>> >>> The third patch builds on top of the second patch and removes ring quiesce >>> for IORING_REGISTER_EVENTFD_ASYNC. >>> >>> The fourth patch completely removes ring quiesce from io_uring_register, >>> as IORING_REGISTER_ENABLE_RINGS and IORING_REGISTER_RESTRICTIONS dont need >>> them. >> >> Let me leave it just for history: I strongly dislike it considering >> there is no one who uses or going to use it. > > Are you referring to the 4th patch? Or the patchset as a whole? Not clear > to me, because eventfd registration is most certainly used by folks > today. I refer to optimising eventfd unregister with no users of it, which lead to the RCU approach. 1/4 is good, taking ENABLE_RINGS and RESTRICTIONS out of quiesce is also great. 4/4 per se is not a problem, even if I'd need to revert it later. >> Even more, I can't find a single user of io_uring_unregister_eventfd() >> in liburing tests, so most probably the paths are not tested at all. > > That's definitely a general issue, not related to this patchset. > Something that most certainly should get added! Ring exit will use the > same unregister path for eventfd, however, so it does get exercised from > there with existing tests too. io_ring_ctx_free() -> io_eventfd_unregister() It's called after full quiesce in io_ring_exit_work() + even more extra sync, so not completely > > But for this change, we definitely need a test that exercises both > register and unregister, trying to trigger something funky there. > -- Pavel Begunkov
© 2016 - 2026 Red Hat, Inc.