[PATCH v5 0/4] io_uring: remove ring quiesce in io_uring_register

Usama Arif posted 4 patches 4 years, 4 months ago
There is a newer version of this series
fs/io_uring.c                   | 202 ++++++++++++++++----------------
include/trace/events/io_uring.h |  13 +-
2 files changed, 107 insertions(+), 108 deletions(-)
[PATCH v5 0/4] io_uring: remove ring quiesce in io_uring_register
Posted by Usama Arif 4 years, 4 months ago
For opcodes relating to registering/unregistering eventfds, this is done by
creating a new RCU data structure (io_ev_fd) as part of io_ring_ctx that
holds the eventfd_ctx, with reads to the structure protected by
rcu_read_lock and writes (register/unregister calls) protected by a mutex.

With the above approach ring quiesce can be avoided which is much more
expensive then using RCU lock. On the system tested, io_uring_reigster with
IORING_REGISTER_EVENTFD takes less than 1ms with RCU lock, compared to 15ms
before with ring quiesce.

The second patch creates the RCU protected data structure and removes ring
quiesce for IORING_REGISTER_EVENTFD and IORING_UNREGISTER_EVENTFD.

The third patch builds on top of the second patch and removes ring quiesce
for IORING_REGISTER_EVENTFD_ASYNC.

The fourth patch completely removes ring quiesce from io_uring_register,
as IORING_REGISTER_ENABLE_RINGS and IORING_REGISTER_RESTRICTIONS dont need
them.

---
v4->v5:
- Remove ring quiesce completely from io_uring_register (Pavel Begunkov)
- Replaced rcu_barrier with unregistering flag (Jens Axboe)
- Created a faster check for ctx->io_ev_fd in io_eventfd_signal and cleaned up
io_eventfd_unregister (Jens Axboe)

v3->v4:
- Switch back to call_rcu and use rcu_barrier incase io_eventfd_register fails
to make sure all rcu callbacks have finished.

v2->v3:
- Switched to using synchronize_rcu from call_rcu in io_eventfd_unregister.

v1->v2:
- Added patch to remove eventfd from tracepoint (Patch 1) (Jens Axboe)
- Made the code of io_should_trigger_evfd as part of io_eventfd_signal (Jens Axboe)

Usama Arif (4):
  io_uring: remove trace for eventfd
  io_uring: avoid ring quiesce while registering/unregistering eventfd
  io_uring: avoid ring quiesce for IORING_REGISTER_EVENTFD_ASYNC
  io_uring: remove ring quiesce for io_uring_register

 fs/io_uring.c                   | 202 ++++++++++++++++----------------
 include/trace/events/io_uring.h |  13 +-
 2 files changed, 107 insertions(+), 108 deletions(-)

-- 
2.25.1

Re: [PATCH v5 0/4] io_uring: remove ring quiesce in io_uring_register
Posted by Pavel Begunkov 4 years, 4 months ago
On 2/3/22 23:34, Usama Arif wrote:
> For opcodes relating to registering/unregistering eventfds, this is done by
> creating a new RCU data structure (io_ev_fd) as part of io_ring_ctx that
> holds the eventfd_ctx, with reads to the structure protected by
> rcu_read_lock and writes (register/unregister calls) protected by a mutex.
> 
> With the above approach ring quiesce can be avoided which is much more
> expensive then using RCU lock. On the system tested, io_uring_reigster with
> IORING_REGISTER_EVENTFD takes less than 1ms with RCU lock, compared to 15ms
> before with ring quiesce.
> 
> The second patch creates the RCU protected data structure and removes ring
> quiesce for IORING_REGISTER_EVENTFD and IORING_UNREGISTER_EVENTFD.
> 
> The third patch builds on top of the second patch and removes ring quiesce
> for IORING_REGISTER_EVENTFD_ASYNC.
> 
> The fourth patch completely removes ring quiesce from io_uring_register,
> as IORING_REGISTER_ENABLE_RINGS and IORING_REGISTER_RESTRICTIONS dont need
> them.

Let me leave it just for history: I strongly dislike it considering
there is no one who uses or going to use it. Even more, I can't find a
single user of io_uring_unregister_eventfd() in liburing tests, so most
probably the paths are not tested at all.



> ---
> v4->v5:
> - Remove ring quiesce completely from io_uring_register (Pavel Begunkov)
> - Replaced rcu_barrier with unregistering flag (Jens Axboe)
> - Created a faster check for ctx->io_ev_fd in io_eventfd_signal and cleaned up
> io_eventfd_unregister (Jens Axboe)
> 
> v3->v4:
> - Switch back to call_rcu and use rcu_barrier incase io_eventfd_register fails
> to make sure all rcu callbacks have finished.
> 
> v2->v3:
> - Switched to using synchronize_rcu from call_rcu in io_eventfd_unregister.
> 
> v1->v2:
> - Added patch to remove eventfd from tracepoint (Patch 1) (Jens Axboe)
> - Made the code of io_should_trigger_evfd as part of io_eventfd_signal (Jens Axboe)
> 
> Usama Arif (4):
>    io_uring: remove trace for eventfd
>    io_uring: avoid ring quiesce while registering/unregistering eventfd
>    io_uring: avoid ring quiesce for IORING_REGISTER_EVENTFD_ASYNC
>    io_uring: remove ring quiesce for io_uring_register
> 
>   fs/io_uring.c                   | 202 ++++++++++++++++----------------
>   include/trace/events/io_uring.h |  13 +-
>   2 files changed, 107 insertions(+), 108 deletions(-)
> 

-- 
Pavel Begunkov
Re: [PATCH v5 0/4] io_uring: remove ring quiesce in io_uring_register
Posted by Jens Axboe 4 years, 4 months ago
On 2/3/22 5:02 PM, Pavel Begunkov wrote:
> On 2/3/22 23:34, Usama Arif wrote:
>> For opcodes relating to registering/unregistering eventfds, this is done by
>> creating a new RCU data structure (io_ev_fd) as part of io_ring_ctx that
>> holds the eventfd_ctx, with reads to the structure protected by
>> rcu_read_lock and writes (register/unregister calls) protected by a mutex.
>>
>> With the above approach ring quiesce can be avoided which is much more
>> expensive then using RCU lock. On the system tested, io_uring_reigster with
>> IORING_REGISTER_EVENTFD takes less than 1ms with RCU lock, compared to 15ms
>> before with ring quiesce.
>>
>> The second patch creates the RCU protected data structure and removes ring
>> quiesce for IORING_REGISTER_EVENTFD and IORING_UNREGISTER_EVENTFD.
>>
>> The third patch builds on top of the second patch and removes ring quiesce
>> for IORING_REGISTER_EVENTFD_ASYNC.
>>
>> The fourth patch completely removes ring quiesce from io_uring_register,
>> as IORING_REGISTER_ENABLE_RINGS and IORING_REGISTER_RESTRICTIONS dont need
>> them.
> 
> Let me leave it just for history: I strongly dislike it considering
> there is no one who uses or going to use it.

Are you referring to the 4th patch? Or the patchset as a whole? Not clear
to me, because eventfd registration is most certainly used by folks
today.

> Even more, I can't find a single user of io_uring_unregister_eventfd()
> in liburing tests, so most probably the paths are not tested at all.

That's definitely a general issue, not related to this patchset.
Something that most certainly should get added! Ring exit will use the
same unregister path for eventfd, however, so it does get exercised from
there with existing tests too.

But for this change, we definitely need a test that exercises both
register and unregister, trying to trigger something funky there.

-- 
Jens Axboe

Re: [PATCH v5 0/4] io_uring: remove ring quiesce in io_uring_register
Posted by Pavel Begunkov 4 years, 4 months ago
On 2/4/22 00:15, Jens Axboe wrote:
> On 2/3/22 5:02 PM, Pavel Begunkov wrote:
>> On 2/3/22 23:34, Usama Arif wrote:
>>> For opcodes relating to registering/unregistering eventfds, this is done by
>>> creating a new RCU data structure (io_ev_fd) as part of io_ring_ctx that
>>> holds the eventfd_ctx, with reads to the structure protected by
>>> rcu_read_lock and writes (register/unregister calls) protected by a mutex.
>>>
>>> With the above approach ring quiesce can be avoided which is much more
>>> expensive then using RCU lock. On the system tested, io_uring_reigster with
>>> IORING_REGISTER_EVENTFD takes less than 1ms with RCU lock, compared to 15ms
>>> before with ring quiesce.
>>>
>>> The second patch creates the RCU protected data structure and removes ring
>>> quiesce for IORING_REGISTER_EVENTFD and IORING_UNREGISTER_EVENTFD.
>>>
>>> The third patch builds on top of the second patch and removes ring quiesce
>>> for IORING_REGISTER_EVENTFD_ASYNC.
>>>
>>> The fourth patch completely removes ring quiesce from io_uring_register,
>>> as IORING_REGISTER_ENABLE_RINGS and IORING_REGISTER_RESTRICTIONS dont need
>>> them.
>>
>> Let me leave it just for history: I strongly dislike it considering
>> there is no one who uses or going to use it.
> 
> Are you referring to the 4th patch? Or the patchset as a whole? Not clear
> to me, because eventfd registration is most certainly used by folks
> today.

I refer to optimising eventfd unregister with no users of it, which
lead to the RCU approach.

1/4 is good, taking ENABLE_RINGS and RESTRICTIONS out of quiesce is
also great. 4/4 per se is not a problem, even if I'd need to revert
it later.

>> Even more, I can't find a single user of io_uring_unregister_eventfd()
>> in liburing tests, so most probably the paths are not tested at all.
> 
> That's definitely a general issue, not related to this patchset.
> Something that most certainly should get added! Ring exit will use the
> same unregister path for eventfd, however, so it does get exercised from
> there with existing tests too.

io_ring_ctx_free()
  -> io_eventfd_unregister()

It's called after full quiesce in io_ring_exit_work() + even more
extra sync, so not completely

> 
> But for this change, we definitely need a test that exercises both
> register and unregister, trying to trigger something funky there.
> 

-- 
Pavel Begunkov