[PATCH v6 0/5] io_uring: remove ring quiesce in io_uring_register

Usama Arif posted 5 patches 4 years, 4 months ago
fs/io_uring.c                   | 179 +++++++++++++-------------------
include/trace/events/io_uring.h |  13 +--
2 files changed, 75 insertions(+), 117 deletions(-)
[PATCH v6 0/5] io_uring: remove ring quiesce in io_uring_register
Posted by Usama Arif 4 years, 4 months ago
Ring quiesce is currently used for registering/unregistering eventfds,
registering restrictions and enabling rings.

For opcodes relating to registering/unregistering eventfds, ring quiesce
can be avoided by creating a new RCU data structure (io_ev_fd) as part
of io_ring_ctx that holds the eventfd_ctx, with reads to the structure
protected by rcu_read_lock and writes (register/unregister calls)
protected by a mutex.

With the above approach ring quiesce can be avoided which is much more
expensive then using RCU lock. On the system tested, io_uring_reigster with
IORING_REGISTER_EVENTFD takes less than 1ms with RCU lock, compared to 15ms
before with ring quiesce.

IORING_SETUP_R_DISABLED prevents submitting requests and
so there will be no requests until IORING_REGISTER_ENABLE_RINGS
is called. And IORING_REGISTER_RESTRICTIONS works only before
IORING_REGISTER_ENABLE_RINGS is called. Hence ring quiesce is
not needed for these opcodes.

---
v5->v6:
- Split removing ring quiesce completely from io_uring_register into
2 patches (Pavel Begunkov)
- Removed extra mutex while registering/unregistering eventfd as uring_lock
can be used (Pavel Begunkov)
- Move setting ctx->evfd to NULL from io_eventfd_put to before call_rcu
(Pavel Begunkov)

v4->v5:
- Remove ring quiesce completely from io_uring_register (Pavel Begunkov)
- Replaced rcu_barrier with unregistering flag (Jens Axboe)
- Created a faster check for ctx->io_ev_fd in io_eventfd_signal and cleaned up
io_eventfd_unregister (Jens Axboe)

v3->v4:
- Switch back to call_rcu and use rcu_barrier incase io_eventfd_register fails
to make sure all rcu callbacks have finished.

v2->v3:
- Switched to using synchronize_rcu from call_rcu in io_eventfd_unregister.

v1->v2:
- Added patch to remove eventfd from tracepoint (Patch 1) (Jens Axboe)
- Made the code of io_should_trigger_evfd as part of io_eventfd_signal (Jens Axboe)

Usama Arif (5):
  io_uring: remove trace for eventfd
  io_uring: avoid ring quiesce while registering/unregistering eventfd
  io_uring: avoid ring quiesce while registering async eventfd
  io_uring: avoid ring quiesce while registering restrictions and
    enabling rings
  io_uring: remove ring quiesce for io_uring_register

 fs/io_uring.c                   | 179 +++++++++++++-------------------
 include/trace/events/io_uring.h |  13 +--
 2 files changed, 75 insertions(+), 117 deletions(-)

-- 
2.25.1

Re: [PATCH v6 0/5] io_uring: remove ring quiesce in io_uring_register
Posted by Jens Axboe 4 years, 4 months ago
On 2/4/22 7:51 AM, Usama Arif wrote:
> Ring quiesce is currently used for registering/unregistering eventfds,
> registering restrictions and enabling rings.
> 
> For opcodes relating to registering/unregistering eventfds, ring quiesce
> can be avoided by creating a new RCU data structure (io_ev_fd) as part
> of io_ring_ctx that holds the eventfd_ctx, with reads to the structure
> protected by rcu_read_lock and writes (register/unregister calls)
> protected by a mutex.
> 
> With the above approach ring quiesce can be avoided which is much more
> expensive then using RCU lock. On the system tested, io_uring_reigster with
> IORING_REGISTER_EVENTFD takes less than 1ms with RCU lock, compared to 15ms
> before with ring quiesce.
> 
> IORING_SETUP_R_DISABLED prevents submitting requests and
> so there will be no requests until IORING_REGISTER_ENABLE_RINGS
> is called. And IORING_REGISTER_RESTRICTIONS works only before
> IORING_REGISTER_ENABLE_RINGS is called. Hence ring quiesce is
> not needed for these opcodes.

I wrote a simple test case just verifying register+unregister, and also
doing a loop to catch any issues around that. Here's the current kernel:

[root@archlinux liburing]# time test/eventfd-reg 

real	0m7.980s
user	0m0.004s
sys	0m0.000s
[root@archlinux liburing]# time test/eventfd-reg 

real	0m8.197s
user	0m0.004s
sys	0m0.000s

which is around ~80ms for each register/unregister cycle, and here are
the results with this patchset:

[root@archlinux liburing]# time test/eventfd-reg

real	0m0.002s
user	0m0.001s
sys	0m0.000s
[root@archlinux liburing]# time test/eventfd-reg

real	0m0.001s
user	0m0.001s
sys	0m0.000s

which looks a lot more reasonable.

I'll look over this one and see if I've got anything to complain about,
just ran it first since I wrote the test anyway. Here's the test case,
btw:

https://git.kernel.dk/cgit/liburing/commit/?id=5bde26e4587168a439cabdbe73740454249e5204

-- 
Jens Axboe

Re: [PATCH v6 0/5] io_uring: remove ring quiesce in io_uring_register
Posted by Jens Axboe 4 years, 4 months ago
On Fri, 4 Feb 2022 14:51:12 +0000, Usama Arif wrote:
> Ring quiesce is currently used for registering/unregistering eventfds,
> registering restrictions and enabling rings.
> 
> For opcodes relating to registering/unregistering eventfds, ring quiesce
> can be avoided by creating a new RCU data structure (io_ev_fd) as part
> of io_ring_ctx that holds the eventfd_ctx, with reads to the structure
> protected by rcu_read_lock and writes (register/unregister calls)
> protected by a mutex.
> 
> [...]

Applied, thanks!

[1/5] io_uring: remove trace for eventfd
      commit: 054f8098d98be4c53ef317e9dd745bb5759f61d9
[2/5] io_uring: avoid ring quiesce while registering/unregistering eventfd
      commit: b77e315a96445e5f19a83546c73d2abbcedfa5db
[3/5] io_uring: avoid ring quiesce while registering async eventfd
      commit: 13bcfd43fd0ef5e0de306e6ffb566970499b6888
[4/5] io_uring: avoid ring quiesce while registering restrictions and enabling rings
      commit: 1769f1468f4697409ee44f494940b5381acc1bae
[5/5] io_uring: remove ring quiesce for io_uring_register
      commit: 971d72eb476604fc91a8e82f0421e6f599f9c300

Best regards,
-- 
Jens Axboe