Do not merge this series. The performance effects are not significant. I am
sharing this mainly to archive the patches and in case someone has ideas on how
to improve this.
Bernd Schubert mentioned io_uring_setup(2) flags that may improve performance:
- IORING_SETUP_SINGLE_ISSUER: optimization when only 1 thread uses an io_uring context
- IORING_SETUP_COOP_TASKRUN: avoids IPIs
- IORING_SETUP_TASKRUN_FLAG: makes COOP_TASKRUN work with userspace CQ ring polling
Suraj Shirvankar already started work on SINGLE_ISSUER in the past:
https://lore.kernel.org/qemu-devel/174293621917.22751.11381319865102029969-0@git.sr.ht/
Where this differs from Suraj's previous work is that I have worked around the
need for the main loop AioContext to be shared by multiple threads (vCPU
threads and the migration thread).
Here are the performance numbers for fio bs=4k in a 4 vCPU guest with 1
IOThread using a virtio-blk disk backed by a local NVMe drive:
IOPS IOPS
Benchmark SINGLE_ISSUER SINGLE_ISSUER|COOP_TASKRUN|TASKRUN_FLAG
randread iodepth=1 54,045 (+1.2%) 54,189 (+1.5%)
randread iodepth=64 318,135 (+0.1%) 315,632 (-0.68%)
randwrite iodepth=1 141,918 (-0.44%) 143,337 (+0.55%)
randwrite iodepth=64 323,948 (-0.015%) 322,755 (-0.38%)
You can find detailed benchmarking results here including the fio
output, fio command-line, and guest libvirt domain XML:
https://gitlab.com/stefanha/virt-playbooks/-/tree/io_uring-flags/notebook/fio-output
https://gitlab.com/stefanha/virt-playbooks/-/blob/io_uring-flags/files/fio.sh
https://gitlab.com/stefanha/virt-playbooks/-/blob/io_uring-flags/files/test.xml.j2
Stefan Hajnoczi (3):
iothread: create AioContext in iothread_run()
aio-posix: enable IORING_SETUP_SINGLE_ISSUER
aio-posix: enable IORING_SETUP_COOP_TASKRUN |
IORING_SETUP_TASKRUN_FLAG
include/system/iothread.h | 1 -
iothread.c | 140 +++++++++++++++++++++-----------------
util/fdmon-io_uring.c | 26 ++++++-
3 files changed, 101 insertions(+), 66 deletions(-)
--
2.50.1