[RFC PATCH 0/1] add fixed file table support

Brian Song posted 1 patch 3 weeks, 2 days ago
block/io_uring.c | 60 ++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 60 insertions(+)
[RFC PATCH 0/1] add fixed file table support
Posted by Brian Song 3 weeks, 2 days ago
Hi everyone,

I am a GSoC QEMU community applicant this year, and I have just
completed this contribution task suggested by the project mentors
Kevin and Stefan. This task requires registering the file descriptor
of a block file that currently uses io_uring as the AIO method to an
io_uring instance, so that when the kernel processes I/O requests, it
can directly use the index to find the file information and avoid
frequent file lookups (fdget()) in the kernel. This is expected to
improve I/O performance.

Note that since this is currently just a proof-of-concept that enables
benchmarking, handling scenarios like block file removal is not yet
implemented. Testing was conducted using fio for random read operations,
and based on the results, there doesn’t seem to be a significant I/O
performance improvement.

Please feel free to share any thoughts!

Thanks,
Brian

The specific testing method and results are as follows:

guest $ sudo fio --filename=/dev/vda \
                 --runtime=120 \
                 --ioengine=io_uring \
                 --direct=1 \
                 --ramp_time=5 \
                 --name=randread \
                 --readwrite=randread \
                 --iodepth=64 \
                 --numjobs=1 \
                 --blocksize=4k \
                 --runtime=30 \
                 --time_based=1

** Guest with fixed file table support: **

**vda (guest.img)**

randread: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=io_uring, iodepth=64
fio-3.39
Starting 1 process
Jobs: 1 (f=1): [r(1)][100.0%][r=502MiB/s][r=128k IOPS][eta 00m:00s]
randread: (groupid=0, jobs=1): err= 0: pid=1208: Fri Apr 11 23:18:26 2025
  read: IOPS=127k, BW=496MiB/s (520MB/s)(14.5GiB/30001msec)
    slat (usec): min=2, max=3541, avg= 5.89, stdev= 3.71
    clat (usec): min=8, max=24149, avg=496.40, stdev=149.85
     lat (usec): min=11, max=24161, avg=502.29, stdev=149.89
    clat percentiles (usec):
     |  1.00th=[  375],  5.00th=[  433], 10.00th=[  449], 20.00th=[  461],
     | 30.00th=[  469], 40.00th=[  474], 50.00th=[  482], 60.00th=[  486],
     | 70.00th=[  494], 80.00th=[  502], 90.00th=[  515], 95.00th=[  537],
     | 99.00th=[ 1287], 99.50th=[ 1516], 99.90th=[ 1827], 99.95th=[ 1958],
     | 99.99th=[ 2573]
   bw (  KiB/s): min=484856, max=530928, per=100.00%, avg=508499.90, stdev=8880.33, samples=60
   iops        : min=121214, max=132732, avg=127124.98, stdev=2220.10, samples=60
  lat (usec)   : 10=0.01%, 20=0.01%, 50=0.01%, 100=0.01%, 250=0.05%
  lat (usec)   : 500=79.75%, 750=18.05%, 1000=0.44%
  lat (msec)   : 2=1.66%, 4=0.04%, 10=0.01%, 20=0.01%, 50=0.01%
  cpu          : usr=44.52%, sys=55.43%, ctx=199, majf=0, minf=36
  IO depths    : 1=0.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
     issued rwts: total=3810630,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
   READ: bw=496MiB/s (520MB/s), 496MiB/s-496MiB/s (520MB/s-520MB/s), io=14.5GiB (15.6GB), run=30001-30001msec

Disk stats (read/write):
  vda: ios=4422643/234, sectors=35381152/14793, merge=0/20, ticks=120202/328, in_queue=120535, util=95.02%



** Guest without fixed file table support**

** vda **
randread: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=io_uring, iodepth=64
fio-3.39
Starting 1 process
Jobs: 1 (f=1): [r(1)][100.0%][r=459MiB/s][r=118k IOPS][eta 00m:00s]
randread: (groupid=0, jobs=1): err= 0: pid=1217: Fri Apr 11 23:16:24 2025
  read: IOPS=127k, BW=498MiB/s (522MB/s)(14.6GiB/30001msec)
    slat (usec): min=2, max=246, avg= 5.91, stdev= 3.19
    clat (usec): min=10, max=21817, avg=494.55, stdev=149.50
     lat (usec): min=17, max=21827, avg=500.46, stdev=149.59
    clat percentiles (usec):
     |  1.00th=[  318],  5.00th=[  392], 10.00th=[  433], 20.00th=[  457],
     | 30.00th=[  469], 40.00th=[  478], 50.00th=[  482], 60.00th=[  490],
     | 70.00th=[  494], 80.00th=[  502], 90.00th=[  529], 95.00th=[  562],
     | 99.00th=[ 1270], 99.50th=[ 1516], 99.90th=[ 1827], 99.95th=[ 1958],
     | 99.99th=[ 2376]
   bw (  KiB/s): min=441768, max=568144, per=100.00%, avg=510363.83, stdev=23076.31, samples=60
   iops        : min=110442, max=142036, avg=127590.88, stdev=5769.07, samples=60
  lat (usec)   : 20=0.01%, 50=0.01%, 100=0.02%, 250=0.10%, 500=76.37%
  lat (usec)   : 750=21.30%, 1000=0.52%
  lat (msec)   : 2=1.65%, 4=0.04%, 10=0.01%, 20=0.01%, 50=0.01%
  cpu          : usr=43.71%, sys=56.26%, ctx=133, majf=0, minf=36
  IO depths    : 1=0.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
     issued rwts: total=3824929,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
   READ: bw=498MiB/s (522MB/s), 498MiB/s-498MiB/s (522MB/s-522MB/s), io=14.6GiB (15.7GB), run=30001-30001msec

Disk stats (read/write):
  vda: ios=4468557/140, sectors=35748456/8817, merge=0/18, ticks=129894/244, in_queue=130143, util=95.00%


Brian Song (1):
  This work adds support for registering block file descriptors to the
    io_uring instance and uses IOSQE_FIXED_FILE in I/O requests (SQEs)
    to avoid the cost of fdget() in the kernel. It is a basic
    implementation for testing, and does not yet handle cases where
    block devices are removed.

 block/io_uring.c | 60 ++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 60 insertions(+)

-- 
2.43.0