Hi all!
Here is an asynchronous scheme for handling fragmented qcow2
reads and writes. Both qcow2 read and write functions loops through
sequential portions of data. The series aim it to parallelize these
loops iterations.
It improves performance for fragmented qcow2 images, I've tested it
as described below.
v2: changed a lot, as
1. a lot of preparations around locks, hd_qiovs, threads for encryption
are done
2. I decided to create separate file with async request handling API, to
reuse it for backup, stream and copy-on-read to improve their performance
too. Mirror and qemu-img convert has their own async request handling,
may be we'll be able finally merge all these similar code into one
feature.
Note that not all API calls used in qcow2, some will be needed on
following steps for parallelizing other io loops.
Based-on: https://github.com/stefanha/qemu/commits/block
About testing:
I have four 4G qcow2 images (with default 64k block size) on my ssd disk:
t-seq.qcow2 - sequentially written qcow2 image
t-reverse.qcow2 - filled by writing 64k portions from end to the start
t-rand.qcow2 - filled by writing 64k portions (aligned) in random order
t-part-rand.qcow2 - filled by shuffling order of 64k writes in 1m clusters
(see source code of image generation in the end for details)
and I've done several runs like the following (sequential io by 1mb chunks):
out=/tmp/block; echo > $out; cat /tmp/files | while read file; do for wr in {"","-w"}; do echo "$file" $wr; ./qemu-img bench -c 4096 -d 1 -f qcow2 -n -s 1m -t none $wr "$file" | grep 'Run completed in' | awk '{print $4}' >> $out; done; done
short info about parameters:
-w - do writes (otherwise do reads)
-c - count of blocks
-s - block size
-t none - disable cache
-n - native aio
-d 1 - don't use parallel requests provided by qemu-img bench itself
results:
+---------------------------+---------+---------+
| file | master | async |
+---------------------------+---------+---------+
| /ssd/t-part-rand.qcow2 | 14.671 | 9.193 |
+---------------------------+---------+---------+
| /ssd/t-part-rand.qcow2 -w | 11.434 | 8.621 |
+---------------------------+---------+---------+
| /ssd/t-rand.qcow2 | 20.421 | 10.05 |
+---------------------------+---------+---------+
| /ssd/t-rand.qcow2 -w | 11.097 | 8.915 |
+---------------------------+---------+---------+
| /ssd/t-reverse.qcow2 | 17.515 | 9.407 |
+---------------------------+---------+---------+
| /ssd/t-reverse.qcow2 -w | 11.255 | 8.649 |
+---------------------------+---------+---------+
| /ssd/t-seq.qcow2 | 9.081 | 9.072 |
+---------------------------+---------+---------+
| /ssd/t-seq.qcow2 -w | 8.761 | 8.747 |
+---------------------------+---------+---------+
| /tmp/t-part-rand.qcow2 | 41.179 | 41.37 |
+---------------------------+---------+---------+
| /tmp/t-part-rand.qcow2 -w | 54.097 | 55.323 |
+---------------------------+---------+---------+
| /tmp/t-rand.qcow2 | 711.899 | 514.339 |
+---------------------------+---------+---------+
| /tmp/t-rand.qcow2 -w | 546.259 | 642.114 |
+---------------------------+---------+---------+
| /tmp/t-reverse.qcow2 | 86.065 | 96.522 |
+---------------------------+---------+---------+
| /tmp/t-reverse.qcow2 -w | 46.557 | 48.499 |
+---------------------------+---------+---------+
| /tmp/t-seq.qcow2 | 33.804 | 33.862 |
+---------------------------+---------+---------+
| /tmp/t-seq.qcow2 -w | 34.299 | 34.233 |
+---------------------------+---------+---------+
Performance gain is obvious, especially for read and especially for ssd.
For hdd there is a degradation for reverse case, but this is the most
impossible case and seems not critical.
How images are generated:
==== gen-writes ======
#!/usr/bin/env python
import random
import sys
size = 4 * 1024 * 1024 * 1024
block = 64 * 1024
block2 = 1024 * 1024
arg = sys.argv[1]
if arg in ('rand', 'reverse', 'seq'):
writes = list(range(0, size, block))
if arg == 'rand':
random.shuffle(writes)
elif arg == 'reverse':
writes.reverse()
elif arg == 'part-rand':
writes = []
for off in range(0, size, block2):
wr = list(range(off, off + block2, block))
random.shuffle(wr)
writes.extend(wr)
elif arg != 'seq':
sys.exit(1)
for w in writes:
print 'write -P 0xff {} {}'.format(w, block)
print 'q'
==========================
===== gen-test-images.sh =====
#!/bin/bash
IMG_PATH=/ssd
for name in seq reverse rand part-rand; do
IMG=$IMG_PATH/t-$name.qcow2
echo createing $IMG ...
rm -f $IMG
qemu-img create -f qcow2 $IMG 4G
gen-writes $name | qemu-io $IMG
done
==============================
Vladimir Sementsov-Ogievskiy (4):
block: introduce aio task pool
block/qcow2: refactor qcow2_co_preadv_part
block/qcow2: refactor qcow2_co_pwritev_part
block/qcow2: introduce parallel subrequest handling in read and write
qapi/block-core.json | 2 +-
block/aio_task.h | 52 +++++
block/aio_task.c | 119 +++++++++++
block/qcow2.c | 459 ++++++++++++++++++++++++++++---------------
block/Makefile.objs | 2 +
block/trace-events | 1 +
6 files changed, 477 insertions(+), 158 deletions(-)
create mode 100644 block/aio_task.h
create mode 100644 block/aio_task.c
--
2.18.0