[Qemu-devel] [PATCH 0/7] qcow2: Reduce the number of I/O ops when doing COW

Alberto Garcia posted 7 patches 6 years, 11 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/cover.1495536228.git.berto@igalia.com
Test checkpatch passed
Test docker passed
Test s390x passed
There is a newer version of this series
block/qcow2-cluster.c | 188 +++++++++++++++++++++++++++++++++++++-------------
block/qcow2.c         |  58 +++++++++++++---
block/qcow2.h         |  11 ++-
3 files changed, 197 insertions(+), 60 deletions(-)
[Qemu-devel] [PATCH 0/7] qcow2: Reduce the number of I/O ops when doing COW
Posted by Alberto Garcia 6 years, 11 months ago
Hi all,

here's a patch series that rewrites the copy-on-write code in the
qcow2 driver to reduce the number of I/O operations.

The situation is that when a guest sends a write request and QEMU
needs to allocate new cluster(s) in a qcow2 file, the unwritten
regions of the new cluster(s) need to be filled with the existing data
(e.g. from the backing image) or with zeroes.

The whole process can require up to 5 I/O operations:

1) Write the data from the actual write request.
2) Read the existing data located before the guest data.
3) Write that data to the new clusters.
4) Read the existing data located after the guest data.
5) Write that data to the new clusters.

This series reduces that to only two operations:

1) Read the existing data from the original clusters
2) Write the updated data (=original + guest request) to the new clusters

Step (1) implies that there's data that will be read but will be
immediately discarded (because it's overwritten by the guest
request). I haven't really detected any big performance problems
because of that, but I decided to be conservative and my code includes
a simple heuristic that keeps the old behavior if the amount of data
to be discarded is higher than 16KB.

I've been testing this series in several scenarios, with different
cluster sizes (32K, 64K, 1MB) and request sizes (from 4 up to 512KB),
and both with an SSD and a rotating HDD. The results vary depending on
the case, with an average increase of 60% in the number of IOPS in the
HDD case, and 15% in the SSD case. In some cases there are really no
big differences and the results are similar before and after this
patch.

Further work for the future includes detecting when the data that
needs to be written consists on zeroes (i.e. allocating a new cluster
with no backing image) and optimizing that case, but let's start with
this.

Regards,

Berto

Alberto Garcia (7):
  qcow2: Remove unused Error in do_perform_cow()
  qcow2: Use unsigned int for both members of Qcow2COWRegion
  qcow2: Make perform_cow() call do_perform_cow() twice
  qcow2: Split do_perform_cow() into _read(), _encrypt() and _write()
  qcow2: Allow reading both COW regions with only one request
  qcow2: Pass a QEMUIOVector to do_perform_cow_{read,write}()
  qcow2: Merge the writing of the COW regions with the guest data

 block/qcow2-cluster.c | 188 +++++++++++++++++++++++++++++++++++++-------------
 block/qcow2.c         |  58 +++++++++++++---
 block/qcow2.h         |  11 ++-
 3 files changed, 197 insertions(+), 60 deletions(-)

-- 
2.11.0


Re: [Qemu-devel] [PATCH 0/7] qcow2: Reduce the number of I/O ops when doing COW
Posted by Eric Blake 6 years, 11 months ago
On 05/23/2017 06:22 AM, Alberto Garcia wrote:
> Hi all,
> 
> here's a patch series that rewrites the copy-on-write code in the
> qcow2 driver to reduce the number of I/O operations.

And it competes with Denis and Anton's patches:
https://lists.gnu.org/archive/html/qemu-devel/2017-05/msg04547.html

What plan of attack should we take on merging the best parts of these
two series?

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Re: [Qemu-devel] [PATCH 0/7] qcow2: Reduce the number of I/O ops when doing COW
Posted by Alberto Garcia 6 years, 11 months ago
On Tue 23 May 2017 04:36:52 PM CEST, Eric Blake wrote:

>> here's a patch series that rewrites the copy-on-write code in the
>> qcow2 driver to reduce the number of I/O operations.
>
> And it competes with Denis and Anton's patches:
> https://lists.gnu.org/archive/html/qemu-devel/2017-05/msg04547.html
>
> What plan of attack should we take on merging the best parts of these
> two series?

I took a look at that series and unless I overlooked something important
it seems that there's actually not that much overlap. Denis and Anton's
series deals with cluster preallocation and mine reduces the number of
I/O operations when there's a COW scenario. Most of my modifications are
in the peform_cow() function, but they barely touch that one.

I think we can review both separately, either of us can rebase our
series on top of the other, I don't expect big changes or conflicts.

Berto

Re: [Qemu-devel] [PATCH 0/7] qcow2: Reduce the number of I/O ops when doing COW
Posted by Anton Nefedov 6 years, 11 months ago
On 05/24/2017 05:20 PM, Alberto Garcia wrote:
> On Tue 23 May 2017 04:36:52 PM CEST, Eric Blake wrote:
>
>>> here's a patch series that rewrites the copy-on-write code in the
>>> qcow2 driver to reduce the number of I/O operations.
>>
>> And it competes with Denis and Anton's patches:
>> https://lists.gnu.org/archive/html/qemu-devel/2017-05/msg04547.html
>>
>> What plan of attack should we take on merging the best parts of these
>> two series?
>
> I took a look at that series and unless I overlooked something important
> it seems that there's actually not that much overlap. Denis and Anton's
> series deals with cluster preallocation and mine reduces the number of
> I/O operations when there's a COW scenario. Most of my modifications are
> in the peform_cow() function, but they barely touch that one.
>
> I think we can review both separately, either of us can rebase our
> series on top of the other, I don't expect big changes or conflicts.
>
> Berto
>

I agree; as mentioned we have similar patches and they don't conflict much.
We noticed a performance regression on HDD though, for the presumably 
optimized case (random 4k write over a large backed image); so the 
patches were put on hold.
SSD looked fine though.

/Anton

Re: [Qemu-devel] [PATCH 0/7] qcow2: Reduce the number of I/O ops when doing COW
Posted by Alberto Garcia 6 years, 11 months ago
On Wed 24 May 2017 06:09:42 PM CEST, Anton Nefedov wrote:

> I agree; as mentioned we have similar patches and they don't conflict
> much.  We noticed a performance regression on HDD though, for the
> presumably optimized case (random 4k write over a large backed image);
> so the patches were put on hold.

Interesting, I think that scenario was noticeably faster in my
tests. What cluster size(s) and image size(s) were you using?

Berto

Re: [Qemu-devel] [PATCH 0/7] qcow2: Reduce the number of I/O ops when doing COW
Posted by Anton Nefedov 6 years, 11 months ago

On 05/24/2017 07:20 PM, Alberto Garcia wrote:
> On Wed 24 May 2017 06:09:42 PM CEST, Anton Nefedov wrote:
> 
>> I agree; as mentioned we have similar patches and they don't conflict
>> much.  We noticed a performance regression on HDD though, for the
>> presumably optimized case (random 4k write over a large backed image);
>> so the patches were put on hold.
> 
> Interesting, I think that scenario was noticeably faster in my
> tests. What cluster size(s) and image size(s) were you using?
> 
64k cluster, 2g image, write 32m in portions of 4k at random offsets

/Anton

Re: [Qemu-devel] [PATCH 0/7] qcow2: Reduce the number of I/O ops when doing COW
Posted by Alberto Garcia 6 years, 11 months ago
On Wed 24 May 2017 06:26:23 PM CEST, Anton Nefedov wrote:
>>> I agree; as mentioned we have similar patches and they don't
>>> conflict much.  We noticed a performance regression on HDD though,
>>> for the presumably optimized case (random 4k write over a large
>>> backed image); so the patches were put on hold.
>> 
>> Interesting, I think that scenario was noticeably faster in my
>> tests. What cluster size(s) and image size(s) were you using?
>> 
> 64k cluster, 2g image, write 32m in portions of 4k at random offsets

I just tried that and the optimized case performs better (as expected),
almost twice as fast in fact:

write: io=32892KB, bw=162944B/s, iops=39, runt=206705msec
write: io=32892KB, bw=309256B/s, iops=75, runt=108911msec

I'll try in a different machine.

Berto

Re: [Qemu-devel] [Qemu-block] [PATCH 0/7] qcow2: Reduce the number of I/O ops when doing COW
Posted by Alberto Garcia 6 years, 10 months ago
On Thu 25 May 2017 01:48:39 PM CEST, Alberto Garcia wrote:
>>>> I agree; as mentioned we have similar patches and they don't
>>>> conflict much.  We noticed a performance regression on HDD though,
>>>> for the presumably optimized case (random 4k write over a large
>>>> backed image); so the patches were put on hold.
>>> 
>>> Interesting, I think that scenario was noticeably faster in my
>>> tests. What cluster size(s) and image size(s) were you using?
>>> 
>> 64k cluster, 2g image, write 32m in portions of 4k at random offsets
>
> I just tried that and the optimized case performs better (as
> expected), almost twice as fast in fact:
>
> write: io=32892KB, bw=162944B/s, iops=39, runt=206705msec
> write: io=32892KB, bw=309256B/s, iops=75, runt=108911msec
>
> I'll try in a different machine.

I made more tests of that same scenario with a different HDD:

write: io=32892KB, bw=588588B/s, iops=143, runt= 57224msec
write: io=32892KB, bw=779951B/s, iops=190, runt= 43184msec

And here are the results without a backing file:

write: io=32892KB, bw=1510.2KB/s, iops=377, runt= 21781msec
write: io=32892KB, bw=5417.1KB/s, iops=1354, runt=  6071msec

Berto

Re: [Qemu-devel] [PATCH 0/7] qcow2: Reduce the number of I/O ops when doing COW
Posted by Kevin Wolf 6 years, 10 months ago
Am 24.05.2017 um 18:09 hat Anton Nefedov geschrieben:
> I agree; as mentioned we have similar patches and they don't conflict much.
> We noticed a performance regression on HDD though, for the
> presumably optimized case (random 4k write over a large backed
> image); so the patches were put on hold.

You're talking about your own patches that should do the same thing,
right? Can you re-do the same test with Berto's patches? Maybe there was
just an implementation glitch in yours.

This approach should very obviously result in a performance improvement,
and the patches are relatively simple, so I'm very much inclined to
merge this as soon as possible.

Kevin

Re: [Qemu-devel] [PATCH 0/7] qcow2: Reduce the number of I/O ops when doing COW
Posted by Anton Nefedov 6 years, 10 months ago
On 05/26/2017 01:17 PM, Kevin Wolf wrote:
> Am 24.05.2017 um 18:09 hat Anton Nefedov geschrieben:
>> I agree; as mentioned we have similar patches and they don't conflict much.
>> We noticed a performance regression on HDD though, for the
>> presumably optimized case (random 4k write over a large backed
>> image); so the patches were put on hold.
> 
> You're talking about your own patches that should do the same thing,
> right? Can you re-do the same test with Berto's patches? Maybe there was
> just an implementation glitch in yours.
> 
> This approach should very obviously result in a performance improvement,
> and the patches are relatively simple, so I'm very much inclined to
> merge this as soon as possible.
> 
> Kevin
> 


Tried the another machine; about 10% improvement here

[root@dpclient centos-7.3-x86_64]# qemu-img info ./harddisk2.hdd
image: ./harddisk2.hdd
file format: qcow2
virtual size: 2.0G (2147483648 bytes)
disk size: 260K
cluster_size: 65536
backing file: harddisk2.hdd.base (actual path: ./harddisk2.hdd.base)
Format specific information:
compat: 1.1
lazy refcounts: false
refcount bits: 16
corrupt: false
[root@dpclient centos-7.3-x86_64]# qemu-img info ./harddisk2.hdd.base
image: ./harddisk2.hdd.base
file format: qcow2
virtual size: 2.0G (2147483648 bytes)
disk size: 2.0G
cluster_size: 65536
Format specific information:
compat: 1.1
lazy refcounts: true
refcount bits: 16
corrupt: false
[root@dpclient centos-7.3-x86_64]# filefrag ./harddisk2.hdd.base
./harddisk2.hdd.base: 3 extents found


[root@localhost ~]# fio --name=randwrite --blocksize=4k
--filename=/dev/sdb --rw=randwrite --direct=1 --ioengine=libaio
--size=2g --io_size=32m

/* master */
write: io=32768KB, bw=372785B/s, iops=91, runt= 90010msec
/* Berto's patches */
write: io=32768KB, bw=404304B/s, iops=98, runt= 82993msec


/Anton

Re: [Qemu-devel] [PATCH 0/7] qcow2: Reduce the number of I/O ops when doing COW
Posted by Alberto Garcia 6 years, 10 months ago
On Fri 26 May 2017 02:47:55 PM CEST, Anton Nefedov wrote:
> Tried the another machine; about 10% improvement here
  [...]
> [root@localhost ~]# fio --name=randwrite --blocksize=4k
> --filename=/dev/sdb --rw=randwrite --direct=1 --ioengine=libaio
> --size=2g --io_size=32m

In my tests I sometimes detected slight performance decreases in that
HDD scenario but using 'write' instead of 'randwrite' (and --runtime=60
instead of --io_size).

Can you try and see how that works for you?

Berto

Re: [Qemu-devel] [PATCH 0/7] qcow2: Reduce the number of I/O ops when doing COW
Posted by Anton Nefedov 6 years, 10 months ago
On 05/26/2017 04:08 PM, Alberto Garcia wrote:
> On Fri 26 May 2017 02:47:55 PM CEST, Anton Nefedov wrote:
>> Tried the another machine; about 10% improvement here
>    [...]
>> [root@localhost ~]# fio --name=randwrite --blocksize=4k
>> --filename=/dev/sdb --rw=randwrite --direct=1 --ioengine=libaio
>> --size=2g --io_size=32m
> 
> In my tests I sometimes detected slight performance decreases in that
> HDD scenario but using 'write' instead of 'randwrite' (and --runtime=60
> instead of --io_size).
> 
> Can you try and see how that works for you?
> 
> Berto
> 

For me it keeps giving pretty much the same result before and after

  write: io=512736KB, bw=8545.4KB/s, iops=2136, runt= 60004msec

/Anton

Re: [Qemu-devel] [PATCH 0/7] qcow2: Reduce the number of I/O ops when doing COW
Posted by Alberto Garcia 6 years, 10 months ago
On Fri 26 May 2017 03:32:49 PM CEST, Anton Nefedov wrote:
>>> [root@localhost ~]# fio --name=randwrite --blocksize=4k
>>> --filename=/dev/sdb --rw=randwrite --direct=1 --ioengine=libaio
>>> --size=2g --io_size=32m
>> 
>> In my tests I sometimes detected slight performance decreases in that
>> HDD scenario but using 'write' instead of 'randwrite' (and --runtime=60
>> instead of --io_size).
>> 
>> Can you try and see how that works for you?
>
> For me it keeps giving pretty much the same result before and after
>
>   write: io=512736KB, bw=8545.4KB/s, iops=2136, runt= 60004msec

Ok, that's the expected behavior. Good!

Berto

Re: [Qemu-devel] [PATCH 0/7] qcow2: Reduce the number of I/O ops when doing COW
Posted by Alberto Garcia 6 years, 10 months ago
ping

On Tue, May 23, 2017 at 01:22:55PM +0200, Alberto Garcia wrote:
> Hi all,
> 
> here's a patch series that rewrites the copy-on-write code in the
> qcow2 driver to reduce the number of I/O operations.
> 
> The situation is that when a guest sends a write request and QEMU
> needs to allocate new cluster(s) in a qcow2 file, the unwritten
> regions of the new cluster(s) need to be filled with the existing data
> (e.g. from the backing image) or with zeroes.
> 
> The whole process can require up to 5 I/O operations:
> 
> 1) Write the data from the actual write request.
> 2) Read the existing data located before the guest data.
> 3) Write that data to the new clusters.
> 4) Read the existing data located after the guest data.
> 5) Write that data to the new clusters.
> 
> This series reduces that to only two operations:
> 
> 1) Read the existing data from the original clusters
> 2) Write the updated data (=original + guest request) to the new clusters
> 
> Step (1) implies that there's data that will be read but will be
> immediately discarded (because it's overwritten by the guest
> request). I haven't really detected any big performance problems
> because of that, but I decided to be conservative and my code includes
> a simple heuristic that keeps the old behavior if the amount of data
> to be discarded is higher than 16KB.
> 
> I've been testing this series in several scenarios, with different
> cluster sizes (32K, 64K, 1MB) and request sizes (from 4 up to 512KB),
> and both with an SSD and a rotating HDD. The results vary depending on
> the case, with an average increase of 60% in the number of IOPS in the
> HDD case, and 15% in the SSD case. In some cases there are really no
> big differences and the results are similar before and after this
> patch.
> 
> Further work for the future includes detecting when the data that
> needs to be written consists on zeroes (i.e. allocating a new cluster
> with no backing image) and optimizing that case, but let's start with
> this.
> 
> Regards,
> 
> Berto
> 
> Alberto Garcia (7):
>   qcow2: Remove unused Error in do_perform_cow()
>   qcow2: Use unsigned int for both members of Qcow2COWRegion
>   qcow2: Make perform_cow() call do_perform_cow() twice
>   qcow2: Split do_perform_cow() into _read(), _encrypt() and _write()
>   qcow2: Allow reading both COW regions with only one request
>   qcow2: Pass a QEMUIOVector to do_perform_cow_{read,write}()
>   qcow2: Merge the writing of the COW regions with the guest data
> 
>  block/qcow2-cluster.c | 188 +++++++++++++++++++++++++++++++++++++-------------
>  block/qcow2.c         |  58 +++++++++++++---
>  block/qcow2.h         |  11 ++-
>  3 files changed, 197 insertions(+), 60 deletions(-)
> 
> -- 
> 2.11.0

Re: [Qemu-devel] [PATCH 0/7] qcow2: Reduce the number of I/O ops when doing COW
Posted by Kevin Wolf 6 years, 10 months ago
Am 07.06.2017 um 13:44 hat Alberto Garcia geschrieben:
> ping

You wanted to address two or three things in the next version, so I
assumed that this version shouldn't be merged.

Kevin

Re: [Qemu-devel] [PATCH 0/7] qcow2: Reduce the number of I/O ops when doing COW
Posted by Alberto Garcia 6 years, 10 months ago
On Wed 07 Jun 2017 01:59:58 PM CEST, Kevin Wolf wrote:
> Am 07.06.2017 um 13:44 hat Alberto Garcia geschrieben:
>> ping
>
> You wanted to address two or three things in the next version, so I
> assumed that this version shouldn't be merged.

Right, I had a couple of minor changes, but the core of the series is
going to remain the same and can already be reviewed.

But of course I can just send the second version with my changes, I'll
do it then.

Berto