qcow2: flush qcow2 l2 meta for new allocated clusters

[PATCH] qcow2: flush qcow2 l2 meta for new allocated clusters

Posted by Ying Fang 5 years, 6 months ago

From: fangying <fangying1@huawei.com>

When qemu or qemu-nbd process uses a qcow2 image and configured with
'cache = none', it will write to the qcow2 image with a cache to cache
L2 tables, however the process will not use L2 tables without explicitly
calling the flush command or closing the mirror flash into the disk.
Which may cause the disk data inconsistent with the written data for
a long time. If an abnormal process exit occurs here, the issued written
data will be lost.

Therefore, in order to keep data consistency we need to flush the changes
to the L2 entry to the disk in time for the newly allocated cluster.

Signed-off-by: Ying Fang <fangying1@huawei.com>

diff --git a/block/qcow2-cache.c b/block/qcow2-cache.c
index 7444b9c..ab6e812 100644
--- a/block/qcow2-cache.c
+++ b/block/qcow2-cache.c
@@ -266,6 +266,22 @@ int qcow2_cache_flush(BlockDriverState *bs, Qcow2Cache *c)
     return result;
 }
 
+#define L2_ENTRIES_PER_SECTOR 64
+int qcow2_cache_l2_write_entry(BlockDriverState *bs, Qcow2Cache *c,
+                               void *table, int index, int num)
+{
+    int ret;
+    int i = qcow2_cache_get_table_idx(c, table);
+    int start_sector = index / L2_ENTRIES_PER_SECTOR;
+    int end_sector = (index + num - 1) / L2_ENTRIES_PER_SECTOR;
+    int nr_sectors = end_sector - start_sector + 1;
+    ret = bdrv_pwrite(bs->file,
+                      c->entries[i].offset + start_sector * BDRV_SECTOR_SIZE,
+                      table + start_sector * BDRV_SECTOR_SIZE,
+                      nr_sectors * BDRV_SECTOR_SIZE);
+    return ret;
+}
+
 int qcow2_cache_set_dependency(BlockDriverState *bs, Qcow2Cache *c,
     Qcow2Cache *dependency)
 {
diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index a677ba9..ae49a83 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -998,6 +998,9 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m)
      }
 
 
+    ret = qcow2_cache_l2_write_entry(bs, s->l2_table_cache, l2_slice,
+                                     l2_index, m->nb_clusters);
+
     qcow2_cache_put(s->l2_table_cache, (void **) &l2_slice);
 
     /*
diff --git a/block/qcow2.h b/block/qcow2.h
index 7ce2c23..168ab59 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -748,6 +748,8 @@ int qcow2_cache_destroy(Qcow2Cache *c);
 void qcow2_cache_entry_mark_dirty(Qcow2Cache *c, void *table);
 int qcow2_cache_flush(BlockDriverState *bs, Qcow2Cache *c);
 int qcow2_cache_write(BlockDriverState *bs, Qcow2Cache *c);
+int qcow2_cache_l2_write_entry(BlockDriverState *bs, Qcow2Cache *c,
+                               void *table, int index, int num);
 int qcow2_cache_set_dependency(BlockDriverState *bs, Qcow2Cache *c,
     Qcow2Cache *dependency);
 void qcow2_cache_depends_on_flush(Qcow2Cache *c);
-- 
1.8.3.1

Re: [PATCH] qcow2: flush qcow2 l2 meta for new allocated clusters

Posted by no-reply@patchew.org 5 years, 6 months ago

Patchew URL: https://patchew.org/QEMU/20200805023826.184-1-fangying1@huawei.com/



Hi,

This series failed the docker-quick@centos7 build test. Please find the testing commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.






The full log is available at
http://patchew.org/logs/20200805023826.184-1-fangying1@huawei.com/testing.docker-quick@centos7/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com

Re: [PATCH] qcow2: flush qcow2 l2 meta for new allocated clusters

Posted by Ying Fang 5 years, 6 months ago


On 8/5/2020 10:43 AM, no-reply@patchew.org wrote:
> Patchew URL: https://patchew.org/QEMU/20200805023826.184-1-fangying1@huawei.com/
> 
> 
> 
> Hi,
> 
> This series failed the docker-quick@centos7 build test. Please find the testing commands and
> their output below. If you have Docker installed, you can probably reproduce it
> locally.
> I see some error message which says ** No space left on device **
However I do not know what is wrong with this build test.
Could you give me some help here?

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
error: copy-fd: write returned No space left on device
fatal: failed to copy file to 
'/var/tmp/patchew-tester-tmp-wtnwtuq5/src/.git/objects/pack/pack-518a8ad92e3ce11d2627a7221e2d360b337cb27d.pack': 
No space left on device
fatal: The remote end hung up unexpectedly
Traceback (most recent call last):
   File "patchew-tester/src/patchew-cli", line 521, in test_one
     git_clone_repo(clone, r["repo"], r["head"], logf, True)
   File "patchew-tester/src/patchew-cli", line 53, in git_clone_repo
     subprocess.check_call(clone_cmd, stderr=logf, stdout=logf)
   File "/opt/rh/rh-python36/root/usr/lib64/python3.6/subprocess.py", 
line 291, in check_call
     raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['git', 'clone', '-q', 
'/home/patchew/.cache/patchew-git-cache/httpsgithubcompatchewprojectqemu-3c8cf5a9c21ff8782164d1def7f44bd888713384', 
'/var/tmp/patchew-tester-tmp-wtnwtuq5/src']' returned non-zero exit 
status 128.

> 
> 
> 
> 
> 
> The full log is available at
> http://patchew.org/logs/20200805023826.184-1-fangying1@huawei.com/testing.docker-quick@centos7/?type=message.
> ---
> Email generated automatically by Patchew [https://patchew.org/].
> Please send your feedback to patchew-devel@redhat.com
>

Re: [PATCH] qcow2: flush qcow2 l2 meta for new allocated clusters

Posted by Daniel P. Berrangé 5 years, 6 months ago

On Thu, Aug 06, 2020 at 05:01:51PM +0800, Ying Fang wrote:
> 
> 
> On 8/5/2020 10:43 AM, no-reply@patchew.org wrote:
> > Patchew URL: https://patchew.org/QEMU/20200805023826.184-1-fangying1@huawei.com/
> > 
> > 
> > 
> > Hi,
> > 
> > This series failed the docker-quick@centos7 build test. Please find the testing commands and
> > their output below. If you have Docker installed, you can probably reproduce it
> > locally.
> > I see some error message which says ** No space left on device **
> However I do not know what is wrong with this build test.
> Could you give me some help here?

It isn't your fault - this is just QEMU's  patchew CI that is broken yet again
due to lack of disk space. Just ignore the error report here.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

Re: [PATCH] qcow2: flush qcow2 l2 meta for new allocated clusters

Posted by Kevin Wolf 5 years, 6 months ago

Am 05.08.2020 um 04:38 hat Ying Fang geschrieben:
> From: fangying <fangying1@huawei.com>
> 
> When qemu or qemu-nbd process uses a qcow2 image and configured with
> 'cache = none', it will write to the qcow2 image with a cache to cache
> L2 tables, however the process will not use L2 tables without explicitly
> calling the flush command or closing the mirror flash into the disk.
> Which may cause the disk data inconsistent with the written data for
> a long time. If an abnormal process exit occurs here, the issued written
> data will be lost.
> 
> Therefore, in order to keep data consistency we need to flush the changes
> to the L2 entry to the disk in time for the newly allocated cluster.
> 
> Signed-off-by: Ying Fang <fangying1@huawei.com>

If you want to have data safely written to the disk after each write
request, you need to use cache=writethrough/directsync (in other words,
aliases that are equivalent to setting -device ...,write-cache=off).
Note that this will have a major impact on write performance.

cache=none means bypassing the kernel page cache (O_DIRECT), but not
flushing after each write request.

Kevin

Re: [PATCH] qcow2: flush qcow2 l2 meta for new allocated clusters

Posted by Ying Fang 5 years, 6 months ago


On 8/6/2020 5:13 PM, Kevin Wolf wrote:
> Am 05.08.2020 um 04:38 hat Ying Fang geschrieben:
>> From: fangying <fangying1@huawei.com>
>>
>> When qemu or qemu-nbd process uses a qcow2 image and configured with
>> 'cache = none', it will write to the qcow2 image with a cache to cache
>> L2 tables, however the process will not use L2 tables without explicitly
>> calling the flush command or closing the mirror flash into the disk.
>> Which may cause the disk data inconsistent with the written data for
>> a long time. If an abnormal process exit occurs here, the issued written
>> data will be lost.
>>
>> Therefore, in order to keep data consistency we need to flush the changes
>> to the L2 entry to the disk in time for the newly allocated cluster.
>>
>> Signed-off-by: Ying Fang <fangying1@huawei.com>
> 
> If you want to have data safely written to the disk after each write
> request, you need to use cache=writethrough/directsync (in other words,
> aliases that are equivalent to setting -device ...,write-cache=off).
> Note that this will have a major impact on write performance.
> 
> cache=none means bypassing the kernel page cache (O_DIRECT), but not
> flushing after each write request.

Well, IIUC, cache=none does not guarantee data safety and we should not
expect that. Then this patch can be ignored.

Thanks.
> 
> Kevin
> 
> .
>

Re: [PATCH] qcow2: flush qcow2 l2 meta for new allocated clusters

Posted by Kevin Wolf 5 years, 6 months ago

Am 07.08.2020 um 09:42 hat Ying Fang geschrieben:
> 
> 
> On 8/6/2020 5:13 PM, Kevin Wolf wrote:
> > Am 05.08.2020 um 04:38 hat Ying Fang geschrieben:
> > > From: fangying <fangying1@huawei.com>
> > > 
> > > When qemu or qemu-nbd process uses a qcow2 image and configured with
> > > 'cache = none', it will write to the qcow2 image with a cache to cache
> > > L2 tables, however the process will not use L2 tables without explicitly
> > > calling the flush command or closing the mirror flash into the disk.
> > > Which may cause the disk data inconsistent with the written data for
> > > a long time. If an abnormal process exit occurs here, the issued written
> > > data will be lost.
> > > 
> > > Therefore, in order to keep data consistency we need to flush the changes
> > > to the L2 entry to the disk in time for the newly allocated cluster.
> > > 
> > > Signed-off-by: Ying Fang <fangying1@huawei.com>
> > 
> > If you want to have data safely written to the disk after each write
> > request, you need to use cache=writethrough/directsync (in other words,
> > aliases that are equivalent to setting -device ...,write-cache=off).
> > Note that this will have a major impact on write performance.
> > 
> > cache=none means bypassing the kernel page cache (O_DIRECT), but not
> > flushing after each write request.
> 
> Well, IIUC, cache=none does not guarantee data safety and we should not
> expect that. Then this patch can be ignored.

Indeed, cache=none is a writeback cache mode with all of the
consequences. In practice, this is normally good enough because the
guest OS will send flush requests when needed (e.g. because a guest
application called fsync()), but if the guest doesn't do this, it may
suffer data loss. This behaviour is comparable to a volatile disk cache
on real hard disks and is a good default, but sometimes you need a
writethrough cache mode at the cost of a performance penalty.

Kevin

Re: [PATCH] qcow2: flush qcow2 l2 meta for new allocated clusters

Posted by Ying Fang 5 years, 5 months ago


On 8/7/2020 4:13 PM, Kevin Wolf wrote:
> Am 07.08.2020 um 09:42 hat Ying Fang geschrieben:
>>
>>
>> On 8/6/2020 5:13 PM, Kevin Wolf wrote:
>>> Am 05.08.2020 um 04:38 hat Ying Fang geschrieben:
>>>> From: fangying <fangying1@huawei.com>
>>>>
>>>> When qemu or qemu-nbd process uses a qcow2 image and configured with
>>>> 'cache = none', it will write to the qcow2 image with a cache to cache
>>>> L2 tables, however the process will not use L2 tables without explicitly
>>>> calling the flush command or closing the mirror flash into the disk.
>>>> Which may cause the disk data inconsistent with the written data for
>>>> a long time. If an abnormal process exit occurs here, the issued written
>>>> data will be lost.
>>>>
>>>> Therefore, in order to keep data consistency we need to flush the changes
>>>> to the L2 entry to the disk in time for the newly allocated cluster.
>>>>
>>>> Signed-off-by: Ying Fang <fangying1@huawei.com>
>>>
>>> If you want to have data safely written to the disk after each write
>>> request, you need to use cache=writethrough/directsync (in other words,
>>> aliases that are equivalent to setting -device ...,write-cache=off).
>>> Note that this will have a major impact on write performance.
>>>
>>> cache=none means bypassing the kernel page cache (O_DIRECT), but not
>>> flushing after each write request.
>>
>> Well, IIUC, cache=none does not guarantee data safety and we should not
>> expect that. Then this patch can be ignored.
> 
> Indeed, cache=none is a writeback cache mode with all of the
> consequences. In practice, this is normally good enough because the
> guest OS will send flush requests when needed (e.g. because a guest
> application called fsync()), but if the guest doesn't do this, it may
> suffer data loss. This behaviour is comparable to a volatile disk cache
> on real hard disks and is a good default, but sometimes you need a
> writethrough cache mode at the cost of a performance penalty.

The late reply, thanks for your detailed explanation on the 'cache' 
option, having more understanding for it now.
> 
> Kevin
> 
> .
>