[PATCH for-5.1 1/2] qcow2: Implement v2 zero writes with discard if possible

Kevin Wolf posted 2 patches 5 years, 3 months ago
Maintainers: Kevin Wolf <kwolf@redhat.com>, Max Reitz <mreitz@redhat.com>
There is a newer version of this series
[PATCH for-5.1 1/2] qcow2: Implement v2 zero writes with discard if possible
Posted by Kevin Wolf 5 years, 3 months ago
qcow2 version 2 images don't support the zero flag for clusters, so for
write_zeroes requests, we return -ENOTSUP and get explicit zero buffer
writes. If the image doesn't have a backing file, we can do better: Just
discard the respective clusters.

This is relevant for 'qemu-img convert -O qcow2 -n', where qemu-img has
to assume that the existing target image may contain any data, so it has
to write zeroes. Without this patch, this results in a fully allocated
target image, even if the source image was empty.

Reported-by: Nir Soffer <nsoffer@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 block/qcow2-cluster.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 4b5fc8c4a7..a677ba9f5c 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -1797,8 +1797,15 @@ int qcow2_cluster_zeroize(BlockDriverState *bs, uint64_t offset,
     assert(QEMU_IS_ALIGNED(end_offset, s->cluster_size) ||
            end_offset >= bs->total_sectors << BDRV_SECTOR_BITS);
 
-    /* The zero flag is only supported by version 3 and newer */
+    /*
+     * The zero flag is only supported by version 3 and newer. However, if we
+     * have no backing file, we can resort to discard in version 2.
+     */
     if (s->qcow_version < 3) {
+        if (!bs->backing) {
+            return qcow2_cluster_discard(bs, offset, bytes,
+                                         QCOW2_DISCARD_REQUEST, false);
+        }
         return -ENOTSUP;
     }
 
-- 
2.25.4


Re: [PATCH for-5.1 1/2] qcow2: Implement v2 zero writes with discard if possible
Posted by Max Reitz 5 years, 3 months ago
On 20.07.20 15:18, Kevin Wolf wrote:
> qcow2 version 2 images don't support the zero flag for clusters, so for
> write_zeroes requests, we return -ENOTSUP and get explicit zero buffer
> writes. If the image doesn't have a backing file, we can do better: Just
> discard the respective clusters.
> 
> This is relevant for 'qemu-img convert -O qcow2 -n', where qemu-img has
> to assume that the existing target image may contain any data, so it has
> to write zeroes. Without this patch, this results in a fully allocated
> target image, even if the source image was empty.
> 
> Reported-by: Nir Soffer <nsoffer@redhat.com>
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>  block/qcow2-cluster.c | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)

Reviewed-by: Max Reitz <mreitz@redhat.com>

Re: [PATCH for-5.1 1/2] qcow2: Implement v2 zero writes with discard if possible
Posted by Maxim Levitsky 5 years, 3 months ago
On Mon, 2020-07-20 at 15:18 +0200, Kevin Wolf wrote:
> qcow2 version 2 images don't support the zero flag for clusters, so for
> write_zeroes requests, we return -ENOTSUP and get explicit zero buffer
> writes. If the image doesn't have a backing file, we can do better: Just
> discard the respective clusters.
> 
> This is relevant for 'qemu-img convert -O qcow2 -n', where qemu-img has
> to assume that the existing target image may contain any data, so it has
> to write zeroes. Without this patch, this results in a fully allocated
> target image, even if the source image was empty.
> 
> Reported-by: Nir Soffer <nsoffer@redhat.com>
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>  block/qcow2-cluster.c | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
> index 4b5fc8c4a7..a677ba9f5c 100644
> --- a/block/qcow2-cluster.c
> +++ b/block/qcow2-cluster.c
> @@ -1797,8 +1797,15 @@ int qcow2_cluster_zeroize(BlockDriverState *bs, uint64_t offset,
>      assert(QEMU_IS_ALIGNED(end_offset, s->cluster_size) ||
>             end_offset >= bs->total_sectors << BDRV_SECTOR_BITS);
>  
> -    /* The zero flag is only supported by version 3 and newer */
> +    /*
> +     * The zero flag is only supported by version 3 and newer. However, if we
> +     * have no backing file, we can resort to discard in version 2.
> +     */
>      if (s->qcow_version < 3) {
> +        if (!bs->backing) {
> +            return qcow2_cluster_discard(bs, offset, bytes,
> +                                         QCOW2_DISCARD_REQUEST, false);
> +        }
>          return -ENOTSUP;
>      }
>  

From my knowelege of nvme, I remember that discard doesn't have to zero the blocks.
There is special namespace capability the indicates the contents of the discarded block.
(Deallocate Logical Block Features)

If and only if the discard behavier flag indicates that discarded areas are zero,
then the write-zero command can have special 'deallocate' flag that hints the controller
to discard the sectors.

So woudn't discarding the clusters have theoretical risk of introducing garbage there?

Best regards,
	Maxim Levitsky


Re: [PATCH for-5.1 1/2] qcow2: Implement v2 zero writes with discard if possible
Posted by Kevin Wolf 5 years, 3 months ago
Am 22.07.2020 um 19:01 hat Maxim Levitsky geschrieben:
> On Mon, 2020-07-20 at 15:18 +0200, Kevin Wolf wrote:
> > qcow2 version 2 images don't support the zero flag for clusters, so for
> > write_zeroes requests, we return -ENOTSUP and get explicit zero buffer
> > writes. If the image doesn't have a backing file, we can do better: Just
> > discard the respective clusters.
> > 
> > This is relevant for 'qemu-img convert -O qcow2 -n', where qemu-img has
> > to assume that the existing target image may contain any data, so it has
> > to write zeroes. Without this patch, this results in a fully allocated
> > target image, even if the source image was empty.
> > 
> > Reported-by: Nir Soffer <nsoffer@redhat.com>
> > Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> > ---
> >  block/qcow2-cluster.c | 9 ++++++++-
> >  1 file changed, 8 insertions(+), 1 deletion(-)
> > 
> > diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
> > index 4b5fc8c4a7..a677ba9f5c 100644
> > --- a/block/qcow2-cluster.c
> > +++ b/block/qcow2-cluster.c
> > @@ -1797,8 +1797,15 @@ int qcow2_cluster_zeroize(BlockDriverState *bs, uint64_t offset,
> >      assert(QEMU_IS_ALIGNED(end_offset, s->cluster_size) ||
> >             end_offset >= bs->total_sectors << BDRV_SECTOR_BITS);
> >  
> > -    /* The zero flag is only supported by version 3 and newer */
> > +    /*
> > +     * The zero flag is only supported by version 3 and newer. However, if we
> > +     * have no backing file, we can resort to discard in version 2.
> > +     */
> >      if (s->qcow_version < 3) {
> > +        if (!bs->backing) {
> > +            return qcow2_cluster_discard(bs, offset, bytes,
> > +                                         QCOW2_DISCARD_REQUEST, false);
> > +        }
> >          return -ENOTSUP;
> >      }
> >  
> 
> From my knowelege of nvme, I remember that discard doesn't have to zero the blocks.
> There is special namespace capability the indicates the contents of the discarded block.
> (Deallocate Logical Block Features)
> 
> If and only if the discard behavier flag indicates that discarded areas are zero,
> then the write-zero command can have special 'deallocate' flag that hints the controller
> to discard the sectors.
> 
> So woudn't discarding the clusters have theoretical risk of introducing garbage there?

No, qcow2_cluster_discard() has a defined behaviour. For v2 images, it
unallocates the cluster in the L2 table (this is only safe without a
backing file), for v3 images it converts them to zero clusters.

Kevin


Re: [PATCH for-5.1 1/2] qcow2: Implement v2 zero writes with discard if possible
Posted by Maxim Levitsky 5 years, 3 months ago
On Wed, 2020-07-22 at 19:14 +0200, Kevin Wolf wrote:
> Am 22.07.2020 um 19:01 hat Maxim Levitsky geschrieben:
> > On Mon, 2020-07-20 at 15:18 +0200, Kevin Wolf wrote:
> > > qcow2 version 2 images don't support the zero flag for clusters, so for
> > > write_zeroes requests, we return -ENOTSUP and get explicit zero buffer
> > > writes. If the image doesn't have a backing file, we can do better: Just
> > > discard the respective clusters.
> > > 
> > > This is relevant for 'qemu-img convert -O qcow2 -n', where qemu-img has
> > > to assume that the existing target image may contain any data, so it has
> > > to write zeroes. Without this patch, this results in a fully allocated
> > > target image, even if the source image was empty.
> > > 
> > > Reported-by: Nir Soffer <nsoffer@redhat.com>
> > > Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> > > ---
> > >  block/qcow2-cluster.c | 9 ++++++++-
> > >  1 file changed, 8 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
> > > index 4b5fc8c4a7..a677ba9f5c 100644
> > > --- a/block/qcow2-cluster.c
> > > +++ b/block/qcow2-cluster.c
> > > @@ -1797,8 +1797,15 @@ int qcow2_cluster_zeroize(BlockDriverState *bs, uint64_t offset,
> > >      assert(QEMU_IS_ALIGNED(end_offset, s->cluster_size) ||
> > >             end_offset >= bs->total_sectors << BDRV_SECTOR_BITS);
> > >  
> > > -    /* The zero flag is only supported by version 3 and newer */
> > > +    /*
> > > +     * The zero flag is only supported by version 3 and newer. However, if we
> > > +     * have no backing file, we can resort to discard in version 2.
> > > +     */
> > >      if (s->qcow_version < 3) {
> > > +        if (!bs->backing) {
> > > +            return qcow2_cluster_discard(bs, offset, bytes,
> > > +                                         QCOW2_DISCARD_REQUEST, false);
> > > +        }
> > >          return -ENOTSUP;
> > >      }
> > >  
> > 
> > From my knowelege of nvme, I remember that discard doesn't have to zero the blocks.
> > There is special namespace capability the indicates the contents of the discarded block.
> > (Deallocate Logical Block Features)
> > 
> > If and only if the discard behavier flag indicates that discarded areas are zero,
> > then the write-zero command can have special 'deallocate' flag that hints the controller
> > to discard the sectors.
> > 
> > So woudn't discarding the clusters have theoretical risk of introducing garbage there?
> 
> No, qcow2_cluster_discard() has a defined behaviour. For v2 images, it
> unallocates the cluster in the L2 table (this is only safe without a
> backing file), for v3 images it converts them to zero clusters.

All right then!

Best regards,
	Maxim Levitsky
> 
> Kevin



Re: [PATCH for-5.1 1/2] qcow2: Implement v2 zero writes with discard if possible
Posted by Nir Soffer 5 years, 3 months ago
On Mon, Jul 20, 2020 at 4:18 PM Kevin Wolf <kwolf@redhat.com> wrote:
>
> qcow2 version 2 images don't support the zero flag for clusters, so for
> write_zeroes requests, we return -ENOTSUP and get explicit zero buffer
> writes. If the image doesn't have a backing file, we can do better: Just
> discard the respective clusters.
>
> This is relevant for 'qemu-img convert -O qcow2 -n', where qemu-img has
> to assume that the existing target image may contain any data, so it has
> to write zeroes. Without this patch, this results in a fully allocated
> target image, even if the source image was empty.
>
> Reported-by: Nir Soffer <nsoffer@redhat.com>
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>  block/qcow2-cluster.c | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
> index 4b5fc8c4a7..a677ba9f5c 100644
> --- a/block/qcow2-cluster.c
> +++ b/block/qcow2-cluster.c
> @@ -1797,8 +1797,15 @@ int qcow2_cluster_zeroize(BlockDriverState *bs, uint64_t offset,
>      assert(QEMU_IS_ALIGNED(end_offset, s->cluster_size) ||
>             end_offset >= bs->total_sectors << BDRV_SECTOR_BITS);
>
> -    /* The zero flag is only supported by version 3 and newer */
> +    /*
> +     * The zero flag is only supported by version 3 and newer. However, if we
> +     * have no backing file, we can resort to discard in version 2.
> +     */
>      if (s->qcow_version < 3) {
> +        if (!bs->backing) {
> +            return qcow2_cluster_discard(bs, offset, bytes,
> +                                         QCOW2_DISCARD_REQUEST, false);
> +        }
>          return -ENOTSUP;
>      }

Looks good to me.

>
> --
> 2.25.4
>