[PATCH] qcow2: Allow resize of images with internal snapshots

Eric Blake posted 1 patch 4 years ago
Test docker-mingw@fedora passed
Test docker-quick@centos7 passed
Test checkpatch passed
Test FreeBSD passed
Test asan passed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20200422205355.274706-1-eblake@redhat.com
Maintainers: Kevin Wolf <kwolf@redhat.com>, Max Reitz <mreitz@redhat.com>
There is a newer version of this series
block/qcow2-snapshot.c     |  21 ++++--
block/qcow2.c              |  31 ++++++--
tests/qemu-iotests/061     |  23 ++++++
tests/qemu-iotests/061.out | 144 +++++++++++++++++++++++++++++++++++++
4 files changed, 211 insertions(+), 8 deletions(-)
[PATCH] qcow2: Allow resize of images with internal snapshots
Posted by Eric Blake 4 years ago
We originally refused to allow resize of images with internal
snapshots because the v2 image format did not require the tracking of
snapshot size, making it impossible to safely revert to a snapshot
with a different size than the current view of the image.  But the
snapshot size tracking was rectified in v3, and our recent fixes to
qemu-img amend (see 0a85af35) guarantee that we always have a valid
snapshot size.  Thus, we no longer need to artificially limit image
resizes, but it does become one more thing that would prevent a
downgrade back to v2.  And now that we support different-sized
snapshots, it's also easy to fix reverting to a snapshot to apply the
new size.

Upgrade iotest 61 to cover this (we previously had NO coverage of
refusal to resize while snapshots exist).  Note that the amend process
can fail but still have effects: in particular, since we break things
into upgrade, resize, downgrade, if a failure does not happen until a
later phase (such as the downgrade attempt), earlier steps are still
visible (a truncation and downgrade attempt will fail, but only after
truncating data).  But even before this patch, an attempt to upgrade
and resize would fail but only after changing the image to v3.  In
some sense, partial image changes on failure are inevitible, since we
can't avoid a mid-change EIO (if you are trying to amend more than one
thing at once, but something fails, I hope you have a backup image).

Signed-off-by: Eric Blake <eblake@redhat.com>
---
 block/qcow2-snapshot.c     |  21 ++++--
 block/qcow2.c              |  31 ++++++--
 tests/qemu-iotests/061     |  23 ++++++
 tests/qemu-iotests/061.out | 144 +++++++++++++++++++++++++++++++++++++
 4 files changed, 211 insertions(+), 8 deletions(-)

diff --git a/block/qcow2-snapshot.c b/block/qcow2-snapshot.c
index 82c32d4c9b08..3f9e48738d0b 100644
--- a/block/qcow2-snapshot.c
+++ b/block/qcow2-snapshot.c
@@ -23,6 +23,7 @@
  */

 #include "qemu/osdep.h"
+#include "sysemu/block-backend.h"
 #include "qapi/error.h"
 #include "qcow2.h"
 #include "qemu/bswap.h"
@@ -775,10 +776,22 @@ int qcow2_snapshot_goto(BlockDriverState *bs, const char *snapshot_id)
     }

     if (sn->disk_size != bs->total_sectors * BDRV_SECTOR_SIZE) {
-        error_report("qcow2: Loading snapshots with different disk "
-            "size is not implemented");
-        ret = -ENOTSUP;
-        goto fail;
+        BlockBackend *blk = blk_new(bdrv_get_aio_context(bs),
+                                    BLK_PERM_RESIZE, BLK_PERM_ALL);
+        ret = blk_insert_bs(blk, bs, &local_err);
+        if (ret < 0) {
+            blk_unref(blk);
+            error_report_err(local_err);
+            goto fail;
+        }
+
+        ret = blk_truncate(blk, sn->disk_size, true, PREALLOC_MODE_OFF,
+                           &local_err);
+        blk_unref(blk);
+        if (ret < 0) {
+            error_report_err(local_err);
+            goto fail;
+        }
     }

     /*
diff --git a/block/qcow2.c b/block/qcow2.c
index b524b0c53f84..29047c33b7e5 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -3988,14 +3988,21 @@ static int coroutine_fn qcow2_co_truncate(BlockDriverState *bs, int64_t offset,

     qemu_co_mutex_lock(&s->lock);

-    /* cannot proceed if image has snapshots */
-    if (s->nb_snapshots) {
-        error_setg(errp, "Can't resize an image which has snapshots");
+    /*
+     * Even though we store snapshot size for all images, it was not
+     * required until v3, so it is not safe to proceed for v2.
+     */
+    if (s->nb_snapshots && s->qcow_version < 3) {
+        error_setg(errp, "Can't resize a v2 image which has snapshots");
         ret = -ENOTSUP;
         goto fail;
     }

-    /* cannot proceed if image has bitmaps */
+    /*
+     * For now, it's easier to not proceed if image has bitmaps, even
+     * though we could resize bitmaps, because it is not obvious
+     * whether new bits should be set or clear.
+     */
     if (qcow2_truncate_bitmaps_check(bs, errp)) {
         ret = -ENOTSUP;
         goto fail;
@@ -4952,6 +4959,7 @@ static int qcow2_downgrade(BlockDriverState *bs, int target_version,
     BDRVQcow2State *s = bs->opaque;
     int current_version = s->qcow_version;
     int ret;
+    int i;

     /* This is qcow2_downgrade(), not qcow2_upgrade() */
     assert(target_version < current_version);
@@ -4969,6 +4977,21 @@ static int qcow2_downgrade(BlockDriverState *bs, int target_version,
         return -ENOTSUP;
     }

+    /*
+     * If any internal snapshot has a different size than the current
+     * image size, or VM state size that exceeds 32 bits, downgrading
+     * is unsafe.  Even though we would still use v3-compliant output
+     * to preserve that data, other v2 programs might not realize
+     * those optional fields are important.
+     */
+    for (i = 0; i < s->nb_snapshots; i++) {
+        if (s->snapshots[i].vm_state_size > UINT32_MAX ||
+            s->snapshots[i].disk_size != bs->total_sectors * BDRV_SECTOR_SIZE) {
+            error_setg(errp, "Internal snapshots prevent downgrade of image");
+            return -ENOTSUP;
+        }
+    }
+
     /* clear incompatible features */
     if (s->incompatible_features & QCOW2_INCOMPAT_DIRTY) {
         ret = qcow2_mark_clean(bs);
diff --git a/tests/qemu-iotests/061 b/tests/qemu-iotests/061
index ce285d308408..fdfb8fab5fb6 100755
--- a/tests/qemu-iotests/061
+++ b/tests/qemu-iotests/061
@@ -111,6 +111,29 @@ $PYTHON qcow2.py "$TEST_IMG" dump-header
 $QEMU_IO -c "read -P 0x2a 42M 64k" "$TEST_IMG" | _filter_qemu_io
 _check_test_img

+echo
+echo "=== Testing resize with snapshots ==="
+echo
+_make_test_img -o "compat=0.10" 32M
+$QEMU_IO -c "write -P 0x2a 24M 64k" "$TEST_IMG" | _filter_qemu_io
+$QEMU_IMG snapshot -c foo "$TEST_IMG"
+$QEMU_IMG resize "$TEST_IMG" 64M                         # fails
+$PYTHON qcow2.py "$TEST_IMG" dump-header
+$QEMU_IMG amend -o "compat=1.1,size=128M" "$TEST_IMG"    # succeeds
+$PYTHON qcow2.py "$TEST_IMG" dump-header
+$QEMU_IMG snapshot -c bar "$TEST_IMG"
+$QEMU_IMG resize --shrink "$TEST_IMG" 64M                # succeeds
+$PYTHON qcow2.py "$TEST_IMG" dump-header
+$QEMU_IMG amend -o "compat=0.10,size=32M" "$TEST_IMG"    # fails, image left v3
+$PYTHON qcow2.py "$TEST_IMG" dump-header
+$QEMU_IMG snapshot -a bar "$TEST_IMG"                    # succeeds
+$PYTHON qcow2.py "$TEST_IMG" dump-header
+$QEMU_IMG snapshot -d bar "$TEST_IMG"
+$QEMU_IMG amend -o "compat=0.10,size=32M" "$TEST_IMG"    # succeeds
+$PYTHON qcow2.py "$TEST_IMG" dump-header
+_check_test_img
+
+
 echo
 echo "=== Testing dirty lazy_refcounts=off ==="
 echo
diff --git a/tests/qemu-iotests/061.out b/tests/qemu-iotests/061.out
index 413cc4e0f4ab..0035210c9ae0 100644
--- a/tests/qemu-iotests/061.out
+++ b/tests/qemu-iotests/061.out
@@ -271,6 +271,150 @@ read 65536/65536 bytes at offset 44040192
 64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 No errors were found on the image.

+=== Testing resize with snapshots ===
+
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=33554432
+wrote 65536/65536 bytes at offset 25165824
+64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+qemu-img: Can't resize a v2 image which has snapshots
+magic                     0x514649fb
+version                   2
+backing_file_offset       0x0
+backing_file_size         0x0
+cluster_bits              16
+size                      33554432
+crypt_method              0
+l1_size                   1
+l1_table_offset           0x30000
+refcount_table_offset     0x10000
+refcount_table_clusters   1
+nb_snapshots              1
+snapshot_offset           0x70000
+incompatible_features     []
+compatible_features       []
+autoclear_features        []
+refcount_order            4
+header_length             72
+
+magic                     0x514649fb
+version                   3
+backing_file_offset       0x0
+backing_file_size         0x0
+cluster_bits              16
+size                      134217728
+crypt_method              0
+l1_size                   1
+l1_table_offset           0x30000
+refcount_table_offset     0x10000
+refcount_table_clusters   1
+nb_snapshots              1
+snapshot_offset           0x70000
+incompatible_features     []
+compatible_features       []
+autoclear_features        []
+refcount_order            4
+header_length             104
+
+Header extension:
+magic                     0x6803f857
+length                    288
+data                      <binary>
+
+Image resized.
+magic                     0x514649fb
+version                   3
+backing_file_offset       0x0
+backing_file_size         0x0
+cluster_bits              16
+size                      67108864
+crypt_method              0
+l1_size                   1
+l1_table_offset           0x30000
+refcount_table_offset     0x10000
+refcount_table_clusters   1
+nb_snapshots              2
+snapshot_offset           0x90000
+incompatible_features     []
+compatible_features       []
+autoclear_features        []
+refcount_order            4
+header_length             104
+
+Header extension:
+magic                     0x6803f857
+length                    288
+data                      <binary>
+
+qemu-img: Internal snapshots prevent downgrade of image
+magic                     0x514649fb
+version                   3
+backing_file_offset       0x0
+backing_file_size         0x0
+cluster_bits              16
+size                      33554432
+crypt_method              0
+l1_size                   1
+l1_table_offset           0x30000
+refcount_table_offset     0x10000
+refcount_table_clusters   1
+nb_snapshots              2
+snapshot_offset           0x90000
+incompatible_features     []
+compatible_features       []
+autoclear_features        []
+refcount_order            4
+header_length             104
+
+Header extension:
+magic                     0x6803f857
+length                    288
+data                      <binary>
+
+magic                     0x514649fb
+version                   3
+backing_file_offset       0x0
+backing_file_size         0x0
+cluster_bits              16
+size                      134217728
+crypt_method              0
+l1_size                   1
+l1_table_offset           0x30000
+refcount_table_offset     0x10000
+refcount_table_clusters   1
+nb_snapshots              2
+snapshot_offset           0x90000
+incompatible_features     []
+compatible_features       []
+autoclear_features        []
+refcount_order            4
+header_length             104
+
+Header extension:
+magic                     0x6803f857
+length                    288
+data                      <binary>
+
+magic                     0x514649fb
+version                   2
+backing_file_offset       0x0
+backing_file_size         0x0
+cluster_bits              16
+size                      33554432
+crypt_method              0
+l1_size                   1
+l1_table_offset           0x30000
+refcount_table_offset     0x10000
+refcount_table_clusters   1
+nb_snapshots              1
+snapshot_offset           0x70000
+incompatible_features     []
+compatible_features       []
+autoclear_features        []
+refcount_order            4
+header_length             72
+
+No errors were found on the image.
+
 === Testing dirty lazy_refcounts=off ===

 Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864
-- 
2.26.2


Re: [PATCH] qcow2: Allow resize of images with internal snapshots
Posted by Max Reitz 4 years ago
On 22.04.20 22:53, Eric Blake wrote:
> We originally refused to allow resize of images with internal
> snapshots because the v2 image format did not require the tracking of
> snapshot size, making it impossible to safely revert to a snapshot
> with a different size than the current view of the image.  But the
> snapshot size tracking was rectified in v3, and our recent fixes to
> qemu-img amend (see 0a85af35) guarantee that we always have a valid
> snapshot size.  Thus, we no longer need to artificially limit image
> resizes, but it does become one more thing that would prevent a
> downgrade back to v2.  And now that we support different-sized
> snapshots, it's also easy to fix reverting to a snapshot to apply the
> new size.
> 
> Upgrade iotest 61 to cover this (we previously had NO coverage of
> refusal to resize while snapshots exist).  Note that the amend process
> can fail but still have effects: in particular, since we break things
> into upgrade, resize, downgrade, if a failure does not happen until a
> later phase (such as the downgrade attempt), earlier steps are still
> visible (a truncation and downgrade attempt will fail, but only after
> truncating data).  But even before this patch, an attempt to upgrade
> and resize would fail but only after changing the image to v3.  In
> some sense, partial image changes on failure are inevitible, since we
> can't avoid a mid-change EIO (if you are trying to amend more than one
> thing at once, but something fails, I hope you have a backup image).
> 
> Signed-off-by: Eric Blake <eblake@redhat.com>
> ---
>  block/qcow2-snapshot.c     |  21 ++++--
>  block/qcow2.c              |  31 ++++++--
>  tests/qemu-iotests/061     |  23 ++++++
>  tests/qemu-iotests/061.out | 144 +++++++++++++++++++++++++++++++++++++
>  4 files changed, 211 insertions(+), 8 deletions(-)
> 
> diff --git a/block/qcow2-snapshot.c b/block/qcow2-snapshot.c
> index 82c32d4c9b08..3f9e48738d0b 100644
> --- a/block/qcow2-snapshot.c
> +++ b/block/qcow2-snapshot.c
> @@ -23,6 +23,7 @@
>   */
> 
>  #include "qemu/osdep.h"
> +#include "sysemu/block-backend.h"
>  #include "qapi/error.h"
>  #include "qcow2.h"
>  #include "qemu/bswap.h"
> @@ -775,10 +776,22 @@ int qcow2_snapshot_goto(BlockDriverState *bs, const char *snapshot_id)
>      }
> 
>      if (sn->disk_size != bs->total_sectors * BDRV_SECTOR_SIZE) {
> -        error_report("qcow2: Loading snapshots with different disk "
> -            "size is not implemented");
> -        ret = -ENOTSUP;
> -        goto fail;
> +        BlockBackend *blk = blk_new(bdrv_get_aio_context(bs),
> +                                    BLK_PERM_RESIZE, BLK_PERM_ALL);
> +        ret = blk_insert_bs(blk, bs, &local_err);

I wonder whether maybe we should reintroduce blk_new_with_bs().

> +        if (ret < 0) {
> +            blk_unref(blk);
> +            error_report_err(local_err);
> +            goto fail;
> +        }
> +
> +        ret = blk_truncate(blk, sn->disk_size, true, PREALLOC_MODE_OFF,
> +                           &local_err);
> +        blk_unref(blk);
> +        if (ret < 0) {
> +            error_report_err(local_err);
> +            goto fail;
> +        }
>      }
> 
>      /*

Looks good.

> diff --git a/block/qcow2.c b/block/qcow2.c
> index b524b0c53f84..29047c33b7e5 100644
> --- a/block/qcow2.c
> +++ b/block/qcow2.c
> @@ -3988,14 +3988,21 @@ static int coroutine_fn qcow2_co_truncate(BlockDriverState *bs, int64_t offset,
> 
>      qemu_co_mutex_lock(&s->lock);
> 
> -    /* cannot proceed if image has snapshots */
> -    if (s->nb_snapshots) {
> -        error_setg(errp, "Can't resize an image which has snapshots");
> +    /*
> +     * Even though we store snapshot size for all images, it was not
> +     * required until v3, so it is not safe to proceed for v2.
> +     */
> +    if (s->nb_snapshots && s->qcow_version < 3) {
> +        error_setg(errp, "Can't resize a v2 image which has snapshots");
>          ret = -ENOTSUP;
>          goto fail;
>      }
> 
> -    /* cannot proceed if image has bitmaps */
> +    /*
> +     * For now, it's easier to not proceed if image has bitmaps, even
> +     * though we could resize bitmaps, because it is not obvious
> +     * whether new bits should be set or clear.

The previous comment was incorrect as well, but actually
qcow2_truncate_bitmaps_check() doesn’t return an error when there is any
bitmap, but only if there are non-active bitmaps, or active bitmaps that
cannot be modified.  So for non-disabled bitmaps, we generally do
happily proceed.

> +     */
>      if (qcow2_truncate_bitmaps_check(bs, errp)) {
>          ret = -ENOTSUP;
>          goto fail;

[...]

> diff --git a/tests/qemu-iotests/061 b/tests/qemu-iotests/061
> index ce285d308408..fdfb8fab5fb6 100755
> --- a/tests/qemu-iotests/061
> +++ b/tests/qemu-iotests/061
> @@ -111,6 +111,29 @@ $PYTHON qcow2.py "$TEST_IMG" dump-header
>  $QEMU_IO -c "read -P 0x2a 42M 64k" "$TEST_IMG" | _filter_qemu_io
>  _check_test_img
> 
> +echo
> +echo "=== Testing resize with snapshots ==="
> +echo
> +_make_test_img -o "compat=0.10" 32M
> +$QEMU_IO -c "write -P 0x2a 24M 64k" "$TEST_IMG" | _filter_qemu_io
> +$QEMU_IMG snapshot -c foo "$TEST_IMG"
> +$QEMU_IMG resize "$TEST_IMG" 64M                         # fails
> +$PYTHON qcow2.py "$TEST_IMG" dump-header

What am I looking for in the header dump?

> +$QEMU_IMG amend -o "compat=1.1,size=128M" "$TEST_IMG"    # succeeds
> +$PYTHON qcow2.py "$TEST_IMG" dump-header

The v2 -> v3 change?  I think a _img_info (| grep compat) would be
better suited (or just a “$QEMU_IMG amend ... || echo ERROR”).

(I don’t like changing that many dump-header outputs whenever the
header_length or the feature table length change for some reason.)

Also, I personally prefer self-testing tests, because I don’t trust
myself when I have to interpret the reference output on my own...  As
such, I think it would make sense to not just put those “# fails”
comments here, but an explicit test on $? instead.  (E.g. by
“|| echo ERROR”, although I can see that would be weird in the
expected-failure case as “&& echo ERROR”.)

> +$QEMU_IMG snapshot -c bar "$TEST_IMG"
> +$QEMU_IMG resize --shrink "$TEST_IMG" 64M                # succeeds
> +$PYTHON qcow2.py "$TEST_IMG" dump-header
> +$QEMU_IMG amend -o "compat=0.10,size=32M" "$TEST_IMG"    # fails, image left v3
> +$PYTHON qcow2.py "$TEST_IMG" dump-header

Again, a grep for the image size would help focus the reference output.

(In addition, _img_info would give us snapshot information.  Might be
interesting.  So maybe the best thing would be to grep the image
version, image size, and snapshot list from the image info every time.)

Speaking of the image size.  Is it intentional that the size is changed
to 32 MB?  Should amend work more like a transaction, in that we should
at least do a loose check on whether the options can be changed before
we touch the image?

Also, there’s a problem of ordering here.  The command as you’ve written
doesn’t have this specific problem, but imagine the size was still 128
MB before (just like the snapshot).  Then what the command does depends
on the order in which the operations are executed: If we change the
version first, the size cannot be changed because of the internal
snapshot.  If we change the size first, the version can no longer be
changed because the internal snapshot has a different size than the image.


(Hypothetical problems that turn out not to be any in practice:

Or, maybe more interesting: What if we try to amend to
compat=0.10,size=128M here?

If the size is changed first, the command will succeed, because then the
snapshot size matches the image size again.  If qemu-img attempts to
change the version first, the whole command will fail.

As I noted above, the size is indeed changed before the version (hence
us getting a 32 MB v3 image here), so the compat=0.10,size=128M amend
would indeed succeed.  But that’s luck.

OTOH, that means that if you have a v2 image with a snapshot and want to
make it a v3 image and resize it at the same time, that would fail by
the same logic, because the size cannot be changed for v2 images.  So
for that case it’d be bad luck.

But we always upgrade an image first in the amend process and downgrade
it last, to address specifically such cases: Allow adding new features
along with the upgrade, and strip unsupported features right before the
downgrade.  So that’s all good.  But I think it still shows that the
dependencies are getting a bit hairy.)

Max

Re: [PATCH] qcow2: Allow resize of images with internal snapshots
Posted by Eric Blake 4 years ago
On 4/23/20 8:55 AM, Max Reitz wrote:
> On 22.04.20 22:53, Eric Blake wrote:
>> We originally refused to allow resize of images with internal
>> snapshots because the v2 image format did not require the tracking of
>> snapshot size, making it impossible to safely revert to a snapshot
>> with a different size than the current view of the image.  But the
>> snapshot size tracking was rectified in v3, and our recent fixes to
>> qemu-img amend (see 0a85af35) guarantee that we always have a valid
>> snapshot size.  Thus, we no longer need to artificially limit image
>> resizes, but it does become one more thing that would prevent a
>> downgrade back to v2.  And now that we support different-sized
>> snapshots, it's also easy to fix reverting to a snapshot to apply the
>> new size.
>>
>> Upgrade iotest 61 to cover this (we previously had NO coverage of
>> refusal to resize while snapshots exist).  Note that the amend process
>> can fail but still have effects: in particular, since we break things
>> into upgrade, resize, downgrade, if a failure does not happen until a
>> later phase (such as the downgrade attempt), earlier steps are still
>> visible (a truncation and downgrade attempt will fail, but only after
>> truncating data).  But even before this patch, an attempt to upgrade
>> and resize would fail but only after changing the image to v3.  In
>> some sense, partial image changes on failure are inevitible, since we
>> can't avoid a mid-change EIO (if you are trying to amend more than one
>> thing at once, but something fails, I hope you have a backup image).
>>

>> @@ -775,10 +776,22 @@ int qcow2_snapshot_goto(BlockDriverState *bs, const char *snapshot_id)
>>       }
>>
>>       if (sn->disk_size != bs->total_sectors * BDRV_SECTOR_SIZE) {
>> -        error_report("qcow2: Loading snapshots with different disk "
>> -            "size is not implemented");
>> -        ret = -ENOTSUP;
>> -        goto fail;
>> +        BlockBackend *blk = blk_new(bdrv_get_aio_context(bs),
>> +                                    BLK_PERM_RESIZE, BLK_PERM_ALL);
>> +        ret = blk_insert_bs(blk, bs, &local_err);
> 
> I wonder whether maybe we should reintroduce blk_new_with_bs().

This code segment is copied from what 'qemu-img amend' does, so adding a 
helper function would indeed make life a bit easier for more than one 
spot in the code base.  Separate patch, obviously.


>> +++ b/block/qcow2.c
>> @@ -3988,14 +3988,21 @@ static int coroutine_fn qcow2_co_truncate(BlockDriverState *bs, int64_t offset,
>>
>>       qemu_co_mutex_lock(&s->lock);
>>
>> -    /* cannot proceed if image has snapshots */
>> -    if (s->nb_snapshots) {
>> -        error_setg(errp, "Can't resize an image which has snapshots");
>> +    /*
>> +     * Even though we store snapshot size for all images, it was not
>> +     * required until v3, so it is not safe to proceed for v2.
>> +     */
>> +    if (s->nb_snapshots && s->qcow_version < 3) {
>> +        error_setg(errp, "Can't resize a v2 image which has snapshots");
>>           ret = -ENOTSUP;
>>           goto fail;
>>       }
>>
>> -    /* cannot proceed if image has bitmaps */
>> +    /*
>> +     * For now, it's easier to not proceed if image has bitmaps, even
>> +     * though we could resize bitmaps, because it is not obvious
>> +     * whether new bits should be set or clear.
> 
> The previous comment was incorrect as well, but actually
> qcow2_truncate_bitmaps_check() doesn’t return an error when there is any
> bitmap, but only if there are non-active bitmaps, or active bitmaps that
> cannot be modified.  So for non-disabled bitmaps, we generally do
> happily proceed.

The comment change is collateral (only because I noticed it in the 
diff); but I could indeed reword it slightly more accurately as:

Check if bitmaps prevent a resize.  Although bitmaps can be resized, 
there are situations where we don't know whether to set or clear new 
bits, so for now it's easiest to just prevent resize in those cases.

And since it is a collateral change, it may even be worth splitting into 
a separate patch.

>> +++ b/tests/qemu-iotests/061
>> @@ -111,6 +111,29 @@ $PYTHON qcow2.py "$TEST_IMG" dump-header
>>   $QEMU_IO -c "read -P 0x2a 42M 64k" "$TEST_IMG" | _filter_qemu_io
>>   _check_test_img
>>
>> +echo
>> +echo "=== Testing resize with snapshots ==="
>> +echo
>> +_make_test_img -o "compat=0.10" 32M
>> +$QEMU_IO -c "write -P 0x2a 24M 64k" "$TEST_IMG" | _filter_qemu_io
>> +$QEMU_IMG snapshot -c foo "$TEST_IMG"
>> +$QEMU_IMG resize "$TEST_IMG" 64M                         # fails
>> +$PYTHON qcow2.py "$TEST_IMG" dump-header
> 
> What am I looking for in the header dump?

I was looking primarily for version (2 vs. 3), size (did it change), and 
number of snapshots.  You're right that grepping for what changes will 
make this easier to maintain.


> Also, I personally prefer self-testing tests, because I don’t trust
> myself when I have to interpret the reference output on my own...  As
> such, I think it would make sense to not just put those “# fails”
> comments here, but an explicit test on $? instead.  (E.g. by
> “|| echo ERROR”, although I can see that would be weird in the
> expected-failure case as “&& echo ERROR”.)

Good idea.

> 
>> +$QEMU_IMG snapshot -c bar "$TEST_IMG"
>> +$QEMU_IMG resize --shrink "$TEST_IMG" 64M                # succeeds
>> +$PYTHON qcow2.py "$TEST_IMG" dump-header
>> +$QEMU_IMG amend -o "compat=0.10,size=32M" "$TEST_IMG"    # fails, image left v3
>> +$PYTHON qcow2.py "$TEST_IMG" dump-header
> 
> Again, a grep for the image size would help focus the reference output.
> 
> (In addition, _img_info would give us snapshot information.  Might be
> interesting.  So maybe the best thing would be to grep the image
> version, image size, and snapshot list from the image info every time.)

Yep, that's the same list I was noticing when writing the patch.

> 
> Speaking of the image size.  Is it intentional that the size is changed
> to 32 MB?  Should amend work more like a transaction, in that we should
> at least do a loose check on whether the options can be changed before
> we touch the image?

Yes, the commit message tried to call it out.  It's a pre-existing 
problem - during amend, we DO make changes to the disk in one step, with 
no way to roll back those changes, even if a later step fails.

Pre-patch, if you request an upgrade to v3 as well as a resize, but 
resize fails, you still end up with the image being changed to v3. 
That's no different from post-patch where if you request a resize and a 
downgrade to v2, the resize happens but not the downgrade.  On the 
bright side, our current failure scenarios at least leave the resulting 
image viable, even if it is not the same as it was pre-attempt.

> 
> Also, there’s a problem of ordering here.  The command as you’ve written
> doesn’t have this specific problem, but imagine the size was still 128
> MB before (just like the snapshot).  Then what the command does depends
> on the order in which the operations are executed: If we change the
> version first, the size cannot be changed because of the internal
> snapshot.  If we change the size first, the version can no longer be
> changed because the internal snapshot has a different size than the image.

Yes, it was a pain for me while writing the tests.  At one point I even 
considered swapping things to do the resize after the downgrade, but 
that introduces a different problem: the downgrade depends on knowing 
the post-transaction size (because downgrading is safe only when all 
internal snapshots match the post-resize length), but we aren't passing 
the desired size through to the upgrade and downgrade functions.  Worse, 
if we swap the downgrade first, and it succeeds but then resize fails, 
the image is no longer viable; at least with the current ordering, even 
though user data has been truncated, it remains v3 so the size 
differences in snapshots don't break the image (and the user DID request 
an image resize, so it's not like we are discarding data accidentally).

If we want to avoid truncating an image at all costs on any potential 
failure path, we have to add a lot more plumbing (not to say it's a bad 
thing, just that it's more work).  And no matter how much plumbing we 
add, or transaction-like rollback capabilities we add, there's still the 
risk that we will hit a late EIO error leaving the image in a 
half-changed state, even if none of our sanity checks failed.  Or put 
another way, without some sort of journaling, the best we can do is 
defer all writes until we know the full set of changes, which limits the 
likelihood of a half-baked change to a write failure.  But since we can 
only reduce, and not eliminate, the possibility of a half-baked change, 
the question then becomes whether it is worth the engineering effort of 
additional complexity for even more corner cases and less risk, or just 
leave things as they are (if qemu-img amend fails, we make no guarantees 
about the state of your image).

> 
> 
> (Hypothetical problems that turn out not to be any in practice:
> 
> Or, maybe more interesting: What if we try to amend to
> compat=0.10,size=128M here?
> 
> If the size is changed first, the command will succeed, because then the
> snapshot size matches the image size again.  If qemu-img attempts to
> change the version first, the whole command will fail.
> 
> As I noted above, the size is indeed changed before the version (hence
> us getting a 32 MB v3 image here), so the compat=0.10,size=128M amend
> would indeed succeed.  But that’s luck.
> 
> OTOH, that means that if you have a v2 image with a snapshot and want to
> make it a v3 image and resize it at the same time, that would fail by
> the same logic, because the size cannot be changed for v2 images.  So
> for that case it’d be bad luck.
> 
> But we always upgrade an image first in the amend process and downgrade
> it last, to address specifically such cases: Allow adding new features
> along with the upgrade, and strip unsupported features right before the
> downgrade.  So that’s all good.  But I think it still shows that the
> dependencies are getting a bit hairy.)

I'm not sure the work in making amend more transaction-like is worth it 
- yes, we can add more validation code up front before making the first 
write change, but as some of the later changes depend on information 
that would be changed earlier, that becomes a lot more state we have to 
collect during our initial probes (my example being that the downgrade 
attempt depends on knowing the final image size, and that's a lot easier 
when the image has already been resized rather than having to pass the 
new size through).

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org


Re: [PATCH] qcow2: Allow resize of images with internal snapshots
Posted by Max Reitz 4 years ago
On 23.04.20 16:35, Eric Blake wrote:
> On 4/23/20 8:55 AM, Max Reitz wrote:
>> On 22.04.20 22:53, Eric Blake wrote:
>>> We originally refused to allow resize of images with internal
>>> snapshots because the v2 image format did not require the tracking of
>>> snapshot size, making it impossible to safely revert to a snapshot
>>> with a different size than the current view of the image.  But the
>>> snapshot size tracking was rectified in v3, and our recent fixes to
>>> qemu-img amend (see 0a85af35) guarantee that we always have a valid
>>> snapshot size.  Thus, we no longer need to artificially limit image
>>> resizes, but it does become one more thing that would prevent a
>>> downgrade back to v2.  And now that we support different-sized
>>> snapshots, it's also easy to fix reverting to a snapshot to apply the
>>> new size.
>>>
>>> Upgrade iotest 61 to cover this (we previously had NO coverage of
>>> refusal to resize while snapshots exist).  Note that the amend process
>>> can fail but still have effects: in particular, since we break things
>>> into upgrade, resize, downgrade, if a failure does not happen until a
>>> later phase (such as the downgrade attempt), earlier steps are still
>>> visible (a truncation and downgrade attempt will fail, but only after
>>> truncating data).  But even before this patch, an attempt to upgrade
>>> and resize would fail but only after changing the image to v3.  In
>>> some sense, partial image changes on failure are inevitible, since we
>>> can't avoid a mid-change EIO (if you are trying to amend more than one
>>> thing at once, but something fails, I hope you have a backup image).
>>>
> 
>>> @@ -775,10 +776,22 @@ int qcow2_snapshot_goto(BlockDriverState *bs,
>>> const char *snapshot_id)
>>>       }
>>>
>>>       if (sn->disk_size != bs->total_sectors * BDRV_SECTOR_SIZE) {
>>> -        error_report("qcow2: Loading snapshots with different disk "
>>> -            "size is not implemented");
>>> -        ret = -ENOTSUP;
>>> -        goto fail;
>>> +        BlockBackend *blk = blk_new(bdrv_get_aio_context(bs),
>>> +                                    BLK_PERM_RESIZE, BLK_PERM_ALL);
>>> +        ret = blk_insert_bs(blk, bs, &local_err);
>>
>> I wonder whether maybe we should reintroduce blk_new_with_bs().
> 
> This code segment is copied from what 'qemu-img amend' does, so adding a
> helper function would indeed make life a bit easier for more than one
> spot in the code base.  Separate patch, obviously.
> 
> 
>>> +++ b/block/qcow2.c
>>> @@ -3988,14 +3988,21 @@ static int coroutine_fn
>>> qcow2_co_truncate(BlockDriverState *bs, int64_t offset,
>>>
>>>       qemu_co_mutex_lock(&s->lock);
>>>
>>> -    /* cannot proceed if image has snapshots */
>>> -    if (s->nb_snapshots) {
>>> -        error_setg(errp, "Can't resize an image which has snapshots");
>>> +    /*
>>> +     * Even though we store snapshot size for all images, it was not
>>> +     * required until v3, so it is not safe to proceed for v2.
>>> +     */
>>> +    if (s->nb_snapshots && s->qcow_version < 3) {
>>> +        error_setg(errp, "Can't resize a v2 image which has
>>> snapshots");
>>>           ret = -ENOTSUP;
>>>           goto fail;
>>>       }
>>>
>>> -    /* cannot proceed if image has bitmaps */
>>> +    /*
>>> +     * For now, it's easier to not proceed if image has bitmaps, even
>>> +     * though we could resize bitmaps, because it is not obvious
>>> +     * whether new bits should be set or clear.
>>
>> The previous comment was incorrect as well, but actually
>> qcow2_truncate_bitmaps_check() doesn’t return an error when there is any
>> bitmap, but only if there are non-active bitmaps, or active bitmaps that
>> cannot be modified.  So for non-disabled bitmaps, we generally do
>> happily proceed.
> 
> The comment change is collateral (only because I noticed it in the
> diff); but I could indeed reword it slightly more accurately as:
> 
> Check if bitmaps prevent a resize.  Although bitmaps can be resized,
> there are situations where we don't know whether to set or clear new
> bits, so for now it's easiest to just prevent resize in those cases.

But I don’t know whether that explanation is actually correct.  I mean,
that would apply for enabled active bitmaps just as well.  But we do
allow resizing an image with such bitmaps so it seems clear that we have
some idea on what to do.  (Or at least we pretend we do, for better or
worse).

(Which is that bdrv_dirty_bitmap_truncate() just calls
hbitmap_truncate(), which fills the new space with zeroes.)

The real reason we can’t resize with certain kinds of bitmaps present
seems more like:
(1) There are some bitmaps that can’t be written to, but we’d have to if
we wanted to resize the image and they’re persistent,
(2) The second case are bitmaps that are persistent but present in
memory; we just haven’t implemented that case, I presume.

So it seems less like a case of “We don’t know what to do”, but more a
combination of “We can’t“ and “We haven‘t implemented this case, but
it’s clear what to do if we did”.

> And since it is a collateral change, it may even be worth splitting into
> a separate patch.
> 
>>> +++ b/tests/qemu-iotests/061
>>> @@ -111,6 +111,29 @@ $PYTHON qcow2.py "$TEST_IMG" dump-header
>>>   $QEMU_IO -c "read -P 0x2a 42M 64k" "$TEST_IMG" | _filter_qemu_io
>>>   _check_test_img
>>>
>>> +echo
>>> +echo "=== Testing resize with snapshots ==="
>>> +echo
>>> +_make_test_img -o "compat=0.10" 32M
>>> +$QEMU_IO -c "write -P 0x2a 24M 64k" "$TEST_IMG" | _filter_qemu_io
>>> +$QEMU_IMG snapshot -c foo "$TEST_IMG"
>>> +$QEMU_IMG resize "$TEST_IMG" 64M                         # fails
>>> +$PYTHON qcow2.py "$TEST_IMG" dump-header
>>
>> What am I looking for in the header dump?
> 
> I was looking primarily for version (2 vs. 3), size (did it change), and
> number of snapshots.  You're right that grepping for what changes will
> make this easier to maintain.
> 
> 
>> Also, I personally prefer self-testing tests, because I don’t trust
>> myself when I have to interpret the reference output on my own...  As
>> such, I think it would make sense to not just put those “# fails”
>> comments here, but an explicit test on $? instead.  (E.g. by
>> “|| echo ERROR”, although I can see that would be weird in the
>> expected-failure case as “&& echo ERROR”.)
> 
> Good idea.
> 
>>
>>> +$QEMU_IMG snapshot -c bar "$TEST_IMG"
>>> +$QEMU_IMG resize --shrink "$TEST_IMG" 64M                # succeeds
>>> +$PYTHON qcow2.py "$TEST_IMG" dump-header
>>> +$QEMU_IMG amend -o "compat=0.10,size=32M" "$TEST_IMG"    # fails,
>>> image left v3
>>> +$PYTHON qcow2.py "$TEST_IMG" dump-header
>>
>> Again, a grep for the image size would help focus the reference output.
>>
>> (In addition, _img_info would give us snapshot information.  Might be
>> interesting.  So maybe the best thing would be to grep the image
>> version, image size, and snapshot list from the image info every time.)
> 
> Yep, that's the same list I was noticing when writing the patch.
> 
>>
>> Speaking of the image size.  Is it intentional that the size is changed
>> to 32 MB?  Should amend work more like a transaction, in that we should
>> at least do a loose check on whether the options can be changed before
>> we touch the image?
> 
> Yes, the commit message tried to call it out.  It's a pre-existing
> problem - during amend, we DO make changes to the disk in one step, with
> no way to roll back those changes, even if a later step fails.
> 
> Pre-patch, if you request an upgrade to v3 as well as a resize, but
> resize fails, you still end up with the image being changed to v3.
> That's no different from post-patch where if you request a resize and a
> downgrade to v2, the resize happens but not the downgrade.  On the
> bright side, our current failure scenarios at least leave the resulting
> image viable, even if it is not the same as it was pre-attempt.

Yes.  OK.

>> Also, there’s a problem of ordering here.  The command as you’ve written
>> doesn’t have this specific problem, but imagine the size was still 128
>> MB before (just like the snapshot).  Then what the command does depends
>> on the order in which the operations are executed: If we change the
>> version first, the size cannot be changed because of the internal
>> snapshot.  If we change the size first, the version can no longer be
>> changed because the internal snapshot has a different size than the
>> image.
> 
> Yes, it was a pain for me while writing the tests.  At one point I even
> considered swapping things to do the resize after the downgrade, but
> that introduces a different problem: the downgrade depends on knowing
> the post-transaction size (because downgrading is safe only when all
> internal snapshots match the post-resize length), but we aren't passing
> the desired size through to the upgrade and downgrade functions.  Worse,
> if we swap the downgrade first, and it succeeds but then resize fails,
> the image is no longer viable; at least with the current ordering, even
> though user data has been truncated, it remains v3 so the size
> differences in snapshots don't break the image (and the user DID request
> an image resize, so it's not like we are discarding data accidentally).
> 
> If we want to avoid truncating an image at all costs on any potential
> failure path, we have to add a lot more plumbing (not to say it's a bad
> thing, just that it's more work).  And no matter how much plumbing we
> add, or transaction-like rollback capabilities we add, there's still the
> risk that we will hit a late EIO error leaving the image in a
> half-changed state, even if none of our sanity checks failed.  Or put
> another way, without some sort of journaling, the best we can do is
> defer all writes until we know the full set of changes, which limits the
> likelihood of a half-baked change to a write failure.  But since we can
> only reduce, and not eliminate, the possibility of a half-baked change,
> the question then becomes whether it is worth the engineering effort of
> additional complexity for even more corner cases and less risk, or just
> leave things as they are (if qemu-img amend fails, we make no guarantees
> about the state of your image).

Yeah.  I don’t think anyone even would have realistic use for
transactional amends.  I suppose all users can easily split their amend
calls with multiple options into multiple amends in the order that would
be most likely reversible, if something went wrong along the way.  (And
that also works.  I.e., downgrading/upgrading is probably the most easy
to revert, but maybe you can only downgrade if your image has the
correct size, so you potentially need to truncate it first.  OTOH, I
can’t imagine many people actually use qemu-img amend to downgrade qcow2
images anyway...)

>> (Hypothetical problems that turn out not to be any in practice:
>>
>> Or, maybe more interesting: What if we try to amend to
>> compat=0.10,size=128M here?
>>
>> If the size is changed first, the command will succeed, because then the
>> snapshot size matches the image size again.  If qemu-img attempts to
>> change the version first, the whole command will fail.
>>
>> As I noted above, the size is indeed changed before the version (hence
>> us getting a 32 MB v3 image here), so the compat=0.10,size=128M amend
>> would indeed succeed.  But that’s luck.
>>
>> OTOH, that means that if you have a v2 image with a snapshot and want to
>> make it a v3 image and resize it at the same time, that would fail by
>> the same logic, because the size cannot be changed for v2 images.  So
>> for that case it’d be bad luck.
>>
>> But we always upgrade an image first in the amend process and downgrade
>> it last, to address specifically such cases: Allow adding new features
>> along with the upgrade, and strip unsupported features right before the
>> downgrade.  So that’s all good.  But I think it still shows that the
>> dependencies are getting a bit hairy.)
> 
> I'm not sure the work in making amend more transaction-like is worth it
> - yes, we can add more validation code up front before making the first
> write change, but as some of the later changes depend on information
> that would be changed earlier, that becomes a lot more state we have to
> collect during our initial probes

I agree, but to me that seems to suggest that it might not be ideal to
consider the amend interface a “This is a description of the post-amend
state I want” interface, because implementing that correctly seems to be
very complicated.  The alternative would be “This is a description of
what I want to change”, and you can give multiple things at once, but
there is no guarantee of whether that works or not.

I feel very much reminded of the LUKS keyslot discussion...

(That is to say, my thoughts on this have little to do with this
specific patch at this point.)

Max

Re: [PATCH] qcow2: Allow resize of images with internal snapshots
Posted by Eric Blake 4 years ago
On 4/23/20 12:21 PM, Max Reitz wrote:

>>> The previous comment was incorrect as well, but actually
>>> qcow2_truncate_bitmaps_check() doesn’t return an error when there is any
>>> bitmap, but only if there are non-active bitmaps, or active bitmaps that
>>> cannot be modified.  So for non-disabled bitmaps, we generally do
>>> happily proceed.
>>
>> The comment change is collateral (only because I noticed it in the
>> diff); but I could indeed reword it slightly more accurately as:
>>
>> Check if bitmaps prevent a resize.  Although bitmaps can be resized,
>> there are situations where we don't know whether to set or clear new
>> bits, so for now it's easiest to just prevent resize in those cases.
> 
> But I don’t know whether that explanation is actually correct.  I mean,
> that would apply for enabled active bitmaps just as well.  But we do
> allow resizing an image with such bitmaps so it seems clear that we have
> some idea on what to do.  (Or at least we pretend we do, for better or
> worse).
> 
> (Which is that bdrv_dirty_bitmap_truncate() just calls
> hbitmap_truncate(), which fills the new space with zeroes.)
> 
> The real reason we can’t resize with certain kinds of bitmaps present
> seems more like:
> (1) There are some bitmaps that can’t be written to, but we’d have to if
> we wanted to resize the image and they’re persistent,
> (2) The second case are bitmaps that are persistent but present in
> memory; we just haven’t implemented that case, I presume.
> 
> So it seems less like a case of “We don’t know what to do”, but more a
> combination of “We can’t“ and “We haven‘t implemented this case, but
> it’s clear what to do if we did”.

Indeed. So definitely a reason to split this change to a separate patch 
(and/or fix the code to finally implement it)


>>> Speaking of the image size.  Is it intentional that the size is changed
>>> to 32 MB?  Should amend work more like a transaction, in that we should
>>> at least do a loose check on whether the options can be changed before
>>> we touch the image?
>>
>> Yes, the commit message tried to call it out.  It's a pre-existing
>> problem - during amend, we DO make changes to the disk in one step, with
>> no way to roll back those changes, even if a later step fails.
>>
>> Pre-patch, if you request an upgrade to v3 as well as a resize, but
>> resize fails, you still end up with the image being changed to v3.
>> That's no different from post-patch where if you request a resize and a
>> downgrade to v2, the resize happens but not the downgrade.  On the
>> bright side, our current failure scenarios at least leave the resulting
>> image viable, even if it is not the same as it was pre-attempt.
> 
> Yes.  OK.

Okay, v2 will have a better commit message.


> Yeah.  I don’t think anyone even would have realistic use for
> transactional amends.  I suppose all users can easily split their amend
> calls with multiple options into multiple amends in the order that would
> be most likely reversible, if something went wrong along the way.  (And
> that also works.  I.e., downgrading/upgrading is probably the most easy
> to revert, but maybe you can only downgrade if your image has the
> correct size, so you potentially need to truncate it first.  OTOH, I
> can’t imagine many people actually use qemu-img amend to downgrade qcow2
> images anyway...)

Indeed - any time that you worry that an interaction of multiple changes 
might fail half-way through, you can always serialize those changes 
yourself instead of hoping the tool sequences them correctly ;)


> I feel very much reminded of the LUKS keyslot discussion...
> 
> (That is to say, my thoughts on this have little to do with this
> specific patch at this point.)

Too true !

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org