Changeset
block/qcow2-cache.c        |  26 +++++++
block/qcow2-cluster.c      |  50 +++++++++++++
block/qcow2-refcount.c     | 140 ++++++++++++++++++++++++++++++++++++-
block/qcow2.c              |  43 +++++++++---
block/qcow2.h              |  17 +++++
qapi/block-core.json       |   3 +-
qemu-img-cmds.hx           |   4 +-
qemu-img.c                 |  23 ++++++
qemu-img.texi              |   6 +-
tests/qemu-iotests/102     |   4 +-
tests/qemu-iotests/163     | 170 +++++++++++++++++++++++++++++++++++++++++++++
tests/qemu-iotests/163.out |   5 ++
tests/qemu-iotests/group   |   1 +
13 files changed, 475 insertions(+), 17 deletions(-)
create mode 100644 tests/qemu-iotests/163
create mode 100644 tests/qemu-iotests/163.out
Git apply log
Switched to a new branch '20170714153749.25132-1-pbutsykin@virtuozzo.com'
Applying: qemu-img: add --shrink flag for resize
Using index info to reconstruct a base tree...
M	qemu-img-cmds.hx
M	qemu-img.c
M	qemu-img.texi
Falling back to patching base and 3-way merge...
Auto-merging qemu-img.texi
Auto-merging qemu-img.c
Auto-merging qemu-img-cmds.hx
Applying: qcow2: add qcow2_cache_discard
Applying: qcow2: add shrink image support
Applying: qemu-iotests: add shrinking image test
To https://github.com/patchew-project/qemu
 + 89abbafc5d...09ccf4b0df patchew/20170714153749.25132-1-pbutsykin@virtuozzo.com -> patchew/20170714153749.25132-1-pbutsykin@virtuozzo.com (forced update)
Test passed: FreeBSD

loading

Test passed: s390x

loading

Test passed: checkpatch

loading

Test passed: docker

loading

[Qemu-devel] [PATCH v6 0/4] Add shrink image for qcow2
Posted by Pavel Butsykin, 53 weeks ago
This patch add shrinking of the image file for qcow2. As a result, this allows
us to reduce the virtual image size and free up space on the disk without
copying the image. Image can be fragmented and shrink is done by punching holes
in the image file.

# ./qemu-img create -f qcow2 image.qcow2 4G
Formatting 'image.qcow2', fmt=qcow2 size=4294967296 encryption=off cluster_size=65536 lazy_refcounts=off refcount_bits=16

# ./qemu-io -c "write -P 0x22 0 1G" image.qcow2
wrote 1073741824/1073741824 bytes at offset 0
1 GiB, 1 ops; 0:00:04.59 (222.886 MiB/sec and 0.2177 ops/sec)

# ./qemu-img resize image.qcow2 512M
warning: qemu-img: Shrinking an image will delete all data beyond the shrunken image's end. Before performing such an operation, make sure there is no important data there.
error: qemu-img: Use the --shrink option to perform a shrink operation.

# ./qemu-img resize --shrink image.qcow2 128M
Image resized.

# ./qemu-img info image.qcow2
image: image.qcow2
file format: qcow2
virtual size: 128M (134217728 bytes)
disk size: 128M
cluster_size: 65536
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false

# du -h image.qcow2
129M    image.qcow2

Changes from v1:
- add --shrink flag for qemu-img resize
- add qcow2_cache_discard
- simplify qcow2_shrink_l1_table() to reduce the likelihood of image corruption
- add new qemu-iotests for shrinking images

Changes from v2:
- replace qprintf() on error_report() (1)
- rewrite warning messages (1)
- enforce --shrink flag for all formats except raw (1)
- split qcow2_cache_discard() (2)
- minor fixes according to comments (3)
- rewrite the last part of qcow2_shrink_reftable() to avoid
  qcow2_free_clusters() calls inside (3)
- improve test for shrinking image (4)

Changes from v3:
- rebase on "Implement a warning_report function" Alistair's patch-set (1)
- spelling fixes (1)
- the man page fix according to the discussion (1)
- add call qcow2_signal_corruption() in case of image corruption (3)

Changes from v4:
- rebase on https://github.com/XanClic/qemu/commits/block Max's block branch

Changes from v5:
- the condition refcount == 0 should be enough to evict the l2/refcount cluster
  from the cache (2)
- overwrite the l1/refcount table in memory with zeros, even if overwriting the
  l1/refcount table on disk has failed (3)
- replace g_try_malloc() on g_malloc() for allocation reftable_tmp (3)

Pavel Butsykin (4):
  qemu-img: add --shrink flag for resize
  qcow2: add qcow2_cache_discard
  qcow2: add shrink image support
  qemu-iotests: add shrinking image test

 block/qcow2-cache.c        |  26 +++++++
 block/qcow2-cluster.c      |  50 +++++++++++++
 block/qcow2-refcount.c     | 140 ++++++++++++++++++++++++++++++++++++-
 block/qcow2.c              |  43 +++++++++---
 block/qcow2.h              |  17 +++++
 qapi/block-core.json       |   3 +-
 qemu-img-cmds.hx           |   4 +-
 qemu-img.c                 |  23 ++++++
 qemu-img.texi              |   6 +-
 tests/qemu-iotests/102     |   4 +-
 tests/qemu-iotests/163     | 170 +++++++++++++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/163.out |   5 ++
 tests/qemu-iotests/group   |   1 +
 13 files changed, 475 insertions(+), 17 deletions(-)
 create mode 100644 tests/qemu-iotests/163
 create mode 100644 tests/qemu-iotests/163.out

-- 
2.13.0


Re: [Qemu-devel] [Qemu-block] [PATCH v6 0/4] Add shrink image for qcow2
Posted by John Snow, 48 weeks ago
Over a month with no replies and we're nearing the next QEMU release. If
this patchset is still applicable, can you rebase and resend for 2.11?

--js

On 07/14/2017 11:37 AM, Pavel Butsykin wrote:
> This patch add shrinking of the image file for qcow2. As a result, this allows
> us to reduce the virtual image size and free up space on the disk without
> copying the image. Image can be fragmented and shrink is done by punching holes
> in the image file.
> 
> # ./qemu-img create -f qcow2 image.qcow2 4G
> Formatting 'image.qcow2', fmt=qcow2 size=4294967296 encryption=off cluster_size=65536 lazy_refcounts=off refcount_bits=16
> 
> # ./qemu-io -c "write -P 0x22 0 1G" image.qcow2
> wrote 1073741824/1073741824 bytes at offset 0
> 1 GiB, 1 ops; 0:00:04.59 (222.886 MiB/sec and 0.2177 ops/sec)
> 
> # ./qemu-img resize image.qcow2 512M
> warning: qemu-img: Shrinking an image will delete all data beyond the shrunken image's end. Before performing such an operation, make sure there is no important data there.
> error: qemu-img: Use the --shrink option to perform a shrink operation.
> 
> # ./qemu-img resize --shrink image.qcow2 128M
> Image resized.
> 
> # ./qemu-img info image.qcow2
> image: image.qcow2
> file format: qcow2
> virtual size: 128M (134217728 bytes)
> disk size: 128M
> cluster_size: 65536
> Format specific information:
>     compat: 1.1
>     lazy refcounts: false
>     refcount bits: 16
>     corrupt: false
> 
> # du -h image.qcow2
> 129M    image.qcow2
> 
> Changes from v1:
> - add --shrink flag for qemu-img resize
> - add qcow2_cache_discard
> - simplify qcow2_shrink_l1_table() to reduce the likelihood of image corruption
> - add new qemu-iotests for shrinking images
> 
> Changes from v2:
> - replace qprintf() on error_report() (1)
> - rewrite warning messages (1)
> - enforce --shrink flag for all formats except raw (1)
> - split qcow2_cache_discard() (2)
> - minor fixes according to comments (3)
> - rewrite the last part of qcow2_shrink_reftable() to avoid
>   qcow2_free_clusters() calls inside (3)
> - improve test for shrinking image (4)
> 
> Changes from v3:
> - rebase on "Implement a warning_report function" Alistair's patch-set (1)
> - spelling fixes (1)
> - the man page fix according to the discussion (1)
> - add call qcow2_signal_corruption() in case of image corruption (3)
> 
> Changes from v4:
> - rebase on https://github.com/XanClic/qemu/commits/block Max's block branch
> 
> Changes from v5:
> - the condition refcount == 0 should be enough to evict the l2/refcount cluster
>   from the cache (2)
> - overwrite the l1/refcount table in memory with zeros, even if overwriting the
>   l1/refcount table on disk has failed (3)
> - replace g_try_malloc() on g_malloc() for allocation reftable_tmp (3)
> 
> Pavel Butsykin (4):
>   qemu-img: add --shrink flag for resize
>   qcow2: add qcow2_cache_discard
>   qcow2: add shrink image support
>   qemu-iotests: add shrinking image test
> 
>  block/qcow2-cache.c        |  26 +++++++
>  block/qcow2-cluster.c      |  50 +++++++++++++
>  block/qcow2-refcount.c     | 140 ++++++++++++++++++++++++++++++++++++-
>  block/qcow2.c              |  43 +++++++++---
>  block/qcow2.h              |  17 +++++
>  qapi/block-core.json       |   3 +-
>  qemu-img-cmds.hx           |   4 +-
>  qemu-img.c                 |  23 ++++++
>  qemu-img.texi              |   6 +-
>  tests/qemu-iotests/102     |   4 +-
>  tests/qemu-iotests/163     | 170 +++++++++++++++++++++++++++++++++++++++++++++
>  tests/qemu-iotests/163.out |   5 ++
>  tests/qemu-iotests/group   |   1 +
>  13 files changed, 475 insertions(+), 17 deletions(-)
>  create mode 100644 tests/qemu-iotests/163
>  create mode 100644 tests/qemu-iotests/163.out
> 

Re: [Qemu-devel] [Qemu-block] [PATCH v6 0/4] Add shrink image for qcow2
Posted by Pavel Butsykin, 48 weeks ago
On 17.08.2017 00:07, John Snow wrote:
> Over a month with no replies and we're nearing the next QEMU release. If
> this patchset is still applicable, can you rebase and resend for 2.11?

Thanks for digging up these patches :) I've sent the rebased version.

> --js
> 
> On 07/14/2017 11:37 AM, Pavel Butsykin wrote:
>> This patch add shrinking of the image file for qcow2. As a result, this allows
>> us to reduce the virtual image size and free up space on the disk without
>> copying the image. Image can be fragmented and shrink is done by punching holes
>> in the image file.
>>
>> # ./qemu-img create -f qcow2 image.qcow2 4G
>> Formatting 'image.qcow2', fmt=qcow2 size=4294967296 encryption=off cluster_size=65536 lazy_refcounts=off refcount_bits=16
>>
>> # ./qemu-io -c "write -P 0x22 0 1G" image.qcow2
>> wrote 1073741824/1073741824 bytes at offset 0
>> 1 GiB, 1 ops; 0:00:04.59 (222.886 MiB/sec and 0.2177 ops/sec)
>>
>> # ./qemu-img resize image.qcow2 512M
>> warning: qemu-img: Shrinking an image will delete all data beyond the shrunken image's end. Before performing such an operation, make sure there is no important data there.
>> error: qemu-img: Use the --shrink option to perform a shrink operation.
>>
>> # ./qemu-img resize --shrink image.qcow2 128M
>> Image resized.
>>
>> # ./qemu-img info image.qcow2
>> image: image.qcow2
>> file format: qcow2
>> virtual size: 128M (134217728 bytes)
>> disk size: 128M
>> cluster_size: 65536
>> Format specific information:
>>      compat: 1.1
>>      lazy refcounts: false
>>      refcount bits: 16
>>      corrupt: false
>>
>> # du -h image.qcow2
>> 129M    image.qcow2
>>
>> Changes from v1:
>> - add --shrink flag for qemu-img resize
>> - add qcow2_cache_discard
>> - simplify qcow2_shrink_l1_table() to reduce the likelihood of image corruption
>> - add new qemu-iotests for shrinking images
>>
>> Changes from v2:
>> - replace qprintf() on error_report() (1)
>> - rewrite warning messages (1)
>> - enforce --shrink flag for all formats except raw (1)
>> - split qcow2_cache_discard() (2)
>> - minor fixes according to comments (3)
>> - rewrite the last part of qcow2_shrink_reftable() to avoid
>>    qcow2_free_clusters() calls inside (3)
>> - improve test for shrinking image (4)
>>
>> Changes from v3:
>> - rebase on "Implement a warning_report function" Alistair's patch-set (1)
>> - spelling fixes (1)
>> - the man page fix according to the discussion (1)
>> - add call qcow2_signal_corruption() in case of image corruption (3)
>>
>> Changes from v4:
>> - rebase on https://github.com/XanClic/qemu/commits/block Max's block branch
>>
>> Changes from v5:
>> - the condition refcount == 0 should be enough to evict the l2/refcount cluster
>>    from the cache (2)
>> - overwrite the l1/refcount table in memory with zeros, even if overwriting the
>>    l1/refcount table on disk has failed (3)
>> - replace g_try_malloc() on g_malloc() for allocation reftable_tmp (3)
>>
>> Pavel Butsykin (4):
>>    qemu-img: add --shrink flag for resize
>>    qcow2: add qcow2_cache_discard
>>    qcow2: add shrink image support
>>    qemu-iotests: add shrinking image test
>>
>>   block/qcow2-cache.c        |  26 +++++++
>>   block/qcow2-cluster.c      |  50 +++++++++++++
>>   block/qcow2-refcount.c     | 140 ++++++++++++++++++++++++++++++++++++-
>>   block/qcow2.c              |  43 +++++++++---
>>   block/qcow2.h              |  17 +++++
>>   qapi/block-core.json       |   3 +-
>>   qemu-img-cmds.hx           |   4 +-
>>   qemu-img.c                 |  23 ++++++
>>   qemu-img.texi              |   6 +-
>>   tests/qemu-iotests/102     |   4 +-
>>   tests/qemu-iotests/163     | 170 +++++++++++++++++++++++++++++++++++++++++++++
>>   tests/qemu-iotests/163.out |   5 ++
>>   tests/qemu-iotests/group   |   1 +
>>   13 files changed, 475 insertions(+), 17 deletions(-)
>>   create mode 100644 tests/qemu-iotests/163
>>   create mode 100644 tests/qemu-iotests/163.out
>>

[Qemu-devel] [PATCH v6 1/4] qemu-img: add --shrink flag for resize
Posted by Pavel Butsykin, 53 weeks ago
The flag is additional precaution against data loss. Perhaps in the future the
operation shrink without this flag will be blocked for all formats, but for now
we need to maintain compatibility with raw.

Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
---
 qemu-img-cmds.hx       |  4 ++--
 qemu-img.c             | 23 +++++++++++++++++++++++
 qemu-img.texi          |  6 +++++-
 tests/qemu-iotests/102 |  4 ++--
 4 files changed, 32 insertions(+), 5 deletions(-)

diff --git a/qemu-img-cmds.hx b/qemu-img-cmds.hx
index ac5946bc4f..e36957a2ca 100644
--- a/qemu-img-cmds.hx
+++ b/qemu-img-cmds.hx
@@ -82,9 +82,9 @@ STEXI
 ETEXI
 
 DEF("resize", img_resize,
-    "resize [--object objectdef] [--image-opts] [-q] filename [+ | -]size")
+    "resize [--object objectdef] [--image-opts] [-q] [--shrink] filename [+ | -]size")
 STEXI
-@item resize [--object @var{objectdef}] [--image-opts] [-q] @var{filename} [+ | -]@var{size}
+@item resize [--object @var{objectdef}] [--image-opts] [-q] [--shrink] @var{filename} [+ | -]@var{size}
 ETEXI
 
 DEF("amend", img_amend,
diff --git a/qemu-img.c b/qemu-img.c
index 28022145d5..b4dc4bb5c4 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -64,6 +64,7 @@ enum {
     OPTION_TARGET_IMAGE_OPTS = 263,
     OPTION_SIZE = 264,
     OPTION_PREALLOCATION = 265,
+    OPTION_SHRINK = 266,
 };
 
 typedef enum OutputFormat {
@@ -3430,6 +3431,7 @@ static int img_resize(int argc, char **argv)
         },
     };
     bool image_opts = false;
+    bool shrink = false;
 
     /* Remove size from argv manually so that negative numbers are not treated
      * as options by getopt. */
@@ -3448,6 +3450,7 @@ static int img_resize(int argc, char **argv)
             {"object", required_argument, 0, OPTION_OBJECT},
             {"image-opts", no_argument, 0, OPTION_IMAGE_OPTS},
             {"preallocation", required_argument, 0, OPTION_PREALLOCATION},
+            {"shrink", no_argument, 0, OPTION_SHRINK},
             {0, 0, 0, 0}
         };
         c = getopt_long(argc, argv, ":f:hq",
@@ -3491,6 +3494,9 @@ static int img_resize(int argc, char **argv)
                 return 1;
             }
             break;
+        case OPTION_SHRINK:
+            shrink = true;
+            break;
         }
     }
     if (optind != argc - 1) {
@@ -3564,6 +3570,23 @@ static int img_resize(int argc, char **argv)
         goto out;
     }
 
+    if (total_size < current_size && !shrink) {
+        warn_report("Shrinking an image will delete all data beyond the "
+                    "shrunken image's end. Before performing such an "
+                    "operation, make sure there is no important data there.");
+
+        if (g_strcmp0(bdrv_get_format_name(blk_bs(blk)), "raw") != 0) {
+            error_report(
+              "Use the --shrink option to perform a shrink operation.");
+            ret = -1;
+            goto out;
+        } else {
+            warn_report("Using the --shrink option will suppress this message."
+                        "Note that future versions of qemu-img may refuse to "
+                        "shrink images without this option.");
+        }
+    }
+
     ret = blk_truncate(blk, total_size, prealloc, &err);
     if (!ret) {
         qprintf(quiet, "Image resized.\n");
diff --git a/qemu-img.texi b/qemu-img.texi
index f11f6036ad..9a930f5e6d 100644
--- a/qemu-img.texi
+++ b/qemu-img.texi
@@ -529,7 +529,7 @@ qemu-img rebase -b base.img diff.qcow2
 At this point, @code{modified.img} can be discarded, since
 @code{base.img + diff.qcow2} contains the same information.
 
-@item resize [--preallocation=@var{prealloc}] @var{filename} [+ | -]@var{size}
+@item resize [--shrink] [--preallocation=@var{prealloc}] @var{filename} [+ | -]@var{size}
 
 Change the disk image as if it had been created with @var{size}.
 
@@ -537,6 +537,10 @@ Before using this command to shrink a disk image, you MUST use file system and
 partitioning tools inside the VM to reduce allocated file systems and partition
 sizes accordingly.  Failure to do so will result in data loss!
 
+When shrinking images, the @code{--shrink} option must be given. This informs
+qemu-img that the user acknowledges all loss of data beyond the truncated
+image's end.
+
 After using this command to grow a disk image, you must use file system and
 partitioning tools inside the VM to actually begin using the new space on the
 device.
diff --git a/tests/qemu-iotests/102 b/tests/qemu-iotests/102
index 87db1bb1bf..d7ad8d9840 100755
--- a/tests/qemu-iotests/102
+++ b/tests/qemu-iotests/102
@@ -54,7 +54,7 @@ _make_test_img $IMG_SIZE
 $QEMU_IO -c 'write 0 64k' "$TEST_IMG" | _filter_qemu_io
 # Remove data cluster from image (first cluster: image header, second: reftable,
 # third: refblock, fourth: L1 table, fifth: L2 table)
-$QEMU_IMG resize -f raw "$TEST_IMG" $((5 * 64 * 1024))
+$QEMU_IMG resize -f raw --shrink "$TEST_IMG" $((5 * 64 * 1024))
 
 $QEMU_IO -c map "$TEST_IMG"
 $QEMU_IMG map "$TEST_IMG"
@@ -69,7 +69,7 @@ $QEMU_IO -c 'write 0 64k' "$TEST_IMG" | _filter_qemu_io
 
 qemu_comm_method=monitor _launch_qemu -drive if=none,file="$TEST_IMG",id=drv0
 
-$QEMU_IMG resize -f raw "$TEST_IMG" $((5 * 64 * 1024))
+$QEMU_IMG resize -f raw --shrink "$TEST_IMG" $((5 * 64 * 1024))
 
 _send_qemu_cmd $QEMU_HANDLE 'qemu-io drv0 map' 'allocated' \
     | sed -e 's/^(qemu).*qemu-io drv0 map...$/(qemu) qemu-io drv0 map/'
-- 
2.13.0


[Qemu-devel] [PATCH v6 2/4] qcow2: add qcow2_cache_discard
Posted by Pavel Butsykin, 53 weeks ago
Whenever l2/refcount table clusters are discarded from the file we can
automatically drop unnecessary content of the cache tables. This reduces
the chance of eviction useful cache data and eliminates inconsistent data
in the cache with the data in the file.

Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
---
 block/qcow2-cache.c    | 26 ++++++++++++++++++++++++++
 block/qcow2-refcount.c | 20 ++++++++++++++++++--
 block/qcow2.h          |  3 +++
 3 files changed, 47 insertions(+), 2 deletions(-)

diff --git a/block/qcow2-cache.c b/block/qcow2-cache.c
index 1d25147392..75746a7f43 100644
--- a/block/qcow2-cache.c
+++ b/block/qcow2-cache.c
@@ -411,3 +411,29 @@ void qcow2_cache_entry_mark_dirty(BlockDriverState *bs, Qcow2Cache *c,
     assert(c->entries[i].offset != 0);
     c->entries[i].dirty = true;
 }
+
+void *qcow2_cache_is_table_offset(BlockDriverState *bs, Qcow2Cache *c,
+                                  uint64_t offset)
+{
+    int i;
+
+    for (i = 0; i < c->size; i++) {
+        if (c->entries[i].offset == offset) {
+            return qcow2_cache_get_table_addr(bs, c, i);
+        }
+    }
+    return NULL;
+}
+
+void qcow2_cache_discard(BlockDriverState *bs, Qcow2Cache *c, void *table)
+{
+    int i = qcow2_cache_get_table_idx(bs, c, table);
+
+    assert(c->entries[i].ref == 0);
+
+    c->entries[i].offset = 0;
+    c->entries[i].lru_counter = 0;
+    c->entries[i].dirty = false;
+
+    qcow2_cache_table_release(bs, c, i, 1);
+}
diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index c9b0dcb4f3..bbe5a2b2cc 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -861,8 +861,24 @@ static int QEMU_WARN_UNUSED_RESULT update_refcount(BlockDriverState *bs,
         }
         s->set_refcount(refcount_block, block_index, refcount);
 
-        if (refcount == 0 && s->discard_passthrough[type]) {
-            update_refcount_discard(bs, cluster_offset, s->cluster_size);
+        if (refcount == 0) {
+            void *table;
+
+            table = qcow2_cache_is_table_offset(bs, s->refcount_block_cache,
+                                                offset);
+            if (table != NULL) {
+                qcow2_cache_put(bs, s->refcount_block_cache, &refcount_block);
+                qcow2_cache_discard(bs, s->refcount_block_cache, table);
+            }
+
+            table = qcow2_cache_is_table_offset(bs, s->l2_table_cache, offset);
+            if (table != NULL) {
+                qcow2_cache_discard(bs, s->l2_table_cache, table);
+            }
+
+            if (s->discard_passthrough[type]) {
+                update_refcount_discard(bs, cluster_offset, s->cluster_size);
+            }
         }
     }
 
diff --git a/block/qcow2.h b/block/qcow2.h
index 96a8d43c17..52c374e9ed 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -649,6 +649,9 @@ int qcow2_cache_get(BlockDriverState *bs, Qcow2Cache *c, uint64_t offset,
 int qcow2_cache_get_empty(BlockDriverState *bs, Qcow2Cache *c, uint64_t offset,
     void **table);
 void qcow2_cache_put(BlockDriverState *bs, Qcow2Cache *c, void **table);
+void *qcow2_cache_is_table_offset(BlockDriverState *bs, Qcow2Cache *c,
+                                  uint64_t offset);
+void qcow2_cache_discard(BlockDriverState *bs, Qcow2Cache *c, void *table);
 
 /* qcow2-bitmap.c functions */
 int qcow2_check_bitmaps_refcounts(BlockDriverState *bs, BdrvCheckResult *res,
-- 
2.13.0


[Qemu-devel] [PATCH v6 3/4] qcow2: add shrink image support
Posted by Pavel Butsykin, 53 weeks ago
This patch add shrinking of the image file for qcow2. As a result, this allows
us to reduce the virtual image size and free up space on the disk without
copying the image. Image can be fragmented and shrink is done by punching holes
in the image file.

Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
---
 block/qcow2-cluster.c  |  50 +++++++++++++++++++++
 block/qcow2-refcount.c | 120 +++++++++++++++++++++++++++++++++++++++++++++++++
 block/qcow2.c          |  43 ++++++++++++++----
 block/qcow2.h          |  14 ++++++
 qapi/block-core.json   |   3 +-
 5 files changed, 220 insertions(+), 10 deletions(-)

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index f06c08f64c..405bc2e7af 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -32,6 +32,56 @@
 #include "qemu/bswap.h"
 #include "trace.h"
 
+int qcow2_shrink_l1_table(BlockDriverState *bs, uint64_t exact_size)
+{
+    BDRVQcow2State *s = bs->opaque;
+    int new_l1_size, i, ret;
+
+    if (exact_size >= s->l1_size) {
+        return 0;
+    }
+
+    new_l1_size = exact_size;
+
+#ifdef DEBUG_ALLOC2
+    fprintf(stderr, "shrink l1_table from %d to %d\n", s->l1_size, new_l1_size);
+#endif
+
+    BLKDBG_EVENT(bs->file, BLKDBG_L1_SHRINK_WRITE_TABLE);
+    ret = bdrv_pwrite_zeroes(bs->file, s->l1_table_offset +
+                                       new_l1_size * sizeof(uint64_t),
+                             (s->l1_size - new_l1_size) * sizeof(uint64_t), 0);
+    if (ret < 0) {
+        goto fail;
+    }
+
+    ret = bdrv_flush(bs->file->bs);
+    if (ret < 0) {
+        goto fail;
+    }
+
+    BLKDBG_EVENT(bs->file, BLKDBG_L1_SHRINK_FREE_L2_CLUSTERS);
+    for (i = s->l1_size - 1; i > new_l1_size - 1; i--) {
+        if ((s->l1_table[i] & L1E_OFFSET_MASK) == 0) {
+            continue;
+        }
+        qcow2_free_clusters(bs, s->l1_table[i] & L1E_OFFSET_MASK,
+                            s->cluster_size, QCOW2_DISCARD_ALWAYS);
+        s->l1_table[i] = 0;
+    }
+    return 0;
+
+fail:
+    /*
+     * If the write in the l1_table failed the image may contain partially
+     * overwritten the l1_table. In this case would be better to clear the
+     * l1_table in memory to avoid possible image corruption.
+     */
+    memset(s->l1_table + exact_size, 0,
+           (s->l1_size - new_l1_size) * sizeof(uint64_t));
+    return ret;
+}
+
 int qcow2_grow_l1_table(BlockDriverState *bs, uint64_t min_size,
                         bool exact_size)
 {
diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index bbe5a2b2cc..6f7c3132c6 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -29,6 +29,7 @@
 #include "block/qcow2.h"
 #include "qemu/range.h"
 #include "qemu/bswap.h"
+#include "qemu/cutils.h"
 
 static int64_t alloc_clusters_noref(BlockDriverState *bs, uint64_t size);
 static int QEMU_WARN_UNUSED_RESULT update_refcount(BlockDriverState *bs,
@@ -3061,3 +3062,122 @@ done:
     qemu_vfree(new_refblock);
     return ret;
 }
+
+static int qcow2_discard_refcount_block(BlockDriverState *bs,
+                                        uint64_t discard_block_offs)
+{
+    BDRVQcow2State *s = bs->opaque;
+    uint64_t refblock_offs = get_refblock_offset(s, discard_block_offs);
+    uint64_t cluster_index = discard_block_offs >> s->cluster_bits;
+    uint32_t block_index = cluster_index & (s->refcount_block_size - 1);
+    void *refblock;
+    int ret;
+
+    assert(discard_block_offs != 0);
+
+    ret = qcow2_cache_get(bs, s->refcount_block_cache, refblock_offs,
+                          &refblock);
+    if (ret < 0) {
+        return ret;
+    }
+
+    if (s->get_refcount(refblock, block_index) != 1) {
+        qcow2_signal_corruption(bs, true, -1, -1, "Invalid refcount:"
+                                " refblock offset %#" PRIx64
+                                ", reftable index %u"
+                                ", block offset %#" PRIx64
+                                ", refcount %#" PRIx64,
+                                refblock_offs,
+                                offset_to_reftable_index(s, discard_block_offs),
+                                discard_block_offs,
+                                s->get_refcount(refblock, block_index));
+        qcow2_cache_put(bs, s->refcount_block_cache, &refblock);
+        return -EINVAL;
+    }
+    s->set_refcount(refblock, block_index, 0);
+
+    qcow2_cache_entry_mark_dirty(bs, s->refcount_block_cache, refblock);
+
+    qcow2_cache_put(bs, s->refcount_block_cache, &refblock);
+
+    if (cluster_index < s->free_cluster_index) {
+        s->free_cluster_index = cluster_index;
+    }
+
+    refblock = qcow2_cache_is_table_offset(bs, s->refcount_block_cache,
+                                           discard_block_offs);
+    if (refblock) {
+        /* discard refblock from the cache if refblock is cached */
+        qcow2_cache_discard(bs, s->refcount_block_cache, refblock);
+    }
+    update_refcount_discard(bs, discard_block_offs, s->cluster_size);
+
+    return 0;
+}
+
+int qcow2_shrink_reftable(BlockDriverState *bs)
+{
+    BDRVQcow2State *s = bs->opaque;
+    uint64_t *reftable_tmp =
+        g_malloc(s->refcount_table_size * sizeof(uint64_t));
+    int i, ret;
+
+    for (i = 0; i < s->refcount_table_size; i++) {
+        int64_t refblock_offs = s->refcount_table[i] & REFT_OFFSET_MASK;
+        void *refblock;
+        bool unused_block;
+
+        if (refblock_offs == 0) {
+            reftable_tmp[i] = 0;
+            continue;
+        }
+        ret = qcow2_cache_get(bs, s->refcount_block_cache, refblock_offs,
+                              &refblock);
+        if (ret < 0) {
+            goto out;
+        }
+
+        /* the refblock has own reference */
+        if (i == offset_to_reftable_index(s, refblock_offs)) {
+            uint64_t block_index = (refblock_offs >> s->cluster_bits) &
+                                   (s->refcount_block_size - 1);
+            uint64_t refcount = s->get_refcount(refblock, block_index);
+
+            s->set_refcount(refblock, block_index, 0);
+
+            unused_block = buffer_is_zero(refblock, s->cluster_size);
+
+            s->set_refcount(refblock, block_index, refcount);
+        } else {
+            unused_block = buffer_is_zero(refblock, s->cluster_size);
+        }
+        qcow2_cache_put(bs, s->refcount_block_cache, &refblock);
+
+        reftable_tmp[i] = unused_block ? 0 : cpu_to_be64(s->refcount_table[i]);
+    }
+
+    ret = bdrv_pwrite_sync(bs->file, s->refcount_table_offset, reftable_tmp,
+                           s->refcount_table_size * sizeof(uint64_t));
+    /*
+     * If the write in the reftable failed the image may contain partially
+     * overwritten the reftable. In this case would be better to clear the
+     * reftable in memory to avoid possible image corruption.
+     */
+    for (i = 0; i < s->refcount_table_size; i++) {
+        if (s->refcount_table[i] && !reftable_tmp[i]) {
+            if (ret == 0) {
+                ret = qcow2_discard_refcount_block(bs, s->refcount_table[i] &
+                                                       REFT_OFFSET_MASK);
+            }
+            s->refcount_table[i] = 0;
+        }
+    }
+
+    if (!s->cache_discards) {
+        qcow2_process_discards(bs, ret);
+    }
+
+out:
+    g_free(reftable_tmp);
+    return ret;
+}
diff --git a/block/qcow2.c b/block/qcow2.c
index c144ea5620..bd281fdd04 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -3120,18 +3120,43 @@ static int qcow2_truncate(BlockDriverState *bs, int64_t offset,
     }
 
     old_length = bs->total_sectors * 512;
+    new_l1_size = size_to_l1(s, offset);
 
-    /* shrinking is currently not supported */
     if (offset < old_length) {
-        error_setg(errp, "qcow2 doesn't support shrinking images yet");
-        return -ENOTSUP;
-    }
+        if (prealloc != PREALLOC_MODE_OFF) {
+            error_setg(errp,
+                       "Preallocation can't be used for shrinking an image");
+            return -EINVAL;
+        }
 
-    new_l1_size = size_to_l1(s, offset);
-    ret = qcow2_grow_l1_table(bs, new_l1_size, true);
-    if (ret < 0) {
-        error_setg_errno(errp, -ret, "Failed to grow the L1 table");
-        return ret;
+        ret = qcow2_cluster_discard(bs, ROUND_UP(offset, s->cluster_size),
+                                    old_length - ROUND_UP(offset,
+                                                          s->cluster_size),
+                                    QCOW2_DISCARD_ALWAYS, true);
+        if (ret < 0) {
+            error_setg_errno(errp, -ret, "Failed to discard cropped clusters");
+            return ret;
+        }
+
+        ret = qcow2_shrink_l1_table(bs, new_l1_size);
+        if (ret < 0) {
+            error_setg_errno(errp, -ret,
+                             "Failed to reduce the number of L2 tables");
+            return ret;
+        }
+
+        ret = qcow2_shrink_reftable(bs);
+        if (ret < 0) {
+            error_setg_errno(errp, -ret,
+                             "Failed to discard unused refblocks");
+            return ret;
+        }
+    } else {
+        ret = qcow2_grow_l1_table(bs, new_l1_size, true);
+        if (ret < 0) {
+            error_setg_errno(errp, -ret, "Failed to grow the L1 table");
+            return ret;
+        }
     }
 
     switch (prealloc) {
diff --git a/block/qcow2.h b/block/qcow2.h
index 52c374e9ed..5a289a81e2 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -521,6 +521,18 @@ static inline uint64_t refcount_diff(uint64_t r1, uint64_t r2)
     return r1 > r2 ? r1 - r2 : r2 - r1;
 }
 
+static inline
+uint32_t offset_to_reftable_index(BDRVQcow2State *s, uint64_t offset)
+{
+    return offset >> (s->refcount_block_bits + s->cluster_bits);
+}
+
+static inline uint64_t get_refblock_offset(BDRVQcow2State *s, uint64_t offset)
+{
+    uint32_t index = offset_to_reftable_index(s, offset);
+    return s->refcount_table[index] & REFT_OFFSET_MASK;
+}
+
 /* qcow2.c functions */
 int qcow2_backing_read1(BlockDriverState *bs, QEMUIOVector *qiov,
                   int64_t sector_num, int nb_sectors);
@@ -584,10 +596,12 @@ int qcow2_inc_refcounts_imrt(BlockDriverState *bs, BdrvCheckResult *res,
 int qcow2_change_refcount_order(BlockDriverState *bs, int refcount_order,
                                 BlockDriverAmendStatusCB *status_cb,
                                 void *cb_opaque, Error **errp);
+int qcow2_shrink_reftable(BlockDriverState *bs);
 
 /* qcow2-cluster.c functions */
 int qcow2_grow_l1_table(BlockDriverState *bs, uint64_t min_size,
                         bool exact_size);
+int qcow2_shrink_l1_table(BlockDriverState *bs, uint64_t max_size);
 int qcow2_write_l1_entry(BlockDriverState *bs, int l1_index);
 int qcow2_decompress_cluster(BlockDriverState *bs, uint64_t cluster_offset);
 int qcow2_encrypt_sectors(BDRVQcow2State *s, int64_t sector_num,
diff --git a/qapi/block-core.json b/qapi/block-core.json
index c437aa50ef..99cef55b7c 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -2487,7 +2487,8 @@
             'cluster_alloc_bytes', 'cluster_free', 'flush_to_os',
             'flush_to_disk', 'pwritev_rmw_head', 'pwritev_rmw_after_head',
             'pwritev_rmw_tail', 'pwritev_rmw_after_tail', 'pwritev',
-            'pwritev_zero', 'pwritev_done', 'empty_image_prepare' ] }
+            'pwritev_zero', 'pwritev_done', 'empty_image_prepare',
+            'l1_shrink_write_table', 'l1_shrink_free_l2_clusters' ] }
 
 ##
 # @BlkdebugInjectErrorOptions:
-- 
2.13.0


[Qemu-devel] [PATCH v6 4/4] qemu-iotests: add shrinking image test
Posted by Pavel Butsykin, 53 weeks ago
Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
---
 tests/qemu-iotests/163     | 170 +++++++++++++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/163.out |   5 ++
 tests/qemu-iotests/group   |   1 +
 3 files changed, 176 insertions(+)
 create mode 100644 tests/qemu-iotests/163
 create mode 100644 tests/qemu-iotests/163.out

diff --git a/tests/qemu-iotests/163 b/tests/qemu-iotests/163
new file mode 100644
index 0000000000..403842354e
--- /dev/null
+++ b/tests/qemu-iotests/163
@@ -0,0 +1,170 @@
+#!/usr/bin/env python
+#
+# Tests for shrinking images
+#
+# Copyright (c) 2016-2017 Parallels International GmbH
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+#
+
+import os, random, iotests, struct, qcow2
+from iotests import qemu_img, qemu_io, image_size
+
+test_img = os.path.join(iotests.test_dir, 'test.img')
+check_img = os.path.join(iotests.test_dir, 'check.img')
+
+def size_to_int(str):
+    suff = ['B', 'K', 'M', 'G', 'T']
+    return int(str[:-1]) * 1024**suff.index(str[-1:])
+
+class ShrinkBaseClass(iotests.QMPTestCase):
+    image_len = '128M'
+    shrink_size = '10M'
+    chunk_size = '16M'
+    refcount_bits = '16'
+
+    def __qcow2_check(self, filename):
+        entry_bits = 3
+        entry_size = 1 << entry_bits
+        l1_mask = 0x00fffffffffffe00
+        div_roundup = lambda n, d: (n + d - 1) / d
+
+        def split_by_n(data, n):
+            for x in xrange(0, len(data), n):
+                yield struct.unpack('>Q', data[x:x + n])[0] & l1_mask
+
+        def check_l1_table(h, l1_data):
+            l1_list = list(split_by_n(l1_data, entry_size))
+            real_l1_size = div_roundup(h.size,
+                                       1 << (h.cluster_bits*2 - entry_size))
+            used, unused = l1_list[:real_l1_size], l1_list[real_l1_size:]
+
+            self.assertTrue(len(used) != 0, "Verifying l1 table content")
+            self.assertFalse(any(unused), "Verifying l1 table content")
+
+        def check_reftable(fd, h, reftable):
+            for offset in split_by_n(reftable, entry_size):
+                if offset != 0:
+                    fd.seek(offset)
+                    cluster = fd.read(1 << h.cluster_bits)
+                    self.assertTrue(any(cluster), "Verifying reftable content")
+
+        with open(filename, "rb") as fd:
+            h = qcow2.QcowHeader(fd)
+
+            fd.seek(h.l1_table_offset)
+            l1_table = fd.read(h.l1_size << entry_bits)
+
+            fd.seek(h.refcount_table_offset)
+            reftable = fd.read(h.refcount_table_clusters << h.cluster_bits)
+
+            check_l1_table(h, l1_table)
+            check_reftable(fd, h, reftable)
+
+    def __raw_check(self, filename):
+        pass
+
+    image_check = {
+        'qcow2' : __qcow2_check,
+        'raw' : __raw_check
+    }
+
+    def setUp(self):
+        if iotests.imgfmt == 'raw':
+            qemu_img('create', '-f', iotests.imgfmt, test_img, self.image_len)
+            qemu_img('create', '-f', iotests.imgfmt, check_img,
+                     self.shrink_size)
+        else:
+            qemu_img('create', '-f', iotests.imgfmt,
+                     '-o', 'cluster_size=' + self.cluster_size +
+                     ',refcount_bits=' + self.refcount_bits,
+                     test_img, self.image_len)
+            qemu_img('create', '-f', iotests.imgfmt,
+                     '-o', 'cluster_size=%s'% self.cluster_size,
+                     check_img, self.shrink_size)
+        qemu_io('-c', 'write -P 0xff 0 ' + self.shrink_size, check_img)
+
+    def tearDown(self):
+        os.remove(test_img)
+        os.remove(check_img)
+
+    def image_verify(self):
+        self.assertEqual(image_size(test_img), image_size(check_img),
+                         "Verifying image size")
+        self.image_check[iotests.imgfmt](self, test_img)
+
+        if iotests.imgfmt == 'raw':
+            return
+        self.assertEqual(qemu_img('check', test_img), 0,
+                         "Verifying image corruption")
+
+    def test_empty_image(self):
+        qemu_img('resize',  '-f', iotests.imgfmt, '--shrink', test_img,
+                 self.shrink_size)
+
+        self.assertEqual(
+            qemu_io('-c', 'read -P 0x00 %s'%self.shrink_size, test_img),
+            qemu_io('-c', 'read -P 0x00 %s'%self.shrink_size, check_img),
+            "Verifying image content")
+
+        self.image_verify()
+
+    def test_sequential_write(self):
+        for offs in range(0, size_to_int(self.image_len),
+                          size_to_int(self.chunk_size)):
+            qemu_io('-c', 'write -P 0xff %d %s' % (offs, self.chunk_size),
+                    test_img)
+
+        qemu_img('resize',  '-f', iotests.imgfmt, '--shrink', test_img,
+                 self.shrink_size)
+
+        self.assertEqual(qemu_img("compare", test_img, check_img), 0,
+                         "Verifying image content")
+
+        self.image_verify()
+
+    def test_random_write(self):
+        offs_list = range(0, size_to_int(self.image_len),
+                          size_to_int(self.chunk_size))
+        random.shuffle(offs_list)
+        for offs in offs_list:
+            qemu_io('-c', 'write -P 0xff %d %s' % (offs, self.chunk_size),
+                    test_img)
+
+        qemu_img('resize',  '-f', iotests.imgfmt, '--shrink', test_img,
+                 self.shrink_size)
+
+        self.assertEqual(qemu_img("compare", test_img, check_img), 0,
+                         "Verifying image content")
+
+        self.image_verify()
+
+class TestShrink512(ShrinkBaseClass):
+    image_len = '3M'
+    shrink_size = '1M'
+    chunk_size = '256K'
+    cluster_size = '512'
+    refcount_bits = '64'
+
+class TestShrink64K(ShrinkBaseClass):
+    cluster_size = '64K'
+
+class TestShrink1M(ShrinkBaseClass):
+    cluster_size = '1M'
+    refcount_bits = '1'
+
+ShrinkBaseClass = None
+
+if __name__ == '__main__':
+    iotests.main(supported_fmts=['raw', 'qcow2'])
diff --git a/tests/qemu-iotests/163.out b/tests/qemu-iotests/163.out
new file mode 100644
index 0000000000..dae404e278
--- /dev/null
+++ b/tests/qemu-iotests/163.out
@@ -0,0 +1,5 @@
+.........
+----------------------------------------------------------------------
+Ran 9 tests
+
+OK
diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
index 2aba585287..1d985c3b45 100644
--- a/tests/qemu-iotests/group
+++ b/tests/qemu-iotests/group
@@ -166,6 +166,7 @@
 159 rw auto quick
 160 rw auto quick
 162 auto quick
+163 rw auto quick
 165 rw auto quick
 170 rw auto quick
 171 rw auto quick
-- 
2.13.0