[PATCH v2 3/4] qapi: blockdev-backup: add discard-source parameter

Vladimir Sementsov-Ogievskiy posted 4 patches 8 months, 1 week ago
Maintainers: John Snow <jsnow@redhat.com>, Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>, Kevin Wolf <kwolf@redhat.com>, Hanna Reitz <hreitz@redhat.com>, Wen Congyang <wencongyang2@huawei.com>, Xie Changlong <xiechanglong.d@gmail.com>, Eric Blake <eblake@redhat.com>, Markus Armbruster <armbru@redhat.com>
There is a newer version of this series
[PATCH v2 3/4] qapi: blockdev-backup: add discard-source parameter
Posted by Vladimir Sementsov-Ogievskiy 8 months, 1 week ago
Add a parameter that enables discard-after-copy. That is mostly useful
in "push backup with fleecing" scheme, when source is snapshot-access
format driver node, based on copy-before-write filter snapshot-access
API:

[guest]      [snapshot-access] ~~ blockdev-backup ~~> [backup target]
   |            |
   | root       | file
   v            v
[copy-before-write]
   |             |
   | file        | target
   v             v
[active disk]   [temp.img]

In this case discard-after-copy does two things:

 - discard data in temp.img to save disk space
 - avoid further copy-before-write operation in discarded area

Note that we have to declare WRITE permission on source in
copy-before-write filter, for discard to work.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
---
 block/backup.c                         |  5 +++--
 block/block-copy.c                     | 12 ++++++++++--
 block/copy-before-write.c              |  2 +-
 block/replication.c                    |  4 ++--
 blockdev.c                             |  2 +-
 include/block/block-copy.h             |  2 +-
 include/block/block_int-global-state.h |  2 +-
 qapi/block-core.json                   |  4 ++++
 8 files changed, 23 insertions(+), 10 deletions(-)

diff --git a/block/backup.c b/block/backup.c
index ec29d6b810..0c86b5be55 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -356,7 +356,7 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs,
                   BlockDriverState *target, int64_t speed,
                   MirrorSyncMode sync_mode, BdrvDirtyBitmap *sync_bitmap,
                   BitmapSyncMode bitmap_mode,
-                  bool compress,
+                  bool compress, bool discard_source,
                   const char *filter_node_name,
                   BackupPerf *perf,
                   BlockdevOnError on_source_error,
@@ -491,7 +491,8 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs,
     job->len = len;
     job->perf = *perf;
 
-    block_copy_set_copy_opts(bcs, perf->use_copy_range, compress);
+    block_copy_set_copy_opts(bcs, perf->use_copy_range, compress,
+                             discard_source);
     block_copy_set_progress_meter(bcs, &job->common.job.progress);
     block_copy_set_speed(bcs, speed);
 
diff --git a/block/block-copy.c b/block/block-copy.c
index 8fca2c3698..eebf643e86 100644
--- a/block/block-copy.c
+++ b/block/block-copy.c
@@ -137,6 +137,7 @@ typedef struct BlockCopyState {
     CoMutex lock;
     int64_t in_flight_bytes;
     BlockCopyMethod method;
+    bool discard_source;
     BlockReqList reqs;
     QLIST_HEAD(, BlockCopyCallState) calls;
     /*
@@ -282,11 +283,12 @@ static uint32_t block_copy_max_transfer(BdrvChild *source, BdrvChild *target)
 }
 
 void block_copy_set_copy_opts(BlockCopyState *s, bool use_copy_range,
-                              bool compress)
+                              bool compress, bool discard_source)
 {
     /* Keep BDRV_REQ_SERIALISING set (or not set) in block_copy_state_new() */
     s->write_flags = (s->write_flags & BDRV_REQ_SERIALISING) |
         (compress ? BDRV_REQ_WRITE_COMPRESSED : 0);
+    s->discard_source = discard_source;
 
     if (s->max_transfer < s->cluster_size) {
         /*
@@ -418,7 +420,7 @@ BlockCopyState *block_copy_state_new(BdrvChild *source, BdrvChild *target,
                                     cluster_size),
     };
 
-    block_copy_set_copy_opts(s, false, false);
+    block_copy_set_copy_opts(s, false, false, false);
 
     ratelimit_init(&s->rate_limit);
     qemu_co_mutex_init(&s->lock);
@@ -589,6 +591,12 @@ static coroutine_fn int block_copy_task_entry(AioTask *task)
     co_put_to_shres(s->mem, t->req.bytes);
     block_copy_task_end(t, ret);
 
+    if (s->discard_source && ret == 0) {
+        int64_t nbytes =
+            MIN(t->req.offset + t->req.bytes, s->len) - t->req.offset;
+        bdrv_co_pdiscard(s->source, t->req.offset, nbytes);
+    }
+
     return ret;
 }
 
diff --git a/block/copy-before-write.c b/block/copy-before-write.c
index 4cd9b7d91e..b208a80ade 100644
--- a/block/copy-before-write.c
+++ b/block/copy-before-write.c
@@ -370,7 +370,7 @@ cbw_child_perm(BlockDriverState *bs, BdrvChild *c, BdrvChildRole role,
              * works even if source node does not have any parents before backup
              * start
              */
-            *nperm = *nperm | BLK_PERM_CONSISTENT_READ;
+            *nperm = *nperm | BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE;
             *nshared &= ~(BLK_PERM_WRITE | BLK_PERM_RESIZE);
         }
     }
diff --git a/block/replication.c b/block/replication.c
index ca6bd0a720..0415a5e8b7 100644
--- a/block/replication.c
+++ b/block/replication.c
@@ -582,8 +582,8 @@ static void replication_start(ReplicationState *rs, ReplicationMode mode,
 
         s->backup_job = backup_job_create(
                                 NULL, s->secondary_disk->bs, s->hidden_disk->bs,
-                                0, MIRROR_SYNC_MODE_NONE, NULL, 0, false, NULL,
-                                &perf,
+                                0, MIRROR_SYNC_MODE_NONE, NULL, 0, false, false,
+                                NULL, &perf,
                                 BLOCKDEV_ON_ERROR_REPORT,
                                 BLOCKDEV_ON_ERROR_REPORT, JOB_INTERNAL,
                                 backup_job_completed, bs, NULL, &local_err);
diff --git a/blockdev.c b/blockdev.c
index 3a5e7222ec..aa64382f82 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -2713,7 +2713,7 @@ static BlockJob *do_backup_common(BackupCommon *backup,
 
     job = backup_job_create(backup->job_id, bs, target_bs, backup->speed,
                             backup->sync, bmap, backup->bitmap_mode,
-                            backup->compress,
+                            backup->compress, backup->discard_source,
                             backup->filter_node_name,
                             &perf,
                             backup->on_source_error,
diff --git a/include/block/block-copy.h b/include/block/block-copy.h
index 8b41643bfa..9555de2562 100644
--- a/include/block/block-copy.h
+++ b/include/block/block-copy.h
@@ -31,7 +31,7 @@ BlockCopyState *block_copy_state_new(BdrvChild *source, BdrvChild *target,
 
 /* Function should be called prior any actual copy request */
 void block_copy_set_copy_opts(BlockCopyState *s, bool use_copy_range,
-                              bool compress);
+                              bool compress, bool discard_source);
 void block_copy_set_progress_meter(BlockCopyState *s, ProgressMeter *pm);
 
 void block_copy_state_free(BlockCopyState *s);
diff --git a/include/block/block_int-global-state.h b/include/block/block_int-global-state.h
index ef31c58bb3..563aabf012 100644
--- a/include/block/block_int-global-state.h
+++ b/include/block/block_int-global-state.h
@@ -187,7 +187,7 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs,
                             MirrorSyncMode sync_mode,
                             BdrvDirtyBitmap *sync_bitmap,
                             BitmapSyncMode bitmap_mode,
-                            bool compress,
+                            bool compress, bool discard_source,
                             const char *filter_node_name,
                             BackupPerf *perf,
                             BlockdevOnError on_source_error,
diff --git a/qapi/block-core.json b/qapi/block-core.json
index ca390c5700..1735769f9f 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -1599,6 +1599,9 @@
 #     node specified by @drive.  If this option is not given, a node
 #     name is autogenerated.  (Since: 4.2)
 #
+# @discard-source: Discard blocks on source which are already copied
+#     to the target.  (Since 9.0)
+#
 # @x-perf: Performance options.  (Since 6.0)
 #
 # Features:
@@ -1620,6 +1623,7 @@
             '*on-target-error': 'BlockdevOnError',
             '*auto-finalize': 'bool', '*auto-dismiss': 'bool',
             '*filter-node-name': 'str',
+            '*discard-source': 'bool',
             '*x-perf': { 'type': 'BackupPerf',
                          'features': [ 'unstable' ] } } }
 
-- 
2.34.1
Re: [PATCH v2 3/4] qapi: blockdev-backup: add discard-source parameter
Posted by Fiona Ebner 8 months ago
Am 17.01.24 um 17:07 schrieb Vladimir Sementsov-Ogievskiy:
> Add a parameter that enables discard-after-copy. That is mostly useful
> in "push backup with fleecing" scheme, when source is snapshot-access
> format driver node, based on copy-before-write filter snapshot-access
> API:
> 
> [guest]      [snapshot-access] ~~ blockdev-backup ~~> [backup target]
>    |            |
>    | root       | file
>    v            v
> [copy-before-write]
>    |             |
>    | file        | target
>    v             v
> [active disk]   [temp.img]
> 
> In this case discard-after-copy does two things:
> 
>  - discard data in temp.img to save disk space
>  - avoid further copy-before-write operation in discarded area
> 
> Note that we have to declare WRITE permission on source in
> copy-before-write filter, for discard to work.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>

Ran into another issue when the cluster_size of the fleecing image is
larger than for the backup target, e.g.

> #!/bin/bash
> rm /tmp/fleecing.qcow2
> ./qemu-img create /tmp/disk.qcow2 -f qcow2 1G
> ./qemu-img create /tmp/fleecing.qcow2 -o cluster_size=2M -f qcow2 1G
> ./qemu-img create /tmp/backup.qcow2 -f qcow2 1G
> ./qemu-system-x86_64 --qmp stdio \
> --blockdev qcow2,node-name=node0,file.driver=file,file.filename=/tmp/disk.qcow2 \
> --blockdev qcow2,node-name=node1,file.driver=file,file.filename=/tmp/fleecing.qcow2,discard=unmap \
> --blockdev qcow2,node-name=node2,file.driver=file,file.filename=/tmp/backup.qcow2 \
> <<EOF
> {"execute": "qmp_capabilities"}
> {"execute": "blockdev-add", "arguments": { "driver": "copy-before-write", "file": "node0", "target": "node1", "node-name": "node3" } }
> {"execute": "blockdev-add", "arguments": { "driver": "snapshot-access", "file": "node3", "discard": "unmap", "node-name": "snap0" } }
> {"execute": "blockdev-backup", "arguments": { "device": "snap0", "target": "node2", "sync": "full", "job-id": "backup0", "discard-source": true } }
> EOF

will fail with

> qemu-system-x86_64: ../util/hbitmap.c:570: hbitmap_reset: Assertion `QEMU_IS_ALIGNED(count, gran) || (start + count == hb->orig_size)' failed.

Backtrace shows the assert happens while discarding, when resetting the
BDRVCopyBeforeWriteState access_bitmap
 > #6  0x0000555556142a2a in hbitmap_reset (hb=0x555557e01b80, start=0,
count=1048576) at ../util/hbitmap.c:570
> #7  0x0000555555f80764 in bdrv_reset_dirty_bitmap_locked (bitmap=0x55555850a660, offset=0, bytes=1048576) at ../block/dirty-bitmap.c:563
> #8  0x0000555555f807ab in bdrv_reset_dirty_bitmap (bitmap=0x55555850a660, offset=0, bytes=1048576) at ../block/dirty-bitmap.c:570
> #9  0x0000555555f7bb16 in cbw_co_pdiscard_snapshot (bs=0x5555581a7f60, offset=0, bytes=1048576) at ../block/copy-before-write.c:330
> #10 0x0000555555f8d00a in bdrv_co_pdiscard_snapshot (bs=0x5555581a7f60, offset=0, bytes=1048576) at ../block/io.c:3734
> #11 0x0000555555fd2380 in snapshot_access_co_pdiscard (bs=0x5555582b4f60, offset=0, bytes=1048576) at ../block/snapshot-access.c:55
> #12 0x0000555555f8b65d in bdrv_co_pdiscard (child=0x5555584fe790, offset=0, bytes=1048576) at ../block/io.c:3144
> #13 0x0000555555f78650 in block_copy_task_entry (task=0x555557f588f0) at ../block/block-copy.c:597

My guess for the cause is that in block_copy_calculate_cluster_size() we
only look at the target. But now that we need to discard the source,
we'll also need to consider that for the calculation?

Best Regards,
Fiona
Re: [PATCH v2 3/4] qapi: blockdev-backup: add discard-source parameter
Posted by Fiona Ebner 7 months, 4 weeks ago
Am 24.01.24 um 16:03 schrieb Fiona Ebner:
> Am 17.01.24 um 17:07 schrieb Vladimir Sementsov-Ogievskiy:
>> Add a parameter that enables discard-after-copy. That is mostly useful
>> in "push backup with fleecing" scheme, when source is snapshot-access
>> format driver node, based on copy-before-write filter snapshot-access
>> API:
>>
>> [guest]      [snapshot-access] ~~ blockdev-backup ~~> [backup target]
>>    |            |
>>    | root       | file
>>    v            v
>> [copy-before-write]
>>    |             |
>>    | file        | target
>>    v             v
>> [active disk]   [temp.img]
>>
>> In this case discard-after-copy does two things:
>>
>>  - discard data in temp.img to save disk space
>>  - avoid further copy-before-write operation in discarded area
>>
>> Note that we have to declare WRITE permission on source in
>> copy-before-write filter, for discard to work.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
> 
> Ran into another issue when the cluster_size of the fleecing image is
> larger than for the backup target, e.g.
> 
>> #!/bin/bash
>> rm /tmp/fleecing.qcow2
>> ./qemu-img create /tmp/disk.qcow2 -f qcow2 1G
>> ./qemu-img create /tmp/fleecing.qcow2 -o cluster_size=2M -f qcow2 1G
>> ./qemu-img create /tmp/backup.qcow2 -f qcow2 1G
>> ./qemu-system-x86_64 --qmp stdio \
>> --blockdev qcow2,node-name=node0,file.driver=file,file.filename=/tmp/disk.qcow2 \
>> --blockdev qcow2,node-name=node1,file.driver=file,file.filename=/tmp/fleecing.qcow2,discard=unmap \
>> --blockdev qcow2,node-name=node2,file.driver=file,file.filename=/tmp/backup.qcow2 \
>> <<EOF
>> {"execute": "qmp_capabilities"}
>> {"execute": "blockdev-add", "arguments": { "driver": "copy-before-write", "file": "node0", "target": "node1", "node-name": "node3" } }
>> {"execute": "blockdev-add", "arguments": { "driver": "snapshot-access", "file": "node3", "discard": "unmap", "node-name": "snap0" } }
>> {"execute": "blockdev-backup", "arguments": { "device": "snap0", "target": "node2", "sync": "full", "job-id": "backup0", "discard-source": true } }
>> EOF
> 
> will fail with
> 
>> qemu-system-x86_64: ../util/hbitmap.c:570: hbitmap_reset: Assertion `QEMU_IS_ALIGNED(count, gran) || (start + count == hb->orig_size)' failed.
> 
> Backtrace shows the assert happens while discarding, when resetting the
> BDRVCopyBeforeWriteState access_bitmap
>  > #6  0x0000555556142a2a in hbitmap_reset (hb=0x555557e01b80, start=0,
> count=1048576) at ../util/hbitmap.c:570
>> #7  0x0000555555f80764 in bdrv_reset_dirty_bitmap_locked (bitmap=0x55555850a660, offset=0, bytes=1048576) at ../block/dirty-bitmap.c:563
>> #8  0x0000555555f807ab in bdrv_reset_dirty_bitmap (bitmap=0x55555850a660, offset=0, bytes=1048576) at ../block/dirty-bitmap.c:570
>> #9  0x0000555555f7bb16 in cbw_co_pdiscard_snapshot (bs=0x5555581a7f60, offset=0, bytes=1048576) at ../block/copy-before-write.c:330
>> #10 0x0000555555f8d00a in bdrv_co_pdiscard_snapshot (bs=0x5555581a7f60, offset=0, bytes=1048576) at ../block/io.c:3734
>> #11 0x0000555555fd2380 in snapshot_access_co_pdiscard (bs=0x5555582b4f60, offset=0, bytes=1048576) at ../block/snapshot-access.c:55
>> #12 0x0000555555f8b65d in bdrv_co_pdiscard (child=0x5555584fe790, offset=0, bytes=1048576) at ../block/io.c:3144
>> #13 0x0000555555f78650 in block_copy_task_entry (task=0x555557f588f0) at ../block/block-copy.c:597
> 
> My guess for the cause is that in block_copy_calculate_cluster_size() we
> only look at the target. But now that we need to discard the source,
> we'll also need to consider that for the calculation?
> 

Just querying the source and picking the maximum won't work either,
because snapshot-access does not currently implement .bdrv_co_get_info
and because copy-before-write (doesn't implement .bdrv_co_get_info and
is a filter) will just return the info of its file child. But the
discard will go to the target child.

If I do

1. .bdrv_co_get_info in snapshot-access: return info from file child
2. .bdrv_co_get_info in copy-before-write: return maximum cluster_size
from file child and target child
3. block_copy_calculate_cluster_size: return maximum from source and target

then the issue does go away, but I don't know if that's not violating
any assumptions and probably there's a better way to avoid the issue?

Best Regards,
Fiona
Re: [PATCH v2 3/4] qapi: blockdev-backup: add discard-source parameter
Posted by Vladimir Sementsov-Ogievskiy 7 months, 4 weeks ago
On 25.01.24 15:47, Fiona Ebner wrote:
> Am 24.01.24 um 16:03 schrieb Fiona Ebner:
>> Am 17.01.24 um 17:07 schrieb Vladimir Sementsov-Ogievskiy:
>>> Add a parameter that enables discard-after-copy. That is mostly useful
>>> in "push backup with fleecing" scheme, when source is snapshot-access
>>> format driver node, based on copy-before-write filter snapshot-access
>>> API:
>>>
>>> [guest]      [snapshot-access] ~~ blockdev-backup ~~> [backup target]
>>>     |            |
>>>     | root       | file
>>>     v            v
>>> [copy-before-write]
>>>     |             |
>>>     | file        | target
>>>     v             v
>>> [active disk]   [temp.img]
>>>
>>> In this case discard-after-copy does two things:
>>>
>>>   - discard data in temp.img to save disk space
>>>   - avoid further copy-before-write operation in discarded area
>>>
>>> Note that we have to declare WRITE permission on source in
>>> copy-before-write filter, for discard to work.
>>>
>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
>>
>> Ran into another issue when the cluster_size of the fleecing image is
>> larger than for the backup target, e.g.
>>
>>> #!/bin/bash
>>> rm /tmp/fleecing.qcow2
>>> ./qemu-img create /tmp/disk.qcow2 -f qcow2 1G
>>> ./qemu-img create /tmp/fleecing.qcow2 -o cluster_size=2M -f qcow2 1G
>>> ./qemu-img create /tmp/backup.qcow2 -f qcow2 1G
>>> ./qemu-system-x86_64 --qmp stdio \
>>> --blockdev qcow2,node-name=node0,file.driver=file,file.filename=/tmp/disk.qcow2 \
>>> --blockdev qcow2,node-name=node1,file.driver=file,file.filename=/tmp/fleecing.qcow2,discard=unmap \
>>> --blockdev qcow2,node-name=node2,file.driver=file,file.filename=/tmp/backup.qcow2 \
>>> <<EOF
>>> {"execute": "qmp_capabilities"}
>>> {"execute": "blockdev-add", "arguments": { "driver": "copy-before-write", "file": "node0", "target": "node1", "node-name": "node3" } }
>>> {"execute": "blockdev-add", "arguments": { "driver": "snapshot-access", "file": "node3", "discard": "unmap", "node-name": "snap0" } }
>>> {"execute": "blockdev-backup", "arguments": { "device": "snap0", "target": "node2", "sync": "full", "job-id": "backup0", "discard-source": true } }
>>> EOF
>>
>> will fail with
>>
>>> qemu-system-x86_64: ../util/hbitmap.c:570: hbitmap_reset: Assertion `QEMU_IS_ALIGNED(count, gran) || (start + count == hb->orig_size)' failed.
>>
>> Backtrace shows the assert happens while discarding, when resetting the
>> BDRVCopyBeforeWriteState access_bitmap
>>   > #6  0x0000555556142a2a in hbitmap_reset (hb=0x555557e01b80, start=0,
>> count=1048576) at ../util/hbitmap.c:570
>>> #7  0x0000555555f80764 in bdrv_reset_dirty_bitmap_locked (bitmap=0x55555850a660, offset=0, bytes=1048576) at ../block/dirty-bitmap.c:563
>>> #8  0x0000555555f807ab in bdrv_reset_dirty_bitmap (bitmap=0x55555850a660, offset=0, bytes=1048576) at ../block/dirty-bitmap.c:570
>>> #9  0x0000555555f7bb16 in cbw_co_pdiscard_snapshot (bs=0x5555581a7f60, offset=0, bytes=1048576) at ../block/copy-before-write.c:330
>>> #10 0x0000555555f8d00a in bdrv_co_pdiscard_snapshot (bs=0x5555581a7f60, offset=0, bytes=1048576) at ../block/io.c:3734
>>> #11 0x0000555555fd2380 in snapshot_access_co_pdiscard (bs=0x5555582b4f60, offset=0, bytes=1048576) at ../block/snapshot-access.c:55
>>> #12 0x0000555555f8b65d in bdrv_co_pdiscard (child=0x5555584fe790, offset=0, bytes=1048576) at ../block/io.c:3144
>>> #13 0x0000555555f78650 in block_copy_task_entry (task=0x555557f588f0) at ../block/block-copy.c:597
>>
>> My guess for the cause is that in block_copy_calculate_cluster_size() we
>> only look at the target. But now that we need to discard the source,
>> we'll also need to consider that for the calculation?
>>
> 
> Just querying the source and picking the maximum won't work either,
> because snapshot-access does not currently implement .bdrv_co_get_info
> and because copy-before-write (doesn't implement .bdrv_co_get_info and
> is a filter) will just return the info of its file child. But the
> discard will go to the target child.
> 
> If I do
> 
> 1. .bdrv_co_get_info in snapshot-access: return info from file child
> 2. .bdrv_co_get_info in copy-before-write: return maximum cluster_size
> from file child and target child
> 3. block_copy_calculate_cluster_size: return maximum from source and target
> 
> then the issue does go away, but I don't know if that's not violating
> any assumptions and probably there's a better way to avoid the issue?
> 


Hmm. Taking maximum is not optimal for usual case without discard-source: user may want to work in smaller granularity than source, to save disk space.

In case with discarding we have two possibilities:

- either take larger granularity for the whole process like you propose (but this will need and option for CBW?)
- or, fix discarding bitmap in CBW to work like normal discard: it should be aligned down. This will lead actually to discard-source option doing nothing..

==
But why do you want fleecing image with larger granularity? Is that a real case or just experimenting? Still we should fix assertion anyway.

I think:

1. fix discarding bitmap to make aligning-down (will do that for v3)

2. if we need another logic for block_copy_calculate_cluster_size() it should be an option. May be explicit "copy-cluster-size" or "granularity" option for CBW driver and for backup job. And we'll just check that given cluster-size is power of two >= target_size.


-- 
Best regards,
Vladimir
Re: [PATCH v2 3/4] qapi: blockdev-backup: add discard-source parameter
Posted by Fiona Ebner 7 months, 4 weeks ago
Am 25.01.24 um 18:22 schrieb Vladimir Sementsov-Ogievskiy:
> 
> Hmm. Taking maximum is not optimal for usual case without
> discard-source: user may want to work in smaller granularity than
> source, to save disk space.
> 
> In case with discarding we have two possibilities:
> 
> - either take larger granularity for the whole process like you propose
> (but this will need and option for CBW?)
> - or, fix discarding bitmap in CBW to work like normal discard: it
> should be aligned down. This will lead actually to discard-source option
> doing nothing..
> 
> ==
> But why do you want fleecing image with larger granularity? Is that a
> real case or just experimenting? Still we should fix assertion anyway.
> 

Yes, it's a real use case. We do support different storage types and
want to allow users to place the fleecing image on a different storage
than the original image for flexibility.

I ran into the issue when backing up to a target with 1 MiB cluster_size
while using a fleecing image on RBD (which has 4 MiB cluster_size by
default).

In theory, I guess I could look into querying the cluster_size of the
backup target and trying to allocate the fleecing image with a small
enough cluster_size. But not sure if that would work on all storage
combinations, and would require teaching our storage plugin API (which
also supports third-party plugins) to perform allocation with a specific
cluster size. So not an ideal solution for us.

> I think:
> 
> 1. fix discarding bitmap to make aligning-down (will do that for v3)
> 

Thanks!

> 2. if we need another logic for block_copy_calculate_cluster_size() it
> should be an option. May be explicit "copy-cluster-size" or
> "granularity" option for CBW driver and for backup job. And we'll just
> check that given cluster-size is power of two >= target_size.
> 

I'll try to implement point 2. That should resolve the issue for our use
case.

Best Regards,
Fiona
Re: [PATCH v2 3/4] qapi: blockdev-backup: add discard-source parameter
Posted by Fiona Ebner 8 months ago
Am 17.01.24 um 17:07 schrieb Vladimir Sementsov-Ogievskiy:
> Add a parameter that enables discard-after-copy. That is mostly useful
> in "push backup with fleecing" scheme, when source is snapshot-access
> format driver node, based on copy-before-write filter snapshot-access
> API:
> 
> [guest]      [snapshot-access] ~~ blockdev-backup ~~> [backup target]
>    |            |
>    | root       | file
>    v            v
> [copy-before-write]
>    |             |
>    | file        | target
>    v             v
> [active disk]   [temp.img]
> 
> In this case discard-after-copy does two things:
> 
>  - discard data in temp.img to save disk space
>  - avoid further copy-before-write operation in discarded area
> 
> Note that we have to declare WRITE permission on source in
> copy-before-write filter, for discard to work.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>

Unfortunately, setting BLK_PERM_WRITE unconditionally breaks
blockdev-backup for a read-only node (even when not using discard-source):

> #!/bin/bash
> ./qemu-img create /tmp/disk.raw -f raw 1G
> ./qemu-img create /tmp/backup.raw -f raw 1G
> ./qemu-system-x86_64 --qmp stdio \
> --blockdev raw,node-name=node0,file.driver=file,file.filename=/tmp/disk.raw,read-only=true \
> --blockdev raw,node-name=node1,file.driver=file,file.filename=/tmp/backup.raw \
> <<EOF
> {"execute": "qmp_capabilities"}
> {"execute": "blockdev-backup", "arguments": { "device": "node0", "target": "node1", "sync": "full", "job-id": "backup0" } }
> EOF

The above works before applying this patch, but leads to an error
afterwards:

> {"error": {"class": "GenericError", "desc": "Block node is read-only"}}

Best Regards,
Fiona
Re: [PATCH v2 3/4] qapi: blockdev-backup: add discard-source parameter
Posted by Vladimir Sementsov-Ogievskiy 7 months, 4 weeks ago
On 19.01.24 17:46, Fiona Ebner wrote:
> Am 17.01.24 um 17:07 schrieb Vladimir Sementsov-Ogievskiy:
>> Add a parameter that enables discard-after-copy. That is mostly useful
>> in "push backup with fleecing" scheme, when source is snapshot-access
>> format driver node, based on copy-before-write filter snapshot-access
>> API:
>>
>> [guest]      [snapshot-access] ~~ blockdev-backup ~~> [backup target]
>>     |            |
>>     | root       | file
>>     v            v
>> [copy-before-write]
>>     |             |
>>     | file        | target
>>     v             v
>> [active disk]   [temp.img]
>>
>> In this case discard-after-copy does two things:
>>
>>   - discard data in temp.img to save disk space
>>   - avoid further copy-before-write operation in discarded area
>>
>> Note that we have to declare WRITE permission on source in
>> copy-before-write filter, for discard to work.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
> 
> Unfortunately, setting BLK_PERM_WRITE unconditionally breaks
> blockdev-backup for a read-only node (even when not using discard-source):

Ohh, right.

So, that's the place when we have to somehow pass through discrard-souce option to CBW filter, so that it know that WRITE permission is needed. Will try in v3. Thanks again for testing!

> 
>> #!/bin/bash
>> ./qemu-img create /tmp/disk.raw -f raw 1G
>> ./qemu-img create /tmp/backup.raw -f raw 1G
>> ./qemu-system-x86_64 --qmp stdio \
>> --blockdev raw,node-name=node0,file.driver=file,file.filename=/tmp/disk.raw,read-only=true \
>> --blockdev raw,node-name=node1,file.driver=file,file.filename=/tmp/backup.raw \
>> <<EOF
>> {"execute": "qmp_capabilities"}
>> {"execute": "blockdev-backup", "arguments": { "device": "node0", "target": "node1", "sync": "full", "job-id": "backup0" } }
>> EOF
> 
> The above works before applying this patch, but leads to an error
> afterwards:
> 
>> {"error": {"class": "GenericError", "desc": "Block node is read-only"}}
> 
> Best Regards,
> Fiona
> 

-- 
Best regards,
Vladimir