[Qemu-devel] [PATCH RESEND v4] drive-mirror: add incremental mode

mahaocong posted 1 patch 5 years, 2 months ago
Test asan passed
Test docker-clang@ubuntu passed
Test docker-mingw@fedora passed
Test checkpatch passed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20190214064312.44794-1-mahaocong_work@163.com
Maintainers: Markus Armbruster <armbru@redhat.com>, Max Reitz <mreitz@redhat.com>, Kevin Wolf <kwolf@redhat.com>, Eric Blake <eblake@redhat.com>, Jeff Cody <jcody@redhat.com>
block/mirror.c            | 47 +++++++++++++++++++++++++++++++++++------------
blockdev.c                | 36 ++++++++++++++++++++++++++++++++++--
include/block/block_int.h |  3 ++-
qapi/block-core.json      | 14 ++++++++++++--
4 files changed, 83 insertions(+), 17 deletions(-)
[Qemu-devel] [PATCH RESEND v4] drive-mirror: add incremental mode
Posted by mahaocong 5 years, 2 months ago
From: mahaocong <mahaocong@didichuxing.com>

This patch adds possibility to start mirroring with user-created-bitmap.
On full mode, mirror create a non-named-bitmap by scanning whole block-chain,
and on top mode, mirror create a bitmap by scanning the top block layer. So I
think I can copy a user-created-bitmap and use it as the initial state of the
mirror, the same as incremental mode drive-backup, and I call this new mode
as incremental mode drive-mirror.

A possible usage scene of incremental mode mirror is live migration. For maintain
the block data and recover after a malfunction, someone may backup data to ceph
or other distributed storage. On qemu incremental backup, we need to create a new
bitmap and attach to block device before the first backup job. Then the bitmap
records the change after the backup job. If we want to migration this vm, we can
migrate block data between source and destionation by using drive-mirror directly,
or use backup data and backup-bitmap to reduce the data should be synchronize.
To do this, we should first create a new image in destination and set backing file
as backup image, then set backup-bitmap as the initial state of incremental mode
drive-mirror, and synchronize dirty block starting with the state set by the last
incremental backup job. When the mirror complete, we get an active layer on destination,
and its backing file is backup image on ceph. Then we can do live copy data from
backing files into overlay images by using block-stream, or do backup continually.

In this scene, It seems that If the guest os doesn't write too many data after the
last backup, the incremental mode may transmit less data than full mode or top
mode. However, if the write data is too many, there is no advantage on incremental
mode compare with other mode.

This scene can be described as following steps:
1.create rbd image in ceph, and map nbd device with rbd image.
2.create a new bitmap and attach to block device.
3.do full mode backup on nbd device and thus sync it to the rbd image.
4.severl times incremental mode backup.
5.create new image in destination and set its backing file as backup image.
6.do live-migration, and migrate block data by incremental mode drive-mirror
  with bitmap created from step 2.

Signed-off-by: Ma Haocong <mahaocong@didichuxing.com>
---
 compare with the version 3, there are following changes:
 1. Adjust error checking form incremental mode to blockdev_mirror_common, which lets
    drive-mirror and blockdev-mirror share error checking for incremental mode.
 2. Delete error checking about granularity, which is seems to be a non-fatal error,
    and add a warnning log to mark it.
 3. Add incremental mode on blockdev-mirror by adding parameter 'bitmap' on qmp command
    'blockdev-mirror'.
 4. Adjust the discription about parameter 'bitmap' on qmp command 'drive-mirror' and
    'blockdev-mirror'.

 I used old version libvirt (3.9.0 before) to test live migration, and I found this version
 use 'qemuMigrationDriveMirror' to migrate block device. Therefore, I only focused on
 drive-mirror. I add incremental mode on blockdev-mirror on new version patch. However,
 I am not sure about the difference between 'drive-mirror' and 'blockdev-mirror'.
 I learn that I need to use 'blockdev-add' to add new node before using 'blockdev-mirror'.
 Other than that, I don't know what else is different. Therefore, there maybe some defect
 on blockdev-mirror. I'm still in learning sth about 'blockdev-mirror', please give me some
 advices, thank you.

 block/mirror.c            | 47 +++++++++++++++++++++++++++++++++++------------
 blockdev.c                | 36 ++++++++++++++++++++++++++++++++++--
 include/block/block_int.h |  3 ++-
 qapi/block-core.json      | 14 ++++++++++++--
 4 files changed, 83 insertions(+), 17 deletions(-)

diff --git a/block/mirror.c b/block/mirror.c
index ab59ad77e8..1fe0c767cf 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -50,6 +50,7 @@ typedef struct MirrorBlockJob {
     /* Used to block operations on the drive-mirror-replace target */
     Error *replace_blocker;
     bool is_none_mode;
+    BdrvDirtyBitmap *src_bitmap;
     BlockMirrorBackingMode backing_mode;
     MirrorCopyMode copy_mode;
     BlockdevOnError on_source_error, on_target_error;
@@ -814,6 +815,16 @@ static int coroutine_fn mirror_dirty_init(MirrorBlockJob *s)
     return 0;
 }
 
+/*
+ * init dirty bitmap by using user bitmap. usr->hbitmap will be copy to
+ * mirror bitmap->hbitmap instead of reuse it.
+ */
+static void coroutine_fn mirror_dirty_init_incremental(MirrorBlockJob *s,
+                                                       Error **errp)
+{
+    bdrv_merge_dirty_bitmap(s->dirty_bitmap, s->src_bitmap, NULL, errp);
+}
+
 /* Called when going out of the streaming phase to flush the bulk of the
  * data to the medium, or just before completing.
  */
@@ -839,6 +850,7 @@ static int coroutine_fn mirror_run(Job *job, Error **errp)
     char backing_filename[2]; /* we only need 2 characters because we are only
                                  checking for a NULL string */
     int ret = 0;
+    Error *local_err = NULL;
 
     if (job_is_cancelled(&s->common.job)) {
         goto immediate_exit;
@@ -913,9 +925,19 @@ static int coroutine_fn mirror_run(Job *job, Error **errp)
 
     s->last_pause_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
     if (!s->is_none_mode) {
-        ret = mirror_dirty_init(s);
-        if (ret < 0 || job_is_cancelled(&s->common.job)) {
-            goto immediate_exit;
+        /* incremental mode */
+        if (s->src_bitmap) {
+            mirror_dirty_init_incremental(s, &local_err);
+            if (local_err) {
+                error_propagate(errp, local_err);
+                ret = -1;
+                goto immediate_exit;
+            }
+        } else {
+            ret = mirror_dirty_init(s);
+            if (ret < 0 || job_is_cancelled(&s->common.job)) {
+                goto immediate_exit;
+            }
         }
     }
 
@@ -1484,7 +1506,8 @@ static void mirror_start_job(const char *job_id, BlockDriverState *bs,
                              BlockCompletionFunc *cb,
                              void *opaque,
                              const BlockJobDriver *driver,
-                             bool is_none_mode, BlockDriverState *base,
+                             bool is_none_mode, BdrvDirtyBitmap *src_bitmap,
+                             BlockDriverState *base,
                              bool auto_complete, const char *filter_node_name,
                              bool is_mirror, MirrorCopyMode copy_mode,
                              Error **errp)
@@ -1598,6 +1621,7 @@ static void mirror_start_job(const char *job_id, BlockDriverState *bs,
     s->on_source_error = on_source_error;
     s->on_target_error = on_target_error;
     s->is_none_mode = is_none_mode;
+    s->src_bitmap = src_bitmap;
     s->backing_mode = backing_mode;
     s->copy_mode = copy_mode;
     s->base = base;
@@ -1664,7 +1688,8 @@ void mirror_start(const char *job_id, BlockDriverState *bs,
                   BlockDriverState *target, const char *replaces,
                   int creation_flags, int64_t speed,
                   uint32_t granularity, int64_t buf_size,
-                  MirrorSyncMode mode, BlockMirrorBackingMode backing_mode,
+                  MirrorSyncMode mode, BdrvDirtyBitmap *src_bitmap,
+                  BlockMirrorBackingMode backing_mode,
                   BlockdevOnError on_source_error,
                   BlockdevOnError on_target_error,
                   bool unmap, const char *filter_node_name,
@@ -1673,17 +1698,14 @@ void mirror_start(const char *job_id, BlockDriverState *bs,
     bool is_none_mode;
     BlockDriverState *base;
 
-    if (mode == MIRROR_SYNC_MODE_INCREMENTAL) {
-        error_setg(errp, "Sync mode 'incremental' not supported");
-        return;
-    }
     is_none_mode = mode == MIRROR_SYNC_MODE_NONE;
     base = mode == MIRROR_SYNC_MODE_TOP ? backing_bs(bs) : NULL;
     mirror_start_job(job_id, bs, creation_flags, target, replaces,
                      speed, granularity, buf_size, backing_mode,
                      on_source_error, on_target_error, unmap, NULL, NULL,
-                     &mirror_job_driver, is_none_mode, base, false,
-                     filter_node_name, true, copy_mode, errp);
+                     &mirror_job_driver, is_none_mode,
+                     src_bitmap, base, false, filter_node_name, true,
+                     copy_mode, errp);
 }
 
 void commit_active_start(const char *job_id, BlockDriverState *bs,
@@ -1707,7 +1729,8 @@ void commit_active_start(const char *job_id, BlockDriverState *bs,
     mirror_start_job(job_id, bs, creation_flags, base, NULL, speed, 0, 0,
                      MIRROR_LEAVE_BACKING_CHAIN,
                      on_error, on_error, true, cb, opaque,
-                     &commit_active_job_driver, false, base, auto_complete,
+                     &commit_active_job_driver, false,
+                     NULL, base, auto_complete,
                      filter_node_name, false, MIRROR_COPY_MODE_BACKGROUND,
                      &local_err);
     if (local_err) {
diff --git a/blockdev.c b/blockdev.c
index a6f71f9d83..9da06c0806 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -3661,6 +3661,8 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
                                    BlockDriverState *target,
                                    bool has_replaces, const char *replaces,
                                    enum MirrorSyncMode sync,
+                                   bool has_bitmap,
+                                   const char *bitmap_name,
                                    BlockMirrorBackingMode backing_mode,
                                    bool has_speed, int64_t speed,
                                    bool has_granularity, uint32_t granularity,
@@ -3678,6 +3680,7 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
                                    Error **errp)
 {
     int job_flags = JOB_DEFAULT;
+    BdrvDirtyBitmap *src_bitmap = NULL;
 
     if (!has_speed) {
         speed = 0;
@@ -3700,6 +3703,10 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
     if (!has_filter_node_name) {
         filter_node_name = NULL;
     }
+    if (!has_bitmap) {
+        bitmap_name = NULL;
+    }
+
     if (!has_copy_mode) {
         copy_mode = MIRROR_COPY_MODE_BACKGROUND;
     }
@@ -3731,13 +3738,35 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs,
     if (!bs->backing && sync == MIRROR_SYNC_MODE_TOP) {
         sync = MIRROR_SYNC_MODE_FULL;
     }
+    if (!bitmap_name && (sync == MIRROR_SYNC_MODE_INCREMENTAL)) {
+        error_setg(errp, "incremental mode must specify the bitmap name");
+        return;
+    }
+    /*
+     * In incremental mode, we should create null name bitmap by
+     * using user bitmap's granularity.
+     */
+    if (sync == MIRROR_SYNC_MODE_INCREMENTAL) {
+        assert(bitmap_name);
+        src_bitmap = bdrv_find_dirty_bitmap(bs, bitmap_name);
+        if (!src_bitmap) {
+            error_setg(errp, "Error: can't find dirty bitmap "
+                       "before start incremental drive-mirror");
+            return;
+        }
+        if (granularity) {
+            warn_report("On incremental mode, granularity is unused, "
+                        "the bitmap's granularity is used instead");
+        }
+        granularity = bdrv_dirty_bitmap_granularity(src_bitmap);
+    }
 
     /* pass the node name to replace to mirror start since it's loose coupling
      * and will allow to check whether the node still exist at mirror completion
      */
     mirror_start(job_id, bs, target,
                  has_replaces ? replaces : NULL, job_flags,
-                 speed, granularity, buf_size, sync, backing_mode,
+                 speed, granularity, buf_size, sync, src_bitmap, backing_mode,
                  on_source_error, on_target_error, unmap, filter_node_name,
                  copy_mode, errp);
 }
@@ -3878,6 +3907,7 @@ void qmp_drive_mirror(DriveMirror *arg, Error **errp)
 
     blockdev_mirror_common(arg->has_job_id ? arg->job_id : NULL, bs, target_bs,
                            arg->has_replaces, arg->replaces, arg->sync,
+                           arg->has_bitmap, arg->bitmap,
                            backing_mode, arg->has_speed, arg->speed,
                            arg->has_granularity, arg->granularity,
                            arg->has_buf_size, arg->buf_size,
@@ -3899,6 +3929,7 @@ void qmp_blockdev_mirror(bool has_job_id, const char *job_id,
                          const char *device, const char *target,
                          bool has_replaces, const char *replaces,
                          MirrorSyncMode sync,
+                         bool has_bitmap, const char *bitmap,
                          bool has_speed, int64_t speed,
                          bool has_granularity, uint32_t granularity,
                          bool has_buf_size, int64_t buf_size,
@@ -3935,7 +3966,8 @@ void qmp_blockdev_mirror(bool has_job_id, const char *job_id,
     bdrv_set_aio_context(target_bs, aio_context);
 
     blockdev_mirror_common(has_job_id ? job_id : NULL, bs, target_bs,
-                           has_replaces, replaces, sync, backing_mode,
+                           has_replaces, replaces, sync, has_bitmap,
+                           bitmap, backing_mode,
                            has_speed, speed,
                            has_granularity, granularity,
                            has_buf_size, buf_size,
diff --git a/include/block/block_int.h b/include/block/block_int.h
index f605622216..57a441f992 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -1054,7 +1054,8 @@ void mirror_start(const char *job_id, BlockDriverState *bs,
                   BlockDriverState *target, const char *replaces,
                   int creation_flags, int64_t speed,
                   uint32_t granularity, int64_t buf_size,
-                  MirrorSyncMode mode, BlockMirrorBackingMode backing_mode,
+                  MirrorSyncMode mode, BdrvDirtyBitmap *src_bitmap,
+                  BlockMirrorBackingMode backing_mode,
                   BlockdevOnError on_source_error,
                   BlockdevOnError on_target_error,
                   bool unmap, const char *filter_node_name,
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 762000f31f..4be810132e 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -1727,6 +1727,11 @@
 #        (all the disk, only the sectors allocated in the topmost image, or
 #        only new I/O).
 #
+# @bitmap: The name of a bitmap to use in incremental mode. This argument must
+#          be present for incremental mode and absent otherwise. In incremental
+#          mode, granularity is unused, the bitmap's granularity is used instead
+#          (since 4.0).
+#
 # @granularity: granularity of the dirty bitmap, default is 64K
 #               if the image format doesn't have clusters, 4K if the clusters
 #               are smaller than that, else the cluster size.  Must be a
@@ -1768,7 +1773,7 @@
 { 'struct': 'DriveMirror',
   'data': { '*job-id': 'str', 'device': 'str', 'target': 'str',
             '*format': 'str', '*node-name': 'str', '*replaces': 'str',
-            'sync': 'MirrorSyncMode', '*mode': 'NewImageMode',
+            'sync': 'MirrorSyncMode', '*bitmap': 'str', '*mode': 'NewImageMode',
             '*speed': 'int', '*granularity': 'uint32',
             '*buf-size': 'int', '*on-source-error': 'BlockdevOnError',
             '*on-target-error': 'BlockdevOnError',
@@ -2017,6 +2022,11 @@
 #        (all the disk, only the sectors allocated in the topmost image, or
 #        only new I/O).
 #
+# @bitmap: The name of a bitmap to use in incremental mode. This argument must
+#          be present for incremental mode and absent otherwise. In incremental
+#          mode, granularity is unused, the bitmap's granularity is used instead
+#          (since 4.0).
+#
 # @granularity: granularity of the dirty bitmap, default is 64K
 #               if the image format doesn't have clusters, 4K if the clusters
 #               are smaller than that, else the cluster size.  Must be a
@@ -2069,7 +2079,7 @@
 { 'command': 'blockdev-mirror',
   'data': { '*job-id': 'str', 'device': 'str', 'target': 'str',
             '*replaces': 'str',
-            'sync': 'MirrorSyncMode',
+            'sync': 'MirrorSyncMode', '*bitmap': 'str',
             '*speed': 'int', '*granularity': 'uint32',
             '*buf-size': 'int', '*on-source-error': 'BlockdevOnError',
             '*on-target-error': 'BlockdevOnError',
-- 
2.14.1



Re: [Qemu-devel] [PATCH RESEND v4] drive-mirror: add incremental mode
Posted by Max Reitz 5 years, 2 months ago
CC-ing John because of the keyword "incremental".

On 14.02.19 07:43, mahaocong wrote:
> From: mahaocong <mahaocong@didichuxing.com>
> 
> This patch adds possibility to start mirroring with user-created-bitmap.
> On full mode, mirror create a non-named-bitmap by scanning whole block-chain,
> and on top mode, mirror create a bitmap by scanning the top block layer. So I
> think I can copy a user-created-bitmap and use it as the initial state of the
> mirror, the same as incremental mode drive-backup, and I call this new mode
> as incremental mode drive-mirror.
> 
> A possible usage scene of incremental mode mirror is live migration. For maintain
> the block data and recover after a malfunction, someone may backup data to ceph
> or other distributed storage. On qemu incremental backup, we need to create a new
> bitmap and attach to block device before the first backup job. Then the bitmap
> records the change after the backup job. If we want to migration this vm, we can
> migrate block data between source and destionation by using drive-mirror directly,
> or use backup data and backup-bitmap to reduce the data should be synchronize.
> To do this, we should first create a new image in destination and set backing file
> as backup image, then set backup-bitmap as the initial state of incremental mode
> drive-mirror, and synchronize dirty block starting with the state set by the last
> incremental backup job. When the mirror complete, we get an active layer on destination,
> and its backing file is backup image on ceph. Then we can do live copy data from
> backing files into overlay images by using block-stream, or do backup continually.
> 
> In this scene, It seems that If the guest os doesn't write too many data after the
> last backup, the incremental mode may transmit less data than full mode or top
> mode. However, if the write data is too many, there is no advantage on incremental
> mode compare with other mode.
> 
> This scene can be described as following steps:
> 1.create rbd image in ceph, and map nbd device with rbd image.
> 2.create a new bitmap and attach to block device.
> 3.do full mode backup on nbd device and thus sync it to the rbd image.
> 4.severl times incremental mode backup.
> 5.create new image in destination and set its backing file as backup image.
> 6.do live-migration, and migrate block data by incremental mode drive-mirror
>   with bitmap created from step 2.
> 
> Signed-off-by: Ma Haocong <mahaocong@didichuxing.com>
> ---

So one important point about incremental backups is that you can
actually do them incrementally: The bitmap is effectively cleared at the
beginning of the backup process (a successor bitmap is installed that is
cleared and receives all changes; at the end of the backup, it either
replaces the old bitmap (on success) or is merged into it (on failure)).
 Therefore, you can do the next incremental backup by using the same bitmap.

How would this work with mirroring?  Instead of clearing the bitmap at
the start of the process, it'd need to be cleared at the end (because we
reach synchronization between source and target then).  But how would
error handling work?

I suppose the named bitmap would need to be copied to act as the dirty
bitmap for the mirror job (at the start of the job).  If a failure
occurs, the copy is simply discarded.  On success, the named bitmap is
cleared when the job is completed.  Hm, that seems to make sense.  Did I
forget anything, John?

In any case, I don't think this patch implemented anything to this
regard...?  So it doesn't really implement incremental mirroring.
However, I think it should, if possible.

Max

Re: [Qemu-devel] [PATCH RESEND v4] drive-mirror: add incremental mode
Posted by 马昊骢 (滴滴云) 5 years, 2 months ago
> CC-ing John because of the keyword "incremental".
Ok
>
>On 14.02.19 07:43, mahaocong wrote:
>> From: mahaocong <address@hidden>
>>
>> This patch adds possibility to start mirroring with user-created-bitmap.
>> On full mode, mirror create a non-named-bitmap by scanning whole block-
>> chain, and on top mode, mirror create a bitmap by scanning the top block
>> layer. So I think I can copy a user-created-bitmap and use it as the
>> initial state of the mirror, the same as incremental mode drive-backup,
>> and I call this new mode as incremental mode drive-mirror.
>>
>> A possible usage scene of incremental mode mirror is live migration.
>> For maintain the block data and recover after a malfunction, someone may
>> backup data to ceph or other distributed storage. On qemu incremental
>> backup, we need to create a new bitmap and attach to block device before
>> the first backup job. Then the bitmap records the change after the backup
>> job. If we want to migration this vm, we can migrate block data between
>> source and destionation by using drive-mirror directly, or use backup data
>> and backup-bitmap to reduce the data should be synchronize. To do this,
>> we should first create a new image in destination and set backing file
>> as backup image, then set backup-bitmap as the initial state of incremental
>> mode drive-mirror, and synchronize dirty block starting with the state set
>> by the last incremental backup job. When the mirror complete, we get an
>> active layer on destination, and its backing file is backup image on ceph.
>> Then we can do live copy data from backing files into overlay images by
>> using block-stream, or do backup continually.
>>
>> In this scene, It seems that If the guest os doesn't write too many data
>> after the last backup, the incremental mode may transmit less data than
>> full mode or top mode. However, if the write data is too many, there is
>> no advantage on incremental mode compare with other mode.
>>
>> This scene can be described as following steps:
>> 1.create rbd image in ceph, and map nbd device with rbd image.
>> 2.create a new bitmap and attach to block device.
>> 3.do full mode backup on nbd device and thus sync it to the rbd image.
>> 4.severl times incremental mode backup.
>> 5.create new image in destination and set its backing file as backup image.
>> 6.do live-migration, and migrate block data by incremental mode drive-
>> mirror with bitmap created from step 2.
>>
>> Signed-off-by: Ma Haocong <address@hidden>
>> ---
>
>So one important point about incremental backups is that you can
>actually do them incrementally: The bitmap is effectively cleared at the
>beginning of the backup process (a successor bitmap is installed that is
>cleared and receives all changes; at the end of the backup, it either
>replaces the old bitmap (on success) or is merged into it (on failure)).
> Therefore, you can do the next incremental backup by using the same bitmap.
>
>How would this work with mirroring?  Instead of clearing the bitmap at
>the start of the process, it'd need to be cleared at the end (because we
>reach synchronization between source and target then).  But how would
>error handling work?
>
>I suppose the named bitmap would need to be copied to act as the dirty
>bitmap for the mirror job (at the start of the job).  If a failure
>occurs, the copy is simply discarded.  On success, the named bitmap is
>cleared when the job is completed.  Hm, that seems to make sense.  Did I
>forget anything, John?
>
>In any case, I don't think this patch implemented anything to this
>regard...?  So it doesn't really implement incremental mirroring.
>However, I think it should, if possible.
>
>Max

I think you are right, the bitmap will be cleared at the end of mirroring,
Therefore, I can't use the same bitmap to do the next incremental mirror,
as the incremental backup does. So, it is not a real incremental mirror.

In our scene, the main purpose is to let mirror job use the same bitmap as
backup job. As bitmap used by backup job tracks dirty data since last time
backup job happened, with this patch, if mirror job uses the backup bitmap,
mirrored data is less than before, and then time spend by mirror job is
shortened. To get the full data, a stream job can pull the backup data into
the mirrored data. If mirror job is used in live migration, with this patch,
time spend by migration is shorten, transferred data is reduced, resources
used by qemu process on source host can be released earlier.

So, this patch is an optimization to our live migration, although it is not
appropriate to call it an incremental mode mirroring.

Thanks

Ma haocong 

Re: [Qemu-devel] [PATCH RESEND v4] drive-mirror: add incremental mode
Posted by Vladimir Sementsov-Ogievskiy 5 years, 1 month ago
27.02.2019 18:25, Max Reitz wrote:
> CC-ing John because of the keyword "incremental".
> 
> On 14.02.19 07:43, mahaocong wrote:
>> From: mahaocong <mahaocong@didichuxing.com>
>>
>> This patch adds possibility to start mirroring with user-created-bitmap.
>> On full mode, mirror create a non-named-bitmap by scanning whole block-chain,
>> and on top mode, mirror create a bitmap by scanning the top block layer. So I
>> think I can copy a user-created-bitmap and use it as the initial state of the
>> mirror, the same as incremental mode drive-backup, and I call this new mode
>> as incremental mode drive-mirror.
>>
>> A possible usage scene of incremental mode mirror is live migration. For maintain
>> the block data and recover after a malfunction, someone may backup data to ceph
>> or other distributed storage. On qemu incremental backup, we need to create a new
>> bitmap and attach to block device before the first backup job. Then the bitmap
>> records the change after the backup job. If we want to migration this vm, we can
>> migrate block data between source and destionation by using drive-mirror directly,
>> or use backup data and backup-bitmap to reduce the data should be synchronize.
>> To do this, we should first create a new image in destination and set backing file
>> as backup image, then set backup-bitmap as the initial state of incremental mode
>> drive-mirror, and synchronize dirty block starting with the state set by the last
>> incremental backup job. When the mirror complete, we get an active layer on destination,
>> and its backing file is backup image on ceph. Then we can do live copy data from
>> backing files into overlay images by using block-stream, or do backup continually.
>>
>> In this scene, It seems that If the guest os doesn't write too many data after the
>> last backup, the incremental mode may transmit less data than full mode or top
>> mode. However, if the write data is too many, there is no advantage on incremental
>> mode compare with other mode.
>>
>> This scene can be described as following steps:
>> 1.create rbd image in ceph, and map nbd device with rbd image.
>> 2.create a new bitmap and attach to block device.
>> 3.do full mode backup on nbd device and thus sync it to the rbd image.
>> 4.severl times incremental mode backup.
>> 5.create new image in destination and set its backing file as backup image.
>> 6.do live-migration, and migrate block data by incremental mode drive-mirror
>>    with bitmap created from step 2.
>>
>> Signed-off-by: Ma Haocong <mahaocong@didichuxing.com>
>> ---
> 
> So one important point about incremental backups is that you can
> actually do them incrementally: The bitmap is effectively cleared at the
> beginning of the backup process (a successor bitmap is installed that is
> cleared and receives all changes; at the end of the backup, it either
> replaces the old bitmap (on success) or is merged into it (on failure)).
>   Therefore, you can do the next incremental backup by using the same bitmap.

Hmm, I heard in some other thread that Eric finally decided to always start
backup on copied bitmap in libvirt, so this logic is skipped. Do we need it
for mirror? Sorry if I'm wrong, Eric, am I?

> 
> How would this work with mirroring?  Instead of clearing the bitmap at
> the start of the process, it'd need to be cleared at the end (because we
> reach synchronization between source and target then).  But how would
> error handling work?
> 
> I suppose the named bitmap would need to be copied to act as the dirty
> bitmap for the mirror job (at the start of the job).  If a failure
> occurs, the copy is simply discarded.  On success, the named bitmap is
> cleared when the job is completed.  Hm, that seems to make sense.  Did I
> forget anything, John?
> 
> In any case, I don't think this patch implemented anything to this
> regard...?  So it doesn't really implement incremental mirroring.
> However, I think it should, if possible.
> 
> Max
> 


-- 
Best regards,
Vladimir
Re: [Qemu-devel] [PATCH RESEND v4] drive-mirror: add incremental mode
Posted by Eric Blake 5 years, 1 month ago
On 3/12/19 3:19 AM, Vladimir Sementsov-Ogievskiy wrote:

>> So one important point about incremental backups is that you can
>> actually do them incrementally: The bitmap is effectively cleared at the
>> beginning of the backup process (a successor bitmap is installed that is
>> cleared and receives all changes; at the end of the backup, it either
>> replaces the old bitmap (on success) or is merged into it (on failure)).
>>   Therefore, you can do the next incremental backup by using the same bitmap.
> 
> Hmm, I heard in some other thread that Eric finally decided to always start
> backup on copied bitmap in libvirt, so this logic is skipped. Do we need it
> for mirror? Sorry if I'm wrong, Eric, am I?

You are correct that my current libvirt patches do incremental backup
mode with a temporary bitmap (so regardless of whether the temporary
bitmap is cleared on success or left alone is irrelevant, the temporary
bitmap goes away afterwards). But just because libvirt isn't using that
particular feature of qemu's incremental backups does not excuse qemu
from not thinking about other clients that might be impacted by doing
incremental backup with a live bitmap, where qemu management of the
bitmap matters.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [Qemu-devel] [PATCH RESEND v4] drive-mirror: add incremental mode
Posted by John Snow 5 years, 1 month ago

On 3/12/19 9:43 AM, Eric Blake wrote:
> On 3/12/19 3:19 AM, Vladimir Sementsov-Ogievskiy wrote:
> 
>>> So one important point about incremental backups is that you can
>>> actually do them incrementally: The bitmap is effectively cleared at the
>>> beginning of the backup process (a successor bitmap is installed that is
>>> cleared and receives all changes; at the end of the backup, it either
>>> replaces the old bitmap (on success) or is merged into it (on failure)).
>>>   Therefore, you can do the next incremental backup by using the same bitmap.
>>
>> Hmm, I heard in some other thread that Eric finally decided to always start
>> backup on copied bitmap in libvirt, so this logic is skipped. Do we need it
>> for mirror? Sorry if I'm wrong, Eric, am I?
> 
> You are correct that my current libvirt patches do incremental backup
> mode with a temporary bitmap (so regardless of whether the temporary
> bitmap is cleared on success or left alone is irrelevant, the temporary
> bitmap goes away afterwards). But just because libvirt isn't using that
> particular feature of qemu's incremental backups does not excuse qemu
> from not thinking about other clients that might be impacted by doing
> incremental backup with a live bitmap, where qemu management of the
> bitmap matters.
> 

I'd prefer adding SYNC=DIFFERENTIAL to intuit that the bitmap isn't
modified after use, if there's no reason to add an actual "incremental"
mode to mirror.

Then it would be fine for mirror to implement differential and not
incremental, and still use the bitmap.

--js

Re: [Qemu-devel] [PATCH RESEND v4] drive-mirror: add incremental mode
Posted by Vladimir Sementsov-Ogievskiy 5 years, 1 month ago
12.03.2019 18:58, John Snow wrote:
> 
> 
> On 3/12/19 9:43 AM, Eric Blake wrote:
>> On 3/12/19 3:19 AM, Vladimir Sementsov-Ogievskiy wrote:
>>
>>>> So one important point about incremental backups is that you can
>>>> actually do them incrementally: The bitmap is effectively cleared at the
>>>> beginning of the backup process (a successor bitmap is installed that is
>>>> cleared and receives all changes; at the end of the backup, it either
>>>> replaces the old bitmap (on success) or is merged into it (on failure)).
>>>>    Therefore, you can do the next incremental backup by using the same bitmap.
>>>
>>> Hmm, I heard in some other thread that Eric finally decided to always start
>>> backup on copied bitmap in libvirt, so this logic is skipped. Do we need it
>>> for mirror? Sorry if I'm wrong, Eric, am I?
>>
>> You are correct that my current libvirt patches do incremental backup
>> mode with a temporary bitmap (so regardless of whether the temporary
>> bitmap is cleared on success or left alone is irrelevant, the temporary
>> bitmap goes away afterwards). But just because libvirt isn't using that
>> particular feature of qemu's incremental backups does not excuse qemu
>> from not thinking about other clients that might be impacted by doing
>> incremental backup with a live bitmap, where qemu management of the
>> bitmap matters.
>>
> 
> I'd prefer adding SYNC=DIFFERENTIAL to intuit that the bitmap isn't
> modified after use, if there's no reason to add an actual "incremental"
> mode to mirror.
> 
> Then it would be fine for mirror to implement differential and not
> incremental, and still use the bitmap.
> 

Agree, this is better.

But I have an idea: could we just add parameter @x-bitmap, independent of @sync ?

In this case, we can get NONE mode reduced by bitmap, which may be usable too.
And FULL mode with bitmap would be what you called DIFFERENTIAL. Not sure about
could TOP mode with bitmap be meaningful.


Also it may be useful at some point to share bitmap between jobs, so we could

start backup(sync=none) from active disk to local temp node, and then mirror from temp
to target. Which gives backup which (1) don't slow down guest writes if target is far and slow
and (2) fast as it is mostly mirror, keeping in mind that mirror is faster than backup as it uses
async pattern and block_status.

And to make such a backup incremental we need to share dirty bitmap between backup and mirror,
and this bitmap should be cleared when something is copied out from source (it may be cleared
both by backup(sync=none) and mirror)... But I'm not sure that shared bitmap should be the same
as bitmap which initializes differential/incremental mode. And this is why I propose new parameter
to be experimental.

-- 
Best regards,
Vladimir
Re: [Qemu-devel] [PATCH RESEND v4] drive-mirror: add incremental mode
Posted by John Snow 5 years, 1 month ago

On 3/12/19 12:32 PM, Vladimir Sementsov-Ogievskiy wrote:
> 12.03.2019 18:58, John Snow wrote:
>>
>>
>> On 3/12/19 9:43 AM, Eric Blake wrote:
>>> On 3/12/19 3:19 AM, Vladimir Sementsov-Ogievskiy wrote:
>>>
>>>>> So one important point about incremental backups is that you can
>>>>> actually do them incrementally: The bitmap is effectively cleared at the
>>>>> beginning of the backup process (a successor bitmap is installed that is
>>>>> cleared and receives all changes; at the end of the backup, it either
>>>>> replaces the old bitmap (on success) or is merged into it (on failure)).
>>>>>    Therefore, you can do the next incremental backup by using the same bitmap.
>>>>
>>>> Hmm, I heard in some other thread that Eric finally decided to always start
>>>> backup on copied bitmap in libvirt, so this logic is skipped. Do we need it
>>>> for mirror? Sorry if I'm wrong, Eric, am I?
>>>
>>> You are correct that my current libvirt patches do incremental backup
>>> mode with a temporary bitmap (so regardless of whether the temporary
>>> bitmap is cleared on success or left alone is irrelevant, the temporary
>>> bitmap goes away afterwards). But just because libvirt isn't using that
>>> particular feature of qemu's incremental backups does not excuse qemu
>>> from not thinking about other clients that might be impacted by doing
>>> incremental backup with a live bitmap, where qemu management of the
>>> bitmap matters.
>>>
>>
>> I'd prefer adding SYNC=DIFFERENTIAL to intuit that the bitmap isn't
>> modified after use, if there's no reason to add an actual "incremental"
>> mode to mirror.
>>
>> Then it would be fine for mirror to implement differential and not
>> incremental, and still use the bitmap.
>>
> 
> Agree, this is better.
> 
> But I have an idea: could we just add parameter @x-bitmap, independent of @sync ?
> 

It's a thought. Fam's original sketch for bitmaps included a separate
parameter for how to clean up the bitmap at the end, but it was removed
because we didn't have use cases for it at the time.

> In this case, we can get NONE mode reduced by bitmap, which may be usable too.

What do you have in mind for the use case?

> And FULL mode with bitmap would be what you called DIFFERENTIAL. Not sure about
> could TOP mode with bitmap be meaningful.
> 
> 
> Also it may be useful at some point to share bitmap between jobs, so we could
> 
> start backup(sync=none) from active disk to local temp node, and then mirror from temp
> to target. Which gives backup which (1) don't slow down guest writes if target is far and slow
> and (2) fast as it is mostly mirror, keeping in mind that mirror is faster than backup as it uses
> async pattern and block_status.
> 

In effect, a disk-backed buffer so we don't lose performance on slow
network links, right?

> And to make such a backup incremental we need to share dirty bitmap between backup and mirror,
> and this bitmap should be cleared when something is copied out from source (it may be cleared
> both by backup(sync=none) and mirror)... But I'm not sure that shared bitmap should be the same
> as bitmap which initializes differential/incremental mode. And this is why I propose new parameter
> to be experimental.
> 

This could be very cool; how does this vision fit in with the vision for
a unified block-copy job? (i.e. unifying mirror, stream, backup and commit.)

Worth thinking about, because we also want the ability to "resume"
mirror jobs, too.

Regardless, I think the correct thing to do in the short term is to add
a differential sync, since we already have sync modes that determine how
to use the bitmap, and this would be part of a newer paradigm that we
can introduce later.

--js

Re: [Qemu-devel] [PATCH RESEND v4] drive-mirror: add incremental mode
Posted by Vladimir Sementsov-Ogievskiy 5 years, 1 month ago

On 13.03.2019 19:52, John Snow wrote:
> 
> 
> On 3/12/19 12:32 PM, Vladimir Sementsov-Ogievskiy wrote:
>> 12.03.2019 18:58, John Snow wrote:
>>>
>>>
>>> On 3/12/19 9:43 AM, Eric Blake wrote:
>>>> On 3/12/19 3:19 AM, Vladimir Sementsov-Ogievskiy wrote:
>>>>
>>>>>> So one important point about incremental backups is that you can
>>>>>> actually do them incrementally: The bitmap is effectively cleared at the
>>>>>> beginning of the backup process (a successor bitmap is installed that is
>>>>>> cleared and receives all changes; at the end of the backup, it either
>>>>>> replaces the old bitmap (on success) or is merged into it (on failure)).
>>>>>>     Therefore, you can do the next incremental backup by using the same bitmap.
>>>>>
>>>>> Hmm, I heard in some other thread that Eric finally decided to always start
>>>>> backup on copied bitmap in libvirt, so this logic is skipped. Do we need it
>>>>> for mirror? Sorry if I'm wrong, Eric, am I?
>>>>
>>>> You are correct that my current libvirt patches do incremental backup
>>>> mode with a temporary bitmap (so regardless of whether the temporary
>>>> bitmap is cleared on success or left alone is irrelevant, the temporary
>>>> bitmap goes away afterwards). But just because libvirt isn't using that
>>>> particular feature of qemu's incremental backups does not excuse qemu
>>>> from not thinking about other clients that might be impacted by doing
>>>> incremental backup with a live bitmap, where qemu management of the
>>>> bitmap matters.
>>>>
>>>
>>> I'd prefer adding SYNC=DIFFERENTIAL to intuit that the bitmap isn't
>>> modified after use, if there's no reason to add an actual "incremental"
>>> mode to mirror.
>>>
>>> Then it would be fine for mirror to implement differential and not
>>> incremental, and still use the bitmap.
>>>
>>
>> Agree, this is better.
>>
>> But I have an idea: could we just add parameter @x-bitmap, independent of @sync ?
>>
> 
> It's a thought. Fam's original sketch for bitmaps included a separate
> parameter for how to clean up the bitmap at the end, but it was removed
> because we didn't have use cases for it at the time.
> 
>> In this case, we can get NONE mode reduced by bitmap, which may be usable too.
> 
> What do you have in mind for the use case?

I think about backup(sync=none) which lacks bitmap support for the idea 
described below. For mirror - I don't know) To imagine this, I need to 
know original reason for incremental-mirror and original reason for
none-mode-mirror :)

> 
>> And FULL mode with bitmap would be what you called DIFFERENTIAL. Not sure about
>> could TOP mode with bitmap be meaningful.
>>
>>
>> Also it may be useful at some point to share bitmap between jobs, so we could
>>
>> start backup(sync=none) from active disk to local temp node, and then mirror from temp
>> to target. Which gives backup which (1) don't slow down guest writes if target is far and slow
>> and (2) fast as it is mostly mirror, keeping in mind that mirror is faster than backup as it uses
>> async pattern and block_status.
>>
> 
> In effect, a disk-backed buffer so we don't lose performance on slow
> network links, right?

Yes

> 
>> And to make such a backup incremental we need to share dirty bitmap between backup and mirror,
>> and this bitmap should be cleared when something is copied out from source (it may be cleared
>> both by backup(sync=none) and mirror)... But I'm not sure that shared bitmap should be the same
>> as bitmap which initializes differential/incremental mode. And this is why I propose new parameter
>> to be experimental.
>>
> 
> This could be very cool; how does this vision fit in with the vision for
> a unified block-copy job? (i.e. unifying mirror, stream, backup and commit.)

My current plan is to finish backup top filter first. Then may be 
implement disk-backed buffer inside backup, to don't start two jobs. And 
then make backup as fast as mirror. Or, may be it would be backup-top 
filter, who will do most of work, so it should be fast too..

I often think about unifying jobs code, but when I try to dig in, it 
becomes too difficult. I think, we should start from separating common 
code to some kind of library, separate file. And first thing to separate
seems to be an async copying loop backed by coroutines and 
block_status.. To share it between all jobs and qemu-img convert. But 
even here, there are differences, for example, mirror is the only job
which needs several iterations of the loop. Backup needs to share
block_status results with backup_top filter, to make it useful for
copy-before-write operations.. Mirror needs block_status of active disk
and backup actually need block_status of point-in-time..

And mirror code is tooo difficult, it's a problem too;)

But there must be some generic part which we may separate to start from.
And the idea is to start from unifying code-path, not interface.

> 
> Worth thinking about, because we also want the ability to "resume"
> mirror jobs, too.
> 
> Regardless, I think the correct thing to do in the short term is to add
> a differential sync, since we already have sync modes that determine how
> to use the bitmap, and this would be part of a newer paradigm that we
> can introduce later.

No objections)