[RFC v2] migration: Add migrate-set-bitmap-node-mapping

Max Reitz posted 1 patch 3 years, 11 months ago
Test docker-mingw@fedora passed
Test checkpatch passed
Test asan passed
Test docker-quick@centos7 passed
Test FreeBSD passed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20200513145610.1484567-1-mreitz@redhat.com
qapi/migration.json            | 36 ++++++++++++++++++++
migration/block-dirty-bitmap.c | 60 ++++++++++++++++++++++++++++++++--
2 files changed, 94 insertions(+), 2 deletions(-)
[RFC v2] migration: Add migrate-set-bitmap-node-mapping
Posted by Max Reitz 3 years, 11 months ago
This command allows mapping block node names to aliases for the purpose
of block dirty bitmap migration.

This way, management tools can use different node names on the source
and destination and pass the mapping of how bitmaps are to be
transferred to qemu (on the source, the destination, or even both with
arbitrary aliases in the migration stream).

Suggested-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
---
Branch: https://github.com/XanClic/qemu.git migration-bitmap-mapping-rfc-v2
Branch: https://git.xanclic.moe/XanClic/qemu.git migration-bitmap-mapping-rfc-v2

(Sorry, v1 was just broken.  This one should work better.)

Vladimir has proposed something like this in April:
https://lists.nongnu.org/archive/html/qemu-block/2020-04/msg00171.html

Now I’ve been asked by my manager to look at this, so I decided to just
write a patch to see how it’d play out.

This is an RFC, because I’d like to tack on tests to the final version,
but I’m not sure whether I can come up with something before the end of
the week (and I’ll be on PTO for the next two weeks).

Also, I don’t know whether migration/block-dirty-bitmap.c is the best
place to put qmp_migrate_set_bitmap_mapping(), but it appears we already
have some QMP handlers in migration/, so I suppose it isn’t too bad.
---
 qapi/migration.json            | 36 ++++++++++++++++++++
 migration/block-dirty-bitmap.c | 60 ++++++++++++++++++++++++++++++++--
 2 files changed, 94 insertions(+), 2 deletions(-)

diff --git a/qapi/migration.json b/qapi/migration.json
index d5000558c6..97037ea635 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -1621,3 +1621,39 @@
 ##
 { 'event': 'UNPLUG_PRIMARY',
   'data': { 'device-id': 'str' } }
+
+##
+# @MigrationBlockNodeMapping:
+#
+# Maps a block node name to an alias for migration.
+#
+# @node-name: A block node name.
+#
+# @alias: An alias name for migration (for example the node name on
+#         the opposite site).
+#
+# Since: 5.1
+##
+{ 'struct': 'MigrationBlockNodeMapping',
+  'data': {
+      'node-name': 'str',
+      'alias': 'str'
+  } }
+
+##
+# @migrate-set-bitmap-node-mapping:
+#
+# Maps block node names to arbitrary aliases for the purpose of dirty
+# bitmap migration.  Such aliases may for example be the corresponding
+# node names on the opposite site.
+#
+# By default, every node name is mapped to itself.
+#
+# @mapping: The mapping; must be one-to-one, but not necessarily
+#           complete.  Any mapping not given will be reset to the
+#           default (i.e. the identity mapping).
+#
+# Since: 5.1
+##
+{ 'command': 'migrate-set-bitmap-node-mapping',
+  'data': { 'mapping': ['MigrationBlockNodeMapping'] } }
diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index 7eafface61..73f400e7ea 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -73,6 +73,8 @@
 #include "qemu/hbitmap.h"
 #include "qemu/cutils.h"
 #include "qapi/error.h"
+#include "qapi/qapi-commands-migration.h"
+#include "qapi/qmp/qdict.h"
 #include "trace.h"
 
 #define CHUNK_SIZE     (1 << 10)
@@ -121,6 +123,9 @@ typedef struct DirtyBitmapMigState {
     bool bulk_completed;
     bool no_bitmaps;
 
+    QDict *node_in_mapping;
+    QDict *node_out_mapping;
+
     /* for send_bitmap_bits() */
     BlockDriverState *prev_bs;
     BdrvDirtyBitmap *prev_bitmap;
@@ -281,8 +286,13 @@ static int init_dirty_bitmap_migration(void)
     dirty_bitmap_mig_state.no_bitmaps = false;
 
     for (bs = bdrv_next_all_states(NULL); bs; bs = bdrv_next_all_states(bs)) {
+        const QDict *map = dirty_bitmap_mig_state.node_out_mapping;
         const char *name = bdrv_get_device_or_node_name(bs);
 
+        if (map) {
+            name = qdict_get_try_str(map, name) ?: name;
+        }
+
         FOR_EACH_DIRTY_BITMAP(bs, bitmap) {
             if (!bdrv_dirty_bitmap_name(bitmap)) {
                 continue;
@@ -600,6 +610,8 @@ static int dirty_bitmap_load_bits(QEMUFile *f, DirtyBitmapLoadState *s)
 
 static int dirty_bitmap_load_header(QEMUFile *f, DirtyBitmapLoadState *s)
 {
+    const QDict *map = dirty_bitmap_mig_state.node_in_mapping;
+    const char *mapped_node = "(none)";
     Error *local_err = NULL;
     bool nothing;
     s->flags = qemu_get_bitmap_flags(f);
@@ -612,7 +624,13 @@ static int dirty_bitmap_load_header(QEMUFile *f, DirtyBitmapLoadState *s)
             error_report("Unable to read node name string");
             return -EINVAL;
         }
-        s->bs = bdrv_lookup_bs(s->node_name, s->node_name, &local_err);
+
+        mapped_node = s->node_name;
+        if (map) {
+            mapped_node = qdict_get_try_str(map, mapped_node) ?: mapped_node;
+        }
+
+        s->bs = bdrv_lookup_bs(mapped_node, mapped_node, &local_err);
         if (!s->bs) {
             error_report_err(local_err);
             return -EINVAL;
@@ -634,7 +652,7 @@ static int dirty_bitmap_load_header(QEMUFile *f, DirtyBitmapLoadState *s)
         if (!s->bitmap && !(s->flags & DIRTY_BITMAP_MIG_FLAG_START)) {
             error_report("Error: unknown dirty bitmap "
                          "'%s' for block device '%s'",
-                         s->bitmap_name, s->node_name);
+                         s->bitmap_name, mapped_node);
             return -EINVAL;
         }
     } else if (!s->bitmap && !nothing) {
@@ -713,6 +731,44 @@ static bool dirty_bitmap_has_postcopy(void *opaque)
     return true;
 }
 
+void qmp_migrate_set_bitmap_node_mapping(MigrationBlockNodeMappingList *mapping,
+                                         Error **errp)
+{
+    QDict *in_mapping = qdict_new();
+    QDict *out_mapping = qdict_new();
+
+    for (; mapping; mapping = mapping->next) {
+        MigrationBlockNodeMapping *entry = mapping->value;
+
+        if (qdict_haskey(out_mapping, entry->node_name)) {
+            error_setg(errp, "Cannot map node name '%s' twice",
+                       entry->node_name);
+            goto fail;
+        }
+
+        if (qdict_haskey(in_mapping, entry->alias)) {
+            error_setg(errp, "Cannot use alias '%s' twice",
+                       entry->alias);
+            goto fail;
+        }
+
+        qdict_put_str(in_mapping, entry->alias, entry->node_name);
+        qdict_put_str(out_mapping, entry->node_name, entry->alias);
+    }
+
+    qobject_unref(dirty_bitmap_mig_state.node_in_mapping);
+    qobject_unref(dirty_bitmap_mig_state.node_out_mapping);
+
+    dirty_bitmap_mig_state.node_in_mapping = in_mapping;
+    dirty_bitmap_mig_state.node_out_mapping = out_mapping;
+
+    return;
+
+fail:
+    qobject_unref(in_mapping);
+    qobject_unref(out_mapping);
+}
+
 static SaveVMHandlers savevm_dirty_bitmap_handlers = {
     .save_setup = dirty_bitmap_save_setup,
     .save_live_complete_postcopy = dirty_bitmap_save_complete,
-- 
2.26.2


Re: [RFC v2] migration: Add migrate-set-bitmap-node-mapping
Posted by Eric Blake 3 years, 11 months ago
On 5/13/20 9:56 AM, Max Reitz wrote:
> This command allows mapping block node names to aliases for the purpose
> of block dirty bitmap migration.
> 
> This way, management tools can use different node names on the source
> and destination and pass the mapping of how bitmaps are to be
> transferred to qemu (on the source, the destination, or even both with
> arbitrary aliases in the migration stream).
> 
> Suggested-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---

> @@ -713,6 +731,44 @@ static bool dirty_bitmap_has_postcopy(void *opaque)
>       return true;
>   }
>   
> +void qmp_migrate_set_bitmap_node_mapping(MigrationBlockNodeMappingList *mapping,
> +                                         Error **errp)
> +{
> +    QDict *in_mapping = qdict_new();
> +    QDict *out_mapping = qdict_new();
> +
> +    for (; mapping; mapping = mapping->next) {
> +        MigrationBlockNodeMapping *entry = mapping->value;
> +
> +        if (qdict_haskey(out_mapping, entry->node_name)) {
> +            error_setg(errp, "Cannot map node name '%s' twice",
> +                       entry->node_name);
> +            goto fail;
> +        }

Can we call this command more than once?  Is it cumulative (call it once 
to set mapping for "a", second time to also set mapping for "b"), or 
should it reset (second call wipes out all mappings from first call, any 
mappings that must exist must be passed in the final call)?

The idea makes sense, and the interface seems usable.  It's nice that 
either source, destination, or both sides of migration can use it (which 
helps in upgrade vs. downgrade scenarios).

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org


Re: [RFC v2] migration: Add migrate-set-bitmap-node-mapping
Posted by Max Reitz 3 years, 11 months ago
On 13.05.20 18:11, Eric Blake wrote:
> On 5/13/20 9:56 AM, Max Reitz wrote:
>> This command allows mapping block node names to aliases for the purpose
>> of block dirty bitmap migration.
>>
>> This way, management tools can use different node names on the source
>> and destination and pass the mapping of how bitmaps are to be
>> transferred to qemu (on the source, the destination, or even both with
>> arbitrary aliases in the migration stream).
>>
>> Suggested-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>> ---
> 
>> @@ -713,6 +731,44 @@ static bool dirty_bitmap_has_postcopy(void *opaque)
>>       return true;
>>   }
>>   +void
>> qmp_migrate_set_bitmap_node_mapping(MigrationBlockNodeMappingList
>> *mapping,
>> +                                         Error **errp)
>> +{
>> +    QDict *in_mapping = qdict_new();
>> +    QDict *out_mapping = qdict_new();
>> +
>> +    for (; mapping; mapping = mapping->next) {
>> +        MigrationBlockNodeMapping *entry = mapping->value;
>> +
>> +        if (qdict_haskey(out_mapping, entry->node_name)) {
>> +            error_setg(errp, "Cannot map node name '%s' twice",
>> +                       entry->node_name);
>> +            goto fail;
>> +        }
> 
> Can we call this command more than once?  Is it cumulative (call it once
> to set mapping for "a", second time to also set mapping for "b"), or
> should it reset (second call wipes out all mappings from first call, any
> mappings that must exist must be passed in the final call)?

I tried to make it clear in the documentation:

> +# @mapping: The mapping; must be one-to-one, but not necessarily
> +#           complete.  Any mapping not given will be reset to the
> +#           default (i.e. the identity mapping).

So everything that isn’t set in the second call is reset.  I thought
about what you proposed (because I guess that’s the most intuitive
idea), but after consideration I didn’t see why we’d need different
behavior, so it would only serve to make the code more complicated.

Max

> The idea makes sense, and the interface seems usable.  It's nice that
> either source, destination, or both sides of migration can use it (which
> helps in upgrade vs. downgrade scenarios).

Re: [RFC v2] migration: Add migrate-set-bitmap-node-mapping
Posted by Kevin Wolf 3 years, 11 months ago
Am 14.05.2020 um 09:13 hat Max Reitz geschrieben:
> On 13.05.20 18:11, Eric Blake wrote:
> > On 5/13/20 9:56 AM, Max Reitz wrote:
> >> This command allows mapping block node names to aliases for the purpose
> >> of block dirty bitmap migration.
> >>
> >> This way, management tools can use different node names on the source
> >> and destination and pass the mapping of how bitmaps are to be
> >> transferred to qemu (on the source, the destination, or even both with
> >> arbitrary aliases in the migration stream).
> >>
> >> Suggested-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> >> Signed-off-by: Max Reitz <mreitz@redhat.com>
> >> ---
> > 
> >> @@ -713,6 +731,44 @@ static bool dirty_bitmap_has_postcopy(void *opaque)
> >>       return true;
> >>   }
> >>   +void
> >> qmp_migrate_set_bitmap_node_mapping(MigrationBlockNodeMappingList
> >> *mapping,
> >> +                                         Error **errp)
> >> +{
> >> +    QDict *in_mapping = qdict_new();
> >> +    QDict *out_mapping = qdict_new();
> >> +
> >> +    for (; mapping; mapping = mapping->next) {
> >> +        MigrationBlockNodeMapping *entry = mapping->value;
> >> +
> >> +        if (qdict_haskey(out_mapping, entry->node_name)) {
> >> +            error_setg(errp, "Cannot map node name '%s' twice",
> >> +                       entry->node_name);
> >> +            goto fail;
> >> +        }
> > 
> > Can we call this command more than once?  Is it cumulative (call it once
> > to set mapping for "a", second time to also set mapping for "b"), or
> > should it reset (second call wipes out all mappings from first call, any
> > mappings that must exist must be passed in the final call)?
> 
> I tried to make it clear in the documentation:
> 
> > +# @mapping: The mapping; must be one-to-one, but not necessarily
> > +#           complete.  Any mapping not given will be reset to the
> > +#           default (i.e. the identity mapping).
> 
> So everything that isn’t set in the second call is reset.  I thought
> about what you proposed (because I guess that’s the most intuitive
> idea), but after consideration I didn’t see why we’d need different
> behavior, so it would only serve to make the code more complicated.

Also, if it were cumulative, we would need a separate reset command
because you probably don't want to use the same mapping you used for an
incoming migration when you later migrate away again to a third host.

Kevin
Re: [RFC v2] migration: Add migrate-set-bitmap-node-mapping
Posted by Vladimir Sementsov-Ogievskiy 3 years, 11 months ago
13.05.2020 17:56, Max Reitz wrote:
> This command allows mapping block node names to aliases for the purpose
> of block dirty bitmap migration.
> 
> This way, management tools can use different node names on the source
> and destination and pass the mapping of how bitmaps are to be
> transferred to qemu (on the source, the destination, or even both with
> arbitrary aliases in the migration stream).
> 
> Suggested-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---
> Branch: https://github.com/XanClic/qemu.git migration-bitmap-mapping-rfc-v2
> Branch: https://git.xanclic.moe/XanClic/qemu.git migration-bitmap-mapping-rfc-v2
> 
> (Sorry, v1 was just broken.  This one should work better.)
> 
> Vladimir has proposed something like this in April:
> https://lists.nongnu.org/archive/html/qemu-block/2020-04/msg00171.html
> 
> Now I’ve been asked by my manager to look at this, so I decided to just
> write a patch to see how it’d play out.

Great! Sometimes I remember about this thing, but never start implementing :)

> 
> This is an RFC, because I’d like to tack on tests to the final version,
> but I’m not sure whether I can come up with something before the end of
> the week (and I’ll be on PTO for the next two weeks).
> 
> Also, I don’t know whether migration/block-dirty-bitmap.c is the best
> place to put qmp_migrate_set_bitmap_mapping(), but it appears we already
> have some QMP handlers in migration/, so I suppose it isn’t too bad.
> ---
>   qapi/migration.json            | 36 ++++++++++++++++++++
>   migration/block-dirty-bitmap.c | 60 ++++++++++++++++++++++++++++++++--
>   2 files changed, 94 insertions(+), 2 deletions(-)
> 
> diff --git a/qapi/migration.json b/qapi/migration.json
> index d5000558c6..97037ea635 100644
> --- a/qapi/migration.json
> +++ b/qapi/migration.json
> @@ -1621,3 +1621,39 @@
>   ##
>   { 'event': 'UNPLUG_PRIMARY',
>     'data': { 'device-id': 'str' } }
> +
> +##
> +# @MigrationBlockNodeMapping:
> +#
> +# Maps a block node name to an alias for migration.
> +#
> +# @node-name: A block node name.
> +#
> +# @alias: An alias name for migration (for example the node name on
> +#         the opposite site).
> +#
> +# Since: 5.1
> +##
> +{ 'struct': 'MigrationBlockNodeMapping',
> +  'data': {
> +      'node-name': 'str',
> +      'alias': 'str'
> +  } }
> +
> +##
> +# @migrate-set-bitmap-node-mapping:
> +#
> +# Maps block node names to arbitrary aliases for the purpose of dirty
> +# bitmap migration.  Such aliases may for example be the corresponding
> +# node names on the opposite site.
> +#
> +# By default, every node name is mapped to itself.
> +#
> +# @mapping: The mapping; must be one-to-one, but not necessarily
> +#           complete.  Any mapping not given will be reset to the
> +#           default (i.e. the identity mapping).
> +#
> +# Since: 5.1
> +##
> +{ 'command': 'migrate-set-bitmap-node-mapping',
> +  'data': { 'mapping': ['MigrationBlockNodeMapping'] } }

Hm. I like it, it's simpler and clearer than what I was thinking about.

1. So, you decided to make only node-mapping, not bitmap-mapping, so we can't rename bitmaps in-flight and can't migrate bitmaps from one node to several and visa-versa. I think it's OK, nothing good in such possibilities, and this simplifies things.

2. If I understand correctly, default to node-name matching doesn't make real sense for libvirt.. But on the other hand, libvirt should not be considered as the ony user of Qemu. Still, the default seems unsafe.. Could we make it optional? Or add an option to disable this default for absolutely strict behavior?

May be, add a parameter

fallback: node-name | error | drop

where
   node-name: use node-name as an alias, if found bitmap on the node not mentioned in @mapping [should not be useful for libvirt, but may be for others]
   error: just error-out if such bitmap found [libvirt should use it, I propose it as a default value for @fallback]
   drop: just ignore such bitmap - it will be lost [just and idea, I doubt that it is really useful]

=======

Also, we should understand (and document here), that default behavior of this command and default behavior of bitmap migration (without this command) are different things:

if command is not called, device names may be used instead of node-names.

> diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
> index 7eafface61..73f400e7ea 100644
> --- a/migration/block-dirty-bitmap.c
> +++ b/migration/block-dirty-bitmap.c

conflicts with my series "[PATCH v2 00/22] Fix error handling during bitmap postcopy"..Still, not too difficult to rebase my series if this goes first.

> @@ -73,6 +73,8 @@
>   #include "qemu/hbitmap.h"
>   #include "qemu/cutils.h"
>   #include "qapi/error.h"
> +#include "qapi/qapi-commands-migration.h"
> +#include "qapi/qmp/qdict.h"
>   #include "trace.h"
>   
>   #define CHUNK_SIZE     (1 << 10)
> @@ -121,6 +123,9 @@ typedef struct DirtyBitmapMigState {
>       bool bulk_completed;
>       bool no_bitmaps;
>   
> +    QDict *node_in_mapping;
> +    QDict *node_out_mapping;
> +
>       /* for send_bitmap_bits() */
>       BlockDriverState *prev_bs;
>       BdrvDirtyBitmap *prev_bitmap;
> @@ -281,8 +286,13 @@ static int init_dirty_bitmap_migration(void)
>       dirty_bitmap_mig_state.no_bitmaps = false;
>   
>       for (bs = bdrv_next_all_states(NULL); bs; bs = bdrv_next_all_states(bs)) {
> +        const QDict *map = dirty_bitmap_mig_state.node_out_mapping;
>           const char *name = bdrv_get_device_or_node_name(bs);
>   
> +        if (map) {
> +            name = qdict_get_try_str(map, name) ?: name;
> +        }
> +
>           FOR_EACH_DIRTY_BITMAP(bs, bitmap) {
>               if (!bdrv_dirty_bitmap_name(bitmap)) {
>                   continue;
> @@ -600,6 +610,8 @@ static int dirty_bitmap_load_bits(QEMUFile *f, DirtyBitmapLoadState *s)
>   
>   static int dirty_bitmap_load_header(QEMUFile *f, DirtyBitmapLoadState *s)
>   {
> +    const QDict *map = dirty_bitmap_mig_state.node_in_mapping;
> +    const char *mapped_node = "(none)";
>       Error *local_err = NULL;
>       bool nothing;
>       s->flags = qemu_get_bitmap_flags(f);
> @@ -612,7 +624,13 @@ static int dirty_bitmap_load_header(QEMUFile *f, DirtyBitmapLoadState *s)
>               error_report("Unable to read node name string");
>               return -EINVAL;
>           }
> -        s->bs = bdrv_lookup_bs(s->node_name, s->node_name, &local_err);
> +
> +        mapped_node = s->node_name;

I think, we can rename s->node_name to alias.. as well as in other places in the code, including migration format specification in the comment at file top.

> +        if (map) {
> +            mapped_node = qdict_get_try_str(map, mapped_node) ?: mapped_node;
> +        }
> +
> +        s->bs = bdrv_lookup_bs(mapped_node, mapped_node, &local_err);

Should we search by device name? I think, that if command set-mapping was called, it means that user is node-name oriented, and we should work only with node-names.

>           if (!s->bs) {
>               error_report_err(local_err);
>               return -EINVAL;
> @@ -634,7 +652,7 @@ static int dirty_bitmap_load_header(QEMUFile *f, DirtyBitmapLoadState *s)
>           if (!s->bitmap && !(s->flags & DIRTY_BITMAP_MIG_FLAG_START)) {
>               error_report("Error: unknown dirty bitmap "
>                            "'%s' for block device '%s'",
> -                         s->bitmap_name, s->node_name);
> +                         s->bitmap_name, mapped_node);
>               return -EINVAL;
>           }
>       } else if (!s->bitmap && !nothing) {
> @@ -713,6 +731,44 @@ static bool dirty_bitmap_has_postcopy(void *opaque)
>       return true;
>   }
>   
> +void qmp_migrate_set_bitmap_node_mapping(MigrationBlockNodeMappingList *mapping,
> +                                         Error **errp)
> +{

Ok, so, here we don't check existing of the nodes or bitmaps in them, and the command may safely called before creating referenced nodes. It's only mapping, nothing more.

> +    QDict *in_mapping = qdict_new();
> +    QDict *out_mapping = qdict_new();
> +
> +    for (; mapping; mapping = mapping->next) {
> +        MigrationBlockNodeMapping *entry = mapping->value;
> +
> +        if (qdict_haskey(out_mapping, entry->node_name)) {
> +            error_setg(errp, "Cannot map node name '%s' twice",
> +                       entry->node_name);
> +            goto fail;
> +        }
> +
> +        if (qdict_haskey(in_mapping, entry->alias)) {
> +            error_setg(errp, "Cannot use alias '%s' twice",
> +                       entry->alias);
> +            goto fail;
> +        }
> +
> +        qdict_put_str(in_mapping, entry->alias, entry->node_name);
> +        qdict_put_str(out_mapping, entry->node_name, entry->alias);
> +    }
> +
> +    qobject_unref(dirty_bitmap_mig_state.node_in_mapping);
> +    qobject_unref(dirty_bitmap_mig_state.node_out_mapping);
> +
> +    dirty_bitmap_mig_state.node_in_mapping = in_mapping;
> +    dirty_bitmap_mig_state.node_out_mapping = out_mapping;
> +
> +    return;
> +
> +fail:
> +    qobject_unref(in_mapping);
> +    qobject_unref(out_mapping);
> +}
> +
>   static SaveVMHandlers savevm_dirty_bitmap_handlers = {
>       .save_setup = dirty_bitmap_save_setup,
>       .save_live_complete_postcopy = dirty_bitmap_save_complete,
> 


-- 
Best regards,
Vladimir

Re: [RFC v2] migration: Add migrate-set-bitmap-node-mapping
Posted by Max Reitz 3 years, 11 months ago
On 13.05.20 22:09, Vladimir Sementsov-Ogievskiy wrote:
[...]
> 1. So, you decided to make only node-mapping, not bitmap-mapping, so we
> can't rename bitmaps in-flight and can't migrate bitmaps from one node
> to several and visa-versa. I think it's OK, nothing good in such
> possibilities, and this simplifies things.

On second thought, I wonder whether it would be useful to migrate
bitmaps from multiple nodes to a single one.  But probably not, this
would only make sense for filters, really, and in such a case the
bitmaps should probably just be moved prior to migration.

(And I can’t imagine any other case.  When flattening backing chains,
the bitmaps from the dropped layers basically become useless.)

Max

Re: [RFC v2] migration: Add migrate-set-bitmap-node-mapping
Posted by Vladimir Sementsov-Ogievskiy 3 years, 11 months ago
14.05.2020 12:09, Max Reitz wrote:
> On 13.05.20 22:09, Vladimir Sementsov-Ogievskiy wrote:
> [...]
>> 1. So, you decided to make only node-mapping, not bitmap-mapping, so we
>> can't rename bitmaps in-flight and can't migrate bitmaps from one node
>> to several and visa-versa. I think it's OK, nothing good in such
>> possibilities, and this simplifies things.
> 
> On second thought, I wonder whether it would be useful to migrate
> bitmaps from multiple nodes to a single one.  But probably not, this
> would only make sense for filters, really, and in such a case the
> bitmaps should probably just be moved prior to migration.

Yes, I think the moving before migration is enough.


-- 
Best regards,
Vladimir

Re: [RFC v2] migration: Add migrate-set-bitmap-node-mapping
Posted by Kevin Wolf 3 years, 11 months ago
Am 14.05.2020 um 11:09 hat Max Reitz geschrieben:
> On 13.05.20 22:09, Vladimir Sementsov-Ogievskiy wrote:
> [...]
> > 1. So, you decided to make only node-mapping, not bitmap-mapping, so we
> > can't rename bitmaps in-flight and can't migrate bitmaps from one node
> > to several and visa-versa. I think it's OK, nothing good in such
> > possibilities, and this simplifies things.
> 
> On second thought, I wonder whether it would be useful to migrate
> bitmaps from multiple nodes to a single one.  But probably not, this
> would only make sense for filters, really, and in such a case the
> bitmaps should probably just be moved prior to migration.
> 
> (And I can’t imagine any other case.  When flattening backing chains,
> the bitmaps from the dropped layers basically become useless.)

I agree, you can always move the bitmaps in a separate step, either on
the source before migration or on the destination afterwards. Migration
is already complicated enough, let's not move things into it that don't
necessarily have to be there.

You would get complications like possibly conflicting bitmap names when
you migrate the bitmaps of two nodes into a single one.

Kevin
Re: [RFC v2] migration: Add migrate-set-bitmap-node-mapping
Posted by Max Reitz 3 years, 11 months ago
On 13.05.20 22:09, Vladimir Sementsov-Ogievskiy wrote:
> 13.05.2020 17:56, Max Reitz wrote:
>> This command allows mapping block node names to aliases for the purpose
>> of block dirty bitmap migration.
>>
>> This way, management tools can use different node names on the source
>> and destination and pass the mapping of how bitmaps are to be
>> transferred to qemu (on the source, the destination, or even both with
>> arbitrary aliases in the migration stream).
>>
>> Suggested-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>> ---
>> Branch: https://github.com/XanClic/qemu.git
>> migration-bitmap-mapping-rfc-v2
>> Branch: https://git.xanclic.moe/XanClic/qemu.git
>> migration-bitmap-mapping-rfc-v2
>>
>> (Sorry, v1 was just broken.  This one should work better.)
>>
>> Vladimir has proposed something like this in April:
>> https://lists.nongnu.org/archive/html/qemu-block/2020-04/msg00171.html
>>
>> Now I’ve been asked by my manager to look at this, so I decided to just
>> write a patch to see how it’d play out.
> 
> Great! Sometimes I remember about this thing, but never start
> implementing :)
> 
>>
>> This is an RFC, because I’d like to tack on tests to the final version,
>> but I’m not sure whether I can come up with something before the end of
>> the week (and I’ll be on PTO for the next two weeks).
>>
>> Also, I don’t know whether migration/block-dirty-bitmap.c is the best
>> place to put qmp_migrate_set_bitmap_mapping(), but it appears we already
>> have some QMP handlers in migration/, so I suppose it isn’t too bad.
>> ---
>>   qapi/migration.json            | 36 ++++++++++++++++++++
>>   migration/block-dirty-bitmap.c | 60 ++++++++++++++++++++++++++++++++--
>>   2 files changed, 94 insertions(+), 2 deletions(-)
>>
>> diff --git a/qapi/migration.json b/qapi/migration.json
>> index d5000558c6..97037ea635 100644
>> --- a/qapi/migration.json
>> +++ b/qapi/migration.json
>> @@ -1621,3 +1621,39 @@
>>   ##
>>   { 'event': 'UNPLUG_PRIMARY',
>>     'data': { 'device-id': 'str' } }
>> +
>> +##
>> +# @MigrationBlockNodeMapping:
>> +#
>> +# Maps a block node name to an alias for migration.
>> +#
>> +# @node-name: A block node name.
>> +#
>> +# @alias: An alias name for migration (for example the node name on
>> +#         the opposite site).
>> +#
>> +# Since: 5.1
>> +##
>> +{ 'struct': 'MigrationBlockNodeMapping',
>> +  'data': {
>> +      'node-name': 'str',
>> +      'alias': 'str'
>> +  } }
>> +
>> +##
>> +# @migrate-set-bitmap-node-mapping:
>> +#
>> +# Maps block node names to arbitrary aliases for the purpose of dirty
>> +# bitmap migration.  Such aliases may for example be the corresponding
>> +# node names on the opposite site.
>> +#
>> +# By default, every node name is mapped to itself.
>> +#
>> +# @mapping: The mapping; must be one-to-one, but not necessarily
>> +#           complete.  Any mapping not given will be reset to the
>> +#           default (i.e. the identity mapping).
>> +#
>> +# Since: 5.1
>> +##
>> +{ 'command': 'migrate-set-bitmap-node-mapping',
>> +  'data': { 'mapping': ['MigrationBlockNodeMapping'] } }
> 
> Hm. I like it, it's simpler and clearer than what I was thinking about.
> 
> 1. So, you decided to make only node-mapping, not bitmap-mapping, so we
> can't rename bitmaps in-flight and can't migrate bitmaps from one node
> to several and visa-versa. I think it's OK, nothing good in such
> possibilities, and this simplifies things.

If it turns out that we’d want it, I suppose we can also still always
extend MigrationBlockNodeMapping by another mapping array for bitmaps.

> 2. If I understand correctly, default to node-name matching doesn't make
> real sense for libvirt.. But on the other hand, libvirt should not be
> considered as the ony user of Qemu. Still, the default seems unsafe..
> Could we make it optional? Or add an option to disable this default for
> absolutely strict behavior?

It was my understanding that libvirt (which should know about all
bitmaps on all nodes) would and could ensure itself that all nodes are
mapped according to what it needs.  (But that’s why Peter is CC’d, to
get his input.)

But your idea seems simple, so why not.

> May be, add a parameter
> 
> fallback: node-name | error | drop
> 
> where
>   node-name: use node-name as an alias, if found bitmap on the node not
> mentioned in @mapping [should not be useful for libvirt, but may be for
> others]
>   error: just error-out if such bitmap found [libvirt should use it, I
> propose it as a default value for @fallback]

You mean error out during migration?  Hm.  I suppose that’s OK, if some
mapping erroneously isn’t set and the node name doesn’t exist in the
destination, we’ll error out, too, so...

Shouldn’t be too difficult to implement, just put the enum in
dirty_bitmap_mig_state, and then do what it says when no entry can be
found in the mapping QDict.

>   drop: just ignore such bitmap - it will be lost [just and idea, I
> doubt that it is really useful]
> 
> =======
> 
> Also, we should understand (and document here), that default behavior of
> this command and default behavior of bitmap migration (without this
> command) are different things:
> 
> if command is not called, device names may be used instead of node-names.

Ah, yes.  Well, that’s actually a real problem with my current
implementation then, too, because...

>> diff --git a/migration/block-dirty-bitmap.c
>> b/migration/block-dirty-bitmap.c
>> index 7eafface61..73f400e7ea 100644
>> --- a/migration/block-dirty-bitmap.c
>> +++ b/migration/block-dirty-bitmap.c
> 
> conflicts with my series "[PATCH v2 00/22] Fix error handling during
> bitmap postcopy"..Still, not too difficult to rebase my series if this
> goes first.
> 
>> @@ -73,6 +73,8 @@
>>   #include "qemu/hbitmap.h"
>>   #include "qemu/cutils.h"
>>   #include "qapi/error.h"
>> +#include "qapi/qapi-commands-migration.h"
>> +#include "qapi/qmp/qdict.h"
>>   #include "trace.h"
>>     #define CHUNK_SIZE     (1 << 10)
>> @@ -121,6 +123,9 @@ typedef struct DirtyBitmapMigState {
>>       bool bulk_completed;
>>       bool no_bitmaps;
>>   +    QDict *node_in_mapping;
>> +    QDict *node_out_mapping;
>> +
>>       /* for send_bitmap_bits() */
>>       BlockDriverState *prev_bs;
>>       BdrvDirtyBitmap *prev_bitmap;
>> @@ -281,8 +286,13 @@ static int init_dirty_bitmap_migration(void)
>>       dirty_bitmap_mig_state.no_bitmaps = false;
>>         for (bs = bdrv_next_all_states(NULL); bs; bs =
>> bdrv_next_all_states(bs)) {
>> +        const QDict *map = dirty_bitmap_mig_state.node_out_mapping;
>>           const char *name = bdrv_get_device_or_node_name(bs);
>>   +        if (map) {
>> +            name = qdict_get_try_str(map, name) ?: name;

@name may be a device name, so it doesn’t match the interface
description.  We must use the node name for the map.

>> +        }
>> +
>>           FOR_EACH_DIRTY_BITMAP(bs, bitmap) {
>>               if (!bdrv_dirty_bitmap_name(bitmap)) {
>>                   continue;
>> @@ -600,6 +610,8 @@ static int dirty_bitmap_load_bits(QEMUFile *f,
>> DirtyBitmapLoadState *s)
>>     static int dirty_bitmap_load_header(QEMUFile *f,
>> DirtyBitmapLoadState *s)
>>   {
>> +    const QDict *map = dirty_bitmap_mig_state.node_in_mapping;
>> +    const char *mapped_node = "(none)";
>>       Error *local_err = NULL;
>>       bool nothing;
>>       s->flags = qemu_get_bitmap_flags(f);
>> @@ -612,7 +624,13 @@ static int dirty_bitmap_load_header(QEMUFile *f,
>> DirtyBitmapLoadState *s)
>>               error_report("Unable to read node name string");
>>               return -EINVAL;
>>           }
>> -        s->bs = bdrv_lookup_bs(s->node_name, s->node_name, &local_err);
>> +
>> +        mapped_node = s->node_name;
> 
> I think, we can rename s->node_name to alias.. as well as in other
> places in the code, including migration format specification in the
> comment at file top.

Why not.

>> +        if (map) {
>> +            mapped_node = qdict_get_try_str(map, mapped_node) ?:
>> mapped_node;
>> +        }
>> +
>> +        s->bs = bdrv_lookup_bs(mapped_node, mapped_node, &local_err);
> 
> Should we search by device name? I think, that if command set-mapping
> was called, it means that user is node-name oriented, and we should work
> only with node-names.

Yes, we shouldn’t.

>>           if (!s->bs) {
>>               error_report_err(local_err);
>>               return -EINVAL;
>> @@ -634,7 +652,7 @@ static int dirty_bitmap_load_header(QEMUFile *f,
>> DirtyBitmapLoadState *s)
>>           if (!s->bitmap && !(s->flags & DIRTY_BITMAP_MIG_FLAG_START)) {
>>               error_report("Error: unknown dirty bitmap "
>>                            "'%s' for block device '%s'",
>> -                         s->bitmap_name, s->node_name);
>> +                         s->bitmap_name, mapped_node);
>>               return -EINVAL;
>>           }
>>       } else if (!s->bitmap && !nothing) {
>> @@ -713,6 +731,44 @@ static bool dirty_bitmap_has_postcopy(void *opaque)
>>       return true;
>>   }
>>   +void
>> qmp_migrate_set_bitmap_node_mapping(MigrationBlockNodeMappingList
>> *mapping,
>> +                                         Error **errp)
>> +{
> 
> Ok, so, here we don't check existing of the nodes or bitmaps in them,
> and the command may safely called before creating referenced nodes. It's
> only mapping, nothing more.

Yes.  That seemed simpler to me than, say, look up the BDSs and take
references on them (and keep them until migration).  I’m not sure
whether that’s an interface we’d want in the block layer (because there
node names are to reference nodes at the time of invoking the command),
but this isn’t really the block layer, so I think/hope it should be fine.

Max

>> +    QDict *in_mapping = qdict_new();
>> +    QDict *out_mapping = qdict_new();
>> +
>> +    for (; mapping; mapping = mapping->next) {
>> +        MigrationBlockNodeMapping *entry = mapping->value;
>> +
>> +        if (qdict_haskey(out_mapping, entry->node_name)) {
>> +            error_setg(errp, "Cannot map node name '%s' twice",
>> +                       entry->node_name);
>> +            goto fail;
>> +        }
>> +
>> +        if (qdict_haskey(in_mapping, entry->alias)) {
>> +            error_setg(errp, "Cannot use alias '%s' twice",
>> +                       entry->alias);
>> +            goto fail;
>> +        }
>> +
>> +        qdict_put_str(in_mapping, entry->alias, entry->node_name);
>> +        qdict_put_str(out_mapping, entry->node_name, entry->alias);
>> +    }
>> +
>> +    qobject_unref(dirty_bitmap_mig_state.node_in_mapping);
>> +    qobject_unref(dirty_bitmap_mig_state.node_out_mapping);
>> +
>> +    dirty_bitmap_mig_state.node_in_mapping = in_mapping;
>> +    dirty_bitmap_mig_state.node_out_mapping = out_mapping;
>> +
>> +    return;
>> +
>> +fail:
>> +    qobject_unref(in_mapping);
>> +    qobject_unref(out_mapping);
>> +}
>> +
>>   static SaveVMHandlers savevm_dirty_bitmap_handlers = {
>>       .save_setup = dirty_bitmap_save_setup,
>>       .save_live_complete_postcopy = dirty_bitmap_save_complete,
>>
> 
> 


Re: [RFC v2] migration: Add migrate-set-bitmap-node-mapping
Posted by Peter Krempa 3 years, 11 months ago
On Wed, May 13, 2020 at 16:56:10 +0200, Max Reitz wrote:
> This command allows mapping block node names to aliases for the purpose
> of block dirty bitmap migration.
> 
> This way, management tools can use different node names on the source
> and destination and pass the mapping of how bitmaps are to be
> transferred to qemu (on the source, the destination, or even both with
> arbitrary aliases in the migration stream).
> 
> Suggested-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---

[...]

> ---
>  qapi/migration.json            | 36 ++++++++++++++++++++
>  migration/block-dirty-bitmap.c | 60 ++++++++++++++++++++++++++++++++--
>  2 files changed, 94 insertions(+), 2 deletions(-)

I just started to write some quick and dirty hacks to test use of this
infrastructure in libvirt. I have 2 quick observations for now:

> 
> diff --git a/qapi/migration.json b/qapi/migration.json
> index d5000558c6..97037ea635 100644
> --- a/qapi/migration.json
> +++ b/qapi/migration.json
> @@ -1621,3 +1621,39 @@
>  ##
>  { 'event': 'UNPLUG_PRIMARY',
>    'data': { 'device-id': 'str' } }
> +
> +##
> +# @MigrationBlockNodeMapping:
> +#
> +# Maps a block node name to an alias for migration.
> +#
> +# @node-name: A block node name.
> +#
> +# @alias: An alias name for migration (for example the node name on
> +#         the opposite site).
> +#
> +# Since: 5.1
> +##
> +{ 'struct': 'MigrationBlockNodeMapping',
> +  'data': {
> +      'node-name': 'str',
> +      'alias': 'str'
> +  } }

We'd probably like a
'nodename:bitmapname' -> 'alias' mapping so that we can select which
bitmaps to migrate and where to migrate them to. The specific use case
is following:

Libvirt supports migration without shared storage (NFS/etc) where we
copy the disk images prior to the memory migration using qemu's NBD
server and the blockdev-mirror job. By default and the most simple way
which doesn't require users fudging with the disk images and copying
backing images themselves is that we flatten all the backing chain
during the copy ("sync":"full"). In this mode we'll need to do some
merging of the bitmaps prior to finalizing the copy.

It's not a problem to do it ourselves, but in the end we'll need to copy
only certain bitmaps which will be created temporarily on the source to
the destination where they'll be persisted.

For now (until I implement the use of the dirty-bitmap-populate blockjob
which I'm also doing in parallel with this kind of) when creating a
snapshot we create a new active bitmap in the overlay for every active
bitmap in the snapshotted image.

When flattening we'll then need to merge the appropriate ones. As said
it's not a problem to prepare all the bitmaps before but we then don't
need to migrate all of them.

By the way, that brings me to another question:

Is there any difference of handling of persistent and non-persistent
bitmaps? Specifically I'm curious what's the correct approach to do the
shared storage migration case when the source and destination image is
the same one. Should we also instruct to migrate the active ones? or are
they migrated by writing them to the image and reloading them?

> +##
> +# @migrate-set-bitmap-node-mapping:

qemu-5.0 deprecated a bunch of migrate-set- specific commands in favor
of migrate-set-parameters. Is it worth/necessary to add a new command
here?

> +#
> +# Maps block node names to arbitrary aliases for the purpose of dirty
> +# bitmap migration.  Such aliases may for example be the corresponding
> +# node names on the opposite site.
> +#
> +# By default, every node name is mapped to itself.
> +#
> +# @mapping: The mapping; must be one-to-one, but not necessarily
> +#           complete.  Any mapping not given will be reset to the
> +#           default (i.e. the identity mapping).
> +#
> +# Since: 5.1
> +##



Re: [RFC v2] migration: Add migrate-set-bitmap-node-mapping
Posted by Vladimir Sementsov-Ogievskiy 3 years, 11 months ago
[add Nikolay]

18.05.2020 19:26, Peter Krempa wrote:
> On Wed, May 13, 2020 at 16:56:10 +0200, Max Reitz wrote:
>> This command allows mapping block node names to aliases for the purpose
>> of block dirty bitmap migration.
>>
>> This way, management tools can use different node names on the source
>> and destination and pass the mapping of how bitmaps are to be
>> transferred to qemu (on the source, the destination, or even both with
>> arbitrary aliases in the migration stream).
>>
>> Suggested-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>> ---
> 
> [...]
> 
>> ---
>>   qapi/migration.json            | 36 ++++++++++++++++++++
>>   migration/block-dirty-bitmap.c | 60 ++++++++++++++++++++++++++++++++--
>>   2 files changed, 94 insertions(+), 2 deletions(-)
> 
> I just started to write some quick and dirty hacks to test use of this
> infrastructure in libvirt. I have 2 quick observations for now:
> 
>>
>> diff --git a/qapi/migration.json b/qapi/migration.json
>> index d5000558c6..97037ea635 100644
>> --- a/qapi/migration.json
>> +++ b/qapi/migration.json
>> @@ -1621,3 +1621,39 @@
>>   ##
>>   { 'event': 'UNPLUG_PRIMARY',
>>     'data': { 'device-id': 'str' } }
>> +
>> +##
>> +# @MigrationBlockNodeMapping:
>> +#
>> +# Maps a block node name to an alias for migration.
>> +#
>> +# @node-name: A block node name.
>> +#
>> +# @alias: An alias name for migration (for example the node name on
>> +#         the opposite site).
>> +#
>> +# Since: 5.1
>> +##
>> +{ 'struct': 'MigrationBlockNodeMapping',
>> +  'data': {
>> +      'node-name': 'str',
>> +      'alias': 'str'
>> +  } }
> 
> We'd probably like a
> 'nodename:bitmapname' -> 'alias' mapping so that we can select which

If we need bitmap mapping, than it should be nodename:bitmapname -> nodealias:bitmapalias mapping, for backward-compatibility with current qemu.

> bitmaps to migrate and where to migrate them to. The specific use case
> is following:
> 
> Libvirt supports migration without shared storage (NFS/etc) where we
> copy the disk images prior to the memory migration using qemu's NBD
> server and the blockdev-mirror job. By default and the most simple way
> which doesn't require users fudging with the disk images and copying
> backing images themselves is that we flatten all the backing chain
> during the copy ("sync":"full"). In this mode we'll need to do some
> merging of the bitmaps prior to finalizing the copy.
> 
> It's not a problem to do it ourselves, but in the end we'll need to copy
> only certain bitmaps which will be created temporarily on the source to
> the destination where they'll be persisted.
> 
> For now (until I implement the use of the dirty-bitmap-populate blockjob
> which I'm also doing in parallel with this kind of) when creating a
> snapshot we create a new active bitmap in the overlay for every active
> bitmap in the snapshotted image.
> 
> When flattening we'll then need to merge the appropriate ones. As said
> it's not a problem to prepare all the bitmaps before but we then don't
> need to migrate all of them.
> 
> By the way, that brings me to another question:
> 
> Is there any difference of handling of persistent and non-persistent
> bitmaps? Specifically I'm curious what's the correct approach to do the
> shared storage migration case when the source and destination image is
> the same one. Should we also instruct to migrate the active ones? or are
> they migrated by writing them to the image and reloading them?

About migration of persistent bitmaps with shared storage: both variants are possible:

1. if you do nothing (i.e. not enable bitmaps migration capability), persistent bitmaps are stored on inactivation of source Qemu and loaded on activation of target Qemu

2. if you enable bitmap migration capability, then bitmaps are _NOT_ stored, they migrate through migration channel

The difference is in downtime: if we have a lot of bitmap data (big disks, small bitmap granularity, a lot of bitmaps) and/or slow shared storage, then with [1] downtime will increase, as we'll have to store all bitmaps to the disk and load them during migration downtime. With [2] bitmaps migrate in postcopy time, when target is already running, so downtime is not increased.

So, in general [2] is better, and I think we should always use it, not making extra difference between shared and non-shared migration.

==

And in this way, no difference between persistent and non-persistent bitmaps migration, let's always migrate them by postcopy [and we go this way (I hope ;) in Virtuozzo]

> 
>> +##
>> +# @migrate-set-bitmap-node-mapping:
> 
> qemu-5.0 deprecated a bunch of migrate-set- specific commands in favor
> of migrate-set-parameters. Is it worth/necessary to add a new command
> here?

Hmm probably, we should include mapping into MigrateSetParameters structure?

> 
>> +#
>> +# Maps block node names to arbitrary aliases for the purpose of dirty
>> +# bitmap migration.  Such aliases may for example be the corresponding
>> +# node names on the opposite site.
>> +#
>> +# By default, every node name is mapped to itself.
>> +#
>> +# @mapping: The mapping; must be one-to-one, but not necessarily
>> +#           complete.  Any mapping not given will be reset to the
>> +#           default (i.e. the identity mapping).
>> +#
>> +# Since: 5.1
>> +##
> 
> 


-- 
Best regards,
Vladimir

Re: [RFC v2] migration: Add migrate-set-bitmap-node-mapping
Posted by Peter Krempa 3 years, 11 months ago
On Mon, May 18, 2020 at 20:52:32 +0300, Vladimir Sementsov-Ogievskiy wrote:
> [add Nikolay]
> 
> 18.05.2020 19:26, Peter Krempa wrote:
> > On Wed, May 13, 2020 at 16:56:10 +0200, Max Reitz wrote:

[...]

> > Is there any difference of handling of persistent and non-persistent
> > bitmaps? Specifically I'm curious what's the correct approach to do the
> > shared storage migration case when the source and destination image is
> > the same one. Should we also instruct to migrate the active ones? or are
> > they migrated by writing them to the image and reloading them?
> 
> About migration of persistent bitmaps with shared storage: both variants are possible:
> 
> 1. if you do nothing (i.e. not enable bitmaps migration capability), persistent bitmaps are stored on inactivation of source Qemu and loaded on activation of target Qemu
> 
> 2. if you enable bitmap migration capability, then bitmaps are _NOT_ stored, they migrate through migration channel
> 
> The difference is in downtime: if we have a lot of bitmap data (big disks, small bitmap granularity, a lot of bitmaps) and/or slow shared storage, then with [1] downtime will increase, as we'll have to store all bitmaps to the disk and load them during migration downtime. With [2] bitmaps migrate in postcopy time, when target is already running, so downtime is not increased.
> 
> So, in general [2] is better, and I think we should always use it, not making extra difference between shared and non-shared migration.

Thanks for the explanation! What about the inactive bitmaps though? Are
they treated the same when opened? Should we consider them for migration
through the migration stream?

> 
> ==
> 
> And in this way, no difference between persistent and non-persistent bitmaps migration, let's always migrate them by postcopy [and we go this way (I hope ;) in Virtuozzo]

Yeah, that's probably going to be what libvirt does as well.


> > > +# @migrate-set-bitmap-node-mapping:
> > 
> > qemu-5.0 deprecated a bunch of migrate-set- specific commands in favor
> > of migrate-set-parameters. Is it worth/necessary to add a new command
> > here?
> 
> Hmm probably, we should include mapping into MigrateSetParameters structure?

IMO yes. I think it also conveniently sidesteps some of the issues that
were discussed in the other threads regarding addition/multiple calls
etc. The mapping will be set at once and any other invocation sets a new
mapping and that's it. Setting nothing will clear the mapping.


Re: [RFC v2] migration: Add migrate-set-bitmap-node-mapping
Posted by Vladimir Sementsov-Ogievskiy 3 years, 11 months ago
18.05.2020 21:20, Peter Krempa wrote:
> On Mon, May 18, 2020 at 20:52:32 +0300, Vladimir Sementsov-Ogievskiy wrote:
>> [add Nikolay]
>>
>> 18.05.2020 19:26, Peter Krempa wrote:
>>> On Wed, May 13, 2020 at 16:56:10 +0200, Max Reitz wrote:
> 
> [...]
> 
>>> Is there any difference of handling of persistent and non-persistent
>>> bitmaps? Specifically I'm curious what's the correct approach to do the
>>> shared storage migration case when the source and destination image is
>>> the same one. Should we also instruct to migrate the active ones? or are
>>> they migrated by writing them to the image and reloading them?
>>
>> About migration of persistent bitmaps with shared storage: both variants are possible:
>>
>> 1. if you do nothing (i.e. not enable bitmaps migration capability), persistent bitmaps are stored on inactivation of source Qemu and loaded on activation of target Qemu
>>
>> 2. if you enable bitmap migration capability, then bitmaps are _NOT_ stored, they migrate through migration channel
>>
>> The difference is in downtime: if we have a lot of bitmap data (big disks, small bitmap granularity, a lot of bitmaps) and/or slow shared storage, then with [1] downtime will increase, as we'll have to store all bitmaps to the disk and load them during migration downtime. With [2] bitmaps migrate in postcopy time, when target is already running, so downtime is not increased.
>>
>> So, in general [2] is better, and I think we should always use it, not making extra difference between shared and non-shared migration.
> 
> Thanks for the explanation! What about the inactive bitmaps though? Are
> they treated the same when opened? Should we consider them for migration
> through the migration stream?

Yes, we should migrate them, as even disabled (inactive) bitmap may be diverged with its on-disk version: user may disable it after some modifications.

And again, we don't want to save/load them during downtime. Another option would be implement a way to flush-and-unload inactive bitmap when it is not needed (e.g. prior to migration start), and do not load inactive bitmaps at Qemu start, but only later on demand... But this is another story and may have own underwater rocks. So, may be we implement such logic one day, but let it be separate.

> 
>>
>> ==
>>
>> And in this way, no difference between persistent and non-persistent bitmaps migration, let's always migrate them by postcopy [and we go this way (I hope ;) in Virtuozzo]
> 
> Yeah, that's probably going to be what libvirt does as well.
> 
> 
>>>> +# @migrate-set-bitmap-node-mapping:
>>>
>>> qemu-5.0 deprecated a bunch of migrate-set- specific commands in favor
>>> of migrate-set-parameters. Is it worth/necessary to add a new command
>>> here?
>>
>> Hmm probably, we should include mapping into MigrateSetParameters structure?
> 
> IMO yes. I think it also conveniently sidesteps some of the issues that
> were discussed in the other threads regarding addition/multiple calls
> etc. The mapping will be set at once and any other invocation sets a new
> mapping and that's it. Setting nothing will clear the mapping.
> 


-- 
Best regards,
Vladimir

Re: [RFC v2] migration: Add migrate-set-bitmap-node-mapping
Posted by Max Reitz 3 years, 10 months ago
On 18.05.20 18:26, Peter Krempa wrote:
> On Wed, May 13, 2020 at 16:56:10 +0200, Max Reitz wrote:
>> This command allows mapping block node names to aliases for the purpose
>> of block dirty bitmap migration.
>>
>> This way, management tools can use different node names on the source
>> and destination and pass the mapping of how bitmaps are to be
>> transferred to qemu (on the source, the destination, or even both with
>> arbitrary aliases in the migration stream).
>>
>> Suggested-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>> ---
> 
> [...]
> 
>> ---
>>  qapi/migration.json            | 36 ++++++++++++++++++++
>>  migration/block-dirty-bitmap.c | 60 ++++++++++++++++++++++++++++++++--
>>  2 files changed, 94 insertions(+), 2 deletions(-)
> 
> I just started to write some quick and dirty hacks to test use of this
> infrastructure in libvirt. I have 2 quick observations for now:
> 
>>
>> diff --git a/qapi/migration.json b/qapi/migration.json
>> index d5000558c6..97037ea635 100644
>> --- a/qapi/migration.json
>> +++ b/qapi/migration.json
>> @@ -1621,3 +1621,39 @@
>>  ##
>>  { 'event': 'UNPLUG_PRIMARY',
>>    'data': { 'device-id': 'str' } }
>> +
>> +##
>> +# @MigrationBlockNodeMapping:
>> +#
>> +# Maps a block node name to an alias for migration.
>> +#
>> +# @node-name: A block node name.
>> +#
>> +# @alias: An alias name for migration (for example the node name on
>> +#         the opposite site).
>> +#
>> +# Since: 5.1
>> +##
>> +{ 'struct': 'MigrationBlockNodeMapping',
>> +  'data': {
>> +      'node-name': 'str',
>> +      'alias': 'str'
>> +  } }
> 
> We'd probably like a
> 'nodename:bitmapname' -> 'alias' mapping so that we can select which
> bitmaps to migrate and where to migrate them to.

Sure, entirely doable.

I think Vladimir is right that we’d want separate node and bitmap
aliases then, though, because the migration stream has both fields.
(Also, if we only had a single alias, you’d always need to call
migrate-set-bitmap-(node-)mapping on both ends to unpack that alias.
With separate node and bitmap aliases, it suffices to call it on one
end, just like the version in thie patch.)

> The specific use case
> is following:
> 
> Libvirt supports migration without shared storage (NFS/etc) where we
> copy the disk images prior to the memory migration using qemu's NBD
> server and the blockdev-mirror job. By default and the most simple way
> which doesn't require users fudging with the disk images and copying
> backing images themselves is that we flatten all the backing chain
> during the copy ("sync":"full"). In this mode we'll need to do some
> merging of the bitmaps prior to finalizing the copy.
> 
> It's not a problem to do it ourselves, but in the end we'll need to copy
> only certain bitmaps which will be created temporarily on the source to
> the destination where they'll be persisted.
> 
> For now (until I implement the use of the dirty-bitmap-populate blockjob
> which I'm also doing in parallel with this kind of) when creating a
> snapshot we create a new active bitmap in the overlay for every active
> bitmap in the snapshotted image.
> 
> When flattening we'll then need to merge the appropriate ones. As said
> it's not a problem to prepare all the bitmaps before but we then don't
> need to migrate all of them.
> 
> By the way, that brings me to another question:
> 
> Is there any difference of handling of persistent and non-persistent
> bitmaps? Specifically I'm curious what's the correct approach to do the
> shared storage migration case when the source and destination image is
> the same one. Should we also instruct to migrate the active ones? or are
> they migrated by writing them to the image and reloading them?

I hope Vladimir has answered your question sufficiently extensively. :)

>> +##
>> +# @migrate-set-bitmap-node-mapping:
> 
> qemu-5.0 deprecated a bunch of migrate-set- specific commands in favor
> of migrate-set-parameters. Is it worth/necessary to add a new command
> here?

I wasn’t aware of that.  It would probably indeed make sense from a
user’s perspective.

It would make the implementation rather different, though, because
instead of putting the mapping locally (and statically) into
migration/block-dirty-bitmap.c, it would require putting it into the
central MigrationState.  Which isn’t to say it’d be worse.  I suppose
it’d be better, actually, but I’ll have to try to say for sure.

You also suggested “setting nothing will clear the mapping” in your
second mail.  That would be a weird default.  Right now, the default for
all migration parameters is to leave them as-is, so it would be different.

The first question would be: What do you mean by “clear the mapping”?
Reset it to the original identity mapping?  Or actually clear it, so
that no bitmap is migrated?  I presume the former, because the latter
can easily be achieved by passing an empty array.

I understand that it seems to make sense to be able to return to the
original identity mapping, but is there actually a use for this?  After
you have started using a custom mapping, wouldn’t you always use custom
mappings?

If there is a use for it, I think the better way to do it would be to
use an alternate type where an explicit null resets the mapping to the
identity mapping.

Max

Re: [RFC v2] migration: Add migrate-set-bitmap-node-mapping
Posted by Peter Krempa 3 years, 10 months ago
On Tue, Jun 02, 2020 at 12:56:32 +0200, Max Reitz wrote:
> On 18.05.20 18:26, Peter Krempa wrote:
> > On Wed, May 13, 2020 at 16:56:10 +0200, Max Reitz wrote:
> >> This command allows mapping block node names to aliases for the purpose
> >> of block dirty bitmap migration.
> >>
> >> This way, management tools can use different node names on the source
> >> and destination and pass the mapping of how bitmaps are to be
> >> transferred to qemu (on the source, the destination, or even both with
> >> arbitrary aliases in the migration stream).
> >>
> >> Suggested-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> >> Signed-off-by: Max Reitz <mreitz@redhat.com>
> >> ---

[...]

> >> +# @migrate-set-bitmap-node-mapping:
> > 
> > qemu-5.0 deprecated a bunch of migrate-set- specific commands in favor
> > of migrate-set-parameters. Is it worth/necessary to add a new command
> > here?
> 
> I wasn’t aware of that.  It would probably indeed make sense from a
> user’s perspective.
> 
> It would make the implementation rather different, though, because
> instead of putting the mapping locally (and statically) into
> migration/block-dirty-bitmap.c, it would require putting it into the
> central MigrationState.  Which isn’t to say it’d be worse.  I suppose
> it’d be better, actually, but I’ll have to try to say for sure.
> 
> You also suggested “setting nothing will clear the mapping” in your
> second mail.  That would be a weird default.  Right now, the default for
> all migration parameters is to leave them as-is, so it would be different.
> 
> The first question would be: What do you mean by “clear the mapping”?
> Reset it to the original identity mapping?  Or actually clear it, so
> that no bitmap is migrated?  I presume the former, because the latter
> can easily be achieved by passing an empty array.

Yeah, lucikily this is easy with json:

default mapping:

"mapping": null

empty mapping:

"mapping": [ ]

> I understand that it seems to make sense to be able to return to the
> original identity mapping, but is there actually a use for this?  After
> you have started using a custom mapping, wouldn’t you always use custom
> mappings?

Libvirt mostly doesn't care. Downgrade of libvirt version is unsupported
so this should not be a problem.

> If there is a use for it, I think the better way to do it would be to
> use an alternate type where an explicit null resets the mapping to the
> identity mapping.

yes :)


Re: [RFC v2] migration: Add migrate-set-bitmap-node-mapping
Posted by Dr. David Alan Gilbert 3 years, 11 months ago
* Max Reitz (mreitz@redhat.com) wrote:

<snip>

> +void qmp_migrate_set_bitmap_node_mapping(MigrationBlockNodeMappingList *mapping,
> +                                         Error **errp)
> +{
> +    QDict *in_mapping = qdict_new();
> +    QDict *out_mapping = qdict_new();
> +
> +    for (; mapping; mapping = mapping->next) {
> +        MigrationBlockNodeMapping *entry = mapping->value;
> +
> +        if (qdict_haskey(out_mapping, entry->node_name)) {
> +            error_setg(errp, "Cannot map node name '%s' twice",
> +                       entry->node_name);
> +            goto fail;
> +        }

I'm not too clear exactly which case this is protecting against;
I think that's protecting against mapping

  'src1'->'dst1' and 'src1'->'dst2'
which is a good check.s (or maybe it's checking against dst2 twice?)

What about cases where there is no mapping - e.g. imagine
that we have b1/b2 on the source and b2/b3 on the dest; now
if we add just a mapping:

  b1->b2

then we end up with:

  b1 -> b2
  b2 -> b2  (non-mapped)
        b3

so we have a clash there - are we protected against that?

Dave

> +        if (qdict_haskey(in_mapping, entry->alias)) {
> +            error_setg(errp, "Cannot use alias '%s' twice",
> +                       entry->alias);
> +            goto fail;
> +        }
> +
> +        qdict_put_str(in_mapping, entry->alias, entry->node_name);
> +        qdict_put_str(out_mapping, entry->node_name, entry->alias);
> +    }
> +
> +    qobject_unref(dirty_bitmap_mig_state.node_in_mapping);
> +    qobject_unref(dirty_bitmap_mig_state.node_out_mapping);
> +
> +    dirty_bitmap_mig_state.node_in_mapping = in_mapping;
> +    dirty_bitmap_mig_state.node_out_mapping = out_mapping;
> +
> +    return;
> +
> +fail:
> +    qobject_unref(in_mapping);
> +    qobject_unref(out_mapping);
> +}
> +
>  static SaveVMHandlers savevm_dirty_bitmap_handlers = {
>      .save_setup = dirty_bitmap_save_setup,
>      .save_live_complete_postcopy = dirty_bitmap_save_complete,
> -- 
> 2.26.2
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK


Re: [RFC v2] migration: Add migrate-set-bitmap-node-mapping
Posted by Max Reitz 3 years, 11 months ago
On 14.05.20 10:42, Dr. David Alan Gilbert wrote:
> * Max Reitz (mreitz@redhat.com) wrote:
> 
> <snip>
> 
>> +void qmp_migrate_set_bitmap_node_mapping(MigrationBlockNodeMappingList *mapping,
>> +                                         Error **errp)
>> +{
>> +    QDict *in_mapping = qdict_new();
>> +    QDict *out_mapping = qdict_new();
>> +
>> +    for (; mapping; mapping = mapping->next) {
>> +        MigrationBlockNodeMapping *entry = mapping->value;
>> +
>> +        if (qdict_haskey(out_mapping, entry->node_name)) {
>> +            error_setg(errp, "Cannot map node name '%s' twice",
>> +                       entry->node_name);
>> +            goto fail;
>> +        }
> 
> I'm not too clear exactly which case this is protecting against;
> I think that's protecting against mapping
> 
>   'src1'->'dst1' and 'src1'->'dst2'
> which is a good check.s (or maybe it's checking against dst2 twice?)

This one is against mapping src1 twice.  Both checks together check that
it’s a one-to-one bijective mapping.

The technical reason why it needs to be one-to-one is because we base
two QDicts off of it, so the inverse mapping needs to work.

> What about cases where there is no mapping - e.g. imagine
> that we have b1/b2 on the source and b2/b3 on the dest; now
> if we add just a mapping:
> 
>   b1->b2
> 
> then we end up with:
> 
>   b1 -> b2
>   b2 -> b2  (non-mapped)
>         b3
> 
> so we have a clash there - are we protected against that?

Oh, no, we aren’t.  That wasn’t intentional.  However, I’m not sure how
we’d protect against it.  We can’t check it in
qmp_migrate_set_bitmap_node_mapping(), because we don’t know yet which
nodes will exist at the time of migration, and which of those will have
bitmaps.

So we’d need to check it as part of the migration process (by looking up
any unmapped entries that default to the identity mapping in the
respective reverse mapping, to see whether anything maps to the same name).

OTOH, Vladimir proposed adding a parameter to
migrate-set-bitmap-node-mapping that would make migration fail if any
bitmaps should be migrated off of unmapped nodes, or if any incoming
alias is unmapped (instead of defaulting to the identity mapping).  If
we just make that the only behavior, then we wouldn’t have a problem
with that at all, because all unmapped nodes would always throw an error.

(And on the third hand, I wonder whether we should actually allow
migrating bitmaps from multiple nodes to a single one, but I suppose
that would require two separate commands, one for incoming and one for
outgoing...)

Max

Re: [RFC v2] migration: Add migrate-set-bitmap-node-mapping
Posted by Dr. David Alan Gilbert 3 years, 11 months ago
* Max Reitz (mreitz@redhat.com) wrote:
> On 14.05.20 10:42, Dr. David Alan Gilbert wrote:
> > * Max Reitz (mreitz@redhat.com) wrote:
> > 
> > <snip>
> > 
> >> +void qmp_migrate_set_bitmap_node_mapping(MigrationBlockNodeMappingList *mapping,
> >> +                                         Error **errp)
> >> +{
> >> +    QDict *in_mapping = qdict_new();
> >> +    QDict *out_mapping = qdict_new();
> >> +
> >> +    for (; mapping; mapping = mapping->next) {
> >> +        MigrationBlockNodeMapping *entry = mapping->value;
> >> +
> >> +        if (qdict_haskey(out_mapping, entry->node_name)) {
> >> +            error_setg(errp, "Cannot map node name '%s' twice",
> >> +                       entry->node_name);
> >> +            goto fail;
> >> +        }
> > 
> > I'm not too clear exactly which case this is protecting against;
> > I think that's protecting against mapping
> > 
> >   'src1'->'dst1' and 'src1'->'dst2'
> > which is a good check.s (or maybe it's checking against dst2 twice?)
> 
> This one is against mapping src1 twice.  Both checks together check that
> it’s a one-to-one bijective mapping.
> 
> The technical reason why it needs to be one-to-one is because we base
> two QDicts off of it, so the inverse mapping needs to work.
> 
> > What about cases where there is no mapping - e.g. imagine
> > that we have b1/b2 on the source and b2/b3 on the dest; now
> > if we add just a mapping:
> > 
> >   b1->b2
> > 
> > then we end up with:
> > 
> >   b1 -> b2
> >   b2 -> b2  (non-mapped)
> >         b3
> > 
> > so we have a clash there - are we protected against that?
> 
> Oh, no, we aren’t.  That wasn’t intentional.  However, I’m not sure how
> we’d protect against it.  We can’t check it in
> qmp_migrate_set_bitmap_node_mapping(), because we don’t know yet which
> nodes will exist at the time of migration, and which of those will have
> bitmaps.
> 
> So we’d need to check it as part of the migration process (by looking up
> any unmapped entries that default to the identity mapping in the
> respective reverse mapping, to see whether anything maps to the same name).

Yeh a once through check of all the nodes at the start of the migration
would probably fix it.

> OTOH, Vladimir proposed adding a parameter to
> migrate-set-bitmap-node-mapping that would make migration fail if any
> bitmaps should be migrated off of unmapped nodes, or if any incoming
> alias is unmapped (instead of defaulting to the identity mapping).  If
> we just make that the only behavior, then we wouldn’t have a problem
> with that at all, because all unmapped nodes would always throw an error.

Yeh that would force you to put a full mapping table in place.

> (And on the third hand, I wonder whether we should actually allow
> migrating bitmaps from multiple nodes to a single one, but I suppose
> that would require two separate commands, one for incoming and one for
> outgoing...)

Wouldn't that get very messy?

Dave

> Max
> 



--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK