[PATCH] qemu: snapshot: Collect 'query-named-block-nodes' prior to memory migration

Peter Krempa posted 1 patch 3 years, 8 months ago
Test syntax-check failed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/libvirt tags/patchew/fe75e453a8072df8a35a33751d8995a507263663.1596204648.git.pkrempa@redhat.com
src/qemu/qemu_driver.c | 17 +++++++++++------
1 file changed, 11 insertions(+), 6 deletions(-)
[PATCH] qemu: snapshot: Collect 'query-named-block-nodes' prior to memory migration
Posted by Peter Krempa 3 years, 8 months ago
When doing an external snapshot we migrate memory to a file as a form of
taking the memory state. This creates a problem as qemu deactivates all
active bitmaps after a successful migration. This means that calling
'query-named-block-nodes' will return an empty list of bitmaps for
devices. We use the bitmap list to propagate the active bitmaps into the
overlay files being created which is required for backups to work after
a snapshot. Since we wouldn't propagate anythign a subsequent backup
will fail with:

invalid argument: missing or broken bitmap 'testchck' for disk 'vda'

To fix this, we can simply collect the bitmap list prior to the
migration.

https://bugzilla.redhat.com/show_bug.cgi?id=1862472

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
---

Note that with current qemu the above steps will still fail as qemu
fails to 'cont' after a migration if backing images contain bitmaps
which is the case here.

See: https://lists.nongnu.org/archive/html/qemu-block/2020-07/msg01833.html

 src/qemu/qemu_driver.c | 17 +++++++++++------
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
index 0ad6359102..b655df8c98 100644
--- a/src/qemu/qemu_driver.c
+++ b/src/qemu/qemu_driver.c
@@ -15365,6 +15365,7 @@ static int
 qemuDomainSnapshotCreateDiskActive(virQEMUDriverPtr driver,
                                    virDomainObjPtr vm,
                                    virDomainMomentObjPtr snap,
+                                   virHashTablePtr blockNamedNodeData,
                                    unsigned int flags,
                                    virQEMUDriverConfigPtr cfg,
                                    qemuDomainAsyncJob asyncJob)
@@ -15378,17 +15379,12 @@ qemuDomainSnapshotCreateDiskActive(virQEMUDriverPtr driver,
     qemuDomainSnapshotDiskDataPtr diskdata = NULL;
     size_t ndiskdata = 0;
     bool blockdev =  virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_BLOCKDEV);
-    g_autoptr(virHashTable) blockNamedNodeData = NULL;

     if (virDomainObjCheckActive(vm) < 0)
         return -1;

     actions = virJSONValueNewArray();

-    if (blockdev &&
-        !(blockNamedNodeData = qemuBlockGetNamedNodeData(vm, asyncJob)))
-        return -1;
-
     /* prepare a list of objects to use in the vm definition so that we don't
      * have to roll back later */
     if (qemuDomainSnapshotDiskPrepare(driver, vm, snap, cfg, reuse, blockdev,
@@ -15455,6 +15451,7 @@ qemuDomainSnapshotCreateActiveExternal(virQEMUDriverPtr driver,
     int compressed;
     g_autoptr(virCommand) compressor = NULL;
     virQEMUSaveDataPtr data = NULL;
+    g_autoptr(virHashTable) blockNamedNodeData = NULL;

     /* If quiesce was requested, then issue a freeze command, and a
      * counterpart thaw command when it is actually sent to agent.
@@ -15509,6 +15506,13 @@ qemuDomainSnapshotCreateActiveExternal(virQEMUDriverPtr driver,
         }
     }

+    /* We need to collect reply from 'query-named-block-nodes' prior to the
+     * migration step as qemu deactivates bitmaps after migration so the result
+     * would be wrong */
+    if (virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_BLOCKDEV) &&
+        !(blockNamedNodeData = qemuBlockGetNamedNodeData(vm, QEMU_ASYNC_JOB_SNAPSHOT)))
+        goto cleanup;
+
     /* do the memory snapshot if necessary */
     if (memory) {
         /* check if migration is possible */
@@ -15553,7 +15557,8 @@ qemuDomainSnapshotCreateActiveExternal(virQEMUDriverPtr driver,

     /* the domain is now paused if a memory snapshot was requested */

-    if ((ret = qemuDomainSnapshotCreateDiskActive(driver, vm, snap, flags, cfg,
+    if ((ret = qemuDomainSnapshotCreateDiskActive(driver, vm, snap,
+                                                  blockNamedNodeData, flags, cfg,
                                                   QEMU_ASYNC_JOB_SNAPSHOT)) < 0)
         goto cleanup;

-- 
2.26.2

Re: [PATCH] qemu: snapshot: Collect 'query-named-block-nodes' prior to memory migration
Posted by Eric Blake 3 years, 8 months ago
On 7/31/20 9:13 AM, Peter Krempa wrote:
> When doing an external snapshot we migrate memory to a file as a form of
> taking the memory state. This creates a problem as qemu deactivates all
> active bitmaps after a successful migration. This means that calling
> 'query-named-block-nodes' will return an empty list of bitmaps for
> devices. We use the bitmap list to propagate the active bitmaps into the
> overlay files being created which is required for backups to work after
> a snapshot. Since we wouldn't propagate anythign a subsequent backup

anything

> will fail with:
> 
> invalid argument: missing or broken bitmap 'testchck' for disk 'vda'
> 
> To fix this, we can simply collect the bitmap list prior to the
> migration.
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=1862472
> 
> Signed-off-by: Peter Krempa <pkrempa@redhat.com>
> ---
> 
> Note that with current qemu the above steps will still fail as qemu
> fails to 'cont' after a migration if backing images contain bitmaps
> which is the case here.
> 
> See: https://lists.nongnu.org/archive/html/qemu-block/2020-07/msg01833.html

But that is on track to be fixed in 5.1-rc3, so that shouldn't be a 
show-stopper.

> 
>   src/qemu/qemu_driver.c | 17 +++++++++++------
>   1 file changed, 11 insertions(+), 6 deletions(-)
> 
> diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
> index 0ad6359102..b655df8c98 100644
> --- a/src/qemu/qemu_driver.c

Reviewed-by: Eric Blake <eblake@redhat.com>

I'll leave it up to you to decide if this is a mandatory fix during freeze.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org