[PATCH] migration: do not restart VM after successful snapshot-load

Paolo Bonzini posted 1 patch 2 years, 10 months ago
Test checkpatch passed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20210504165826.618801-1-pbonzini@redhat.com
Maintainers: "Dr. David Alan Gilbert" <dgilbert@redhat.com>, Juan Quintela <quintela@redhat.com>
migration/savevm.c | 16 ++++++++--------
monitor/hmp-cmds.c |  7 +------
2 files changed, 9 insertions(+), 14 deletions(-)
[PATCH] migration: do not restart VM after successful snapshot-load
Posted by Paolo Bonzini 2 years, 10 months ago
The HMP loadvm code is calling load_snapshot rather than
qmp_snapshot_load, in order to bypass the job infrastructure.  The code
around it is almost the same, with one difference: hmp_loadvm is
restarting the VM if load_snapshot fails, qmp_snapshot_load is doing so
if load_snapshot succeeds.

Fix the bug in QMP by moving the common code to load_snapshot.

Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 migration/savevm.c | 16 ++++++++--------
 monitor/hmp-cmds.c |  7 +------
 2 files changed, 9 insertions(+), 14 deletions(-)

diff --git a/migration/savevm.c b/migration/savevm.c
index 52e2d72e4b..a899191cbf 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2992,6 +2992,7 @@ bool load_snapshot(const char *name, const char *vmstate,
     int ret;
     AioContext *aio_context;
     MigrationIncomingState *mis = migration_incoming_get_current();
+    int saved_vm_running  = runstate_is_running();
 
     if (!bdrv_all_can_snapshot(has_devices, devices, errp)) {
         return false;
@@ -3024,6 +3025,8 @@ bool load_snapshot(const char *name, const char *vmstate,
         return false;
     }
 
+    vm_stop(RUN_STATE_RESTORE_VM);
+
     /*
      * Flush the record/replay queue. Now the VM state is going
      * to change. Therefore we don't need to preserve its consistency
@@ -3061,13 +3064,17 @@ bool load_snapshot(const char *name, const char *vmstate,
 
     if (ret < 0) {
         error_setg(errp, "Error %d while loading VM state", ret);
-        return false;
+        goto err_restart;
     }
 
     return true;
 
 err_drain:
     bdrv_drain_all_end();
+err_restart:
+    if (saved_vm_running) {
+        vm_start();
+    }
     return false;
 }
 
@@ -3135,17 +3142,10 @@ static void snapshot_load_job_bh(void *opaque)
 {
     Job *job = opaque;
     SnapshotJob *s = container_of(job, SnapshotJob, common);
-    int orig_vm_running;
 
     job_progress_set_remaining(&s->common, 1);
 
-    orig_vm_running = runstate_is_running();
-    vm_stop(RUN_STATE_RESTORE_VM);
-
     s->ret = load_snapshot(s->tag, s->vmstate, true, s->devices, s->errp);
-    if (s->ret && orig_vm_running) {
-        vm_start();
-    }
 
     job_progress_update(&s->common, 1);
 
diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c
index 0ad5b77477..a39436c8cb 100644
--- a/monitor/hmp-cmds.c
+++ b/monitor/hmp-cmds.c
@@ -1127,15 +1127,10 @@ void hmp_balloon(Monitor *mon, const QDict *qdict)
 
 void hmp_loadvm(Monitor *mon, const QDict *qdict)
 {
-    int saved_vm_running  = runstate_is_running();
     const char *name = qdict_get_str(qdict, "name");
     Error *err = NULL;
 
-    vm_stop(RUN_STATE_RESTORE_VM);
-
-    if (!load_snapshot(name, NULL, false, NULL, &err) && saved_vm_running) {
-        vm_start();
-    }
+    load_snapshot(name, NULL, false, NULL, &err);
     hmp_handle_error(mon, err);
 }
 
-- 
2.26.2


Re: [PATCH] migration: do not restart VM after successful snapshot-load
Posted by Dr. David Alan Gilbert 2 years, 10 months ago
* Paolo Bonzini (pbonzini@redhat.com) wrote:
> The HMP loadvm code is calling load_snapshot rather than
> qmp_snapshot_load, in order to bypass the job infrastructure.  The code
> around it is almost the same, with one difference: hmp_loadvm is
> restarting the VM if load_snapshot fails, qmp_snapshot_load is doing so
> if load_snapshot succeeds.
> 
> Fix the bug in QMP by moving the common code to load_snapshot.

So I agree it's nice to have them consistent, but hmm.

> Cc: qemu-stable@nongnu.org
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  migration/savevm.c | 16 ++++++++--------
>  monitor/hmp-cmds.c |  7 +------
>  2 files changed, 9 insertions(+), 14 deletions(-)
> 
> diff --git a/migration/savevm.c b/migration/savevm.c
> index 52e2d72e4b..a899191cbf 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -2992,6 +2992,7 @@ bool load_snapshot(const char *name, const char *vmstate,
>      int ret;
>      AioContext *aio_context;
>      MigrationIncomingState *mis = migration_incoming_get_current();
> +    int saved_vm_running  = runstate_is_running();
>  
>      if (!bdrv_all_can_snapshot(has_devices, devices, errp)) {
>          return false;
> @@ -3024,6 +3025,8 @@ bool load_snapshot(const char *name, const char *vmstate,
>          return false;
>      }
>  
> +    vm_stop(RUN_STATE_RESTORE_VM);
> +
>      /*
>       * Flush the record/replay queue. Now the VM state is going
>       * to change. Therefore we don't need to preserve its consistency
> @@ -3061,13 +3064,17 @@ bool load_snapshot(const char *name, const char *vmstate,
>  
>      if (ret < 0) {
>          error_setg(errp, "Error %d while loading VM state", ret);
> -        return false;
> +        goto err_restart;
>      }

I don't think this is safe.
If qemu_loadvm_state(f) fails, depending on the point that it fails,
the state of the VM is in a part loaded, indeterminate state - it
doesn't seem right to auto-restart it.

Note, that's a destinct failure from the earlier failures, e.g. trying
to find the snapshot and noticing it doesn't exist - that's OK to
restart the VM.

Then there's the question of what to do if the load_snapshot succeeds;
qmp's behaviour of running the loaded-VM doesn't seem wrong to me;
although you'd think that would be based on the loaded state not the
original; but that's probably a different question.

Dave

>      return true;
>  
>  err_drain:
>      bdrv_drain_all_end();
> +err_restart:
> +    if (saved_vm_running) {
> +        vm_start();
> +    }
>      return false;
>  }
>  
> @@ -3135,17 +3142,10 @@ static void snapshot_load_job_bh(void *opaque)
>  {
>      Job *job = opaque;
>      SnapshotJob *s = container_of(job, SnapshotJob, common);
> -    int orig_vm_running;
>  
>      job_progress_set_remaining(&s->common, 1);
>  
> -    orig_vm_running = runstate_is_running();
> -    vm_stop(RUN_STATE_RESTORE_VM);
> -
>      s->ret = load_snapshot(s->tag, s->vmstate, true, s->devices, s->errp);
> -    if (s->ret && orig_vm_running) {
> -        vm_start();
> -    }
>  
>      job_progress_update(&s->common, 1);
>  
> diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c
> index 0ad5b77477..a39436c8cb 100644
> --- a/monitor/hmp-cmds.c
> +++ b/monitor/hmp-cmds.c
> @@ -1127,15 +1127,10 @@ void hmp_balloon(Monitor *mon, const QDict *qdict)
>  
>  void hmp_loadvm(Monitor *mon, const QDict *qdict)
>  {
> -    int saved_vm_running  = runstate_is_running();
>      const char *name = qdict_get_str(qdict, "name");
>      Error *err = NULL;
>  
> -    vm_stop(RUN_STATE_RESTORE_VM);
> -
> -    if (!load_snapshot(name, NULL, false, NULL, &err) && saved_vm_running) {
> -        vm_start();
> -    }
> +    load_snapshot(name, NULL, false, NULL, &err);
>      hmp_handle_error(mon, err);
>  }
>  
> -- 
> 2.26.2
> 
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK


Re: [PATCH] migration: do not restart VM after successful snapshot-load
Posted by Daniel P. Berrangé 2 years, 10 months ago
On Tue, May 04, 2021 at 12:58:26PM -0400, Paolo Bonzini wrote:
> The HMP loadvm code is calling load_snapshot rather than
> qmp_snapshot_load, in order to bypass the job infrastructure.  The code
> around it is almost the same, with one difference: hmp_loadvm is
> restarting the VM if load_snapshot fails, qmp_snapshot_load is doing so
> if load_snapshot succeeds.
> 
> Fix the bug in QMP by moving the common code to load_snapshot.

I'm wondering how did you discover this bug ?

> Cc: qemu-stable@nongnu.org
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  migration/savevm.c | 16 ++++++++--------
>  monitor/hmp-cmds.c |  7 +------
>  2 files changed, 9 insertions(+), 14 deletions(-)

We ought to assert this behaviour in some test cases

We have  qemu-iotests/068  for HMP  and 

qemu-iotests/tests/internal-snapshots-qapi   for QMP...

....doh, i just realize we never got the latter merged.

> 
> diff --git a/migration/savevm.c b/migration/savevm.c
> index 52e2d72e4b..a899191cbf 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -2992,6 +2992,7 @@ bool load_snapshot(const char *name, const char *vmstate,
>      int ret;
>      AioContext *aio_context;
>      MigrationIncomingState *mis = migration_incoming_get_current();
> +    int saved_vm_running  = runstate_is_running();
>  
>      if (!bdrv_all_can_snapshot(has_devices, devices, errp)) {
>          return false;
> @@ -3024,6 +3025,8 @@ bool load_snapshot(const char *name, const char *vmstate,
>          return false;
>      }
>  
> +    vm_stop(RUN_STATE_RESTORE_VM);
> +
>      /*
>       * Flush the record/replay queue. Now the VM state is going
>       * to change. Therefore we don't need to preserve its consistency
> @@ -3061,13 +3064,17 @@ bool load_snapshot(const char *name, const char *vmstate,
>  
>      if (ret < 0) {
>          error_setg(errp, "Error %d while loading VM state", ret);
> -        return false;
> +        goto err_restart;
>      }
>  
>      return true;
>  
>  err_drain:
>      bdrv_drain_all_end();
> +err_restart:
> +    if (saved_vm_running) {
> +        vm_start();
> +    }
>      return false;
>  }
>  
> @@ -3135,17 +3142,10 @@ static void snapshot_load_job_bh(void *opaque)
>  {
>      Job *job = opaque;
>      SnapshotJob *s = container_of(job, SnapshotJob, common);
> -    int orig_vm_running;
>  
>      job_progress_set_remaining(&s->common, 1);
>  
> -    orig_vm_running = runstate_is_running();
> -    vm_stop(RUN_STATE_RESTORE_VM);
> -
>      s->ret = load_snapshot(s->tag, s->vmstate, true, s->devices, s->errp);
> -    if (s->ret && orig_vm_running) {
> -        vm_start();
> -    }
>  
>      job_progress_update(&s->common, 1);
>  
> diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c
> index 0ad5b77477..a39436c8cb 100644
> --- a/monitor/hmp-cmds.c
> +++ b/monitor/hmp-cmds.c
> @@ -1127,15 +1127,10 @@ void hmp_balloon(Monitor *mon, const QDict *qdict)
>  
>  void hmp_loadvm(Monitor *mon, const QDict *qdict)
>  {
> -    int saved_vm_running  = runstate_is_running();
>      const char *name = qdict_get_str(qdict, "name");
>      Error *err = NULL;
>  
> -    vm_stop(RUN_STATE_RESTORE_VM);
> -
> -    if (!load_snapshot(name, NULL, false, NULL, &err) && saved_vm_running) {
> -        vm_start();
> -    }
> +    load_snapshot(name, NULL, false, NULL, &err);
>      hmp_handle_error(mon, err);
>  }
>  
> -- 
> 2.26.2
> 
> 

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


Re: [PATCH] migration: do not restart VM after successful snapshot-load
Posted by Eric Blake 2 years, 10 months ago
On 5/4/21 11:58 AM, Paolo Bonzini wrote:
> The HMP loadvm code is calling load_snapshot rather than
> qmp_snapshot_load, in order to bypass the job infrastructure.  The code
> around it is almost the same, with one difference: hmp_loadvm is
> restarting the VM if load_snapshot fails, qmp_snapshot_load is doing so
> if load_snapshot succeeds.
> 
> Fix the bug in QMP by moving the common code to load_snapshot.
> 
> Cc: qemu-stable@nongnu.org
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  migration/savevm.c | 16 ++++++++--------
>  monitor/hmp-cmds.c |  7 +------
>  2 files changed, 9 insertions(+), 14 deletions(-)

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org