[PATCH V7 00/12] fix migration of suspended runstate

Steve Sistare posted 12 patches 11 months, 3 weeks ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/1701883417-356268-1-git-send-email-steven.sistare@oracle.com
Maintainers: Stefan Berger <stefanb@linux.vnet.ibm.com>, Gerd Hoffmann <kraxel@redhat.com>, Stefano Stabellini <sstabellini@kernel.org>, Anthony Perard <anthony.perard@citrix.com>, Paul Durrant <paul@xen.org>, Juan Quintela <quintela@redhat.com>, Peter Xu <peterx@redhat.com>, Fabiano Rosas <farosas@suse.de>, Leonardo Bras <leobras@redhat.com>, Paolo Bonzini <pbonzini@redhat.com>, Eric Blake <eblake@redhat.com>, Markus Armbruster <armbru@redhat.com>, Richard Henderson <richard.henderson@linaro.org>, Thomas Huth <thuth@redhat.com>, Laurent Vivier <lvivier@redhat.com>
There is a newer version of this series
backends/tpm/tpm_emulator.c          |   2 +-
hw/usb/hcd-ehci.c                    |   2 +-
hw/usb/redirect.c                    |   2 +-
hw/xen/xen-hvm-common.c              |   2 +-
include/migration/snapshot.h         |   7 ++
include/sysemu/runstate.h            |  16 ++++
migration/global_state.c             |  35 ++++++-
migration/migration-hmp-cmds.c       |   8 +-
migration/migration.c                |  15 +--
migration/savevm.c                   |  23 +++--
qapi/misc.json                       |  10 +-
system/cpus.c                        |  47 +++++++--
system/runstate.c                    |   9 ++
system/vl.c                          |   2 +
tests/migration/i386/Makefile        |   5 +-
tests/migration/i386/a-b-bootblock.S |  50 +++++++++-
tests/migration/i386/a-b-bootblock.h |  26 +++--
tests/qtest/migration-helpers.c      |  27 ++----
tests/qtest/migration-helpers.h      |  11 ++-
tests/qtest/migration-test.c         | 181 +++++++++++++++++++++++++----------
20 files changed, 354 insertions(+), 126 deletions(-)
[PATCH V7 00/12] fix migration of suspended runstate
Posted by Steve Sistare 11 months, 3 weeks ago
Migration of a guest in the suspended runstate is broken.  The incoming
migration code automatically tries to wake the guest, which is wrong;
the guest should end migration in the same runstate it started.  Further,
after saving a snapshot in the suspended state and loading it, the vm_start
fails.  The runstate is RUNNING, but the guest is not.

See the commit messages for the details.

Changes in V2:
  * simplify "start on wakeup request"
  * fix postcopy, snapshot, and background migration
  * refactor fixes for each type of migration
  * explicitly handled suspended events and runstate in tests
  * add test for postcopy and background migration

Changes in V3:
  * rebase to tip
  * fix hang in new function migrate_wait_for_dirty_mem

Changes in V4:
  * rebase to tip
  * add patch for vm_prepare_start (thanks Peter)
  * add patch to preserve cpu ticks

Changes in V5:
  * rebase to tip
  * added patches to completely stop vm in suspended state:
      cpus: refactor vm_stop
      cpus: stop vm in suspended state
  * added patch to partially resume vm in suspended state:
      cpus: start vm in suspended state
  * modified "preserve suspended ..." patches to use the above.
  * deleted patch "preserve cpu ticks if suspended".  stop ticks in
    vm_stop_force_state instead.
  * deleted patch "add runstate function".  defined new helper function
    migrate_new_runstate in "preserve suspended runstate"
  * Added some RB's, but removed other RB's because the patches changed.

Changes in V6:
  * all vm_stop calls completely stop the suspended state
  * refactored and updated the "cpus" patches
  * simplified the "preserve suspended" patches
  * added patch "bootfile per vm"

Changes in V7:
  * rebase to tip, add RB-s
  * fix backwards compatibility for global_state.vm_was_suspended
  * delete vm_prepare_start state argument, and rename patch
    "pass runstate to vm_prepare_start" to
    "check running not RUN_STATE_RUNNING"
  * drop patches:
      tests/qtest: bootfile per vm
      tests/qtest: background migration with suspend
  * rename runstate_is_started to runstate_is_live
  * move wait_for_suspend in tests

Steve Sistare (12):
  cpus: vm_was_suspended
  cpus: stop vm in suspended runstate
  cpus: check running not RUN_STATE_RUNNING
  cpus: vm_resume
  migration: propagate suspended runstate
  migration: preserve suspended runstate
  migration: preserve suspended for snapshot
  migration: preserve suspended for bg_migration
  tests/qtest: migration events
  tests/qtest: option to suspend during migration
  tests/qtest: precopy migration with suspend
  tests/qtest: postcopy migration with suspend

 backends/tpm/tpm_emulator.c          |   2 +-
 hw/usb/hcd-ehci.c                    |   2 +-
 hw/usb/redirect.c                    |   2 +-
 hw/xen/xen-hvm-common.c              |   2 +-
 include/migration/snapshot.h         |   7 ++
 include/sysemu/runstate.h            |  16 ++++
 migration/global_state.c             |  35 ++++++-
 migration/migration-hmp-cmds.c       |   8 +-
 migration/migration.c                |  15 +--
 migration/savevm.c                   |  23 +++--
 qapi/misc.json                       |  10 +-
 system/cpus.c                        |  47 +++++++--
 system/runstate.c                    |   9 ++
 system/vl.c                          |   2 +
 tests/migration/i386/Makefile        |   5 +-
 tests/migration/i386/a-b-bootblock.S |  50 +++++++++-
 tests/migration/i386/a-b-bootblock.h |  26 +++--
 tests/qtest/migration-helpers.c      |  27 ++----
 tests/qtest/migration-helpers.h      |  11 ++-
 tests/qtest/migration-test.c         | 181 +++++++++++++++++++++++++----------
 20 files changed, 354 insertions(+), 126 deletions(-)

-- 
1.8.3.1
Re: [PATCH V7 00/12] fix migration of suspended runstate
Posted by Steven Sistare 11 months, 3 weeks ago
FYI, these patches still need RB:
  migration: propagate suspended runstate
  tests/qtest: precopy migration with suspend
  tests/qtest: postcopy migration with suspend

This has RB, but the interaction between vm_start and vm_prepare_start
changed, so needs another look.
    cpus: stop vm in suspended runstate

- Steve

On 12/6/2023 12:23 PM, Steve Sistare wrote:
> Migration of a guest in the suspended runstate is broken.  The incoming
> migration code automatically tries to wake the guest, which is wrong;
> the guest should end migration in the same runstate it started.  Further,
> after saving a snapshot in the suspended state and loading it, the vm_start
> fails.  The runstate is RUNNING, but the guest is not.
> 
> See the commit messages for the details.
> 
> Changes in V2:
>   * simplify "start on wakeup request"
>   * fix postcopy, snapshot, and background migration
>   * refactor fixes for each type of migration
>   * explicitly handled suspended events and runstate in tests
>   * add test for postcopy and background migration
> 
> Changes in V3:
>   * rebase to tip
>   * fix hang in new function migrate_wait_for_dirty_mem
> 
> Changes in V4:
>   * rebase to tip
>   * add patch for vm_prepare_start (thanks Peter)
>   * add patch to preserve cpu ticks
> 
> Changes in V5:
>   * rebase to tip
>   * added patches to completely stop vm in suspended state:
>       cpus: refactor vm_stop
>       cpus: stop vm in suspended state
>   * added patch to partially resume vm in suspended state:
>       cpus: start vm in suspended state
>   * modified "preserve suspended ..." patches to use the above.
>   * deleted patch "preserve cpu ticks if suspended".  stop ticks in
>     vm_stop_force_state instead.
>   * deleted patch "add runstate function".  defined new helper function
>     migrate_new_runstate in "preserve suspended runstate"
>   * Added some RB's, but removed other RB's because the patches changed.
> 
> Changes in V6:
>   * all vm_stop calls completely stop the suspended state
>   * refactored and updated the "cpus" patches
>   * simplified the "preserve suspended" patches
>   * added patch "bootfile per vm"
> 
> Changes in V7:
>   * rebase to tip, add RB-s
>   * fix backwards compatibility for global_state.vm_was_suspended
>   * delete vm_prepare_start state argument, and rename patch
>     "pass runstate to vm_prepare_start" to
>     "check running not RUN_STATE_RUNNING"
>   * drop patches:
>       tests/qtest: bootfile per vm
>       tests/qtest: background migration with suspend
>   * rename runstate_is_started to runstate_is_live
>   * move wait_for_suspend in tests
> 
> Steve Sistare (12):
>   cpus: vm_was_suspended
>   cpus: stop vm in suspended runstate
>   cpus: check running not RUN_STATE_RUNNING
>   cpus: vm_resume
>   migration: propagate suspended runstate
>   migration: preserve suspended runstate
>   migration: preserve suspended for snapshot
>   migration: preserve suspended for bg_migration
>   tests/qtest: migration events
>   tests/qtest: option to suspend during migration
>   tests/qtest: precopy migration with suspend
>   tests/qtest: postcopy migration with suspend
> 
>  backends/tpm/tpm_emulator.c          |   2 +-
>  hw/usb/hcd-ehci.c                    |   2 +-
>  hw/usb/redirect.c                    |   2 +-
>  hw/xen/xen-hvm-common.c              |   2 +-
>  include/migration/snapshot.h         |   7 ++
>  include/sysemu/runstate.h            |  16 ++++
>  migration/global_state.c             |  35 ++++++-
>  migration/migration-hmp-cmds.c       |   8 +-
>  migration/migration.c                |  15 +--
>  migration/savevm.c                   |  23 +++--
>  qapi/misc.json                       |  10 +-
>  system/cpus.c                        |  47 +++++++--
>  system/runstate.c                    |   9 ++
>  system/vl.c                          |   2 +
>  tests/migration/i386/Makefile        |   5 +-
>  tests/migration/i386/a-b-bootblock.S |  50 +++++++++-
>  tests/migration/i386/a-b-bootblock.h |  26 +++--
>  tests/qtest/migration-helpers.c      |  27 ++----
>  tests/qtest/migration-helpers.h      |  11 ++-
>  tests/qtest/migration-test.c         | 181 +++++++++++++++++++++++++----------
>  20 files changed, 354 insertions(+), 126 deletions(-)
>
Re: [PATCH V7 00/12] fix migration of suspended runstate
Posted by Peter Xu 11 months, 3 weeks ago
On Wed, Dec 06, 2023 at 12:30:02PM -0500, Steven Sistare wrote:
>     cpus: stop vm in suspended runstate

This patch still didn't copy the QAPI maintainers, please remember to do so
in a new post.

Maybe it would be easier to move the QAPI doc changes into a separate
patch?

Thanks,

-- 
Peter Xu
Re: [PATCH V7 00/12] fix migration of suspended runstate
Posted by Steven Sistare 11 months, 3 weeks ago
On 12/11/2023 1:56 AM, Peter Xu wrote:
> On Wed, Dec 06, 2023 at 12:30:02PM -0500, Steven Sistare wrote:
>>     cpus: stop vm in suspended runstate
> 
> This patch still didn't copy the QAPI maintainers, please remember to do so
> in a new post.
> 
> Maybe it would be easier to move the QAPI doc changes into a separate
> patch?

This was intentional.  I did not cc them for the whole series to spare them from
excess email.  You cc'd them for "[PATCH V6 03/14] cpus: stop vm in suspended runstate" 
they had no comments, and there is no change for V7, so I assumed we are OK, but I will
cc them again for that patch to be sure.

- Steve
Re: [PATCH V7 00/12] fix migration of suspended runstate
Posted by Peter Xu 11 months, 2 weeks ago
On Mon, Dec 11, 2023 at 08:31:17AM -0500, Steven Sistare wrote:
> On 12/11/2023 1:56 AM, Peter Xu wrote:
> > On Wed, Dec 06, 2023 at 12:30:02PM -0500, Steven Sistare wrote:
> >>     cpus: stop vm in suspended runstate
> > 
> > This patch still didn't copy the QAPI maintainers, please remember to do so
> > in a new post.
> > 
> > Maybe it would be easier to move the QAPI doc changes into a separate
> > patch?
> 
> This was intentional.  I did not cc them for the whole series to spare them from
> excess email.  You cc'd them for "[PATCH V6 03/14] cpus: stop vm in suspended runstate" 
> they had no comments, and there is no change for V7, so I assumed we are OK, but I will
> cc them again for that patch to be sure.

Yes, please do so.  Per my experience on QEMU, we normally need an ACK from
all sides to merge a patch.  There can be outliers if the changes are very
trivial so one maintainer may just pull that in, but it'll always be good
to always copy relevant maintainers.

Thanks,

-- 
Peter Xu