This series introduces a new POSTCOPY_DEVICE state that is active (both,
on source and destination side), while the destination loads the device
state. Before this series, if the destination machine failed during the
device load, the source side would stay stuck POSTCOPY_ACTIVE with no
way of recovery. With this series, if the migration fails while in
POSTCOPY_DEVICE state, the source side can safely resume, as destination
has not started yet.
RFC: https://lore.kernel.org/all/20250807114922.1013286-1-jmarcin@redhat.com/
V1: https://lore.kernel.org/all/20250915115918.3520735-1-jmarcin@redhat.com/
V2 changes:
- removed old patch 2, that changed migration_has_failed()
Patch 2:
- moved postcopy_ram_listen_thread() to postcopy_ram.c as per TODO,
suggested by Fabiano
Patch 3:
- introduced separate postcopy-exit-on-error setting instead of reusing
existing exit-on-error setting, suggested by Fabiano and Jirka
- merged migration_incoming_finish() and
migration_incoming_state_destroy() into migration_incoming_cleanup()
and added migration_incoming_cleanup_bh(), suggested by Fabiano
Patch 4:
- introduced POSTCOPY_DEVICE state also to destination, suggested by
Jirka
- moved POSTCOPY_DEVICE->POSTCOPY_ACTIVE transition from return path
thread to main migration thread, suggested by Peter
Juraj Marcin (3):
migration: Move postcopy_ram_listen_thread() to postcopy-ram.c
migration: Refactor all incoming cleanup into
migration_incoming_cleanup()
migration: Introduce POSTCOPY_DEVICE state
Peter Xu (1):
migration: Do not try to start VM if disk activation fails
migration/migration-hmp-cmds.c | 2 +-
migration/migration.c | 148 +++++++++++++++++---------
migration/migration.h | 7 +-
migration/postcopy-ram.c | 137 ++++++++++++++++++++++++
migration/postcopy-ram.h | 3 +
migration/savevm.c | 137 ++----------------------
migration/savevm.h | 2 +
migration/trace-events | 1 +
qapi/migration.json | 17 ++-
system/vl.c | 3 +-
tests/qtest/migration/precopy-tests.c | 3 +-
11 files changed, 274 insertions(+), 186 deletions(-)
--
2.51.0