On Tue, May 10, 2022 at 05:20:21PM +0200, Jiri Denemark wrote:
> This series implements a new VIR_MIGRATE_POSTCOPY_RESUME flag (virsh
> migrate --resume) for recovering from a failed post-copy migration.
>
> You can also fetch the series from my gitlab fork:
>
> git fetch https://gitlab.com/jirkade/libvirt.git post-copy-recovery
>
> Jiri Denemark (80):
> qemu: Add debug messages to job recovery code
> qemumonitorjsontest: Test more migration capabilities
> qemu: Return state from qemuMonitorGetMigrationCapabilities
> qemu: Enable migration events only when disabled
> Introduce VIR_DOMAIN_RUNNING_POSTCOPY_FAILED
> qemu: Keep domain running on dst on failed post-copy migration
> qemu: Explicitly emit events on post-copy failure
> qemu: Make qemuDomainCleanupAdd return void
> conf: Introduce virDomainObjIsFailedPostcopy helper
> conf: Introduce virDomainObjIsPostcopy helper
> qemu: Introduce qemuProcessCleanupMigrationJob
> qemu: Rename qemuDomainObjRestoreJob as qemuDomainObjPreserveJob
> qemu: Add qemuDomainObjRestoreAsyncJob
> qemu: Keep migration job active after failed post-copy
> qemu: Abort failed post-copy when we haven't called Finish yet
> qemu: Restore failed migration job on reconnect
> qemu: Restore async job start timestamp on reconnect
> qemu: Drop forward declarations in migration code
> qemu: Don't wait for migration job when migration is running
> qemu: Use switch in qemuDomainGetJobInfoMigrationStats
> qemu: Fetch paused migration stats
> qemu: Handle 'postcopy-paused' migration state
> qemu: Add support for postcopy-recover QEMU migration state
> qemu: Create domain object at the end of qemuMigrationDstFinish
> qemu: Move success-only code out of endjob in qemuMigrationDstFinish
> qemu: Separate success and failure path in qemuMigrationDstFinish
> qemu: Rename "endjob" label in qemuMigrationDstFinish
> qemu: Generate migration cookie in Finish phase earlier
> qemu: Make final part of migration Finish phase reusable
> qemu: Drop obsolete comment in qemuMigrationDstFinish
> qemu: Preserve error in qemuMigrationDstFinish
> qemu: Introduce qemuMigrationDstFinishFresh
> qemu: Introduce qemuMigrationDstFinishOffline
> qemu: Separate cookie parsing for qemuMigrationDstFinishOffline
> qemu: Introduce qemuMigrationDstFinishActive
> qemu: Handle migration job in qemuMigrationDstFinish
> qemu: Make final part of migration Confirm phase reusable
> qemu: Make sure migrationPort is released even in callbacks
> qemu: Pass qemuDomainJobObj to qemuMigrationDstComplete
> qemu: Finish completed unattended migration
> qemu: Ignore missing memory statistics in query-migrate
> qemu: Improve post-copy migration handling on reconnect
> qemu: Check flags incompatible with offline migration earlier
> qemu: Introduce qemuMigrationSrcBeginXML helper
> qemu: Add new migration phases for post-copy recovery
> qemu: Separate protocol checks from qemuMigrationJobSetPhase
> qemu: Make qemuMigrationCheckPhase failure fatal
> qemu: Refactor qemuDomainObjSetJobPhase
> qemu: Do not set job owner in qemuMigrationJobSetPhase
> qemu: Use QEMU_MIGRATION_PHASE_POSTCOPY_FAILED
> Introduce VIR_MIGRATE_POSTCOPY_RESUME flag
> virsh: Add --postcopy-resume option for migrate command
> qemu: Don't set VIR_MIGRATE_PAUSED for post-copy resume
> qemu: Implement VIR_MIGRATE_POSTCOPY_RESUME for Begin phase
> qmeu: Refactor qemuMigrationSrcPerformPhase
> qemu: Separate starting migration from qemuMigrationSrcRun
> qemu: Add support for 'resume' parameter of migrate QMP command
> qemu: Implement VIR_MIGRATE_POSTCOPY_RESUME for Perform phase
> qemu: Implement VIR_MIGRATE_POSTCOPY_RESUME for Confirm phase
> qemu: Introduce qemuMigrationDstPrepareFresh
> qemu: Refactor qemuMigrationDstPrepareFresh
> qemu: Simplify cleanup in qemuMigrationDstPrepareFresh
> qemu: Add support for migrate-recover QMP command
> qemu: Rename qemuMigrationSrcCleanup
> qemu: Refactor qemuMigrationAnyConnectionClosed
> qemu: Handle incoming migration in qemuMigrationAnyConnectionClosed
> qemu: Start a migration phase in qemuMigrationAnyConnectionClosed
> qemu: Implement VIR_MIGRATE_POSTCOPY_RESUME for Prepare phase
> qemu: Implement VIR_MIGRATE_POSTCOPY_RESUME for Finish phase
> qemu: Create completed jobData in qemuMigrationSrcComplete
> qemu: Register qemuProcessCleanupMigrationJob after Begin phase
> qemu: Call qemuDomainCleanupAdd from qemuMigrationJobContinue
> qemu: Implement VIR_MIGRATE_POSTCOPY_RESUME for peer-to-peer migration
> qemu: Enable support for VIR_MIGRATE_POSTCOPY_RESUME
> Add virDomainAbortJobFlags public API
> qemu: Implement virDomainAbortJobFlags
> Add VIR_DOMAIN_ABORT_JOB_POSTCOPY flag for virDomainAbortJobFlags
> qemu: Implement VIR_DOMAIN_ABORT_JOB_POSTCOPY flag
> virsh: Add --postcopy option for domjobabort command
> NEWS: Add support for post-copy recovery
For the patches with simple fixes or once the questions asked by Peter
are resolved. But there will be v2 to handle pausing/keeping VM running
on dst for failed post-copy migration.
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>