[PATCH v2 0/5] vfio/migration: Block VFIO migration with postcopy and background snapshot

Avihai Horon posted 5 patches 8 months ago
Failed in applying to current master (apply log)
There is a newer version of this series
include/migration/register.h |  5 +++++
migration/migration.h        |  4 ++--
migration/savevm.h           |  1 +
hw/vfio/common.c             |  7 +++++--
hw/vfio/migration.c          | 31 +++++++++++++++++++++++++++++++
migration/migration.c        | 10 +++++++---
migration/savevm.c           | 31 ++++++++++++++++++++++++++++++-
migration/target.c           |  8 ++++----
8 files changed, 85 insertions(+), 12 deletions(-)
[PATCH v2 0/5] vfio/migration: Block VFIO migration with postcopy and background snapshot
Posted by Avihai Horon 8 months ago
Hello,

Recently added VFIO migration is not compatible with some of the
pre-existing migration features. This was overlooked and today these
combinations are not blocked by QEMU. This series fixes it.

Postcopy migration:
VFIO migration is not compatible with postcopy migration. A VFIO device
in the destination can't handle page faults for pages that have not been
sent yet. Doing such migration will cause the VM to crash in the
destination.

Background snapshot:
Background snapshot allows creating a snapshot of the VM while it's
running and keeping it small by not including dirty RAM pages.

The way it works is by first stopping the VM, saving the non-iterable
devices' state and then starting the VM and saving the RAM while write
protecting it with UFFD. The resulting snapshot represents the VM state
at snapshot start.

VFIO migration is not compatible with background snapshot.
First of all, VFIO device state is not even saved in background snapshot
because only non-iterable device state is saved. But even if it was
saved, after starting the VM, a VFIO device could dirty pages without it
being detected by UFFD write protection. This would corrupt the
snapshot, as the RAM in it would not represent the RAM at snapshot
start.

This series fixes it by blocking these combinations. This is done by
adding a .save_prepare() handler to struct SaveVMHandler. The
.save_prepare() handler is called early, even before migration starts,
and is used by VFIO migration to check the migration capabilities and
fail migration if needed.

Note that this series is based on the P2P series [1] sent a few weeks
ago.

Comments and suggestions will be greatly appreciated.

Thanks.

Changes from v1 [2]:
* Adopted Peter's suggestion to block migration upon migrate command
  using a new .save_prepare() handler in SaveVMHandlers.
* Added R-bs by Cedric.

[1]
https://lore.kernel.org/qemu-devel/20230802081449.2528-1-avihaih@nvidia.com/

[2]
https://lore.kernel.org/qemu-devel/20230828151842.11303-1-avihaih@nvidia.com/

Avihai Horon (5):
  migration: Add migration prefix to functions in target.c
  vfio/migration: Fail adding device with enable-migration=on and
    existing blocker
  migration: Add .save_prepare() handler to struct SaveVMHandlers
  vfio/migration: Block VFIO migration with postcopy migration
  vfio/migration: Block VFIO migration with background snapshot

 include/migration/register.h |  5 +++++
 migration/migration.h        |  4 ++--
 migration/savevm.h           |  1 +
 hw/vfio/common.c             |  7 +++++--
 hw/vfio/migration.c          | 31 +++++++++++++++++++++++++++++++
 migration/migration.c        | 10 +++++++---
 migration/savevm.c           | 31 ++++++++++++++++++++++++++++++-
 migration/target.c           |  8 ++++----
 8 files changed, 85 insertions(+), 12 deletions(-)

-- 
2.26.3