VFIO migration is not compatible with postcopy migration. A VFIO device
in the destination can't handle page faults for pages that have not been
sent yet.
Doing such migration will cause the VM to crash in the destination:
qemu-system-x86_64: VFIO_MAP_DMA failed: Bad address
qemu-system-x86_64: vfio_dma_map(0x55a28c7659d0, 0xc0000, 0xb000, 0x7f1b11a00000) = -14 (Bad address)
qemu: hardware error: vfio: DMA mapping failed, unable to continue
To prevent this, block VFIO migration with postcopy migration.
Reported-by: Yanghang Liu <yanghliu@redhat.com>
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
---
hw/vfio/migration.c | 22 ++++++++++++++++++++++
1 file changed, 22 insertions(+)
diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
index 71855468fe..20994dc1d6 100644
--- a/hw/vfio/migration.c
+++ b/hw/vfio/migration.c
@@ -335,6 +335,27 @@ static bool vfio_precopy_supported(VFIODevice *vbasedev)
/* ---------------------------------------------------------------------- */
+static int vfio_save_prepare(void *opaque, Error **errp)
+{
+ VFIODevice *vbasedev = opaque;
+
+ /*
+ * Snapshot doesn't use postcopy, so allow snapshot even if postcopy is on.
+ */
+ if (runstate_check(RUN_STATE_SAVE_VM)) {
+ return 0;
+ }
+
+ if (migrate_postcopy_ram()) {
+ error_setg(
+ errp, "%s: VFIO migration is not supported with postcopy migration",
+ vbasedev->name);
+ return -EOPNOTSUPP;
+ }
+
+ return 0;
+}
+
static int vfio_save_setup(QEMUFile *f, void *opaque)
{
VFIODevice *vbasedev = opaque;
@@ -640,6 +661,7 @@ static bool vfio_switchover_ack_needed(void *opaque)
}
static const SaveVMHandlers savevm_vfio_handlers = {
+ .save_prepare = vfio_save_prepare,
.save_setup = vfio_save_setup,
.save_cleanup = vfio_save_cleanup,
.state_pending_estimate = vfio_state_pending_estimate,
--
2.26.3
On Thu, Aug 31, 2023 at 03:57:01PM +0300, Avihai Horon wrote:
> VFIO migration is not compatible with postcopy migration. A VFIO device
> in the destination can't handle page faults for pages that have not been
> sent yet.
>
> Doing such migration will cause the VM to crash in the destination:
>
> qemu-system-x86_64: VFIO_MAP_DMA failed: Bad address
> qemu-system-x86_64: vfio_dma_map(0x55a28c7659d0, 0xc0000, 0xb000, 0x7f1b11a00000) = -14 (Bad address)
> qemu: hardware error: vfio: DMA mapping failed, unable to continue
>
> To prevent this, block VFIO migration with postcopy migration.
>
> Reported-by: Yanghang Liu <yanghliu@redhat.com>
> Signed-off-by: Avihai Horon <avihaih@nvidia.com>
> ---
> hw/vfio/migration.c | 22 ++++++++++++++++++++++
> 1 file changed, 22 insertions(+)
>
> diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
> index 71855468fe..20994dc1d6 100644
> --- a/hw/vfio/migration.c
> +++ b/hw/vfio/migration.c
> @@ -335,6 +335,27 @@ static bool vfio_precopy_supported(VFIODevice *vbasedev)
>
> /* ---------------------------------------------------------------------- */
>
> +static int vfio_save_prepare(void *opaque, Error **errp)
> +{
> + VFIODevice *vbasedev = opaque;
> +
> + /*
> + * Snapshot doesn't use postcopy, so allow snapshot even if postcopy is on.
> + */
> + if (runstate_check(RUN_STATE_SAVE_VM)) {
> + return 0;
> + }
Just purely curious: will it really work to save a snapshot for the GPU
assigned use case?
> +
> + if (migrate_postcopy_ram()) {
> + error_setg(
> + errp, "%s: VFIO migration is not supported with postcopy migration",
> + vbasedev->name);
> + return -EOPNOTSUPP;
> + }
> +
> + return 0;
> +}
--
Peter Xu
On 01/09/2023 18:51, Peter Xu wrote:
> External email: Use caution opening links or attachments
>
>
> On Thu, Aug 31, 2023 at 03:57:01PM +0300, Avihai Horon wrote:
>> VFIO migration is not compatible with postcopy migration. A VFIO device
>> in the destination can't handle page faults for pages that have not been
>> sent yet.
>>
>> Doing such migration will cause the VM to crash in the destination:
>>
>> qemu-system-x86_64: VFIO_MAP_DMA failed: Bad address
>> qemu-system-x86_64: vfio_dma_map(0x55a28c7659d0, 0xc0000, 0xb000, 0x7f1b11a00000) = -14 (Bad address)
>> qemu: hardware error: vfio: DMA mapping failed, unable to continue
>>
>> To prevent this, block VFIO migration with postcopy migration.
>>
>> Reported-by: Yanghang Liu <yanghliu@redhat.com>
>> Signed-off-by: Avihai Horon <avihaih@nvidia.com>
>> ---
>> hw/vfio/migration.c | 22 ++++++++++++++++++++++
>> 1 file changed, 22 insertions(+)
>>
>> diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
>> index 71855468fe..20994dc1d6 100644
>> --- a/hw/vfio/migration.c
>> +++ b/hw/vfio/migration.c
>> @@ -335,6 +335,27 @@ static bool vfio_precopy_supported(VFIODevice *vbasedev)
>>
>> /* ---------------------------------------------------------------------- */
>>
>> +static int vfio_save_prepare(void *opaque, Error **errp)
>> +{
>> + VFIODevice *vbasedev = opaque;
>> +
>> + /*
>> + * Snapshot doesn't use postcopy, so allow snapshot even if postcopy is on.
>> + */
>> + if (runstate_check(RUN_STATE_SAVE_VM)) {
>> + return 0;
>> + }
> Just purely curious: will it really work to save a snapshot for the GPU
> assigned use case?
I have never tried that.
Adding Tarun, maybe he can answer that.
Thanks.
>> +
>> + if (migrate_postcopy_ram()) {
>> + error_setg(
>> + errp, "%s: VFIO migration is not supported with postcopy migration",
>> + vbasedev->name);
>> + return -EOPNOTSUPP;
>> + }
>> +
>> + return 0;
>> +}
> --
> Peter Xu
>
When try to do the vfio post-copy migration, we can get an expected
internal error now: "unable to execute QEMU command 'migrate':
0000:b1:00.2: VFIO migration is not supported with postcopy migration"
Tested-by: Yanghang Liu <yanghliu@redhat.com>
Best Regards,
YangHang Liu
On Thu, Aug 31, 2023 at 8:57 PM Avihai Horon <avihaih@nvidia.com> wrote:
>
> VFIO migration is not compatible with postcopy migration. A VFIO device
> in the destination can't handle page faults for pages that have not been
> sent yet.
>
> Doing such migration will cause the VM to crash in the destination:
>
> qemu-system-x86_64: VFIO_MAP_DMA failed: Bad address
> qemu-system-x86_64: vfio_dma_map(0x55a28c7659d0, 0xc0000, 0xb000, 0x7f1b11a00000) = -14 (Bad address)
> qemu: hardware error: vfio: DMA mapping failed, unable to continue
>
> To prevent this, block VFIO migration with postcopy migration.
>
> Reported-by: Yanghang Liu <yanghliu@redhat.com>
> Signed-off-by: Avihai Horon <avihaih@nvidia.com>
> ---
> hw/vfio/migration.c | 22 ++++++++++++++++++++++
> 1 file changed, 22 insertions(+)
>
> diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
> index 71855468fe..20994dc1d6 100644
> --- a/hw/vfio/migration.c
> +++ b/hw/vfio/migration.c
> @@ -335,6 +335,27 @@ static bool vfio_precopy_supported(VFIODevice *vbasedev)
>
> /* ---------------------------------------------------------------------- */
>
> +static int vfio_save_prepare(void *opaque, Error **errp)
> +{
> + VFIODevice *vbasedev = opaque;
> +
> + /*
> + * Snapshot doesn't use postcopy, so allow snapshot even if postcopy is on.
> + */
> + if (runstate_check(RUN_STATE_SAVE_VM)) {
> + return 0;
> + }
> +
> + if (migrate_postcopy_ram()) {
> + error_setg(
> + errp, "%s: VFIO migration is not supported with postcopy migration",
> + vbasedev->name);
> + return -EOPNOTSUPP;
> + }
> +
> + return 0;
> +}
> +
> static int vfio_save_setup(QEMUFile *f, void *opaque)
> {
> VFIODevice *vbasedev = opaque;
> @@ -640,6 +661,7 @@ static bool vfio_switchover_ack_needed(void *opaque)
> }
>
> static const SaveVMHandlers savevm_vfio_handlers = {
> + .save_prepare = vfio_save_prepare,
> .save_setup = vfio_save_setup,
> .save_cleanup = vfio_save_cleanup,
> .state_pending_estimate = vfio_state_pending_estimate,
> --
> 2.26.3
>
© 2016 - 2026 Red Hat, Inc.