[PATCH 1/3] vfio/migration: Max in-flight VFIO device state buffer count limit

Maciej S. Szmigiero posted 3 patches 4 months, 3 weeks ago
There is a newer version of this series
[PATCH 1/3] vfio/migration: Max in-flight VFIO device state buffer count limit
Posted by Maciej S. Szmigiero 4 months, 3 weeks ago
From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com>

Allow capping the maximum count of in-flight VFIO device state buffers
queued at the destination, otherwise a malicious QEMU source could
theoretically cause the target QEMU to allocate unlimited amounts of memory
for buffers-in-flight.

Since this is not expected to be a realistic threat in most of VFIO live
migration use cases and the right value depends on the particular setup
disable the limit by default by setting it to UINT64_MAX.

Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
---
 docs/devel/migration/vfio.rst | 13 +++++++++++++
 hw/vfio/migration-multifd.c   | 16 ++++++++++++++++
 hw/vfio/pci.c                 |  9 +++++++++
 include/hw/vfio/vfio-device.h |  1 +
 4 files changed, 39 insertions(+)

diff --git a/docs/devel/migration/vfio.rst b/docs/devel/migration/vfio.rst
index 673e354754c8..f4a6bfa4619b 100644
--- a/docs/devel/migration/vfio.rst
+++ b/docs/devel/migration/vfio.rst
@@ -247,3 +247,16 @@ The multifd VFIO device state transfer is controlled by
 "x-migration-multifd-transfer" VFIO device property. This property defaults to
 AUTO, which means that VFIO device state transfer via multifd channels is
 attempted in configurations that otherwise support it.
+
+Since the target QEMU needs to load device state buffers in-order it needs to
+queue incoming buffers until they can be loaded into the device.
+This means that a malicious QEMU source could theoretically cause the target
+QEMU to allocate unlimited amounts of memory for such buffers-in-flight.
+
+The "x-migration-max-queued-buffers" property allows capping the maximum count
+of these VFIO device state buffers queued at the destination.
+
+Because a malicious QEMU source causing OOM on the target is not expected to be
+a realistic threat in most of VFIO live migration use cases and the right value
+depends on the particular setup by default this queued buffers limit is
+disabled by setting it to UINT64_MAX.
diff --git a/hw/vfio/migration-multifd.c b/hw/vfio/migration-multifd.c
index 850a31948878..f26c112090b4 100644
--- a/hw/vfio/migration-multifd.c
+++ b/hw/vfio/migration-multifd.c
@@ -56,6 +56,7 @@ typedef struct VFIOMultifd {
     QemuMutex load_bufs_mutex; /* Lock order: this lock -> BQL */
     uint32_t load_buf_idx;
     uint32_t load_buf_idx_last;
+    uint32_t load_buf_queued_pending_buffers;
 } VFIOMultifd;
 
 static void vfio_state_buffer_clear(gpointer data)
@@ -127,6 +128,17 @@ static bool vfio_load_state_buffer_insert(VFIODevice *vbasedev,
 
     assert(packet->idx >= multifd->load_buf_idx);
 
+    multifd->load_buf_queued_pending_buffers++;
+    if (multifd->load_buf_queued_pending_buffers >
+        vbasedev->migration_max_queued_buffers) {
+        error_setg(errp,
+                   "%s: queuing state buffer %" PRIu32
+                   " would exceed the max of %" PRIu64,
+                   vbasedev->name, packet->idx,
+                   vbasedev->migration_max_queued_buffers);
+        return false;
+    }
+
     lb->data = g_memdup2(&packet->data, packet_total_size - sizeof(*packet));
     lb->len = packet_total_size - sizeof(*packet);
     lb->is_present = true;
@@ -387,6 +399,9 @@ static bool vfio_load_bufs_thread(void *opaque, bool *should_quit, Error **errp)
             goto thread_exit;
         }
 
+        assert(multifd->load_buf_queued_pending_buffers > 0);
+        multifd->load_buf_queued_pending_buffers--;
+
         if (multifd->load_buf_idx == multifd->load_buf_idx_last - 1) {
             trace_vfio_load_state_device_buffer_end(vbasedev->name);
         }
@@ -423,6 +438,7 @@ static VFIOMultifd *vfio_multifd_new(void)
 
     multifd->load_buf_idx = 0;
     multifd->load_buf_idx_last = UINT32_MAX;
+    multifd->load_buf_queued_pending_buffers = 0;
     qemu_cond_init(&multifd->load_bufs_buffer_ready_cond);
 
     multifd->load_bufs_thread_running = false;
diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index fa25bded25c5..2765a39f9df1 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -3524,6 +3524,8 @@ static const Property vfio_pci_dev_properties[] = {
                 vbasedev.migration_multifd_transfer,
                 vfio_pci_migration_multifd_transfer_prop, OnOffAuto,
                 .set_default = true, .defval.i = ON_OFF_AUTO_AUTO),
+    DEFINE_PROP_UINT64("x-migration-max-queued-buffers", VFIOPCIDevice,
+                       vbasedev.migration_max_queued_buffers, UINT64_MAX),
     DEFINE_PROP_BOOL("migration-events", VFIOPCIDevice,
                      vbasedev.migration_events, false),
     DEFINE_PROP_BOOL("x-no-mmap", VFIOPCIDevice, vbasedev.no_mmap, false),
@@ -3698,6 +3700,13 @@ static void vfio_pci_dev_class_init(ObjectClass *klass, const void *data)
                                           "x-migration-multifd-transfer",
                                           "Transfer this device state via "
                                           "multifd channels when live migrating it");
+    object_class_property_set_description(klass, /* 10.1 */
+                                          "x-migration-max-queued-buffers",
+                                          "Maximum count of in-flight VFIO "
+                                          "device state buffers queued at the "
+                                          "destination when doing live "
+                                          "migration of device state via "
+                                          "multifd channels");
 }
 
 static const TypeInfo vfio_pci_dev_info = {
diff --git a/include/hw/vfio/vfio-device.h b/include/hw/vfio/vfio-device.h
index d45e5a68a24e..0ee34aaf668b 100644
--- a/include/hw/vfio/vfio-device.h
+++ b/include/hw/vfio/vfio-device.h
@@ -66,6 +66,7 @@ typedef struct VFIODevice {
     bool ram_block_discard_allowed;
     OnOffAuto enable_migration;
     OnOffAuto migration_multifd_transfer;
+    uint64_t migration_max_queued_buffers;
     bool migration_events;
     bool use_region_fds;
     VFIODeviceOps *ops;
Re: [PATCH 1/3] vfio/migration: Max in-flight VFIO device state buffer count limit
Posted by Avihai Horon 4 months, 1 week ago
On 24/06/2025 20:51, Maciej S. Szmigiero wrote:
> External email: Use caution opening links or attachments
>
>
> From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com>
>
> Allow capping the maximum count of in-flight VFIO device state buffers
> queued at the destination, otherwise a malicious QEMU source could
> theoretically cause the target QEMU to allocate unlimited amounts of memory
> for buffers-in-flight.
>
> Since this is not expected to be a realistic threat in most of VFIO live
> migration use cases and the right value depends on the particular setup
> disable the limit by default by setting it to UINT64_MAX.
>
> Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>

Reviewed-by: Avihai Horon <avihaih@nvidia.com>

But do we really need both x-migration-max-queued-buffers and 
x-migration-max-queued-buffers-size?
I think one is sufficient.

I vote for x-migration-max-queued-buffers-size as the actual memory 
limit won't change depending on VFIO migration buffer size.

However, if you pick x-migration-max-queued-buffers, maybe it's worth 
mentioning in the docs what is the size of a buffer?

Thanks.

> ---
>   docs/devel/migration/vfio.rst | 13 +++++++++++++
>   hw/vfio/migration-multifd.c   | 16 ++++++++++++++++
>   hw/vfio/pci.c                 |  9 +++++++++
>   include/hw/vfio/vfio-device.h |  1 +
>   4 files changed, 39 insertions(+)
>
> diff --git a/docs/devel/migration/vfio.rst b/docs/devel/migration/vfio.rst
> index 673e354754c8..f4a6bfa4619b 100644
> --- a/docs/devel/migration/vfio.rst
> +++ b/docs/devel/migration/vfio.rst
> @@ -247,3 +247,16 @@ The multifd VFIO device state transfer is controlled by
>   "x-migration-multifd-transfer" VFIO device property. This property defaults to
>   AUTO, which means that VFIO device state transfer via multifd channels is
>   attempted in configurations that otherwise support it.
> +
> +Since the target QEMU needs to load device state buffers in-order it needs to
> +queue incoming buffers until they can be loaded into the device.
> +This means that a malicious QEMU source could theoretically cause the target
> +QEMU to allocate unlimited amounts of memory for such buffers-in-flight.
> +
> +The "x-migration-max-queued-buffers" property allows capping the maximum count
> +of these VFIO device state buffers queued at the destination.
> +
> +Because a malicious QEMU source causing OOM on the target is not expected to be
> +a realistic threat in most of VFIO live migration use cases and the right value
> +depends on the particular setup by default this queued buffers limit is
> +disabled by setting it to UINT64_MAX.
> diff --git a/hw/vfio/migration-multifd.c b/hw/vfio/migration-multifd.c
> index 850a31948878..f26c112090b4 100644
> --- a/hw/vfio/migration-multifd.c
> +++ b/hw/vfio/migration-multifd.c
> @@ -56,6 +56,7 @@ typedef struct VFIOMultifd {
>       QemuMutex load_bufs_mutex; /* Lock order: this lock -> BQL */
>       uint32_t load_buf_idx;
>       uint32_t load_buf_idx_last;
> +    uint32_t load_buf_queued_pending_buffers;
>   } VFIOMultifd;
>
>   static void vfio_state_buffer_clear(gpointer data)
> @@ -127,6 +128,17 @@ static bool vfio_load_state_buffer_insert(VFIODevice *vbasedev,
>
>       assert(packet->idx >= multifd->load_buf_idx);
>
> +    multifd->load_buf_queued_pending_buffers++;
> +    if (multifd->load_buf_queued_pending_buffers >
> +        vbasedev->migration_max_queued_buffers) {
> +        error_setg(errp,
> +                   "%s: queuing state buffer %" PRIu32
> +                   " would exceed the max of %" PRIu64,
> +                   vbasedev->name, packet->idx,
> +                   vbasedev->migration_max_queued_buffers);
> +        return false;
> +    }
> +
>       lb->data = g_memdup2(&packet->data, packet_total_size - sizeof(*packet));
>       lb->len = packet_total_size - sizeof(*packet);
>       lb->is_present = true;
> @@ -387,6 +399,9 @@ static bool vfio_load_bufs_thread(void *opaque, bool *should_quit, Error **errp)
>               goto thread_exit;
>           }
>
> +        assert(multifd->load_buf_queued_pending_buffers > 0);
> +        multifd->load_buf_queued_pending_buffers--;
> +
>           if (multifd->load_buf_idx == multifd->load_buf_idx_last - 1) {
>               trace_vfio_load_state_device_buffer_end(vbasedev->name);
>           }
> @@ -423,6 +438,7 @@ static VFIOMultifd *vfio_multifd_new(void)
>
>       multifd->load_buf_idx = 0;
>       multifd->load_buf_idx_last = UINT32_MAX;
> +    multifd->load_buf_queued_pending_buffers = 0;
>       qemu_cond_init(&multifd->load_bufs_buffer_ready_cond);
>
>       multifd->load_bufs_thread_running = false;
> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> index fa25bded25c5..2765a39f9df1 100644
> --- a/hw/vfio/pci.c
> +++ b/hw/vfio/pci.c
> @@ -3524,6 +3524,8 @@ static const Property vfio_pci_dev_properties[] = {
>                   vbasedev.migration_multifd_transfer,
>                   vfio_pci_migration_multifd_transfer_prop, OnOffAuto,
>                   .set_default = true, .defval.i = ON_OFF_AUTO_AUTO),
> +    DEFINE_PROP_UINT64("x-migration-max-queued-buffers", VFIOPCIDevice,
> +                       vbasedev.migration_max_queued_buffers, UINT64_MAX),
>       DEFINE_PROP_BOOL("migration-events", VFIOPCIDevice,
>                        vbasedev.migration_events, false),
>       DEFINE_PROP_BOOL("x-no-mmap", VFIOPCIDevice, vbasedev.no_mmap, false),
> @@ -3698,6 +3700,13 @@ static void vfio_pci_dev_class_init(ObjectClass *klass, const void *data)
>                                             "x-migration-multifd-transfer",
>                                             "Transfer this device state via "
>                                             "multifd channels when live migrating it");
> +    object_class_property_set_description(klass, /* 10.1 */
> +                                          "x-migration-max-queued-buffers",
> +                                          "Maximum count of in-flight VFIO "
> +                                          "device state buffers queued at the "
> +                                          "destination when doing live "
> +                                          "migration of device state via "
> +                                          "multifd channels");
>   }
>
>   static const TypeInfo vfio_pci_dev_info = {
> diff --git a/include/hw/vfio/vfio-device.h b/include/hw/vfio/vfio-device.h
> index d45e5a68a24e..0ee34aaf668b 100644
> --- a/include/hw/vfio/vfio-device.h
> +++ b/include/hw/vfio/vfio-device.h
> @@ -66,6 +66,7 @@ typedef struct VFIODevice {
>       bool ram_block_discard_allowed;
>       OnOffAuto enable_migration;
>       OnOffAuto migration_multifd_transfer;
> +    uint64_t migration_max_queued_buffers;
>       bool migration_events;
>       bool use_region_fds;
>       VFIODeviceOps *ops;
Re: [PATCH 1/3] vfio/migration: Max in-flight VFIO device state buffer count limit
Posted by Maciej S. Szmigiero 4 months, 1 week ago
On 7.07.2025 11:29, Avihai Horon wrote:
> 
> On 24/06/2025 20:51, Maciej S. Szmigiero wrote:
>> External email: Use caution opening links or attachments
>>
>>
>> From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com>
>>
>> Allow capping the maximum count of in-flight VFIO device state buffers
>> queued at the destination, otherwise a malicious QEMU source could
>> theoretically cause the target QEMU to allocate unlimited amounts of memory
>> for buffers-in-flight.
>>
>> Since this is not expected to be a realistic threat in most of VFIO live
>> migration use cases and the right value depends on the particular setup
>> disable the limit by default by setting it to UINT64_MAX.
>>
>> Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
> 
> Reviewed-by: Avihai Horon <avihaih@nvidia.com>

Thanks.

> But do we really need both x-migration-max-queued-buffers and x-migration-max-queued-buffers-size?
> I think one is sufficient.
> 
> I vote for x-migration-max-queued-buffers-size as the actual memory limit won't change depending on VFIO migration buffer size.

If just one of these limits were to be implemented my vote would be also for the size limit rather than the count limit.

> However, if you pick x-migration-max-queued-buffers, maybe it's worth mentioning in the docs what is the size of a buffer?

I think it's not a constant number since it's a MIN(VFIO_MIG_DEFAULT_DATA_BUFFER_SIZE, stop_copy_size)?
On the other hand, one could say it's "at most VFIO_MIG_DEFAULT_DATA_BUFFER_SIZE".

> Thanks.
Thanks,
Maciej
Re: [PATCH 1/3] vfio/migration: Max in-flight VFIO device state buffer count limit
Posted by Cédric Le Goater 4 months ago
>> But do we really need both x-migration-max-queued-buffers and x-migration-max-queued-buffers-size?
>> I think one is sufficient.
>>
>> I vote for x-migration-max-queued-buffers-size as the actual memory limit won't change depending on VFIO migration buffer size.
> 
> If just one of these limits were to be implemented my vote would be also for the size limit rather than the count limit.

Let's keep the size limit only.

Thanks,

C.
Re: [PATCH 1/3] vfio/migration: Max in-flight VFIO device state buffer count limit
Posted by Fabiano Rosas 4 months, 2 weeks ago
"Maciej S. Szmigiero" <mail@maciej.szmigiero.name> writes:

> From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com>
>
> Allow capping the maximum count of in-flight VFIO device state buffers
> queued at the destination, otherwise a malicious QEMU source could
> theoretically cause the target QEMU to allocate unlimited amounts of memory
> for buffers-in-flight.
>
> Since this is not expected to be a realistic threat in most of VFIO live
> migration use cases and the right value depends on the particular setup
> disable the limit by default by setting it to UINT64_MAX.
>
> Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>

Reviewed-by: Fabiano Rosas <farosas@suse.de>