[v5] Multifd 🔀 device state transfer support with VFIO consumer

[PATCH v5 34/36] vfio/migration: Max in-flight VFIO device state buffer count limit

Posted by Maciej S. Szmigiero 1 month, 2 weeks ago

From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com>

Allow capping the maximum count of in-flight VFIO device state buffers
queued at the destination, otherwise a malicious QEMU source could
theoretically cause the target QEMU to allocate unlimited amounts of memory
for buffers-in-flight.

Since this is not expected to be a realistic threat in most of VFIO live
migration use cases and the right value depends on the particular setup
disable the limit by default by setting it to UINT64_MAX.

Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
---
 hw/vfio/migration-multifd.c   | 14 ++++++++++++++
 hw/vfio/pci.c                 |  2 ++
 include/hw/vfio/vfio-common.h |  1 +
 3 files changed, 17 insertions(+)

diff --git a/hw/vfio/migration-multifd.c b/hw/vfio/migration-multifd.c
index 18a5ff964a37..04aa3f4a6596 100644
--- a/hw/vfio/migration-multifd.c
+++ b/hw/vfio/migration-multifd.c
@@ -53,6 +53,7 @@ typedef struct VFIOMultifd {
     QemuMutex load_bufs_mutex; /* Lock order: this lock -> BQL */
     uint32_t load_buf_idx;
     uint32_t load_buf_idx_last;
+    uint32_t load_buf_queued_pending_buffers;
 } VFIOMultifd;
 
 static void vfio_state_buffer_clear(gpointer data)
@@ -121,6 +122,15 @@ static bool vfio_load_state_buffer_insert(VFIODevice *vbasedev,
 
     assert(packet->idx >= multifd->load_buf_idx);
 
+    multifd->load_buf_queued_pending_buffers++;
+    if (multifd->load_buf_queued_pending_buffers >
+        vbasedev->migration_max_queued_buffers) {
+        error_setg(errp,
+                   "queuing state buffer %" PRIu32 " would exceed the max of %" PRIu64,
+                   packet->idx, vbasedev->migration_max_queued_buffers);
+        return false;
+    }
+
     lb->data = g_memdup2(&packet->data, packet_total_size - sizeof(*packet));
     lb->len = packet_total_size - sizeof(*packet);
     lb->is_present = true;
@@ -374,6 +384,9 @@ static bool vfio_load_bufs_thread(void *opaque, bool *should_quit, Error **errp)
             goto ret_signal;
         }
 
+        assert(multifd->load_buf_queued_pending_buffers > 0);
+        multifd->load_buf_queued_pending_buffers--;
+
         if (multifd->load_buf_idx == multifd->load_buf_idx_last - 1) {
             trace_vfio_load_state_device_buffer_end(vbasedev->name);
         }
@@ -408,6 +421,7 @@ VFIOMultifd *vfio_multifd_new(void)
 
     multifd->load_buf_idx = 0;
     multifd->load_buf_idx_last = UINT32_MAX;
+    multifd->load_buf_queued_pending_buffers = 0;
     qemu_cond_init(&multifd->load_bufs_buffer_ready_cond);
 
     multifd->load_bufs_thread_running = false;
diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 9111805ae06c..247418f0fce2 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -3383,6 +3383,8 @@ static const Property vfio_pci_dev_properties[] = {
                 vbasedev.migration_multifd_transfer,
                 qdev_prop_on_off_auto_mutable, OnOffAuto,
                 .set_default = true, .defval.i = ON_OFF_AUTO_AUTO),
+    DEFINE_PROP_UINT64("x-migration-max-queued-buffers", VFIOPCIDevice,
+                       vbasedev.migration_max_queued_buffers, UINT64_MAX),
     DEFINE_PROP_BOOL("migration-events", VFIOPCIDevice,
                      vbasedev.migration_events, false),
     DEFINE_PROP_BOOL("x-no-mmap", VFIOPCIDevice, vbasedev.no_mmap, false),
diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index 3006931accf6..30a5bb9af61b 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -155,6 +155,7 @@ typedef struct VFIODevice {
     bool ram_block_discard_allowed;
     OnOffAuto enable_migration;
     OnOffAuto migration_multifd_transfer;
+    uint64_t migration_max_queued_buffers;
     bool migration_events;
     VFIODeviceOps *ops;
     unsigned int num_irqs;

Re: [PATCH v5 34/36] vfio/migration: Max in-flight VFIO device state buffer count limit

Posted by Avihai Horon 1 month ago

On 19/02/2025 22:34, Maciej S. Szmigiero wrote:
> External email: Use caution opening links or attachments
>
>
> From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com>
>
> Allow capping the maximum count of in-flight VFIO device state buffers
> queued at the destination, otherwise a malicious QEMU source could
> theoretically cause the target QEMU to allocate unlimited amounts of memory
> for buffers-in-flight.

I still think it's better to limit the number of bytes rather than 
number of buffers:
1. To the average user the number of buffers doesn't really mean 
anything. They have to open the code and see what is the size of a 
single buffer and then choose their value.
2. Currently VFIO migration buffer size is 1MB. If later it's changed to 
2MB for example, users will have to adjust their configuration 
accordingly. With number of bytes, the configuration remains the same no 
matter what is the VFIO migration buffer size.

>
> Since this is not expected to be a realistic threat in most of VFIO live
> migration use cases and the right value depends on the particular setup
> disable the limit by default by setting it to UINT64_MAX.
>
> Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
> ---
>   hw/vfio/migration-multifd.c   | 14 ++++++++++++++
>   hw/vfio/pci.c                 |  2 ++
>   include/hw/vfio/vfio-common.h |  1 +
>   3 files changed, 17 insertions(+)
>
> diff --git a/hw/vfio/migration-multifd.c b/hw/vfio/migration-multifd.c
> index 18a5ff964a37..04aa3f4a6596 100644
> --- a/hw/vfio/migration-multifd.c
> +++ b/hw/vfio/migration-multifd.c
> @@ -53,6 +53,7 @@ typedef struct VFIOMultifd {
>       QemuMutex load_bufs_mutex; /* Lock order: this lock -> BQL */
>       uint32_t load_buf_idx;
>       uint32_t load_buf_idx_last;
> +    uint32_t load_buf_queued_pending_buffers;
>   } VFIOMultifd;
>
>   static void vfio_state_buffer_clear(gpointer data)
> @@ -121,6 +122,15 @@ static bool vfio_load_state_buffer_insert(VFIODevice *vbasedev,
>
>       assert(packet->idx >= multifd->load_buf_idx);
>
> +    multifd->load_buf_queued_pending_buffers++;
> +    if (multifd->load_buf_queued_pending_buffers >
> +        vbasedev->migration_max_queued_buffers) {
> +        error_setg(errp,
> +                   "queuing state buffer %" PRIu32 " would exceed the max of %" PRIu64,
> +                   packet->idx, vbasedev->migration_max_queued_buffers);

Let's add vbasedev->name to the error message so we know which device 
caused the error.

Thanks.

> +        return false;
> +    }
> +
>       lb->data = g_memdup2(&packet->data, packet_total_size - sizeof(*packet));
>       lb->len = packet_total_size - sizeof(*packet);
>       lb->is_present = true;
> @@ -374,6 +384,9 @@ static bool vfio_load_bufs_thread(void *opaque, bool *should_quit, Error **errp)
>               goto ret_signal;
>           }
>
> +        assert(multifd->load_buf_queued_pending_buffers > 0);
> +        multifd->load_buf_queued_pending_buffers--;
> +
>           if (multifd->load_buf_idx == multifd->load_buf_idx_last - 1) {
>               trace_vfio_load_state_device_buffer_end(vbasedev->name);
>           }
> @@ -408,6 +421,7 @@ VFIOMultifd *vfio_multifd_new(void)
>
>       multifd->load_buf_idx = 0;
>       multifd->load_buf_idx_last = UINT32_MAX;
> +    multifd->load_buf_queued_pending_buffers = 0;
>       qemu_cond_init(&multifd->load_bufs_buffer_ready_cond);
>
>       multifd->load_bufs_thread_running = false;
> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> index 9111805ae06c..247418f0fce2 100644
> --- a/hw/vfio/pci.c
> +++ b/hw/vfio/pci.c
> @@ -3383,6 +3383,8 @@ static const Property vfio_pci_dev_properties[] = {
>                   vbasedev.migration_multifd_transfer,
>                   qdev_prop_on_off_auto_mutable, OnOffAuto,
>                   .set_default = true, .defval.i = ON_OFF_AUTO_AUTO),
> +    DEFINE_PROP_UINT64("x-migration-max-queued-buffers", VFIOPCIDevice,
> +                       vbasedev.migration_max_queued_buffers, UINT64_MAX),
>       DEFINE_PROP_BOOL("migration-events", VFIOPCIDevice,
>                        vbasedev.migration_events, false),
>       DEFINE_PROP_BOOL("x-no-mmap", VFIOPCIDevice, vbasedev.no_mmap, false),
> diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
> index 3006931accf6..30a5bb9af61b 100644
> --- a/include/hw/vfio/vfio-common.h
> +++ b/include/hw/vfio/vfio-common.h
> @@ -155,6 +155,7 @@ typedef struct VFIODevice {
>       bool ram_block_discard_allowed;
>       OnOffAuto enable_migration;
>       OnOffAuto migration_multifd_transfer;
> +    uint64_t migration_max_queued_buffers;
>       bool migration_events;
>       VFIODeviceOps *ops;
>       unsigned int num_irqs;

Re: [PATCH v5 34/36] vfio/migration: Max in-flight VFIO device state buffer count limit

Posted by Maciej S. Szmigiero 1 month ago

On 2.03.2025 15:53, Avihai Horon wrote:
> 
> On 19/02/2025 22:34, Maciej S. Szmigiero wrote:
>> External email: Use caution opening links or attachments
>>
>>
>> From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com>
>>
>> Allow capping the maximum count of in-flight VFIO device state buffers
>> queued at the destination, otherwise a malicious QEMU source could
>> theoretically cause the target QEMU to allocate unlimited amounts of memory
>> for buffers-in-flight.
> 
> I still think it's better to limit the number of bytes rather than number of buffers:
> 1. To the average user the number of buffers doesn't really mean anything. They have to open the code and see what is the size of a single buffer and then choose their value.
> 2. Currently VFIO migration buffer size is 1MB. If later it's changed to 2MB for example, users will have to adjust their configuration accordingly. With number of bytes, the configuration remains the same no matter what is the VFIO migration buffer size.

Sorry Avihai, but we're a little more than week from code freeze
so it's really not a time for more than cosmetic changes.

Thanks,
Maciej

Re: [PATCH v5 34/36] vfio/migration: Max in-flight VFIO device state buffer count limit

Posted by Maciej S. Szmigiero 1 month ago

On 2.03.2025 15:54, Maciej S. Szmigiero wrote:
> On 2.03.2025 15:53, Avihai Horon wrote:
>>
>> On 19/02/2025 22:34, Maciej S. Szmigiero wrote:
>>> External email: Use caution opening links or attachments
>>>
>>>
>>> From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com>
>>>
>>> Allow capping the maximum count of in-flight VFIO device state buffers
>>> queued at the destination, otherwise a malicious QEMU source could
>>> theoretically cause the target QEMU to allocate unlimited amounts of memory
>>> for buffers-in-flight.
>>
>> I still think it's better to limit the number of bytes rather than number of buffers:
>> 1. To the average user the number of buffers doesn't really mean anything. They have to open the code and see what is the size of a single buffer and then choose their value.
>> 2. Currently VFIO migration buffer size is 1MB. If later it's changed to 2MB for example, users will have to adjust their configuration accordingly. With number of bytes, the configuration remains the same no matter what is the VFIO migration buffer size.
> 
> Sorry Avihai, but we're a little more than week from code freeze
> so it's really not a time for more than cosmetic changes.

And if you really, really want to have queued buffers size limit
that's something could be added later as additional
x-migration-max-queued-buffers-size or something property
since these limits aren't exclusive.

Thanks,
Maciej

Re: [PATCH v5 34/36] vfio/migration: Max in-flight VFIO device state buffer count limit

Posted by Avihai Horon 1 month ago

On 02/03/2025 16:59, Maciej S. Szmigiero wrote:
> External email: Use caution opening links or attachments
>
>
> On 2.03.2025 15:54, Maciej S. Szmigiero wrote:
>> On 2.03.2025 15:53, Avihai Horon wrote:
>>>
>>> On 19/02/2025 22:34, Maciej S. Szmigiero wrote:
>>>> External email: Use caution opening links or attachments
>>>>
>>>>
>>>> From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com>
>>>>
>>>> Allow capping the maximum count of in-flight VFIO device state buffers
>>>> queued at the destination, otherwise a malicious QEMU source could
>>>> theoretically cause the target QEMU to allocate unlimited amounts 
>>>> of memory
>>>> for buffers-in-flight.
>>>
>>> I still think it's better to limit the number of bytes rather than 
>>> number of buffers:
>>> 1. To the average user the number of buffers doesn't really mean 
>>> anything. They have to open the code and see what is the size of a 
>>> single buffer and then choose their value.
>>> 2. Currently VFIO migration buffer size is 1MB. If later it's 
>>> changed to 2MB for example, users will have to adjust their 
>>> configuration accordingly. With number of bytes, the configuration 
>>> remains the same no matter what is the VFIO migration buffer size.
>>
>> Sorry Avihai, but we're a little more than week from code freeze
>> so it's really not a time for more than cosmetic changes.
>
> And if you really, really want to have queued buffers size limit
> that's something could be added later as additional
> x-migration-max-queued-buffers-size or something property
> since these limits aren't exclusive.
>
Sure, I agree.
It's not urgent nor mandatory for now, I just wanted to express my 
opinion :)

Thanks.

> Thanks,
> Maciej
>

Re: [PATCH v5 34/36] vfio/migration: Max in-flight VFIO device state buffer count limit

Posted by Cédric Le Goater 1 month, 1 week ago

On 2/19/25 21:34, Maciej S. Szmigiero wrote:
> From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com>
> 
> Allow capping the maximum count of in-flight VFIO device state buffers
> queued at the destination, otherwise a malicious QEMU source could
> theoretically cause the target QEMU to allocate unlimited amounts of memory
> for buffers-in-flight.
> 
> Since this is not expected to be a realistic threat in most of VFIO live
> migration use cases and the right value depends on the particular setup
> disable the limit by default by setting it to UINT64_MAX.
> 
> Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
> ---
>   hw/vfio/migration-multifd.c   | 14 ++++++++++++++
>   hw/vfio/pci.c                 |  2 ++
>   include/hw/vfio/vfio-common.h |  1 +
>   3 files changed, 17 insertions(+)
> 
> diff --git a/hw/vfio/migration-multifd.c b/hw/vfio/migration-multifd.c
> index 18a5ff964a37..04aa3f4a6596 100644
> --- a/hw/vfio/migration-multifd.c
> +++ b/hw/vfio/migration-multifd.c
> @@ -53,6 +53,7 @@ typedef struct VFIOMultifd {
>       QemuMutex load_bufs_mutex; /* Lock order: this lock -> BQL */
>       uint32_t load_buf_idx;
>       uint32_t load_buf_idx_last;
> +    uint32_t load_buf_queued_pending_buffers;
>   } VFIOMultifd;
>   
>   static void vfio_state_buffer_clear(gpointer data)
> @@ -121,6 +122,15 @@ static bool vfio_load_state_buffer_insert(VFIODevice *vbasedev,
>   
>       assert(packet->idx >= multifd->load_buf_idx);
>   
> +    multifd->load_buf_queued_pending_buffers++;
> +    if (multifd->load_buf_queued_pending_buffers >
> +        vbasedev->migration_max_queued_buffers) {
> +        error_setg(errp,
> +                   "queuing state buffer %" PRIu32 " would exceed the max of %" PRIu64,
> +                   packet->idx, vbasedev->migration_max_queued_buffers);
> +        return false;
> +    }
> +
>       lb->data = g_memdup2(&packet->data, packet_total_size - sizeof(*packet));
>       lb->len = packet_total_size - sizeof(*packet);
>       lb->is_present = true;
> @@ -374,6 +384,9 @@ static bool vfio_load_bufs_thread(void *opaque, bool *should_quit, Error **errp)
>               goto ret_signal;
>           }
>   
> +        assert(multifd->load_buf_queued_pending_buffers > 0);
> +        multifd->load_buf_queued_pending_buffers--;
> +
>           if (multifd->load_buf_idx == multifd->load_buf_idx_last - 1) {
>               trace_vfio_load_state_device_buffer_end(vbasedev->name);
>           }
> @@ -408,6 +421,7 @@ VFIOMultifd *vfio_multifd_new(void)
>   
>       multifd->load_buf_idx = 0;
>       multifd->load_buf_idx_last = UINT32_MAX;
> +    multifd->load_buf_queued_pending_buffers = 0;
>       qemu_cond_init(&multifd->load_bufs_buffer_ready_cond);
>   
>       multifd->load_bufs_thread_running = false;
> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> index 9111805ae06c..247418f0fce2 100644
> --- a/hw/vfio/pci.c
> +++ b/hw/vfio/pci.c
> @@ -3383,6 +3383,8 @@ static const Property vfio_pci_dev_properties[] = {
>                   vbasedev.migration_multifd_transfer,
>                   qdev_prop_on_off_auto_mutable, OnOffAuto,
>                   .set_default = true, .defval.i = ON_OFF_AUTO_AUTO),
> +    DEFINE_PROP_UINT64("x-migration-max-queued-buffers", VFIOPCIDevice,
> +                       vbasedev.migration_max_queued_buffers, UINT64_MAX),

UINT64_MAX doesn't make sense to me. What would be a reasonable value ?

Have you monitored the max ? Should we collect some statistics on this
value and raise a warning if a high water mark is reached ? I think
this would more useful.

>       DEFINE_PROP_BOOL("migration-events", VFIOPCIDevice,
>                        vbasedev.migration_events, false),
>       DEFINE_PROP_BOOL("x-no-mmap", VFIOPCIDevice, vbasedev.no_mmap, false),


Please add property documentation in vfio_pci_dev_class_init()


Thanks,

C.


> diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
> index 3006931accf6..30a5bb9af61b 100644
> --- a/include/hw/vfio/vfio-common.h
> +++ b/include/hw/vfio/vfio-common.h
> @@ -155,6 +155,7 @@ typedef struct VFIODevice {
>       bool ram_block_discard_allowed;
>       OnOffAuto enable_migration;
>       OnOffAuto migration_multifd_transfer;
> +    uint64_t migration_max_queued_buffers;
>       bool migration_events;
>       VFIODeviceOps *ops;
>       unsigned int num_irqs;
>

Re: [PATCH v5 34/36] vfio/migration: Max in-flight VFIO device state buffer count limit

Posted by Maciej S. Szmigiero 1 month, 1 week ago

On 27.02.2025 07:48, Cédric Le Goater wrote:
> On 2/19/25 21:34, Maciej S. Szmigiero wrote:
>> From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com>
>>
>> Allow capping the maximum count of in-flight VFIO device state buffers
>> queued at the destination, otherwise a malicious QEMU source could
>> theoretically cause the target QEMU to allocate unlimited amounts of memory
>> for buffers-in-flight.
>>
>> Since this is not expected to be a realistic threat in most of VFIO live
>> migration use cases and the right value depends on the particular setup
>> disable the limit by default by setting it to UINT64_MAX.
>>
>> Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
>> ---
>>   hw/vfio/migration-multifd.c   | 14 ++++++++++++++
>>   hw/vfio/pci.c                 |  2 ++
>>   include/hw/vfio/vfio-common.h |  1 +
>>   3 files changed, 17 insertions(+)
>>
>> diff --git a/hw/vfio/migration-multifd.c b/hw/vfio/migration-multifd.c
>> index 18a5ff964a37..04aa3f4a6596 100644
>> --- a/hw/vfio/migration-multifd.c
>> +++ b/hw/vfio/migration-multifd.c
>> @@ -53,6 +53,7 @@ typedef struct VFIOMultifd {
>>       QemuMutex load_bufs_mutex; /* Lock order: this lock -> BQL */
>>       uint32_t load_buf_idx;
>>       uint32_t load_buf_idx_last;
>> +    uint32_t load_buf_queued_pending_buffers;
>>   } VFIOMultifd;
>>   static void vfio_state_buffer_clear(gpointer data)
>> @@ -121,6 +122,15 @@ static bool vfio_load_state_buffer_insert(VFIODevice *vbasedev,
>>       assert(packet->idx >= multifd->load_buf_idx);
>> +    multifd->load_buf_queued_pending_buffers++;
>> +    if (multifd->load_buf_queued_pending_buffers >
>> +        vbasedev->migration_max_queued_buffers) {
>> +        error_setg(errp,
>> +                   "queuing state buffer %" PRIu32 " would exceed the max of %" PRIu64,
>> +                   packet->idx, vbasedev->migration_max_queued_buffers);
>> +        return false;
>> +    }
>> +
>>       lb->data = g_memdup2(&packet->data, packet_total_size - sizeof(*packet));
>>       lb->len = packet_total_size - sizeof(*packet);
>>       lb->is_present = true;
>> @@ -374,6 +384,9 @@ static bool vfio_load_bufs_thread(void *opaque, bool *should_quit, Error **errp)
>>               goto ret_signal;
>>           }
>> +        assert(multifd->load_buf_queued_pending_buffers > 0);
>> +        multifd->load_buf_queued_pending_buffers--;
>> +
>>           if (multifd->load_buf_idx == multifd->load_buf_idx_last - 1) {
>>               trace_vfio_load_state_device_buffer_end(vbasedev->name);
>>           }
>> @@ -408,6 +421,7 @@ VFIOMultifd *vfio_multifd_new(void)
>>       multifd->load_buf_idx = 0;
>>       multifd->load_buf_idx_last = UINT32_MAX;
>> +    multifd->load_buf_queued_pending_buffers = 0;
>>       qemu_cond_init(&multifd->load_bufs_buffer_ready_cond);
>>       multifd->load_bufs_thread_running = false;
>> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
>> index 9111805ae06c..247418f0fce2 100644
>> --- a/hw/vfio/pci.c
>> +++ b/hw/vfio/pci.c
>> @@ -3383,6 +3383,8 @@ static const Property vfio_pci_dev_properties[] = {
>>                   vbasedev.migration_multifd_transfer,
>>                   qdev_prop_on_off_auto_mutable, OnOffAuto,
>>                   .set_default = true, .defval.i = ON_OFF_AUTO_AUTO),
>> +    DEFINE_PROP_UINT64("x-migration-max-queued-buffers", VFIOPCIDevice,
>> +                       vbasedev.migration_max_queued_buffers, UINT64_MAX),
> 
> UINT64_MAX doesn't make sense to me. What would be a reasonable value ?

It's the value that effectively disables this limit.

> Have you monitored the max ? Should we collect some statistics on this
> value and raise a warning if a high water mark is reached ? I think
> this would more useful.

It's an additional mechanism, which is not expected to be necessary
in most of real-world setups, hence it's disabled by default:
> Since this is not expected to be a realistic threat in most of VFIO live
> migration use cases and the right value depends on the particular setup
> disable the limit by default by setting it to UINT64_MAX.

The minimum value that works with particular setup depends on number of
multifd channels, probably also the number of NIC queues, etc. so it's
not something we should propose hard default to - unless it's a very
high default like 100 buffers, but then why have it set by default?.

IMHO setting it to UINT64_MAX clearly shows that it is disabled by
default since it obviously couldn't be set higher.
  
>>       DEFINE_PROP_BOOL("migration-events", VFIOPCIDevice,
>>                        vbasedev.migration_events, false),
>>       DEFINE_PROP_BOOL("x-no-mmap", VFIOPCIDevice, vbasedev.no_mmap, false),
> 
> 
> Please add property documentation in vfio_pci_dev_class_init()
> 

I'm not sure what you mean by that, vfio_pci_dev_class_init() doesn't
contain any documentation or even references to either
x-migration-max-queued-buffers or x-migration-multifd-transfer:
> static void vfio_pci_dev_class_init(ObjectClass *klass, void *data)
> {
>     DeviceClass *dc = DEVICE_CLASS(klass);
>     PCIDeviceClass *pdc = PCI_DEVICE_CLASS(klass);
> 
>     device_class_set_legacy_reset(dc, vfio_pci_reset);
>     device_class_set_props(dc, vfio_pci_dev_properties);
> #ifdef CONFIG_IOMMUFD
>     object_class_property_add_str(klass, "fd", NULL, vfio_pci_set_fd);
> #endif
>     dc->desc = "VFIO-based PCI device assignment";
>     set_bit(DEVICE_CATEGORY_MISC, dc->categories);
>     pdc->realize = vfio_realize;
>     pdc->exit = vfio_exitfn;
>     pdc->config_read = vfio_pci_read_config;
>     pdc->config_write = vfio_pci_write_config;
> }


> Thanks,
> 
> C.

Thanks,
Maciej

Re: [PATCH v5 34/36] vfio/migration: Max in-flight VFIO device state buffer count limit

Posted by Cédric Le Goater 1 month, 1 week ago

On 2/27/25 23:01, Maciej S. Szmigiero wrote:
> On 27.02.2025 07:48, Cédric Le Goater wrote:
>> On 2/19/25 21:34, Maciej S. Szmigiero wrote:
>>> From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com>
>>>
>>> Allow capping the maximum count of in-flight VFIO device state buffers
>>> queued at the destination, otherwise a malicious QEMU source could
>>> theoretically cause the target QEMU to allocate unlimited amounts of memory
>>> for buffers-in-flight.
>>>
>>> Since this is not expected to be a realistic threat in most of VFIO live
>>> migration use cases and the right value depends on the particular setup
>>> disable the limit by default by setting it to UINT64_MAX.
>>>
>>> Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
>>> ---
>>>   hw/vfio/migration-multifd.c   | 14 ++++++++++++++
>>>   hw/vfio/pci.c                 |  2 ++
>>>   include/hw/vfio/vfio-common.h |  1 +
>>>   3 files changed, 17 insertions(+)
>>>
>>> diff --git a/hw/vfio/migration-multifd.c b/hw/vfio/migration-multifd.c
>>> index 18a5ff964a37..04aa3f4a6596 100644
>>> --- a/hw/vfio/migration-multifd.c
>>> +++ b/hw/vfio/migration-multifd.c
>>> @@ -53,6 +53,7 @@ typedef struct VFIOMultifd {
>>>       QemuMutex load_bufs_mutex; /* Lock order: this lock -> BQL */
>>>       uint32_t load_buf_idx;
>>>       uint32_t load_buf_idx_last;
>>> +    uint32_t load_buf_queued_pending_buffers;
>>>   } VFIOMultifd;
>>>   static void vfio_state_buffer_clear(gpointer data)
>>> @@ -121,6 +122,15 @@ static bool vfio_load_state_buffer_insert(VFIODevice *vbasedev,
>>>       assert(packet->idx >= multifd->load_buf_idx);
>>> +    multifd->load_buf_queued_pending_buffers++;
>>> +    if (multifd->load_buf_queued_pending_buffers >
>>> +        vbasedev->migration_max_queued_buffers) {
>>> +        error_setg(errp,
>>> +                   "queuing state buffer %" PRIu32 " would exceed the max of %" PRIu64,
>>> +                   packet->idx, vbasedev->migration_max_queued_buffers);
>>> +        return false;
>>> +    }
>>> +
>>>       lb->data = g_memdup2(&packet->data, packet_total_size - sizeof(*packet));
>>>       lb->len = packet_total_size - sizeof(*packet);
>>>       lb->is_present = true;
>>> @@ -374,6 +384,9 @@ static bool vfio_load_bufs_thread(void *opaque, bool *should_quit, Error **errp)
>>>               goto ret_signal;
>>>           }
>>> +        assert(multifd->load_buf_queued_pending_buffers > 0);
>>> +        multifd->load_buf_queued_pending_buffers--;
>>> +
>>>           if (multifd->load_buf_idx == multifd->load_buf_idx_last - 1) {
>>>               trace_vfio_load_state_device_buffer_end(vbasedev->name);
>>>           }
>>> @@ -408,6 +421,7 @@ VFIOMultifd *vfio_multifd_new(void)
>>>       multifd->load_buf_idx = 0;
>>>       multifd->load_buf_idx_last = UINT32_MAX;
>>> +    multifd->load_buf_queued_pending_buffers = 0;
>>>       qemu_cond_init(&multifd->load_bufs_buffer_ready_cond);
>>>       multifd->load_bufs_thread_running = false;
>>> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
>>> index 9111805ae06c..247418f0fce2 100644
>>> --- a/hw/vfio/pci.c
>>> +++ b/hw/vfio/pci.c
>>> @@ -3383,6 +3383,8 @@ static const Property vfio_pci_dev_properties[] = {
>>>                   vbasedev.migration_multifd_transfer,
>>>                   qdev_prop_on_off_auto_mutable, OnOffAuto,
>>>                   .set_default = true, .defval.i = ON_OFF_AUTO_AUTO),
>>> +    DEFINE_PROP_UINT64("x-migration-max-queued-buffers", VFIOPCIDevice,
>>> +                       vbasedev.migration_max_queued_buffers, UINT64_MAX),
>>
>> UINT64_MAX doesn't make sense to me. What would be a reasonable value ?
> 
> It's the value that effectively disables this limit.
> 
>> Have you monitored the max ? Should we collect some statistics on this
>> value and raise a warning if a high water mark is reached ? I think
>> this would more useful.
> 
> It's an additional mechanism, which is not expected to be necessary
> in most of real-world setups, hence it's disabled by default:
>> Since this is not expected to be a realistic threat in most of VFIO live
>> migration use cases and the right value depends on the particular setup
>> disable the limit by default by setting it to UINT64_MAX.
> 
> The minimum value that works with particular setup depends on number of
> multifd channels, probably also the number of NIC queues, etc. so it's
> not something we should propose hard default to - unless it's a very
> high default like 100 buffers, but then why have it set by default?.
> 
> IMHO setting it to UINT64_MAX clearly shows that it is disabled by
> default since it obviously couldn't be set higher.

This doesn't convince me that we should take this patch in QEMU 10.0.
Please keep for now. We will decide in v6.
  
>>>       DEFINE_PROP_BOOL("migration-events", VFIOPCIDevice,
>>>                        vbasedev.migration_events, false),
>>>       DEFINE_PROP_BOOL("x-no-mmap", VFIOPCIDevice, vbasedev.no_mmap, false),
>>
>>
>> Please add property documentation in vfio_pci_dev_class_init()
>>
> 
> I'm not sure what you mean by that, vfio_pci_dev_class_init() doesn't
> contain any documentation or even references to either
> x-migration-max-queued-buffers or x-migration-multifd-transfer:

Indeed :/ I am trying to fix documentation here :

   https://lore.kernel.org/qemu-devel/20250217173455.449983-1-clg@redhat.com/

Please do something similar. I will polish the edges when merging
if necessary.

Overall, we should improve VFIO documentation, migration is one sub-feature
among many.

Thanks,

C.



>> static void vfio_pci_dev_class_init(ObjectClass *klass, void *data)
>> {
>>     DeviceClass *dc = DEVICE_CLASS(klass);
>>     PCIDeviceClass *pdc = PCI_DEVICE_CLASS(klass);
>>
>>     device_class_set_legacy_reset(dc, vfio_pci_reset);
>>     device_class_set_props(dc, vfio_pci_dev_properties);
>> #ifdef CONFIG_IOMMUFD
>>     object_class_property_add_str(klass, "fd", NULL, vfio_pci_set_fd);
>> #endif
>>     dc->desc = "VFIO-based PCI device assignment";
>>     set_bit(DEVICE_CATEGORY_MISC, dc->categories);
>>     pdc->realize = vfio_realize;
>>     pdc->exit = vfio_exitfn;
>>     pdc->config_read = vfio_pci_read_config;
>>     pdc->config_write = vfio_pci_write_config;
>> }
> 
> 
>> Thanks,
>>
>> C.
> 
> Thanks,
> Maciej
>

Re: [PATCH v5 34/36] vfio/migration: Max in-flight VFIO device state buffer count limit

Posted by Maciej S. Szmigiero 1 month, 1 week ago

On 28.02.2025 09:53, Cédric Le Goater wrote:
> On 2/27/25 23:01, Maciej S. Szmigiero wrote:
>> On 27.02.2025 07:48, Cédric Le Goater wrote:
>>> On 2/19/25 21:34, Maciej S. Szmigiero wrote:
>>>> From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com>
>>>>
>>>> Allow capping the maximum count of in-flight VFIO device state buffers
>>>> queued at the destination, otherwise a malicious QEMU source could
>>>> theoretically cause the target QEMU to allocate unlimited amounts of memory
>>>> for buffers-in-flight.
>>>>
>>>> Since this is not expected to be a realistic threat in most of VFIO live
>>>> migration use cases and the right value depends on the particular setup
>>>> disable the limit by default by setting it to UINT64_MAX.
>>>>
>>>> Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
>>>> ---
>>>>   hw/vfio/migration-multifd.c   | 14 ++++++++++++++
>>>>   hw/vfio/pci.c                 |  2 ++
>>>>   include/hw/vfio/vfio-common.h |  1 +
>>>>   3 files changed, 17 insertions(+)
>>>>
>>>> diff --git a/hw/vfio/migration-multifd.c b/hw/vfio/migration-multifd.c
>>>> index 18a5ff964a37..04aa3f4a6596 100644
>>>> --- a/hw/vfio/migration-multifd.c
>>>> +++ b/hw/vfio/migration-multifd.c
>>>> @@ -53,6 +53,7 @@ typedef struct VFIOMultifd {
>>>>       QemuMutex load_bufs_mutex; /* Lock order: this lock -> BQL */
>>>>       uint32_t load_buf_idx;
>>>>       uint32_t load_buf_idx_last;
>>>> +    uint32_t load_buf_queued_pending_buffers;
>>>>   } VFIOMultifd;
>>>>   static void vfio_state_buffer_clear(gpointer data)
>>>> @@ -121,6 +122,15 @@ static bool vfio_load_state_buffer_insert(VFIODevice *vbasedev,
>>>>       assert(packet->idx >= multifd->load_buf_idx);
>>>> +    multifd->load_buf_queued_pending_buffers++;
>>>> +    if (multifd->load_buf_queued_pending_buffers >
>>>> +        vbasedev->migration_max_queued_buffers) {
>>>> +        error_setg(errp,
>>>> +                   "queuing state buffer %" PRIu32 " would exceed the max of %" PRIu64,
>>>> +                   packet->idx, vbasedev->migration_max_queued_buffers);
>>>> +        return false;
>>>> +    }
>>>> +
>>>>       lb->data = g_memdup2(&packet->data, packet_total_size - sizeof(*packet));
>>>>       lb->len = packet_total_size - sizeof(*packet);
>>>>       lb->is_present = true;
>>>> @@ -374,6 +384,9 @@ static bool vfio_load_bufs_thread(void *opaque, bool *should_quit, Error **errp)
>>>>               goto ret_signal;
>>>>           }
>>>> +        assert(multifd->load_buf_queued_pending_buffers > 0);
>>>> +        multifd->load_buf_queued_pending_buffers--;
>>>> +
>>>>           if (multifd->load_buf_idx == multifd->load_buf_idx_last - 1) {
>>>>               trace_vfio_load_state_device_buffer_end(vbasedev->name);
>>>>           }
>>>> @@ -408,6 +421,7 @@ VFIOMultifd *vfio_multifd_new(void)
>>>>       multifd->load_buf_idx = 0;
>>>>       multifd->load_buf_idx_last = UINT32_MAX;
>>>> +    multifd->load_buf_queued_pending_buffers = 0;
>>>>       qemu_cond_init(&multifd->load_bufs_buffer_ready_cond);
>>>>       multifd->load_bufs_thread_running = false;
>>>> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
>>>> index 9111805ae06c..247418f0fce2 100644
>>>> --- a/hw/vfio/pci.c
>>>> +++ b/hw/vfio/pci.c
>>>> @@ -3383,6 +3383,8 @@ static const Property vfio_pci_dev_properties[] = {
>>>>                   vbasedev.migration_multifd_transfer,
>>>>                   qdev_prop_on_off_auto_mutable, OnOffAuto,
>>>>                   .set_default = true, .defval.i = ON_OFF_AUTO_AUTO),
>>>> +    DEFINE_PROP_UINT64("x-migration-max-queued-buffers", VFIOPCIDevice,
>>>> +                       vbasedev.migration_max_queued_buffers, UINT64_MAX),
>>>
>>> UINT64_MAX doesn't make sense to me. What would be a reasonable value ?
>>
>> It's the value that effectively disables this limit.
>>
>>> Have you monitored the max ? Should we collect some statistics on this
>>> value and raise a warning if a high water mark is reached ? I think
>>> this would more useful.
>>
>> It's an additional mechanism, which is not expected to be necessary
>> in most of real-world setups, hence it's disabled by default:
>>> Since this is not expected to be a realistic threat in most of VFIO live
>>> migration use cases and the right value depends on the particular setup
>>> disable the limit by default by setting it to UINT64_MAX.
>>
>> The minimum value that works with particular setup depends on number of
>> multifd channels, probably also the number of NIC queues, etc. so it's
>> not something we should propose hard default to - unless it's a very
>> high default like 100 buffers, but then why have it set by default?.
>>
>> IMHO setting it to UINT64_MAX clearly shows that it is disabled by
>> default since it obviously couldn't be set higher.
> 
> This doesn't convince me that we should take this patch in QEMU 10.0.
> Please keep for now. We will decide in v6.

Okay, let's decide at the v6 time then.

>>>>       DEFINE_PROP_BOOL("migration-events", VFIOPCIDevice,
>>>>                        vbasedev.migration_events, false),
>>>>       DEFINE_PROP_BOOL("x-no-mmap", VFIOPCIDevice, vbasedev.no_mmap, false),
>>>
>>>
>>> Please add property documentation in vfio_pci_dev_class_init()
>>>
>>
>> I'm not sure what you mean by that, vfio_pci_dev_class_init() doesn't
>> contain any documentation or even references to either
>> x-migration-max-queued-buffers or x-migration-multifd-transfer:
> 
> Indeed :/ I am trying to fix documentation here :
> 
>    https://lore.kernel.org/qemu-devel/20250217173455.449983-1-clg@redhat.com/
> 
> Please do something similar. I will polish the edges when merging
> if necessary.

Ahh, I see now - that patch set of yours isn't merged upstream yet so
that's why I did not know what you had on mind.

> Overall, we should improve VFIO documentation, migration is one sub-feature
> among many.

Sure - I've now added object_class_property_set_description() description
for all 3 newly added parameters:
x-migration-multifd-transfer, x-migration-load-config-after-iter and
x-migration-max-queued-buffers.

> Thanks,
> 
> C.
> 

Thanks,
Maciej