From nobody Sat Nov 15 14:49:51 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1750787665223434.8869830552708; Tue, 24 Jun 2025 10:54:25 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1uU7ps-0002tD-Ib; Tue, 24 Jun 2025 13:53:33 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uU7pm-0002rt-Jv; Tue, 24 Jun 2025 13:53:26 -0400 Received: from vps-ovh.mhejs.net ([145.239.82.108]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uU7pk-0002in-J1; Tue, 24 Jun 2025 13:53:26 -0400 Received: from MUA by vps-ovh.mhejs.net with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.98.1) (envelope-from ) id 1uU7pb-00000005WmI-1ZgI; Tue, 24 Jun 2025 19:53:15 +0200 From: "Maciej S. Szmigiero" To: Alex Williamson , =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= , Peter Xu , Fabiano Rosas Cc: Peter Maydell , Avihai Horon , qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH 1/3] vfio/migration: Max in-flight VFIO device state buffer count limit Date: Tue, 24 Jun 2025 19:51:56 +0200 Message-ID: <0e88a253e06647f6c01bdeba45848501b3631bd3.1750787338.git.maciej.szmigiero@oracle.com> X-Mailer: git-send-email 2.49.0 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=145.239.82.108; envelope-from=mhej@vps-ovh.mhejs.net; helo=vps-ovh.mhejs.net X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1750787668478116600 Content-Type: text/plain; charset="utf-8" From: "Maciej S. Szmigiero" Allow capping the maximum count of in-flight VFIO device state buffers queued at the destination, otherwise a malicious QEMU source could theoretically cause the target QEMU to allocate unlimited amounts of memory for buffers-in-flight. Since this is not expected to be a realistic threat in most of VFIO live migration use cases and the right value depends on the particular setup disable the limit by default by setting it to UINT64_MAX. Signed-off-by: Maciej S. Szmigiero Reviewed-by: Avihai Horon Reviewed-by: Fabiano Rosas --- docs/devel/migration/vfio.rst | 13 +++++++++++++ hw/vfio/migration-multifd.c | 16 ++++++++++++++++ hw/vfio/pci.c | 9 +++++++++ include/hw/vfio/vfio-device.h | 1 + 4 files changed, 39 insertions(+) diff --git a/docs/devel/migration/vfio.rst b/docs/devel/migration/vfio.rst index 673e354754c8..f4a6bfa4619b 100644 --- a/docs/devel/migration/vfio.rst +++ b/docs/devel/migration/vfio.rst @@ -247,3 +247,16 @@ The multifd VFIO device state transfer is controlled by "x-migration-multifd-transfer" VFIO device property. This property default= s to AUTO, which means that VFIO device state transfer via multifd channels is attempted in configurations that otherwise support it. + +Since the target QEMU needs to load device state buffers in-order it needs= to +queue incoming buffers until they can be loaded into the device. +This means that a malicious QEMU source could theoretically cause the targ= et +QEMU to allocate unlimited amounts of memory for such buffers-in-flight. + +The "x-migration-max-queued-buffers" property allows capping the maximum c= ount +of these VFIO device state buffers queued at the destination. + +Because a malicious QEMU source causing OOM on the target is not expected = to be +a realistic threat in most of VFIO live migration use cases and the right = value +depends on the particular setup by default this queued buffers limit is +disabled by setting it to UINT64_MAX. diff --git a/hw/vfio/migration-multifd.c b/hw/vfio/migration-multifd.c index 850a31948878..f26c112090b4 100644 --- a/hw/vfio/migration-multifd.c +++ b/hw/vfio/migration-multifd.c @@ -56,6 +56,7 @@ typedef struct VFIOMultifd { QemuMutex load_bufs_mutex; /* Lock order: this lock -> BQL */ uint32_t load_buf_idx; uint32_t load_buf_idx_last; + uint32_t load_buf_queued_pending_buffers; } VFIOMultifd; =20 static void vfio_state_buffer_clear(gpointer data) @@ -127,6 +128,17 @@ static bool vfio_load_state_buffer_insert(VFIODevice *= vbasedev, =20 assert(packet->idx >=3D multifd->load_buf_idx); =20 + multifd->load_buf_queued_pending_buffers++; + if (multifd->load_buf_queued_pending_buffers > + vbasedev->migration_max_queued_buffers) { + error_setg(errp, + "%s: queuing state buffer %" PRIu32 + " would exceed the max of %" PRIu64, + vbasedev->name, packet->idx, + vbasedev->migration_max_queued_buffers); + return false; + } + lb->data =3D g_memdup2(&packet->data, packet_total_size - sizeof(*pack= et)); lb->len =3D packet_total_size - sizeof(*packet); lb->is_present =3D true; @@ -387,6 +399,9 @@ static bool vfio_load_bufs_thread(void *opaque, bool *s= hould_quit, Error **errp) goto thread_exit; } =20 + assert(multifd->load_buf_queued_pending_buffers > 0); + multifd->load_buf_queued_pending_buffers--; + if (multifd->load_buf_idx =3D=3D multifd->load_buf_idx_last - 1) { trace_vfio_load_state_device_buffer_end(vbasedev->name); } @@ -423,6 +438,7 @@ static VFIOMultifd *vfio_multifd_new(void) =20 multifd->load_buf_idx =3D 0; multifd->load_buf_idx_last =3D UINT32_MAX; + multifd->load_buf_queued_pending_buffers =3D 0; qemu_cond_init(&multifd->load_bufs_buffer_ready_cond); =20 multifd->load_bufs_thread_running =3D false; diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index fa25bded25c5..2765a39f9df1 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -3524,6 +3524,8 @@ static const Property vfio_pci_dev_properties[] =3D { vbasedev.migration_multifd_transfer, vfio_pci_migration_multifd_transfer_prop, OnOffAuto, .set_default =3D true, .defval.i =3D ON_OFF_AUTO_AUTO), + DEFINE_PROP_UINT64("x-migration-max-queued-buffers", VFIOPCIDevice, + vbasedev.migration_max_queued_buffers, UINT64_MAX), DEFINE_PROP_BOOL("migration-events", VFIOPCIDevice, vbasedev.migration_events, false), DEFINE_PROP_BOOL("x-no-mmap", VFIOPCIDevice, vbasedev.no_mmap, false), @@ -3698,6 +3700,13 @@ static void vfio_pci_dev_class_init(ObjectClass *kla= ss, const void *data) "x-migration-multifd-transfer", "Transfer this device state via " "multifd channels when live migr= ating it"); + object_class_property_set_description(klass, /* 10.1 */ + "x-migration-max-queued-buffers", + "Maximum count of in-flight VFIO= " + "device state buffers queued at = the " + "destination when doing live " + "migration of device state via " + "multifd channels"); } =20 static const TypeInfo vfio_pci_dev_info =3D { diff --git a/include/hw/vfio/vfio-device.h b/include/hw/vfio/vfio-device.h index d45e5a68a24e..0ee34aaf668b 100644 --- a/include/hw/vfio/vfio-device.h +++ b/include/hw/vfio/vfio-device.h @@ -66,6 +66,7 @@ typedef struct VFIODevice { bool ram_block_discard_allowed; OnOffAuto enable_migration; OnOffAuto migration_multifd_transfer; + uint64_t migration_max_queued_buffers; bool migration_events; bool use_region_fds; VFIODeviceOps *ops; From nobody Sat Nov 15 14:49:51 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 17507876652881001.3136107936208; Tue, 24 Jun 2025 10:54:25 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1uU7qA-0002wB-CX; Tue, 24 Jun 2025 13:53:50 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uU7pr-0002tL-N4; Tue, 24 Jun 2025 13:53:32 -0400 Received: from vps-ovh.mhejs.net ([145.239.82.108]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uU7pp-0002j6-Nd; Tue, 24 Jun 2025 13:53:31 -0400 Received: from MUA by vps-ovh.mhejs.net with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.98.1) (envelope-from ) id 1uU7pg-00000005WmV-272w; Tue, 24 Jun 2025 19:53:20 +0200 From: "Maciej S. Szmigiero" To: Alex Williamson , =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= , Peter Xu , Fabiano Rosas Cc: Peter Maydell , Avihai Horon , qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH 2/3] vfio/migration: Add x-migration-load-config-after-iter VFIO property Date: Tue, 24 Jun 2025 19:51:57 +0200 Message-ID: <22e94f25448f9ff42b84c84df3960c4ecc94cbdc.1750787338.git.maciej.szmigiero@oracle.com> X-Mailer: git-send-email 2.49.0 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=145.239.82.108; envelope-from=mhej@vps-ovh.mhejs.net; helo=vps-ovh.mhejs.net X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1750787668590116600 Content-Type: text/plain; charset="utf-8" From: "Maciej S. Szmigiero" This property allows configuring whether to start the config load only after all iterables were loaded. Such interlocking is required for ARM64 due to this platform VFIO dependency on interrupt controller being loaded first. The property defaults to AUTO, which means ON for ARM, OFF for other platforms. Signed-off-by: Maciej S. Szmigiero Reviewed-by: Avihai Horon Reviewed-by: Fabiano Rosas --- docs/devel/migration/vfio.rst | 6 +++ hw/core/machine.c | 1 + hw/vfio/migration-multifd.c | 88 +++++++++++++++++++++++++++++++ hw/vfio/migration-multifd.h | 3 ++ hw/vfio/migration.c | 10 +++- hw/vfio/pci.c | 9 ++++ hw/vfio/vfio-migration-internal.h | 1 + include/hw/vfio/vfio-device.h | 1 + 8 files changed, 118 insertions(+), 1 deletion(-) diff --git a/docs/devel/migration/vfio.rst b/docs/devel/migration/vfio.rst index f4a6bfa4619b..7c9cb7bdbf87 100644 --- a/docs/devel/migration/vfio.rst +++ b/docs/devel/migration/vfio.rst @@ -260,3 +260,9 @@ Because a malicious QEMU source causing OOM on the targ= et is not expected to be a realistic threat in most of VFIO live migration use cases and the right = value depends on the particular setup by default this queued buffers limit is disabled by setting it to UINT64_MAX. + +Some host platforms (like ARM64) require that VFIO device config is loaded= only +after all iterables were loaded. +Such interlocking is controlled by "x-migration-load-config-after-iter" VF= IO +device property, which in its default setting (AUTO) does so only on platf= orms +that actually require it. diff --git a/hw/core/machine.c b/hw/core/machine.c index e869821b2246..16640b700f2e 100644 --- a/hw/core/machine.c +++ b/hw/core/machine.c @@ -39,6 +39,7 @@ =20 GlobalProperty hw_compat_10_0[] =3D { { "scsi-hd", "dpofua", "off" }, + { "vfio-pci", "x-migration-load-config-after-iter", "off" }, }; const size_t hw_compat_10_0_len =3D G_N_ELEMENTS(hw_compat_10_0); =20 diff --git a/hw/vfio/migration-multifd.c b/hw/vfio/migration-multifd.c index f26c112090b4..a12ec1ead74a 100644 --- a/hw/vfio/migration-multifd.c +++ b/hw/vfio/migration-multifd.c @@ -17,6 +17,7 @@ #include "qemu/error-report.h" #include "qemu/lockable.h" #include "qemu/main-loop.h" +#include "qemu/target-info.h" #include "qemu/thread.h" #include "io/channel-buffer.h" #include "migration/qemu-file.h" @@ -35,6 +36,27 @@ typedef struct VFIODeviceStatePacket { uint8_t data[0]; } QEMU_PACKED VFIODeviceStatePacket; =20 +bool vfio_load_config_after_iter(VFIODevice *vbasedev) +{ + if (vbasedev->migration_load_config_after_iter =3D=3D ON_OFF_AUTO_ON) { + return true; + } else if (vbasedev->migration_load_config_after_iter =3D=3D ON_OFF_AU= TO_OFF) { + return false; + } + + assert(vbasedev->migration_load_config_after_iter =3D=3D ON_OFF_AUTO_A= UTO); + + /* + * Starting the config load only after all iterables were loaded is re= quired + * for ARM64 due to this platform VFIO dependency on interrupt control= ler + * being loaded first. + * + * See commit d329f5032e17 ("vfio: Move the saving of the config space= to + * the right place in VFIO migration"). + */ + return strcmp(target_name(), "aarch64") =3D=3D 0; +} + /* type safety */ typedef struct VFIOStateBuffers { GArray *array; @@ -50,6 +72,9 @@ typedef struct VFIOMultifd { bool load_bufs_thread_running; bool load_bufs_thread_want_exit; =20 + bool load_bufs_iter_done; + QemuCond load_bufs_iter_done_cond; + VFIOStateBuffers load_bufs; QemuCond load_bufs_buffer_ready_cond; QemuCond load_bufs_thread_finished_cond; @@ -409,6 +434,22 @@ static bool vfio_load_bufs_thread(void *opaque, bool *= should_quit, Error **errp) multifd->load_buf_idx++; } =20 + if (vfio_load_config_after_iter(vbasedev)) { + while (!multifd->load_bufs_iter_done) { + qemu_cond_wait(&multifd->load_bufs_iter_done_cond, + &multifd->load_bufs_mutex); + + /* + * Need to re-check cancellation immediately after wait in case + * cond was signalled by vfio_load_cleanup_load_bufs_thread(). + */ + if (vfio_load_bufs_thread_want_exit(multifd, should_quit)) { + error_setg(errp, "operation cancelled"); + goto thread_exit; + } + } + } + if (!vfio_load_bufs_thread_load_config(vbasedev, errp)) { goto thread_exit; } @@ -428,6 +469,48 @@ thread_exit: return ret; } =20 +int vfio_load_state_config_load_ready(VFIODevice *vbasedev) +{ + VFIOMigration *migration =3D vbasedev->migration; + VFIOMultifd *multifd =3D migration->multifd; + int ret =3D 0; + + if (!vfio_multifd_transfer_enabled(vbasedev)) { + error_report("%s: got DEV_CONFIG_LOAD_READY outside multifd transf= er", + vbasedev->name); + return -EINVAL; + } + + if (!vfio_load_config_after_iter(vbasedev)) { + error_report("%s: got DEV_CONFIG_LOAD_READY but was disabled", + vbasedev->name); + return -EINVAL; + } + + assert(multifd); + + /* The lock order is load_bufs_mutex -> BQL so unlock BQL here first */ + bql_unlock(); + WITH_QEMU_LOCK_GUARD(&multifd->load_bufs_mutex) { + if (multifd->load_bufs_iter_done) { + /* Can't print error here as we're outside BQL */ + ret =3D -EINVAL; + break; + } + + multifd->load_bufs_iter_done =3D true; + qemu_cond_signal(&multifd->load_bufs_iter_done_cond); + } + bql_lock(); + + if (ret) { + error_report("%s: duplicate DEV_CONFIG_LOAD_READY", + vbasedev->name); + } + + return ret; +} + static VFIOMultifd *vfio_multifd_new(void) { VFIOMultifd *multifd =3D g_new(VFIOMultifd, 1); @@ -441,6 +524,9 @@ static VFIOMultifd *vfio_multifd_new(void) multifd->load_buf_queued_pending_buffers =3D 0; qemu_cond_init(&multifd->load_bufs_buffer_ready_cond); =20 + multifd->load_bufs_iter_done =3D false; + qemu_cond_init(&multifd->load_bufs_iter_done_cond); + multifd->load_bufs_thread_running =3D false; multifd->load_bufs_thread_want_exit =3D false; qemu_cond_init(&multifd->load_bufs_thread_finished_cond); @@ -464,6 +550,7 @@ static void vfio_load_cleanup_load_bufs_thread(VFIOMult= ifd *multifd) multifd->load_bufs_thread_want_exit =3D true; =20 qemu_cond_signal(&multifd->load_bufs_buffer_ready_cond); + qemu_cond_signal(&multifd->load_bufs_iter_done_cond); qemu_cond_wait(&multifd->load_bufs_thread_finished_cond, &multifd->load_bufs_mutex); } @@ -476,6 +563,7 @@ static void vfio_multifd_free(VFIOMultifd *multifd) vfio_load_cleanup_load_bufs_thread(multifd); =20 qemu_cond_destroy(&multifd->load_bufs_thread_finished_cond); + qemu_cond_destroy(&multifd->load_bufs_iter_done_cond); vfio_state_buffers_destroy(&multifd->load_bufs); qemu_cond_destroy(&multifd->load_bufs_buffer_ready_cond); qemu_mutex_destroy(&multifd->load_bufs_mutex); diff --git a/hw/vfio/migration-multifd.h b/hw/vfio/migration-multifd.h index 0bab63211d30..487f457282df 100644 --- a/hw/vfio/migration-multifd.h +++ b/hw/vfio/migration-multifd.h @@ -20,9 +20,12 @@ void vfio_multifd_cleanup(VFIODevice *vbasedev); bool vfio_multifd_transfer_supported(void); bool vfio_multifd_transfer_enabled(VFIODevice *vbasedev); =20 +bool vfio_load_config_after_iter(VFIODevice *vbasedev); bool vfio_multifd_load_state_buffer(void *opaque, char *data, size_t data_= size, Error **errp); =20 +int vfio_load_state_config_load_ready(VFIODevice *vbasedev); + void vfio_multifd_emit_dummy_eos(VFIODevice *vbasedev, QEMUFile *f); =20 bool diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c index b76697bd1a23..7c6436d4c344 100644 --- a/hw/vfio/migration.c +++ b/hw/vfio/migration.c @@ -675,7 +675,11 @@ static void vfio_save_state(QEMUFile *f, void *opaque) int ret; =20 if (vfio_multifd_transfer_enabled(vbasedev)) { - vfio_multifd_emit_dummy_eos(vbasedev, f); + if (vfio_load_config_after_iter(vbasedev)) { + qemu_put_be64(f, VFIO_MIG_FLAG_DEV_CONFIG_LOAD_READY); + } else { + vfio_multifd_emit_dummy_eos(vbasedev, f); + } return; } =20 @@ -784,6 +788,10 @@ static int vfio_load_state(QEMUFile *f, void *opaque, = int version_id) =20 return ret; } + case VFIO_MIG_FLAG_DEV_CONFIG_LOAD_READY: + { + return vfio_load_state_config_load_ready(vbasedev); + } default: error_report("%s: Unknown tag 0x%"PRIx64, vbasedev->name, data= ); return -EINVAL; diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index 2765a39f9df1..01e48e39de75 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -3524,6 +3524,9 @@ static const Property vfio_pci_dev_properties[] =3D { vbasedev.migration_multifd_transfer, vfio_pci_migration_multifd_transfer_prop, OnOffAuto, .set_default =3D true, .defval.i =3D ON_OFF_AUTO_AUTO), + DEFINE_PROP_ON_OFF_AUTO("x-migration-load-config-after-iter", VFIOPCID= evice, + vbasedev.migration_load_config_after_iter, + ON_OFF_AUTO_AUTO), DEFINE_PROP_UINT64("x-migration-max-queued-buffers", VFIOPCIDevice, vbasedev.migration_max_queued_buffers, UINT64_MAX), DEFINE_PROP_BOOL("migration-events", VFIOPCIDevice, @@ -3700,6 +3703,12 @@ static void vfio_pci_dev_class_init(ObjectClass *kla= ss, const void *data) "x-migration-multifd-transfer", "Transfer this device state via " "multifd channels when live migr= ating it"); + object_class_property_set_description(klass, /* 10.1 */ + "x-migration-load-config-after-i= ter", + "Start the config load only afte= r " + "all iterables were loaded when = doing " + "live migration of device state = via " + "multifd channels"); object_class_property_set_description(klass, /* 10.1 */ "x-migration-max-queued-buffers", "Maximum count of in-flight VFIO= " diff --git a/hw/vfio/vfio-migration-internal.h b/hw/vfio/vfio-migration-int= ernal.h index a8b456b239df..54141e27e6b2 100644 --- a/hw/vfio/vfio-migration-internal.h +++ b/hw/vfio/vfio-migration-internal.h @@ -32,6 +32,7 @@ #define VFIO_MIG_FLAG_DEV_SETUP_STATE (0xffffffffef100003ULL) #define VFIO_MIG_FLAG_DEV_DATA_STATE (0xffffffffef100004ULL) #define VFIO_MIG_FLAG_DEV_INIT_DATA_SENT (0xffffffffef100005ULL) +#define VFIO_MIG_FLAG_DEV_CONFIG_LOAD_READY (0xffffffffef100006ULL) =20 typedef struct VFIODevice VFIODevice; typedef struct VFIOMultifd VFIOMultifd; diff --git a/include/hw/vfio/vfio-device.h b/include/hw/vfio/vfio-device.h index 0ee34aaf668b..359d553b916a 100644 --- a/include/hw/vfio/vfio-device.h +++ b/include/hw/vfio/vfio-device.h @@ -66,6 +66,7 @@ typedef struct VFIODevice { bool ram_block_discard_allowed; OnOffAuto enable_migration; OnOffAuto migration_multifd_transfer; + OnOffAuto migration_load_config_after_iter; uint64_t migration_max_queued_buffers; bool migration_events; bool use_region_fds; From nobody Sat Nov 15 14:49:51 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 175078772392297.40323801727357; Tue, 24 Jun 2025 10:55:23 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1uU7q0-0002vC-4y; Tue, 24 Jun 2025 13:53:40 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uU7pp-0002sw-GI; Tue, 24 Jun 2025 13:53:29 -0400 Received: from vps-ovh.mhejs.net ([145.239.82.108]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uU7pn-0002jX-QE; Tue, 24 Jun 2025 13:53:29 -0400 Received: from MUA by vps-ovh.mhejs.net with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.98.1) (envelope-from ) id 1uU7pl-00000005Wmh-2a5T; Tue, 24 Jun 2025 19:53:25 +0200 From: "Maciej S. Szmigiero" To: Alex Williamson , =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= , Peter Xu , Fabiano Rosas Cc: Peter Maydell , Avihai Horon , qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH 3/3] vfio/migration: Add also max in-flight VFIO device state buffers size limit Date: Tue, 24 Jun 2025 19:51:58 +0200 Message-ID: <65a132db72807ec6341015e7079b10352d718b96.1750787338.git.maciej.szmigiero@oracle.com> X-Mailer: git-send-email 2.49.0 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=145.239.82.108; envelope-from=mhej@vps-ovh.mhejs.net; helo=vps-ovh.mhejs.net X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1750787726600116600 Content-Type: text/plain; charset="utf-8" From: "Maciej S. Szmigiero" There's already a max in-flight VFIO device state buffers *count* limit, add also max queued buffers *size* limit. Signed-off-by: Maciej S. Szmigiero Reviewed-by: Avihai Horon Reviewed-by: Fabiano Rosas --- docs/devel/migration/vfio.rst | 8 +++++--- hw/vfio/migration-multifd.c | 21 +++++++++++++++++++-- hw/vfio/pci.c | 9 +++++++++ include/hw/vfio/vfio-device.h | 1 + 4 files changed, 34 insertions(+), 5 deletions(-) diff --git a/docs/devel/migration/vfio.rst b/docs/devel/migration/vfio.rst index 7c9cb7bdbf87..127a1db35949 100644 --- a/docs/devel/migration/vfio.rst +++ b/docs/devel/migration/vfio.rst @@ -254,12 +254,14 @@ This means that a malicious QEMU source could theoret= ically cause the target QEMU to allocate unlimited amounts of memory for such buffers-in-flight. =20 The "x-migration-max-queued-buffers" property allows capping the maximum c= ount -of these VFIO device state buffers queued at the destination. +of these VFIO device state buffers queued at the destination while +"x-migration-max-queued-buffers-size" property allows capping their total = queued +size. =20 Because a malicious QEMU source causing OOM on the target is not expected = to be a realistic threat in most of VFIO live migration use cases and the right = value -depends on the particular setup by default this queued buffers limit is -disabled by setting it to UINT64_MAX. +depends on the particular setup by default these queued buffers limits are +disabled by setting them to UINT64_MAX. =20 Some host platforms (like ARM64) require that VFIO device config is loaded= only after all iterables were loaded. diff --git a/hw/vfio/migration-multifd.c b/hw/vfio/migration-multifd.c index a12ec1ead74a..c76f1f2181f9 100644 --- a/hw/vfio/migration-multifd.c +++ b/hw/vfio/migration-multifd.c @@ -82,6 +82,7 @@ typedef struct VFIOMultifd { uint32_t load_buf_idx; uint32_t load_buf_idx_last; uint32_t load_buf_queued_pending_buffers; + size_t load_buf_queued_pending_buffers_size; } VFIOMultifd; =20 static void vfio_state_buffer_clear(gpointer data) @@ -138,6 +139,7 @@ static bool vfio_load_state_buffer_insert(VFIODevice *v= basedev, VFIOMigration *migration =3D vbasedev->migration; VFIOMultifd *multifd =3D migration->multifd; VFIOStateBuffer *lb; + size_t data_size =3D packet_total_size - sizeof(*packet); =20 vfio_state_buffers_assert_init(&multifd->load_bufs); if (packet->idx >=3D vfio_state_buffers_size_get(&multifd->load_bufs))= { @@ -164,8 +166,19 @@ static bool vfio_load_state_buffer_insert(VFIODevice *= vbasedev, return false; } =20 - lb->data =3D g_memdup2(&packet->data, packet_total_size - sizeof(*pack= et)); - lb->len =3D packet_total_size - sizeof(*packet); + multifd->load_buf_queued_pending_buffers_size +=3D data_size; + if (multifd->load_buf_queued_pending_buffers_size > + vbasedev->migration_max_queued_buffers_size) { + error_setg(errp, + "%s: queuing state buffer %" PRIu32 + " would exceed the size max of %" PRIu64, + vbasedev->name, packet->idx, + vbasedev->migration_max_queued_buffers_size); + return false; + } + + lb->data =3D g_memdup2(&packet->data, data_size); + lb->len =3D data_size; lb->is_present =3D true; =20 return true; @@ -349,6 +362,9 @@ static bool vfio_load_state_buffer_write(VFIODevice *vb= asedev, assert(wr_ret <=3D buf_len); buf_len -=3D wr_ret; buf_cur +=3D wr_ret; + + assert(multifd->load_buf_queued_pending_buffers_size >=3D wr_ret); + multifd->load_buf_queued_pending_buffers_size -=3D wr_ret; } =20 trace_vfio_load_state_device_buffer_load_end(vbasedev->name, @@ -522,6 +538,7 @@ static VFIOMultifd *vfio_multifd_new(void) multifd->load_buf_idx =3D 0; multifd->load_buf_idx_last =3D UINT32_MAX; multifd->load_buf_queued_pending_buffers =3D 0; + multifd->load_buf_queued_pending_buffers_size =3D 0; qemu_cond_init(&multifd->load_bufs_buffer_ready_cond); =20 multifd->load_bufs_iter_done =3D false; diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index 01e48e39de75..944813ee7bdb 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -3529,6 +3529,8 @@ static const Property vfio_pci_dev_properties[] =3D { ON_OFF_AUTO_AUTO), DEFINE_PROP_UINT64("x-migration-max-queued-buffers", VFIOPCIDevice, vbasedev.migration_max_queued_buffers, UINT64_MAX), + DEFINE_PROP_SIZE("x-migration-max-queued-buffers-size", VFIOPCIDevice, + vbasedev.migration_max_queued_buffers_size, UINT64_MA= X), DEFINE_PROP_BOOL("migration-events", VFIOPCIDevice, vbasedev.migration_events, false), DEFINE_PROP_BOOL("x-no-mmap", VFIOPCIDevice, vbasedev.no_mmap, false), @@ -3716,6 +3718,13 @@ static void vfio_pci_dev_class_init(ObjectClass *kla= ss, const void *data) "destination when doing live " "migration of device state via " "multifd channels"); + object_class_property_set_description(klass, /* 10.1 */ + "x-migration-max-queued-buffers-= size", + "Maximum size of in-flight VFIO " + "device state buffers queued at = the " + "destination when doing live " + "migration of device state via " + "multifd channels"); } =20 static const TypeInfo vfio_pci_dev_info =3D { diff --git a/include/hw/vfio/vfio-device.h b/include/hw/vfio/vfio-device.h index 359d553b916a..3e86d07347d6 100644 --- a/include/hw/vfio/vfio-device.h +++ b/include/hw/vfio/vfio-device.h @@ -68,6 +68,7 @@ typedef struct VFIODevice { OnOffAuto migration_multifd_transfer; OnOffAuto migration_load_config_after_iter; uint64_t migration_max_queued_buffers; + uint64_t migration_max_queued_buffers_size; bool migration_events; bool use_region_fds; VFIODeviceOps *ops;