From nobody Thu Apr 3 10:45:38 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1741129229963396.1070344836327; Tue, 4 Mar 2025 15:00:29 -0800 (PST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tpbEP-0002u1-TT; Tue, 04 Mar 2025 17:59:22 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tpbEM-0002t9-TL for qemu-devel@nongnu.org; Tue, 04 Mar 2025 17:59:19 -0500 Received: from vps-ovh.mhejs.net ([145.239.82.108]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tpbEK-00074U-R7 for qemu-devel@nongnu.org; Tue, 04 Mar 2025 17:59:18 -0500 Received: from MUA by vps-ovh.mhejs.net with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.98) (envelope-from ) id 1tpaPk-00000000Lah-2Jmd; Tue, 04 Mar 2025 23:07:00 +0100 From: "Maciej S. Szmigiero" To: Peter Xu , Fabiano Rosas Cc: Alex Williamson , =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= , Eric Blake , Markus Armbruster , =?UTF-8?q?Daniel=20P=20=2E=20Berrang=C3=A9?= , Avihai Horon , Joao Martins , qemu-devel@nongnu.org Subject: [PATCH v6 26/36] vfio/migration: Multifd device state transfer support - received buffers queuing Date: Tue, 4 Mar 2025 23:03:53 +0100 Message-ID: X-Mailer: git-send-email 2.48.1 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=145.239.82.108; envelope-from=mhej@vps-ovh.mhejs.net; helo=vps-ovh.mhejs.net X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1741129231725019000 Content-Type: text/plain; charset="utf-8" From: "Maciej S. Szmigiero" The multifd received data needs to be reassembled since device state packets sent via different multifd channels can arrive out-of-order. Therefore, each VFIO device state packet carries a header indicating its position in the stream. The raw device state data is saved into a VFIOStateBuffer for later in-order loading into the device. The last such VFIO device state packet should have VFIO_DEVICE_STATE_CONFIG_STATE flag set and carry the device config state. Signed-off-by: Maciej S. Szmigiero Reviewed-by: C=C3=A9dric Le Goater --- hw/vfio/migration-multifd.c | 163 ++++++++++++++++++++++++++++++++++++ hw/vfio/migration-multifd.h | 3 + hw/vfio/migration.c | 1 + hw/vfio/trace-events | 1 + 4 files changed, 168 insertions(+) diff --git a/hw/vfio/migration-multifd.c b/hw/vfio/migration-multifd.c index 091dc43210ad..79df11b7baa9 100644 --- a/hw/vfio/migration-multifd.c +++ b/hw/vfio/migration-multifd.c @@ -32,18 +32,181 @@ typedef struct VFIODeviceStatePacket { uint8_t data[0]; } QEMU_PACKED VFIODeviceStatePacket; =20 +/* type safety */ +typedef struct VFIOStateBuffers { + GArray *array; +} VFIOStateBuffers; + +typedef struct VFIOStateBuffer { + bool is_present; + char *data; + size_t len; +} VFIOStateBuffer; + typedef struct VFIOMultifd { + VFIOStateBuffers load_bufs; + QemuCond load_bufs_buffer_ready_cond; + QemuMutex load_bufs_mutex; /* Lock order: this lock -> BQL */ + uint32_t load_buf_idx; + uint32_t load_buf_idx_last; } VFIOMultifd; =20 +static void vfio_state_buffer_clear(gpointer data) +{ + VFIOStateBuffer *lb =3D data; + + if (!lb->is_present) { + return; + } + + g_clear_pointer(&lb->data, g_free); + lb->is_present =3D false; +} + +static void vfio_state_buffers_init(VFIOStateBuffers *bufs) +{ + bufs->array =3D g_array_new(FALSE, TRUE, sizeof(VFIOStateBuffer)); + g_array_set_clear_func(bufs->array, vfio_state_buffer_clear); +} + +static void vfio_state_buffers_destroy(VFIOStateBuffers *bufs) +{ + g_clear_pointer(&bufs->array, g_array_unref); +} + +static void vfio_state_buffers_assert_init(VFIOStateBuffers *bufs) +{ + assert(bufs->array); +} + +static unsigned int vfio_state_buffers_size_get(VFIOStateBuffers *bufs) +{ + return bufs->array->len; +} + +static void vfio_state_buffers_size_set(VFIOStateBuffers *bufs, + unsigned int size) +{ + g_array_set_size(bufs->array, size); +} + +static VFIOStateBuffer *vfio_state_buffers_at(VFIOStateBuffers *bufs, + unsigned int idx) +{ + return &g_array_index(bufs->array, VFIOStateBuffer, idx); +} + +/* called with load_bufs_mutex locked */ +static bool vfio_load_state_buffer_insert(VFIODevice *vbasedev, + VFIODeviceStatePacket *packet, + size_t packet_total_size, + Error **errp) +{ + VFIOMigration *migration =3D vbasedev->migration; + VFIOMultifd *multifd =3D migration->multifd; + VFIOStateBuffer *lb; + + vfio_state_buffers_assert_init(&multifd->load_bufs); + if (packet->idx >=3D vfio_state_buffers_size_get(&multifd->load_bufs))= { + vfio_state_buffers_size_set(&multifd->load_bufs, packet->idx + 1); + } + + lb =3D vfio_state_buffers_at(&multifd->load_bufs, packet->idx); + if (lb->is_present) { + error_setg(errp, "%s: state buffer %" PRIu32 " already filled", + vbasedev->name, packet->idx); + return false; + } + + assert(packet->idx >=3D multifd->load_buf_idx); + + lb->data =3D g_memdup2(&packet->data, packet_total_size - sizeof(*pack= et)); + lb->len =3D packet_total_size - sizeof(*packet); + lb->is_present =3D true; + + return true; +} + +bool vfio_multifd_load_state_buffer(void *opaque, char *data, size_t data_= size, + Error **errp) +{ + VFIODevice *vbasedev =3D opaque; + VFIOMigration *migration =3D vbasedev->migration; + VFIOMultifd *multifd =3D migration->multifd; + VFIODeviceStatePacket *packet =3D (VFIODeviceStatePacket *)data; + + if (!vfio_multifd_transfer_enabled(vbasedev)) { + error_setg(errp, + "%s: got device state packet but not doing multifd tran= sfer", + vbasedev->name); + return false; + } + + assert(multifd); + + if (data_size < sizeof(*packet)) { + error_setg(errp, "%s: packet too short at %zu (min is %zu)", + vbasedev->name, data_size, sizeof(*packet)); + return false; + } + + if (packet->version !=3D VFIO_DEVICE_STATE_PACKET_VER_CURRENT) { + error_setg(errp, "%s: packet has unknown version %" PRIu32, + vbasedev->name, packet->version); + return false; + } + + if (packet->idx =3D=3D UINT32_MAX) { + error_setg(errp, "%s: packet index is invalid", vbasedev->name); + return false; + } + + trace_vfio_load_state_device_buffer_incoming(vbasedev->name, packet->i= dx); + + /* + * Holding BQL here would violate the lock order and can cause + * a deadlock once we attempt to lock load_bufs_mutex below. + */ + assert(!bql_locked()); + + WITH_QEMU_LOCK_GUARD(&multifd->load_bufs_mutex) { + /* config state packet should be the last one in the stream */ + if (packet->flags & VFIO_DEVICE_STATE_CONFIG_STATE) { + multifd->load_buf_idx_last =3D packet->idx; + } + + if (!vfio_load_state_buffer_insert(vbasedev, packet, data_size, + errp)) { + return false; + } + + qemu_cond_signal(&multifd->load_bufs_buffer_ready_cond); + } + + return true; +} + static VFIOMultifd *vfio_multifd_new(void) { VFIOMultifd *multifd =3D g_new(VFIOMultifd, 1); =20 + vfio_state_buffers_init(&multifd->load_bufs); + + qemu_mutex_init(&multifd->load_bufs_mutex); + + multifd->load_buf_idx =3D 0; + multifd->load_buf_idx_last =3D UINT32_MAX; + qemu_cond_init(&multifd->load_bufs_buffer_ready_cond); + return multifd; } =20 static void vfio_multifd_free(VFIOMultifd *multifd) { + vfio_state_buffers_destroy(&multifd->load_bufs); + qemu_cond_destroy(&multifd->load_bufs_buffer_ready_cond); + qemu_mutex_destroy(&multifd->load_bufs_mutex); + g_free(multifd); } =20 diff --git a/hw/vfio/migration-multifd.h b/hw/vfio/migration-multifd.h index 2a7a76164f29..8c6320fcb484 100644 --- a/hw/vfio/migration-multifd.h +++ b/hw/vfio/migration-multifd.h @@ -20,4 +20,7 @@ void vfio_multifd_cleanup(VFIODevice *vbasedev); bool vfio_multifd_transfer_supported(void); bool vfio_multifd_transfer_enabled(VFIODevice *vbasedev); =20 +bool vfio_multifd_load_state_buffer(void *opaque, char *data, size_t data_= size, + Error **errp); + #endif diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c index 3c8286ae6230..ecc4ee940567 100644 --- a/hw/vfio/migration.c +++ b/hw/vfio/migration.c @@ -801,6 +801,7 @@ static const SaveVMHandlers savevm_vfio_handlers =3D { .load_setup =3D vfio_load_setup, .load_cleanup =3D vfio_load_cleanup, .load_state =3D vfio_load_state, + .load_state_buffer =3D vfio_multifd_load_state_buffer, .switchover_ack_needed =3D vfio_switchover_ack_needed, }; =20 diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events index a02c668f28a4..404ea079b25c 100644 --- a/hw/vfio/trace-events +++ b/hw/vfio/trace-events @@ -154,6 +154,7 @@ vfio_load_device_config_state_start(const char *name) "= (%s)" vfio_load_device_config_state_end(const char *name) " (%s)" vfio_load_state(const char *name, uint64_t data) " (%s) data 0x%"PRIx64 vfio_load_state_device_data(const char *name, uint64_t data_size, int ret)= " (%s) size %"PRIu64" ret %d" +vfio_load_state_device_buffer_incoming(const char *name, uint32_t idx) " (= %s) idx %"PRIu32 vfio_migration_realize(const char *name) " (%s)" vfio_migration_set_device_state(const char *name, const char *state) " (%s= ) state %s" vfio_migration_set_state(const char *name, const char *new_state, const ch= ar *recover_state) " (%s) new state %s, recover state %s"