From nobody Sun Nov 24 10:38:24 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1724781463202754.2631949475822; Tue, 27 Aug 2024 10:57:43 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sj0R1-0001Cw-T4; Tue, 27 Aug 2024 13:56:52 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sj0QW-0007gx-Aw for qemu-devel@nongnu.org; Tue, 27 Aug 2024 13:56:21 -0400 Received: from vps-vb.mhejs.net ([37.28.154.113]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sj0QM-0001cn-Tf for qemu-devel@nongnu.org; Tue, 27 Aug 2024 13:56:18 -0400 Received: from MUA by vps-vb.mhejs.net with esmtps (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sj0Q8-0002Oq-A6; Tue, 27 Aug 2024 19:55:56 +0200 From: "Maciej S. Szmigiero" To: Peter Xu , Fabiano Rosas Cc: Alex Williamson , =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= , Eric Blake , Markus Armbruster , =?UTF-8?q?Daniel=20P=20=2E=20Berrang=C3=A9?= , Avihai Horon , Joao Martins , qemu-devel@nongnu.org Subject: [PATCH v2 12/17] migration/multifd: Device state transfer support - send side Date: Tue, 27 Aug 2024 19:54:31 +0200 Message-ID: X-Mailer: git-send-email 2.45.2 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=37.28.154.113; envelope-from=mail@maciej.szmigiero.name; helo=vps-vb.mhejs.net X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZM-MESSAGEID: 1724781465224116600 Content-Type: text/plain; charset="utf-8" From: "Maciej S. Szmigiero" A new function multifd_queue_device_state() is provided for device to queue its state for transmission via a multifd channel. Signed-off-by: Maciej S. Szmigiero --- include/migration/misc.h | 4 ++ migration/meson.build | 1 + migration/multifd-device-state.c | 99 ++++++++++++++++++++++++++++++++ migration/multifd-nocomp.c | 6 +- migration/multifd-qpl.c | 2 +- migration/multifd-uadk.c | 2 +- migration/multifd-zlib.c | 2 +- migration/multifd-zstd.c | 2 +- migration/multifd.c | 65 +++++++++++++++------ migration/multifd.h | 29 +++++++++- 10 files changed, 184 insertions(+), 28 deletions(-) create mode 100644 migration/multifd-device-state.c diff --git a/include/migration/misc.h b/include/migration/misc.h index bfadc5613bac..7266b1b77d1f 100644 --- a/include/migration/misc.h +++ b/include/migration/misc.h @@ -111,4 +111,8 @@ bool migration_in_bg_snapshot(void); /* migration/block-dirty-bitmap.c */ void dirty_bitmap_mig_init(void); =20 +/* migration/multifd-device-state.c */ +bool multifd_queue_device_state(char *idstr, uint32_t instance_id, + char *data, size_t len); + #endif diff --git a/migration/meson.build b/migration/meson.build index 77f3abf08eb1..00853595894f 100644 --- a/migration/meson.build +++ b/migration/meson.build @@ -21,6 +21,7 @@ system_ss.add(files( 'migration-hmp-cmds.c', 'migration.c', 'multifd.c', + 'multifd-device-state.c', 'multifd-nocomp.c', 'multifd-zlib.c', 'multifd-zero-page.c', diff --git a/migration/multifd-device-state.c b/migration/multifd-device-st= ate.c new file mode 100644 index 000000000000..c9b44f0b5ab9 --- /dev/null +++ b/migration/multifd-device-state.c @@ -0,0 +1,99 @@ +/* + * Multifd device state migration + * + * Copyright (C) 2024 Oracle and/or its affiliates. + * + * This work is licensed under the terms of the GNU GPL, version 2 or late= r. + * See the COPYING file in the top-level directory. + */ + +#include "qemu/osdep.h" +#include "qemu/lockable.h" +#include "migration/misc.h" +#include "multifd.h" + +static QemuMutex queue_job_mutex; + +static MultiFDSendData *device_state_send; + +size_t multifd_device_state_payload_size(void) +{ + return sizeof(MultiFDDeviceState_t); +} + +void multifd_device_state_save_setup(void) +{ + qemu_mutex_init(&queue_job_mutex); + + device_state_send =3D multifd_send_data_alloc(); +} + +void multifd_device_state_clear(MultiFDDeviceState_t *device_state) +{ + g_clear_pointer(&device_state->idstr, g_free); + g_clear_pointer(&device_state->buf, g_free); +} + +void multifd_device_state_save_cleanup(void) +{ + g_clear_pointer(&device_state_send, multifd_send_data_free); + + qemu_mutex_destroy(&queue_job_mutex); +} + +static void multifd_device_state_fill_packet(MultiFDSendParams *p) +{ + MultiFDDeviceState_t *device_state =3D &p->data->u.device_state; + MultiFDPacketDeviceState_t *packet =3D p->packet_device_state; + + packet->hdr.flags =3D cpu_to_be32(p->flags); + strncpy(packet->idstr, device_state->idstr, sizeof(packet->idstr)); + packet->instance_id =3D cpu_to_be32(device_state->instance_id); + packet->next_packet_size =3D cpu_to_be32(p->next_packet_size); +} + +void multifd_device_state_send_prepare(MultiFDSendParams *p) +{ + MultiFDDeviceState_t *device_state =3D &p->data->u.device_state; + + assert(multifd_payload_device_state(p->data)); + + multifd_send_prepare_header_device_state(p); + + assert(!(p->flags & MULTIFD_FLAG_SYNC)); + + p->next_packet_size =3D device_state->buf_len; + if (p->next_packet_size > 0) { + p->iov[p->iovs_num].iov_base =3D device_state->buf; + p->iov[p->iovs_num].iov_len =3D p->next_packet_size; + p->iovs_num++; + } + + p->flags |=3D MULTIFD_FLAG_NOCOMP | MULTIFD_FLAG_DEVICE_STATE; + + multifd_device_state_fill_packet(p); +} + +bool multifd_queue_device_state(char *idstr, uint32_t instance_id, + char *data, size_t len) +{ + /* Device state submissions can come from multiple threads */ + QEMU_LOCK_GUARD(&queue_job_mutex); + MultiFDDeviceState_t *device_state; + + assert(multifd_payload_empty(device_state_send)); + + multifd_set_payload_type(device_state_send, MULTIFD_PAYLOAD_DEVICE_STA= TE); + device_state =3D &device_state_send->u.device_state; + device_state->idstr =3D g_strdup(idstr); + device_state->instance_id =3D instance_id; + device_state->buf =3D g_memdup2(data, len); + device_state->buf_len =3D len; + + if (!multifd_send(&device_state_send)) { + multifd_send_data_clear(device_state_send); + return false; + } + + return true; +} diff --git a/migration/multifd-nocomp.c b/migration/multifd-nocomp.c index 39eb77c9b3b7..0b7b543f44db 100644 --- a/migration/multifd-nocomp.c +++ b/migration/multifd-nocomp.c @@ -116,13 +116,13 @@ static int multifd_nocomp_send_prepare(MultiFDSendPar= ams *p, Error **errp) * Only !zerocopy needs the header in IOV; zerocopy will * send it separately. */ - multifd_send_prepare_header(p); + multifd_send_prepare_header_ram(p); } =20 multifd_send_prepare_iovs(p); p->flags |=3D MULTIFD_FLAG_NOCOMP; =20 - multifd_send_fill_packet(p); + multifd_send_fill_packet_ram(p); =20 if (use_zero_copy_send) { /* Send header first, without zerocopy */ @@ -371,7 +371,7 @@ bool multifd_send_prepare_common(MultiFDSendParams *p) return false; } =20 - multifd_send_prepare_header(p); + multifd_send_prepare_header_ram(p); =20 return true; } diff --git a/migration/multifd-qpl.c b/migration/multifd-qpl.c index 75041a4c4dfe..bd6b5b6a3868 100644 --- a/migration/multifd-qpl.c +++ b/migration/multifd-qpl.c @@ -490,7 +490,7 @@ static int multifd_qpl_send_prepare(MultiFDSendParams *= p, Error **errp) =20 out: p->flags |=3D MULTIFD_FLAG_QPL; - multifd_send_fill_packet(p); + multifd_send_fill_packet_ram(p); return 0; } =20 diff --git a/migration/multifd-uadk.c b/migration/multifd-uadk.c index db2549f59bfe..6e2d26010742 100644 --- a/migration/multifd-uadk.c +++ b/migration/multifd-uadk.c @@ -198,7 +198,7 @@ static int multifd_uadk_send_prepare(MultiFDSendParams = *p, Error **errp) } out: p->flags |=3D MULTIFD_FLAG_UADK; - multifd_send_fill_packet(p); + multifd_send_fill_packet_ram(p); return 0; } =20 diff --git a/migration/multifd-zlib.c b/migration/multifd-zlib.c index 6787538762d2..62a1fe59ad3e 100644 --- a/migration/multifd-zlib.c +++ b/migration/multifd-zlib.c @@ -156,7 +156,7 @@ static int multifd_zlib_send_prepare(MultiFDSendParams = *p, Error **errp) =20 out: p->flags |=3D MULTIFD_FLAG_ZLIB; - multifd_send_fill_packet(p); + multifd_send_fill_packet_ram(p); return 0; } =20 diff --git a/migration/multifd-zstd.c b/migration/multifd-zstd.c index 1576b1e2adc6..f98b07e7f9f5 100644 --- a/migration/multifd-zstd.c +++ b/migration/multifd-zstd.c @@ -143,7 +143,7 @@ static int multifd_zstd_send_prepare(MultiFDSendParams = *p, Error **errp) =20 out: p->flags |=3D MULTIFD_FLAG_ZSTD; - multifd_send_fill_packet(p); + multifd_send_fill_packet_ram(p); return 0; } =20 diff --git a/migration/multifd.c b/migration/multifd.c index a74e8a5cc891..bebe5b5a9b9c 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -12,6 +12,7 @@ =20 #include "qemu/osdep.h" #include "qemu/cutils.h" +#include "qemu/iov.h" #include "qemu/rcu.h" #include "exec/target_page.h" #include "sysemu/sysemu.h" @@ -19,6 +20,7 @@ #include "qemu/error-report.h" #include "qapi/error.h" #include "file.h" +#include "migration/misc.h" #include "migration.h" #include "migration-stats.h" #include "savevm.h" @@ -107,7 +109,9 @@ MultiFDSendData *multifd_send_data_alloc(void) * added to the union in the future are larger than * (MultiFDPages_t + flex array). */ - max_payload_size =3D MAX(multifd_ram_payload_size(), sizeof(MultiFDPay= load)); + max_payload_size =3D MAX(multifd_ram_payload_size(), + multifd_device_state_payload_size()); + max_payload_size =3D MAX(max_payload_size, sizeof(MultiFDPayload)); =20 /* * Account for any holes the compiler might insert. We can't pack @@ -126,6 +130,9 @@ void multifd_send_data_clear(MultiFDSendData *data) } =20 switch (data->type) { + case MULTIFD_PAYLOAD_DEVICE_STATE: + multifd_device_state_clear(&data->u.device_state); + break; default: /* Nothing to do */ break; @@ -228,7 +235,7 @@ static int multifd_recv_initial_packet(QIOChannel *c, E= rror **errp) return msg.id; } =20 -void multifd_send_fill_packet(MultiFDSendParams *p) +void multifd_send_fill_packet_ram(MultiFDSendParams *p) { MultiFDPacket_t *packet =3D p->packet; uint64_t packet_num; @@ -397,20 +404,16 @@ bool multifd_send(MultiFDSendData **send_data) =20 p =3D &multifd_send_state->params[i]; /* - * Lockless read to p->pending_job is safe, because only multifd - * sender thread can clear it. + * Lockless RMW on p->pending_job_preparing is safe, because only = multifd + * sender thread can clear it after it had seen p->pending_job bei= ng set. + * + * Pairs with qatomic_store_release() in multifd_send_thread(). */ - if (qatomic_read(&p->pending_job) =3D=3D false) { + if (qatomic_cmpxchg(&p->pending_job_preparing, false, true) =3D=3D= false) { break; } } =20 - /* - * Make sure we read p->pending_job before all the rest. Pairs with - * qatomic_store_release() in multifd_send_thread(). - */ - smp_mb_acquire(); - assert(multifd_payload_empty(p->data)); =20 /* @@ -534,6 +537,7 @@ static bool multifd_send_cleanup_channel(MultiFDSendPar= ams *p, Error **errp) p->name =3D NULL; g_clear_pointer(&p->data, multifd_send_data_free); p->packet_len =3D 0; + g_clear_pointer(&p->packet_device_state, g_free); g_free(p->packet); p->packet =3D NULL; multifd_send_state->ops->send_cleanup(p, errp); @@ -545,6 +549,7 @@ static void multifd_send_cleanup_state(void) { file_cleanup_outgoing_migration(); socket_cleanup_outgoing_migration(); + multifd_device_state_save_cleanup(); qemu_sem_destroy(&multifd_send_state->channels_created); qemu_sem_destroy(&multifd_send_state->channels_ready); g_free(multifd_send_state->params); @@ -670,19 +675,29 @@ static void *multifd_send_thread(void *opaque) * qatomic_store_release() in multifd_send(). */ if (qatomic_load_acquire(&p->pending_job)) { + bool is_device_state =3D multifd_payload_device_state(p->data); + size_t total_size; + p->flags =3D 0; p->iovs_num =3D 0; assert(!multifd_payload_empty(p->data)); =20 - ret =3D multifd_send_state->ops->send_prepare(p, &local_err); - if (ret !=3D 0) { - break; + if (is_device_state) { + multifd_device_state_send_prepare(p); + } else { + ret =3D multifd_send_state->ops->send_prepare(p, &local_er= r); + if (ret !=3D 0) { + break; + } } =20 if (migrate_mapped_ram()) { + assert(!is_device_state); + ret =3D file_write_ramblock_iov(p->c, p->iov, p->iovs_num, &p->data->u.ram, &local_err); } else { + total_size =3D iov_size(p->iov, p->iovs_num); ret =3D qio_channel_writev_full_all(p->c, p->iov, p->iovs_= num, NULL, 0, p->write_flags, &local_err); @@ -692,18 +707,27 @@ static void *multifd_send_thread(void *opaque) break; } =20 - stat64_add(&mig_stats.multifd_bytes, - p->next_packet_size + p->packet_len); + if (is_device_state) { + stat64_add(&mig_stats.multifd_bytes, total_size); + } else { + /* + * Can't just always add total_size since IOVs do not incl= ude + * packet header in the zerocopy RAM case. + */ + stat64_add(&mig_stats.multifd_bytes, + p->next_packet_size + p->packet_len); + } =20 p->next_packet_size =3D 0; multifd_send_data_clear(p->data); =20 /* * Making sure p->data is published before saying "we're - * free". Pairs with the smp_mb_acquire() in + * free". Pairs with the qatomic_cmpxchg() in * multifd_send(). */ qatomic_store_release(&p->pending_job, false); + qatomic_store_release(&p->pending_job_preparing, false); } else { /* * If not a normal job, must be a sync request. Note that @@ -714,7 +738,7 @@ static void *multifd_send_thread(void *opaque) =20 if (use_packets) { p->flags =3D MULTIFD_FLAG_SYNC; - multifd_send_fill_packet(p); + multifd_send_fill_packet_ram(p); ret =3D qio_channel_write_all(p->c, (void *)p->packet, p->packet_len, &local_err); if (ret !=3D 0) { @@ -910,6 +934,9 @@ bool multifd_send_setup(void) p->packet_len =3D sizeof(MultiFDPacket_t) + sizeof(uint64_t) * page_count; p->packet =3D g_malloc0(p->packet_len); + p->packet_device_state =3D g_malloc0(sizeof(*p->packet_device_= state)); + p->packet_device_state->hdr.magic =3D cpu_to_be32(MULTIFD_MAGI= C); + p->packet_device_state->hdr.version =3D cpu_to_be32(MULTIFD_VE= RSION); } p->name =3D g_strdup_printf("mig/src/send_%d", i); p->write_flags =3D 0; @@ -944,6 +971,8 @@ bool multifd_send_setup(void) } } =20 + multifd_device_state_save_setup(); + return true; =20 err: diff --git a/migration/multifd.h b/migration/multifd.h index a0853622153e..c15c83104c8b 100644 --- a/migration/multifd.h +++ b/migration/multifd.h @@ -120,10 +120,12 @@ typedef struct { typedef enum { MULTIFD_PAYLOAD_NONE, MULTIFD_PAYLOAD_RAM, + MULTIFD_PAYLOAD_DEVICE_STATE, } MultiFDPayloadType; =20 typedef union MultiFDPayload { MultiFDPages_t ram; + MultiFDDeviceState_t device_state; } MultiFDPayload; =20 struct MultiFDSendData { @@ -136,6 +138,11 @@ static inline bool multifd_payload_empty(MultiFDSendDa= ta *data) return data->type =3D=3D MULTIFD_PAYLOAD_NONE; } =20 +static inline bool multifd_payload_device_state(MultiFDSendData *data) +{ + return data->type =3D=3D MULTIFD_PAYLOAD_DEVICE_STATE; +} + static inline void multifd_set_payload_type(MultiFDSendData *data, MultiFDPayloadType type) { @@ -182,13 +189,15 @@ typedef struct { * cleared by the multifd sender threads. */ bool pending_job; + bool pending_job_preparing; bool pending_sync; MultiFDSendData *data; =20 /* thread local variables. No locking required */ =20 - /* pointer to the packet */ + /* pointers to the possible packet types */ MultiFDPacket_t *packet; + MultiFDPacketDeviceState_t *packet_device_state; /* size of the next packet that contains pages */ uint32_t next_packet_size; /* packets sent through this channel */ @@ -276,18 +285,25 @@ typedef struct { } MultiFDMethods; =20 void multifd_register_ops(int method, MultiFDMethods *ops); -void multifd_send_fill_packet(MultiFDSendParams *p); +void multifd_send_fill_packet_ram(MultiFDSendParams *p); bool multifd_send_prepare_common(MultiFDSendParams *p); void multifd_send_zero_page_detect(MultiFDSendParams *p); void multifd_recv_zero_page_process(MultiFDRecvParams *p); =20 -static inline void multifd_send_prepare_header(MultiFDSendParams *p) +static inline void multifd_send_prepare_header_ram(MultiFDSendParams *p) { p->iov[0].iov_len =3D p->packet_len; p->iov[0].iov_base =3D p->packet; p->iovs_num++; } =20 +static inline void multifd_send_prepare_header_device_state(MultiFDSendPar= ams *p) +{ + p->iov[0].iov_len =3D sizeof(*p->packet_device_state); + p->iov[0].iov_base =3D p->packet_device_state; + p->iovs_num++; +} + void multifd_channel_connect(MultiFDSendParams *p, QIOChannel *ioc); bool multifd_send(MultiFDSendData **send_data); MultiFDSendData *multifd_send_data_alloc(void); @@ -310,4 +326,11 @@ int multifd_ram_flush_and_sync(void); size_t multifd_ram_payload_size(void); void multifd_ram_fill_packet(MultiFDSendParams *p); int multifd_ram_unfill_packet(MultiFDRecvParams *p, Error **errp); + +size_t multifd_device_state_payload_size(void); +void multifd_device_state_save_setup(void); +void multifd_device_state_clear(MultiFDDeviceState_t *device_state); +void multifd_device_state_save_cleanup(void); +void multifd_device_state_send_prepare(MultiFDSendParams *p); + #endif