From nobody Sat Apr 11 23:02:57 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=suse.de ARC-Seal: i=1; a=rsa-sha256; t=1773061738; cv=none; d=zohomail.com; s=zohoarc; b=B1nReGBZbHiXzHNHEhbZv4GGq3NAtb1nOxoH9N8VWyHjgPPUOZ1mH8bIv6UzgyutcpeHG2tdBs5ifhtsa7jiyV+Of3FONdbP1iYpf0uuFItxplMSiianB+SeTLytf7/Yp4EmAk9TPJdvR+kjuwlkZ1XvyILtU/ZbzbKZlE/jck0= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1773061738; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=EI/hm+oCGbJNJPXLYwaaghjCssMy0e6uep2ugqXpmMk=; b=Goauzdq5Bui4vtN3s1p8KrpJBl6bn1HewuSToc5uA7QTIEM50/GuLJjzyzyy+YDQ1Zr8QNF6/zGLCPOHsNb0rdr5Wc2su9EogbVIkKJkmQzZnrmpiBXS7N2fWWI8N+NNPLkTCu3ALXDHnJna/uuY74xrzSTG+y+ZW13xRxSox/w= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1773061738687740.079874566809; Mon, 9 Mar 2026 06:08:58 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1vzaL8-0006gW-M4; Mon, 09 Mar 2026 09:08:06 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1vzaL5-0006g0-CA for qemu-devel@nongnu.org; Mon, 09 Mar 2026 09:08:04 -0400 Received: from smtp-out2.suse.de ([195.135.223.131]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1vzaL3-00088E-8l for qemu-devel@nongnu.org; Mon, 09 Mar 2026 09:08:03 -0400 Received: from imap1.dmz-prg2.suse.org (unknown [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id BF7295BDDF; Mon, 9 Mar 2026 13:07:49 +0000 (UTC) Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 5952F3EEF2; Mon, 9 Mar 2026 13:07:48 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id kC7uByTGrmlJcAAAD6G6ig (envelope-from ); Mon, 09 Mar 2026 13:07:48 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1773061669; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=EI/hm+oCGbJNJPXLYwaaghjCssMy0e6uep2ugqXpmMk=; b=nWSbdZfQQhG9BXC+CsfRVo2RqD9jUgPxwUzyPswkJQ4x9OX8XmUmoa5sWStJms9PGFHCtk Flqmqz7yxHQ/YcEBiuuWvo1l98qk2GKEKg6MNpcy71QphG60RC54z+5Is41Cef297t8kBz O8orKrMpdiVoD6g+nFJH4RRXjLmjjuk= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1773061669; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=EI/hm+oCGbJNJPXLYwaaghjCssMy0e6uep2ugqXpmMk=; b=SHkkb3o30JcFY2bJO7Fz23BLQmyI0ie/bNMH1eUURbmiT0aOLjo4qbVBG6CX6yebB5xx8V Pu6O5UDXYNijPCCw== Authentication-Results: smtp-out2.suse.de; none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1773061669; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=EI/hm+oCGbJNJPXLYwaaghjCssMy0e6uep2ugqXpmMk=; b=nWSbdZfQQhG9BXC+CsfRVo2RqD9jUgPxwUzyPswkJQ4x9OX8XmUmoa5sWStJms9PGFHCtk Flqmqz7yxHQ/YcEBiuuWvo1l98qk2GKEKg6MNpcy71QphG60RC54z+5Is41Cef297t8kBz O8orKrMpdiVoD6g+nFJH4RRXjLmjjuk= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1773061669; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=EI/hm+oCGbJNJPXLYwaaghjCssMy0e6uep2ugqXpmMk=; b=SHkkb3o30JcFY2bJO7Fz23BLQmyI0ie/bNMH1eUURbmiT0aOLjo4qbVBG6CX6yebB5xx8V Pu6O5UDXYNijPCCw== From: Fabiano Rosas To: qemu-devel@nongnu.org Cc: Peter Xu , Lukas Straub , Juan Quintela Subject: [PULL 09/22] multifd: Add COLO support Date: Mon, 9 Mar 2026 10:07:14 -0300 Message-ID: <20260309130730.20526-10-farosas@suse.de> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20260309130730.20526-1-farosas@suse.de> References: <20260309130730.20526-1-farosas@suse.de> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Spamd-Result: default: False [-2.80 / 50.00]; BAYES_HAM(-3.00)[100.00%]; NEURAL_HAM_LONG(-1.00)[-1.000]; MID_CONTAINS_FROM(1.00)[]; R_MISSING_CHARSET(0.50)[]; NEURAL_HAM_SHORT(-0.20)[-0.999]; MIME_GOOD(-0.10)[text/plain]; TO_MATCH_ENVRCPT_ALL(0.00)[]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; FREEMAIL_CC(0.00)[redhat.com,web.de]; TO_DN_SOME(0.00)[]; MIME_TRACE(0.00)[0:+]; DBL_BLOCKED_OPENRESOLVER(0.00)[suse.de:mid,suse.de:email,imap1.dmz-prg2.suse.org:helo]; FUZZY_RATELIMITED(0.00)[rspamd.com]; FROM_HAS_DN(0.00)[]; RCVD_COUNT_TWO(0.00)[2]; RCVD_TLS_ALL(0.00)[]; FROM_EQ_ENVFROM(0.00)[]; RCPT_COUNT_THREE(0.00)[4]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; FREEMAIL_ENVRCPT(0.00)[web.de] X-Spam-Score: -2.80 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=195.135.223.131; envelope-from=farosas@suse.de; helo=smtp-out2.suse.de X-Spam_score_int: -26 X-Spam_score: -2.7 X-Spam_bar: -- X-Spam_report: (-2.7 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.819, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.903, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: qemu development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @suse.de) X-ZM-MESSAGEID: 1773061739691154100 Content-Type: text/plain; charset="utf-8" From: Lukas Straub Like in the normal ram_load() path, put the received pages into the colo cache and mark the pages in the bitmap so that they will be flushed to the guest later. Multifd with COLO is useful to reduce the VM pause time during checkpointing for latency sensitive workloads. In such workloads the worst-case latency is especially important. Also, this is already worth it for the precopy phase as it helps with converging. Moreover, multifd migration is the preferred way to do migration nowadays and this allows to use multifd compression with COLO. Benchmark: Cluster nodes - Intel Xenon E5-2630 v3 - 48Gb RAM - 10G Ethernet Guest - Windows Server 2016 - 6Gb RAM - 4 cores Workload - Upload a file to the guest with SMB to simulate moderate memory dirtying - Measure the memory transfer time portion of each checkpoint - 600ms COLO checkpoint interval Results Plain idle mean: 4.50ms 99per: 10.33ms load mean: 24.30ms 99per: 78.05ms Multifd-4 idle mean: 6.48ms 99per: 10.41ms load mean: 14.12ms 99per: 31.27ms Evaluation While multifd has slightly higher latency when the guest idles, it is 10ms faster under load and more importantly it's worst case latency is less than 1/2 of plain under load as can be seen in the 99. Percentile. Co-authored-by: Juan Quintela [farosas: changed SoB to coauthored as Juan doesn't own that email address = anymore] Reviewed-by: Fabiano Rosas Reviewed-by: Peter Xu Signed-off-by: Lukas Straub Link: https://lore.kernel.org/qemu-devel/20260302-colo_unit_test_multifd-v1= 1-8-d653fb3b1d80@web.de [removed license boilerplate] Signed-off-by: Fabiano Rosas --- MAINTAINERS | 1 + migration/meson.build | 2 +- migration/multifd-colo.c | 41 ++++++++++++++++++++++++++++++++++++++ migration/multifd-colo.h | 23 +++++++++++++++++++++ migration/multifd-nocomp.c | 10 +++++++++- migration/multifd.c | 8 ++++++++ migration/multifd.h | 5 ++++- 7 files changed, 87 insertions(+), 3 deletions(-) create mode 100644 migration/multifd-colo.c create mode 100644 migration/multifd-colo.h diff --git a/MAINTAINERS b/MAINTAINERS index df78865809..d8dc4c8d7b 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -3883,6 +3883,7 @@ COLO Framework M: Lukas Straub S: Maintained F: migration/colo* +F: migration/multifd-colo.* F: include/migration/colo.h F: include/migration/failover.h F: docs/COLO-FT.txt diff --git a/migration/meson.build b/migration/meson.build index c7f39bdb55..c9f0f5f9f2 100644 --- a/migration/meson.build +++ b/migration/meson.build @@ -39,7 +39,7 @@ system_ss.add(files( ), gnutls, zlib) =20 if get_option('replication').allowed() - system_ss.add(files('colo-failover.c', 'colo.c')) + system_ss.add(files('colo-failover.c', 'colo.c', 'multifd-colo.c')) else system_ss.add(files('colo-stubs.c')) endif diff --git a/migration/multifd-colo.c b/migration/multifd-colo.c new file mode 100644 index 0000000000..fb13e38b9b --- /dev/null +++ b/migration/multifd-colo.c @@ -0,0 +1,41 @@ +/* + * SPDX-License-Identifier: GPL-2.0-or-later + * + * multifd colo implementation + * + * Copyright (c) Lukas Straub + */ + +#include "qemu/osdep.h" +#include "multifd.h" +#include "multifd-colo.h" +#include "migration/colo.h" +#include "system/ramblock.h" + +void multifd_colo_prepare_recv(MultiFDRecvParams *p) +{ + /* + * While we're still in precopy state (not yet in colo state), we copy + * received pages to both guest and cache. No need to set dirty bits, + * since guest and cache memory are in sync. + */ + if (migration_incoming_in_colo_state()) { + colo_record_bitmap(p->block, p->normal, p->normal_num); + colo_record_bitmap(p->block, p->zero, p->zero_num); + } +} + +void multifd_colo_process_recv(MultiFDRecvParams *p) +{ + if (!migration_incoming_in_colo_state()) { + for (int i =3D 0; i < p->normal_num; i++) { + void *guest =3D p->block->host + p->normal[i]; + void *cache =3D p->host + p->normal[i]; + memcpy(guest, cache, multifd_ram_page_size()); + } + for (int i =3D 0; i < p->zero_num; i++) { + void *guest =3D p->block->host + p->zero[i]; + memset(guest, 0, multifd_ram_page_size()); + } + } +} diff --git a/migration/multifd-colo.h b/migration/multifd-colo.h new file mode 100644 index 0000000000..1011c5a1da --- /dev/null +++ b/migration/multifd-colo.h @@ -0,0 +1,23 @@ +/* + * SPDX-License-Identifier: GPL-2.0-or-later + * + * multifd colo header + * + * Copyright (c) Lukas Straub + */ + +#ifndef QEMU_MIGRATION_MULTIFD_COLO_H +#define QEMU_MIGRATION_MULTIFD_COLO_H + +#ifdef CONFIG_REPLICATION + +void multifd_colo_prepare_recv(MultiFDRecvParams *p); +void multifd_colo_process_recv(MultiFDRecvParams *p); + +#else + +static inline void multifd_colo_prepare_recv(MultiFDRecvParams *p) {} +static inline void multifd_colo_process_recv(MultiFDRecvParams *p) {} + +#endif +#endif diff --git a/migration/multifd-nocomp.c b/migration/multifd-nocomp.c index 9be79b3b8e..9f7a792fa7 100644 --- a/migration/multifd-nocomp.c +++ b/migration/multifd-nocomp.c @@ -16,6 +16,7 @@ #include "file.h" #include "migration-stats.h" #include "multifd.h" +#include "multifd-colo.h" #include "options.h" #include "migration.h" #include "qapi/error.h" @@ -269,7 +270,6 @@ int multifd_ram_unfill_packet(MultiFDRecvParams *p, Err= or **errp) return -1; } =20 - p->host =3D p->block->host; for (i =3D 0; i < p->normal_num; i++) { uint64_t offset =3D be64_to_cpu(packet->offset[i]); =20 @@ -294,6 +294,14 @@ int multifd_ram_unfill_packet(MultiFDRecvParams *p, Er= ror **errp) p->zero[i] =3D offset; } =20 + if (migrate_colo()) { + multifd_colo_prepare_recv(p); + assert(p->block->colo_cache); + p->host =3D p->block->colo_cache; + } else { + p->host =3D p->block->host; + } + return 0; } =20 diff --git a/migration/multifd.c b/migration/multifd.c index 4259ab2628..2193088996 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -29,6 +29,7 @@ #include "qemu-file.h" #include "trace.h" #include "multifd.h" +#include "multifd-colo.h" #include "options.h" #include "qemu/yank.h" #include "io/channel-file.h" @@ -1258,6 +1259,13 @@ static int multifd_ram_state_recv(MultiFDRecvParams = *p, Error **errp) int ret; =20 ret =3D multifd_recv_state->ops->recv(p, errp); + if (ret !=3D 0) { + return ret; + } + + if (migrate_colo()) { + multifd_colo_process_recv(p); + } =20 return ret; } diff --git a/migration/multifd.h b/migration/multifd.h index 89a395aef2..fbc35702b0 100644 --- a/migration/multifd.h +++ b/migration/multifd.h @@ -279,7 +279,10 @@ typedef struct { uint64_t packets_recved; /* ramblock */ RAMBlock *block; - /* ramblock host address */ + /* + * Normally, it points to ramblock's host address. When COLO + * is enabled, it points to the mirror cache for the ramblock. + */ uint8_t *host; /* buffers to recv */ struct iovec *iov; --=20 2.51.0