From nobody Thu Apr 3 10:10:53 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=suse.de ARC-Seal: i=1; a=rsa-sha256; t=1741646141; cv=none; d=zohomail.com; s=zohoarc; b=oBMIFk5W9KRQ87vCbbwLEzuSWQO7c8tFSvx0wkN4AkeDMd1XTu4X4gkPAkE+7JZksDP+K5tENZvnWXmZMGqbdtsXp5imJ10RYQWPCrdUej0UNYIY2W/oHQJP1c7oIYcbNoUs/7k3vUgzwauJrtnYDOHxYsAL1rxPvUMCP2mZ81A= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1741646141; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=sCRIoHALaSzJJSqDE4A39LXrEmHIJBJiF0Llh+bdjrE=; b=HN4plmTD6b9rV+DVBlp47Z43bdjIKBwedRbHauWou8gwYuVSkQocWPyXJEK510i6njDwAqOlWu1VklBLxF0/A3FMipQaVc81uGaK08cyl60Ysy0Y2ezSte/QfK4NfSZ2WsGADHuOrQX2RvRtZ/tZic+zdNV7NuD/MBRCkfaZicA= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1741646141948832.6909525322278; Mon, 10 Mar 2025 15:35:41 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1trli2-0003rr-Px; Mon, 10 Mar 2025 18:34:54 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1trli1-0003rh-DA for qemu-devel@nongnu.org; Mon, 10 Mar 2025 18:34:53 -0400 Received: from smtp-out2.suse.de ([195.135.223.131]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1trlhz-0001i3-9w for qemu-devel@nongnu.org; Mon, 10 Mar 2025 18:34:53 -0400 Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 1E3ED1F38A; Mon, 10 Mar 2025 22:34:46 +0000 (UTC) Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id BDB0F1399F; Mon, 10 Mar 2025 22:34:44 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id kKKkHgRpz2dzRAAAD6G6ig (envelope-from ); Mon, 10 Mar 2025 22:34:44 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1741646086; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=sCRIoHALaSzJJSqDE4A39LXrEmHIJBJiF0Llh+bdjrE=; b=h5h6pW3vnjcYdHqJ/ABDeraSaS8fLHfg53jZlLaVUdY7XjIcKB9PdQPKIq9WEFhyY28P3L qotgXjsTedzlV3cIXgIFYbRfCrXMP1JRAxQ9nd4+mN9nMfEcePgxmRx1FloymtZr5XP8YG ySSC82oJ4ZGV4BgHaZR72Rp0abpNwE0= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1741646086; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=sCRIoHALaSzJJSqDE4A39LXrEmHIJBJiF0Llh+bdjrE=; b=1a1m78v7N5u4a7TEtU+2gVvEA9Bd6moz96UNBWCDcuTrpnh+z2eJR11IwBlBd/9EEn7WL0 2z8DAgmHkLOCAfCA== Authentication-Results: smtp-out2.suse.de; dkim=pass header.d=suse.de header.s=susede2_rsa header.b=h5h6pW3v; dkim=pass header.d=suse.de header.s=susede2_ed25519 header.b=1a1m78v7 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1741646086; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=sCRIoHALaSzJJSqDE4A39LXrEmHIJBJiF0Llh+bdjrE=; b=h5h6pW3vnjcYdHqJ/ABDeraSaS8fLHfg53jZlLaVUdY7XjIcKB9PdQPKIq9WEFhyY28P3L qotgXjsTedzlV3cIXgIFYbRfCrXMP1JRAxQ9nd4+mN9nMfEcePgxmRx1FloymtZr5XP8YG ySSC82oJ4ZGV4BgHaZR72Rp0abpNwE0= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1741646086; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=sCRIoHALaSzJJSqDE4A39LXrEmHIJBJiF0Llh+bdjrE=; b=1a1m78v7N5u4a7TEtU+2gVvEA9Bd6moz96UNBWCDcuTrpnh+z2eJR11IwBlBd/9EEn7WL0 2z8DAgmHkLOCAfCA== From: Fabiano Rosas To: qemu-devel@nongnu.org Cc: Peter Xu , Yan Fu Subject: [PULL v2 1/3] migration: Fix UAF for incoming migration on MigrationState Date: Mon, 10 Mar 2025 19:34:38 -0300 Message-Id: <20250310223440.31328-2-farosas@suse.de> X-Mailer: git-send-email 2.35.3 In-Reply-To: <20250310223440.31328-1-farosas@suse.de> References: <20250310223440.31328-1-farosas@suse.de> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 1E3ED1F38A X-Spamd-Result: default: False [-3.01 / 50.00]; BAYES_HAM(-3.00)[100.00%]; MID_CONTAINS_FROM(1.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; R_MISSING_CHARSET(0.50)[]; R_DKIM_ALLOW(-0.20)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; NEURAL_HAM_SHORT(-0.20)[-1.000]; MIME_GOOD(-0.10)[text/plain]; MX_GOOD(-0.01)[]; DBL_BLOCKED_OPENRESOLVER(0.00)[suse.de:email,suse.de:dkim,suse.de:mid,imap1.dmz-prg2.suse.org:helo,imap1.dmz-prg2.suse.org:rdns]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; RBL_SPAMHAUS_BLOCKED_OPENRESOLVER(0.00)[2a07:de40:b281:104:10:150:64:97:from]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; MIME_TRACE(0.00)[0:+]; TO_MATCH_ENVRCPT_ALL(0.00)[]; FUZZY_BLOCKED(0.00)[rspamd.com]; FROM_EQ_ENVFROM(0.00)[]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; RECEIVED_SPAMHAUS_BLOCKED_OPENRESOLVER(0.00)[2a07:de40:b281:106:10:150:64:167:received]; RCVD_COUNT_TWO(0.00)[2]; RCVD_TLS_ALL(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; DKIM_TRACE(0.00)[suse.de:+] X-Rspamd-Server: rspamd2.dmz-prg2.suse.org X-Rspamd-Action: no action X-Spam-Score: -3.01 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=195.135.223.131; envelope-from=farosas@suse.de; helo=smtp-out2.suse.de X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @suse.de) X-ZM-MESSAGEID: 1741646144666019100 Content-Type: text/plain; charset="utf-8" From: Peter Xu On the incoming migration side, QEMU uses a coroutine to load all the VM states. Inside, it may reference MigrationState on global states like migration capabilities, parameters, error state, shared mutexes and more. However there's nothing yet to make sure MigrationState won't get destroyed (e.g. after migration_shutdown()). Meanwhile there's also no API available to remove the incoming coroutine in migration_shutdown(), avoiding it to access the freed elements. There's a bug report showing this can happen and crash dest QEMU when migration is cancelled on source. When it happens, the dest main thread is trying to cleanup everything: #0 qemu_aio_coroutine_enter #1 aio_dispatch_handler #2 aio_poll #3 monitor_cleanup #4 qemu_cleanup #5 qemu_default_main Then it found the migration incoming coroutine, schedule it (even after migration_shutdown()), causing crash: #0 __pthread_kill_implementation #1 __pthread_kill_internal #2 __GI_raise #3 __GI_abort #4 __assert_fail_base #5 __assert_fail #6 qemu_mutex_lock_impl #7 qemu_lockable_mutex_lock #8 qemu_lockable_lock #9 qemu_lockable_auto_lock #10 migrate_set_error #11 process_incoming_migration_co #12 coroutine_trampoline To fix it, take a refcount after an incoming setup is properly done when qmp_migrate_incoming() succeeded the 1st time. As it's during a QMP handler which needs BQL, it means the main loop is still alive (without going into cleanups, which also needs BQL). Releasing the refcount now only until the incoming migration coroutine finished or failed. Hence the refcount is valid for both (1) setup phase of incoming ports, mostly IO watches (e.g. qio_channel_add_watch_full()), and (2) the incoming coroutine itself (process_incoming_migration_co()). Note that we can't unref in migration_incoming_state_destroy(), because both qmp_xen_load_devices_state() and load_snapshot() will use it without an incoming migration. Those hold BQL so they're not prone to this issue. PS: I suspect nobody uses Xen's command at all, as it didn't register yank, hence AFAIU the command should crash on master when trying to unregister yank in migration_incoming_state_destroy().. but that's another story. Also note that in some incoming failure cases we may not always unref the MigrationState refcount, which is a trade-off to keep things simple. We could make it accurate, but it can be an overkill. Some examples: - Unlike most of the rest protocols, socket_start_incoming_migration() may create net listener after incoming port setup sucessfully. It means we can't unref in migration_channel_process_incoming() as a generic path because socket protocol might keep using MigrationState. - For either socket or file, multiple IO watches might be created, it means logically each IO watch needs to take one refcount for MigrationState so as to be 100% accurate on ownership of refcount taken. In general, we at least need per-protocol handling to make it accurate, which can be an overkill if we know incoming failed after all. Add a short comment to explain that when taking the refcount in qmp_migrate_incoming(). Bugzilla: https://issues.redhat.com/browse/RHEL-69775 Tested-by: Yan Fu Signed-off-by: Peter Xu Reviewed-by: Fabiano Rosas Message-ID: <20250220132459.512610-1-peterx@redhat.com> Signed-off-by: Fabiano Rosas --- migration/migration.c | 40 ++++++++++++++++++++++++++++++++++++++-- 1 file changed, 38 insertions(+), 2 deletions(-) diff --git a/migration/migration.c b/migration/migration.c index 1833cfe358..d46e776e24 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -116,6 +116,27 @@ static void migration_downtime_start(MigrationState *s) s->downtime_start =3D qemu_clock_get_ms(QEMU_CLOCK_REALTIME); } =20 +/* + * This is unfortunate: incoming migration actually needs the outgoing + * migration state (MigrationState) to be there too, e.g. to query + * capabilities, parameters, using locks, setup errors, etc. + * + * NOTE: when calling this, making sure current_migration exists and not + * been freed yet! Otherwise trying to access the refcount is already + * an use-after-free itself.. + * + * TODO: Move shared part of incoming / outgoing out into separate object. + * Then this is not needed. + */ +static void migrate_incoming_ref_outgoing_state(void) +{ + object_ref(migrate_get_current()); +} +static void migrate_incoming_unref_outgoing_state(void) +{ + object_unref(migrate_get_current()); +} + static void migration_downtime_end(MigrationState *s) { int64_t now =3D qemu_clock_get_ms(QEMU_CLOCK_REALTIME); @@ -863,7 +884,7 @@ process_incoming_migration_co(void *opaque) * postcopy thread. */ trace_process_incoming_migration_co_postcopy_end_main(); - return; + goto out; } /* Else if something went wrong then just fall out of the normal e= xit */ } @@ -879,7 +900,8 @@ process_incoming_migration_co(void *opaque) } =20 migration_bh_schedule(process_incoming_migration_bh, mis); - return; + goto out; + fail: migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE, MIGRATION_STATUS_FAILED); @@ -896,6 +918,9 @@ fail: =20 exit(EXIT_FAILURE); } +out: + /* Pairs with the refcount taken in qmp_migrate_incoming() */ + migrate_incoming_unref_outgoing_state(); } =20 /** @@ -1901,6 +1926,17 @@ void qmp_migrate_incoming(const char *uri, bool has_= channels, return; } =20 + /* + * Making sure MigrationState is available until incoming migration + * completes. + * + * NOTE: QEMU _might_ leak this refcount in some failure paths, but + * that's OK. This is the minimum change we need to at least making + * sure success case is clean on the refcount. We can try harder to + * make it accurate for any kind of failures, but it might be an + * overkill and doesn't bring us much benefit. + */ + migrate_incoming_ref_outgoing_state(); once =3D false; } =20 --=20 2.35.3 From nobody Thu Apr 3 10:10:53 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=suse.de ARC-Seal: i=1; a=rsa-sha256; t=1741646143; cv=none; d=zohomail.com; s=zohoarc; b=AD5gfUWnsaXF5e8YWGbdOyTaMgrcqRAOKHexjhyhvjtfbVtpuu8yCiaPVtwf9l/ghK+hD5aLvyKeThphAeGu7vqchj9Vd3y0fP7eFjFhuPIEJRUdHesywhP/w5mPq7RFI3qv4KgEyaAFeguMU6hQ5bDUcy+fVJrm4PcPkHs2TNw= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1741646143; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=v4uFGsx80hZqWFWzwJZpTRsesyR37SbMYmMlBn4jo7k=; b=V12SXamPzhmRd/YeSZ7Gn9khhGj3uOL8PX7L6+xm1aze2FBWtT9EEcKixzTjccngV2IsQ74ZUurmcP/3mmKXMb7JljgMIqlx6fTYRebeso41vM3hDo6Io8pMEF1yIhf0dMq6+jBUZxg3jfam0t1d1sI9VygaxNq7qgjSZfLCAYM= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1741646143155513.2428137258488; Mon, 10 Mar 2025 15:35:43 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1trliB-0003sq-Kb; Mon, 10 Mar 2025 18:35:03 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1trli9-0003sf-F4 for qemu-devel@nongnu.org; Mon, 10 Mar 2025 18:35:01 -0400 Received: from smtp-out2.suse.de ([2a07:de40:b251:101:10:150:64:2]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1trli7-0001iR-Iq for qemu-devel@nongnu.org; Mon, 10 Mar 2025 18:35:01 -0400 Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 3995B1F387; Mon, 10 Mar 2025 22:34:48 +0000 (UTC) Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 9B2AD1399F; Mon, 10 Mar 2025 22:34:46 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id UNQLFgZpz2dzRAAAD6G6ig (envelope-from ); Mon, 10 Mar 2025 22:34:46 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1741646088; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=v4uFGsx80hZqWFWzwJZpTRsesyR37SbMYmMlBn4jo7k=; b=1k3Mq1FyUyj3nwUcy0iwyOgdOt2guPclRUARkXYKepujaexaXIjjWOwaTMNwU5MFxvqrT0 LU9eg9dVp+KwncWZVbYZHL0PgLHm41/QcSCYlTKsDj3wxJSaJzTqNAedOe+ZoftbMaNpkc WYx/yFwnCybB7kEagrAlj5pmm/g7VO0= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1741646088; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=v4uFGsx80hZqWFWzwJZpTRsesyR37SbMYmMlBn4jo7k=; b=RZvWQMm5OKVKr3g9DvprYWNOa4Aa0F+Y1E1O8/cy27NeFlczye9KCYF8JqvM0/k5iMnCIY oG2o+SjlcZ7YHYDA== Authentication-Results: smtp-out2.suse.de; dkim=pass header.d=suse.de header.s=susede2_rsa header.b=1k3Mq1Fy; dkim=pass header.d=suse.de header.s=susede2_ed25519 header.b=RZvWQMm5 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1741646088; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=v4uFGsx80hZqWFWzwJZpTRsesyR37SbMYmMlBn4jo7k=; b=1k3Mq1FyUyj3nwUcy0iwyOgdOt2guPclRUARkXYKepujaexaXIjjWOwaTMNwU5MFxvqrT0 LU9eg9dVp+KwncWZVbYZHL0PgLHm41/QcSCYlTKsDj3wxJSaJzTqNAedOe+ZoftbMaNpkc WYx/yFwnCybB7kEagrAlj5pmm/g7VO0= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1741646088; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=v4uFGsx80hZqWFWzwJZpTRsesyR37SbMYmMlBn4jo7k=; b=RZvWQMm5OKVKr3g9DvprYWNOa4Aa0F+Y1E1O8/cy27NeFlczye9KCYF8JqvM0/k5iMnCIY oG2o+SjlcZ7YHYDA== From: Fabiano Rosas To: qemu-devel@nongnu.org Cc: Peter Xu , Steve Sistare , David Hildenbrand Subject: [PULL v2 2/3] migration: ram block cpr blockers Date: Mon, 10 Mar 2025 19:34:39 -0300 Message-Id: <20250310223440.31328-3-farosas@suse.de> X-Mailer: git-send-email 2.35.3 In-Reply-To: <20250310223440.31328-1-farosas@suse.de> References: <20250310223440.31328-1-farosas@suse.de> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 3995B1F387 X-Spam-Score: -3.01 X-Rspamd-Action: no action X-Spamd-Result: default: False [-3.01 / 50.00]; BAYES_HAM(-3.00)[100.00%]; MID_CONTAINS_FROM(1.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; R_MISSING_CHARSET(0.50)[]; R_DKIM_ALLOW(-0.20)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; NEURAL_HAM_SHORT(-0.20)[-1.000]; MIME_GOOD(-0.10)[text/plain]; MX_GOOD(-0.01)[]; DBL_BLOCKED_OPENRESOLVER(0.00)[suse.de:dkim,suse.de:mid,suse.de:email,oracle.com:email,imap1.dmz-prg2.suse.org:rdns,imap1.dmz-prg2.suse.org:helo]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; RBL_SPAMHAUS_BLOCKED_OPENRESOLVER(0.00)[2a07:de40:b281:104:10:150:64:97:from]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; MIME_TRACE(0.00)[0:+]; TO_MATCH_ENVRCPT_ALL(0.00)[]; RECEIVED_SPAMHAUS_BLOCKED_OPENRESOLVER(0.00)[2a07:de40:b281:106:10:150:64:167:received]; FROM_EQ_ENVFROM(0.00)[]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; FUZZY_BLOCKED(0.00)[rspamd.com]; RCVD_COUNT_TWO(0.00)[2]; RCVD_TLS_ALL(0.00)[]; RCPT_COUNT_THREE(0.00)[4]; DKIM_TRACE(0.00)[suse.de:+] X-Rspamd-Server: rspamd1.dmz-prg2.suse.org Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2a07:de40:b251:101:10:150:64:2; envelope-from=farosas@suse.de; helo=smtp-out2.suse.de X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @suse.de) X-ZM-MESSAGEID: 1741646146495019100 Content-Type: text/plain; charset="utf-8" From: Steve Sistare Unlike cpr-reboot mode, cpr-transfer mode cannot save volatile ram blocks in the migration stream file and recreate them later, because the physical memory for the blocks is pinned and registered for vfio. Add a blocker for volatile ram blocks. Also add a blocker for RAM_GUEST_MEMFD. Preserving guest_memfd may be sufficient for CPR, but it has not been tested yet. Signed-off-by: Steve Sistare Reviewed-by: Fabiano Rosas Reviewed-by: Peter Xu Reviewed-by: David Hildenbrand Message-ID: <1740667681-257312-1-git-send-email-steven.sistare@oracle.com> Signed-off-by: Fabiano Rosas --- include/exec/memory.h | 3 ++ include/exec/ramblock.h | 1 + migration/savevm.c | 2 ++ system/physmem.c | 66 +++++++++++++++++++++++++++++++++++++++++ 4 files changed, 72 insertions(+) diff --git a/include/exec/memory.h b/include/exec/memory.h index 78c4e0aec8..d09af58c97 100644 --- a/include/exec/memory.h +++ b/include/exec/memory.h @@ -3203,6 +3203,9 @@ bool ram_block_discard_is_disabled(void); */ bool ram_block_discard_is_required(void); =20 +void ram_block_add_cpr_blocker(RAMBlock *rb, Error **errp); +void ram_block_del_cpr_blocker(RAMBlock *rb); + #endif =20 #endif diff --git a/include/exec/ramblock.h b/include/exec/ramblock.h index 0babd105c0..64484cd821 100644 --- a/include/exec/ramblock.h +++ b/include/exec/ramblock.h @@ -39,6 +39,7 @@ struct RAMBlock { /* RCU-enabled, writes protected by the ramlist lock */ QLIST_ENTRY(RAMBlock) next; QLIST_HEAD(, RAMBlockNotifier) ramblock_notifiers; + Error *cpr_blocker; int fd; uint64_t fd_offset; int guest_memfd; diff --git a/migration/savevm.c b/migration/savevm.c index 5c4fdfd95e..ce158c3512 100644 --- a/migration/savevm.c +++ b/migration/savevm.c @@ -3514,12 +3514,14 @@ void vmstate_register_ram(MemoryRegion *mr, DeviceS= tate *dev) qemu_ram_set_idstr(mr->ram_block, memory_region_name(mr), dev); qemu_ram_set_migratable(mr->ram_block); + ram_block_add_cpr_blocker(mr->ram_block, &error_fatal); } =20 void vmstate_unregister_ram(MemoryRegion *mr, DeviceState *dev) { qemu_ram_unset_idstr(mr->ram_block); qemu_ram_unset_migratable(mr->ram_block); + ram_block_del_cpr_blocker(mr->ram_block); } =20 void vmstate_register_ram_global(MemoryRegion *mr) diff --git a/system/physmem.c b/system/physmem.c index a6af555f4b..e97de3ef65 100644 --- a/system/physmem.c +++ b/system/physmem.c @@ -71,7 +71,10 @@ =20 #include "qemu/pmem.h" =20 +#include "qapi/qapi-types-migration.h" +#include "migration/blocker.h" #include "migration/cpr.h" +#include "migration/options.h" #include "migration/vmstate.h" =20 #include "qemu/range.h" @@ -1904,6 +1907,14 @@ static void ram_block_add(RAMBlock *new_block, Error= **errp) qemu_mutex_unlock_ramlist(); goto out_free; } + + error_setg(&new_block->cpr_blocker, + "Memory region %s uses guest_memfd, " + "which is not supported with CPR.", + memory_region_name(new_block->mr)); + migrate_add_blocker_modes(&new_block->cpr_blocker, errp, + MIG_MODE_CPR_TRANSFER, + -1); } =20 ram_size =3D (new_block->offset + new_block->max_length) >> TARGET_PAG= E_BITS; @@ -4095,3 +4106,58 @@ bool ram_block_discard_is_required(void) return qatomic_read(&ram_block_discard_required_cnt) || qatomic_read(&ram_block_coordinated_discard_required_cnt); } + +/* + * Return true if ram is compatible with CPR. Do not exclude rom, + * because the rom file could change in new QEMU. + */ +static bool ram_is_cpr_compatible(RAMBlock *rb) +{ + MemoryRegion *mr =3D rb->mr; + + if (!mr || !memory_region_is_ram(mr)) { + return true; + } + + /* Ram device is remapped in new QEMU */ + if (memory_region_is_ram_device(mr)) { + return true; + } + + /* + * A file descriptor is passed to new QEMU and remapped, or its backing + * file is reopened and mapped. It must be shared to avoid COW. + */ + if (rb->fd >=3D 0 && qemu_ram_is_shared(rb)) { + return true; + } + + return false; +} + +/* + * Add a blocker for each volatile ram block. This function should only be + * called after we know that the block is migratable. Non-migratable bloc= ks + * are either re-created in new QEMU, or are handled specially, or are cov= ered + * by a device-level CPR blocker. + */ +void ram_block_add_cpr_blocker(RAMBlock *rb, Error **errp) +{ + assert(qemu_ram_is_migratable(rb)); + + if (ram_is_cpr_compatible(rb)) { + return; + } + + error_setg(&rb->cpr_blocker, + "Memory region %s is not compatible with CPR. share=3Don is= " + "required for memory-backend objects, and aux-ram-share=3Do= n is " + "required.", memory_region_name(rb->mr)); + migrate_add_blocker_modes(&rb->cpr_blocker, errp, MIG_MODE_CPR_TRANSFE= R, + -1); +} + +void ram_block_del_cpr_blocker(RAMBlock *rb) +{ + migrate_del_blocker(&rb->cpr_blocker); +} --=20 2.35.3 From nobody Thu Apr 3 10:10:53 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=suse.de ARC-Seal: i=1; a=rsa-sha256; t=1741646144; cv=none; d=zohomail.com; s=zohoarc; b=PqG1j+VtXheb6g7VIdtWSifzrln9j80zqqJsNckIziI3ZRBEJa2FQ/nG8r8uZsrOt98SOJFeTOqvecR3ms+os/sBHwuVX+t9VOePvcdZNLUHLXSeR1Gt2Ut1ZLSIMRhoAAeVYPfI2CTsksxljL81LYnas9ArySOSjv1QK4qaIwU= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1741646144; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=D0wrgi6pxrkmOfIx5ZgU7+DOtdeSiALIzxKjJwUR95Y=; b=M6KYf9gU/xhT1JDsNi7Sft6bCPhrJ7PT0IqhYMWN6oErRUIkn8YJ2K7A+suO0rzUOOKSwsOuFb+7Htzp1Ioc4REOB8m1omXrIdoyobdROlKCecwSWi7Rs+M4gzbELqnbZlG5XfdXtyImFytwu7xbwlIOwVRDR6hI8lfJNEEnT6c= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1741646144210501.19347296332364; Mon, 10 Mar 2025 15:35:44 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1trliU-0004Bd-Vv; Mon, 10 Mar 2025 18:35:23 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1trliF-0003vN-En for qemu-devel@nongnu.org; Mon, 10 Mar 2025 18:35:07 -0400 Received: from smtp-out2.suse.de ([195.135.223.131]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1trliD-0001jI-Kl for qemu-devel@nongnu.org; Mon, 10 Mar 2025 18:35:07 -0400 Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 14F551F388; Mon, 10 Mar 2025 22:34:50 +0000 (UTC) Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id B60B21399F; Mon, 10 Mar 2025 22:34:48 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id 8OrLHAhpz2dzRAAAD6G6ig (envelope-from ); Mon, 10 Mar 2025 22:34:48 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1741646090; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=D0wrgi6pxrkmOfIx5ZgU7+DOtdeSiALIzxKjJwUR95Y=; b=Zwgu5gfx9SjWVnsJxwugC90T4P5BraR3/US/osn9MKmbmkEEzZEtKDtbHjRfOA6hNBpAnI 80s7u33aQaJ1WKZOl+7z1JEFui7jOM3bulPtu7FUbZzWg6Bg5DaqrauaCmFWtBUcL+HZoj o5TZVja4UZaSDsNbdQCBkfUH3vuBuIE= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1741646090; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=D0wrgi6pxrkmOfIx5ZgU7+DOtdeSiALIzxKjJwUR95Y=; b=jw+nQg1iXVLc6dvX+twKa97S+9G8hbZ4V3KDqRnE4D+bwLYLXFDryijZF/w533JlHFGcn1 hQNYpk9/D5mYI0Cg== Authentication-Results: smtp-out2.suse.de; dkim=pass header.d=suse.de header.s=susede2_rsa header.b=Zwgu5gfx; dkim=pass header.d=suse.de header.s=susede2_ed25519 header.b=jw+nQg1i DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1741646090; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=D0wrgi6pxrkmOfIx5ZgU7+DOtdeSiALIzxKjJwUR95Y=; b=Zwgu5gfx9SjWVnsJxwugC90T4P5BraR3/US/osn9MKmbmkEEzZEtKDtbHjRfOA6hNBpAnI 80s7u33aQaJ1WKZOl+7z1JEFui7jOM3bulPtu7FUbZzWg6Bg5DaqrauaCmFWtBUcL+HZoj o5TZVja4UZaSDsNbdQCBkfUH3vuBuIE= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1741646090; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=D0wrgi6pxrkmOfIx5ZgU7+DOtdeSiALIzxKjJwUR95Y=; b=jw+nQg1iXVLc6dvX+twKa97S+9G8hbZ4V3KDqRnE4D+bwLYLXFDryijZF/w533JlHFGcn1 hQNYpk9/D5mYI0Cg== From: Fabiano Rosas To: qemu-devel@nongnu.org Cc: Peter Xu , Li Zhijian Subject: [PULL v2 3/3] migration: Prioritize RDMA in ram_save_target_page() Date: Mon, 10 Mar 2025 19:34:40 -0300 Message-Id: <20250310223440.31328-4-farosas@suse.de> X-Mailer: git-send-email 2.35.3 In-Reply-To: <20250310223440.31328-1-farosas@suse.de> References: <20250310223440.31328-1-farosas@suse.de> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 14F551F388 X-Spamd-Result: default: False [-3.01 / 50.00]; BAYES_HAM(-3.00)[100.00%]; MID_CONTAINS_FROM(1.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; R_MISSING_CHARSET(0.50)[]; R_DKIM_ALLOW(-0.20)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; NEURAL_HAM_SHORT(-0.20)[-1.000]; MIME_GOOD(-0.10)[text/plain]; MX_GOOD(-0.01)[]; DBL_BLOCKED_OPENRESOLVER(0.00)[suse.de:email,suse.de:dkim,suse.de:mid,imap1.dmz-prg2.suse.org:helo,imap1.dmz-prg2.suse.org:rdns]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; RBL_SPAMHAUS_BLOCKED_OPENRESOLVER(0.00)[2a07:de40:b281:104:10:150:64:97:from]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; MIME_TRACE(0.00)[0:+]; TO_MATCH_ENVRCPT_ALL(0.00)[]; FUZZY_BLOCKED(0.00)[rspamd.com]; FROM_EQ_ENVFROM(0.00)[]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; RECEIVED_SPAMHAUS_BLOCKED_OPENRESOLVER(0.00)[2a07:de40:b281:106:10:150:64:167:received]; RCVD_COUNT_TWO(0.00)[2]; RCVD_TLS_ALL(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; DKIM_TRACE(0.00)[suse.de:+] X-Rspamd-Server: rspamd2.dmz-prg2.suse.org X-Rspamd-Action: no action X-Spam-Score: -3.01 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=195.135.223.131; envelope-from=farosas@suse.de; helo=smtp-out2.suse.de X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @suse.de) X-ZM-MESSAGEID: 1741646146417019100 Content-Type: text/plain; charset="utf-8" From: Li Zhijian Address an error in RDMA-based migration by ensuring RDMA is prioritized when saving pages in `ram_save_target_page()`. Previously, the RDMA protocol's page-saving step was placed after other protocols due to a refactoring in commit bc38dc2f5f3. This led to migration failures characterized by unknown control messages and state loading errors destination: (qemu) qemu-system-x86_64: Unknown control message QEMU FILE qemu-system-x86_64: error while loading state section id 1(ram) qemu-system-x86_64: load of migration failed: Operation not permitted source: (qemu) qemu-system-x86_64: RDMA is in an error state waiting migration to a= bort! qemu-system-x86_64: failed to save SaveStateEntry with id(name): 1(ram): -1 qemu-system-x86_64: rdma migration: recv polling control error! qemu-system-x86_64: warning: Early error. Sending error. qemu-system-x86_64: warning: rdma migration: send polling control error RDMA migration implemented its own protocol/method to send pages to destination side, hand over to RDMA first to prevent pages being saved by other protocol. Fixes: bc38dc2f5f3 ("migration: refactor ram_save_target_page functions") Reviewed-by: Peter Xu Signed-off-by: Li Zhijian Message-ID: <20250305062825.772629-2-lizhijian@fujitsu.com> Signed-off-by: Fabiano Rosas --- migration/ram.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/migration/ram.c b/migration/ram.c index 589b6505eb..424df6d9f1 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -1964,6 +1964,11 @@ static int ram_save_target_page(RAMState *rs, PageSe= archStatus *pss) ram_addr_t offset =3D ((ram_addr_t)pss->page) << TARGET_PAGE_BITS; int res; =20 + /* Hand over to RDMA first */ + if (control_save_page(pss, offset, &res)) { + return res; + } + if (!migrate_multifd() || migrate_zero_page_detection() =3D=3D ZERO_PAGE_DETECTION_LEGACY= ) { if (save_zero_page(rs, pss, offset)) { @@ -1976,10 +1981,6 @@ static int ram_save_target_page(RAMState *rs, PageSe= archStatus *pss) return ram_save_multifd_page(block, offset); } =20 - if (control_save_page(pss, offset, &res)) { - return res; - } - return ram_save_page(rs, pss); } =20 --=20 2.35.3