From nobody Fri Nov 14 23:31:34 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=quarantine dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1762204445; cv=none; d=zohomail.com; s=zohoarc; b=ehrvG2fVMP6O9vN1XLuEYC04qGK7FMNtWYvjYWO2WqqAX9WyrCfRxeZF1WF8BvYYDARgIPaEQZIuTeSLcc0MEiQ5A+hkG0HUyXimHU67eyYDKThXeXDBwBZMNJwNUL2oX/x7kth/g73PBJ8Nrdw9hILAZ29rmW79qy7jWY6e0x4= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1762204445; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=/7FjPA8GdfJf4kCb2i7TngezmlYaUsti3XGlUtftU7k=; b=QxAwAwitodxdC/dcApeDC/0mckY7Ic67nL07zXBprbDU9UHlJxnHcPKCpB73MEawg5EMBvZvR+pCnfJRMm/xlZ5kbFniE0tAGG2ljwyMhkBAfCwI7+GVLey6v8MX9C+0gGju1cAytky77RJr6yVb067cSK0N3/6bpxYNB/nlNTU= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=quarantine dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1762204445497205.63390068338424; Mon, 3 Nov 2025 13:14:05 -0800 (PST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1vG1m0-00017H-VV; Mon, 03 Nov 2025 16:07:32 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1vG1lz-00016o-Mf for qemu-devel@nongnu.org; Mon, 03 Nov 2025 16:07:31 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1vG1li-0005RN-F4 for qemu-devel@nongnu.org; Mon, 03 Nov 2025 16:07:30 -0500 Received: from mail-qt1-f197.google.com (mail-qt1-f197.google.com [209.85.160.197]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-592-Anv7iClrO6eiQjgWriHVWg-1; Mon, 03 Nov 2025 16:07:11 -0500 Received: by mail-qt1-f197.google.com with SMTP id d75a77b69052e-4eceefa36d0so135203651cf.1 for ; Mon, 03 Nov 2025 13:07:11 -0800 (PST) Received: from x1.com ([142.188.210.50]) by smtp.gmail.com with ESMTPSA id af79cd13be357-8b0f5468968sm57428185a.19.2025.11.03.13.07.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 03 Nov 2025 13:07:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1762204032; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/7FjPA8GdfJf4kCb2i7TngezmlYaUsti3XGlUtftU7k=; b=M5RJUo5hQ4l6Qv7w9tcDI1WSmfnXEu+eH64xeWVx0R4gc8w13IL1i2x53MDB9Pqh097OHQ HlP5ZFAGunJiUc+6H3hOyyCwRtTKyHnJTWfeJZYmH6vlYWWnjFAAt40TgJC3/VYww/6tcb kUjCIDxwt45LdERRg9769Y26PaDu63E= X-MC-Unique: Anv7iClrO6eiQjgWriHVWg-1 X-Mimecast-MFC-AGG-ID: Anv7iClrO6eiQjgWriHVWg_1762204031 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=google; t=1762204031; x=1762808831; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=/7FjPA8GdfJf4kCb2i7TngezmlYaUsti3XGlUtftU7k=; b=mtk+JyGZK1968tH8xtnsQhja9Ow3FIwoG6aVgpSk61BgX7PgCsXq6iB99rlUNYFXOH VD/I/JvajyJplCt1eCkbixL951+ZEcnfuyhEEXi9dvY4lW87x2gfaGx2q7wTTQ5CZ7dd fhh3ZsFnU3zbUMAc3GBxt7Tw8TpxPHA02Z1QQ0AJYOf3MmM7NKlrOYAPq7uLtjZYaXU2 /8OMz1grif0WLwiy2lAmu/QH8qewK8xO0zlMpqsyf2S4sRZzcfRLloeK/5TogBoUKVoX auFBMo+KyDypAHxFa6H+Xo4ke8cCdkGCYSWsKcIQtx5a273ihUugbXzCHTYqdl8LwpaF 9h7w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1762204031; x=1762808831; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=/7FjPA8GdfJf4kCb2i7TngezmlYaUsti3XGlUtftU7k=; b=NgClbZvciTX+iquVZgGxVfWpXcZT7kp94C4BKMn9gMyPV6MM/QSHGXydKB9V2Dz8u2 btmUda57xmEZG1+nmbnm28GivBaSswUMY8qDWvscIFtvkdJOlcehlkHQEySt7gvmsCHG KBN+QREp26QFURuj340EOaTFK3nsjWdGsr3L7N+jTr8fCIZqwWykdYrI7YWhBaJUQ8uj s3lB8EtGLCz7Nf1mFyoZ9bdIQxmT1DYV6Eb4LEVsppd7WjG3G8lwMjJOoPUABrRwgBAY hzoXUBqdA2Oiu6oESZuq7MSd24v/KlyXYO8gVIu259za6kVwWuf+VsHEHqsSgwT9FqG1 052Q== X-Gm-Message-State: AOJu0YwcXtju3Gr709//pGnoiUn2hmDcYKS1O4IPBJpAFsFNhaU6uEVv qTHNSS7ozgYFmKjwtmzR7bluWWRu9M2uGen0SDJZch7k26YA8YAyQvhiYCPUcTivgsvWuTIgVs/ njTkhJMUDK8kLDUbpJV6OQd2SR9o+IJVR8PvouAb5gyqvXn3/1RO4SSFMWId9O/L9osaoJj7pg4 AIdfpGnogDXB8bh70AGkk9QiYBISRX7KjjnrN7iQ== X-Gm-Gg: ASbGnctL1kwoAeakVLEmEpdQX5QJNhBlxYLCEJGfLSs3TkHxRtxKbPtyoevNkKFZYWK QoLgcqgRMCrnS8AftldBD6LFDcJPWv1TiYle1MTTh1RtjJP1kZtqQhabBGbCM338kIr/m0l9H3g 87fH5ftD8EddzptIntFORwD8TDPeudlwd6vJRhiaIUvgqjqfr7EuMDE44pg6/rK00hDDZYlU6VI pi8E/PipG+vNIN7Gwj2E4KDoPSwqq69MqCJkaZKLr/iUtLGXJMcmhyc8yGneKjYoSTzFFXTVTki aJC5iOAq+HmrcDrfG3HgQEwBZYnlNpCOVZ+sMDzs0Bmsudeu1pTlDjFXfWQW2EUi X-Received: by 2002:ac8:7d0f:0:b0:4e8:ad2a:b0cf with SMTP id d75a77b69052e-4ed30d512aamr182178991cf.9.1762204030525; Mon, 03 Nov 2025 13:07:10 -0800 (PST) X-Google-Smtp-Source: AGHT+IE2g7WbmHbwwQnJV5qAqgm7TSs/zKMtTn0/RSQC2RgaKHvU8HDoCv/7gw8vWFek+RI80cPURQ== X-Received: by 2002:ac8:7d0f:0:b0:4e8:ad2a:b0cf with SMTP id d75a77b69052e-4ed30d512aamr182178531cf.9.1762204029978; Mon, 03 Nov 2025 13:07:09 -0800 (PST) From: Peter Xu To: qemu-devel@nongnu.org Cc: Fabiano Rosas , David Hildenbrand , peterx@redhat.com, Paolo Bonzini Subject: [PULL 30/36] migration: Do not try to start VM if disk activation fails Date: Mon, 3 Nov 2025 16:06:19 -0500 Message-ID: <20251103210625.3689448-31-peterx@redhat.com> X-Mailer: git-send-email 2.50.1 In-Reply-To: <20251103210625.3689448-1-peterx@redhat.com> References: <20251103210625.3689448-1-peterx@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=peterx@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, SPF_PASS=-0.001, T_SPF_HELO_TEMPERROR=0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1762204452356158500 Content-Type: text/plain; charset="utf-8" If a rare split brain happens (e.g. dest QEMU started running somehow, taking shared drive locks), src QEMU may not be able to activate the drives anymore. In this case, src QEMU shouldn't start the VM or it might crash the block layer later with something like: Meanwhile, src QEMU cannot try to continue either even if dest QEMU can release the drive locks (e.g. by QMP "stop"). Because as long as dest QEMU started running, it means dest QEMU's RAM is the only version that is consistent with current status of the shared storage. Reviewed-by: Fabiano Rosas Link: https://lore.kernel.org/r/20251103183301.3840862-3-jmarcin@redhat.com Signed-off-by: Peter Xu --- migration/migration.c | 29 ++++++++++++++++++++++++----- 1 file changed, 24 insertions(+), 5 deletions(-) diff --git a/migration/migration.c b/migration/migration.c index 5e74993b46..6e647c7c4a 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -3526,6 +3526,8 @@ static MigIterateState migration_iteration_run(Migrat= ionState *s) =20 static void migration_iteration_finish(MigrationState *s) { + Error *local_err =3D NULL; + bql_lock(); =20 /* @@ -3549,11 +3551,28 @@ static void migration_iteration_finish(MigrationSta= te *s) case MIGRATION_STATUS_FAILED: case MIGRATION_STATUS_CANCELLED: case MIGRATION_STATUS_CANCELLING: - /* - * Re-activate the block drives if they're inactivated. Note, COLO - * shouldn't use block_active at all, so it should be no-op there. - */ - migration_block_activate(NULL); + if (!migration_block_activate(&local_err)) { + /* + * Re-activate the block drives if they're inactivated. + * + * If it fails (e.g. in case of a split brain, where dest QEMU + * might have taken some of the drive locks and running!), do + * not start VM, instead wait for mgmt to decide the next step. + * + * If dest already started, it means dest QEMU should contain + * all the data it needs and it properly owns all the drive + * locks. Then even if src QEMU got a FAILED in migration, it + * normally should mean we should treat the migration as + * COMPLETED. + * + * NOTE: it's not safe anymore to start VM on src now even if + * dest would release the drive locks. It's because as long as + * dest started running then only dest QEMU's RAM is consistent + * with the shared storage. + */ + error_free(local_err); + break; + } if (runstate_is_live(s->vm_old_state)) { if (!runstate_check(RUN_STATE_SHUTDOWN)) { vm_start(); --=20 2.50.1