From nobody Tue Apr 23 19:40:17 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zohomail.com: domain of redhat.com designates 170.10.133.124 as permitted sender) client-ip=170.10.133.124; envelope-from=libvir-list-bounces@redhat.com; helo=us-smtp-delivery-124.mimecast.com; Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=libvir-list-bounces@redhat.com; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1668785892; cv=none; d=zohomail.com; s=zohoarc; b=agiEbyOY5/t3gMSvXD3iqWBc8lFFi84Y+z/nHRzl+lnLbJBNzvyPYcHMZpeJzKzvLAuZCwb07MDnySeJ9fRN+V/O+M0w3Gonu5G1K6i6EzbMbvMAHrRFajTwf3r12bWOfXp6wom/KQGjlq/D0yg1JmzIZhz09SPJo1R122lN81o= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1668785892; h=Content-Type:Content-Transfer-Encoding:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=MpnX+Ci65vSxrF31wDbggV19Z+FqaYVQroyti4Z+gRw=; b=XKIfsSPVZVDyoTn/K+o0HSdMU/5HeWX/R7vRPYxEqllsKVSvP8LtCof4cztmAZKQALMnmPPaIb8E5Xdo0EYZzjiyksSVD7TZzoWnPJ9pLEx10bXItVriJ3A6ltbN4Vubtavg3YW9b+WcGTJzXN7vq9nKzTPodp8Haz1T0PJ19Vk= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=libvir-list-bounces@redhat.com; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by mx.zohomail.com with SMTPS id 1668785892988958.0635531179855; Fri, 18 Nov 2022 07:38:12 -0800 (PST) Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-365-uyfneCw4NL2KzYh7WCZrTg-1; Fri, 18 Nov 2022 10:37:35 -0500 Received: from smtp.corp.redhat.com (int-mx10.intmail.prod.int.rdu2.redhat.com [10.11.54.10]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 5717F86C05B; Fri, 18 Nov 2022 15:37:30 +0000 (UTC) Received: from mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com [10.30.29.100]) by smtp.corp.redhat.com (Postfix) with ESMTP id 40E6F492B24; Fri, 18 Nov 2022 15:37:30 +0000 (UTC) Received: from mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (localhost [IPv6:::1]) by mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (Postfix) with ESMTP id AFC081946A5A; Fri, 18 Nov 2022 15:37:28 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx09.intmail.prod.int.rdu2.redhat.com [10.11.54.9]) by mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (Postfix) with ESMTP id 5CE131946587 for ; Fri, 18 Nov 2022 15:37:27 +0000 (UTC) Received: by smtp.corp.redhat.com (Postfix) id 40FDF4B3FCD; Fri, 18 Nov 2022 15:37:27 +0000 (UTC) Received: from rein.int.mamuti.net (ovpn-192-75.brq.redhat.com [10.40.192.75]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 1C5494B3FCE for ; Fri, 18 Nov 2022 15:37:27 +0000 (UTC) Received: by rein.int.mamuti.net (Postfix, from userid 500) id C601418A190; Fri, 18 Nov 2022 16:37:24 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1668785892; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-unsubscribe:list-subscribe:list-post; bh=MpnX+Ci65vSxrF31wDbggV19Z+FqaYVQroyti4Z+gRw=; b=KZOe2GlOeII0mAkLwh7ZjTPZ7lgINYJLWUAKZ8PAcPo0OJL6iX4jknB1/ADvnk9IRCRcxQ XEqKygk4xcy5UPm5NaE7IYTod6kJRRw8YYvQYN2iA+jM2NKAJLcYXYJMk7ZLtvZaIQPEEp 7TgIxJGu6kUeL/J/gW8mxe13DKP58D8= X-MC-Unique: uyfneCw4NL2KzYh7WCZrTg-1 X-Original-To: libvir-list@listman.corp.redhat.com From: Jiri Denemark To: libvir-list@redhat.com Subject: [libvirt PATCH 1/2] qemu: Always restore post-copy migration job on reconnect Date: Fri, 18 Nov 2022 16:37:21 +0100 Message-Id: <153692a137780e47246524345f91822158b9720a.1668785779.git.jdenemar@redhat.com> In-Reply-To: References: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.9 X-BeenThere: libvir-list@redhat.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Development discussions about the libvirt library & tools List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libvir-list-bounces@redhat.com Sender: "libvir-list" X-Scanned-By: MIMEDefang 3.1 on 10.11.54.10 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1668785893993100003 Content-Type: text/plain; charset="utf-8" We need the restored job even in case the migration already finished even though we will stop it just a few lines below as the functions we call in between require an existing migration job. This fixes a crash on reconnect when post-copy migration finished while the daemon was not running. Signed-off-by: Jiri Denemark Reviewed-by: Peter Krempa --- src/qemu/qemu_process.c | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-) diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 0769f30d74..e1c18dde90 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -3635,11 +3635,14 @@ qemuProcessRecoverMigration(virQEMUDriver *driver, =20 if (rc > 0) { job->phase =3D QEMU_MIGRATION_PHASE_POSTCOPY_FAILED; + /* Even though we restore the migration async job here, the APIs b= elow + * use VIR_ASYNC_JOB_NONE because we're already in a MODIFY job st= arted + * before we reconnected to the domain. */ + qemuProcessRestoreMigrationJob(vm, job); =20 if (migStatus =3D=3D VIR_DOMAIN_JOB_STATUS_POSTCOPY) { VIR_DEBUG("Post-copy migration of domain %s still running, it = will be handled as unattended", vm->def->name); - qemuProcessRestoreMigrationJob(vm, job); return 0; } =20 @@ -3648,17 +3651,19 @@ qemuProcessRecoverMigration(virQEMUDriver *driver, qemuMigrationSrcPostcopyFailed(vm); else qemuMigrationDstPostcopyFailed(vm); - - qemuProcessRestoreMigrationJob(vm, job); return 0; } =20 VIR_DEBUG("Post-copy migration of domain %s already finished", vm->def->name); - if (job->asyncJob =3D=3D VIR_ASYNC_JOB_MIGRATION_OUT) + if (job->asyncJob =3D=3D VIR_ASYNC_JOB_MIGRATION_OUT) { qemuMigrationSrcComplete(driver, vm, VIR_ASYNC_JOB_NONE); - else + /* No need to stop the restored job as the domain has just been + * destroyed. */ + } else { qemuMigrationDstComplete(driver, vm, true, VIR_ASYNC_JOB_NONE,= job); + virDomainObjEndAsyncJob(vm); + } return 0; } =20 --=20 2.38.1 From nobody Tue Apr 23 19:40:17 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zohomail.com: domain of redhat.com designates 170.10.133.124 as permitted sender) client-ip=170.10.133.124; envelope-from=libvir-list-bounces@redhat.com; helo=us-smtp-delivery-124.mimecast.com; Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=libvir-list-bounces@redhat.com; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1668785856; cv=none; d=zohomail.com; s=zohoarc; b=U+9G3fXjOlyQpxQ/TZh3kXqIamKvrfD22byGVvV+Z1SDHgFVsS6InzgVGoRbzpc9BTiMrDrz31PPUh5aX5cwLGzb+CckmaEF/Fcm6N9ceXh2EZf3wE/f6/6eAF8f3KYr3miMNyzuNWcPC21iICcftBtqTYNd2wisigu1Pt/oDs8= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1668785856; h=Content-Type:Content-Transfer-Encoding:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=5sysuFJWXvme3XYgHVh8ltDea94XjZXE419kEPBHga4=; b=ATpME1Jd53gyQhwylX9RhefH5FnlJJCZJerIMXk7RxcgfKg32yuqfR6T0akrXfQNFvjq1zK0okxnmqjsIG8zJmnUyXK3JydeZmfSVrXhwJvZo9zGLzemO/uRG3+A8n7/5+ET+If97fBZut3F7N8CBpAkYq4RpFrfg6fxTv7FRaE= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=libvir-list-bounces@redhat.com; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by mx.zohomail.com with SMTPS id 1668785856841176.12156441223112; Fri, 18 Nov 2022 07:37:36 -0800 (PST) Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-394-kQYzkpXaN-2xTnSlY2MSNA-1; Fri, 18 Nov 2022 10:37:32 -0500 Received: from smtp.corp.redhat.com (int-mx09.intmail.prod.int.rdu2.redhat.com [10.11.54.9]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 3090B88607B; Fri, 18 Nov 2022 15:37:30 +0000 (UTC) Received: from mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com [10.30.29.100]) by smtp.corp.redhat.com (Postfix) with ESMTP id 7AF244B3FCD; Fri, 18 Nov 2022 15:37:28 +0000 (UTC) Received: from mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (localhost [IPv6:::1]) by mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (Postfix) with ESMTP id 4731A19465A8; Fri, 18 Nov 2022 15:37:28 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx10.intmail.prod.int.rdu2.redhat.com [10.11.54.10]) by mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (Postfix) with ESMTP id CB9961946587 for ; Fri, 18 Nov 2022 15:37:27 +0000 (UTC) Received: by smtp.corp.redhat.com (Postfix) id C054C492B1E; Fri, 18 Nov 2022 15:37:27 +0000 (UTC) Received: from rein.int.mamuti.net (ovpn-192-75.brq.redhat.com [10.40.192.75]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 9C704492B04 for ; Fri, 18 Nov 2022 15:37:27 +0000 (UTC) Received: by rein.int.mamuti.net (Postfix, from userid 500) id 4D3FB18A191; Fri, 18 Nov 2022 16:37:25 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1668785855; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-unsubscribe:list-subscribe:list-post; bh=5sysuFJWXvme3XYgHVh8ltDea94XjZXE419kEPBHga4=; b=UFcPVYUGwdY0328HR7dc+WLVa3h6/eW6BeF0HYSZmExlKLy4N83XBbskwD8bgzbNpiqK58 Cj7bsidK1e8jMVAV2fKo9H8BZUX9I5sNXUCk8bOV+FSZsG0Z+9po130J4WGFO9lhTJqWwL YOdsDsOnLoX89I1/78FdtCaQ0o0k7rg= X-MC-Unique: kQYzkpXaN-2xTnSlY2MSNA-1 X-Original-To: libvir-list@listman.corp.redhat.com From: Jiri Denemark To: libvir-list@redhat.com Subject: [libvirt PATCH 2/2] qemu: Ignore failure in post-copy migration when QEMU says completed Date: Fri, 18 Nov 2022 16:37:22 +0100 Message-Id: In-Reply-To: References: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.10 X-BeenThere: libvir-list@redhat.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Development discussions about the libvirt library & tools List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libvir-list-bounces@redhat.com Sender: "libvir-list" X-Scanned-By: MIMEDefang 3.1 on 10.11.54.9 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1668785857903100001 Content-Type: text/plain; charset="utf-8" When post-copy migration is running in Finish phase we already did everything needed and we're just waiting for all the memory to transfer to the destination. The domain is already running on there at this point. Once all data is transferred (QEMU sends a MIGRATION completed event) we're done. So in this specific post-copy case the source does not need to care about the result of the Finish call as long as QEMU says migration completed. The Finish call to the destination daemon may fail for reasons that do not affect QEMU, e.g., libvirt daemon was restarted there or the libvirt connection broke. Currently we just mark the post-copy migration as failed on the source and keep the domain paused there. But when libvirt daemon is restarted at this point, it will detect migration finished successfully and kill the domain as migrated. It make sense to do this even without having to restart the daemon. Closes: https://gitlab.com/libvirt/libvirt/-/issues/338 Signed-off-by: Jiri Denemark Reviewed-by: Peter Krempa --- src/qemu/qemu_migration.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index bba4e1dbf3..bef06f4caf 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -3901,6 +3901,7 @@ qemuMigrationSrcConfirmPhase(virQEMUDriver *driver, g_autoptr(qemuMigrationCookie) mig =3D NULL; qemuDomainObjPrivate *priv =3D vm->privateData; qemuDomainJobPrivate *jobPriv =3D vm->job->privateData; + qemuDomainJobDataPrivate *currentData =3D vm->job->current->privateDat= a; virDomainJobData *jobData =3D NULL; qemuMigrationJobPhase phase; =20 @@ -3911,6 +3912,13 @@ qemuMigrationSrcConfirmPhase(virQEMUDriver *driver, =20 virCheckFlags(QEMU_MIGRATION_FLAGS, -1); =20 + if (retcode !=3D 0 && + virDomainObjIsPostcopy(vm, VIR_DOMAIN_JOB_OPERATION_MIGRATION_OUT)= && + currentData->stats.mig.status =3D=3D QEMU_MONITOR_MIGRATION_STATUS= _COMPLETED) { + VIR_DEBUG("Finish phase failed, but QEMU reports post-copy migrati= on is completed; forcing success"); + retcode =3D 0; + } + if (flags & VIR_MIGRATE_POSTCOPY_RESUME) { phase =3D QEMU_MIGRATION_PHASE_CONFIRM_RESUME; } else if (virDomainObjIsFailedPostcopy(vm)) { --=20 2.38.1