From nobody Mon Feb 9 09:16:01 2026 Delivered-To: importer@patchew.org Received-SPF: pass (zohomail.com: domain of redhat.com designates 170.10.129.124 as permitted sender) client-ip=170.10.129.124; envelope-from=libvir-list-bounces@redhat.com; helo=us-smtp-delivery-124.mimecast.com; Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=libvir-list-bounces@redhat.com; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1665752315; cv=none; d=zohomail.com; s=zohoarc; b=KJma/srKnB1VuLpeUGOCU5Yp97khqljGhLFYikv8hml1mVyTTYZfHXBhcMMb8DB8Mzx76dvch73V2KQpHOHCeC89uKV+y3lDrEZ4o4XjQ/mHx/BV0J8PAos7kVLAeRBpHnrqkl8BURPq0RhZRe1EcQPvqTa9GbtstIwc8vcJLiU= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1665752315; h=Content-Type:Content-Transfer-Encoding:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=/0o1JugNSMi+mzfEcTzFRCNfdO19szk8KB8ROtRZxro=; b=UtGIZpYP4JS/i3n5OJNM2Nytn6tLJJs7sPF3hrwrNSmTSlBzOWtH+VLQ5Q9Si5DT5w0kVLeKJkDi3riOVBiQ6V5+LWMZmYCbDqbFR1EM8l9xtYctlxXUd4cPYRlHh1+RC551hueqlLNLRl4tN87XzxP712ZqvmR1iVPB+2Z+6hc= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=libvir-list-bounces@redhat.com; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by mx.zohomail.com with SMTPS id 1665752315762552.2446399272399; Fri, 14 Oct 2022 05:58:35 -0700 (PDT) Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-563--n5GKB1lMRyv6dDAUeJcSA-1; Fri, 14 Oct 2022 08:58:29 -0400 Received: from smtp.corp.redhat.com (int-mx10.intmail.prod.int.rdu2.redhat.com [10.11.54.10]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id B58CB85704A; Fri, 14 Oct 2022 12:58:26 +0000 (UTC) Received: from mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (unknown [10.30.29.100]) by smtp.corp.redhat.com (Postfix) with ESMTP id A17C84021B8; Fri, 14 Oct 2022 12:58:26 +0000 (UTC) Received: from mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (localhost [IPv6:::1]) by mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (Postfix) with ESMTP id 71DD4194658F; Fri, 14 Oct 2022 12:58:26 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) by mm-prod-listman-01.mail-001.prod.us-east-1.aws.redhat.com (Postfix) with ESMTP id 2F3FF1946588 for ; Fri, 14 Oct 2022 12:58:26 +0000 (UTC) Received: by smtp.corp.redhat.com (Postfix) id 13D4E40C955A; Fri, 14 Oct 2022 12:58:26 +0000 (UTC) Received: from rein.int.mamuti.net (ovpn-194-60.brq.redhat.com [10.40.194.60]) by smtp.corp.redhat.com (Postfix) with ESMTPS id C6B7940C6F9F for ; Fri, 14 Oct 2022 12:58:25 +0000 (UTC) Received: by rein.int.mamuti.net (Postfix, from userid 500) id 480B31800D8; Fri, 14 Oct 2022 14:58:24 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1665752314; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-unsubscribe:list-subscribe:list-post; bh=/0o1JugNSMi+mzfEcTzFRCNfdO19szk8KB8ROtRZxro=; b=GXXSwUdSb2M+2fpzQm8fblBsX7GgWRLNBDxHOmOPURK0RwcyGZHvOjkZ6o/5jgpzJ1DZGJ yUecagPlO95j7mjqk1k0cV9VOkeTcmxetBVb95ORIhrpUdOEiqm0wIVUnWgVnL+j991CTr ciDda3P3qmcIAlV/xlIKhtuguwdH3JI= X-MC-Unique: -n5GKB1lMRyv6dDAUeJcSA-1 X-Original-To: libvir-list@listman.corp.redhat.com From: Jiri Denemark To: libvir-list@redhat.com Subject: [libvirt PATCH 2/3] qemu: Do not crash when canceling migration on reconnect Date: Fri, 14 Oct 2022 14:58:07 +0200 Message-Id: In-Reply-To: References: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 X-BeenThere: libvir-list@redhat.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Development discussions about the libvirt library & tools List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libvir-list-bounces@redhat.com Sender: "libvir-list" X-Scanned-By: MIMEDefang 3.1 on 10.11.54.10 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1665752317216100003 Content-Type: text/plain; charset="utf-8" When libvirtd is restarted during an active outgoing migration (or snapshot, save, or dump which are internally implemented as migration) it wants to cancel the migration. But by a mistake in commit v8.7.0-57-g2d7b22b561 the qemuMigrationSrcCancel function is called with wait =3D=3D true, which leads to an instant crash by dereferencing NULL pointer stored in priv->job.current. When canceling migration to file (snapshot, save, dump), we don't need to wait until it is really canceled as no migration capabilities or parameters need to be restored. On the other hand we need to wait when canceling outgoing migration and since we don't have virDomainJobData at this point, we have to temporarily restore the migration job to make sure we can process MIGRATION events from QEMU. Signed-off-by: Jiri Denemark --- src/qemu/qemu_migration.c | 28 +++++++++++++++++++++++----- src/qemu/qemu_migration.h | 3 ++- src/qemu/qemu_process.c | 4 ++-- 3 files changed, 27 insertions(+), 8 deletions(-) diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 21c870334d..76e486fbc7 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -4633,8 +4633,7 @@ qemuMigrationSrcIsCanceled(virDomainObj *vm) * cancellation to complete. * * The thread (the caller itself in most cases) which is watching the migr= ation - * will do all the cleanup once migration is canceled. If no thread is wat= ching - * the migration, use qemuMigrationSrcCancelUnattended instead. + * will do all the cleanup once migration is canceled. */ int qemuMigrationSrcCancel(virDomainObj *vm, @@ -6979,11 +6978,12 @@ qemuMigrationSrcToFile(virQEMUDriver *driver, virDo= mainObj *vm, =20 =20 /** - * This function is supposed to be used only when no other thread is watch= ing - * the migration. + * This function is supposed to be used only to while reconnecting to a do= main + * with an active migration job. */ int -qemuMigrationSrcCancelUnattended(virDomainObj *vm) +qemuMigrationSrcCancelUnattended(virDomainObj *vm, + virDomainJobObj *oldJob) { bool storage =3D false; size_t i; @@ -6991,8 +6991,26 @@ qemuMigrationSrcCancelUnattended(virDomainObj *vm) VIR_DEBUG("Canceling unfinished outgoing migration of domain %s", vm->def->name); =20 + /* Make sure MIGRATION event handler can store the current migration s= tate + * in the job. + */ + if (!vm->job->current) { + qemuDomainObjRestoreAsyncJob(vm, VIR_ASYNC_JOB_MIGRATION_OUT, + oldJob->phase, oldJob->asyncStarted, + VIR_DOMAIN_JOB_OPERATION_MIGRATION_OU= T, + QEMU_DOMAIN_JOB_STATS_TYPE_MIGRATION, + VIR_DOMAIN_JOB_STATUS_FAILED, + VIR_JOB_NONE); + } + + /* We're inside a MODIFY job and the restored MIGRATION_OUT async job = is + * used only for processing migration events from QEMU. Thus we don't = want + * to start a nested job for talking to QEMU. + */ qemuMigrationSrcCancel(vm, VIR_ASYNC_JOB_NONE, true); =20 + virDomainObjEndAsyncJob(vm); + for (i =3D 0; i < vm->def->ndisks; i++) { virDomainDiskDef *disk =3D vm->def->disks[i]; qemuDomainDiskPrivate *diskPriv =3D QEMU_DOMAIN_DISK_PRIVATE(disk); diff --git a/src/qemu/qemu_migration.h b/src/qemu/qemu_migration.h index fbea45ad4e..3d7c2702aa 100644 --- a/src/qemu/qemu_migration.h +++ b/src/qemu/qemu_migration.h @@ -241,7 +241,8 @@ qemuMigrationSrcToFile(virQEMUDriver *driver, ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) G_GNUC_WARN_UNUSED_RESULT; =20 int -qemuMigrationSrcCancelUnattended(virDomainObj *vm); +qemuMigrationSrcCancelUnattended(virDomainObj *vm, + virDomainJobObj *oldJob); =20 int qemuMigrationSrcCancel(virDomainObj *vm, diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index e6fd9395de..a8101e1233 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -3532,7 +3532,7 @@ qemuProcessRecoverMigrationOut(virQEMUDriver *driver, */ VIR_DEBUG("Cancelling unfinished migration of domain %s", vm->def->name); - if (qemuMigrationSrcCancelUnattended(vm) < 0) { + if (qemuMigrationSrcCancelUnattended(vm, job) < 0) { VIR_WARN("Could not cancel ongoing migration of domain %s", vm->def->name); } @@ -3691,7 +3691,7 @@ qemuProcessRecoverJob(virQEMUDriver *driver, case VIR_ASYNC_JOB_SAVE: case VIR_ASYNC_JOB_DUMP: case VIR_ASYNC_JOB_SNAPSHOT: - qemuMigrationSrcCancel(vm, VIR_ASYNC_JOB_NONE, true); + qemuMigrationSrcCancel(vm, VIR_ASYNC_JOB_NONE, false); /* resume the domain but only if it was paused as a result of * running a migration-to-file operation. Although we are * recovering an async job, this function is run at startup --=20 2.38.0