From nobody Mon Mar 23 23:18:36 2026 Delivered-To: importer@patchew.org Received-SPF: pass (zohomail.com: domain of lists.libvirt.org designates 8.43.85.245 as permitted sender) client-ip=8.43.85.245; envelope-from=devel-bounces@lists.libvirt.org; helo=lists.libvirt.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of lists.libvirt.org designates 8.43.85.245 as permitted sender) smtp.mailfrom=devel-bounces@lists.libvirt.org; dmarc=pass(p=reject dis=none) header.from=lists.libvirt.org ARC-Seal: i=1; a=rsa-sha256; t=1774255121; cv=none; d=zohomail.com; s=zohoarc; b=dQJYEY7ewl1ij9uY5eNiHBjZYoOqbrXKGkpzb52tNvqpafehyQHx7EqkFPMCxPkN5F0AW/UiiwKtjlyC77UHlapVn/9YFaP4FCXF04sOVDpoPN/+9JTukJHwqks2nd84B52TOpB4AYgqFlsDeB1CKm1nnxh5vk6jIEhpYx+a4Dc= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1774255121; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:List-Subscribe:List-Post:List-Owner:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:Reply-To:Reply-To:Subject:Subject:To:To:Message-Id; bh=s8ONz2u+uxqN4kULB/vSldaUPu0x14xH2RZsByHEJqc=; b=gGdFFmBCdSUiKNqQa4OFg6Y3vhHqOxGwmC/E5NKQ1WPOJiN8kiX+cnf4bOI0MHsj852NbjdOh6fd0C5R4Keb1kzdZ73QfKBHEmgohc9bNxmzZ/7U+zVV9ROeTL6suAEUc+oMmKWtfPkVskweJZEvFiKE3Q8qkBp8hDP+BpO1MiI= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of lists.libvirt.org designates 8.43.85.245 as permitted sender) smtp.mailfrom=devel-bounces@lists.libvirt.org; dmarc=pass header.from= (p=reject dis=none) Return-Path: Received: from lists.libvirt.org (lists.libvirt.org [8.43.85.245]) by mx.zohomail.com with SMTPS id 1774255121917672.4476508817949; Mon, 23 Mar 2026 01:38:41 -0700 (PDT) Received: by lists.libvirt.org (Postfix, from userid 993) id 812B43F345; Mon, 23 Mar 2026 04:38:40 -0400 (EDT) Received: from [172.19.199.12] (lists.libvirt.org [8.43.85.245]) by lists.libvirt.org (Postfix) with ESMTP id 3548E3F3DD; Mon, 23 Mar 2026 04:38:05 -0400 (EDT) Received: by lists.libvirt.org (Postfix, from userid 993) id AE07F3F29F; Fri, 20 Mar 2026 13:34:06 -0400 (EDT) Received: from relay.virtuozzo.com (relay.virtuozzo.com [130.117.225.111]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (3072 bits) server-digest SHA256) (No client certificate requested) by lists.libvirt.org (Postfix) with ESMTPS id F30DC4180B for ; Fri, 20 Mar 2026 13:34:05 -0400 (EDT) Received: from ch-demo-asa.virtuozzo.com ([130.117.225.8] helo=iris.sw.ru) by relay.virtuozzo.com with esmtp (Exim 4.96) (envelope-from ) id 1w3dhf-00EttE-23; Fri, 20 Mar 2026 18:33:59 +0100 X-Spam-Checker-Version: SpamAssassin 4.0.1 (2024-03-26) on lists.libvirt.org X-Spam-Level: X-Spam-Status: No, score=-4.7 required=5.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HELO_MISC_IP,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED,RCVD_IN_VALIDITY_RPBL_BLOCKED, RCVD_IN_VALIDITY_SAFE_BLOCKED,SPF_PASS autolearn=unavailable autolearn_force=no version=4.0.1 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=virtuozzo.com; s=relay; h=MIME-Version:Message-ID:Date:Subject:From: Content-Type; bh=s8ONz2u+uxqN4kULB/vSldaUPu0x14xH2RZsByHEJqc=; b=AgJw7+/kGyfV nF65oITGz/pA7Is5O42eHtJKUEJnsPdD4PTElMZgTCoB+uc+pKPznwTSLnbXxOxQYgHT67OzvMusY qVV1N43pTAMRhfa4THGWFi0OhPTSP0rdaa0dWyFlT7U3danxZPAcS18BaCe3xhV7CRXn39niHaOSv EXVyacF/djfItMjaCHSz5Qdt8b0ko+/v31RABT3Hjhha82xpgJvSyBJa0xHg5i3Rg2Msu9wuw9AIR r3tskz8vB13j86Gci9OjrpV1u8UYs9Q4Y0QL8ituyqjU6XCchy+rj5UsxgFK7v0kJVn5qp5RI/ACv ub6K9SXmZJMD/yNmautniQ==; To: devel@lists.libvirt.org Subject: [PATCH v2] qemu: fix potential hang in qemuMigrationSrcCancelUnattended during reconnect Date: Fri, 20 Mar 2026 18:34:02 +0100 Message-ID: <20260320173402.836587-1-den@openvz.org> X-Mailer: git-send-email 2.51.0 MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-MailFrom: den@openvz.org X-Mailman-Rule-Hits: nonmember-moderation X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; loop; banned-address; header-match-devel.lists.libvirt.org-0; emergency; member-moderation Message-ID-Hash: 7C4FD5JRFG3MBTYZNW6QZYCHY4HG5VRX X-Message-ID-Hash: 7C4FD5JRFG3MBTYZNW6QZYCHY4HG5VRX X-Mailman-Approved-At: Mon, 23 Mar 2026 08:38:02 +0000 CC: den@openvz.org, Jiri Denemark , Peter Krempa , Michal Privoznik , Efim Shevrin X-Mailman-Version: 3.3.10 Precedence: list List-Id: Development discussions about the libvirt library & tools Archived-At: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: From: "Denis V. Lunev via Devel" Reply-To: "Denis V. Lunev" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZM-MESSAGEID: 1774255124721154100 Content-Type: text/plain; charset="utf-8" When libvirtd reconnects to a running QEMU process that had an in-progress migration, qemuProcessReconnect first connects the monitor and only later recovers the migration job. During this window the async job is VIR_ASYNC_JOB_NONE, so any MIGRATION status events from QEMU are silently dropped by qemuProcessHandleMigrationStatus. If the migration was already cancelled or completed by QEMU during this window, no further events will be emitted. When qemuMigrationSrcCancelUnattended later restores the async job and calls qemuMigrationSrcCancel with wait=3Dtrue, the wait loop calls qemuDomainObjWait (virCondWait with no timeout) and blocks forever waiting for an event that will never arrive. Fix this by re-querying QEMU migration state with qemuMigrationAnyRefreshStatus after restoring the async job but before calling qemuMigrationSrcCancel. If QEMU has already reached a terminal state, the cancel is skipped. Signed-off-by: Denis V. Lunev CC: Jiri Denemark CC: Peter Krempa CC: Michal Privoznik CC: Efim Shevrin --- v1 -> v2: Instead of querying QEMU with query-migrate inside qemuMigrationSrcCancel, use qemuMigrationAnyRefreshStatus in qemuMigrationSrcCancelUnattended after restoring the async job to re-check migration state before the actual cancel. src/qemu/qemu_migration.c | 18 ++++++++++++++---- 1 file changed, 14 insertions(+), 4 deletions(-) diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index fec808ccfb..a4bd7efa09 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -7330,6 +7330,7 @@ int qemuMigrationSrcCancelUnattended(virDomainObj *vm, virDomainJobObj *oldJob) { + virDomainJobStatus migStatus =3D VIR_DOMAIN_JOB_STATUS_NONE; bool storage =3D false; size_t i; =20 @@ -7348,11 +7349,20 @@ qemuMigrationSrcCancelUnattended(virDomainObj *vm, VIR_JOB_NONE); } =20 - /* We're inside a MODIFY job and the restored MIGRATION_OUT async job = is - * used only for processing migration events from QEMU. Thus we don't = want - * to start a nested job for talking to QEMU. + /* Query the actual migration state from QEMU. The state passed to + * qemuProcessRecoverMigrationOut may be stale: QEMU could have + * reached a terminal state between that initial query and the async + * job restore above, with the corresponding event silently dropped. */ - qemuMigrationSrcCancel(vm, VIR_ASYNC_JOB_NONE, true); + qemuMigrationAnyRefreshStatus(vm, VIR_ASYNC_JOB_NONE, &migStatus); + + if (migStatus !=3D VIR_DOMAIN_JOB_STATUS_CANCELED) { + /* We're inside a MODIFY job and the restored MIGRATION_OUT async + * job is used only for processing migration events from QEMU. + * Thus we don't want to start a nested job for talking to QEMU. + */ + qemuMigrationSrcCancel(vm, VIR_ASYNC_JOB_NONE, true); + } =20 virDomainObjEndAsyncJob(vm); =20 --=20 2.51.0