From nobody Sun Feb 8 17:04:19 2026 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; envelope-from=libvir-list-bounces@redhat.com; helo=mx1.redhat.com; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=libvir-list-bounces@redhat.com; dmarc=pass(p=none dis=none) header.from=redhat.com Return-Path: Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by mx.zohomail.com with SMTPS id 1552403692855731.2978210577506; Tue, 12 Mar 2019 08:14:52 -0700 (PDT) Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 3559A3082E91; Tue, 12 Mar 2019 15:14:51 +0000 (UTC) Received: from colo-mx.corp.redhat.com (colo-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.20]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 0ED7457989; Tue, 12 Mar 2019 15:14:51 +0000 (UTC) Received: from lists01.pubmisc.prod.ext.phx2.redhat.com (lists01.pubmisc.prod.ext.phx2.redhat.com [10.5.19.33]) by colo-mx.corp.redhat.com (Postfix) with ESMTP id B5946181A138; Tue, 12 Mar 2019 15:14:50 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) by lists01.pubmisc.prod.ext.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id x2CFDUHZ025362 for ; Tue, 12 Mar 2019 11:13:30 -0400 Received: by smtp.corp.redhat.com (Postfix) id 1F7EA1973B; Tue, 12 Mar 2019 15:13:30 +0000 (UTC) Received: from moe.brq.redhat.com (unknown [10.43.2.30]) by smtp.corp.redhat.com (Postfix) with ESMTP id 9787227C41 for ; Tue, 12 Mar 2019 15:13:29 +0000 (UTC) From: Michal Privoznik To: libvir-list@redhat.com Date: Tue, 12 Mar 2019 16:13:20 +0100 Message-Id: <37940af3f160aba59fa6efed2b47d85ed333c37f.1552403532.git.mprivozn@redhat.com> In-Reply-To: References: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-loop: libvir-list@redhat.com Subject: [libvirt] [PATCH 5/5] qemu_hotplug: Fix a rare race condition when detaching a device twice X-BeenThere: libvir-list@redhat.com X-Mailman-Version: 2.1.12 Precedence: junk List-Id: Development discussions about the libvirt library & tools List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Transfer-Encoding: quoted-printable Sender: libvir-list-bounces@redhat.com Errors-To: libvir-list-bounces@redhat.com X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.46]); Tue, 12 Mar 2019 15:14:51 +0000 (UTC) Content-Type: text/plain; charset="utf-8" https://bugzilla.redhat.com/show_bug.cgi?id=3D1623389 If a device is detached twice from the same domain the following race condition may happen: 1) The first DetachDevice() call will issue "device_del" on qemu monitor, but since the DEVICE_DELETED event did not arrive in time, the API ends claiming "Device detach request sent successfully". 2) The second DetachDevice() therefore still find the device in the domain and thus proceeds to detaching it again. It calls EnterMonitor() and qemuMonitorSend() trying to issue "device_del" command again. This gets both domain lock and monitor lock released. 3) At this point, qemu sends us the DEVICE_DELETED event which is going to be handled by the event loop which ends up calling qemuDomainSignalDeviceRemoval() to determine who is going to remove the device from domain definition. Whether it is the caller that marked the device for removal or whether it is going to be the event processing thread. 4) Because the device was marked for removal, qemuDomainSignalDeviceRemoval() returns true, which means the event is to be processed by the thread that has marked the device for removal (and is currently still trying to issue "device_del" command) 5) The thread finally issues the "device_del" command, which fails (obviously) and therefore it calls qemuDomainResetDeviceRemoval() to reset the device marking and quits immediately after, NOT removing any device from the domain definition. At this point, the device is still present in the domain definition but doesn't exist in qemu anymore. Worse, there is no way to remove it from the domain definition. Solution is to note down that we've seen the event and if the second "device_del" fails, not take it as a failure but carry on with the usual execution. Signed-off-by: Michal Privoznik --- src/qemu/qemu_domain.h | 1 + src/qemu/qemu_hotplug.c | 83 +++++++++++++++++++++++++++++------------ 2 files changed, 60 insertions(+), 24 deletions(-) diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h index 9f468e5661..fb361515ba 100644 --- a/src/qemu/qemu_domain.h +++ b/src/qemu/qemu_domain.h @@ -218,6 +218,7 @@ typedef qemuDomainUnpluggingDevice *qemuDomainUnpluggin= gDevicePtr; struct _qemuDomainUnpluggingDevice { const char *alias; qemuDomainUnpluggingDeviceStatus status; + bool eventSeen; /* True if DEVICE_DELETED event arrived. */ }; =20 =20 diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index 574477e916..93c0e14adf 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -70,22 +70,47 @@ unsigned long long qemuDomainRemoveDeviceWaitTime =3D 1= 000ull * 5; /** * qemuDomainDeleteDevice: * @mon: qemu monitor + * @vm: domain object * @alias: device to remove * * A simple wrapper around qemuMonitorDelDevice(). - * @mon must be locked upon entry. + * @mon must be locked upon entry, @vm shan't. * * Returns: 0 on success, * -1 otherwise. */ static inline int qemuDomainDeleteDevice(qemuMonitorPtr mon, + virDomainObjPtr vm, const char *alias) { - if (qemuMonitorDelDevice(mon, alias) < 0) - return -1; + qemuDomainObjPrivatePtr priv; + int ret =3D 0; =20 - return 0; + if (qemuMonitorDelDevice(mon, alias) < 0) { + if (vm) { + /* It is safe to lock and unlock both @mon and @vm + * here because: + * a) qemuDomainObjEnterMonitor() ensures @mon is + * ref()'d + * b) The API that is calling us ensures that @vm is + * ref()'d + */ + virObjectUnlock(mon); + virObjectLock(vm); + priv =3D vm->privateData; + if (priv->unplug.eventSeen) + virResetLastError(); + else + ret =3D -1; + virObjectLock(mon); + virObjectUnlock(vm); + } else { + ret =3D -1; + } + } + + return ret; } =20 =20 @@ -189,7 +214,11 @@ qemuDomainDetachZPCIDevice(qemuMonitorPtr mon, if (virAsprintf(&zpciAlias, "zpci%d", info->addr.pci.zpci.uid) < 0) goto cleanup; =20 - if (qemuDomainDeleteDevice(mon, zpciAlias) < 0) + /* zPCI devices are not exposed in domain XML yet. Therefore, + * they are treated as collateral devices which can't be + * unplugged directly at user's will. Hence, it's safe to + * pass NULL here. */ + if (qemuDomainDeleteDevice(mon, NULL, zpciAlias) < 0) goto cleanup; =20 ret =3D 0; @@ -5165,6 +5194,7 @@ qemuDomainResetDeviceRemoval(virDomainObjPtr vm) { qemuDomainObjPrivatePtr priv =3D vm->privateData; priv->unplug.alias =3D NULL; + priv->unplug.eventSeen =3D false; } =20 /* Returns: @@ -5187,7 +5217,8 @@ qemuDomainWaitForDeviceRemoval(virDomainObjPtr vm) return 1; until +=3D qemuDomainRemoveDeviceWaitTime; =20 - while (priv->unplug.alias) { + while (priv->unplug.alias && + !priv->unplug.eventSeen) { if ((rc =3D virDomainObjWaitUntil(vm, until)) =3D=3D 1) return 0; =20 @@ -5204,6 +5235,9 @@ qemuDomainWaitForDeviceRemoval(virDomainObjPtr vm) return -1; } =20 + VIR_DEBUG("unplug.alias=3D%s unplug.eventSeen=3D%d", + NULLSTR(priv->unplug.alias), priv->unplug.eventSeen); + return 1; } =20 @@ -5224,6 +5258,7 @@ qemuDomainSignalDeviceRemoval(virDomainObjPtr vm, VIR_DEBUG("Removal of device '%s' continues in waiting thread", de= vAlias); qemuDomainResetDeviceRemoval(vm); priv->unplug.status =3D status; + priv->unplug.eventSeen =3D true; virDomainObjBroadcast(vm); return true; } @@ -5251,7 +5286,7 @@ qemuDomainDetachVirtioDiskDevice(virQEMUDriverPtr dri= ver, qemuDomainMarkDeviceForRemoval(vm, &detach->info); =20 qemuDomainObjEnterMonitor(driver, vm); - if (qemuDomainDeleteDevice(priv->mon, detach->info.alias) < 0) { + if (qemuDomainDeleteDevice(priv->mon, vm, detach->info.alias) < 0) { if (qemuDomainObjExitMonitor(driver, vm) < 0) goto cleanup; virDomainAuditDisk(vm, detach->src, NULL, "detach", false); @@ -5289,7 +5324,7 @@ qemuDomainDetachDiskDevice(virQEMUDriverPtr driver, qemuDomainMarkDeviceForRemoval(vm, &detach->info); =20 qemuDomainObjEnterMonitor(driver, vm); - if (qemuDomainDeleteDevice(priv->mon, detach->info.alias) < 0) { + if (qemuDomainDeleteDevice(priv->mon, vm, detach->info.alias) < 0) { if (qemuDomainObjExitMonitor(driver, vm) < 0) goto cleanup; virDomainAuditDisk(vm, detach->src, NULL, "detach", false); @@ -5483,7 +5518,7 @@ int qemuDomainDetachControllerDevice(virQEMUDriverPtr= driver, goto exit_monitor; } =20 - if (qemuDomainDeleteDevice(priv->mon, detach->info.alias) < 0) { + if (qemuDomainDeleteDevice(priv->mon, vm, detach->info.alias) < 0) { ignore_value(qemuDomainObjExitMonitor(driver, vm)); goto cleanup; } @@ -5527,7 +5562,7 @@ qemuDomainDetachHostPCIDevice(virQEMUDriverPtr driver, qemuDomainMarkDeviceForRemoval(vm, detach->info); =20 qemuDomainObjEnterMonitor(driver, vm); - ret =3D qemuDomainDeleteDevice(priv->mon, detach->info->alias); + ret =3D qemuDomainDeleteDevice(priv->mon, vm, detach->info->alias); if (qemuDomainObjExitMonitor(driver, vm) < 0) ret =3D -1; =20 @@ -5553,7 +5588,7 @@ qemuDomainDetachHostUSBDevice(virQEMUDriverPtr driver, qemuDomainMarkDeviceForRemoval(vm, detach->info); =20 qemuDomainObjEnterMonitor(driver, vm); - ret =3D qemuDomainDeleteDevice(priv->mon, detach->info->alias); + ret =3D qemuDomainDeleteDevice(priv->mon, vm, detach->info->alias); if (qemuDomainObjExitMonitor(driver, vm) < 0) ret =3D -1; =20 @@ -5579,7 +5614,7 @@ qemuDomainDetachHostSCSIDevice(virQEMUDriverPtr drive= r, qemuDomainMarkDeviceForRemoval(vm, detach->info); =20 qemuDomainObjEnterMonitor(driver, vm); - ret =3D qemuDomainDeleteDevice(priv->mon, detach->info->alias); + ret =3D qemuDomainDeleteDevice(priv->mon, vm, detach->info->alias); =20 if (qemuDomainObjExitMonitor(driver, vm) < 0) return -1; @@ -5606,7 +5641,7 @@ qemuDomainDetachSCSIVHostDevice(virQEMUDriverPtr driv= er, qemuDomainMarkDeviceForRemoval(vm, detach->info); =20 qemuDomainObjEnterMonitor(driver, vm); - ret =3D qemuDomainDeleteDevice(priv->mon, detach->info->alias); + ret =3D qemuDomainDeleteDevice(priv->mon, vm, detach->info->alias); =20 if (qemuDomainObjExitMonitor(driver, vm) < 0) return -1; @@ -5634,7 +5669,7 @@ qemuDomainDetachMediatedDevice(virQEMUDriverPtr drive= r, qemuDomainMarkDeviceForRemoval(vm, detach->info); =20 qemuDomainObjEnterMonitor(driver, vm); - ret =3D qemuDomainDeleteDevice(priv->mon, detach->info->alias); + ret =3D qemuDomainDeleteDevice(priv->mon, vm, detach->info->alias); if (qemuDomainObjExitMonitor(driver, vm) < 0) ret =3D -1; =20 @@ -5814,7 +5849,7 @@ qemuDomainDetachShmemDevice(virQEMUDriverPtr driver, qemuDomainMarkDeviceForRemoval(vm, &shmem->info); =20 qemuDomainObjEnterMonitor(driver, vm); - if (qemuDomainDeleteDevice(priv->mon, shmem->info.alias) < 0) { + if (qemuDomainDeleteDevice(priv->mon, vm, shmem->info.alias) < 0) { ignore_value(qemuDomainObjExitMonitor(driver, vm)); goto cleanup; } @@ -5875,7 +5910,7 @@ qemuDomainDetachWatchdog(virQEMUDriverPtr driver, qemuDomainMarkDeviceForRemoval(vm, &watchdog->info); =20 qemuDomainObjEnterMonitor(driver, vm); - if (qemuDomainDeleteDevice(priv->mon, watchdog->info.alias) < 0) { + if (qemuDomainDeleteDevice(priv->mon, vm, watchdog->info.alias) < 0) { ignore_value(qemuDomainObjExitMonitor(driver, vm)); goto cleanup; } @@ -5925,7 +5960,7 @@ qemuDomainDetachRedirdevDevice(virQEMUDriverPtr drive= r, qemuDomainMarkDeviceForRemoval(vm, &tmpRedirdevDef->info); =20 qemuDomainObjEnterMonitor(driver, vm); - if (qemuDomainDeleteDevice(priv->mon, tmpRedirdevDef->info.alias) < 0)= { + if (qemuDomainDeleteDevice(priv->mon, vm, tmpRedirdevDef->info.alias) = < 0) { ignore_value(qemuDomainObjExitMonitor(driver, vm)); goto cleanup; } @@ -5996,7 +6031,7 @@ qemuDomainDetachNetDevice(virQEMUDriverPtr driver, qemuDomainMarkDeviceForRemoval(vm, &detach->info); =20 qemuDomainObjEnterMonitor(driver, vm); - if (qemuDomainDeleteDevice(priv->mon, detach->info.alias) < 0) { + if (qemuDomainDeleteDevice(priv->mon, vm, detach->info.alias) < 0) { if (qemuDomainObjExitMonitor(driver, vm) < 0) goto cleanup; virDomainAuditNet(vm, detach, NULL, "detach", false); @@ -6173,7 +6208,7 @@ int qemuDomainDetachChrDevice(virQEMUDriverPtr driver, goto cleanup; } } else { - if (qemuDomainDeleteDevice(priv->mon, tmpChr->info.alias) < 0) { + if (qemuDomainDeleteDevice(priv->mon, vm, tmpChr->info.alias) < 0)= { ignore_value(qemuDomainObjExitMonitor(driver, vm)); goto cleanup; } @@ -6229,7 +6264,7 @@ qemuDomainDetachRNGDevice(virQEMUDriverPtr driver, qemuDomainMarkDeviceForRemoval(vm, &tmpRNG->info); =20 qemuDomainObjEnterMonitor(driver, vm); - rc =3D qemuDomainDeleteDevice(priv->mon, tmpRNG->info.alias); + rc =3D qemuDomainDeleteDevice(priv->mon, vm, tmpRNG->info.alias); if (qemuDomainObjExitMonitor(driver, vm) || rc < 0) goto cleanup; =20 @@ -6281,7 +6316,7 @@ qemuDomainDetachMemoryDevice(virQEMUDriverPtr driver, qemuDomainMarkDeviceForRemoval(vm, &mem->info); =20 qemuDomainObjEnterMonitor(driver, vm); - rc =3D qemuDomainDeleteDevice(priv->mon, mem->info.alias); + rc =3D qemuDomainDeleteDevice(priv->mon, vm, mem->info.alias); if (qemuDomainObjExitMonitor(driver, vm) < 0 || rc < 0) goto cleanup; =20 @@ -6389,7 +6424,7 @@ qemuDomainHotplugDelVcpu(virQEMUDriverPtr driver, =20 qemuDomainObjEnterMonitor(driver, vm); =20 - rc =3D qemuDomainDeleteDevice(qemuDomainGetMonitor(vm), vcpupriv->alia= s); + rc =3D qemuDomainDeleteDevice(qemuDomainGetMonitor(vm), vm, vcpupriv->= alias); =20 if (qemuDomainObjExitMonitor(driver, vm) < 0) goto cleanup; @@ -6997,7 +7032,7 @@ qemuDomainDetachInputDevice(virDomainObjPtr vm, qemuDomainMarkDeviceForRemoval(vm, &input->info); =20 qemuDomainObjEnterMonitor(driver, vm); - if (qemuDomainDeleteDevice(priv->mon, input->info.alias) < 0) { + if (qemuDomainDeleteDevice(priv->mon, vm, input->info.alias) < 0) { ignore_value(qemuDomainObjExitMonitor(driver, vm)); goto cleanup; } @@ -7040,7 +7075,7 @@ qemuDomainDetachVsockDevice(virDomainObjPtr vm, qemuDomainMarkDeviceForRemoval(vm, &vsock->info); =20 qemuDomainObjEnterMonitor(driver, vm); - if (qemuDomainDeleteDevice(priv->mon, vsock->info.alias) < 0) { + if (qemuDomainDeleteDevice(priv->mon, vm, vsock->info.alias) < 0) { ignore_value(qemuDomainObjExitMonitor(driver, vm)); goto cleanup; } --=20 2.19.2 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list