From nobody Sun Feb 8 18:29:19 2026 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; envelope-from=libvir-list-bounces@redhat.com; helo=mx1.redhat.com; Authentication-Results: mx.zoho.com; spf=pass (zoho.com: domain of redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=libvir-list-bounces@redhat.com; Return-Path: Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by mx.zohomail.com with SMTPS id 1498695980187863.095724447269; Wed, 28 Jun 2017 17:26:20 -0700 (PDT) Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 3CE7718E687; Thu, 29 Jun 2017 00:26:15 +0000 (UTC) Received: from colo-mx.corp.redhat.com (colo-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.21]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 0BF85182D3; Thu, 29 Jun 2017 00:26:15 +0000 (UTC) Received: from lists01.pubmisc.prod.ext.phx2.redhat.com (lists01.pubmisc.prod.ext.phx2.redhat.com [10.5.19.33]) by colo-mx.corp.redhat.com (Postfix) with ESMTP id B3DCC4A492; Thu, 29 Jun 2017 00:26:14 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) by lists01.pubmisc.prod.ext.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id v5T0Q7Vc008911 for ; Wed, 28 Jun 2017 20:26:07 -0400 Received: by smtp.corp.redhat.com (Postfix) id 4B5927D510; Thu, 29 Jun 2017 00:26:07 +0000 (UTC) Received: from mx1.redhat.com (ext-mx06.extmail.prod.ext.phx2.redhat.com [10.5.110.30]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 453B960462 for ; Thu, 29 Jun 2017 00:26:07 +0000 (UTC) Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 83DAA3DBDE for ; Thu, 29 Jun 2017 00:26:05 +0000 (UTC) Received: from pps.filterd (m0098393.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.20/8.16.0.20) with SMTP id v5T0Nmed030797 for ; Wed, 28 Jun 2017 20:26:05 -0400 Received: from e33.co.us.ibm.com (e33.co.us.ibm.com [32.97.110.151]) by mx0a-001b2d01.pphosted.com with ESMTP id 2bck158aed-1 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT) for ; Wed, 28 Jun 2017 20:26:04 -0400 Received: from localhost by e33.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 28 Jun 2017 18:26:04 -0600 Received: from b03cxnp08028.gho.boulder.ibm.com (9.17.130.20) by e33.co.us.ibm.com (192.168.1.133) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Wed, 28 Jun 2017 18:25:59 -0600 Received: from b03ledav002.gho.boulder.ibm.com (b03ledav002.gho.boulder.ibm.com [9.17.130.233]) by b03cxnp08028.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id v5T0PxeJ27394238; Wed, 28 Jun 2017 17:25:59 -0700 Received: from b03ledav002.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 42110136040; Wed, 28 Jun 2017 18:25:59 -0600 (MDT) Received: from localhost (unknown [9.80.98.208]) by b03ledav002.gho.boulder.ibm.com (Postfix) with ESMTP id 0325113603A; Wed, 28 Jun 2017 18:25:58 -0600 (MDT) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 3CE7718E687 Authentication-Results: ext-mx05.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=linux.vnet.ibm.com Authentication-Results: ext-mx05.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=libvir-list-bounces@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 3CE7718E687 DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 83DAA3DBDE Authentication-Results: ext-mx06.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=linux.vnet.ibm.com Authentication-Results: ext-mx06.extmail.prod.ext.phx2.redhat.com; spf=none smtp.mailfrom=mdroth@linux.vnet.ibm.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 83DAA3DBDE From: Michael Roth To: libvir-list@redhat.com Date: Wed, 28 Jun 2017 19:24:59 -0500 In-Reply-To: <1498695900-1648-1-git-send-email-mdroth@linux.vnet.ibm.com> References: <1498695900-1648-1-git-send-email-mdroth@linux.vnet.ibm.com> X-TM-AS-GCONF: 00 x-cbid: 17062900-0008-0000-0000-0000081EBC3E X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00007292; HX=3.00000241; KW=3.00000007; PH=3.00000004; SC=3.00000214; SDB=6.00880137; UDB=6.00438730; IPR=6.00660291; BA=6.00005445; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00015996; XFM=3.00000015; UTC=2017-06-29 00:26:02 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17062900-0009-0000-0000-000042D76900 Message-Id: <1498695900-1648-5-git-send-email-mdroth@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:, , definitions=2017-06-28_15:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=1 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1703280000 definitions=main-1706290006 X-Greylist: Sender passed SPF test, Sender IP whitelisted by DNSRBL, ACL 203 matched, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.30]); Thu, 29 Jun 2017 00:26:05 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.30]); Thu, 29 Jun 2017 00:26:05 +0000 (UTC) for IP:'148.163.156.1' DOMAIN:'mx0a-001b2d01.pphosted.com' HELO:'mx0a-001b2d01.pphosted.com' FROM:'mdroth@linux.vnet.ibm.com' RCPT:'' X-RedHat-Spam-Score: 0.08 (BAYES_50, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL) 148.163.156.1 mx0a-001b2d01.pphosted.com 148.163.156.1 mx0a-001b2d01.pphosted.com X-Scanned-By: MIMEDefang 2.78 on 10.5.110.30 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-loop: libvir-list@redhat.com Cc: pkrempa@redhat.com, aik@ozlabs.ru, mprivozn@redhat.com, qemu-devel@nongnu.org, abologna@redhat.com, qemu-ppc@nongnu.org, laine@laine.org Subject: [libvirt] [RFC PATCH 4/5] qemu: hotplug: unbind VFIO devices as a group X-BeenThere: libvir-list@redhat.com X-Mailman-Version: 2.1.12 Precedence: junk List-Id: Development discussions about the libvirt library & tools List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Sender: libvir-list-bounces@redhat.com Errors-To: libvir-list-bounces@redhat.com X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.29]); Thu, 29 Jun 2017 00:26:18 +0000 (UTC) X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Type: text/plain; charset="utf-8" Currently we bind a managed hostdev back to the host driver (or "unbind" from the perspective of the stub driver) immediately upon receiving a DEVICE_DELETED event from QEMU. In cases where we have more one device from the group attached to a guest, this runs the risk of putting the group in a "non-viable" state where both a guest and host are using devices from a group simultaneously. This patch addresses this by deferring the unbind step until all hostdevs from a group have been detached from the guest. In the meantime, they are left on the drvManager's inactiveList, in a similar state as they would be if they were unmanaged devices that were bound to VFIO via nodedev-detach but not yet plugged into a guest. Signed-off-by: Michael Roth --- src/libvirt_private.syms | 3 ++ src/qemu/qemu_hostdev.c | 16 +++++++++ src/qemu/qemu_hostdev.h | 4 +++ src/qemu/qemu_hotplug.c | 16 ++++++++- src/util/virhostdev.c | 90 ++++++++++++++++++++++++++++++++++++++++++++= ++++ src/util/virhostdev.h | 8 +++++ src/util/virpci.c | 27 +++++++++++++++ src/util/virpci.h | 1 + 8 files changed, 164 insertions(+), 1 deletion(-) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 2bd3581..ba7fa39 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1783,6 +1783,8 @@ virHostCPUStatsAssign; virHostdevFindUSBDevice; virHostdevIsSCSIDevice; virHostdevManagerGetDefault; +virHostdevPCIDeviceGroupUnbind; +virHostdevPCIDeviceGroupUnbindable; virHostdevPCINodeDeviceDetach; virHostdevPCINodeDeviceReAttach; virHostdevPCINodeDeviceReset; @@ -2342,6 +2344,7 @@ virPCIDeviceWaitForCleanup; virPCIEDeviceInfoFree; virPCIGetDeviceAddressFromSysfsLink; virPCIGetHeaderType; +virPCIGetIOMMUGroupList; virPCIGetNetName; virPCIGetPhysicalFunction; virPCIGetVirtualFunctionIndex; diff --git a/src/qemu/qemu_hostdev.c b/src/qemu/qemu_hostdev.c index 73d26f4..fdc52fe 100644 --- a/src/qemu/qemu_hostdev.c +++ b/src/qemu/qemu_hostdev.c @@ -384,6 +384,22 @@ qemuHostdevPrepareDomainDevices(virQEMUDriverPtr drive= r, } =20 void +qemuHostdevReleasePCIDevices(virQEMUDriverPtr driver, + const char *name, + virDomainHostdevDefPtr *hostdevs, + int nhostdevs) +{ + virQEMUDriverConfigPtr cfg =3D virQEMUDriverGetConfig(driver); + const char *oldStateDir =3D cfg->stateDir; + virHostdevManagerPtr hostdev_mgr =3D driver->hostdevMgr; + + virHostdevReleasePCIDevices(hostdev_mgr, QEMU_DRIVER_NAME, name, + hostdevs, nhostdevs, oldStateDir); + + virObjectUnref(cfg); +} + +void qemuHostdevReAttachPCIDevices(virQEMUDriverPtr driver, const char *name, virDomainHostdevDefPtr *hostdevs, diff --git a/src/qemu/qemu_hostdev.h b/src/qemu/qemu_hostdev.h index 9a7c7f1..b010085 100644 --- a/src/qemu/qemu_hostdev.h +++ b/src/qemu/qemu_hostdev.h @@ -74,6 +74,10 @@ void qemuHostdevReAttachPCIDevices(virQEMUDriverPtr driv= er, const char *name, virDomainHostdevDefPtr *hostdevs, int nhostdevs); +void qemuHostdevReleasePCIDevices(virQEMUDriverPtr driver, + const char *name, + virDomainHostdevDefPtr *hostdevs, + int nhostdevs); void qemuHostdevReAttachUSBDevices(virQEMUDriverPtr driver, const char *name, virDomainHostdevDefPtr *hostdevs, diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index b557e82..af5ee6f 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -3896,7 +3896,10 @@ qemuDomainRemoveHostDevice(virQEMUDriverPtr driver, =20 switch ((virDomainHostdevSubsysType) hostdev->source.subsys.type) { case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI: - qemuHostdevReAttachPCIDevices(driver, vm->def->name, &hostdev, 1); + if (is_vfio) + qemuHostdevReleasePCIDevices(driver, vm->def->name, &hostdev, = 1); + else + qemuHostdevReAttachPCIDevices(driver, vm->def->name, &hostdev,= 1); qemuDomainReleaseDeviceAddress(vm, hostdev->info, NULL); /* QEMU might no longer need to lock as much memory, eg. we just * detached the last VFIO device, so adjust the limit here */ @@ -3925,6 +3928,17 @@ qemuDomainRemoveHostDevice(virQEMUDriverPtr driver, virDomainNetDefFree(net); } =20 + if (is_vfio) { + int iommu_group =3D + virPCIDeviceAddressGetIOMMUGroupNum(&hostdev->source.subsys.u.= pci.addr); + if (virHostdevPCIDeviceGroupUnbindable(driver->hostdevMgr, + iommu_group)) { + virHostdevPCIDeviceGroupUnbind(driver->hostdevMgr, + iommu_group); + } + } + + ret =3D 0; =20 cleanup: diff --git a/src/util/virhostdev.c b/src/util/virhostdev.c index 2cd3f34..a7f04fe 100644 --- a/src/util/virhostdev.c +++ b/src/util/virhostdev.c @@ -905,6 +905,96 @@ virHostdevPreparePCIDevices(virHostdevManagerPtr mgr, return ret; } =20 +static bool +virHostdevPCIDeviceUnbindableInternal(virHostdevManagerPtr mgr, + int iommu_group) +{ + struct virHostdevIsPCINodeDeviceUsedData data =3D { mgr, NULL, true }; + + if (virPCIIOMMUGroupIterate(iommu_group, + virHostdevIsPCINodeDeviceUsed, + &data) < 0) { + VIR_DEBUG("IOMMU group %d is not unbindable", iommu_group); + return false; + } + + VIR_DEBUG("IOMMU group %d is unbindable", iommu_group); + return true; +} + +/* + * Check if devices within IOMMU group are in use by any domains + */ +bool +virHostdevPCIDeviceGroupUnbindable(virHostdevManagerPtr mgr, + int iommu_group) +{ + bool result; + + virObjectLock(mgr->activePCIHostdevs); + result =3D virHostdevPCIDeviceUnbindableInternal(mgr, iommu_group); + virObjectUnlock(mgr->activePCIHostdevs); + + return result; +} + +/* + * Confirm all devices in IOMMU group are in inactiveList + * before attempting to reattach to host driver. Devices in IOMMU + * group that aren't in either activeList or inactiveList are considered + * outside our control, so we treat them as inactive as well. + * + * Callers can check virHostdevPCIDeviceGroupUnbindable() beforehand + * for some indication that the group is ready for reattach to the + * host, but since it's possible for a hostdev from the group to get + * re-attached to a guest prior to subsequently calling this function + * there is no guarantee of this, which should be fine since it would + * only be immediately rebound to the stub driver anyway. + */ +void +virHostdevPCIDeviceGroupUnbind(virHostdevManagerPtr mgr, + int iommu_group) +{ + virPCIDeviceListPtr pcidevs =3D NULL; + size_t i; + + virObjectLock(mgr->activePCIHostdevs); + virObjectLock(mgr->inactivePCIHostdevs); + + if (!virHostdevPCIDeviceUnbindableInternal(mgr, iommu_group)) { + VIR_DEBUG("IOMMU group %d still in use, deferring reattach " + "of PCI devices to host", iommu_group); + goto cleanup; + } + + pcidevs =3D virPCIGetIOMMUGroupList(iommu_group); + for (i =3D 0; i < virPCIDeviceListCount(pcidevs); i++) { + virPCIDevicePtr actual, pci =3D virPCIDeviceListGet(pcidevs, i); + virPCIDeviceAddressPtr devAddr =3D virPCIDeviceGetAddress(pci); + + actual =3D virPCIDeviceListFindByIDs(mgr->inactivePCIHostdevs, + devAddr->domain, + devAddr->bus, + devAddr->slot, + devAddr->function); + if (actual) { + VIR_DEBUG("Reattaching PCI device %s", virPCIDeviceGetName(act= ual)); + if (virPCIDeviceGetManaged(actual)) + if (virPCIDeviceReattach(actual, mgr->activePCIHostdevs, + mgr->inactivePCIHostdevs) < 0) { + VIR_ERROR(_("Failed to re-attach PCI device: %s"), + virGetLastErrorMessage()); + virResetLastError(); + } + } + } + + cleanup: + virObjectUnref(pcidevs); + virObjectUnlock(mgr->activePCIHostdevs); + virObjectUnlock(mgr->inactivePCIHostdevs); +} + /* * Pre-condition: inactivePCIHostdevs & activePCIHostdevs * are locked diff --git a/src/util/virhostdev.h b/src/util/virhostdev.h index fbc7fbd..2ab8101 100644 --- a/src/util/virhostdev.h +++ b/src/util/virhostdev.h @@ -122,6 +122,14 @@ virHostdevReleasePCIDevices(virHostdevManagerPtr mgr, const char *oldStateDir) ATTRIBUTE_NONNULL(1); void +virHostdevPCIDeviceGroupUnbind(virHostdevManagerPtr mgr, + int iommu_group) + ATTRIBUTE_NONNULL(1); +bool +virHostdevPCIDeviceGroupUnbindable(virHostdevManagerPtr mgr, + int iommu_group) + ATTRIBUTE_NONNULL(1); +void virHostdevReAttachUSBDevices(virHostdevManagerPtr hostdev_mgr, const char *drv_name, const char *dom_name, diff --git a/src/util/virpci.c b/src/util/virpci.c index b842f44..a8e5190 100644 --- a/src/util/virpci.c +++ b/src/util/virpci.c @@ -2298,6 +2298,33 @@ virPCIDeviceGetIOMMUGroupList(virPCIDevicePtr dev) } =20 =20 +/* + * virPCIGetIOMMUGroupList - return a virPCIDeviceList containing + * all of the devices in @iommu_group. + * + * Return the new list, or NULL on failure + */ +virPCIDeviceListPtr +virPCIGetIOMMUGroupList(int iommu_group) +{ + virPCIDeviceListPtr groupList =3D virPCIDeviceListNew(); + + if (!groupList) + goto error; + + if (virPCIIOMMUGroupIterate(iommu_group, + virPCIDeviceGetIOMMUGroupAddOne, + groupList) < 0) + goto error; + + return groupList; + + error: + virObjectUnref(groupList); + return NULL; +} + + typedef struct { virPCIDeviceAddressPtr **iommuGroupDevices; size_t *nIommuGroupDevices; diff --git a/src/util/virpci.h b/src/util/virpci.h index 5ec1306..5bcacb2 100644 --- a/src/util/virpci.h +++ b/src/util/virpci.h @@ -180,6 +180,7 @@ int virPCIIOMMUGroupIterate(int iommu_group, virPCIDeviceAddressActor actor, void *opaque); virPCIDeviceListPtr virPCIDeviceGetIOMMUGroupList(virPCIDevicePtr dev); +virPCIDeviceListPtr virPCIGetIOMMUGroupList(int iommu_group); int virPCIDeviceAddressGetIOMMUGroupAddresses(virPCIDeviceAddressPtr devAd= dr, virPCIDeviceAddressPtr **iom= muGroupDevices, size_t *nIommuGroupDevices); --=20 2.7.4 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list