From nobody Sun Feb 8 06:00:36 2026 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; envelope-from=libvir-list-bounces@redhat.com; helo=mx1.redhat.com; Authentication-Results: mx.zoho.com; spf=pass (zoho.com: domain of redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=libvir-list-bounces@redhat.com; Return-Path: Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by mx.zohomail.com with SMTPS id 1498695995369789.0986487952478; Wed, 28 Jun 2017 17:26:35 -0700 (PDT) Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 0798E7486A; Thu, 29 Jun 2017 00:26:33 +0000 (UTC) Received: from colo-mx.corp.redhat.com (colo-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.20]) by smtp.corp.redhat.com (Postfix) with ESMTPS id C33E217DCE; Thu, 29 Jun 2017 00:26:32 +0000 (UTC) Received: from lists01.pubmisc.prod.ext.phx2.redhat.com (lists01.pubmisc.prod.ext.phx2.redhat.com [10.5.19.33]) by colo-mx.corp.redhat.com (Postfix) with ESMTP id 698A11853E33; Thu, 29 Jun 2017 00:26:32 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) by lists01.pubmisc.prod.ext.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id v5T0Q98b008928 for ; Wed, 28 Jun 2017 20:26:09 -0400 Received: by smtp.corp.redhat.com (Postfix) id 80136757B6; Thu, 29 Jun 2017 00:26:09 +0000 (UTC) Received: from mx1.redhat.com (ext-mx01.extmail.prod.ext.phx2.redhat.com [10.5.110.25]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 7AE8F757C0 for ; Thu, 29 Jun 2017 00:26:07 +0000 (UTC) Received: from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id E91E881235 for ; Thu, 29 Jun 2017 00:26:05 +0000 (UTC) Received: from pps.filterd (m0098419.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.20/8.16.0.20) with SMTP id v5T0NtlO026906 for ; Wed, 28 Jun 2017 20:26:05 -0400 Received: from e37.co.us.ibm.com (e37.co.us.ibm.com [32.97.110.158]) by mx0b-001b2d01.pphosted.com with ESMTP id 2bcj0cjw4k-1 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT) for ; Wed, 28 Jun 2017 20:26:05 -0400 Received: from localhost by e37.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 28 Jun 2017 18:26:04 -0600 Received: from b03cxnp08026.gho.boulder.ibm.com (9.17.130.18) by e37.co.us.ibm.com (192.168.1.137) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Wed, 28 Jun 2017 18:26:00 -0600 Received: from b03ledav003.gho.boulder.ibm.com (b03ledav003.gho.boulder.ibm.com [9.17.130.234]) by b03cxnp08026.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id v5T0Q0AT65208530; Wed, 28 Jun 2017 17:26:00 -0700 Received: from b03ledav003.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 4BD256A042; Wed, 28 Jun 2017 18:26:00 -0600 (MDT) Received: from localhost (unknown [9.80.98.208]) by b03ledav003.gho.boulder.ibm.com (Postfix) with ESMTP id 00C756A03C; Wed, 28 Jun 2017 18:25:59 -0600 (MDT) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 0798E7486A Authentication-Results: ext-mx02.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=linux.vnet.ibm.com Authentication-Results: ext-mx02.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=libvir-list-bounces@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 0798E7486A DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com E91E881235 Authentication-Results: ext-mx01.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=linux.vnet.ibm.com Authentication-Results: ext-mx01.extmail.prod.ext.phx2.redhat.com; spf=none smtp.mailfrom=mdroth@linux.vnet.ibm.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com E91E881235 From: Michael Roth To: libvir-list@redhat.com Date: Wed, 28 Jun 2017 19:25:00 -0500 In-Reply-To: <1498695900-1648-1-git-send-email-mdroth@linux.vnet.ibm.com> References: <1498695900-1648-1-git-send-email-mdroth@linux.vnet.ibm.com> X-TM-AS-GCONF: 00 x-cbid: 17062900-0024-0000-0000-000016BEEE3A X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00007292; HX=3.00000241; KW=3.00000007; PH=3.00000004; SC=3.00000214; SDB=6.00880137; UDB=6.00438730; IPR=6.00660291; BA=6.00005445; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00015996; XFM=3.00000015; UTC=2017-06-29 00:26:03 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17062900-0025-0000-0000-00004B98403B Message-Id: <1498695900-1648-6-git-send-email-mdroth@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:, , definitions=2017-06-28_15:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=3 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1703280000 definitions=main-1706290006 X-Greylist: Sender passed SPF test, Sender IP whitelisted by DNSRBL, ACL 203 matched, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Thu, 29 Jun 2017 00:26:06 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Thu, 29 Jun 2017 00:26:06 +0000 (UTC) for IP:'148.163.158.5' DOMAIN:'mx0b-001b2d01.pphosted.com' HELO:'mx0a-001b2d01.pphosted.com' FROM:'mdroth@linux.vnet.ibm.com' RCPT:'' X-RedHat-Spam-Score: -2.7 (BAYES_50, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2) 148.163.158.5 mx0b-001b2d01.pphosted.com 148.163.158.5 mx0b-001b2d01.pphosted.com X-Scanned-By: MIMEDefang 2.78 on 10.5.110.25 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-loop: libvir-list@redhat.com Cc: pkrempa@redhat.com, aik@ozlabs.ru, mprivozn@redhat.com, qemu-devel@nongnu.org, abologna@redhat.com, qemu-ppc@nongnu.org, laine@laine.org Subject: [libvirt] [RFC PATCH 5/5] qemu: hotplug: wait for VFIO group FD close before unbind X-BeenThere: libvir-list@redhat.com X-Mailman-Version: 2.1.12 Precedence: junk List-Id: Development discussions about the libvirt library & tools List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Sender: libvir-list-bounces@redhat.com Errors-To: libvir-list-bounces@redhat.com X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Thu, 29 Jun 2017 00:26:34 +0000 (UTC) X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Type: text/plain; charset="utf-8" QEMU emits DEVICE_DELETED events during a device's "unparent" callback, but some additional cleanup occurs afterward via "finalize". In most cases libvirt can ignore the latter, but in the case of VFIO the closing of a device's group FD happens here, which is something libvirt needs to wait for before attempting to bind a hostdev back to a host driver. In the case of powernv, and possibly other host archs as well, failing to do this can lead to the host device driver crashing due to necessary setup (like restoring default DMA windows for the IOMMU group) not being completed yet. We attempt to avoid this here by polling the QEMU process for open FDs referencing /dev/vfio/ and waiting for a certain period of time. In practice the delay between the DEVICE_DELETED event and closing of the group FD seems to be around 6 seconds, so we set the max wait time at 15 seconds. If we time out we leave the device in the inactiveList and bound to VFIO. We only attempt the wait if the last hostdev from an IOMMU group is being detached and there's reasonable expectation that the group FD will be closed soon. There are alternatives to this approach, like adding a specific group delete event to QEMU and handling this cleanup via and asynchronous event handler, nut since we do a similar poll-wait for things like KVM device passthrough this simple approach is hopefully a reasonable starting point at least. Signed-off-by: Michael Roth --- src/libvirt_private.syms | 1 + src/qemu/qemu_hotplug.c | 34 +++++++++++++++++++++++++++++-- src/util/virfile.c | 52 ++++++++++++++++++++++++++++++++++++++++++++= ++++ src/util/virfile.h | 1 + 4 files changed, 86 insertions(+), 2 deletions(-) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index ba7fa39..787267c 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1657,6 +1657,7 @@ virFileIsDir; virFileIsExecutable; virFileIsLink; virFileIsMountPoint; +virFileIsOpenByPid; virFileIsSharedFS; virFileIsSharedFSType; virFileLength; diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index af5ee6f..d200bab 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -68,6 +68,8 @@ VIR_LOG_INIT("qemu.qemu_hotplug"); /* Wait up to 5 seconds for device removal to finish. */ unsigned long long qemuDomainRemoveDeviceWaitTime =3D 1000ull * 5; =20 +/* Wait up to 15 seconds for iommu group close */ +unsigned long long qemuDomainRemoveDeviceGroupWaitTime =3D 1000ull * 15; =20 /** * qemuDomainPrepareDisk: @@ -3830,6 +3832,32 @@ qemuDomainRemoveSCSIVHostDevice(virQEMUDriverPtr dri= ver, } =20 static int +qemuDomainWaitForDeviceGroupClose(virDomainObjPtr vm, int iommu_group) +{ + char *group_path; + unsigned long long remaining_ms =3D qemuDomainRemoveDeviceGroupWaitTim= e; + int rc =3D -1; + + if (virAsprintf(&group_path, "/dev/vfio/%d", iommu_group) < 0) + return -1; + + while ((rc =3D virFileIsOpenByPid(group_path, vm->pid)) =3D=3D 1) { + if (remaining_ms <=3D 0) + break; + usleep(100*1000); + remaining_ms -=3D 100; + } + + VIR_DEBUG("IOMMU group %d FD status: %d, wait time: %llu ms", + iommu_group, rc, + qemuDomainRemoveDeviceGroupWaitTime - remaining_ms); + + VIR_FREE(group_path); + return rc; +} + + +static int qemuDomainRemoveHostDevice(virQEMUDriverPtr driver, virDomainObjPtr vm, virDomainHostdevDefPtr hostdev) @@ -3933,8 +3961,10 @@ qemuDomainRemoveHostDevice(virQEMUDriverPtr driver, virPCIDeviceAddressGetIOMMUGroupNum(&hostdev->source.subsys.u.= pci.addr); if (virHostdevPCIDeviceGroupUnbindable(driver->hostdevMgr, iommu_group)) { - virHostdevPCIDeviceGroupUnbind(driver->hostdevMgr, - iommu_group); + if (qemuDomainWaitForDeviceGroupClose(vm, iommu_group) =3D=3D = 0) { + virHostdevPCIDeviceGroupUnbind(driver->hostdevMgr, + iommu_group); + } } } =20 diff --git a/src/util/virfile.c b/src/util/virfile.c index d444b32..29b762f 100644 --- a/src/util/virfile.c +++ b/src/util/virfile.c @@ -4162,3 +4162,55 @@ virFileReadValueString(char **value, const char *for= mat, ...) VIR_FREE(str); return ret; } + +int +virFileIsOpenByPid(const char *path, pid_t pid) +{ + struct dirent *ent; + DIR *filelist_dir; + char *filelist_path; + bool found =3D false; + int rc =3D -1; + + if (!path || !IS_ABSOLUTE_FILE_NAME(path)) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("invalid path: %s"), path ? path : "null"); + goto error; + } + + if (virAsprintf(&filelist_path, "/proc/%d/fd", pid) < 0) + goto error; + + if (virDirOpen(&filelist_dir, filelist_path) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("unable to open directory: %s"), filelist_path); + goto error; + } + + while (!found && + (rc =3D virDirRead(filelist_dir, &ent, filelist_path)) =3D=3D 1= ) { + char *resolved_path =3D NULL; + char *link_path; + if ((rc =3D virAsprintf(&link_path, "%s/%s", filelist_path, ent->d= _name)) < 0) + break; + if (virFileResolveLink(link_path, &resolved_path) =3D=3D 0) { + if (resolved_path) { + VIR_DEBUG("checking absolute path for match (need: %s, got= : %s)", + path, resolved_path); + if (STREQ(resolved_path, path)) + found =3D true; + VIR_FREE(resolved_path); + } + } + } + + VIR_DIR_CLOSE(filelist_dir); + error: + VIR_FREE(filelist_path); + + VIR_DEBUG("returning, rc: %d, found: %d", rc, found); + if (rc < 0) + return rc; + + return found ? 1 : 0; +} diff --git a/src/util/virfile.h b/src/util/virfile.h index 57ceb80..fb86786 100644 --- a/src/util/virfile.h +++ b/src/util/virfile.h @@ -347,6 +347,7 @@ int virFileReadValueScaledInt(unsigned long long *value= , const char *format, ... int virFileReadValueString(char **value, const char *format, ...) ATTRIBUTE_FMT_PRINTF(2, 3); =20 +int virFileIsOpenByPid(const char *path, pid_t pid); =20 int virFileInData(int fd, int *inData, --=20 2.7.4 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list