From nobody Sun Feb 8 18:10:40 2026 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; envelope-from=libvir-list-bounces@redhat.com; helo=mx1.redhat.com; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=libvir-list-bounces@redhat.com; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1569514881; cv=none; d=zoho.com; s=zohoarc; b=MzU6iir6bIrFak6qaYdOtOXKKDYLI51Mnz8x8qFEnh1WDcOTJ1TQlXg7Xj8BisEy0P9rQkH2rYjOq+SpQONZnROG9dcOlhn6QkuCitLrIXRRRPjDeYpmH3Qk5Y3w4fCWiqe17XxMdmjnMilM9k65ctASggzMwMuQD1f5A+FJEgY= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zoho.com; s=zohoarc; t=1569514881; h=Content-Type:Content-Transfer-Encoding:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To:ARC-Authentication-Results; bh=Dk/HzvMX1LgxDZvJYu3g5eXDwjkDAM6uHTdDOZk79B4=; b=nYCT22IZ2H9Nhua96bYkD+GSakvs4dAXv5GX2ifs9xWdIQmzEC65/v1wO3DKmCh9yO27EqbEi/pSogdiNbXy5BJ08a+c/cyf5b9BGcCxLMN5pCVCd/Z78LcrS05Xdqy9Q8M+ZgawEkRWtPqMgN5YauKZCFvkiI1UHS9SSSIN/pQ= ARC-Authentication-Results: i=1; mx.zoho.com; spf=pass (zoho.com: domain of redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=libvir-list-bounces@redhat.com; dmarc=pass header.from= (p=none dis=none) header.from= Return-Path: Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by mx.zohomail.com with SMTPS id 1569514881616650.232761603927; Thu, 26 Sep 2019 09:21:21 -0700 (PDT) Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 107A9A44AE4; Thu, 26 Sep 2019 16:21:20 +0000 (UTC) Received: from colo-mx.corp.redhat.com (colo-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.21]) by smtp.corp.redhat.com (Postfix) with ESMTPS id DEA225D9C3; Thu, 26 Sep 2019 16:21:19 +0000 (UTC) Received: from lists01.pubmisc.prod.ext.phx2.redhat.com (lists01.pubmisc.prod.ext.phx2.redhat.com [10.5.19.33]) by colo-mx.corp.redhat.com (Postfix) with ESMTP id 9EB1D4EE6A; Thu, 26 Sep 2019 16:21:19 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) by lists01.pubmisc.prod.ext.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id x8QGEPcO003621 for ; Thu, 26 Sep 2019 12:14:25 -0400 Received: by smtp.corp.redhat.com (Postfix) id 117765D9DC; Thu, 26 Sep 2019 16:14:25 +0000 (UTC) Received: from moe.brq.redhat.com (unknown [10.43.2.30]) by smtp.corp.redhat.com (Postfix) with ESMTP id 8E7365D9D5 for ; Thu, 26 Sep 2019 16:14:20 +0000 (UTC) From: Michal Privoznik To: libvir-list@redhat.com Date: Thu, 26 Sep 2019 18:12:19 +0200 Message-Id: In-Reply-To: References: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-loop: libvir-list@redhat.com Subject: [libvirt] [PATCH v2 23/39] virhostdev: Include virNVMeDevice module X-BeenThere: libvir-list@redhat.com X-Mailman-Version: 2.1.12 Precedence: junk List-Id: Development discussions about the libvirt library & tools List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Transfer-Encoding: quoted-printable Sender: libvir-list-bounces@redhat.com Errors-To: libvir-list-bounces@redhat.com X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.6.2 (mx1.redhat.com [10.5.110.68]); Thu, 26 Sep 2019 16:21:20 +0000 (UTC) Content-Type: text/plain; charset="utf-8" Now that we have virNVMeDevice module (introduced in previous commit), let's use it int virHostdev to track which NVMe devices are free to be used by a domain and which are taken. Signed-off-by: Michal Privoznik --- src/libvirt_private.syms | 5 + src/util/virhostdev.c | 332 +++++++++++++++++++++++++++++++++++++++ src/util/virhostdev.h | 37 +++++ 3 files changed, 374 insertions(+) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 0c84b347db..ae8b41ce30 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2113,18 +2113,23 @@ virHostdevPCINodeDeviceReAttach; virHostdevPCINodeDeviceReset; virHostdevPrepareDomainDevices; virHostdevPrepareMediatedDevices; +virHostdevPrepareNVMeDevices; +virHostdevPrepareOneNVMeDevice; virHostdevPreparePCIDevices; virHostdevPrepareSCSIDevices; virHostdevPrepareSCSIVHostDevices; virHostdevPrepareUSBDevices; virHostdevReAttachDomainDevices; virHostdevReAttachMediatedDevices; +virHostdevReAttachNVMeDevices; +virHostdevReAttachOneNVMeDevice; virHostdevReAttachPCIDevices; virHostdevReAttachSCSIDevices; virHostdevReAttachSCSIVHostDevices; virHostdevReAttachUSBDevices; virHostdevUpdateActiveDomainDevices; virHostdevUpdateActiveMediatedDevices; +virHostdevUpdateActiveNVMeDevices; virHostdevUpdateActivePCIDevices; virHostdevUpdateActiveSCSIDevices; virHostdevUpdateActiveUSBDevices; diff --git a/src/util/virhostdev.c b/src/util/virhostdev.c index f0c97ca887..9ccf79453c 100644 --- a/src/util/virhostdev.c +++ b/src/util/virhostdev.c @@ -138,6 +138,7 @@ virHostdevManagerDispose(void *obj) virObjectUnref(hostdevMgr->activeSCSIHostdevs); virObjectUnref(hostdevMgr->activeSCSIVHostHostdevs); virObjectUnref(hostdevMgr->activeMediatedHostdevs); + virObjectUnref(hostdevMgr->activeNVMeHostdevs); VIR_FREE(hostdevMgr->stateDir); } =20 @@ -168,6 +169,9 @@ virHostdevManagerNew(void) if (!(hostdevMgr->activeMediatedHostdevs =3D virMediatedDeviceListNew(= ))) return NULL; =20 + if (!(hostdevMgr->activeNVMeHostdevs =3D virNVMeDeviceListNew())) + return NULL; + if (privileged) { if (VIR_STRDUP(hostdevMgr->stateDir, HOSTDEV_STATE_DIR) < 0) return NULL; @@ -2240,3 +2244,331 @@ virHostdevUpdateActiveDomainDevices(virHostdevManag= erPtr mgr, =20 return 0; } + + +static int +virHostdevGetNVMeDeviceList(virNVMeDeviceListPtr nvmeDevices, + virStorageSourcePtr src, + const char *drv_name, + const char *dom_name) +{ + virStorageSourcePtr n; + + for (n =3D src; virStorageSourceIsBacking(n); n =3D n->backingStore) { + VIR_AUTOPTR(virNVMeDevice) dev =3D NULL; + const virStorageSourceNVMeDef *srcNVMe =3D n->nvme; + + if (n->type !=3D VIR_STORAGE_TYPE_NVME) + continue; + + if (!(dev =3D virNVMeDeviceNew(&srcNVMe->pciAddr, + srcNVMe->namespace, + srcNVMe->managed))) + return -1; + + if (virNVMeDeviceUsedBySet(dev, drv_name, dom_name) < 0) + return -1; + + if (virNVMeDeviceListAdd(nvmeDevices, dev) < 0) + return -1; + } + + return 0; +} + + +int +virHostdevPrepareOneNVMeDevice(virHostdevManagerPtr hostdev_mgr, + const char *drv_name, + const char *dom_name, + virStorageSourcePtr src) +{ + VIR_AUTOUNREF(virNVMeDeviceListPtr) nvmeDevices =3D NULL; + VIR_AUTOUNREF(virPCIDeviceListPtr) pciDevices =3D NULL; + const unsigned int pciFlags =3D 0; + virNVMeDevicePtr temp =3D NULL; + size_t i; + ssize_t lastGoodNVMeIdx =3D -1; + int ret =3D -1; + + if (!(nvmeDevices =3D virNVMeDeviceListNew())) + return -1; + + if (virHostdevGetNVMeDeviceList(nvmeDevices, src, drv_name, dom_name) = < 0) + return -1; + + if (virNVMeDeviceListCount(nvmeDevices) =3D=3D 0) + return 0; + + virObjectLock(hostdev_mgr->activeNVMeHostdevs); + + /* Firstly, let's check if all devices are free */ + for (i =3D 0; i < virNVMeDeviceListCount(nvmeDevices); i++) { + const virNVMeDevice *dev =3D virNVMeDeviceListGet(nvmeDevices, i); + const virPCIDeviceAddress *addr =3D NULL; + VIR_AUTOFREE(char *) addrStr =3D NULL; + const char *actual_drvname =3D NULL; + const char *actual_domname =3D NULL; + + temp =3D virNVMeDeviceListLookup(hostdev_mgr->activeNVMeHostdevs, = dev); + + /* Not on the list means not used */ + if (!temp) + continue; + + virNVMeDeviceUsedByGet(temp, &actual_drvname, &actual_domname); + addr =3D virNVMeDeviceAddressGet(dev); + addrStr =3D virPCIDeviceAddressAsString(addr); + + virReportError(VIR_ERR_OPERATION_INVALID, + _("NVMe device %s already in use by driver %s domai= n %s"), + NULLSTR(addrStr), actual_drvname, actual_domname); + goto cleanup; + } + + if (!(pciDevices =3D virNVMeDeviceListCreateDetachList(hostdev_mgr->ac= tiveNVMeHostdevs, + nvmeDevices))) + goto cleanup; + + /* Let's check if all PCI devices are NVMe disks. */ + for (i =3D 0; i < virPCIDeviceListCount(pciDevices); i++) { + virPCIDevicePtr pci =3D virPCIDeviceListGet(pciDevices, i); + VIR_AUTOFREE(char *) drvPath =3D NULL; + VIR_AUTOFREE(char *) drvName =3D NULL; + int stub =3D VIR_PCI_STUB_DRIVER_NONE; + + if (virPCIDeviceGetDriverPathAndName(pci, &drvPath, &drvName) < 0) + goto cleanup; + + if (drvName) + stub =3D virPCIStubDriverTypeFromString(drvName); + + if (stub =3D=3D VIR_PCI_STUB_DRIVER_VFIO || + STREQ_NULLABLE(drvName, "nvme")) + continue; + + VIR_WARN("Suspicious NVMe disk assignment. PCI device " + "%s is not an NVMe disk, it has %s driver", + virPCIDeviceGetName(pci), NULLSTR(drvName)); + } + + /* This looks like a good opportunity to merge inactive NVMe devices o= nto + * the active list. This, however, means that if something goes wrong = we + * have to perform a rollback before returning. */ + for (i =3D 0; i < virNVMeDeviceListCount(nvmeDevices); i++) { + temp =3D virNVMeDeviceListGet(nvmeDevices, i); + + if (virNVMeDeviceListAdd(hostdev_mgr->activeNVMeHostdevs, temp) < = 0) + goto rollback; + + lastGoodNVMeIdx =3D i; + } + + if (virHostdevPreparePCIDevicesImpl(hostdev_mgr, + drv_name, dom_name, NULL, + pciDevices, NULL, 0, pciFlags) < 0) + goto rollback; + + ret =3D 0; + cleanup: + virObjectUnlock(hostdev_mgr->activeNVMeHostdevs); + return ret; + + rollback: + while (lastGoodNVMeIdx >=3D 0) { + temp =3D virNVMeDeviceListGet(nvmeDevices, lastGoodNVMeIdx); + + virNVMeDeviceListDel(hostdev_mgr->activeNVMeHostdevs, temp); + + lastGoodNVMeIdx--; + } + goto cleanup; +} + + +int +virHostdevPrepareNVMeDevices(virHostdevManagerPtr hostdev_mgr, + const char *drv_name, + const char *dom_name, + virDomainDiskDefPtr *disks, + size_t ndisks) +{ + size_t i; + ssize_t lastGoodDiskIdx =3D -1; + + for (i =3D 0; i < ndisks; i++) { + if (virHostdevPrepareOneNVMeDevice(hostdev_mgr, drv_name, + dom_name, disks[i]->src) < 0) + goto rollback; + + lastGoodDiskIdx =3D i; + } + + return 0; + + rollback: + while (lastGoodDiskIdx >=3D 0) { + if (virHostdevReAttachOneNVMeDevice(hostdev_mgr, drv_name, dom_nam= e, + disks[lastGoodDiskIdx]->src) <= 0) { + VIR_ERROR(_("Failed to reattach NVMe for disk target: %s"), + disks[lastGoodDiskIdx]->dst); + } + + lastGoodDiskIdx--; + } + + return -1; +} + + +int +virHostdevReAttachOneNVMeDevice(virHostdevManagerPtr hostdev_mgr, + const char *drv_name, + const char *dom_name, + virStorageSourcePtr src) +{ + VIR_AUTOUNREF(virNVMeDeviceListPtr) nvmeDevices =3D NULL; + VIR_AUTOUNREF(virPCIDeviceListPtr) pciDevices =3D NULL; + size_t i; + int ret =3D -1; + + if (!(nvmeDevices =3D virNVMeDeviceListNew())) + return -1; + + if (virHostdevGetNVMeDeviceList(nvmeDevices, src, drv_name, dom_name) = < 0) + return -1; + + if (virNVMeDeviceListCount(nvmeDevices) =3D=3D 0) + return 0; + + virObjectLock(hostdev_mgr->activeNVMeHostdevs); + + if (!(pciDevices =3D virNVMeDeviceListCreateReAttachList(hostdev_mgr->= activeNVMeHostdevs, + nvmeDevices))) + goto cleanup; + + virHostdevReAttachPCIDevicesImpl(hostdev_mgr, + drv_name, dom_name, pciDevices, + NULL, 0, NULL); + + for (i =3D 0; i < virNVMeDeviceListCount(nvmeDevices); i++) { + virNVMeDevicePtr temp =3D virNVMeDeviceListGet(nvmeDevices, i); + + if (virNVMeDeviceListDel(hostdev_mgr->activeNVMeHostdevs, temp) < = 0) + goto cleanup; + } + + ret =3D 0; + cleanup: + virObjectUnlock(hostdev_mgr->activeNVMeHostdevs); + return ret; +} + + +int +virHostdevReAttachNVMeDevices(virHostdevManagerPtr hostdev_mgr, + const char *drv_name, + const char *dom_name, + virDomainDiskDefPtr *disks, + size_t ndisks) +{ + size_t i; + int ret =3D 0; + + /* Contrary to virHostdevPrepareNVMeDevices, this is a best + * effort approach. Just iterate over all disks and try to + * reattach them. Don't stop at the first failure. */ + for (i =3D 0; i < ndisks; i++) { + if (virHostdevReAttachOneNVMeDevice(hostdev_mgr, drv_name, + dom_name, disks[i]->src) < 0) { + VIR_ERROR(_("Failed to reattach NVMe for disk target: %s"), + disks[i]->dst); + ret =3D -1; + } + } + + return ret; +} + + +int +virHostdevUpdateActiveNVMeDevices(virHostdevManagerPtr hostdev_mgr, + const char *drv_name, + const char *dom_name, + virDomainDiskDefPtr *disks, + size_t ndisks) +{ + VIR_AUTOUNREF(virNVMeDeviceListPtr) nvmeDevices =3D NULL; + VIR_AUTOUNREF(virPCIDeviceListPtr) pciDevices =3D NULL; + virNVMeDevicePtr temp =3D NULL; + size_t i; + ssize_t lastGoodNVMeIdx =3D -1; + ssize_t lastGoodPCIIdx =3D -1; + int ret =3D -1; + + if (!(nvmeDevices =3D virNVMeDeviceListNew())) + return -1; + + for (i =3D 0; i < ndisks; i++) { + if (virHostdevGetNVMeDeviceList(nvmeDevices, disks[i]->src, drv_na= me, dom_name) < 0) + return -1; + } + + if (virNVMeDeviceListCount(nvmeDevices) =3D=3D 0) + return 0; + + virObjectLock(hostdev_mgr->activeNVMeHostdevs); + virObjectLock(hostdev_mgr->activePCIHostdevs); + virObjectLock(hostdev_mgr->inactivePCIHostdevs); + + if (!(pciDevices =3D virNVMeDeviceListCreateDetachList(hostdev_mgr->ac= tiveNVMeHostdevs, + nvmeDevices))) + goto cleanup; + + for (i =3D 0; i < virNVMeDeviceListCount(nvmeDevices); i++) { + temp =3D virNVMeDeviceListGet(nvmeDevices, i); + + if (virNVMeDeviceListAdd(hostdev_mgr->activeNVMeHostdevs, temp) < = 0) + goto rollback; + + lastGoodNVMeIdx =3D i; + } + + for (i =3D 0; i < virPCIDeviceListCount(pciDevices); i++) { + virPCIDevicePtr actual =3D virPCIDeviceListGet(pciDevices, i); + + /* We must restore some attributes that were lost on daemon restar= t. */ + virPCIDeviceSetUnbindFromStub(actual, true); + if (virPCIDeviceSetUsedBy(actual, drv_name, dom_name) < 0) + goto rollback; + + if (virPCIDeviceListAddCopy(hostdev_mgr->activePCIHostdevs, actual= ) < 0) + goto rollback; + + lastGoodPCIIdx =3D i; + } + + ret =3D 0; + cleanup: + virObjectUnlock(hostdev_mgr->inactivePCIHostdevs); + virObjectUnlock(hostdev_mgr->activePCIHostdevs); + virObjectUnlock(hostdev_mgr->activeNVMeHostdevs); + return ret; + + rollback: + while (lastGoodNVMeIdx >=3D 0) { + temp =3D virNVMeDeviceListGet(nvmeDevices, lastGoodNVMeIdx); + + virNVMeDeviceListDel(hostdev_mgr->activeNVMeHostdevs, temp); + + lastGoodNVMeIdx--; + } + while (lastGoodPCIIdx >=3D 0) { + virPCIDevicePtr actual =3D virPCIDeviceListGet(pciDevices, i); + + virPCIDeviceListDel(hostdev_mgr->activePCIHostdevs, actual); + + lastGoodPCIIdx--; + } + goto cleanup; +} diff --git a/src/util/virhostdev.h b/src/util/virhostdev.h index c7ef2055c1..372eca2751 100644 --- a/src/util/virhostdev.h +++ b/src/util/virhostdev.h @@ -29,6 +29,7 @@ #include "virscsivhost.h" #include "conf/domain_conf.h" #include "virmdev.h" +#include "virnvme.h" =20 typedef enum { VIR_HOSTDEV_STRICT_ACS_CHECK =3D (1 << 0), /* strict acs check */ @@ -53,6 +54,9 @@ struct _virHostdevManager { virSCSIDeviceListPtr activeSCSIHostdevs; virSCSIVHostDeviceListPtr activeSCSIVHostHostdevs; virMediatedDeviceListPtr activeMediatedHostdevs; + /* NVMe devices are PCI devices really, but one NVMe disk can + * have multiple namespaces. */ + virNVMeDeviceListPtr activeNVMeHostdevs; }; =20 virHostdevManagerPtr virHostdevManagerGetDefault(void); @@ -204,3 +208,36 @@ int virHostdevPCINodeDeviceReAttach(virHostdevManagerP= tr mgr, int virHostdevPCINodeDeviceReset(virHostdevManagerPtr mgr, virPCIDevicePtr pci) ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2); + +int +virHostdevPrepareOneNVMeDevice(virHostdevManagerPtr hostdev_mgr, + const char *drv_name, + const char *dom_name, + virStorageSourcePtr src); + +int +virHostdevPrepareNVMeDevices(virHostdevManagerPtr hostdev_mgr, + const char *drv_name, + const char *dom_name, + virDomainDiskDefPtr *disks, + size_t ndisks); + +int +virHostdevReAttachOneNVMeDevice(virHostdevManagerPtr hostdev_mgr, + const char *drv_name, + const char *dom_name, + virStorageSourcePtr src); + +int +virHostdevReAttachNVMeDevices(virHostdevManagerPtr hostdev_mgr, + const char *drv_name, + const char *dom_name, + virDomainDiskDefPtr *disks, + size_t ndisks); + +int +virHostdevUpdateActiveNVMeDevices(virHostdevManagerPtr hostdev_mgr, + const char *drv_name, + const char *dom_name, + virDomainDiskDefPtr *disks, + size_t ndisks); --=20 2.21.0 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list