From nobody Mon Apr 29 17:10:17 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zohomail.com: domain of redhat.com designates 207.211.31.120 as permitted sender) client-ip=207.211.31.120; envelope-from=libvir-list-bounces@redhat.com; helo=us-smtp-1.mimecast.com; Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of redhat.com designates 207.211.31.120 as permitted sender) smtp.mailfrom=libvir-list-bounces@redhat.com; dmarc=fail(p=none dis=none) header.from=nvidia.com ARC-Seal: i=1; a=rsa-sha256; t=1598612436; cv=none; d=zohomail.com; s=zohoarc; b=i3AjLSNZvl+zTXHsYJl6ElljPIDdFmRLLnTnGcLin0vxLG5K+r8lZ0EG8CXiz+3r+b3m2Kcj/fIhdX+uYImf6iw7LV5njpv/0pns/aiigP4Z8p+WPqSclzRlhR4vyZCQWjCyYZ+9578/wGgFBcJ89crPNn0sd/xZOQWe7kqy+RU= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1598612436; h=Content-Type:Cc:Date:From:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:Message-ID:Sender:Subject:To; bh=7y14ZGal5m8kCik9JoF4ROz1QuM2BKJ9D7p1ZMXyqI0=; b=KRbfdlh1EopJVZM2UyjtF3PyGybVsKfcEfEV1uWcF9vgVAd0Nq7FUlfGo51mk+AhOI7MbL73jhVgY+6xEWBQhvbigvDAdsGbydOdXfL0qDCI21mdBK35OV+NyDBR393/7FSdTY12OwiXWnFVxeOL7uZRxUhBVQ2uryy/pWv3NeA= ARC-Authentication-Results: i=1; mx.zohomail.com; spf=pass (zohomail.com: domain of redhat.com designates 207.211.31.120 as permitted sender) smtp.mailfrom=libvir-list-bounces@redhat.com; dmarc=fail header.from= (p=none dis=none) header.from= Return-Path: Received: from us-smtp-1.mimecast.com (us-smtp-delivery-1.mimecast.com [207.211.31.120]) by mx.zohomail.com with SMTPS id 1598612436467992.6073872789523; Fri, 28 Aug 2020 04:00:36 -0700 (PDT) Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-95-P14J33tHML-zjTDy4Kh2RQ-1; Fri, 28 Aug 2020 07:00:33 -0400 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id D7AFA189E617; Fri, 28 Aug 2020 11:00:26 +0000 (UTC) Received: from colo-mx.corp.redhat.com (colo-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.21]) by smtp.corp.redhat.com (Postfix) with ESMTPS id A2CBF747BD; Fri, 28 Aug 2020 11:00:24 +0000 (UTC) Received: from lists01.pubmisc.prod.ext.phx2.redhat.com (lists01.pubmisc.prod.ext.phx2.redhat.com [10.5.19.33]) by colo-mx.corp.redhat.com (Postfix) with ESMTP id 84E74EC22; Fri, 28 Aug 2020 11:00:20 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) by lists01.pubmisc.prod.ext.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id 07SB0Hew000764 for ; Fri, 28 Aug 2020 07:00:17 -0400 Received: by smtp.corp.redhat.com (Postfix) id AC2041007A29; Fri, 28 Aug 2020 11:00:17 +0000 (UTC) Received: from mimecast-mx02.redhat.com (mimecast05.extmail.prod.ext.rdu2.redhat.com [10.11.55.21]) by smtp.corp.redhat.com (Postfix) with ESMTPS id A77D61009A10 for ; Fri, 28 Aug 2020 11:00:14 +0000 (UTC) Received: from us-smtp-1.mimecast.com (us-smtp-delivery-1.mimecast.com [207.211.31.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 70BF8800260 for ; Fri, 28 Aug 2020 11:00:14 +0000 (UTC) Received: from mellanox.co.il (mail-il-dmz.mellanox.com [193.47.165.129]) by relay.mimecast.com with ESMTP id us-mta-569-eHQoMTTtPvWV7DGmPAuq3Q-1; Fri, 28 Aug 2020 07:00:11 -0400 Received: from Internal Mail-Server by MTLPINE1 (envelope-from dlinkin@nvidia.com) with SMTP; 28 Aug 2020 13:53:28 +0300 Received: from dev-r-vrt-111.mtr.labs.mlnx. (dev-r-vrt-111.mtr.labs.mlnx [10.212.111.1]) by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id 07SArS7L021732; Fri, 28 Aug 2020 13:53:28 +0300 X-MC-Unique: P14J33tHML-zjTDy4Kh2RQ-1 X-MC-Unique: eHQoMTTtPvWV7DGmPAuq3Q-1 From: Dmytro Linkin To: libvir-list@redhat.com Subject: [PATCH] util: Add phys_port_name support on virPCIGetNetName Date: Fri, 28 Aug 2020 13:53:21 +0300 Message-Id: <1598612001-5955-1-git-send-email-dlinkin@nvidia.com> X-Mimecast-Impersonation-Protect: Policy=CLT - Impersonation Protection Definition; Similar Internal Domain=false; Similar Monitored External Domain=false; Custom External Domain=false; Mimecast External Domain=false; Newly Observed Domain=false; Internal User Name=false; Custom Display Name List=false; Reply-to Address Mismatch=false; Targeted Threat Dictionary=false; Mimecast Threat Dictionary=false; Custom Threat Dictionary=false; X-Scanned-By: MIMEDefang 2.78 on 10.11.54.3 X-loop: libvir-list@redhat.com Cc: Dmytro Linkin X-BeenThere: libvir-list@redhat.com X-Mailman-Version: 2.1.12 Precedence: junk List-Id: Development discussions about the libvirt library & tools List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: libvir-list-bounces@redhat.com Errors-To: libvir-list-bounces@redhat.com X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=libvir-list-bounces@redhat.com X-Mimecast-Spam-Score: 2.002 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Current virPCIGetNetName() logic is to get net device name by checking it's phys_port_id, if caller provide it, or by it's index (eg, by it's position at sysfs net directory). This approach worked fine up until linux kernel version 5.8, where NVIDIA Mellanox driver implemented linking of VFs' representors to PCI device in switchdev mode. This mean that device's sysfs net directory will hold multiple net devices. Ex.: $ ls '/sys/bus/pci/devices/0000:82:00.0/net' ens1f0 eth0 eth1 Most switch devices support phys_port_name instead of phys_port_id, so virPCIGetNetName() will try to get PF name by it's index - 0. The problem here is that the PF nedev entry may not be the first. To fix that, for switch devices, we introduce a new logic to select the PF uplink netdev according to the content of phys_port_name. Extend virPCIGetNetName() with physPortNameRegex variable to get proper device by it's phys_port_name scheme, for ex., "p[0-9]+$" to get PF, "pf[0-9]+vf[0-9]+$" to get VF or "p1$" to get exact net device. So now virPCIGetNetName() logic work in following sequence: - filter by phys_port_id, if it's provided, or - filter by phys_port_name, if it's regex provided, or - get net device by it's index (position) in sysfs net directory. Also, make getting content of iface sysfs files more generic. Signed-off-by: Dmytro Linkin Reviewed-by: Adrian Chiris --- src/hypervisor/virhostdev.c | 2 +- src/util/virnetdev.c | 74 ++++++++++++++++++++++++++++++++++++-----= ---- src/util/virnetdev.h | 4 +++ src/util/virpci.c | 63 ++++++++++++++++++++++++++++++++++++-- src/util/virpci.h | 6 ++++ 5 files changed, 130 insertions(+), 19 deletions(-) diff --git a/src/hypervisor/virhostdev.c b/src/hypervisor/virhostdev.c index 69102b8..1f5c347 100644 --- a/src/hypervisor/virhostdev.c +++ b/src/hypervisor/virhostdev.c @@ -333,7 +333,7 @@ virHostdevNetDevice(virDomainHostdevDefPtr hostdev, * type=3D'hostdev'>, and it is only those devices that should * end up calling this function. */ - if (virPCIGetNetName(sysfs_path, 0, NULL, linkdev) < 0) + if (virPCIGetNetName(sysfs_path, 0, NULL, NULL, linkdev) < 0) return -1; =20 if (!(*linkdev)) { diff --git a/src/util/virnetdev.c b/src/util/virnetdev.c index b42fa86..99e3b35 100644 --- a/src/util/virnetdev.c +++ b/src/util/virnetdev.c @@ -1112,6 +1112,29 @@ virNetDevGetPCIDevice(const char *devName) } =20 =20 +/* A wrapper to get content of file from ifname SYSFS_NET_DIR + */ +static int +virNetDevGetSysfsFileValue(const char *ifname, + const char *fileName, + char **sysfsFileData) +{ + g_autofree char *sysfsFile =3D NULL; + + *sysfsFileData =3D NULL; + + if (virNetDevSysfsFile(&sysfsFile, ifname, fileName) < 0) + return -1; + + /* a failure to read just means the driver doesn't support + * , so set success now and ignore the return from + * virFileReadAllQuiet(). + */ + + ignore_value(virFileReadAllQuiet(sysfsFile, 1024, sysfsFileData)); + return 0; +} + /** * virNetDevGetPhysPortID: * @@ -1130,20 +1153,29 @@ int virNetDevGetPhysPortID(const char *ifname, char **physPortID) { - g_autofree char *physPortIDFile =3D NULL; - - *physPortID =3D NULL; - - if (virNetDevSysfsFile(&physPortIDFile, ifname, "phys_port_id") < 0) - return -1; + return virNetDevGetSysfsFileValue(ifname, "phys_port_id", physPortID); +} =20 - /* a failure to read just means the driver doesn't support - * phys_port_id, so set success now and ignore the return from - * virFileReadAllQuiet(). - */ =20 - ignore_value(virFileReadAllQuiet(physPortIDFile, 1024, physPortID)); - return 0; +/** + * virNetDevGetPhysPortName: + * + * @ifname: name of a netdev + * + * @physPortName: pointer to char* that will receive @ifname's + * phys_port_name from sysfs (null terminated + * string). Could be NULL if @ifname's net driver doesn't + * support phys_port_name (most netdev drivers + * don't). Caller is responsible for freeing the string + * when finished. + * + * Returns 0 on success or -1 on failure. + */ +int +virNetDevGetPhysPortName(const char *ifname, + char **physPortName) +{ + return virNetDevGetSysfsFileValue(ifname, "phys_port_name", physPortNa= me); } =20 =20 @@ -1200,7 +1232,7 @@ virNetDevGetVirtualFunctions(const char *pfname, } =20 if (virPCIGetNetName(pci_sysfs_device_link, 0, - pfPhysPortID, &((*vfname)[i])) < 0) { + pfPhysPortID, NULL, &((*vfname)[i])) < 0) { goto cleanup; } =20 @@ -1295,7 +1327,8 @@ virNetDevGetPhysicalFunction(const char *ifname, char= **pfname) return -1; =20 if (virPCIGetNetName(physfn_sysfs_path, 0, - vfPhysPortID, pfname) < 0) { + vfPhysPortID, + VIR_PF_PHYS_PORT_NAME_REGEX, pfname) < 0) { return -1; } =20 @@ -1358,7 +1391,7 @@ virNetDevPFGetVF(const char *pfname, int vf, char **v= fname) * isn't bound to a netdev driver, it won't have a netdev name, * and vfname will be NULL). */ - return virPCIGetNetName(virtfnSysfsPath, 0, pfPhysPortID, vfname); + return virPCIGetNetName(virtfnSysfsPath, 0, pfPhysPortID, NULL, vfname= ); } =20 =20 @@ -1403,6 +1436,17 @@ virNetDevGetPhysPortID(const char *ifname G_GNUC_UNU= SED, } =20 int +virNetDevGetPhysPortName(const char *ifname G_GNUC_UNUSED, + char **physPortName) +{ + /* this actually should never be called, and is just here to + * satisfy the linker. + */ + *physPortName =3D NULL; + return 0; +} + +int virNetDevGetVirtualFunctions(const char *pfname G_GNUC_UNUSED, char ***vfname G_GNUC_UNUSED, virPCIDeviceAddressPtr **virt_fns G_GNUC_UNUS= ED, diff --git a/src/util/virnetdev.h b/src/util/virnetdev.h index 55e3948..712421d 100644 --- a/src/util/virnetdev.h +++ b/src/util/virnetdev.h @@ -229,6 +229,10 @@ int virNetDevGetPhysPortID(const char *ifname, char **physPortID) ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) G_GNUC_WARN_UNUSED_RESULT; +int virNetDevGetPhysPortName(const char *ifname, + char **physPortName) + ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) + G_GNUC_WARN_UNUSED_RESULT; =20 int virNetDevGetVirtualFunctions(const char *pfname, char ***vfname, diff --git a/src/util/virpci.c b/src/util/virpci.c index 47c671d..18b3f66 100644 --- a/src/util/virpci.c +++ b/src/util/virpci.c @@ -2409,8 +2409,10 @@ virPCIDeviceAddressGetSysfsFile(virPCIDeviceAddressP= tr addr, * virPCIGetNetName: * @device_link_sysfs_path: sysfs path to the PCI device * @idx: used to choose which netdev when there are several - * (ignored if physPortID is set) + * (ignored if physPortID or physPortNameRegex is set) * @physPortID: match this string in the netdev's phys_port_id + * (or NULL to ignore and use phys_port_name or idx instead) + * @physPortNameRegex: match this regex with netdev's phys_port_name * (or NULL to ignore and use idx instead) * @netname: used to return the name of the netdev * (set to NULL (but returns success) if there is no netdev) @@ -2421,11 +2423,13 @@ int virPCIGetNetName(const char *device_link_sysfs_path, size_t idx, char *physPortID, + char *physPortNameRegex, char **netname) { g_autofree char *pcidev_sysfs_net_path =3D NULL; g_autofree char *firstEntryName =3D NULL; g_autofree char *thisPhysPortID =3D NULL; + g_autofree char *thisPhysPortName =3D NULL; int ret =3D -1; DIR *dir =3D NULL; struct dirent *entry =3D NULL; @@ -2466,6 +2470,41 @@ virPCIGetNetName(const char *device_link_sysfs_path, =20 continue; } + } else if (physPortNameRegex) { + /* Most switch devices use phys_port_name instead of + * phys_port_id. + * NOTE: VFs' representors net devices can be linked to PF's P= CI + * device, which mean that there'll be multiple net devices + * instances and to get a proper net device need to match on + * specific regex. + * To get PF netdev, for ex., used following regex: + * "(p[0-9]+$)|(p[0-9]+s[0-9]+$)" + * or to get exact VF's netdev next regex is used: + * "pf0vf1$" + */ + if (virNetDevGetPhysPortName(entry->d_name, &thisPhysPortName)= < 0) + goto cleanup; + + if (thisPhysPortName) { + /* if this one doesn't match, keep looking */ + if (!virStringMatch(thisPhysPortName, physPortNameRegex)) { + VIR_FREE(thisPhysPortName); + /* Save the first entry we find to use as a failsafe + * in case we fail to match on regex. + */ + if (!firstEntryName) + firstEntryName =3D g_strdup(entry->d_name); + + continue; + } + } else { + /* Save the first entry we find to use as a failsafe in ca= se + * phys_port_name is not supported. + */ + if (!firstEntryName) + firstEntryName =3D g_strdup(entry->d_name); + continue; + } } else { if (i++ < idx) continue; @@ -2494,6 +2533,22 @@ virPCIGetNetName(const char *device_link_sysfs_path, "phys_port_id '%s' under PCI device at %s= "), physPortID, device_link_sysfs_path); } + } else if (physPortNameRegex) { + if (firstEntryName) { + /* We didn't match the provided phys_port_name regex, prob= ably + * because kernel or NIC driver doesn't support it, so just + * return first netname we found. + */ + *netname =3D firstEntryName; + firstEntryName =3D NULL; + ret =3D 0; + } else { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Could not find network device with " + "phys_port_name matching regex '%s' " + "under PCI device at %s"), + physPortNameRegex, device_link_sysfs_path); + } } else { ret =3D 0; /* no netdev at the given index is *not* an error */ } @@ -2539,7 +2594,7 @@ virPCIGetVirtualFunctionInfo(const char *vf_sysfs_dev= ice_path, * correct. */ if (pfNetDevIdx =3D=3D -1) { - if (virPCIGetNetName(vf_sysfs_device_path, 0, NULL, &vfname) < 0) + if (virPCIGetNetName(vf_sysfs_device_path, 0, NULL, NULL, &vfname)= < 0) goto cleanup; =20 if (vfname) { @@ -2550,7 +2605,8 @@ virPCIGetVirtualFunctionInfo(const char *vf_sysfs_dev= ice_path, } =20 if (virPCIGetNetName(pf_sysfs_device_path, - pfNetDevIdx, vfPhysPortID, pfname) < 0) { + pfNetDevIdx, vfPhysPortID, + VIR_PF_PHYS_PORT_NAME_REGEX, pfname) < 0) { goto cleanup; } =20 @@ -2688,6 +2744,7 @@ int virPCIGetNetName(const char *device_link_sysfs_path G_GNUC_UNUSED, size_t idx G_GNUC_UNUSED, char *physPortID G_GNUC_UNUSED, + char *physPortNameScheme G_GNUC_UNUSED, char **netname G_GNUC_UNUSED) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _(unsupported)); diff --git a/src/util/virpci.h b/src/util/virpci.h index b3322ba..6ea0873 100644 --- a/src/util/virpci.h +++ b/src/util/virpci.h @@ -55,6 +55,11 @@ struct _virZPCIDeviceAddress { =20 #define VIR_PCI_DEVICE_ADDRESS_FMT "%04x:%02x:%02x.%d" =20 +/* Represents format of PF's phys_port_name in switchdev mode: + * 'p%u' or 'p%us%u'. New line checked since value is readed from sysfs fi= le. + */ +# define VIR_PF_PHYS_PORT_NAME_REGEX ((char *)"(p[0-9]+$)|(p[0-9]+s[0-9]+$= )") + struct _virPCIDeviceAddress { unsigned int domain; unsigned int bus; @@ -232,6 +237,7 @@ int virPCIDeviceAddressGetSysfsFile(virPCIDeviceAddress= Ptr addr, int virPCIGetNetName(const char *device_link_sysfs_path, size_t idx, char *physPortID, + char *physPortNameRegex, char **netname); =20 int virPCIGetSysfsFile(char *virPCIDeviceName, --=20 1.8.3.1