From nobody Mon Feb 9 12:09:05 2026 Delivered-To: importer@patchew.org Received-SPF: pass (zohomail.com: domain of redhat.com designates 63.128.21.124 as permitted sender) client-ip=63.128.21.124; envelope-from=libvir-list-bounces@redhat.com; helo=us-smtp-delivery-124.mimecast.com; Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of redhat.com designates 63.128.21.124 as permitted sender) smtp.mailfrom=libvir-list-bounces@redhat.com; dmarc=fail(p=none dis=none) header.from=nvidia.com ARC-Seal: i=1; a=rsa-sha256; t=1611231746; cv=none; d=zohomail.com; s=zohoarc; b=OuWJPi3vHpg04hsMZGkVpnkHrIJHlQY5ZrqyytThJlUtjnHmQmwQyrcUC5G359Tvl7gumcG0yIVXuJ62sWI4QbN1l//nUQ/uP/wgtj5d0vWNsNIonINami4NBJ3EaNA/Q3hxr6dXZL88U4ntesMDWBFG6VUWGVZXKhGop4gz7rw= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1611231746; h=Content-Type:Content-Transfer-Encoding:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=EK6Y07luBY+PWG+uuGc7+QHxeU9MFnRmBOYPbAtEvK0=; b=hRAGmsKsM3DQGFvVJB2Hzii5J/nqdjWUyMZqH2WVKeBNzYSNvudc9v7tt0jDpuCumXE1CBBxoITDjJI2L6KYv7nbwo6zFzEt5e/ZnqlDB7N/drZvvyoO8Ohqj4yAPHzwL/IK8diC3CAFAukKTt2A9p1f4X3LJq2gEkfxyMfvtFY= ARC-Authentication-Results: i=1; mx.zohomail.com; spf=pass (zohomail.com: domain of redhat.com designates 63.128.21.124 as permitted sender) smtp.mailfrom=libvir-list-bounces@redhat.com; dmarc=fail header.from= (p=none dis=none) header.from= Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [63.128.21.124]) by mx.zohomail.com with SMTPS id 1611231746068354.75262847051897; Thu, 21 Jan 2021 04:22:26 -0800 (PST) Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-488-WInqpQl4OTaVSKRpQUhz7A-1; Thu, 21 Jan 2021 07:22:22 -0500 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 94CF9AFA80; Thu, 21 Jan 2021 12:22:16 +0000 (UTC) Received: from colo-mx.corp.redhat.com (colo-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.21]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 094CA100AE32; Thu, 21 Jan 2021 12:22:16 +0000 (UTC) Received: from lists01.pubmisc.prod.ext.phx2.redhat.com (lists01.pubmisc.prod.ext.phx2.redhat.com [10.5.19.33]) by colo-mx.corp.redhat.com (Postfix) with ESMTP id CADBA4BB7B; Thu, 21 Jan 2021 12:22:12 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) by lists01.pubmisc.prod.ext.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id 10LCMAUh029613 for ; Thu, 21 Jan 2021 07:22:10 -0500 Received: by smtp.corp.redhat.com (Postfix) id 2D861A9E7B; Thu, 21 Jan 2021 12:22:10 +0000 (UTC) Received: from mimecast-mx02.redhat.com (mimecast02.extmail.prod.ext.rdu2.redhat.com [10.11.55.18]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 28451D7B38 for ; Thu, 21 Jan 2021 12:22:06 +0000 (UTC) Received: from us-smtp-1.mimecast.com (us-smtp-delivery-1.mimecast.com [205.139.110.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id C7BF582DFE1 for ; Thu, 21 Jan 2021 12:22:06 +0000 (UTC) Received: from mellanox.co.il (mail-il-dmz.mellanox.com [193.47.165.129]) by relay.mimecast.com with ESMTP id us-mta-430-VmsLhDDpNFGbNYo2w7YY3w-1; Thu, 21 Jan 2021 07:22:04 -0500 Received: from Internal Mail-Server by MTLPINE1 (envelope-from moshele@nvidia.com) with SMTP; 21 Jan 2021 14:15:22 +0200 Received: from mtr-vdi-113.wap.labs.mlnx. (mtr-vdi-113.wap.labs.mlnx [10.209.100.75]) by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id 10LCFMB7002812; Thu, 21 Jan 2021 14:15:22 +0200 X-MC-Unique: WInqpQl4OTaVSKRpQUhz7A-1 X-MC-Unique: VmsLhDDpNFGbNYo2w7YY3w-1 From: Moshe Levi To: libvir-list@redhat.com, adrianc@nvidia.com Subject: [PATCH v2 2/2] util: Add phys_port_name support on virPCIGetNetName Date: Thu, 21 Jan 2021 14:15:22 +0200 Message-Id: <20210121121522.71550-3-moshele@nvidia.com> In-Reply-To: <20210121121522.71550-1-moshele@nvidia.com> References: <20210121121522.71550-1-moshele@nvidia.com> MIME-Version: 1.0 X-Mimecast-Impersonation-Protect: Policy=CLT - Impersonation Protection Definition; Similar Internal Domain=false; Similar Monitored External Domain=false; Custom External Domain=false; Mimecast External Domain=false; Newly Observed Domain=false; Internal User Name=false; Custom Display Name List=false; Reply-to Address Mismatch=false; Targeted Threat Dictionary=false; Mimecast Threat Dictionary=false; Custom Threat Dictionary=false X-Scanned-By: MIMEDefang 2.79 on 10.11.54.5 X-loop: libvir-list@redhat.com X-BeenThere: libvir-list@redhat.com X-Mailman-Version: 2.1.12 Precedence: junk List-Id: Development discussions about the libvirt library & tools List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: libvir-list-bounces@redhat.com Errors-To: libvir-list-bounces@redhat.com X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=libvir-list-bounces@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Dmytro Linkin Current virPCIGetNetName() logic is to get net device name by checking it's phys_port_id, if caller provide it, or by it's index (eg, by it's position at sysfs net directory). This approach worked fine up until linux kernel version 5.8, where NVIDIA Mellanox driver implemented linking of VFs' representors to PF PCI address [1] This mean that device's sysfs net directory will hold multiple net devices. Ex.: $ ls '/sys/bus/pci/devices/0000:82:00.0/net' ens1f0 eth0 eth1 In switchdev mode the PF and the VF represntors support phys_port_name In that case there is only a single netdev on the PCI device which is the PF. The other net devices are the VF representors which we want to exclude. whereas in the case of using phys_port_id, there could be multiple PFs, and so we have to match the phys_port_id of the VF virPCIGetNetName() will try to get PF name by it's index - 0. The problem here is that the PF nedev entry may not be the first. To fix that, for switch devices, we introduce a new logic to select the PF uplink netdev according to the content of phys_port_name. Extend virPCIGetNetName() logic work in following sequence: - filter by phys_port_id, if it's provided, (multi PF) or - filter by phys_port_name if exist (one PF) - get net device by it's index (position) in sysfs net directory. [1] - https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/ commit/?id=3D123f0f53dd64b67e34142485fe866a8a581f12f1 Co-Authored-by: Moshe Levi Signed-off-by: Dmytro Linkin Reviewed-by: Adrian Chiris --- src/util/virnetdev.c | 7 +++---- src/util/virpci.c | 38 +++++++++++++++++++++++++++++++++++++- src/util/virpci.h | 5 +++++ 3 files changed, 45 insertions(+), 5 deletions(-) diff --git a/src/util/virnetdev.c b/src/util/virnetdev.c index d41b967d6a..2485718b48 100644 --- a/src/util/virnetdev.c +++ b/src/util/virnetdev.c @@ -1263,7 +1263,7 @@ virNetDevGetVirtualFunctions(const char *pfname, } =20 if (virPCIGetNetName(pci_sysfs_device_link, 0, - pfPhysPortID, NULL, &((*vfname)[i])) < 0) { + pfPhysPortID, &((*vfname)[i])) < 0) { goto cleanup; } =20 @@ -1358,8 +1358,7 @@ virNetDevGetPhysicalFunction(const char *ifname, char= **pfname) return -1; =20 if (virPCIGetNetName(physfn_sysfs_path, 0, - vfPhysPortID, - VIR_PF_PHYS_PORT_NAME_REGEX, pfname) < 0) { + vfPhysPortID, pfname) < 0) { return -1; } =20 @@ -1422,7 +1421,7 @@ virNetDevPFGetVF(const char *pfname, int vf, char **v= fname) * isn't bound to a netdev driver, it won't have a netdev name, * and vfname will be NULL). */ - return virPCIGetNetName(virtfnSysfsPath, 0, pfPhysPortID, NULL, vfname= ); + return virPCIGetNetName(virtfnSysfsPath, 0, pfPhysPortID, vfname); } =20 =20 diff --git a/src/util/virpci.c b/src/util/virpci.c index 50fd5ef7ea..e20fb9e10b 100644 --- a/src/util/virpci.c +++ b/src/util/virpci.c @@ -2469,7 +2469,7 @@ virPCIDeviceAddressGetSysfsFile(virPCIDeviceAddressPt= r addr, * @idx: used to choose which netdev when there are several * (ignored if physPortID is set) * @physPortID: match this string in the netdev's phys_port_id - * (or NULL to ignore and use idx instead) + * (or NULL to ignore and use phys_port_name or idx instead) * @netname: used to return the name of the netdev * (set to NULL (but returns success) if there is no netdev) * @@ -2483,6 +2483,7 @@ virPCIGetNetName(const char *device_link_sysfs_path, { g_autofree char *pcidev_sysfs_net_path =3D NULL; g_autofree char *firstEntryName =3D NULL; + g_autofree char *thisPhysPortName =3D NULL; g_autoptr(DIR) dir =3D NULL; struct dirent *entry =3D NULL; size_t i =3D 0; @@ -2522,7 +2523,42 @@ virPCIGetNetName(const char *device_link_sysfs_path, =20 continue; } + } else { + /* Most switch devices use phys_port_name instead of + * phys_port_id. + * NOTE: VFs' representors net devices can be linked to PF's P= CI + * device, which mean that there'll be multiple net devices + * instances and to get a proper net device need to match on + * specific regex. + * To get PF netdev, for ex., used following regex: + * "(p[0-9]+$)|(p[0-9]+s[0-9]+$)" + * or to get exact VF's netdev next regex is used: + * "pf0vf1$" + */ + if (virNetDevGetPhysPortName(entry->d_name, &thisPhysPortName)= < 0) + return -1; + + if (thisPhysPortName) { + /* if this one doesn't match, keep looking */ + if (!virStringMatch(thisPhysPortName, VIR_PF_PHYS_PORT_NAM= E_REGEX)) { + VIR_FREE(thisPhysPortName); + /* Save the first entry we find to use as a failsafe + * in case we fail to match on regex. + */ + if (!firstEntryName) + firstEntryName =3D g_strdup(entry->d_name); + + continue; + } + } else { + /* Save the first entry we find to use as a failsafe in ca= se + * phys_port_name is not supported. + */ + if (!firstEntryName) + firstEntryName =3D g_strdup(entry->d_name); + continue; + } if (i++ < idx) continue; } diff --git a/src/util/virpci.h b/src/util/virpci.h index 43828b0a8a..9e89ede1d5 100644 --- a/src/util/virpci.h +++ b/src/util/virpci.h @@ -55,6 +55,11 @@ struct _virZPCIDeviceAddress { =20 #define VIR_PCI_DEVICE_ADDRESS_FMT "%04x:%02x:%02x.%d" =20 +/* Represents format of PF's phys_port_name in switchdev mode: + * 'p%u' or 'p%us%u'. New line checked since value is readed from sysfs fi= le. + */ +# define VIR_PF_PHYS_PORT_NAME_REGEX "(p[0-9]+$)|(p[0-9]+s[0-9]+$)" + struct _virPCIDeviceAddress { unsigned int domain; unsigned int bus; --=20 2.30.0