[libvirt] [PATCH V2] nodedev: Fix failing to parse PCI address for non-PCI network devices

Jim Fehlig posted 1 patch 6 years, 2 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/libvirt tags/patchew/20180108170859.19508-1-jfehlig@suse.com
There is a newer version of this series
src/libvirt_private.syms |  1 +
src/util/virnetdev.c     | 25 ++++++++++++++++++++++---
src/util/virpci.c        | 32 ++++++++++++++++++++++++++++++++
src/util/virpci.h        |  3 +++
4 files changed, 58 insertions(+), 3 deletions(-)
[libvirt] [PATCH V2] nodedev: Fix failing to parse PCI address for non-PCI network devices
Posted by Jim Fehlig 6 years, 2 months ago
Based loosely on a patch from Fei Li <fli@suse.com>.

Commit 8708ca01c added virNetDevSwitchdevFeature() to check if a network
device has Switchdev capabilities. virNetDevSwitchdevFeature() attempts
to retrieve the PCI device associated with the network device, ignoring
non-PCI devices. It does so via the following call chain

  virNetDevSwitchdevFeature()->virNetDevGetPCIDevice()->
  virPCIGetDeviceAddressFromSysfsLink()

For non-PCI network devices (qeth, Xen vif, etc),
virPCIGetDeviceAddressFromSysfsLink() will report an error when
virPCIDeviceAddressParse() fails. virPCIDeviceAddressParse() also
logs an error. After commit 8708ca01c there are now two errors reported
for each non-PCI network device even though the errors are harmless.

To avoid changing virPCIGetDeviceAddressFromSysfsLink(),
virPCIDeviceAddressParse() and related code that has quite a few
call sites, add a function to check if a network device is a
PCI device and use it in virNetDevSwitchdevFeature() before attempting
to retrieve the PCI device.
---

Suggestions welcome on a better name for virPCIIsPCIDevice, or even
a better approach to solving this issue. Currently virPCIIsPCIDevice
assumes the device is PCI if its subsystem starts with 'pci'. I'd be
happy to hear if there is a better way to determine if a network
device is PCI.

 src/libvirt_private.syms |  1 +
 src/util/virnetdev.c     | 25 ++++++++++++++++++++++---
 src/util/virpci.c        | 32 ++++++++++++++++++++++++++++++++
 src/util/virpci.h        |  3 +++
 4 files changed, 58 insertions(+), 3 deletions(-)

diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms
index a705fa846..bdf98ded1 100644
--- a/src/libvirt_private.syms
+++ b/src/libvirt_private.syms
@@ -2458,6 +2458,7 @@ virPCIGetVirtualFunctionInfo;
 virPCIGetVirtualFunctions;
 virPCIHeaderTypeFromString;
 virPCIHeaderTypeToString;
+virPCIIsPCIDevice;
 virPCIIsVirtualFunction;
 virPCIStubDriverTypeFromString;
 virPCIStubDriverTypeToString;
diff --git a/src/util/virnetdev.c b/src/util/virnetdev.c
index eb2d119bf..a9af08797 100644
--- a/src/util/virnetdev.c
+++ b/src/util/virnetdev.c
@@ -1148,6 +1148,21 @@ virNetDevSysfsDeviceFile(char **pf_sysfs_device_link, const char *ifname,
 }
 
 
+static bool
+virNetDevIsPCIDevice(const char *devName)
+{
+    char *vfSysfsDevicePath = NULL;
+    bool ret;
+
+    if (virNetDevSysfsFile(&vfSysfsDevicePath, devName, "device/subsystem") < 0)
+        return false;
+
+    ret = virPCIIsPCIDevice(vfSysfsDevicePath);
+    VIR_FREE(vfSysfsDevicePath);
+    return ret;
+}
+
+
 static virPCIDevicePtr
 virNetDevGetPCIDevice(const char *devName)
 {
@@ -3236,14 +3251,18 @@ virNetDevSwitchdevFeature(const char *ifname,
     if (is_vf == 1 && virNetDevGetPhysicalFunction(ifname, &pfname) < 0)
         goto cleanup;
 
-    pci_device_ptr = pfname ? virNetDevGetPCIDevice(pfname) :
-                              virNetDevGetPCIDevice(ifname);
+    if (pfname == NULL && VIR_STRDUP(pfname, ifname) < 0)
+        goto cleanup;
+
     /* No PCI device, then no feature bit to check/add */
-    if (pci_device_ptr == NULL) {
+    if (!virNetDevIsPCIDevice(pfname)) {
         ret = 0;
         goto cleanup;
     }
 
+    if ((pci_device_ptr = virNetDevGetPCIDevice(pfname)) == NULL)
+        goto cleanup;
+
     if (!(nl_msg = nlmsg_alloc_simple(family_id,
                                       NLM_F_REQUEST | NLM_F_ACK))) {
         virReportOOMError();
diff --git a/src/util/virpci.c b/src/util/virpci.c
index fe57bef32..f22d89cd7 100644
--- a/src/util/virpci.c
+++ b/src/util/virpci.c
@@ -2651,6 +2651,38 @@ virPCIGetDeviceAddressFromSysfsLink(const char *device_link)
     return bdf;
 }
 
+/**
+ * virPCIIsPCIDevice:
+ * @device_link: sysfs path for the device
+ *
+ * Returns true if the device specified in @device_link sysfs
+ * path is a PCI device, otherwise returns false.
+ */
+bool
+virPCIIsPCIDevice(const char *device_link)
+{
+    char *device_path = NULL;
+    char *subsys = NULL;
+    bool ret;
+
+    if (!virFileExists(device_link)) {
+        VIR_DEBUG("'%s' does not exist", device_link);
+        return false;
+    }
+
+    device_path = canonicalize_file_name(device_link);
+    if (device_path == NULL) {
+        VIR_DEBUG("Failed to resolve device link '%s'", device_link);
+        return false;
+    }
+
+    subsys = last_component(device_path);
+    ret = STRPREFIX(subsys, "pci");
+
+    VIR_FREE(device_path);
+    return ret;
+}
+
 /**
  * virPCIGetPhysicalFunction:
  * @vf_sysfs_path: sysfs path for the virtual function
diff --git a/src/util/virpci.h b/src/util/virpci.h
index f1fbe39e6..489d8a777 100644
--- a/src/util/virpci.h
+++ b/src/util/virpci.h
@@ -190,6 +190,9 @@ int virPCIDeviceWaitForCleanup(virPCIDevicePtr dev, const char *matcher);
 virPCIDeviceAddressPtr
 virPCIGetDeviceAddressFromSysfsLink(const char *device_link);
 
+bool
+virPCIIsPCIDevice(const char *device_link);
+
 int virPCIGetPhysicalFunction(const char *vf_sysfs_path,
                               virPCIDeviceAddressPtr *pf);
 
-- 
2.15.1

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH V2] nodedev: Fix failing to parse PCI address for non-PCI network devices
Posted by Erik Skultety 6 years, 2 months ago
On Mon, Jan 08, 2018 at 10:08:59AM -0700, Jim Fehlig wrote:
> Based loosely on a patch from Fei Li <fli@suse.com>.
>
> Commit 8708ca01c added virNetDevSwitchdevFeature() to check if a network
> device has Switchdev capabilities. virNetDevSwitchdevFeature() attempts
> to retrieve the PCI device associated with the network device, ignoring
> non-PCI devices. It does so via the following call chain
>
>   virNetDevSwitchdevFeature()->virNetDevGetPCIDevice()->
>   virPCIGetDeviceAddressFromSysfsLink()
>
> For non-PCI network devices (qeth, Xen vif, etc),
> virPCIGetDeviceAddressFromSysfsLink() will report an error when
> virPCIDeviceAddressParse() fails. virPCIDeviceAddressParse() also
> logs an error. After commit 8708ca01c there are now two errors reported
> for each non-PCI network device even though the errors are harmless.
>
> To avoid changing virPCIGetDeviceAddressFromSysfsLink(),
> virPCIDeviceAddressParse() and related code that has quite a few
> call sites, add a function to check if a network device is a
> PCI device and use it in virNetDevSwitchdevFeature() before attempting
> to retrieve the PCI device.
> ---
>
> Suggestions welcome on a better name for virPCIIsPCIDevice, or even
> a better approach to solving this issue. Currently virPCIIsPCIDevice
> assumes the device is PCI if its subsystem starts with 'pci'. I'd be
> happy to hear if there is a better way to determine if a network
> device is PCI.

Overall I like this approach, but see my comments below. I'm also curious about
some points I raised.

>
>  src/libvirt_private.syms |  1 +
>  src/util/virnetdev.c     | 25 ++++++++++++++++++++++---
>  src/util/virpci.c        | 32 ++++++++++++++++++++++++++++++++
>  src/util/virpci.h        |  3 +++
>  4 files changed, 58 insertions(+), 3 deletions(-)
>
> diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms
> index a705fa846..bdf98ded1 100644
> --- a/src/libvirt_private.syms
> +++ b/src/libvirt_private.syms
> @@ -2458,6 +2458,7 @@ virPCIGetVirtualFunctionInfo;
>  virPCIGetVirtualFunctions;
>  virPCIHeaderTypeFromString;
>  virPCIHeaderTypeToString;
> +virPCIIsPCIDevice;
>  virPCIIsVirtualFunction;
>  virPCIStubDriverTypeFromString;
>  virPCIStubDriverTypeToString;
> diff --git a/src/util/virnetdev.c b/src/util/virnetdev.c
> index eb2d119bf..a9af08797 100644
> --- a/src/util/virnetdev.c
> +++ b/src/util/virnetdev.c
> @@ -1148,6 +1148,21 @@ virNetDevSysfsDeviceFile(char **pf_sysfs_device_link, const char *ifname,
>  }
>
>
> +static bool
> +virNetDevIsPCIDevice(const char *devName)
> +{
> +    char *vfSysfsDevicePath = NULL;
> +    bool ret;
> +
> +    if (virNetDevSysfsFile(&vfSysfsDevicePath, devName, "device/subsystem") < 0)
> +        return false;

See below for ^this.

> +
> +    ret = virPCIIsPCIDevice(vfSysfsDevicePath);
> +    VIR_FREE(vfSysfsDevicePath);
> +    return ret;
> +}
> +
> +
>  static virPCIDevicePtr
>  virNetDevGetPCIDevice(const char *devName)
>  {
> @@ -3236,14 +3251,18 @@ virNetDevSwitchdevFeature(const char *ifname,
>      if (is_vf == 1 && virNetDevGetPhysicalFunction(ifname, &pfname) < 0)
>          goto cleanup;
>
> -    pci_device_ptr = pfname ? virNetDevGetPCIDevice(pfname) :
> -                              virNetDevGetPCIDevice(ifname);

I'd probably leave this as is, virNetDevGetPCIDevice already calls the
virNetDevSysfsFile, then you'd have something like virNetDevIsPCIDevice instead
of virPCIisPCIDevice (since those methods already assume a PCI device which we
don't know at this point). Something like this:

virNetDevIsPCIDevice(const char *device)
{
    char *subsystem_link;

    virAsprintf(&subsystem_link, "%s/subsystem", device);

    <body of your virPCIIsPCIDevice>
}

virNetDevGetPCIDevice
{
    if (virNetDevSysfsFile(&vfSysfsDevicePath, devname, "device") < 0)
        goto cleanup;

    if (virNetDevIsPCIDevice(vfSysfsDevicePath) < 0)
        goto cleanup;

        ....
}


> +    if (pfname == NULL && VIR_STRDUP(pfname, ifname) < 0)
> +        goto cleanup;
> +

^This hunk would be unnecessary then.

>      /* No PCI device, then no feature bit to check/add */
> -    if (pci_device_ptr == NULL) {
> +    if (!virNetDevIsPCIDevice(pfname)) {

^This one too...

>          ret = 0;
>          goto cleanup;
>      }
>
> +    if ((pci_device_ptr = virNetDevGetPCIDevice(pfname)) == NULL)
> +        goto cleanup;
> +
>      if (!(nl_msg = nlmsg_alloc_simple(family_id,
>                                        NLM_F_REQUEST | NLM_F_ACK))) {
>          virReportOOMError();
> diff --git a/src/util/virpci.c b/src/util/virpci.c
> index fe57bef32..f22d89cd7 100644
> --- a/src/util/virpci.c
> +++ b/src/util/virpci.c
> @@ -2651,6 +2651,38 @@ virPCIGetDeviceAddressFromSysfsLink(const char *device_link)
>      return bdf;
>  }
>
> +/**
> + * virPCIIsPCIDevice:
> + * @device_link: sysfs path for the device
> + *
> + * Returns true if the device specified in @device_link sysfs
> + * path is a PCI device, otherwise returns false.
> + */
> +bool
> +virPCIIsPCIDevice(const char *device_link)

This could still be part of the virnetdev module and be called
virNetDevIsPCIDevice.

> +{
> +    char *device_path = NULL;
> +    char *subsys = NULL;
> +    bool ret;
> +
> +    if (!virFileExists(device_link)) {
> +        VIR_DEBUG("'%s' does not exist", device_link);

This indicates a failure so it should be and error, but we don't care about the
failure, changing the level of the message to debug looks odd, because you just
don't want it to show up in the log, yet when debugging is ON, this would hint
that we failed to do something for some reason, rather than indicate the
codeflow we took. I don't know, I wouldn't abuse the log levels like this, as I
said, we don't care about the existence of the link, so no message is fine.

> +        return false;
> +    }
> +
> +    device_path = canonicalize_file_name(device_link);
> +    if (device_path == NULL) {
> +        VIR_DEBUG("Failed to resolve device link '%s'", device_link);

^This is an error, we have a link, but we failed to resolve, we do it in a few
places already. Also, you should use virFileResolveLink for these purposes to
get the absolute path.

> +        return false;
> +    }
> +
> +    subsys = last_component(device_path);
> +    ret = STRPREFIX(subsys, "pci");

So, I see 2 options, one would be to somehow call back to the nodedev driver
and ask for the parent device and its capabilities, from which you'd check PCI.
I checked and we don't do anything like that yet and I do believe it would be a
significant non-trivial amount of work to do. Therefore I think that this
approach is acceptable, I'd do it as well, however, I also found this:
https://www.kernel.org/doc/html/v4.13/admin-guide/sysfs-rules.html

If I understand the "device"-link section correctly, no application should rely
on the 'device' link existence once the device itself can be accessed via
/sys/devices, which in my case and probably yours as well, you can. So if I'm
not mistaken, there is a chance that some platforms might have the pci device
under /sys/devices and so the 'device' link within <x>/net/<iface> wouldn't
exist which would cause the logic of your patch to be flawed. Now, if kernel
assumes that every application should build the whole directory chain starting
at /sys/devices ending with your device just to get to this device's parent is
IMHO madness and I do see a usecase in the backwards pointing symlinks.

> +
> +    VIR_FREE(device_path);
> +    return ret;
> +}
> +

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH V2] nodedev: Fix failing to parse PCI address for non-PCI network devices
Posted by Jim Fehlig 6 years, 2 months ago
On 01/09/2018 04:35 AM, Erik Skultety wrote:
> On Mon, Jan 08, 2018 at 10:08:59AM -0700, Jim Fehlig wrote:
>> Based loosely on a patch from Fei Li <fli@suse.com>.
>>
>> Commit 8708ca01c added virNetDevSwitchdevFeature() to check if a network
>> device has Switchdev capabilities. virNetDevSwitchdevFeature() attempts
>> to retrieve the PCI device associated with the network device, ignoring
>> non-PCI devices. It does so via the following call chain
>>
>>    virNetDevSwitchdevFeature()->virNetDevGetPCIDevice()->
>>    virPCIGetDeviceAddressFromSysfsLink()
>>
>> For non-PCI network devices (qeth, Xen vif, etc),
>> virPCIGetDeviceAddressFromSysfsLink() will report an error when
>> virPCIDeviceAddressParse() fails. virPCIDeviceAddressParse() also
>> logs an error. After commit 8708ca01c there are now two errors reported
>> for each non-PCI network device even though the errors are harmless.
>>
>> To avoid changing virPCIGetDeviceAddressFromSysfsLink(),
>> virPCIDeviceAddressParse() and related code that has quite a few
>> call sites, add a function to check if a network device is a
>> PCI device and use it in virNetDevSwitchdevFeature() before attempting
>> to retrieve the PCI device.
>> ---
>>
>> Suggestions welcome on a better name for virPCIIsPCIDevice, or even
>> a better approach to solving this issue. Currently virPCIIsPCIDevice
>> assumes the device is PCI if its subsystem starts with 'pci'. I'd be
>> happy to hear if there is a better way to determine if a network
>> device is PCI.
> 
> Overall I like this approach, but see my comments below. I'm also curious about
> some points I raised.
> 
>>
>>   src/libvirt_private.syms |  1 +
>>   src/util/virnetdev.c     | 25 ++++++++++++++++++++++---
>>   src/util/virpci.c        | 32 ++++++++++++++++++++++++++++++++
>>   src/util/virpci.h        |  3 +++
>>   4 files changed, 58 insertions(+), 3 deletions(-)
>>
>> diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms
>> index a705fa846..bdf98ded1 100644
>> --- a/src/libvirt_private.syms
>> +++ b/src/libvirt_private.syms
>> @@ -2458,6 +2458,7 @@ virPCIGetVirtualFunctionInfo;
>>   virPCIGetVirtualFunctions;
>>   virPCIHeaderTypeFromString;
>>   virPCIHeaderTypeToString;
>> +virPCIIsPCIDevice;
>>   virPCIIsVirtualFunction;
>>   virPCIStubDriverTypeFromString;
>>   virPCIStubDriverTypeToString;
>> diff --git a/src/util/virnetdev.c b/src/util/virnetdev.c
>> index eb2d119bf..a9af08797 100644
>> --- a/src/util/virnetdev.c
>> +++ b/src/util/virnetdev.c
>> @@ -1148,6 +1148,21 @@ virNetDevSysfsDeviceFile(char **pf_sysfs_device_link, const char *ifname,
>>   }
>>
>>
>> +static bool
>> +virNetDevIsPCIDevice(const char *devName)
>> +{
>> +    char *vfSysfsDevicePath = NULL;
>> +    bool ret;
>> +
>> +    if (virNetDevSysfsFile(&vfSysfsDevicePath, devName, "device/subsystem") < 0)
>> +        return false;
> 
> See below for ^this.
> 
>> +
>> +    ret = virPCIIsPCIDevice(vfSysfsDevicePath);
>> +    VIR_FREE(vfSysfsDevicePath);
>> +    return ret;
>> +}
>> +
>> +
>>   static virPCIDevicePtr
>>   virNetDevGetPCIDevice(const char *devName)
>>   {
>> @@ -3236,14 +3251,18 @@ virNetDevSwitchdevFeature(const char *ifname,
>>       if (is_vf == 1 && virNetDevGetPhysicalFunction(ifname, &pfname) < 0)
>>           goto cleanup;
>>
>> -    pci_device_ptr = pfname ? virNetDevGetPCIDevice(pfname) :
>> -                              virNetDevGetPCIDevice(ifname);
> 
> I'd probably leave this as is, virNetDevGetPCIDevice already calls the
> virNetDevSysfsFile, then you'd have something like virNetDevIsPCIDevice instead
> of virPCIisPCIDevice (since those methods already assume a PCI device which we
> don't know at this point). Something like this:
> 
> virNetDevIsPCIDevice(const char *device)
> {
>      char *subsystem_link;
> 
>      virAsprintf(&subsystem_link, "%s/subsystem", device);
> 
>      <body of your virPCIIsPCIDevice>
> }
> 
> virNetDevGetPCIDevice
> {
>      if (virNetDevSysfsFile(&vfSysfsDevicePath, devname, "device") < 0)
>          goto cleanup;
> 
>      if (virNetDevIsPCIDevice(vfSysfsDevicePath) < 0)
>          goto cleanup;
> 
>          ....
> }

Yes, good point about functions in virpci.c already assuming a PCI device. I've 
changed the code as you suggested.

> 
>> +    if (pfname == NULL && VIR_STRDUP(pfname, ifname) < 0)
>> +        goto cleanup;
>> +
> 
> ^This hunk would be unnecessary then.
> 
>>       /* No PCI device, then no feature bit to check/add */
>> -    if (pci_device_ptr == NULL) {
>> +    if (!virNetDevIsPCIDevice(pfname)) {
> 
> ^This one too...
> 
>>           ret = 0;
>>           goto cleanup;
>>       }
>>
>> +    if ((pci_device_ptr = virNetDevGetPCIDevice(pfname)) == NULL)
>> +        goto cleanup;
>> +
>>       if (!(nl_msg = nlmsg_alloc_simple(family_id,
>>                                         NLM_F_REQUEST | NLM_F_ACK))) {
>>           virReportOOMError();
>> diff --git a/src/util/virpci.c b/src/util/virpci.c
>> index fe57bef32..f22d89cd7 100644
>> --- a/src/util/virpci.c
>> +++ b/src/util/virpci.c
>> @@ -2651,6 +2651,38 @@ virPCIGetDeviceAddressFromSysfsLink(const char *device_link)
>>       return bdf;
>>   }
>>
>> +/**
>> + * virPCIIsPCIDevice:
>> + * @device_link: sysfs path for the device
>> + *
>> + * Returns true if the device specified in @device_link sysfs
>> + * path is a PCI device, otherwise returns false.
>> + */
>> +bool
>> +virPCIIsPCIDevice(const char *device_link)
> 
> This could still be part of the virnetdev module and be called
> virNetDevIsPCIDevice.
> 
>> +{
>> +    char *device_path = NULL;
>> +    char *subsys = NULL;
>> +    bool ret;
>> +
>> +    if (!virFileExists(device_link)) {
>> +        VIR_DEBUG("'%s' does not exist", device_link);
> 
> This indicates a failure so it should be and error, but we don't care about the
> failure, changing the level of the message to debug looks odd, because you just
> don't want it to show up in the log, yet when debugging is ON, this would hint
> that we failed to do something for some reason, rather than indicate the
> codeflow we took. I don't know, I wouldn't abuse the log levels like this, as I
> said, we don't care about the existence of the link, so no message is fine.
> 
>> +        return false;
>> +    }
>> +
>> +    device_path = canonicalize_file_name(device_link);
>> +    if (device_path == NULL) {
>> +        VIR_DEBUG("Failed to resolve device link '%s'", device_link);
> 
> ^This is an error, we have a link, but we failed to resolve, we do it in a few
> places already. Also, you should use virFileResolveLink for these purposes to
> get the absolute path.
> 
>> +        return false;
>> +    }
>> +
>> +    subsys = last_component(device_path);
>> +    ret = STRPREFIX(subsys, "pci");
> 
> So, I see 2 options, one would be to somehow call back to the nodedev driver
> and ask for the parent device and its capabilities, from which you'd check PCI.
> I checked and we don't do anything like that yet and I do believe it would be a
> significant non-trivial amount of work to do. Therefore I think that this
> approach is acceptable, I'd do it as well, however, I also found this:
> https://www.kernel.org/doc/html/v4.13/admin-guide/sysfs-rules.html

Ah, thanks for the pointer to refresh my memory on interacting with sysfs. I 
think I got it right in V3

https://www.redhat.com/archives/libvir-list/2018-January/msg00562.html

Regards,
Jim

> If I understand the "device"-link section correctly, no application should rely
> on the 'device' link existence once the device itself can be accessed via
> /sys/devices, which in my case and probably yours as well, you can. So if I'm
> not mistaken, there is a chance that some platforms might have the pci device
> under /sys/devices and so the 'device' link within <x>/net/<iface> wouldn't
> exist which would cause the logic of your patch to be flawed. Now, if kernel
> assumes that every application should build the whole directory chain starting
> at /sys/devices ending with your device just to get to this device's parent is
> IMHO madness and I do see a usecase in the backwards pointing symlinks.
> 
>> +
>> +    VIR_FREE(device_path);
>> +    return ret;
>> +}
>> +

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list