[PATCH] util: basic support for vendor-specific vfio drivers

Laine Stump posted 1 patch 1 year, 8 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/libvirt tags/patchew/20220801040222.675916-1-laine@redhat.com
meson.build                 |  1 +
src/hypervisor/virhostdev.c | 26 ++++-------
src/util/virpci.c           | 90 ++++++++++++++++++++++++++++++++++---
src/util/virpci.h           |  3 ++
4 files changed, 97 insertions(+), 23 deletions(-)
[PATCH] util: basic support for vendor-specific vfio drivers
Posted by Laine Stump 1 year, 8 months ago
Before a PCI device can be assigned to a guest with VFIO, that device
must be bound to the vfio-pci driver rather than to the device's
normal driver. The vfio-pci driver provides APIs that permit QEMU to
perform all the necessary operations to make the device accessible to
the guest.

There has been kernel work recently to support vendor/device-specific
VFIO drivers that provide the basic vfio-pci driver functionality
while adding support for device-specific operations (for example these
device-specific drivers are planned to support live migration of
certain devices). All that will be needed to make this functionality
available will be to bind the new vendor-specific driver to the device
(rather than the generic vfio-pci driver, which will continue to work
just without the extra functionality).

But until now libvirt has required that all PCI devices being assigned
to a guest with VFIO specifically have the "vfio-pci" driver bound to
the device. So even if the user manually binds a shiny new
vendor-specific driver to the device (and puts "managed='no'" in the
config to prevent libvirt from changing that), libvirt will just fail
during startup of the guest (or during hotplug) because the driver
bound to the device isn't named exactly "vfio-pci".

Fortunately these new vendor/device-specific drivers can be easily
identified as being "vfio-pci + extra stuff" - all that's needed is to
look at the output of the "modinfo $driver_name" command to see if
"vfio_pci" is in the alias list for the driver.

That's what this patch does. When libvirt checks the driver bound to a
device (either to decide if it needs to bind to a different driver or
perform some other operation, or if the current driver is acceptable
as-is), if the driver isn't specifically "vfio-pci", then it will look
at the output of modinfo for the driver that *is* bound to the device;
if modinfo shows vfio_pci as an alias for that device, then we'll
behave as if the driver was exactly "vfio-pci".

The effect of this patch is that users will now be able to pre-setup a
device to be bound to a vendor-specific driver, then put
"managed='no'" in the config and libvirt will allow that driver.

What this patch does *not* do is handle automatically determining the
proper/best vendor-specific driver and binding to it in the case of
"managed='yes'". This will be implemented later when there is a widely
available driver / device combo we can use for testing. This initial
simple patch is just something simple that will permit initial testing
of the new drivers' functionality.

(I personally had to add an extra patch playing with driver names to
my build just to test that everything was working as expected; that's
okay for a patch as simple as this, but wouldn't be acceptable testing
for anything more complex.)

Signed-off-by: Laine Stump <laine@redhat.com>
---
 meson.build                 |  1 +
 src/hypervisor/virhostdev.c | 26 ++++-------
 src/util/virpci.c           | 90 ++++++++++++++++++++++++++++++++++---
 src/util/virpci.h           |  3 ++
 4 files changed, 97 insertions(+), 23 deletions(-)

diff --git a/meson.build b/meson.build
index de59b1be9c..9d96eb3ee3 100644
--- a/meson.build
+++ b/meson.build
@@ -822,6 +822,7 @@ optional_programs = [
   'iscsiadm',
   'mdevctl',
   'mm-ctl',
+  'modinfo',
   'modprobe',
   'ovs-vsctl',
   'pdwtags',
diff --git a/src/hypervisor/virhostdev.c b/src/hypervisor/virhostdev.c
index c0ce867596..15b35fa75e 100644
--- a/src/hypervisor/virhostdev.c
+++ b/src/hypervisor/virhostdev.c
@@ -747,9 +747,8 @@ virHostdevPreparePCIDevicesImpl(virHostdevManager *mgr,
                                    mgr->inactivePCIHostdevs) < 0)
                 goto reattachdevs;
         } else {
-            g_autofree char *driverPath = NULL;
-            g_autofree char *driverName = NULL;
-            int stub;
+            g_autofree char *drvName = NULL;
+            virPCIStubDriver drvType;
 
             /* Unmanaged devices should already have been marked as
              * inactive: if that's the case, we can simply move on */
@@ -769,18 +768,14 @@ virHostdevPreparePCIDevicesImpl(virHostdevManager *mgr,
              *       information about active / inactive device across
              *       daemon restarts has been implemented */
 
-            if (virPCIDeviceGetDriverPathAndName(pci,
-                                                 &driverPath, &driverName) < 0)
+            if (virPCIDeviceGetDriverNameAndType(pci, &drvName, &drvType) < 0)
                 goto reattachdevs;
 
-            stub = virPCIStubDriverTypeFromString(driverName);
-
-            if (stub > VIR_PCI_STUB_DRIVER_NONE &&
-                stub < VIR_PCI_STUB_DRIVER_LAST) {
+            if (drvType > VIR_PCI_STUB_DRIVER_NONE) {
 
                 /* The device is bound to a known stub driver: store this
                  * information and add a copy to the inactive list */
-                virPCIDeviceSetStubDriver(pci, stub);
+                virPCIDeviceSetStubDriver(pci, drvType);
 
                 VIR_DEBUG("Adding PCI device %s to inactive list",
                           virPCIDeviceGetName(pci));
@@ -2292,18 +2287,13 @@ virHostdevPrepareOneNVMeDevice(virHostdevManager *hostdev_mgr,
     /* Let's check if all PCI devices are NVMe disks. */
     for (i = 0; i < virPCIDeviceListCount(pciDevices); i++) {
         virPCIDevice *pci = virPCIDeviceListGet(pciDevices, i);
-        g_autofree char *drvPath = NULL;
         g_autofree char *drvName = NULL;
-        int stub = VIR_PCI_STUB_DRIVER_NONE;
+        virPCIStubDriver drvType;
 
-        if (virPCIDeviceGetDriverPathAndName(pci, &drvPath, &drvName) < 0)
+        if (virPCIDeviceGetDriverNameAndType(pci, &drvName, &drvType) < 0)
             goto cleanup;
 
-        if (drvName)
-            stub = virPCIStubDriverTypeFromString(drvName);
-
-        if (stub == VIR_PCI_STUB_DRIVER_VFIO ||
-            STREQ_NULLABLE(drvName, "nvme"))
+        if (drvType == VIR_PCI_STUB_DRIVER_VFIO || STREQ_NULLABLE(drvName, "nvme"))
             continue;
 
         VIR_WARN("Suspicious NVMe disk assignment. PCI device "
diff --git a/src/util/virpci.c b/src/util/virpci.c
index 7800966963..8b714d3ddc 100644
--- a/src/util/virpci.c
+++ b/src/util/virpci.c
@@ -37,6 +37,7 @@
 #include "virstring.h"
 #include "viralloc.h"
 #include "virpcivpd.h"
+#include "vircommand.h"
 
 VIR_LOG_INIT("util.pci");
 
@@ -277,6 +278,84 @@ virPCIDeviceGetDriverPathAndName(virPCIDevice *dev, char **path, char **name)
 }
 
 
+/**
+ * virPCIDeviceGetDriverNameAndType:
+ * @dev: virPCIDevice object to examine
+ * @drvName: returns name of driver bound to this device (if any)
+ * @drvType: returns type of driver if it is a known stub driver type
+ *
+ * Find the name of the driver bound to @dev (if any) and the type of
+ * the driver if it is a known/recognized "stub" driver (based on the
+ * name). If the name doesn't match one of the virPCIStubDrivers
+ * exactly, check the output of "modinfo vfio-pci" to see if
+ * "vfio_pci" is included in the driver's list of aliases; if so, then
+ * this driver has all the functionality of the basic vfio_pci driver,
+ * so it should be considered of the type VIR_PCI_STUB_DRIVER_VFIO.
+ *
+ * Return 0 on success, -1 on failure. If -1 is returned, then an error
+ * message has been logged.
+ */
+int
+virPCIDeviceGetDriverNameAndType(virPCIDevice *dev,
+                                 char **drvName,
+                                 virPCIStubDriver *drvType)
+{
+    g_autofree char *drvPath = NULL;
+    g_autoptr(virCommand) cmd = NULL;
+    g_autofree char *output = NULL;
+    g_autoptr(GRegex) regex = NULL;
+    g_autoptr(GError) err = NULL;
+    g_autoptr(GMatchInfo) info = NULL;
+    int exit;
+    int tmpType;
+
+    if (virPCIDeviceGetDriverPathAndName(dev, &drvPath, drvName) < 0)
+        return -1;
+
+    if (!*drvName) {
+        *drvType = VIR_PCI_STUB_DRIVER_NONE;
+        return 0;
+    }
+
+    tmpType = virPCIStubDriverTypeFromString(*drvName);
+
+    if (tmpType > VIR_PCI_STUB_DRIVER_NONE) {
+        *drvType = tmpType;
+        return 0; /* exact match of a known driver name (or no name) */
+    }
+
+    /* Check the output of "modinfo $drvName" to see if it has
+     * "vfio_pci" as an alias. If it does, then this driver should
+     * also be considered as a vfio-pci driver, because it implements
+     * all the functionality of the basic vfio-pci (plus additional
+     * device-specific stuff).
+     */
+
+    cmd = virCommandNewArgList(MODINFO, *drvName, NULL);
+    virCommandSetOutputBuffer(cmd, &output);
+    if (virCommandRun(cmd, &exit) < 0)
+        return -1;
+
+    regex = g_regex_new("^alias: +vfio_pci:", G_REGEX_MULTILINE, 0, &err);
+    if (!regex) {
+        virReportError(VIR_ERR_INTERNAL_ERROR,
+                       _("Failed to compile regex %s"), err->message);
+        return -1;
+    }
+
+    g_regex_match(regex, output, 0, &info);
+    if (g_match_info_matches(info)) {
+        VIR_DEBUG("Driver %s is a vfio_pci driver", *drvName);
+        *drvType = VIR_PCI_STUB_DRIVER_VFIO;
+    } else {
+        VIR_DEBUG("Driver %s is NOT a vfio_pci driver", *drvName);
+        *drvType = VIR_PCI_STUB_DRIVER_NONE;
+    }
+
+    return 0;
+}
+
+
 static int
 virPCIDeviceConfigOpenInternal(virPCIDevice *dev, bool readonly, bool fatal)
 {
@@ -1004,8 +1083,8 @@ virPCIDeviceReset(virPCIDevice *dev,
                   virPCIDeviceList *activeDevs,
                   virPCIDeviceList *inactiveDevs)
 {
-    g_autofree char *drvPath = NULL;
     g_autofree char *drvName = NULL;
+    virPCIStubDriver drvType;
     int ret = -1;
     int fd = -1;
     int hdrType = -1;
@@ -1032,15 +1111,16 @@ virPCIDeviceReset(virPCIDevice *dev,
      * reset it whenever appropriate, so doing it ourselves would just
      * be redundant.
      */
-    if (virPCIDeviceGetDriverPathAndName(dev, &drvPath, &drvName) < 0)
+    if (virPCIDeviceGetDriverNameAndType(dev, &drvName, &drvType) < 0)
         goto cleanup;
 
-    if (virPCIStubDriverTypeFromString(drvName) == VIR_PCI_STUB_DRIVER_VFIO) {
-        VIR_DEBUG("Device %s is bound to vfio-pci - skip reset",
-                  dev->name);
+    if (drvType == VIR_PCI_STUB_DRIVER_VFIO) {
+
+        VIR_DEBUG("Device %s is bound to %s - skip reset", dev->name, drvName);
         ret = 0;
         goto cleanup;
     }
+
     VIR_DEBUG("Resetting device %s", dev->name);
 
     if ((fd = virPCIDeviceConfigOpenWrite(dev)) < 0)
diff --git a/src/util/virpci.h b/src/util/virpci.h
index 4d9193f24e..0532b90f90 100644
--- a/src/util/virpci.h
+++ b/src/util/virpci.h
@@ -280,6 +280,9 @@ int virPCIDeviceRebind(virPCIDevice *dev);
 int virPCIDeviceGetDriverPathAndName(virPCIDevice *dev,
                                      char **path,
                                      char **name);
+int virPCIDeviceGetDriverNameAndType(virPCIDevice *dev,
+                                     char **drvName,
+                                     virPCIStubDriver *drvType);
 
 int virPCIDeviceIsPCIExpress(virPCIDevice *dev);
 int virPCIDeviceHasPCIExpressLink(virPCIDevice *dev);
-- 
2.35.3
Re: [PATCH] util: basic support for vendor-specific vfio drivers
Posted by Erik Skultety 1 year, 8 months ago
On Mon, Aug 01, 2022 at 12:02:22AM -0400, Laine Stump wrote:
> Before a PCI device can be assigned to a guest with VFIO, that device
> must be bound to the vfio-pci driver rather than to the device's
> normal driver. The vfio-pci driver provides APIs that permit QEMU to
> perform all the necessary operations to make the device accessible to
> the guest.
> 
> There has been kernel work recently to support vendor/device-specific
> VFIO drivers that provide the basic vfio-pci driver functionality
> while adding support for device-specific operations (for example these
> device-specific drivers are planned to support live migration of
> certain devices). All that will be needed to make this functionality
> available will be to bind the new vendor-specific driver to the device
> (rather than the generic vfio-pci driver, which will continue to work
> just without the extra functionality).
> 
> But until now libvirt has required that all PCI devices being assigned
> to a guest with VFIO specifically have the "vfio-pci" driver bound to
> the device. So even if the user manually binds a shiny new
> vendor-specific driver to the device (and puts "managed='no'" in the
> config to prevent libvirt from changing that), libvirt will just fail
> during startup of the guest (or during hotplug) because the driver
> bound to the device isn't named exactly "vfio-pci".
> 
> Fortunately these new vendor/device-specific drivers can be easily
> identified as being "vfio-pci + extra stuff" - all that's needed is to
> look at the output of the "modinfo $driver_name" command to see if
> "vfio_pci" is in the alias list for the driver.
> 
> That's what this patch does. When libvirt checks the driver bound to a
> device (either to decide if it needs to bind to a different driver or
> perform some other operation, or if the current driver is acceptable
> as-is), if the driver isn't specifically "vfio-pci", then it will look
> at the output of modinfo for the driver that *is* bound to the device;
> if modinfo shows vfio_pci as an alias for that device, then we'll
> behave as if the driver was exactly "vfio-pci".

Since you say that the vendor/device-specific drivers does each of such drivers
implement the base vfio-pci functionality or they simply call into the base
driver? The reason why I'm asking is that if each of the vendor-specific
drivers depend on the vfio-pci module to be loaded as well, then reading
/proc/modules should suffice as vfio-pci should be listed right next to the
vendor-specific one. What am I missing?

The 'alias' field is optional so do we have any support guarantees from the
vendors that the it will always be filled in correctly? I mean you surely
handle that case in the code, but once we start supporting this there's no way
back and we already know how painful it can be to convince the vendors to
follow some kind of standard so that we don't need to maintain several code
paths based on a vendor-matrix.

...

> +int
> +virPCIDeviceGetDriverNameAndType(virPCIDevice *dev,
> +                                 char **drvName,
> +                                 virPCIStubDriver *drvType)
> +{
> +    g_autofree char *drvPath = NULL;
> +    g_autoptr(virCommand) cmd = NULL;
> +    g_autofree char *output = NULL;
> +    g_autoptr(GRegex) regex = NULL;
> +    g_autoptr(GError) err = NULL;
> +    g_autoptr(GMatchInfo) info = NULL;
> +    int exit;
> +    int tmpType;
> +
> +    if (virPCIDeviceGetDriverPathAndName(dev, &drvPath, drvName) < 0)
> +        return -1;
> +
> +    if (!*drvName) {
> +        *drvType = VIR_PCI_STUB_DRIVER_NONE;
> +        return 0;
> +    }
> +
> +    tmpType = virPCIStubDriverTypeFromString(*drvName);
> +
> +    if (tmpType > VIR_PCI_STUB_DRIVER_NONE) {
> +        *drvType = tmpType;
> +        return 0; /* exact match of a known driver name (or no name) */
> +    }
> +
> +    /* Check the output of "modinfo $drvName" to see if it has
> +     * "vfio_pci" as an alias. If it does, then this driver should
> +     * also be considered as a vfio-pci driver, because it implements
> +     * all the functionality of the basic vfio-pci (plus additional
> +     * device-specific stuff).
> +     */

Instead of calling an external program and then grepping its output which
technically could change in the future, wouldn't it be better if we read
/lib/modules/`uname -r`/modules.alias and filtered whatever line had the
vfio-pci' substring and compared the module name with the user-provided device
driver?
If not then I think you should pass '-F alias' to the command to speed up the
regex just a tiny bit.

Regards,
Erik
Re: [PATCH] util: basic support for vendor-specific vfio drivers
Posted by Laine Stump 1 year, 8 months ago
On 8/1/22 7:58 AM, Erik Skultety wrote:
> On Mon, Aug 01, 2022 at 12:02:22AM -0400, Laine Stump wrote:
>> Before a PCI device can be assigned to a guest with VFIO, that device
>> must be bound to the vfio-pci driver rather than to the device's
>> normal driver. The vfio-pci driver provides APIs that permit QEMU to
>> perform all the necessary operations to make the device accessible to
>> the guest.
>>
>> There has been kernel work recently to support vendor/device-specific
>> VFIO drivers that provide the basic vfio-pci driver functionality
>> while adding support for device-specific operations (for example these
>> device-specific drivers are planned to support live migration of
>> certain devices). All that will be needed to make this functionality
>> available will be to bind the new vendor-specific driver to the device
>> (rather than the generic vfio-pci driver, which will continue to work
>> just without the extra functionality).
>>
>> But until now libvirt has required that all PCI devices being assigned
>> to a guest with VFIO specifically have the "vfio-pci" driver bound to
>> the device. So even if the user manually binds a shiny new
>> vendor-specific driver to the device (and puts "managed='no'" in the
>> config to prevent libvirt from changing that), libvirt will just fail
>> during startup of the guest (or during hotplug) because the driver
>> bound to the device isn't named exactly "vfio-pci".
>>
>> Fortunately these new vendor/device-specific drivers can be easily
>> identified as being "vfio-pci + extra stuff" - all that's needed is to
>> look at the output of the "modinfo $driver_name" command to see if
>> "vfio_pci" is in the alias list for the driver.
>>
>> That's what this patch does. When libvirt checks the driver bound to a
>> device (either to decide if it needs to bind to a different driver or
>> perform some other operation, or if the current driver is acceptable
>> as-is), if the driver isn't specifically "vfio-pci", then it will look
>> at the output of modinfo for the driver that *is* bound to the device;
>> if modinfo shows vfio_pci as an alias for that device, then we'll
>> behave as if the driver was exactly "vfio-pci".
> 
> Since you say that the vendor/device-specific drivers does each of such drivers
> implement the base vfio-pci functionality or they simply call into the base
> driver? The reason why I'm asking is that if each of the vendor-specific
> drivers depend on the vfio-pci module to be loaded as well, then reading
> /proc/modules should suffice as vfio-pci should be listed right next to the
> vendor-specific one. What am I missing?
I don't know the definitive answer to that, as I have no example of a 
working vendor-specific driver to look at and only know about the kernel 
work going on second-hand from Alex. It looks like even the vfio_pci 
driver itself depends on other presumably lower level vfio-* modules (it 
directly uses vfio_pci_core, which in turn uses vfio and vfio_vifqfd), 
so possibly these new drivers would be depending on one or more of those 
lower level modules rather than vfio_pci. Also I would imagine it would 
be possible for other drivers to also depend on the vfio-pci driver 
while not themselves being a vfio driver.

> 
> The 'alias' field is optional so do we have any support guarantees from the
> vendors that the it will always be filled in correctly? I mean you surely
> handle that case in the code, but once we start supporting this there's no way
> back and we already know how painful it can be to convince the vendors to
> follow some kind of standard so that we don't need to maintain several code
> paths based on a vendor-matrix.

The aliases are what is used to determine the "best" vfio driver for a 
particular device, so I don't think it would be possible for a driver to 
not implement it, and the method I've used here to determine if a driver 
is a vfio driver was recommended by Alex after a couple of discussions 
on the subject.

> 
> ...
> 
>> +int
>> +virPCIDeviceGetDriverNameAndType(virPCIDevice *dev,
>> +                                 char **drvName,
>> +                                 virPCIStubDriver *drvType)
>> +{
>> +    g_autofree char *drvPath = NULL;
>> +    g_autoptr(virCommand) cmd = NULL;
>> +    g_autofree char *output = NULL;
>> +    g_autoptr(GRegex) regex = NULL;
>> +    g_autoptr(GError) err = NULL;
>> +    g_autoptr(GMatchInfo) info = NULL;
>> +    int exit;
>> +    int tmpType;
>> +
>> +    if (virPCIDeviceGetDriverPathAndName(dev, &drvPath, drvName) < 0)
>> +        return -1;
>> +
>> +    if (!*drvName) {
>> +        *drvType = VIR_PCI_STUB_DRIVER_NONE;
>> +        return 0;
>> +    }
>> +
>> +    tmpType = virPCIStubDriverTypeFromString(*drvName);
>> +
>> +    if (tmpType > VIR_PCI_STUB_DRIVER_NONE) {
>> +        *drvType = tmpType;
>> +        return 0; /* exact match of a known driver name (or no name) */
>> +    }
>> +
>> +    /* Check the output of "modinfo $drvName" to see if it has
>> +     * "vfio_pci" as an alias. If it does, then this driver should
>> +     * also be considered as a vfio-pci driver, because it implements
>> +     * all the functionality of the basic vfio-pci (plus additional
>> +     * device-specific stuff).
>> +     */
> 
> Instead of calling an external program and then grepping its output which
> technically could change in the future, wouldn't it be better if we read
> /lib/modules/`uname -r`/modules.alias and filtered whatever line had the
> vfio-pci' substring and compared the module name with the user-provided device
> driver?

Again, although I was hesistant about calling an external command, and 
asked if there was something simpler, Alex still suggested modinfo, so 
I'll let him answer that. Alex?

(Also, although the format of the output of "uname -r" is pretty much 
written in stone, you're still running an external command :-))

> If not then I think you should pass '-F alias' to the command to speed up the
> regex just a tiny bit.

Sure. That sounds fine to me.
Re: [PATCH] util: basic support for vendor-specific vfio drivers
Posted by Daniel P. Berrangé 1 year, 8 months ago
On Mon, Aug 01, 2022 at 09:30:38AM -0400, Laine Stump wrote:
> On 8/1/22 7:58 AM, Erik Skultety wrote:
> > 
> > Instead of calling an external program and then grepping its output which
> > technically could change in the future, wouldn't it be better if we read
> > /lib/modules/`uname -r`/modules.alias and filtered whatever line had the
> > vfio-pci' substring and compared the module name with the user-provided device
> > driver?
> 
> Again, although I was hesistant about calling an external command, and asked
> if there was something simpler, Alex still suggested modinfo, so I'll let
> him answer that. Alex?
> 
> (Also, although the format of the output of "uname -r" is pretty much
> written in stone, you're still running an external command :-))

You wouldn't  actually call 'uname -r', you'd invoke uname(2) function
and use the 'release' field in 'struct utsname'.

I'd favour reading modules.alias directly over invoking modinfo for
sure, though I'd be even more in favour of the kernel just exposing
the sysfs attribute and in the meanwhile just hardcoding the only 2
driver names that exist so far.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
Re: [PATCH] util: basic support for vendor-specific vfio drivers
Posted by Laine Stump 1 year, 8 months ago
On 8/5/22 5:40 AM, Daniel P. Berrangé wrote:
> On Mon, Aug 01, 2022 at 09:30:38AM -0400, Laine Stump wrote:
>> On 8/1/22 7:58 AM, Erik Skultety wrote:
>>>
>>> Instead of calling an external program and then grepping its output which
>>> technically could change in the future, wouldn't it be better if we read
>>> /lib/modules/`uname -r`/modules.alias and filtered whatever line had the
>>> vfio-pci' substring and compared the module name with the user-provided device
>>> driver?
>>
>> Again, although I was hesistant about calling an external command, and asked
>> if there was something simpler, Alex still suggested modinfo, so I'll let
>> him answer that. Alex?
>>
>> (Also, although the format of the output of "uname -r" is pretty much
>> written in stone, you're still running an external command :-))
> 
> You wouldn't  actually call 'uname -r', you'd invoke uname(2) function
> and use the 'release' field in 'struct utsname'.

Yeah, I wasn't thinking clearly when I said that :-P

> 
> I'd favour reading modules.alias directly over invoking modinfo for
> sure, though I'd be even more in favour of the kernel just exposing
> the sysfs attribute and in the meanwhile just hardcoding the only 2
> driver names that exist so far.

The problem with hardcoding the 2 existing driver names is that it 
wouldn't do any good to anyone developing a new driver, and part of the 
aim of doing this is to make it possible for developers to test their 
new drivers using libvirt (and management systems based on libvirt).

Re: [PATCH] util: basic support for vendor-specific vfio drivers
Posted by Daniel P. Berrangé 1 year, 8 months ago
On Fri, Aug 05, 2022 at 11:46:17AM -0400, Laine Stump wrote:
> On 8/5/22 5:40 AM, Daniel P. Berrangé wrote:
> > On Mon, Aug 01, 2022 at 09:30:38AM -0400, Laine Stump wrote:
> > > On 8/1/22 7:58 AM, Erik Skultety wrote:
> > > > 
> > > > Instead of calling an external program and then grepping its output which
> > > > technically could change in the future, wouldn't it be better if we read
> > > > /lib/modules/`uname -r`/modules.alias and filtered whatever line had the
> > > > vfio-pci' substring and compared the module name with the user-provided device
> > > > driver?
> > > 
> > > Again, although I was hesistant about calling an external command, and asked
> > > if there was something simpler, Alex still suggested modinfo, so I'll let
> > > him answer that. Alex?
> > > 
> > > (Also, although the format of the output of "uname -r" is pretty much
> > > written in stone, you're still running an external command :-))
> > 
> > You wouldn't  actually call 'uname -r', you'd invoke uname(2) function
> > and use the 'release' field in 'struct utsname'.
> 
> Yeah, I wasn't thinking clearly when I said that :-P
> 
> > 
> > I'd favour reading modules.alias directly over invoking modinfo for
> > sure, though I'd be even more in favour of the kernel just exposing
> > the sysfs attribute and in the meanwhile just hardcoding the only 2
> > driver names that exist so far.
> 
> The problem with hardcoding the 2 existing driver names is that it wouldn't
> do any good to anyone developing a new driver, and part of the aim of doing
> this is to make it possible for developers to test their new drivers using
> libvirt (and management systems based on libvirt).

I'm only suggesting hardcoding the driver names, *if* the kernel folks
agree to expose the sysfs directory.  Anyone developing new drivers
is unlikely to have their drivers merged before this new sysfs dir
is added, so it is not a significant enough real world blocker for them
for us to worry about. Best to focus on what the best long term
approach is, and not worry about problems that will only exist for
a couple of kernel releases today.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

Re: [PATCH] util: basic support for vendor-specific vfio drivers
Posted by Erik Skultety 1 year, 8 months ago
Putting Alex on CC since I don't see him there:
+alex.williamson@redhat.com

On Mon, Aug 01, 2022 at 09:30:38AM -0400, Laine Stump wrote:
> On 8/1/22 7:58 AM, Erik Skultety wrote:
> > On Mon, Aug 01, 2022 at 12:02:22AM -0400, Laine Stump wrote:
> > > Before a PCI device can be assigned to a guest with VFIO, that device
> > > must be bound to the vfio-pci driver rather than to the device's
> > > normal driver. The vfio-pci driver provides APIs that permit QEMU to
> > > perform all the necessary operations to make the device accessible to
> > > the guest.
> > > 
> > > There has been kernel work recently to support vendor/device-specific
> > > VFIO drivers that provide the basic vfio-pci driver functionality
> > > while adding support for device-specific operations (for example these
> > > device-specific drivers are planned to support live migration of
> > > certain devices). All that will be needed to make this functionality
> > > available will be to bind the new vendor-specific driver to the device
> > > (rather than the generic vfio-pci driver, which will continue to work
> > > just without the extra functionality).
> > > 
> > > But until now libvirt has required that all PCI devices being assigned
> > > to a guest with VFIO specifically have the "vfio-pci" driver bound to
> > > the device. So even if the user manually binds a shiny new
> > > vendor-specific driver to the device (and puts "managed='no'" in the
> > > config to prevent libvirt from changing that), libvirt will just fail
> > > during startup of the guest (or during hotplug) because the driver
> > > bound to the device isn't named exactly "vfio-pci".
> > > 
> > > Fortunately these new vendor/device-specific drivers can be easily
> > > identified as being "vfio-pci + extra stuff" - all that's needed is to
> > > look at the output of the "modinfo $driver_name" command to see if
> > > "vfio_pci" is in the alias list for the driver.
> > > 
> > > That's what this patch does. When libvirt checks the driver bound to a
> > > device (either to decide if it needs to bind to a different driver or
> > > perform some other operation, or if the current driver is acceptable
> > > as-is), if the driver isn't specifically "vfio-pci", then it will look
> > > at the output of modinfo for the driver that *is* bound to the device;
> > > if modinfo shows vfio_pci as an alias for that device, then we'll
> > > behave as if the driver was exactly "vfio-pci".
> > 
> > Since you say that the vendor/device-specific drivers does each of such drivers
> > implement the base vfio-pci functionality or they simply call into the base
> > driver? The reason why I'm asking is that if each of the vendor-specific
> > drivers depend on the vfio-pci module to be loaded as well, then reading
> > /proc/modules should suffice as vfio-pci should be listed right next to the
> > vendor-specific one. What am I missing?
> I don't know the definitive answer to that, as I have no example of a
> working vendor-specific driver to look at and only know about the kernel
> work going on second-hand from Alex. It looks like even the vfio_pci driver
> itself depends on other presumably lower level vfio-* modules (it directly
> uses vfio_pci_core, which in turn uses vfio and vfio_vifqfd), so possibly
> these new drivers would be depending on one or more of those lower level
> modules rather than vfio_pci. Also I would imagine it would be possible for
> other drivers to also depend on the vfio-pci driver while not themselves
> being a vfio driver.
> 
> > 
> > The 'alias' field is optional so do we have any support guarantees from the
> > vendors that the it will always be filled in correctly? I mean you surely
> > handle that case in the code, but once we start supporting this there's no way
> > back and we already know how painful it can be to convince the vendors to
> > follow some kind of standard so that we don't need to maintain several code
> > paths based on a vendor-matrix.
> 
> The aliases are what is used to determine the "best" vfio driver for a
> particular device, so I don't think it would be possible for a driver to not
> implement it, and the method I've used here to determine if a driver is a
> vfio driver was recommended by Alex after a couple of discussions on the
> subject.
> 
> > 
> > ...
> > 
> > > +int
> > > +virPCIDeviceGetDriverNameAndType(virPCIDevice *dev,
> > > +                                 char **drvName,
> > > +                                 virPCIStubDriver *drvType)
> > > +{
> > > +    g_autofree char *drvPath = NULL;
> > > +    g_autoptr(virCommand) cmd = NULL;
> > > +    g_autofree char *output = NULL;
> > > +    g_autoptr(GRegex) regex = NULL;
> > > +    g_autoptr(GError) err = NULL;
> > > +    g_autoptr(GMatchInfo) info = NULL;
> > > +    int exit;
> > > +    int tmpType;
> > > +
> > > +    if (virPCIDeviceGetDriverPathAndName(dev, &drvPath, drvName) < 0)
> > > +        return -1;
> > > +
> > > +    if (!*drvName) {
> > > +        *drvType = VIR_PCI_STUB_DRIVER_NONE;
> > > +        return 0;
> > > +    }
> > > +
> > > +    tmpType = virPCIStubDriverTypeFromString(*drvName);
> > > +
> > > +    if (tmpType > VIR_PCI_STUB_DRIVER_NONE) {
> > > +        *drvType = tmpType;
> > > +        return 0; /* exact match of a known driver name (or no name) */
> > > +    }
> > > +
> > > +    /* Check the output of "modinfo $drvName" to see if it has
> > > +     * "vfio_pci" as an alias. If it does, then this driver should
> > > +     * also be considered as a vfio-pci driver, because it implements
> > > +     * all the functionality of the basic vfio-pci (plus additional
> > > +     * device-specific stuff).
> > > +     */
> > 
> > Instead of calling an external program and then grepping its output which
> > technically could change in the future, wouldn't it be better if we read
> > /lib/modules/`uname -r`/modules.alias and filtered whatever line had the
> > vfio-pci' substring and compared the module name with the user-provided device
> > driver?
> 
> Again, although I was hesistant about calling an external command, and asked
> if there was something simpler, Alex still suggested modinfo, so I'll let
> him answer that. Alex?
> 
> (Also, although the format of the output of "uname -r" is pretty much
> written in stone, you're still running an external command :-))
> 
> > If not then I think you should pass '-F alias' to the command to speed up the
> > regex just a tiny bit.
> 
> Sure. That sounds fine to me.
>
Re: [PATCH] util: basic support for vendor-specific vfio drivers
Posted by Alex Williamson 1 year, 8 months ago
On Mon, 1 Aug 2022 16:02:05 +0200
Erik Skultety <eskultet@redhat.com> wrote:

> Putting Alex on CC since I don't see him there:
> +alex.williamson@redhat.com

Hmm, Laine cc'd me on the initial post but it seems it got dropped
somewhere.
 
> On Mon, Aug 01, 2022 at 09:30:38AM -0400, Laine Stump wrote:
> > On 8/1/22 7:58 AM, Erik Skultety wrote:  
> > > On Mon, Aug 01, 2022 at 12:02:22AM -0400, Laine Stump wrote:  
> > > > Before a PCI device can be assigned to a guest with VFIO, that device
> > > > must be bound to the vfio-pci driver rather than to the device's
> > > > normal driver. The vfio-pci driver provides APIs that permit QEMU to
> > > > perform all the necessary operations to make the device accessible to
> > > > the guest.
> > > > 
> > > > There has been kernel work recently to support vendor/device-specific
> > > > VFIO drivers that provide the basic vfio-pci driver functionality
> > > > while adding support for device-specific operations (for example these
> > > > device-specific drivers are planned to support live migration of
> > > > certain devices). All that will be needed to make this functionality
> > > > available will be to bind the new vendor-specific driver to the device
> > > > (rather than the generic vfio-pci driver, which will continue to work
> > > > just without the extra functionality).
> > > > 
> > > > But until now libvirt has required that all PCI devices being assigned
> > > > to a guest with VFIO specifically have the "vfio-pci" driver bound to
> > > > the device. So even if the user manually binds a shiny new
> > > > vendor-specific driver to the device (and puts "managed='no'" in the
> > > > config to prevent libvirt from changing that), libvirt will just fail
> > > > during startup of the guest (or during hotplug) because the driver
> > > > bound to the device isn't named exactly "vfio-pci".
> > > > 
> > > > Fortunately these new vendor/device-specific drivers can be easily
> > > > identified as being "vfio-pci + extra stuff" - all that's needed is to
> > > > look at the output of the "modinfo $driver_name" command to see if
> > > > "vfio_pci" is in the alias list for the driver.
> > > > 
> > > > That's what this patch does. When libvirt checks the driver bound to a
> > > > device (either to decide if it needs to bind to a different driver or
> > > > perform some other operation, or if the current driver is acceptable
> > > > as-is), if the driver isn't specifically "vfio-pci", then it will look
> > > > at the output of modinfo for the driver that *is* bound to the device;
> > > > if modinfo shows vfio_pci as an alias for that device, then we'll
> > > > behave as if the driver was exactly "vfio-pci".  
> > > 
> > > Since you say that the vendor/device-specific drivers does each of such drivers
> > > implement the base vfio-pci functionality or they simply call into the base
> > > driver? The reason why I'm asking is that if each of the vendor-specific
> > > drivers depend on the vfio-pci module to be loaded as well, then reading
> > > /proc/modules should suffice as vfio-pci should be listed right next to the
> > > vendor-specific one. What am I missing?  
> > I don't know the definitive answer to that, as I have no example of a
> > working vendor-specific driver to look at and only know about the kernel
> > work going on second-hand from Alex. It looks like even the vfio_pci driver
> > itself depends on other presumably lower level vfio-* modules (it directly
> > uses vfio_pci_core, which in turn uses vfio and vfio_vifqfd), so possibly
> > these new drivers would be depending on one or more of those lower level
> > modules rather than vfio_pci. Also I would imagine it would be possible for
> > other drivers to also depend on the vfio-pci driver while not themselves
> > being a vfio driver.

A module dependency on vfio-pci (actually vfio-pci-core) is a pretty
loose requirement, *any* symbol dependency generates such a linkage,
without necessarily exposing a vfio-pci uAPI.  The alias support
introduced to the kernel is intended to allow userspace to determine
the most appropriate vfio-pci driver for a device, whether that's
vfio-pci itself or a variant driver that augments device specific
features.  See the upstream commit here:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=cc6711b0bf36de068b10490198d05ac168377989

All we're doing here is extending libvirt to say that if the driver is
vfio-pci or the modalias for the driver is prefixed with vfio-pci, then
the driver exposes a vfio-pci compatible uAPI.  I expect in the future
libvirt, or some other utility, may take on the role as described in
the above commit log to not only detect that a driver supports a
vfio-pci uAPI, but also to identify the most appropriate driver for the
device which exposes a vfio-uAPI.

> > > The 'alias' field is optional so do we have any support guarantees from the
> > > vendors that the it will always be filled in correctly? I mean you surely
> > > handle that case in the code, but once we start supporting this there's no way
> > > back and we already know how painful it can be to convince the vendors to
> > > follow some kind of standard so that we don't need to maintain several code
> > > paths based on a vendor-matrix.  
> > 
> > The aliases are what is used to determine the "best" vfio driver for a
> > particular device, so I don't think it would be possible for a driver to not
> > implement it, and the method I've used here to determine if a driver is a
> > vfio driver was recommended by Alex after a couple of discussions on the
> > subject.
> >   
> > > 
> > > ...
> > >   
> > > > +int
> > > > +virPCIDeviceGetDriverNameAndType(virPCIDevice *dev,
> > > > +                                 char **drvName,
> > > > +                                 virPCIStubDriver *drvType)
> > > > +{
> > > > +    g_autofree char *drvPath = NULL;
> > > > +    g_autoptr(virCommand) cmd = NULL;
> > > > +    g_autofree char *output = NULL;
> > > > +    g_autoptr(GRegex) regex = NULL;
> > > > +    g_autoptr(GError) err = NULL;
> > > > +    g_autoptr(GMatchInfo) info = NULL;
> > > > +    int exit;
> > > > +    int tmpType;
> > > > +
> > > > +    if (virPCIDeviceGetDriverPathAndName(dev, &drvPath, drvName) < 0)
> > > > +        return -1;
> > > > +
> > > > +    if (!*drvName) {
> > > > +        *drvType = VIR_PCI_STUB_DRIVER_NONE;
> > > > +        return 0;
> > > > +    }
> > > > +
> > > > +    tmpType = virPCIStubDriverTypeFromString(*drvName);
> > > > +
> > > > +    if (tmpType > VIR_PCI_STUB_DRIVER_NONE) {
> > > > +        *drvType = tmpType;
> > > > +        return 0; /* exact match of a known driver name (or no name) */
> > > > +    }
> > > > +
> > > > +    /* Check the output of "modinfo $drvName" to see if it has
> > > > +     * "vfio_pci" as an alias. If it does, then this driver should
> > > > +     * also be considered as a vfio-pci driver, because it implements
> > > > +     * all the functionality of the basic vfio-pci (plus additional
> > > > +     * device-specific stuff).
> > > > +     */  
> > > 
> > > Instead of calling an external program and then grepping its output which
> > > technically could change in the future, wouldn't it be better if we read
> > > /lib/modules/`uname -r`/modules.alias and filtered whatever line had the
> > > vfio-pci' substring and compared the module name with the user-provided device
> > > driver?  
> > 
> > Again, although I was hesistant about calling an external command, and asked
> > if there was something simpler, Alex still suggested modinfo, so I'll let
> > him answer that. Alex?
> > 
> > (Also, although the format of the output of "uname -r" is pretty much
> > written in stone, you're still running an external command :-))

The above commit log actually suggests modules.alias, so if you'd
rather rely on that than modinfo, sure, that's fine.  Thanks,

Alex
Re: [PATCH] util: basic support for vendor-specific vfio drivers
Posted by Erik Skultety 1 year, 8 months ago
On Mon, Aug 01, 2022 at 09:30:38AM -0400, Laine Stump wrote:
> On 8/1/22 7:58 AM, Erik Skultety wrote:
> > On Mon, Aug 01, 2022 at 12:02:22AM -0400, Laine Stump wrote:
> > > Before a PCI device can be assigned to a guest with VFIO, that device
> > > must be bound to the vfio-pci driver rather than to the device's
> > > normal driver. The vfio-pci driver provides APIs that permit QEMU to
> > > perform all the necessary operations to make the device accessible to
> > > the guest.
> > > 
> > > There has been kernel work recently to support vendor/device-specific
> > > VFIO drivers that provide the basic vfio-pci driver functionality
> > > while adding support for device-specific operations (for example these
> > > device-specific drivers are planned to support live migration of
> > > certain devices). All that will be needed to make this functionality
> > > available will be to bind the new vendor-specific driver to the device
> > > (rather than the generic vfio-pci driver, which will continue to work
> > > just without the extra functionality).
> > > 
> > > But until now libvirt has required that all PCI devices being assigned
> > > to a guest with VFIO specifically have the "vfio-pci" driver bound to
> > > the device. So even if the user manually binds a shiny new
> > > vendor-specific driver to the device (and puts "managed='no'" in the
> > > config to prevent libvirt from changing that), libvirt will just fail
> > > during startup of the guest (or during hotplug) because the driver
> > > bound to the device isn't named exactly "vfio-pci".
> > > 
> > > Fortunately these new vendor/device-specific drivers can be easily
> > > identified as being "vfio-pci + extra stuff" - all that's needed is to
> > > look at the output of the "modinfo $driver_name" command to see if
> > > "vfio_pci" is in the alias list for the driver.
> > > 
> > > That's what this patch does. When libvirt checks the driver bound to a
> > > device (either to decide if it needs to bind to a different driver or
> > > perform some other operation, or if the current driver is acceptable
> > > as-is), if the driver isn't specifically "vfio-pci", then it will look
> > > at the output of modinfo for the driver that *is* bound to the device;
> > > if modinfo shows vfio_pci as an alias for that device, then we'll
> > > behave as if the driver was exactly "vfio-pci".
> > 
> > Since you say that the vendor/device-specific drivers does each of such drivers
> > implement the base vfio-pci functionality or they simply call into the base
> > driver? The reason why I'm asking is that if each of the vendor-specific
> > drivers depend on the vfio-pci module to be loaded as well, then reading
> > /proc/modules should suffice as vfio-pci should be listed right next to the
> > vendor-specific one. What am I missing?
> I don't know the definitive answer to that, as I have no example of a
> working vendor-specific driver to look at and only know about the kernel
> work going on second-hand from Alex. It looks like even the vfio_pci driver
> itself depends on other presumably lower level vfio-* modules (it directly
> uses vfio_pci_core, which in turn uses vfio and vfio_vifqfd), so possibly
> these new drivers would be depending on one or more of those lower level
> modules rather than vfio_pci. Also I would imagine it would be possible for
> other drivers to also depend on the vfio-pci driver while not themselves
> being a vfio driver.
> 
> > 
> > The 'alias' field is optional so do we have any support guarantees from the
> > vendors that the it will always be filled in correctly? I mean you surely
> > handle that case in the code, but once we start supporting this there's no way
> > back and we already know how painful it can be to convince the vendors to
> > follow some kind of standard so that we don't need to maintain several code
> > paths based on a vendor-matrix.
> 
> The aliases are what is used to determine the "best" vfio driver for a
> particular device, so I don't think it would be possible for a driver to not
> implement it, and the method I've used here to determine if a driver is a
> vfio driver was recommended by Alex after a couple of discussions on the
> subject.
> 
> > 
> > ...
> > 
> > > +int
> > > +virPCIDeviceGetDriverNameAndType(virPCIDevice *dev,
> > > +                                 char **drvName,
> > > +                                 virPCIStubDriver *drvType)
> > > +{
> > > +    g_autofree char *drvPath = NULL;
> > > +    g_autoptr(virCommand) cmd = NULL;
> > > +    g_autofree char *output = NULL;
> > > +    g_autoptr(GRegex) regex = NULL;
> > > +    g_autoptr(GError) err = NULL;
> > > +    g_autoptr(GMatchInfo) info = NULL;
> > > +    int exit;
> > > +    int tmpType;
> > > +
> > > +    if (virPCIDeviceGetDriverPathAndName(dev, &drvPath, drvName) < 0)
> > > +        return -1;
> > > +
> > > +    if (!*drvName) {
> > > +        *drvType = VIR_PCI_STUB_DRIVER_NONE;
> > > +        return 0;
> > > +    }
> > > +
> > > +    tmpType = virPCIStubDriverTypeFromString(*drvName);
> > > +
> > > +    if (tmpType > VIR_PCI_STUB_DRIVER_NONE) {
> > > +        *drvType = tmpType;
> > > +        return 0; /* exact match of a known driver name (or no name) */
> > > +    }
> > > +
> > > +    /* Check the output of "modinfo $drvName" to see if it has
> > > +     * "vfio_pci" as an alias. If it does, then this driver should
> > > +     * also be considered as a vfio-pci driver, because it implements
> > > +     * all the functionality of the basic vfio-pci (plus additional
> > > +     * device-specific stuff).
> > > +     */
> > 
> > Instead of calling an external program and then grepping its output which
> > technically could change in the future, wouldn't it be better if we read
> > /lib/modules/`uname -r`/modules.alias and filtered whatever line had the
> > vfio-pci' substring and compared the module name with the user-provided device
> > driver?
> 
> Again, although I was hesistant about calling an external command, and asked
> if there was something simpler, Alex still suggested modinfo, so I'll let
> him answer that. Alex?
> 
> (Also, although the format of the output of "uname -r" is pretty much
> written in stone, you're still running an external command :-))

Well, that was just my attempt to express whatever is the running kernel in
that path, but of course I didn't mean to run the command verbatim as it
contradicts what I'm trying to suggest. We can get the version from e.g. QEMU
caps (<kernelVersion>), it's also available in /proc/version. Grepping the
codebase we don't seem to be using the kernelVersion attribute from QEMU caps
at the moment.

Erik