[libvirt PATCH] RFC: Add support for vDPA network devices

Jonathon Jongsma posted 1 patch 3 years, 8 months ago
Test syntax-check failed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/libvirt tags/patchew/20200818183717.165048-1-jjongsma@redhat.com
docs/formatdomain.rst                         | 20 +++++++++
docs/schemas/domaincommon.rng                 | 15 +++++++
src/conf/domain_conf.c                        | 41 +++++++++++++++++++
src/conf/domain_conf.h                        |  4 ++
src/conf/netdev_bandwidth_conf.c              |  1 +
src/libxl/libxl_conf.c                        |  1 +
src/libxl/xen_common.c                        |  1 +
src/lxc/lxc_controller.c                      |  1 +
src/lxc/lxc_driver.c                          |  3 ++
src/lxc/lxc_process.c                         |  1 +
src/qemu/qemu_command.c                       | 29 ++++++++++++-
src/qemu/qemu_command.h                       |  3 +-
src/qemu/qemu_domain.c                        |  6 ++-
src/qemu/qemu_hotplug.c                       | 15 ++++---
src/qemu/qemu_interface.c                     | 25 +++++++++++
src/qemu/qemu_interface.h                     |  2 +
src/qemu/qemu_process.c                       |  1 +
src/qemu/qemu_validate.c                      |  1 +
src/vmx/vmx.c                                 |  1 +
.../net-vdpa.x86_64-latest.args               | 37 +++++++++++++++++
tests/qemuxml2argvdata/net-vdpa.xml           | 28 +++++++++++++
tests/qemuxml2argvmock.c                      | 11 ++++-
tests/qemuxml2argvtest.c                      |  1 +
tests/qemuxml2xmloutdata/net-vdpa.xml         | 34 +++++++++++++++
tests/qemuxml2xmltest.c                       |  1 +
tools/virsh-domain.c                          |  1 +
26 files changed, 274 insertions(+), 10 deletions(-)
create mode 100644 tests/qemuxml2argvdata/net-vdpa.x86_64-latest.args
create mode 100644 tests/qemuxml2argvdata/net-vdpa.xml
create mode 100644 tests/qemuxml2xmloutdata/net-vdpa.xml
[libvirt PATCH] RFC: Add support for vDPA network devices
Posted by Jonathon Jongsma 3 years, 8 months ago
vDPA network devices allow high-performance networking in a virtual
machine by providing a wire-speed data path. These devices require a
vendor-specific host driver but the data path follows the virtio
specification.

The support for vDPA devices was recently added to qemu. This allows
libvirt to support these devices. It requires that the device is
configured on the host with the appropriate vendor-specific driver.
This will create a chardev on the host at e.g. /dev/vhost-vdpa-0. That
chardev path can then be used to define a new interface with
type='vdpa'.
---
 docs/formatdomain.rst                         | 20 +++++++++
 docs/schemas/domaincommon.rng                 | 15 +++++++
 src/conf/domain_conf.c                        | 41 +++++++++++++++++++
 src/conf/domain_conf.h                        |  4 ++
 src/conf/netdev_bandwidth_conf.c              |  1 +
 src/libxl/libxl_conf.c                        |  1 +
 src/libxl/xen_common.c                        |  1 +
 src/lxc/lxc_controller.c                      |  1 +
 src/lxc/lxc_driver.c                          |  3 ++
 src/lxc/lxc_process.c                         |  1 +
 src/qemu/qemu_command.c                       | 29 ++++++++++++-
 src/qemu/qemu_command.h                       |  3 +-
 src/qemu/qemu_domain.c                        |  6 ++-
 src/qemu/qemu_hotplug.c                       | 15 ++++---
 src/qemu/qemu_interface.c                     | 25 +++++++++++
 src/qemu/qemu_interface.h                     |  2 +
 src/qemu/qemu_process.c                       |  1 +
 src/qemu/qemu_validate.c                      |  1 +
 src/vmx/vmx.c                                 |  1 +
 .../net-vdpa.x86_64-latest.args               | 37 +++++++++++++++++
 tests/qemuxml2argvdata/net-vdpa.xml           | 28 +++++++++++++
 tests/qemuxml2argvmock.c                      | 11 ++++-
 tests/qemuxml2argvtest.c                      |  1 +
 tests/qemuxml2xmloutdata/net-vdpa.xml         | 34 +++++++++++++++
 tests/qemuxml2xmltest.c                       |  1 +
 tools/virsh-domain.c                          |  1 +
 26 files changed, 274 insertions(+), 10 deletions(-)
 create mode 100644 tests/qemuxml2argvdata/net-vdpa.x86_64-latest.args
 create mode 100644 tests/qemuxml2argvdata/net-vdpa.xml
 create mode 100644 tests/qemuxml2xmloutdata/net-vdpa.xml

diff --git a/docs/formatdomain.rst b/docs/formatdomain.rst
index 8365fc8bbb..1356485504 100644
--- a/docs/formatdomain.rst
+++ b/docs/formatdomain.rst
@@ -4632,6 +4632,26 @@ or stopping the guest.
    </devices>
    ...
 
+:anchor:`<a id="elementsNICSVDPA"/>`
+
+vDPA devices
+^^^^^^^^^^^^
+
+A vDPA device can be used to provide wire speed network performance within a
+domain. The host device must already be configured with the appropriate
+device-specific vDPA driver. This creates a vDPA char device (e.g.
+/dev/vhost-vdpa-0) that can be used to assign the device to a libvirt domain.
+
+::
+
+   ...
+   <devices>
+     <interface type='vdpa'>
+       <source dev='/dev/vhost-vdpa-0'/>
+     </interface>
+   </devices>
+   ...
+
 :anchor:`<a id="elementsTeaming"/>`
 
 Teaming a virtio/hostdev NIC pair
diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng
index 0d0dcbc5ce..17f74490f4 100644
--- a/docs/schemas/domaincommon.rng
+++ b/docs/schemas/domaincommon.rng
@@ -3108,6 +3108,21 @@
             <ref name="interface-options"/>
           </interleave>
         </group>
+
+        <group>
+          <attribute name="type">
+            <value>vdpa</value>
+          </attribute>
+          <interleave>
+            <element name="source">
+              <attribute name="dev">
+                <ref name="deviceName"/>
+              </attribute>
+            </element>
+            <ref name="interface-options"/>
+          </interleave>
+        </group>
+
       </choice>
       <optional>
         <attribute name="trustGuestRxFilters">
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
index 8e7981bf25..74f2c2f3e3 100644
--- a/src/conf/domain_conf.c
+++ b/src/conf/domain_conf.c
@@ -549,6 +549,7 @@ VIR_ENUM_IMPL(virDomainNet,
               "direct",
               "hostdev",
               "udp",
+              "vdpa",
 );
 
 VIR_ENUM_IMPL(virDomainNetModel,
@@ -2495,6 +2496,10 @@ virDomainNetDefClear(virDomainNetDefPtr def)
         def->data.vhostuser = NULL;
         break;
 
+    case VIR_DOMAIN_NET_TYPE_VDPA:
+        VIR_FREE(def->data.vdpa.devicepath);
+        break;
+
     case VIR_DOMAIN_NET_TYPE_SERVER:
     case VIR_DOMAIN_NET_TYPE_CLIENT:
     case VIR_DOMAIN_NET_TYPE_MCAST:
@@ -6489,6 +6494,15 @@ virDomainNetDefValidate(const virDomainNetDef *net)
         return -1;
     }
 
+    if (net->type == VIR_DOMAIN_NET_TYPE_VDPA &&
+        net->model != VIR_DOMAIN_NET_MODEL_VIRTIO) {
+            virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
+                           _("invalid model for interface of type '%s': '%s'"),
+                           virDomainNetTypeToString(net->type),
+                           virDomainNetModelTypeToString(net->model));
+            return -1;
+    }
+
     return 0;
 }
 
@@ -11982,6 +11996,7 @@ virDomainNetDefParseXML(virDomainXMLOptionPtr xmlopt,
     g_autofree char *vhost_path = NULL;
     g_autofree char *teamingType = NULL;
     g_autofree char *teamingPersistent = NULL;
+    g_autofree char *vdpa_dev = NULL;
     const char *prefix = xmlopt ? xmlopt->config.netPrefix : NULL;
 
     if (!(def = virDomainNetDefNew(xmlopt)))
@@ -12075,6 +12090,10 @@ virDomainNetDefParseXML(virDomainXMLOptionPtr xmlopt,
                 if (virDomainChrSourceReconnectDefParseXML(&reconnect, cur, ctxt) < 0)
                     goto error;
 
+            } else if (!vdpa_dev
+                       && def->type == VIR_DOMAIN_NET_TYPE_VDPA
+                       && virXMLNodeNameEqual(cur, "source")) {
+                vdpa_dev = virXMLPropString(cur, "dev");
             } else if (!def->virtPortProfile
                        && virXMLNodeNameEqual(cur, "virtualport")) {
                 if (def->type == VIR_DOMAIN_NET_TYPE_NETWORK) {
@@ -12332,6 +12351,16 @@ virDomainNetDefParseXML(virDomainXMLOptionPtr xmlopt,
         }
         break;
 
+    case VIR_DOMAIN_NET_TYPE_VDPA:
+        if (vdpa_dev == NULL) {
+            virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
+                           _("No <source> 'dev' attribute "
+                             "specified with <interface type='vdpa'/>"));
+            goto error;
+        }
+        def->data.vdpa.devicepath = g_steal_pointer(&vdpa_dev);
+        break;
+
     case VIR_DOMAIN_NET_TYPE_BRIDGE:
         if (bridge == NULL) {
             virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
@@ -12727,6 +12756,7 @@ virDomainNetDefParseXML(virDomainXMLOptionPtr xmlopt,
         case VIR_DOMAIN_NET_TYPE_DIRECT:
         case VIR_DOMAIN_NET_TYPE_HOSTDEV:
         case VIR_DOMAIN_NET_TYPE_UDP:
+        case VIR_DOMAIN_NET_TYPE_VDPA:
             break;
         case VIR_DOMAIN_NET_TYPE_LAST:
         default:
@@ -26737,6 +26767,14 @@ virDomainNetDefFormat(virBufferPtr buf,
             }
             break;
 
+        case VIR_DOMAIN_NET_TYPE_VDPA:
+           if (def->data.vdpa.devicepath) {
+               virBufferEscapeString(buf, "<source dev='%s'",
+                                     def->data.vdpa.devicepath);
+               sourceLines++;
+           }
+            break;
+
         case VIR_DOMAIN_NET_TYPE_USER:
         case VIR_DOMAIN_NET_TYPE_LAST:
             break;
@@ -30902,6 +30940,7 @@ virDomainNetGetActualVirtPortProfile(const virDomainNetDef *iface)
     case VIR_DOMAIN_NET_TYPE_MCAST:
     case VIR_DOMAIN_NET_TYPE_INTERNAL:
     case VIR_DOMAIN_NET_TYPE_UDP:
+    case VIR_DOMAIN_NET_TYPE_VDPA:
     case VIR_DOMAIN_NET_TYPE_LAST:
     default:
         return NULL;
@@ -31718,6 +31757,7 @@ virDomainNetTypeSharesHostView(const virDomainNetDef *net)
     case VIR_DOMAIN_NET_TYPE_INTERNAL:
     case VIR_DOMAIN_NET_TYPE_HOSTDEV:
     case VIR_DOMAIN_NET_TYPE_UDP:
+    case VIR_DOMAIN_NET_TYPE_VDPA:
     case VIR_DOMAIN_NET_TYPE_LAST:
         break;
     }
@@ -31982,6 +32022,7 @@ virDomainNetDefActualToNetworkPort(virDomainDefPtr dom,
     case VIR_DOMAIN_NET_TYPE_UDP:
     case VIR_DOMAIN_NET_TYPE_USER:
     case VIR_DOMAIN_NET_TYPE_VHOSTUSER:
+    case VIR_DOMAIN_NET_TYPE_VDPA:
         virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
                        _("Unexpected network port type %s"),
                        virDomainNetTypeToString(virDomainNetGetActualType(iface)));
diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h
index 68be32614c..4f63a3eef4 100644
--- a/src/conf/domain_conf.h
+++ b/src/conf/domain_conf.h
@@ -872,6 +872,7 @@ typedef enum {
     VIR_DOMAIN_NET_TYPE_DIRECT,
     VIR_DOMAIN_NET_TYPE_HOSTDEV,
     VIR_DOMAIN_NET_TYPE_UDP,
+    VIR_DOMAIN_NET_TYPE_VDPA,
 
     VIR_DOMAIN_NET_TYPE_LAST
 } virDomainNetType;
@@ -1045,6 +1046,9 @@ struct _virDomainNetDef {
              */
             virDomainActualNetDefPtr actual;
         } network;
+        struct {
+            char *devicepath;
+        } vdpa;
         struct {
             char *brname;
         } bridge;
diff --git a/src/conf/netdev_bandwidth_conf.c b/src/conf/netdev_bandwidth_conf.c
index 396ac62019..4eb12e2951 100644
--- a/src/conf/netdev_bandwidth_conf.c
+++ b/src/conf/netdev_bandwidth_conf.c
@@ -315,6 +315,7 @@ bool virNetDevSupportsBandwidth(virDomainNetType type)
     case VIR_DOMAIN_NET_TYPE_UDP:
     case VIR_DOMAIN_NET_TYPE_INTERNAL:
     case VIR_DOMAIN_NET_TYPE_HOSTDEV:
+    case VIR_DOMAIN_NET_TYPE_VDPA:
     case VIR_DOMAIN_NET_TYPE_LAST:
         break;
     }
diff --git a/src/libxl/libxl_conf.c b/src/libxl/libxl_conf.c
index 7c2c015015..709cdc8719 100644
--- a/src/libxl/libxl_conf.c
+++ b/src/libxl/libxl_conf.c
@@ -1371,6 +1371,7 @@ libxlMakeNic(virDomainDefPtr def,
         case VIR_DOMAIN_NET_TYPE_INTERNAL:
         case VIR_DOMAIN_NET_TYPE_DIRECT:
         case VIR_DOMAIN_NET_TYPE_HOSTDEV:
+        case VIR_DOMAIN_NET_TYPE_VDPA:
         case VIR_DOMAIN_NET_TYPE_LAST:
             virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
                     _("unsupported interface type %s"),
diff --git a/src/libxl/xen_common.c b/src/libxl/xen_common.c
index 75fe7e0644..b1ec34bf11 100644
--- a/src/libxl/xen_common.c
+++ b/src/libxl/xen_common.c
@@ -1776,6 +1776,7 @@ xenFormatNet(virConnectPtr conn,
     case VIR_DOMAIN_NET_TYPE_HOSTDEV:
     case VIR_DOMAIN_NET_TYPE_UDP:
     case VIR_DOMAIN_NET_TYPE_USER:
+    case VIR_DOMAIN_NET_TYPE_VDPA:
         virReportError(VIR_ERR_CONFIG_UNSUPPORTED, _("Unsupported net type '%s'"),
                        virDomainNetTypeToString(net->type));
         return -1;
diff --git a/src/lxc/lxc_controller.c b/src/lxc/lxc_controller.c
index ae6b737b60..cb573d6c01 100644
--- a/src/lxc/lxc_controller.c
+++ b/src/lxc/lxc_controller.c
@@ -422,6 +422,7 @@ static int virLXCControllerGetNICIndexes(virLXCControllerPtr ctrl)
         case VIR_DOMAIN_NET_TYPE_UDP:
         case VIR_DOMAIN_NET_TYPE_INTERNAL:
         case VIR_DOMAIN_NET_TYPE_HOSTDEV:
+        case VIR_DOMAIN_NET_TYPE_VDPA:
             virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
                            _("Unsupported net type %s"),
                            virDomainNetTypeToString(actualType));
diff --git a/src/lxc/lxc_driver.c b/src/lxc/lxc_driver.c
index 1cdd6ee455..a36f83a588 100644
--- a/src/lxc/lxc_driver.c
+++ b/src/lxc/lxc_driver.c
@@ -3503,6 +3503,7 @@ lxcDomainAttachDeviceNetLive(virLXCDriverPtr driver,
     case VIR_DOMAIN_NET_TYPE_INTERNAL:
     case VIR_DOMAIN_NET_TYPE_HOSTDEV:
     case VIR_DOMAIN_NET_TYPE_UDP:
+    case VIR_DOMAIN_NET_TYPE_VDPA:
         virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
                        _("Network device type is not supported"));
         goto cleanup;
@@ -3557,6 +3558,7 @@ lxcDomainAttachDeviceNetLive(virLXCDriverPtr driver,
         case VIR_DOMAIN_NET_TYPE_INTERNAL:
         case VIR_DOMAIN_NET_TYPE_HOSTDEV:
         case VIR_DOMAIN_NET_TYPE_UDP:
+        case VIR_DOMAIN_NET_TYPE_VDPA:
         case VIR_DOMAIN_NET_TYPE_LAST:
         default:
             /* no-op */
@@ -3998,6 +4000,7 @@ lxcDomainDetachDeviceNetLive(virDomainObjPtr vm,
     case VIR_DOMAIN_NET_TYPE_INTERNAL:
     case VIR_DOMAIN_NET_TYPE_HOSTDEV:
     case VIR_DOMAIN_NET_TYPE_UDP:
+    case VIR_DOMAIN_NET_TYPE_VDPA:
         virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
                        _("Only bridged veth devices can be detached"));
         goto cleanup;
diff --git a/src/lxc/lxc_process.c b/src/lxc/lxc_process.c
index fc59c2e5af..90e9790cea 100644
--- a/src/lxc/lxc_process.c
+++ b/src/lxc/lxc_process.c
@@ -606,6 +606,7 @@ virLXCProcessSetupInterfaces(virLXCDriverPtr driver,
         case VIR_DOMAIN_NET_TYPE_INTERNAL:
         case VIR_DOMAIN_NET_TYPE_LAST:
         case VIR_DOMAIN_NET_TYPE_HOSTDEV:
+        case VIR_DOMAIN_NET_TYPE_VDPA:
             virReportError(VIR_ERR_INTERNAL_ERROR,
                            _("Unsupported network type %s"),
                            virDomainNetTypeToString(type));
diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c
index 01812cd39b..9c5265ccdf 100644
--- a/src/qemu/qemu_command.c
+++ b/src/qemu/qemu_command.c
@@ -3552,7 +3552,8 @@ qemuBuildHostNetStr(virDomainNetDefPtr net,
                     size_t tapfdSize,
                     char **vhostfd,
                     size_t vhostfdSize,
-                    const char *slirpfd)
+                    const char *slirpfd,
+                    const char *vdpafd)
 {
     bool is_tap = false;
     virDomainNetType netType = virDomainNetGetActualType(net);
@@ -3690,6 +3691,13 @@ qemuBuildHostNetStr(virDomainNetDefPtr net,
             return NULL;
         break;
 
+    case VIR_DOMAIN_NET_TYPE_VDPA:
+        /* Caller will pass the fd to qemu with add-fd */
+        if (virJSONValueObjectCreate(&netprops, "s:type", "vhost-vdpa", NULL) < 0 ||
+            virJSONValueObjectAppendString(netprops, "vhostdev", vdpafd) < 0)
+            return NULL;
+        break;
+
     case VIR_DOMAIN_NET_TYPE_HOSTDEV:
         /* Should have been handled earlier via PCI/USB hotplug code. */
     case VIR_DOMAIN_NET_TYPE_LAST:
@@ -8013,6 +8021,8 @@ qemuBuildInterfaceCommandLine(virQEMUDriverPtr driver,
     char **tapfdName = NULL;
     char **vhostfdName = NULL;
     g_autofree char *slirpfdName = NULL;
+    g_autofree char *vdpafdName = NULL;
+    int vdpafd = -1;
     virDomainNetType actualType = virDomainNetGetActualType(net);
     const virNetDevBandwidth *actualBandwidth;
     bool requireNicdev = false;
@@ -8098,6 +8108,11 @@ qemuBuildInterfaceCommandLine(virQEMUDriverPtr driver,
 
         break;
 
+    case VIR_DOMAIN_NET_TYPE_VDPA:
+        if ((vdpafd = qemuInterfaceVDPAConnect(net)) < 0)
+            goto cleanup;
+        break;
+
     case VIR_DOMAIN_NET_TYPE_USER:
     case VIR_DOMAIN_NET_TYPE_SERVER:
     case VIR_DOMAIN_NET_TYPE_CLIENT:
@@ -8140,6 +8155,7 @@ qemuBuildInterfaceCommandLine(virQEMUDriverPtr driver,
     case VIR_DOMAIN_NET_TYPE_UDP:
     case VIR_DOMAIN_NET_TYPE_INTERNAL:
     case VIR_DOMAIN_NET_TYPE_HOSTDEV:
+    case VIR_DOMAIN_NET_TYPE_VDPA:
     case VIR_DOMAIN_NET_TYPE_LAST:
        /* These types don't use a network device on the host, but
         * instead use some other type of connection to the emulated
@@ -8219,13 +8235,22 @@ qemuBuildInterfaceCommandLine(virQEMUDriverPtr driver,
         vhostfd[i] = -1;
     }
 
+    if (vdpafd > 0) {
+        virCommandPassFD(cmd, vdpafd, VIR_COMMAND_PASS_FD_CLOSE_PARENT);
+        g_autofree char *fdset = qemuVirCommandGetFDSet(cmd, vdpafd);
+        if (!fdset)
+            goto cleanup;
+        virCommandAddArgList(cmd, "-add-fd", fdset, NULL);
+        vdpafdName = qemuVirCommandGetDevSet(cmd, vdpafd);
+    }
+
     if (chardev)
         virCommandAddArgList(cmd, "-chardev", chardev, NULL);
 
     if (!(hostnetprops = qemuBuildHostNetStr(net,
                                              tapfdName, tapfdSize,
                                              vhostfdName, vhostfdSize,
-                                             slirpfdName)))
+                                             slirpfdName, vdpafdName)))
         goto cleanup;
 
     if (!(host = virQEMUBuildNetdevCommandlineFromJSON(hostnetprops,
diff --git a/src/qemu/qemu_command.h b/src/qemu/qemu_command.h
index 89d99b111f..e8b4f4785a 100644
--- a/src/qemu/qemu_command.h
+++ b/src/qemu/qemu_command.h
@@ -99,7 +99,8 @@ virJSONValuePtr qemuBuildHostNetStr(virDomainNetDefPtr net,
                                     size_t tapfdSize,
                                     char **vhostfd,
                                     size_t vhostfdSize,
-                                    const char *slirpfd);
+                                    const char *slirpfd,
+                                    const char *vdpafd);
 
 /* Current, best practice */
 char *qemuBuildNicDevStr(virDomainDefPtr def,
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
index c440c79e1d..daae5a1b03 100644
--- a/src/qemu/qemu_domain.c
+++ b/src/qemu/qemu_domain.c
@@ -5027,7 +5027,10 @@ qemuDomainDeviceNetDefPostParse(virDomainNetDefPtr net,
                                 const virDomainDef *def,
                                 virQEMUCapsPtr qemuCaps)
 {
-    if (net->type != VIR_DOMAIN_NET_TYPE_HOSTDEV &&
+    if (net->type == VIR_DOMAIN_NET_TYPE_VDPA &&
+        !virDomainNetGetModelString(net))
+        net->model = VIR_DOMAIN_NET_MODEL_VIRTIO;
+    else if (net->type != VIR_DOMAIN_NET_TYPE_HOSTDEV &&
         !virDomainNetGetModelString(net) &&
         virDomainNetResolveActualType(net) != VIR_DOMAIN_NET_TYPE_HOSTDEV)
         net->model = qemuDomainDefaultNetModel(def, qemuCaps);
@@ -9201,6 +9204,7 @@ qemuDomainNetSupportsMTU(virDomainNetType type)
     case VIR_DOMAIN_NET_TYPE_DIRECT:
     case VIR_DOMAIN_NET_TYPE_HOSTDEV:
     case VIR_DOMAIN_NET_TYPE_UDP:
+    case VIR_DOMAIN_NET_TYPE_VDPA:
     case VIR_DOMAIN_NET_TYPE_LAST:
         break;
     }
diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c
index 2c6c30ce03..23ae2310a2 100644
--- a/src/qemu/qemu_hotplug.c
+++ b/src/qemu/qemu_hotplug.c
@@ -1340,6 +1340,7 @@ qemuDomainAttachNetDevice(virQEMUDriverPtr driver,
     case VIR_DOMAIN_NET_TYPE_MCAST:
     case VIR_DOMAIN_NET_TYPE_INTERNAL:
     case VIR_DOMAIN_NET_TYPE_UDP:
+    case VIR_DOMAIN_NET_TYPE_VDPA:
     case VIR_DOMAIN_NET_TYPE_LAST:
         virReportError(VIR_ERR_OPERATION_UNSUPPORTED,
                        _("hotplug of interface type of %s is not implemented yet"),
@@ -1388,7 +1389,7 @@ qemuDomainAttachNetDevice(virQEMUDriverPtr driver,
     if (!(netprops = qemuBuildHostNetStr(net,
                                          tapfdName, tapfdSize,
                                          vhostfdName, vhostfdSize,
-                                         slirpfdName)))
+                                         slirpfdName, NULL)))
         goto cleanup;
 
     qemuDomainObjEnterMonitor(driver, vm);
@@ -3390,6 +3391,7 @@ qemuDomainChangeNetFilter(virDomainObjPtr vm,
     case VIR_DOMAIN_NET_TYPE_DIRECT:
     case VIR_DOMAIN_NET_TYPE_HOSTDEV:
     case VIR_DOMAIN_NET_TYPE_UDP:
+    case VIR_DOMAIN_NET_TYPE_VDPA:
         virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
                        _("filters not supported on interfaces of type %s"),
                        virDomainNetTypeToString(virDomainNetGetActualType(newdev)));
@@ -3483,8 +3485,9 @@ qemuDomainChangeNet(virQEMUDriverPtr driver,
     olddev = *devslot;
 
     oldType = virDomainNetGetActualType(olddev);
-    if (oldType == VIR_DOMAIN_NET_TYPE_HOSTDEV) {
-        /* no changes are possible to a type='hostdev' interface */
+    if (oldType == VIR_DOMAIN_NET_TYPE_HOSTDEV ||
+        oldType == VIR_DOMAIN_NET_TYPE_VDPA) {
+        /* no changes are possible to a type='hostdev' or type='vdpa' interface */
         virReportError(VIR_ERR_OPERATION_UNSUPPORTED,
                        _("cannot change config of '%s' network type"),
                        virDomainNetTypeToString(oldType));
@@ -3671,8 +3674,9 @@ qemuDomainChangeNet(virQEMUDriverPtr driver,
 
     newType = virDomainNetGetActualType(newdev);
 
-    if (newType == VIR_DOMAIN_NET_TYPE_HOSTDEV) {
-        /* can't turn it into a type='hostdev' interface */
+    if (newType == VIR_DOMAIN_NET_TYPE_HOSTDEV ||
+        newType == VIR_DOMAIN_NET_TYPE_VDPA) {
+        /* can't turn it into a type='hostdev' or type='vdpa' interface */
         virReportError(VIR_ERR_OPERATION_UNSUPPORTED,
                        _("cannot change network interface type to '%s'"),
                        virDomainNetTypeToString(newType));
@@ -3726,6 +3730,7 @@ qemuDomainChangeNet(virQEMUDriverPtr driver,
             break;
 
         case VIR_DOMAIN_NET_TYPE_VHOSTUSER:
+        case VIR_DOMAIN_NET_TYPE_VDPA:
         case VIR_DOMAIN_NET_TYPE_HOSTDEV:
             virReportError(VIR_ERR_OPERATION_UNSUPPORTED,
                            _("unable to change config on '%s' network type"),
diff --git a/src/qemu/qemu_interface.c b/src/qemu/qemu_interface.c
index ffec992596..676648ebab 100644
--- a/src/qemu/qemu_interface.c
+++ b/src/qemu/qemu_interface.c
@@ -118,6 +118,7 @@ qemuInterfaceStartDevice(virDomainNetDefPtr net)
     case VIR_DOMAIN_NET_TYPE_UDP:
     case VIR_DOMAIN_NET_TYPE_INTERNAL:
     case VIR_DOMAIN_NET_TYPE_HOSTDEV:
+    case VIR_DOMAIN_NET_TYPE_VDPA:
     case VIR_DOMAIN_NET_TYPE_LAST:
         /* these types all require no action */
         break;
@@ -203,6 +204,7 @@ qemuInterfaceStopDevice(virDomainNetDefPtr net)
     case VIR_DOMAIN_NET_TYPE_UDP:
     case VIR_DOMAIN_NET_TYPE_INTERNAL:
     case VIR_DOMAIN_NET_TYPE_HOSTDEV:
+    case VIR_DOMAIN_NET_TYPE_VDPA:
     case VIR_DOMAIN_NET_TYPE_LAST:
         /* these types all require no action */
         break;
@@ -630,6 +632,29 @@ qemuInterfaceBridgeConnect(virDomainDefPtr def,
 }
 
 
+/* qemuInterfaceVDPAConnect:
+ * @net: pointer to the VM's interface description
+ *
+ * returns: file descriptor of the vdpa device
+ *
+ * Called *only* called if actualType is VIR_DOMAIN_NET_TYPE_VDPA
+ */
+int
+qemuInterfaceVDPAConnect(virDomainNetDefPtr net)
+{
+    int fd;
+
+    if ((fd = open(net->data.vdpa.devicepath, O_RDWR)) < 0) {
+        virReportSystemError(errno,
+                             _("Unable to open '%s' for vdpa device"),
+                             net->data.vdpa.devicepath);
+        return -1;
+    }
+
+    return fd;
+}
+
+
 qemuSlirpPtr
 qemuInterfacePrepareSlirp(virQEMUDriverPtr driver,
                           virDomainNetDefPtr net)
diff --git a/src/qemu/qemu_interface.h b/src/qemu/qemu_interface.h
index 3dcefc6a12..1ba24f0a6f 100644
--- a/src/qemu/qemu_interface.h
+++ b/src/qemu/qemu_interface.h
@@ -58,3 +58,5 @@ int qemuInterfaceOpenVhostNet(virDomainDefPtr def,
 
 qemuSlirpPtr qemuInterfacePrepareSlirp(virQEMUDriverPtr driver,
                                        virDomainNetDefPtr net);
+
+int qemuInterfaceVDPAConnect(virDomainNetDefPtr net) G_GNUC_NO_INLINE;
diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
index 126fabf5ef..70c3b9b46d 100644
--- a/src/qemu/qemu_process.c
+++ b/src/qemu/qemu_process.c
@@ -7517,6 +7517,7 @@ void qemuProcessStop(virQEMUDriverPtr driver,
         case VIR_DOMAIN_NET_TYPE_INTERNAL:
         case VIR_DOMAIN_NET_TYPE_HOSTDEV:
         case VIR_DOMAIN_NET_TYPE_UDP:
+        case VIR_DOMAIN_NET_TYPE_VDPA:
         case VIR_DOMAIN_NET_TYPE_LAST:
             /* No special cleanup procedure for these types. */
             break;
diff --git a/src/qemu/qemu_validate.c b/src/qemu/qemu_validate.c
index 488f258d00..623f998463 100644
--- a/src/qemu/qemu_validate.c
+++ b/src/qemu/qemu_validate.c
@@ -1130,6 +1130,7 @@ qemuValidateNetSupportsCoalesce(virDomainNetType type)
     case VIR_DOMAIN_NET_TYPE_MCAST:
     case VIR_DOMAIN_NET_TYPE_INTERNAL:
     case VIR_DOMAIN_NET_TYPE_UDP:
+    case VIR_DOMAIN_NET_TYPE_VDPA:
     case VIR_DOMAIN_NET_TYPE_LAST:
         break;
     }
diff --git a/src/vmx/vmx.c b/src/vmx/vmx.c
index a123a8807c..f6f6efb322 100644
--- a/src/vmx/vmx.c
+++ b/src/vmx/vmx.c
@@ -3833,6 +3833,7 @@ virVMXFormatEthernet(virDomainNetDefPtr def, int controller,
       case VIR_DOMAIN_NET_TYPE_DIRECT:
       case VIR_DOMAIN_NET_TYPE_HOSTDEV:
       case VIR_DOMAIN_NET_TYPE_UDP:
+      case VIR_DOMAIN_NET_TYPE_VDPA:
         virReportError(VIR_ERR_CONFIG_UNSUPPORTED, _("Unsupported net type '%s'"),
                        virDomainNetTypeToString(def->type));
         return -1;
diff --git a/tests/qemuxml2argvdata/net-vdpa.x86_64-latest.args b/tests/qemuxml2argvdata/net-vdpa.x86_64-latest.args
new file mode 100644
index 0000000000..8e76ac7794
--- /dev/null
+++ b/tests/qemuxml2argvdata/net-vdpa.x86_64-latest.args
@@ -0,0 +1,37 @@
+LC_ALL=C \
+PATH=/bin \
+HOME=/tmp/lib/domain--1-QEMUGuest1 \
+USER=test \
+LOGNAME=test \
+XDG_DATA_HOME=/tmp/lib/domain--1-QEMUGuest1/.local/share \
+XDG_CACHE_HOME=/tmp/lib/domain--1-QEMUGuest1/.cache \
+XDG_CONFIG_HOME=/tmp/lib/domain--1-QEMUGuest1/.config \
+QEMU_AUDIO_DRV=none \
+/usr/bin/qemu-system-i386 \
+-name guest=QEMUGuest1,debug-threads=on \
+-S \
+-object secret,id=masterKey0,format=raw,\
+file=/tmp/lib/domain--1-QEMUGuest1/master-key.aes \
+-machine pc,accel=tcg,usb=off,dump-guest-core=off \
+-cpu qemu64 \
+-m 214 \
+-overcommit mem-lock=off \
+-smp 1,sockets=1,cores=1,threads=1 \
+-uuid c7a5fdbd-edaf-9455-926a-d65c16db1809 \
+-display none \
+-no-user-config \
+-nodefaults \
+-chardev socket,id=charmonitor,fd=1729,server,nowait \
+-mon chardev=charmonitor,id=monitor,mode=control \
+-rtc base=utc \
+-no-shutdown \
+-no-acpi \
+-boot strict=on \
+-device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 \
+-add-fd set=0,fd=1732 \
+-netdev vhost-vdpa,vhostdev=/dev/fdset/0,id=hostnet0 \
+-device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:95:db:c0,bus=pci.0,\
+addr=0x2 \
+-sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,\
+resourcecontrol=deny \
+-msg timestamp=on
diff --git a/tests/qemuxml2argvdata/net-vdpa.xml b/tests/qemuxml2argvdata/net-vdpa.xml
new file mode 100644
index 0000000000..30cca7eb6e
--- /dev/null
+++ b/tests/qemuxml2argvdata/net-vdpa.xml
@@ -0,0 +1,28 @@
+<domain type='qemu'>
+  <name>QEMUGuest1</name>
+  <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid>
+  <memory unit='KiB'>219136</memory>
+  <currentMemory unit='KiB'>219136</currentMemory>
+  <vcpu placement='static'>1</vcpu>
+  <os>
+    <type arch='i686' machine='pc'>hvm</type>
+    <boot dev='hd'/>
+  </os>
+  <clock offset='utc'/>
+  <on_poweroff>destroy</on_poweroff>
+  <on_reboot>restart</on_reboot>
+  <on_crash>destroy</on_crash>
+  <devices>
+    <emulator>/usr/bin/qemu-system-i386</emulator>
+    <controller type='usb' index='0'/>
+    <controller type='ide' index='0'/>
+    <controller type='pci' index='0' model='pci-root'/>
+    <interface type='vdpa'>
+      <mac address='52:54:00:95:db:c0'/>
+      <source dev='/dev/vhost-vdpa-0'/>
+    </interface>
+    <input type='mouse' bus='ps2'/>
+    <input type='keyboard' bus='ps2'/>
+    <memballoon model='none'/>
+  </devices>
+</domain>
diff --git a/tests/qemuxml2argvmock.c b/tests/qemuxml2argvmock.c
index e5841bc8e3..516776697f 100644
--- a/tests/qemuxml2argvmock.c
+++ b/tests/qemuxml2argvmock.c
@@ -205,7 +205,7 @@ virHostGetDRMRenderNode(void)
 
 static void (*real_virCommandPassFD)(virCommandPtr cmd, int fd, unsigned int flags);
 
-static const int testCommandPassSafeFDs[] = { 1730, 1731 };
+static const int testCommandPassSafeFDs[] = { 1730, 1731, 1732 };
 
 void
 virCommandPassFD(virCommandPtr cmd,
@@ -283,3 +283,12 @@ qemuBuildTPMOpenBackendFDs(const char *tpmdev G_GNUC_UNUSED,
     *cancelfd = 1731;
     return 0;
 }
+
+
+int
+qemuInterfaceVDPAConnect(virDomainNetDefPtr net G_GNUC_UNUSED)
+{
+    if (fcntl(1732, F_GETFD) != -1)
+        abort();
+    return 1732;
+}
diff --git a/tests/qemuxml2argvtest.c b/tests/qemuxml2argvtest.c
index 01839cb88c..9587e1f2f2 100644
--- a/tests/qemuxml2argvtest.c
+++ b/tests/qemuxml2argvtest.c
@@ -1446,6 +1446,7 @@ mymain(void)
             QEMU_CAPS_DEVICE_VFIO_PCI);
     DO_TEST_FAILURE("net-hostdev-fail",
                     QEMU_CAPS_DEVICE_VFIO_PCI);
+    DO_TEST_CAPS_LATEST("net-vdpa");
 
     DO_TEST("hostdev-pci-multifunction",
             QEMU_CAPS_KVM,
diff --git a/tests/qemuxml2xmloutdata/net-vdpa.xml b/tests/qemuxml2xmloutdata/net-vdpa.xml
new file mode 100644
index 0000000000..b362405c14
--- /dev/null
+++ b/tests/qemuxml2xmloutdata/net-vdpa.xml
@@ -0,0 +1,34 @@
+<domain type='qemu'>
+  <name>QEMUGuest1</name>
+  <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid>
+  <memory unit='KiB'>219136</memory>
+  <currentMemory unit='KiB'>219136</currentMemory>
+  <vcpu placement='static'>1</vcpu>
+  <os>
+    <type arch='i686' machine='pc'>hvm</type>
+    <boot dev='hd'/>
+  </os>
+  <clock offset='utc'/>
+  <on_poweroff>destroy</on_poweroff>
+  <on_reboot>restart</on_reboot>
+  <on_crash>destroy</on_crash>
+  <devices>
+    <emulator>/usr/bin/qemu-system-i386</emulator>
+    <controller type='usb' index='0'>
+      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/>
+    </controller>
+    <controller type='ide' index='0'>
+      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/>
+    </controller>
+    <controller type='pci' index='0' model='pci-root'/>
+    <interface type='vdpa'>
+      <mac address='52:54:00:95:db:c0'/>
+      <source dev='/dev/vhost-vdpa-0'/>
+      <model type='virtio'/>
+      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
+    </interface>
+    <input type='mouse' bus='ps2'/>
+    <input type='keyboard' bus='ps2'/>
+    <memballoon model='none'/>
+  </devices>
+</domain>
diff --git a/tests/qemuxml2xmltest.c b/tests/qemuxml2xmltest.c
index a07e2b7553..978babb110 100644
--- a/tests/qemuxml2xmltest.c
+++ b/tests/qemuxml2xmltest.c
@@ -494,6 +494,7 @@ mymain(void)
     DO_TEST("net-mtu", NONE);
     DO_TEST("net-coalesce", NONE);
     DO_TEST("net-many-models", NONE);
+    DO_TEST("net-vdpa", NONE);
 
     DO_TEST("serial-tcp-tlsx509-chardev", NONE);
     DO_TEST("serial-tcp-tlsx509-chardev-notls", NONE);
diff --git a/tools/virsh-domain.c b/tools/virsh-domain.c
index 286cf79671..10b396bcf0 100644
--- a/tools/virsh-domain.c
+++ b/tools/virsh-domain.c
@@ -1007,6 +1007,7 @@ cmdAttachInterface(vshControl *ctl, const vshCmd *cmd)
     case VIR_DOMAIN_NET_TYPE_MCAST:
     case VIR_DOMAIN_NET_TYPE_UDP:
     case VIR_DOMAIN_NET_TYPE_INTERNAL:
+    case VIR_DOMAIN_NET_TYPE_VDPA:
     case VIR_DOMAIN_NET_TYPE_LAST:
         vshError(ctl, _("No support for %s in command 'attach-interface'"),
                  type);
-- 
2.26.2

Re: [libvirt PATCH] RFC: Add support for vDPA network devices
Posted by Laine Stump 3 years, 8 months ago
On 8/18/20 2:37 PM, Jonathon Jongsma wrote:
> vDPA network devices allow high-performance networking in a virtual
> machine by providing a wire-speed data path. These devices require a
> vendor-specific host driver but the data path follows the virtio
> specification.
>
> The support for vDPA devices was recently added to qemu. This allows
> libvirt to support these devices. It requires that the device is
> configured on the host with the appropriate vendor-specific driver.
> This will create a chardev on the host at e.g. /dev/vhost-vdpa-0. That
> chardev path can then be used to define a new interface with
> type='vdpa'.
> ---
>   docs/formatdomain.rst                         | 20 +++++++++
>   docs/schemas/domaincommon.rng                 | 15 +++++++
>   src/conf/domain_conf.c                        | 41 +++++++++++++++++++
>   src/conf/domain_conf.h                        |  4 ++
>   src/conf/netdev_bandwidth_conf.c              |  1 +
>   src/libxl/libxl_conf.c                        |  1 +
>   src/libxl/xen_common.c                        |  1 +
>   src/lxc/lxc_controller.c                      |  1 +
>   src/lxc/lxc_driver.c                          |  3 ++
>   src/lxc/lxc_process.c                         |  1 +
>   src/qemu/qemu_command.c                       | 29 ++++++++++++-
>   src/qemu/qemu_command.h                       |  3 +-
>   src/qemu/qemu_domain.c                        |  6 ++-
>   src/qemu/qemu_hotplug.c                       | 15 ++++---
>   src/qemu/qemu_interface.c                     | 25 +++++++++++
>   src/qemu/qemu_interface.h                     |  2 +
>   src/qemu/qemu_process.c                       |  1 +
>   src/qemu/qemu_validate.c                      |  1 +
>   src/vmx/vmx.c                                 |  1 +
>   .../net-vdpa.x86_64-latest.args               | 37 +++++++++++++++++
>   tests/qemuxml2argvdata/net-vdpa.xml           | 28 +++++++++++++
>   tests/qemuxml2argvmock.c                      | 11 ++++-
>   tests/qemuxml2argvtest.c                      |  1 +
>   tests/qemuxml2xmloutdata/net-vdpa.xml         | 34 +++++++++++++++
>   tests/qemuxml2xmltest.c                       |  1 +
>   tools/virsh-domain.c                          |  1 +
>   26 files changed, 274 insertions(+), 10 deletions(-)
>   create mode 100644 tests/qemuxml2argvdata/net-vdpa.x86_64-latest.args
>   create mode 100644 tests/qemuxml2argvdata/net-vdpa.xml
>   create mode 100644 tests/qemuxml2xmloutdata/net-vdpa.xml


I would have had fewer excuses to procrastinate in looking at this if it 
was broken up into smaller patches. At least one patch for the change to 
the XML schema/parser/formatter, and xml2xml test case, and a bit of 
docs in formatdomain, then another putting the support into qemu for 
that bit of config.


> diff --git a/docs/formatdomain.rst b/docs/formatdomain.rst
> index 8365fc8bbb..1356485504 100644
> --- a/docs/formatdomain.rst
> +++ b/docs/formatdomain.rst
> @@ -4632,6 +4632,26 @@ or stopping the guest.
>      </devices>
>      ...
>   
> +:anchor:`<a id="elementsNICSVDPA"/>`
> +
> +vDPA devices
> +^^^^^^^^^^^^
> +
> +A vDPA device can be used to provide wire speed network performance within a
> +domain. The host device must already be configured with the appropriate
> +device-specific vDPA driver. This creates a vDPA char device (e.g.
> +/dev/vhost-vdpa-0) that can be used to assign the device to a libvirt domain.


Maybe at least mention here that this only works with certain models of 
SR-IOV NICs, and that each guest vdpa uses up one SR-IOV VF on the host. 
Otherwise we'll get people seeing the "wirespeed performance" part, then 
trying to figure out how to set it up using their ISA bus NE2000 NIC or 
something :-)


> +
> +::
> +
> +   ...
> +   <devices>
> +     <interface type='vdpa'>
> +       <source dev='/dev/vhost-vdpa-0'/>


(The above device is created (I just learned this from you in IRC!) by 
unbinding a VF from its NIC driver on the host, and re-binding it to a 
special VDPA-VF driver.)


As we were just discussing online, on one hand it could be nice if 
libvirt could automatically handle rebinding the VF to the vdpa host 
driver (given the PCI address of the VF), to make it easier to use 
(because no advance setup would be needed), similar to what's already 
done with hostdev devices (and <interface type='hostdev'>) when 
managed='yes' (which is the default setting).


On the other hand, it is exactly that managed='yes' functionality that 
has created more "libvirt-but-not-really-libvirt" bug reports than any 
other aspect of vfio device assignment, because the process of unbinding 
and rebinding drivers is timing-sensitive and causes code that's usually 
run only once at host boot-time to be run hundreds of times thus making 
it more likely to expose infrequently-hit bugs.


I just bring this up in advance of someone suggesting the addition of 
managed='yes' here to put in my vote for *not* doing it, and instead 
using that same effort to provide some sort of API in the node-device 
driver for easily creating one or more VDPA devices from VFs, which 
could be done once at host boot time, and thus avoid the level of 
"libvirt-not-libvirt" bug reports for VDPA. (and after that maybe even 
an API to allocate a device from that pool to be used by a guest). But 
that's for later.


> +     </interface>
> +   </devices>
> +   ...
> +
>   :anchor:`<a id="elementsTeaming"/>`
>   
>   Teaming a virtio/hostdev NIC pair
> diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng
> index 0d0dcbc5ce..17f74490f4 100644
> --- a/docs/schemas/domaincommon.rng
> +++ b/docs/schemas/domaincommon.rng
> @@ -3108,6 +3108,21 @@
>               <ref name="interface-options"/>
>             </interleave>
>           </group>
> +
> +        <group>
> +          <attribute name="type">
> +            <value>vdpa</value>
> +          </attribute>
> +          <interleave>
> +            <element name="source">
> +              <attribute name="dev">
> +                <ref name="deviceName"/>
> +              </attribute>
> +            </element>
> +            <ref name="interface-options"/>
> +          </interleave>
> +        </group>
> +
>         </choice>
>         <optional>
>           <attribute name="trustGuestRxFilters">
> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
> index 8e7981bf25..74f2c2f3e3 100644
> --- a/src/conf/domain_conf.c
> +++ b/src/conf/domain_conf.c
> @@ -549,6 +549,7 @@ VIR_ENUM_IMPL(virDomainNet,
>                 "direct",
>                 "hostdev",
>                 "udp",
> +              "vdpa",
>   );
>   
>   VIR_ENUM_IMPL(virDomainNetModel,
> @@ -2495,6 +2496,10 @@ virDomainNetDefClear(virDomainNetDefPtr def)
>           def->data.vhostuser = NULL;
>           break;
>   
> +    case VIR_DOMAIN_NET_TYPE_VDPA:
> +        VIR_FREE(def->data.vdpa.devicepath);
> +        break;
> +
>       case VIR_DOMAIN_NET_TYPE_SERVER:
>       case VIR_DOMAIN_NET_TYPE_CLIENT:
>       case VIR_DOMAIN_NET_TYPE_MCAST:
> @@ -6489,6 +6494,15 @@ virDomainNetDefValidate(const virDomainNetDef *net)
>           return -1;
>       }
>   
> +    if (net->type == VIR_DOMAIN_NET_TYPE_VDPA &&
> +        net->model != VIR_DOMAIN_NET_MODEL_VIRTIO) {
> +            virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
> +                           _("invalid model for interface of type '%s': '%s'"),
> +                           virDomainNetTypeToString(net->type),
> +                           virDomainNetModelTypeToString(net->model));
> +            return -1;
> +    }
> +


I see that in qemuDomainDeviceNetDefPostParse you set this to virtio if 
it isn't specified. It seems a bit odd to set the default in the 
qemu-specific post-parse, but check that the default actually *is* 
virtio in the generic domain validate. Since device models tend to be 
hypervisor-specific, I'm thinking maybe we should set an unspecified 
model to virtio where you currently have that 
(qemuDomainDeviceNetDefPostParse()) but more the above validation check 
from here over to qemuValidateDomainDeviceDefNetwork()


(Wow. This whole thing of having 4 (and even more in the case of NetDef, 
since there is a difference between define-time and runtime validation) 
separate places to check settings makes it really complicated to decide 
on the correct place to put one tiny check. It's tough to even remember 
where they are and what they're called - I have to do a chain of cscope 
searches every single time I get into the subject!)


(P.S. I just noticed that vhost-user, which also uses the virtio-net 
backend, just has a check directly at the end of 
virDomainNetDefParseXML() that checks if model = virtio was specified, 
and if not it logs an error and fails.)(which points out *yet another* 
place that inputs are validated. Sigh.)


>       return 0;
>   }
>   
> @@ -11982,6 +11996,7 @@ virDomainNetDefParseXML(virDomainXMLOptionPtr xmlopt,
>       g_autofree char *vhost_path = NULL;
>       g_autofree char *teamingType = NULL;
>       g_autofree char *teamingPersistent = NULL;
> +    g_autofree char *vdpa_dev = NULL;
>       const char *prefix = xmlopt ? xmlopt->config.netPrefix : NULL;
>   
>       if (!(def = virDomainNetDefNew(xmlopt)))
> @@ -12075,6 +12090,10 @@ virDomainNetDefParseXML(virDomainXMLOptionPtr xmlopt,
>                   if (virDomainChrSourceReconnectDefParseXML(&reconnect, cur, ctxt) < 0)
>                       goto error;
>   
> +            } else if (!vdpa_dev
> +                       && def->type == VIR_DOMAIN_NET_TYPE_VDPA
> +                       && virXMLNodeNameEqual(cur, "source")) {
> +                vdpa_dev = virXMLPropString(cur, "dev");


(it's always kind of bugged me that in so many places we just ignore 
multiple definitions of the same element in our parsing, rather than 
logging an error. But this pattern has so much precedent that I'm not 
going to say anything about it. Oops, already did. Forget I said that.)


>               } else if (!def->virtPortProfile
>                          && virXMLNodeNameEqual(cur, "virtualport")) {
>                   if (def->type == VIR_DOMAIN_NET_TYPE_NETWORK) {
> @@ -12332,6 +12351,16 @@ virDomainNetDefParseXML(virDomainXMLOptionPtr xmlopt,
>           }
>           break;
>   
> +    case VIR_DOMAIN_NET_TYPE_VDPA:
> +        if (vdpa_dev == NULL) {
> +            virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
> +                           _("No <source> 'dev' attribute "
> +                             "specified with <interface type='vdpa'/>"));
> +            goto error;
> +        }
> +        def->data.vdpa.devicepath = g_steal_pointer(&vdpa_dev);
> +        break;
> +


Yeah, this is the place I was talking about before. It used to be that 
this was the place to check for anything that *must* be there no matter 
what the hypervisor. I still don't get exactly what is the status of 
these checks at the end of the parse functions; do we want to deprecate 
them? Or should we still add more stuff as long as it's okay to log an 
error even when we're reading existing XML from disk? Should someone be 
moving the entire switch statement containing this chunk into 
virDomainNetDefValidate()?)


>       case VIR_DOMAIN_NET_TYPE_BRIDGE:
>           if (bridge == NULL) {
>               virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
> @@ -12727,6 +12756,7 @@ virDomainNetDefParseXML(virDomainXMLOptionPtr xmlopt,
>           case VIR_DOMAIN_NET_TYPE_DIRECT:
>           case VIR_DOMAIN_NET_TYPE_HOSTDEV:
>           case VIR_DOMAIN_NET_TYPE_UDP:
> +        case VIR_DOMAIN_NET_TYPE_VDPA:
>               break;
>           case VIR_DOMAIN_NET_TYPE_LAST:
>           default:
> @@ -26737,6 +26767,14 @@ virDomainNetDefFormat(virBufferPtr buf,
>               }
>               break;
>   
> +        case VIR_DOMAIN_NET_TYPE_VDPA:
> +           if (def->data.vdpa.devicepath) {
> +               virBufferEscapeString(buf, "<source dev='%s'",
> +                                     def->data.vdpa.devicepath);
> +               sourceLines++;
> +           }
> +            break;
> +
>           case VIR_DOMAIN_NET_TYPE_USER:
>           case VIR_DOMAIN_NET_TYPE_LAST:
>               break;
> @@ -30902,6 +30940,7 @@ virDomainNetGetActualVirtPortProfile(const virDomainNetDef *iface)
>       case VIR_DOMAIN_NET_TYPE_MCAST:
>       case VIR_DOMAIN_NET_TYPE_INTERNAL:
>       case VIR_DOMAIN_NET_TYPE_UDP:
> +    case VIR_DOMAIN_NET_TYPE_VDPA:
>       case VIR_DOMAIN_NET_TYPE_LAST:
>       default:
>           return NULL;
> @@ -31718,6 +31757,7 @@ virDomainNetTypeSharesHostView(const virDomainNetDef *net)
>       case VIR_DOMAIN_NET_TYPE_INTERNAL:
>       case VIR_DOMAIN_NET_TYPE_HOSTDEV:
>       case VIR_DOMAIN_NET_TYPE_UDP:
> +    case VIR_DOMAIN_NET_TYPE_VDPA:
>       case VIR_DOMAIN_NET_TYPE_LAST:
>           break;
>       }
> @@ -31982,6 +32022,7 @@ virDomainNetDefActualToNetworkPort(virDomainDefPtr dom,
>       case VIR_DOMAIN_NET_TYPE_UDP:
>       case VIR_DOMAIN_NET_TYPE_USER:
>       case VIR_DOMAIN_NET_TYPE_VHOSTUSER:
> +    case VIR_DOMAIN_NET_TYPE_VDPA:
>           virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
>                          _("Unexpected network port type %s"),
>                          virDomainNetTypeToString(virDomainNetGetActualType(iface)));
> diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h
> index 68be32614c..4f63a3eef4 100644
> --- a/src/conf/domain_conf.h
> +++ b/src/conf/domain_conf.h
> @@ -872,6 +872,7 @@ typedef enum {
>       VIR_DOMAIN_NET_TYPE_DIRECT,
>       VIR_DOMAIN_NET_TYPE_HOSTDEV,
>       VIR_DOMAIN_NET_TYPE_UDP,
> +    VIR_DOMAIN_NET_TYPE_VDPA,
>   
>       VIR_DOMAIN_NET_TYPE_LAST
>   } virDomainNetType;
> @@ -1045,6 +1046,9 @@ struct _virDomainNetDef {
>                */
>               virDomainActualNetDefPtr actual;
>           } network;
> +        struct {
> +            char *devicepath;
> +        } vdpa;
>           struct {
>               char *brname;
>           } bridge;
> diff --git a/src/conf/netdev_bandwidth_conf.c b/src/conf/netdev_bandwidth_conf.c
> index 396ac62019..4eb12e2951 100644
> --- a/src/conf/netdev_bandwidth_conf.c
> +++ b/src/conf/netdev_bandwidth_conf.c
> @@ -315,6 +315,7 @@ bool virNetDevSupportsBandwidth(virDomainNetType type)
>       case VIR_DOMAIN_NET_TYPE_UDP:
>       case VIR_DOMAIN_NET_TYPE_INTERNAL:
>       case VIR_DOMAIN_NET_TYPE_HOSTDEV:
> +    case VIR_DOMAIN_NET_TYPE_VDPA:
>       case VIR_DOMAIN_NET_TYPE_LAST:
>           break;
>       }
> diff --git a/src/libxl/libxl_conf.c b/src/libxl/libxl_conf.c
> index 7c2c015015..709cdc8719 100644
> --- a/src/libxl/libxl_conf.c
> +++ b/src/libxl/libxl_conf.c
> @@ -1371,6 +1371,7 @@ libxlMakeNic(virDomainDefPtr def,
>           case VIR_DOMAIN_NET_TYPE_INTERNAL:
>           case VIR_DOMAIN_NET_TYPE_DIRECT:
>           case VIR_DOMAIN_NET_TYPE_HOSTDEV:
> +        case VIR_DOMAIN_NET_TYPE_VDPA:
>           case VIR_DOMAIN_NET_TYPE_LAST:
>               virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
>                       _("unsupported interface type %s"),
> diff --git a/src/libxl/xen_common.c b/src/libxl/xen_common.c
> index 75fe7e0644..b1ec34bf11 100644
> --- a/src/libxl/xen_common.c
> +++ b/src/libxl/xen_common.c
> @@ -1776,6 +1776,7 @@ xenFormatNet(virConnectPtr conn,
>       case VIR_DOMAIN_NET_TYPE_HOSTDEV:
>       case VIR_DOMAIN_NET_TYPE_UDP:
>       case VIR_DOMAIN_NET_TYPE_USER:
> +    case VIR_DOMAIN_NET_TYPE_VDPA:
>           virReportError(VIR_ERR_CONFIG_UNSUPPORTED, _("Unsupported net type '%s'"),
>                          virDomainNetTypeToString(net->type));
>           return -1;
> diff --git a/src/lxc/lxc_controller.c b/src/lxc/lxc_controller.c
> index ae6b737b60..cb573d6c01 100644
> --- a/src/lxc/lxc_controller.c
> +++ b/src/lxc/lxc_controller.c
> @@ -422,6 +422,7 @@ static int virLXCControllerGetNICIndexes(virLXCControllerPtr ctrl)
>           case VIR_DOMAIN_NET_TYPE_UDP:
>           case VIR_DOMAIN_NET_TYPE_INTERNAL:
>           case VIR_DOMAIN_NET_TYPE_HOSTDEV:
> +        case VIR_DOMAIN_NET_TYPE_VDPA:
>               virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
>                              _("Unsupported net type %s"),
>                              virDomainNetTypeToString(actualType));
> diff --git a/src/lxc/lxc_driver.c b/src/lxc/lxc_driver.c
> index 1cdd6ee455..a36f83a588 100644
> --- a/src/lxc/lxc_driver.c
> +++ b/src/lxc/lxc_driver.c
> @@ -3503,6 +3503,7 @@ lxcDomainAttachDeviceNetLive(virLXCDriverPtr driver,
>       case VIR_DOMAIN_NET_TYPE_INTERNAL:
>       case VIR_DOMAIN_NET_TYPE_HOSTDEV:
>       case VIR_DOMAIN_NET_TYPE_UDP:
> +    case VIR_DOMAIN_NET_TYPE_VDPA:
>           virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
>                          _("Network device type is not supported"));
>           goto cleanup;
> @@ -3557,6 +3558,7 @@ lxcDomainAttachDeviceNetLive(virLXCDriverPtr driver,
>           case VIR_DOMAIN_NET_TYPE_INTERNAL:
>           case VIR_DOMAIN_NET_TYPE_HOSTDEV:
>           case VIR_DOMAIN_NET_TYPE_UDP:
> +        case VIR_DOMAIN_NET_TYPE_VDPA:
>           case VIR_DOMAIN_NET_TYPE_LAST:
>           default:
>               /* no-op */
> @@ -3998,6 +4000,7 @@ lxcDomainDetachDeviceNetLive(virDomainObjPtr vm,
>       case VIR_DOMAIN_NET_TYPE_INTERNAL:
>       case VIR_DOMAIN_NET_TYPE_HOSTDEV:
>       case VIR_DOMAIN_NET_TYPE_UDP:
> +    case VIR_DOMAIN_NET_TYPE_VDPA:
>           virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
>                          _("Only bridged veth devices can be detached"));
>           goto cleanup;
> diff --git a/src/lxc/lxc_process.c b/src/lxc/lxc_process.c
> index fc59c2e5af..90e9790cea 100644
> --- a/src/lxc/lxc_process.c
> +++ b/src/lxc/lxc_process.c
> @@ -606,6 +606,7 @@ virLXCProcessSetupInterfaces(virLXCDriverPtr driver,
>           case VIR_DOMAIN_NET_TYPE_INTERNAL:
>           case VIR_DOMAIN_NET_TYPE_LAST:
>           case VIR_DOMAIN_NET_TYPE_HOSTDEV:
> +        case VIR_DOMAIN_NET_TYPE_VDPA:
>               virReportError(VIR_ERR_INTERNAL_ERROR,
>                              _("Unsupported network type %s"),
>                              virDomainNetTypeToString(type));
> diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c
> index 01812cd39b..9c5265ccdf 100644
> --- a/src/qemu/qemu_command.c
> +++ b/src/qemu/qemu_command.c
> @@ -3552,7 +3552,8 @@ qemuBuildHostNetStr(virDomainNetDefPtr net,
>                       size_t tapfdSize,
>                       char **vhostfd,
>                       size_t vhostfdSize,
> -                    const char *slirpfd)
> +                    const char *slirpfd,
> +                    const char *vdpafd)
>   {
>       bool is_tap = false;
>       virDomainNetType netType = virDomainNetGetActualType(net);
> @@ -3690,6 +3691,13 @@ qemuBuildHostNetStr(virDomainNetDefPtr net,
>               return NULL;
>           break;
>   
> +    case VIR_DOMAIN_NET_TYPE_VDPA:
> +        /* Caller will pass the fd to qemu with add-fd */
> +        if (virJSONValueObjectCreate(&netprops, "s:type", "vhost-vdpa", NULL) < 0 ||
> +            virJSONValueObjectAppendString(netprops, "vhostdev", vdpafd) < 0)
> +            return NULL;
> +        break;
> +
>       case VIR_DOMAIN_NET_TYPE_HOSTDEV:
>           /* Should have been handled earlier via PCI/USB hotplug code. */
>       case VIR_DOMAIN_NET_TYPE_LAST:
> @@ -8013,6 +8021,8 @@ qemuBuildInterfaceCommandLine(virQEMUDriverPtr driver,
>       char **tapfdName = NULL;
>       char **vhostfdName = NULL;
>       g_autofree char *slirpfdName = NULL;
> +    g_autofree char *vdpafdName = NULL;
> +    int vdpafd = -1;
>       virDomainNetType actualType = virDomainNetGetActualType(net);
>       const virNetDevBandwidth *actualBandwidth;
>       bool requireNicdev = false;
> @@ -8098,6 +8108,11 @@ qemuBuildInterfaceCommandLine(virQEMUDriverPtr driver,
>   
>           break;
>   
> +    case VIR_DOMAIN_NET_TYPE_VDPA:
> +        if ((vdpafd = qemuInterfaceVDPAConnect(net)) < 0)
> +            goto cleanup;
> +        break;
> +
>       case VIR_DOMAIN_NET_TYPE_USER:
>       case VIR_DOMAIN_NET_TYPE_SERVER:
>       case VIR_DOMAIN_NET_TYPE_CLIENT:
> @@ -8140,6 +8155,7 @@ qemuBuildInterfaceCommandLine(virQEMUDriverPtr driver,
>       case VIR_DOMAIN_NET_TYPE_UDP:
>       case VIR_DOMAIN_NET_TYPE_INTERNAL:
>       case VIR_DOMAIN_NET_TYPE_HOSTDEV:
> +    case VIR_DOMAIN_NET_TYPE_VDPA:
>       case VIR_DOMAIN_NET_TYPE_LAST:
>          /* These types don't use a network device on the host, but
>           * instead use some other type of connection to the emulated
> @@ -8219,13 +8235,22 @@ qemuBuildInterfaceCommandLine(virQEMUDriverPtr driver,
>           vhostfd[i] = -1;
>       }
>   
> +    if (vdpafd > 0) {
> +        virCommandPassFD(cmd, vdpafd, VIR_COMMAND_PASS_FD_CLOSE_PARENT);
> +        g_autofree char *fdset = qemuVirCommandGetFDSet(cmd, vdpafd);
> +        if (!fdset)
> +            goto cleanup;
> +        virCommandAddArgList(cmd, "-add-fd", fdset, NULL);
> +        vdpafdName = qemuVirCommandGetDevSet(cmd, vdpafd);
> +    }
> +
>       if (chardev)
>           virCommandAddArgList(cmd, "-chardev", chardev, NULL);
>   
>       if (!(hostnetprops = qemuBuildHostNetStr(net,
>                                                tapfdName, tapfdSize,
>                                                vhostfdName, vhostfdSize,
> -                                             slirpfdName)))
> +                                             slirpfdName, vdpafdName)))
>           goto cleanup;
>   
>       if (!(host = virQEMUBuildNetdevCommandlineFromJSON(hostnetprops,
> diff --git a/src/qemu/qemu_command.h b/src/qemu/qemu_command.h
> index 89d99b111f..e8b4f4785a 100644
> --- a/src/qemu/qemu_command.h
> +++ b/src/qemu/qemu_command.h
> @@ -99,7 +99,8 @@ virJSONValuePtr qemuBuildHostNetStr(virDomainNetDefPtr net,
>                                       size_t tapfdSize,
>                                       char **vhostfd,
>                                       size_t vhostfdSize,
> -                                    const char *slirpfd);
> +                                    const char *slirpfd,
> +                                    const char *vdpafd);
>   
>   /* Current, best practice */
>   char *qemuBuildNicDevStr(virDomainDefPtr def,
> diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
> index c440c79e1d..daae5a1b03 100644
> --- a/src/qemu/qemu_domain.c
> +++ b/src/qemu/qemu_domain.c
> @@ -5027,7 +5027,10 @@ qemuDomainDeviceNetDefPostParse(virDomainNetDefPtr net,
>                                   const virDomainDef *def,
>                                   virQEMUCapsPtr qemuCaps)
>   {
> -    if (net->type != VIR_DOMAIN_NET_TYPE_HOSTDEV &&
> +    if (net->type == VIR_DOMAIN_NET_TYPE_VDPA &&
> +        !virDomainNetGetModelString(net))
> +        net->model = VIR_DOMAIN_NET_MODEL_VIRTIO;
> +    else if (net->type != VIR_DOMAIN_NET_TYPE_HOSTDEV &&
>           !virDomainNetGetModelString(net) &&
>           virDomainNetResolveActualType(net) != VIR_DOMAIN_NET_TYPE_HOSTDEV)
>           net->model = qemuDomainDefaultNetModel(def, qemuCaps);
> @@ -9201,6 +9204,7 @@ qemuDomainNetSupportsMTU(virDomainNetType type)
>       case VIR_DOMAIN_NET_TYPE_DIRECT:
>       case VIR_DOMAIN_NET_TYPE_HOSTDEV:
>       case VIR_DOMAIN_NET_TYPE_UDP:
> +    case VIR_DOMAIN_NET_TYPE_VDPA:
>       case VIR_DOMAIN_NET_TYPE_LAST:
>           break;
>       }
> diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c
> index 2c6c30ce03..23ae2310a2 100644
> --- a/src/qemu/qemu_hotplug.c
> +++ b/src/qemu/qemu_hotplug.c
> @@ -1340,6 +1340,7 @@ qemuDomainAttachNetDevice(virQEMUDriverPtr driver,
>       case VIR_DOMAIN_NET_TYPE_MCAST:
>       case VIR_DOMAIN_NET_TYPE_INTERNAL:
>       case VIR_DOMAIN_NET_TYPE_UDP:
> +    case VIR_DOMAIN_NET_TYPE_VDPA:
>       case VIR_DOMAIN_NET_TYPE_LAST:
>           virReportError(VIR_ERR_OPERATION_UNSUPPORTED,
>                          _("hotplug of interface type of %s is not implemented yet"),
> @@ -1388,7 +1389,7 @@ qemuDomainAttachNetDevice(virQEMUDriverPtr driver,
>       if (!(netprops = qemuBuildHostNetStr(net,
>                                            tapfdName, tapfdSize,
>                                            vhostfdName, vhostfdSize,
> -                                         slirpfdName)))
> +                                         slirpfdName, NULL)))
>           goto cleanup;
>   
>       qemuDomainObjEnterMonitor(driver, vm);
> @@ -3390,6 +3391,7 @@ qemuDomainChangeNetFilter(virDomainObjPtr vm,
>       case VIR_DOMAIN_NET_TYPE_DIRECT:
>       case VIR_DOMAIN_NET_TYPE_HOSTDEV:
>       case VIR_DOMAIN_NET_TYPE_UDP:
> +    case VIR_DOMAIN_NET_TYPE_VDPA:
>           virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
>                          _("filters not supported on interfaces of type %s"),
>                          virDomainNetTypeToString(virDomainNetGetActualType(newdev)));
> @@ -3483,8 +3485,9 @@ qemuDomainChangeNet(virQEMUDriverPtr driver,
>       olddev = *devslot;
>   
>       oldType = virDomainNetGetActualType(olddev);
> -    if (oldType == VIR_DOMAIN_NET_TYPE_HOSTDEV) {
> -        /* no changes are possible to a type='hostdev' interface */
> +    if (oldType == VIR_DOMAIN_NET_TYPE_HOSTDEV ||
> +        oldType == VIR_DOMAIN_NET_TYPE_VDPA) {
> +        /* no changes are possible to a type='hostdev' or type='vdpa' interface */
>           virReportError(VIR_ERR_OPERATION_UNSUPPORTED,
>                          _("cannot change config of '%s' network type"),
>                          virDomainNetTypeToString(oldType));
> @@ -3671,8 +3674,9 @@ qemuDomainChangeNet(virQEMUDriverPtr driver,
>   
>       newType = virDomainNetGetActualType(newdev);
>   
> -    if (newType == VIR_DOMAIN_NET_TYPE_HOSTDEV) {
> -        /* can't turn it into a type='hostdev' interface */
> +    if (newType == VIR_DOMAIN_NET_TYPE_HOSTDEV ||
> +        newType == VIR_DOMAIN_NET_TYPE_VDPA) {
> +        /* can't turn it into a type='hostdev' or type='vdpa' interface */
>           virReportError(VIR_ERR_OPERATION_UNSUPPORTED,
>                          _("cannot change network interface type to '%s'"),
>                          virDomainNetTypeToString(newType));
> @@ -3726,6 +3730,7 @@ qemuDomainChangeNet(virQEMUDriverPtr driver,
>               break;
>   
>           case VIR_DOMAIN_NET_TYPE_VHOSTUSER:
> +        case VIR_DOMAIN_NET_TYPE_VDPA:
>           case VIR_DOMAIN_NET_TYPE_HOSTDEV:
>               virReportError(VIR_ERR_OPERATION_UNSUPPORTED,
>                              _("unable to change config on '%s' network type"),
> diff --git a/src/qemu/qemu_interface.c b/src/qemu/qemu_interface.c
> index ffec992596..676648ebab 100644
> --- a/src/qemu/qemu_interface.c
> +++ b/src/qemu/qemu_interface.c
> @@ -118,6 +118,7 @@ qemuInterfaceStartDevice(virDomainNetDefPtr net)
>       case VIR_DOMAIN_NET_TYPE_UDP:
>       case VIR_DOMAIN_NET_TYPE_INTERNAL:
>       case VIR_DOMAIN_NET_TYPE_HOSTDEV:
> +    case VIR_DOMAIN_NET_TYPE_VDPA:
>       case VIR_DOMAIN_NET_TYPE_LAST:
>           /* these types all require no action */
>           break;
> @@ -203,6 +204,7 @@ qemuInterfaceStopDevice(virDomainNetDefPtr net)
>       case VIR_DOMAIN_NET_TYPE_UDP:
>       case VIR_DOMAIN_NET_TYPE_INTERNAL:
>       case VIR_DOMAIN_NET_TYPE_HOSTDEV:
> +    case VIR_DOMAIN_NET_TYPE_VDPA:
>       case VIR_DOMAIN_NET_TYPE_LAST:
>           /* these types all require no action */
>           break;
> @@ -630,6 +632,29 @@ qemuInterfaceBridgeConnect(virDomainDefPtr def,
>   }
>   
>   
> +/* qemuInterfaceVDPAConnect:
> + * @net: pointer to the VM's interface description
> + *
> + * returns: file descriptor of the vdpa device
> + *
> + * Called *only* called if actualType is VIR_DOMAIN_NET_TYPE_VDPA
> + */
> +int
> +qemuInterfaceVDPAConnect(virDomainNetDefPtr net)
> +{
> +    int fd;
> +
> +    if ((fd = open(net->data.vdpa.devicepath, O_RDWR)) < 0) {
> +        virReportSystemError(errno,
> +                             _("Unable to open '%s' for vdpa device"),
> +                             net->data.vdpa.devicepath);
> +        return -1;
> +    }
> +
> +    return fd;
> +}
> +
> +
>   qemuSlirpPtr
>   qemuInterfacePrepareSlirp(virQEMUDriverPtr driver,
>                             virDomainNetDefPtr net)
> diff --git a/src/qemu/qemu_interface.h b/src/qemu/qemu_interface.h
> index 3dcefc6a12..1ba24f0a6f 100644
> --- a/src/qemu/qemu_interface.h
> +++ b/src/qemu/qemu_interface.h
> @@ -58,3 +58,5 @@ int qemuInterfaceOpenVhostNet(virDomainDefPtr def,
>   
>   qemuSlirpPtr qemuInterfacePrepareSlirp(virQEMUDriverPtr driver,
>                                          virDomainNetDefPtr net);
> +
> +int qemuInterfaceVDPAConnect(virDomainNetDefPtr net) G_GNUC_NO_INLINE;
> diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
> index 126fabf5ef..70c3b9b46d 100644
> --- a/src/qemu/qemu_process.c
> +++ b/src/qemu/qemu_process.c
> @@ -7517,6 +7517,7 @@ void qemuProcessStop(virQEMUDriverPtr driver,
>           case VIR_DOMAIN_NET_TYPE_INTERNAL:
>           case VIR_DOMAIN_NET_TYPE_HOSTDEV:
>           case VIR_DOMAIN_NET_TYPE_UDP:
> +        case VIR_DOMAIN_NET_TYPE_VDPA:
>           case VIR_DOMAIN_NET_TYPE_LAST:
>               /* No special cleanup procedure for these types. */
>               break;
> diff --git a/src/qemu/qemu_validate.c b/src/qemu/qemu_validate.c
> index 488f258d00..623f998463 100644
> --- a/src/qemu/qemu_validate.c
> +++ b/src/qemu/qemu_validate.c
> @@ -1130,6 +1130,7 @@ qemuValidateNetSupportsCoalesce(virDomainNetType type)
>       case VIR_DOMAIN_NET_TYPE_MCAST:
>       case VIR_DOMAIN_NET_TYPE_INTERNAL:
>       case VIR_DOMAIN_NET_TYPE_UDP:
> +    case VIR_DOMAIN_NET_TYPE_VDPA:
>       case VIR_DOMAIN_NET_TYPE_LAST:
>           break;
>       }
> diff --git a/src/vmx/vmx.c b/src/vmx/vmx.c
> index a123a8807c..f6f6efb322 100644
> --- a/src/vmx/vmx.c
> +++ b/src/vmx/vmx.c
> @@ -3833,6 +3833,7 @@ virVMXFormatEthernet(virDomainNetDefPtr def, int controller,
>         case VIR_DOMAIN_NET_TYPE_DIRECT:
>         case VIR_DOMAIN_NET_TYPE_HOSTDEV:
>         case VIR_DOMAIN_NET_TYPE_UDP:
> +      case VIR_DOMAIN_NET_TYPE_VDPA:
>           virReportError(VIR_ERR_CONFIG_UNSUPPORTED, _("Unsupported net type '%s'"),
>                          virDomainNetTypeToString(def->type));
>           return -1;
> diff --git a/tests/qemuxml2argvdata/net-vdpa.x86_64-latest.args b/tests/qemuxml2argvdata/net-vdpa.x86_64-latest.args
> new file mode 100644
> index 0000000000..8e76ac7794
> --- /dev/null
> +++ b/tests/qemuxml2argvdata/net-vdpa.x86_64-latest.args
> @@ -0,0 +1,37 @@
> +LC_ALL=C \
> +PATH=/bin \
> +HOME=/tmp/lib/domain--1-QEMUGuest1 \
> +USER=test \
> +LOGNAME=test \
> +XDG_DATA_HOME=/tmp/lib/domain--1-QEMUGuest1/.local/share \
> +XDG_CACHE_HOME=/tmp/lib/domain--1-QEMUGuest1/.cache \
> +XDG_CONFIG_HOME=/tmp/lib/domain--1-QEMUGuest1/.config \
> +QEMU_AUDIO_DRV=none \
> +/usr/bin/qemu-system-i386 \
> +-name guest=QEMUGuest1,debug-threads=on \
> +-S \
> +-object secret,id=masterKey0,format=raw,\
> +file=/tmp/lib/domain--1-QEMUGuest1/master-key.aes \
> +-machine pc,accel=tcg,usb=off,dump-guest-core=off \
> +-cpu qemu64 \
> +-m 214 \
> +-overcommit mem-lock=off \
> +-smp 1,sockets=1,cores=1,threads=1 \
> +-uuid c7a5fdbd-edaf-9455-926a-d65c16db1809 \
> +-display none \
> +-no-user-config \
> +-nodefaults \
> +-chardev socket,id=charmonitor,fd=1729,server,nowait \
> +-mon chardev=charmonitor,id=monitor,mode=control \
> +-rtc base=utc \
> +-no-shutdown \
> +-no-acpi \
> +-boot strict=on \
> +-device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 \
> +-add-fd set=0,fd=1732 \
> +-netdev vhost-vdpa,vhostdev=/dev/fdset/0,id=hostnet0 \


Okay, I'm feeling too lazy to parse through the code above an see how 
you arrived at "vhostdev='/dev/fdset/0'", but that doesn't look right. 
Shouldn't you be ending up with "-netdev vhost-vdpa,fd=NN,..."? The 
document I have shows that syntax is supported, so there shouldn't be 
any need to do the add-fd stuff in this case.


I think the next step should be to find some hardware and give this a 
smoke test! :-)


> +-device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:95:db:c0,bus=pci.0,\
> +addr=0x2 \
> +-sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,\
> +resourcecontrol=deny \
> +-msg timestamp=on
> diff --git a/tests/qemuxml2argvdata/net-vdpa.xml b/tests/qemuxml2argvdata/net-vdpa.xml
> new file mode 100644
> index 0000000000..30cca7eb6e
> --- /dev/null
> +++ b/tests/qemuxml2argvdata/net-vdpa.xml
> @@ -0,0 +1,28 @@
> +<domain type='qemu'>
> +  <name>QEMUGuest1</name>
> +  <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid>
> +  <memory unit='KiB'>219136</memory>
> +  <currentMemory unit='KiB'>219136</currentMemory>
> +  <vcpu placement='static'>1</vcpu>
> +  <os>
> +    <type arch='i686' machine='pc'>hvm</type>
> +    <boot dev='hd'/>
> +  </os>
> +  <clock offset='utc'/>
> +  <on_poweroff>destroy</on_poweroff>
> +  <on_reboot>restart</on_reboot>
> +  <on_crash>destroy</on_crash>
> +  <devices>
> +    <emulator>/usr/bin/qemu-system-i386</emulator>
> +    <controller type='usb' index='0'/>
> +    <controller type='ide' index='0'/>
> +    <controller type='pci' index='0' model='pci-root'/>
> +    <interface type='vdpa'>
> +      <mac address='52:54:00:95:db:c0'/>
> +      <source dev='/dev/vhost-vdpa-0'/>
> +    </interface>
> +    <input type='mouse' bus='ps2'/>
> +    <input type='keyboard' bus='ps2'/>
> +    <memballoon model='none'/>
> +  </devices>
> +</domain>
> diff --git a/tests/qemuxml2argvmock.c b/tests/qemuxml2argvmock.c
> index e5841bc8e3..516776697f 100644
> --- a/tests/qemuxml2argvmock.c
> +++ b/tests/qemuxml2argvmock.c
> @@ -205,7 +205,7 @@ virHostGetDRMRenderNode(void)
>   
>   static void (*real_virCommandPassFD)(virCommandPtr cmd, int fd, unsigned int flags);
>   
> -static const int testCommandPassSafeFDs[] = { 1730, 1731 };
> +static const int testCommandPassSafeFDs[] = { 1730, 1731, 1732 };
>   
>   void
>   virCommandPassFD(virCommandPtr cmd,
> @@ -283,3 +283,12 @@ qemuBuildTPMOpenBackendFDs(const char *tpmdev G_GNUC_UNUSED,
>       *cancelfd = 1731;
>       return 0;
>   }
> +
> +
> +int
> +qemuInterfaceVDPAConnect(virDomainNetDefPtr net G_GNUC_UNUSED)
> +{
> +    if (fcntl(1732, F_GETFD) != -1)
> +        abort();
> +    return 1732;
> +}
> diff --git a/tests/qemuxml2argvtest.c b/tests/qemuxml2argvtest.c
> index 01839cb88c..9587e1f2f2 100644
> --- a/tests/qemuxml2argvtest.c
> +++ b/tests/qemuxml2argvtest.c
> @@ -1446,6 +1446,7 @@ mymain(void)
>               QEMU_CAPS_DEVICE_VFIO_PCI);
>       DO_TEST_FAILURE("net-hostdev-fail",
>                       QEMU_CAPS_DEVICE_VFIO_PCI);
> +    DO_TEST_CAPS_LATEST("net-vdpa");
>   
>       DO_TEST("hostdev-pci-multifunction",
>               QEMU_CAPS_KVM,
> diff --git a/tests/qemuxml2xmloutdata/net-vdpa.xml b/tests/qemuxml2xmloutdata/net-vdpa.xml
> new file mode 100644
> index 0000000000..b362405c14
> --- /dev/null
> +++ b/tests/qemuxml2xmloutdata/net-vdpa.xml
> @@ -0,0 +1,34 @@
> +<domain type='qemu'>
> +  <name>QEMUGuest1</name>
> +  <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid>
> +  <memory unit='KiB'>219136</memory>
> +  <currentMemory unit='KiB'>219136</currentMemory>
> +  <vcpu placement='static'>1</vcpu>
> +  <os>
> +    <type arch='i686' machine='pc'>hvm</type>
> +    <boot dev='hd'/>
> +  </os>
> +  <clock offset='utc'/>
> +  <on_poweroff>destroy</on_poweroff>
> +  <on_reboot>restart</on_reboot>
> +  <on_crash>destroy</on_crash>
> +  <devices>
> +    <emulator>/usr/bin/qemu-system-i386</emulator>
> +    <controller type='usb' index='0'>
> +      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/>
> +    </controller>
> +    <controller type='ide' index='0'>
> +      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/>
> +    </controller>
> +    <controller type='pci' index='0' model='pci-root'/>
> +    <interface type='vdpa'>
> +      <mac address='52:54:00:95:db:c0'/>
> +      <source dev='/dev/vhost-vdpa-0'/>
> +      <model type='virtio'/>
> +      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
> +    </interface>
> +    <input type='mouse' bus='ps2'/>
> +    <input type='keyboard' bus='ps2'/>
> +    <memballoon model='none'/>
> +  </devices>
> +</domain>
> diff --git a/tests/qemuxml2xmltest.c b/tests/qemuxml2xmltest.c
> index a07e2b7553..978babb110 100644
> --- a/tests/qemuxml2xmltest.c
> +++ b/tests/qemuxml2xmltest.c
> @@ -494,6 +494,7 @@ mymain(void)
>       DO_TEST("net-mtu", NONE);
>       DO_TEST("net-coalesce", NONE);
>       DO_TEST("net-many-models", NONE);
> +    DO_TEST("net-vdpa", NONE);
>   
>       DO_TEST("serial-tcp-tlsx509-chardev", NONE);
>       DO_TEST("serial-tcp-tlsx509-chardev-notls", NONE);
> diff --git a/tools/virsh-domain.c b/tools/virsh-domain.c
> index 286cf79671..10b396bcf0 100644
> --- a/tools/virsh-domain.c
> +++ b/tools/virsh-domain.c
> @@ -1007,6 +1007,7 @@ cmdAttachInterface(vshControl *ctl, const vshCmd *cmd)
>       case VIR_DOMAIN_NET_TYPE_MCAST:
>       case VIR_DOMAIN_NET_TYPE_UDP:
>       case VIR_DOMAIN_NET_TYPE_INTERNAL:
> +    case VIR_DOMAIN_NET_TYPE_VDPA:
>       case VIR_DOMAIN_NET_TYPE_LAST:
>           vshError(ctl, _("No support for %s in command 'attach-interface'"),
>                    type);


Re: [libvirt PATCH] RFC: Add support for vDPA network devices
Posted by Jonathon Jongsma 3 years, 8 months ago
On Thu, 2020-08-20 at 18:56 -0400, Laine Stump wrote:
> On 8/18/20 2:37 PM, Jonathon Jongsma wrote:
> > vDPA network devices allow high-performance networking in a virtual
> > machine by providing a wire-speed data path. These devices require
> > a
> > vendor-specific host driver but the data path follows the virtio
> > specification.
> > 
> > The support for vDPA devices was recently added to qemu. This
> > allows
> > libvirt to support these devices. It requires that the device is
> > configured on the host with the appropriate vendor-specific driver.
> > This will create a chardev on the host at e.g. /dev/vhost-vdpa-0.
> > That
> > chardev path can then be used to define a new interface with
> > type='vdpa'.
> > ---


[snip]


> 
> 
> > +
> > +::
> > +
> > +   ...
> > +   <devices>
> > +     <interface type='vdpa'>
> > +       <source dev='/dev/vhost-vdpa-0'/>
> 
> (The above device is created (I just learned this from you in IRC!)
> by 
> unbinding a VF from its NIC driver on the host, and re-binding it to
> a 
> special VDPA-VF driver.)
> 
> 
> As we were just discussing online, on one hand it could be nice if 
> libvirt could automatically handle rebinding the VF to the vdpa host 
> driver (given the PCI address of the VF), to make it easier to use 
> (because no advance setup would be needed), similar to what's
> already 
> done with hostdev devices (and <interface type='hostdev'>) when 
> managed='yes' (which is the default setting).
> 
> 
> On the other hand, it is exactly that managed='yes' functionality
> that 
> has created more "libvirt-but-not-really-libvirt" bug reports than
> any 
> other aspect of vfio device assignment, because the process of
> unbinding 
> and rebinding drivers is timing-sensitive and causes code that's
> usually 
> run only once at host boot-time to be run hundreds of times thus
> making 
> it more likely to expose infrequently-hit bugs.
> 
> 
> I just bring this up in advance of someone suggesting the addition
> of 
> managed='yes' here to put in my vote for *not* doing it, and instead 
> using that same effort to provide some sort of API in the node-
> device 
> driver for easily creating one or more VDPA devices from VFs, which 
> could be done once at host boot time, and thus avoid the level of 
> "libvirt-not-libvirt" bug reports for VDPA. (and after that maybe
> even 
> an API to allocate a device from that pool to be used by a guest).
> But 
> that's for later.

I'd really love to hear some additional opinions on this topic from
some more of the libvirt "old-timers". I intentionally kept the initial
vdpa support simple by requiring that the device is already setup and
bound to the appropriate driver. But I want to make sure that we can
add additional capabilities later (as Laine suggested) with this same
API/schema.

Anybody else have thoughts on this topic?


> 
> 
> > +     </interface>
> > +   </devices>
> > +   ...
> > +
> >   :anchor:`<a id="elementsTeaming"/>`
> >   
> >   Teaming a virtio/hostdev NIC pair
> > diff --git a/docs/schemas/domaincommon.rng
> > b/docs/schemas/domaincommon.rng
> > index 0d0dcbc5ce..17f74490f4 100644
> > --- a/docs/schemas/domaincommon.rng
> > +++ b/docs/schemas/domaincommon.rng
> > @@ -3108,6 +3108,21 @@
> >               <ref name="interface-options"/>
> >             </interleave>
> >           </group>
> > +
> > +        <group>
> > +          <attribute name="type">
> > +            <value>vdpa</value>
> > +          </attribute>
> > +          <interleave>
> > +            <element name="source">
> > +              <attribute name="dev">
> > +                <ref name="deviceName"/>
> > +              </attribute>
> > +            </element>
> > +            <ref name="interface-options"/>
> > +          </interleave>
> > +        </group>
> > +
> >         </choice>
> >         <optional>
> >           <attribute name="trustGuestRxFilters">
> > 


[snip]

> > diff --git a/tests/qemuxml2argvdata/net-vdpa.x86_64-latest.args
> > b/tests/qemuxml2argvdata/net-vdpa.x86_64-latest.args
> > new file mode 100644
> > index 0000000000..8e76ac7794
> > --- /dev/null
> > +++ b/tests/qemuxml2argvdata/net-vdpa.x86_64-latest.args
> > @@ -0,0 +1,37 @@
> > +LC_ALL=C \
> > +PATH=/bin \
> > +HOME=/tmp/lib/domain--1-QEMUGuest1 \
> > +USER=test \
> > +LOGNAME=test \
> > +XDG_DATA_HOME=/tmp/lib/domain--1-QEMUGuest1/.local/share \
> > +XDG_CACHE_HOME=/tmp/lib/domain--1-QEMUGuest1/.cache \
> > +XDG_CONFIG_HOME=/tmp/lib/domain--1-QEMUGuest1/.config \
> > +QEMU_AUDIO_DRV=none \
> > +/usr/bin/qemu-system-i386 \
> > +-name guest=QEMUGuest1,debug-threads=on \
> > +-S \
> > +-object secret,id=masterKey0,format=raw,\
> > +file=/tmp/lib/domain--1-QEMUGuest1/master-key.aes \
> > +-machine pc,accel=tcg,usb=off,dump-guest-core=off \
> > +-cpu qemu64 \
> > +-m 214 \
> > +-overcommit mem-lock=off \
> > +-smp 1,sockets=1,cores=1,threads=1 \
> > +-uuid c7a5fdbd-edaf-9455-926a-d65c16db1809 \
> > +-display none \
> > +-no-user-config \
> > +-nodefaults \
> > +-chardev socket,id=charmonitor,fd=1729,server,nowait \
> > +-mon chardev=charmonitor,id=monitor,mode=control \
> > +-rtc base=utc \
> > +-no-shutdown \
> > +-no-acpi \
> > +-boot strict=on \
> > +-device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 \
> > +-add-fd set=0,fd=1732 \
> > +-netdev vhost-vdpa,vhostdev=/dev/fdset/0,id=hostnet0 \
> 
> Okay, I'm feeling too lazy to parse through the code above an see
> how 
> you arrived at "vhostdev='/dev/fdset/0'", but that doesn't look
> right. 
> Shouldn't you be ending up with "-netdev vhost-vdpa,fd=NN,..."? The 
> document I have shows that syntax is supported, so there shouldn't
> be 
> any need to do the add-fd stuff in this case.


The initial proposal for vdpa in qemu supported both vhostdev= and fd=
parameters, but the final implementation does not actually support an
fd= parameter here. The recommended way to pass an pre-opened fd to
qemu in this scenario is to use the 'add-fd' and /dev/fdset/ method as
shown above.

As far as I understand from reading qemu code, /dev/fdset/N is not
actually a file that exists in the filesystem. Instead, when you call
'add-fd', qemu adds that fd to an internal mapping that maps from the
fdset specified by set=N to the passed fd. Whenever qemu tries to open
a file, qemu_open() has special handling for filenames that start with
"/dev/fdset/": it looks up the fd associated with that fdset id and
returns that instead of attempting to open the file path. 

So I think the above should be correct.

> 
> 
> I think the next step should be to find some hardware and give this
> a 
> smoke test! :-)

Indeed, I'm working with Cindy Lu (who implemented the qemu vdpa
support) to try to get this tested properly. Unfortunately I've been
unable to get vdpa_sim working and I don't have access to hardware at
the moment.

Thanks for the review!

Jonathon

Re: [libvirt PATCH] RFC: Add support for vDPA network devices
Posted by Jonathon Jongsma 3 years, 8 months ago
I got some feedback from John Ferlan on a different forum about missing
handling of migration and qemu capabilities. I'm adding this to my
patch, but I'd appreciate any additional feedback, particularly on the
xml format and the managed=yes/node-device questions below.


On Mon, 2020-08-24 at 16:33 -0500, Jonathon Jongsma wrote:
> On Thu, 2020-08-20 at 18:56 -0400, Laine Stump wrote:
> > On 8/18/20 2:37 PM, Jonathon Jongsma wrote:
> > > vDPA network devices allow high-performance networking in a
> > > virtual
> > > machine by providing a wire-speed data path. These devices
> > > require
> > > a
> > > vendor-specific host driver but the data path follows the virtio
> > > specification.
> > > 
> > > The support for vDPA devices was recently added to qemu. This
> > > allows
> > > libvirt to support these devices. It requires that the device is
> > > configured on the host with the appropriate vendor-specific
> > > driver.
> > > This will create a chardev on the host at e.g. /dev/vhost-vdpa-0.
> > > That
> > > chardev path can then be used to define a new interface with
> > > type='vdpa'.
> > > ---
> 
> [snip]
> 
> 
> > 
> > > +
> > > +::
> > > +
> > > +   ...
> > > +   <devices>
> > > +     <interface type='vdpa'>
> > > +       <source dev='/dev/vhost-vdpa-0'/>
> > 
> > (The above device is created (I just learned this from you in IRC!)
> > by 
> > unbinding a VF from its NIC driver on the host, and re-binding it
> > to
> > a 
> > special VDPA-VF driver.)
> > 
> > 
> > As we were just discussing online, on one hand it could be nice if 
> > libvirt could automatically handle rebinding the VF to the vdpa
> > host 
> > driver (given the PCI address of the VF), to make it easier to use 
> > (because no advance setup would be needed), similar to what's
> > already 
> > done with hostdev devices (and <interface type='hostdev'>) when 
> > managed='yes' (which is the default setting).
> > 
> > 
> > On the other hand, it is exactly that managed='yes' functionality
> > that 
> > has created more "libvirt-but-not-really-libvirt" bug reports than
> > any 
> > other aspect of vfio device assignment, because the process of
> > unbinding 
> > and rebinding drivers is timing-sensitive and causes code that's
> > usually 
> > run only once at host boot-time to be run hundreds of times thus
> > making 
> > it more likely to expose infrequently-hit bugs.
> > 
> > 
> > I just bring this up in advance of someone suggesting the addition
> > of 
> > managed='yes' here to put in my vote for *not* doing it, and
> > instead 
> > using that same effort to provide some sort of API in the node-
> > device 
> > driver for easily creating one or more VDPA devices from VFs,
> > which 
> > could be done once at host boot time, and thus avoid the level of 
> > "libvirt-not-libvirt" bug reports for VDPA. (and after that maybe
> > even 
> > an API to allocate a device from that pool to be used by a guest).
> > But 
> > that's for later.
> 
> I'd really love to hear some additional opinions on this topic from
> some more of the libvirt "old-timers". I intentionally kept the
> initial
> vdpa support simple by requiring that the device is already setup and
> bound to the appropriate driver. But I want to make sure that we can
> add additional capabilities later (as Laine suggested) with this same
> API/schema.
> 
> Anybody else have thoughts on this topic?
> 
> 
> > 
> > > +     </interface>
> > > +   </devices>
> > > +   ...
> > > +
> > >   :anchor:`<a id="elementsTeaming"/>`
> > >   
> > >   Teaming a virtio/hostdev NIC pair
> > > diff --git a/docs/schemas/domaincommon.rng
> > > b/docs/schemas/domaincommon.rng
> > > index 0d0dcbc5ce..17f74490f4 100644
> > > --- a/docs/schemas/domaincommon.rng
> > > +++ b/docs/schemas/domaincommon.rng
> > > @@ -3108,6 +3108,21 @@
> > >               <ref name="interface-options"/>
> > >             </interleave>
> > >           </group>
> > > +
> > > +        <group>
> > > +          <attribute name="type">
> > > +            <value>vdpa</value>
> > > +          </attribute>
> > > +          <interleave>
> > > +            <element name="source">
> > > +              <attribute name="dev">
> > > +                <ref name="deviceName"/>
> > > +              </attribute>
> > > +            </element>
> > > +            <ref name="interface-options"/>
> > > +          </interleave>
> > > +        </group>
> > > +
> > >         </choice>
> > >         <optional>
> > >           <attribute name="trustGuestRxFilters">
> > > 
> 
> [snip]
> 
> > > diff --git a/tests/qemuxml2argvdata/net-vdpa.x86_64-latest.args
> > > b/tests/qemuxml2argvdata/net-vdpa.x86_64-latest.args
> > > new file mode 100644
> > > index 0000000000..8e76ac7794
> > > --- /dev/null
> > > +++ b/tests/qemuxml2argvdata/net-vdpa.x86_64-latest.args
> > > @@ -0,0 +1,37 @@
> > > +LC_ALL=C \
> > > +PATH=/bin \
> > > +HOME=/tmp/lib/domain--1-QEMUGuest1 \
> > > +USER=test \
> > > +LOGNAME=test \
> > > +XDG_DATA_HOME=/tmp/lib/domain--1-QEMUGuest1/.local/share \
> > > +XDG_CACHE_HOME=/tmp/lib/domain--1-QEMUGuest1/.cache \
> > > +XDG_CONFIG_HOME=/tmp/lib/domain--1-QEMUGuest1/.config \
> > > +QEMU_AUDIO_DRV=none \
> > > +/usr/bin/qemu-system-i386 \
> > > +-name guest=QEMUGuest1,debug-threads=on \
> > > +-S \
> > > +-object secret,id=masterKey0,format=raw,\
> > > +file=/tmp/lib/domain--1-QEMUGuest1/master-key.aes \
> > > +-machine pc,accel=tcg,usb=off,dump-guest-core=off \
> > > +-cpu qemu64 \
> > > +-m 214 \
> > > +-overcommit mem-lock=off \
> > > +-smp 1,sockets=1,cores=1,threads=1 \
> > > +-uuid c7a5fdbd-edaf-9455-926a-d65c16db1809 \
> > > +-display none \
> > > +-no-user-config \
> > > +-nodefaults \
> > > +-chardev socket,id=charmonitor,fd=1729,server,nowait \
> > > +-mon chardev=charmonitor,id=monitor,mode=control \
> > > +-rtc base=utc \
> > > +-no-shutdown \
> > > +-no-acpi \
> > > +-boot strict=on \
> > > +-device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 \
> > > +-add-fd set=0,fd=1732 \
> > > +-netdev vhost-vdpa,vhostdev=/dev/fdset/0,id=hostnet0 \
> > 
> > Okay, I'm feeling too lazy to parse through the code above an see
> > how 
> > you arrived at "vhostdev='/dev/fdset/0'", but that doesn't look
> > right. 
> > Shouldn't you be ending up with "-netdev vhost-vdpa,fd=NN,..."?
> > The 
> > document I have shows that syntax is supported, so there shouldn't
> > be 
> > any need to do the add-fd stuff in this case.
> 
> The initial proposal for vdpa in qemu supported both vhostdev= and
> fd=
> parameters, but the final implementation does not actually support an
> fd= parameter here. The recommended way to pass an pre-opened fd to
> qemu in this scenario is to use the 'add-fd' and /dev/fdset/ method
> as
> shown above.
> 
> As far as I understand from reading qemu code, /dev/fdset/N is not
> actually a file that exists in the filesystem. Instead, when you call
> 'add-fd', qemu adds that fd to an internal mapping that maps from the
> fdset specified by set=N to the passed fd. Whenever qemu tries to
> open
> a file, qemu_open() has special handling for filenames that start
> with
> "/dev/fdset/": it looks up the fd associated with that fdset id and
> returns that instead of attempting to open the file path. 
> 
> So I think the above should be correct.
> 
> > 
> > I think the next step should be to find some hardware and give this
> > a 
> > smoke test! :-)
> 
> Indeed, I'm working with Cindy Lu (who implemented the qemu vdpa
> support) to try to get this tested properly. Unfortunately I've been
> unable to get vdpa_sim working and I don't have access to hardware at
> the moment.
> 
> Thanks for the review!
> 
> Jonathon
>