[PATCH v4 00/14] vDPA shadow virtqueue

Eugenio Pérez posted 14 patches 2 years, 1 month ago
Test checkpatch passed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20220303185147.3605350-1-eperezma@redhat.com
Maintainers: "Michael S. Tsirkin" <mst@redhat.com>, Peter Xu <peterx@redhat.com>, Jason Wang <jasowang@redhat.com>, Eric Blake <eblake@redhat.com>, Markus Armbruster <armbru@redhat.com>
There is a newer version of this series
qapi/net.json                      |   8 +-
hw/virtio/vhost-iova-tree.h        |  27 ++
hw/virtio/vhost-shadow-virtqueue.h |  87 ++++
include/hw/virtio/vhost-vdpa.h     |   8 +
include/qemu/iova-tree.h           |  18 +
hw/virtio/vhost-iova-tree.c        | 155 +++++++
hw/virtio/vhost-shadow-virtqueue.c | 637 +++++++++++++++++++++++++++++
hw/virtio/vhost-vdpa.c             | 525 +++++++++++++++++++++++-
net/vhost-vdpa.c                   |  48 ++-
util/iova-tree.c                   | 135 ++++++
hw/virtio/meson.build              |   2 +-
11 files changed, 1625 insertions(+), 25 deletions(-)
create mode 100644 hw/virtio/vhost-iova-tree.h
create mode 100644 hw/virtio/vhost-shadow-virtqueue.h
create mode 100644 hw/virtio/vhost-iova-tree.c
create mode 100644 hw/virtio/vhost-shadow-virtqueue.c
[PATCH v4 00/14] vDPA shadow virtqueue
Posted by Eugenio Pérez 2 years, 1 month ago
This series enable shadow virtqueue (SVQ) for vhost-vdpa devices. This
is intended as a new method of tracking the memory the devices touch
during a migration process: Instead of relay on vhost device's dirty
logging capability, SVQ intercepts the VQ dataplane forwarding the
descriptors between VM and device. This way qemu is the effective
writer of guests memory, like in qemu's virtio device operation.

When SVQ is enabled qemu offers a new virtual address space to the
device to read and write into, and it maps new vrings and the guest
memory in it. SVQ also intercepts kicks and calls between the device
and the guest. Used buffers relay would cause dirty memory being
tracked, but at this RFC SVQ is not enabled on migration automatically.

Thanks of being a buffers relay system, SVQ can be used also to
communicate devices and drivers with different capabilities, like
devices that only supports packed vring and not split and old guest
with no driver packed support.

It is based on the ideas of DPDK SW assisted LM, in the series of
DPDK's https://patchwork.dpdk.org/cover/48370/ . However, these does
not map the shadow vq in guest's VA, but in qemu's.

For qemu to use shadow virtqueues the guest virtio driver must not use
features like event_idx.

SVQ needs to be enabled with QMP command:

{ "execute": "x-vhost-set-shadow-vq",
      "arguments": { "name": "vhost-vdpa0", "enable": true } }

This series includes some patches to delete in the final version that
helps with its testing. The first two of the series have been sent
sepparately but they haven't been included in qemu main branch.

The two after them adds the feature to stop the device and be able to
set and get its status. It's intended to be used with vp_vpda driver in
a nested environment, so they are also external to this series. The
vp_vdpa driver also need modifications to forward the new status bit,
they will be proposed sepparately

Patches 5-12 prepares the SVQ and QMP command to support guest to host
notifications forwarding. If the SVQ is enabled with these ones
applied and the device supports it, that part can be tested in
isolation (for example, with networking), hopping through SVQ.

Same thing is true with patches 13-17, but with device to guest
notifications.

Based on them, patches from 18 to 22 implement the actual buffer
forwarding, using some features already introduced in previous.
However, they will need a host device with no iommu, something that
is not available at the moment.

The last part of the series uses properly the host iommu, so the driver
can access this new virtual address space created.

Comments are welcome.

TODO:
* Event, indirect, packed, and others features of virtio.
* To support different set of features between the device<->SVQ and the
  SVQ<->guest communication.
* Support of device host notifier memory regions.
* To sepparate buffers forwarding in its own AIO context, so we can
  throw more threads to that task and we don't need to stop the main
  event loop.
* Support multiqueue virtio-net vdpa.
* Proper documentation.

Changes from v3:
* Add @unstable feature to NetdevVhostVDPAOptions.x-svq.
* Fix uncomplete mapping (by 1 byte) of memory regions if svq is enabled.
v3 link:
https://lore.kernel.org/qemu-devel/20220302203012.3476835-1-eperezma@redhat.com/

Changes from v2:
* Less assertions and more error handling in iova tree code.
* Better documentation, both fixing errors and making @param: format
* Homogeneize SVQ avail_idx_shadow and shadow_used_idx to make shadow a
  prefix at both times.
* Fix: Fo not use VirtQueueElement->len field, track separatedly.
* Split vhost_svq_{enable,disable}_notification, so the code looks more
  like the kernel driver code.
* Small improvements.
v2 link:
https://lore.kernel.org/all/CAJaqyWfXHE0C54R_-OiwJzjC0gPpkE3eX0L8BeeZXGm1ERYPtA@mail.gmail.com/

Changes from v1:
* Feature set at device->SVQ is now the same as SVQ->guest.
* Size of SVQ is not max available device size anymore, but guest's
  negotiated.
* Add VHOST_FILE_UNBIND kick and call fd treatment.
* Make SVQ a public struct
* Come back to previous approach to iova-tree
* Some assertions are now fail paths. Some errors are now log_guest.
* Only mask _F_LOG feature at vdpa_set_features svq enable path.
* Refactor some errors and messages. Add missing error unwindings.
* Add memory barrier at _F_NO_NOTIFY set.
* Stop checking for features flags out of transport range.
v1 link:
https://lore.kernel.org/virtualization/7d86c715-6d71-8a27-91f5-8d47b71e3201@redhat.com/

Changes from v4 RFC:
* Support of allocating / freeing iova ranges in IOVA tree. Extending
  already present iova-tree for that.
* Proper validation of guest features. Now SVQ can negotiate a
  different set of features with the device when enabled.
* Support of host notifiers memory regions
* Handling of SVQ full queue in case guest's descriptors span to
  different memory regions (qemu's VA chunks).
* Flush pending used buffers at end of SVQ operation.
* QMP command now looks by NetClientState name. Other devices will need
  to implement it's way to enable vdpa.
* Rename QMP command to set, so it looks more like a way of working
* Better use of qemu error system
* Make a few assertions proper error-handling paths.
* Add more documentation
* Less coupling of virtio / vhost, that could cause friction on changes
* Addressed many other small comments and small fixes.

Changes from v3 RFC:
  * Move everything to vhost-vdpa backend. A big change, this allowed
    some cleanup but more code has been added in other places.
  * More use of glib utilities, especially to manage memory.
v3 link:
https://lists.nongnu.org/archive/html/qemu-devel/2021-05/msg06032.html

Changes from v2 RFC:
  * Adding vhost-vdpa devices support
  * Fixed some memory leaks pointed by different comments
v2 link:
https://lists.nongnu.org/archive/html/qemu-devel/2021-03/msg05600.html

Changes from v1 RFC:
  * Use QMP instead of migration to start SVQ mode.
  * Only accepting IOMMU devices, closer behavior with target devices
    (vDPA)
  * Fix invalid masking/unmasking of vhost call fd.
  * Use of proper methods for synchronization.
  * No need to modify VirtIO device code, all of the changes are
    contained in vhost code.
  * Delete superfluous code.
  * An intermediate RFC was sent with only the notifications forwarding
    changes. It can be seen in
    https://patchew.org/QEMU/20210129205415.876290-1-eperezma@redhat.com/
v1 link:
https://lists.gnu.org/archive/html/qemu-devel/2020-11/msg05372.html

Eugenio Pérez (14):
  vhost: Add VhostShadowVirtqueue
  vhost: Add Shadow VirtQueue kick forwarding capabilities
  vhost: Add Shadow VirtQueue call forwarding capabilities
  vhost: Add vhost_svq_valid_features to shadow vq
  virtio: Add vhost_svq_get_vring_addr
  vdpa: adapt vhost_ops callbacks to svq
  vhost: Shadow virtqueue buffers forwarding
  util: Add iova_tree_alloc_map
  vhost: Add VhostIOVATree
  vdpa: Add custom IOTLB translations to SVQ
  vdpa: Adapt vhost_vdpa_get_vring_base to SVQ
  vdpa: Never set log_base addr if SVQ is enabled
  vdpa: Expose VHOST_F_LOG_ALL on SVQ
  vdpa: Add x-svq to NetdevVhostVDPAOptions

 qapi/net.json                      |   8 +-
 hw/virtio/vhost-iova-tree.h        |  27 ++
 hw/virtio/vhost-shadow-virtqueue.h |  87 ++++
 include/hw/virtio/vhost-vdpa.h     |   8 +
 include/qemu/iova-tree.h           |  18 +
 hw/virtio/vhost-iova-tree.c        | 155 +++++++
 hw/virtio/vhost-shadow-virtqueue.c | 637 +++++++++++++++++++++++++++++
 hw/virtio/vhost-vdpa.c             | 525 +++++++++++++++++++++++-
 net/vhost-vdpa.c                   |  48 ++-
 util/iova-tree.c                   | 135 ++++++
 hw/virtio/meson.build              |   2 +-
 11 files changed, 1625 insertions(+), 25 deletions(-)
 create mode 100644 hw/virtio/vhost-iova-tree.h
 create mode 100644 hw/virtio/vhost-shadow-virtqueue.h
 create mode 100644 hw/virtio/vhost-iova-tree.c
 create mode 100644 hw/virtio/vhost-shadow-virtqueue.c

--
2.27.0