[RFC v2 00/13] vDPA software assisted live migration

Eugenio Pérez posted 13 patches 3 years, 1 month ago
Test checkpatch passed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20210315194842.277740-1-eperezma@redhat.com
There is a newer version of this series
qapi/net.json                      |  22 ++
hw/virtio/vhost-shadow-virtqueue.h |  36 ++
include/hw/virtio/vhost.h          |   6 +
include/hw/virtio/virtio.h         |   3 +
hw/virtio/vhost-backend.c          |  29 ++
hw/virtio/vhost-shadow-virtqueue.c | 551 +++++++++++++++++++++++++++++
hw/virtio/vhost.c                  | 283 +++++++++++++++
hw/virtio/virtio.c                 |  23 +-
hw/virtio/meson.build              |   2 +-
9 files changed, 952 insertions(+), 3 deletions(-)
create mode 100644 hw/virtio/vhost-shadow-virtqueue.h
create mode 100644 hw/virtio/vhost-shadow-virtqueue.c
[RFC v2 00/13] vDPA software assisted live migration
Posted by Eugenio Pérez 3 years, 1 month ago
This series enable shadow virtqueue for vhost-net devices. This is a
new method of vhost devices migration: Instead of relay on vhost
device's dirty logging capability, SW assisted LM intercepts dataplane,
forwarding the descriptors between VM and device. Is intended for vDPA
devices with no logging, but this address the basic platform to build
that support on.

In this migration mode, qemu offers a new vring to the device to
read and write into, and disable vhost notifiers, processing guest and
vhost notifications in qemu. On used buffer relay, qemu will mark the
dirty memory as with plain virtio-net devices. This way, devices does
not need to have dirty page logging capability.

This series is a POC doing SW LM for vhost-net devices, which already
have dirty page logging capabilities. For qemu to use shadow virtqueues
these vhost-net devices need to be instantiated:
* With IOMMU (iommu_platform=on,ats=on)
* Without event_idx (event_idx=off)

And shadow virtqueue needs to be enabled for them with QMP command
like:

{
  "execute": "x-vhost-enable-shadow-vq",
  "arguments": {
    "name": "virtio-net",
    "enable": false
  }
}

Just the notification forwarding (with no descriptor relay) can be
achieved with patches 5 and 6, and starting SVQ. Previous commits
are cleanup ones and declaration of QMP command.

Commit 11 introduces the buffer forwarding. Previous one are for
preparations again, and laters are for enabling some obvious
optimizations.

It is based on the ideas of DPDK SW assisted LM, in the series of
DPDK's https://patchwork.dpdk.org/cover/48370/ . However, these does
not map the shadow vq in guest's VA, but in qemu's.

Comments are welcome! Especially on:
* Different/improved way of synchronization, particularly on the race
  of masking.

TODO:
* Event, indirect, packed, and others features of virtio - Waiting for
  confirmation of the big picture.
* vDPA devices: Developing solutions for tracking the available IOVA
  space for all devices. Small POC available, skipping the get/set
  status (since vDPA does not support it) and just allocating more and
  more IOVA addresses in a hardcoded range available for the device.
* To sepparate buffers forwarding in its own AIO context, so we can
  throw more threads to that task and we don't need to stop the main
  event loop.
* IOMMU optimizations, so bacthing and bigger chunks of IOVA can be
  sent to device.
* Automatic kick-in on live-migration.
* Proper documentation.

Thanks!

Changes from v1 RFC:
  * Use QMP instead of migration to start SVQ mode.
  * Only accepting IOMMU devices, closer behavior with target devices
    (vDPA)
  * Fix invalid masking/unmasking of vhost call fd.
  * Use of proper methods for synchronization.
  * No need to modify VirtIO device code, all of the changes are
    contained in vhost code.
  * Delete superfluous code.
  * An intermediate RFC was sent with only the notifications forwarding
    changes. It can be seen in
    https://patchew.org/QEMU/20210129205415.876290-1-eperezma@redhat.com/
  * v1 at
    https://lists.gnu.org/archive/html/qemu-devel/2020-11/msg05372.html

Eugenio Pérez (13):
  virtio: Add virtio_queue_is_host_notifier_enabled
  vhost: Save masked_notifier state
  vhost: Add VhostShadowVirtqueue
  vhost: Add x-vhost-enable-shadow-vq qmp
  vhost: Route guest->host notification through shadow virtqueue
  vhost: Route host->guest notification through shadow virtqueue
  vhost: Avoid re-set masked notifier in shadow vq
  virtio: Add vhost_shadow_vq_get_vring_addr
  virtio: Add virtio_queue_full
  vhost: add vhost_kernel_set_vring_enable
  vhost: Shadow virtqueue buffers forwarding
  vhost: Check for device VRING_USED_F_NO_NOTIFY at shadow virtqueue
    kick
  vhost: Use VRING_AVAIL_F_NO_INTERRUPT at device call on shadow
    virtqueue

 qapi/net.json                      |  22 ++
 hw/virtio/vhost-shadow-virtqueue.h |  36 ++
 include/hw/virtio/vhost.h          |   6 +
 include/hw/virtio/virtio.h         |   3 +
 hw/virtio/vhost-backend.c          |  29 ++
 hw/virtio/vhost-shadow-virtqueue.c | 551 +++++++++++++++++++++++++++++
 hw/virtio/vhost.c                  | 283 +++++++++++++++
 hw/virtio/virtio.c                 |  23 +-
 hw/virtio/meson.build              |   2 +-
 9 files changed, 952 insertions(+), 3 deletions(-)
 create mode 100644 hw/virtio/vhost-shadow-virtqueue.h
 create mode 100644 hw/virtio/vhost-shadow-virtqueue.c

-- 
2.27.0



Re: [RFC v2 00/13] vDPA software assisted live migration
Posted by Jason Wang 3 years, 1 month ago
在 2021/3/16 上午3:48, Eugenio Pérez 写道:
> This series enable shadow virtqueue for vhost-net devices. This is a
> new method of vhost devices migration: Instead of relay on vhost
> device's dirty logging capability, SW assisted LM intercepts dataplane,
> forwarding the descriptors between VM and device. Is intended for vDPA
> devices with no logging, but this address the basic platform to build
> that support on.
>
> In this migration mode, qemu offers a new vring to the device to
> read and write into, and disable vhost notifiers, processing guest and
> vhost notifications in qemu. On used buffer relay, qemu will mark the
> dirty memory as with plain virtio-net devices. This way, devices does
> not need to have dirty page logging capability.
>
> This series is a POC doing SW LM for vhost-net devices, which already
> have dirty page logging capabilities. For qemu to use shadow virtqueues
> these vhost-net devices need to be instantiated:
> * With IOMMU (iommu_platform=on,ats=on)
> * Without event_idx (event_idx=off)
>
> And shadow virtqueue needs to be enabled for them with QMP command
> like:
>
> {
>    "execute": "x-vhost-enable-shadow-vq",
>    "arguments": {
>      "name": "virtio-net",
>      "enable": false
>    }
> }
>
> Just the notification forwarding (with no descriptor relay) can be
> achieved with patches 5 and 6, and starting SVQ. Previous commits
> are cleanup ones and declaration of QMP command.
>
> Commit 11 introduces the buffer forwarding. Previous one are for
> preparations again, and laters are for enabling some obvious
> optimizations.
>
> It is based on the ideas of DPDK SW assisted LM, in the series of
> DPDK's https://patchwork.dpdk.org/cover/48370/ . However, these does
> not map the shadow vq in guest's VA, but in qemu's.
>
> Comments are welcome! Especially on:
> * Different/improved way of synchronization, particularly on the race
>    of masking.
>
> TODO:
> * Event, indirect, packed, and others features of virtio - Waiting for
>    confirmation of the big picture.


So two things in my mind after reviewing the seires:

1) which layer should we implement the shadow virtqueue. E.g if you want 
to do that at virtio level, you need to deal with a lot of 
synchronizations. I prefer to do it in vhost-vDPA.
2) Using VA as IOVA which can not work for vhost-vDPA


> * vDPA devices:


So I think we can start from a vhost-vDPA specific shadow virtqueue 
first, then extending it to be a general one which might be much easier.


> Developing solutions for tracking the available IOVA
>    space for all devices.


For vhost-net, you can assume that [0, ULLONG_MAX] is valid so you can 
simply use VA as IOVA.


> Small POC available, skipping the get/set
>    status (since vDPA does not support it) and just allocating more and
>    more IOVA addresses in a hardcoded range available for the device.


I'm not sure this can work but you need make sure that range can fit the 
size of the all memory regions and need to deal with memory region add 
and del.

I guess you probably need a full functional tree based IOVA allocator.

Thanks


> * To sepparate buffers forwarding in its own AIO context, so we can
>    throw more threads to that task and we don't need to stop the main
>    event loop.
> * IOMMU optimizations, so bacthing and bigger chunks of IOVA can be
>    sent to device.
> * Automatic kick-in on live-migration.
> * Proper documentation.
>
> Thanks!
>
> Changes from v1 RFC:
>    * Use QMP instead of migration to start SVQ mode.
>    * Only accepting IOMMU devices, closer behavior with target devices
>      (vDPA)
>    * Fix invalid masking/unmasking of vhost call fd.
>    * Use of proper methods for synchronization.
>    * No need to modify VirtIO device code, all of the changes are
>      contained in vhost code.
>    * Delete superfluous code.
>    * An intermediate RFC was sent with only the notifications forwarding
>      changes. It can be seen in
>      https://patchew.org/QEMU/20210129205415.876290-1-eperezma@redhat.com/
>    * v1 at
>      https://lists.gnu.org/archive/html/qemu-devel/2020-11/msg05372.html
>
> Eugenio Pérez (13):
>    virtio: Add virtio_queue_is_host_notifier_enabled
>    vhost: Save masked_notifier state
>    vhost: Add VhostShadowVirtqueue
>    vhost: Add x-vhost-enable-shadow-vq qmp
>    vhost: Route guest->host notification through shadow virtqueue
>    vhost: Route host->guest notification through shadow virtqueue
>    vhost: Avoid re-set masked notifier in shadow vq
>    virtio: Add vhost_shadow_vq_get_vring_addr
>    virtio: Add virtio_queue_full
>    vhost: add vhost_kernel_set_vring_enable
>    vhost: Shadow virtqueue buffers forwarding
>    vhost: Check for device VRING_USED_F_NO_NOTIFY at shadow virtqueue
>      kick
>    vhost: Use VRING_AVAIL_F_NO_INTERRUPT at device call on shadow
>      virtqueue
>
>   qapi/net.json                      |  22 ++
>   hw/virtio/vhost-shadow-virtqueue.h |  36 ++
>   include/hw/virtio/vhost.h          |   6 +
>   include/hw/virtio/virtio.h         |   3 +
>   hw/virtio/vhost-backend.c          |  29 ++
>   hw/virtio/vhost-shadow-virtqueue.c | 551 +++++++++++++++++++++++++++++
>   hw/virtio/vhost.c                  | 283 +++++++++++++++
>   hw/virtio/virtio.c                 |  23 +-
>   hw/virtio/meson.build              |   2 +-
>   9 files changed, 952 insertions(+), 3 deletions(-)
>   create mode 100644 hw/virtio/vhost-shadow-virtqueue.h
>   create mode 100644 hw/virtio/vhost-shadow-virtqueue.c
>


Re: [RFC v2 00/13] vDPA software assisted live migration
Posted by Eugenio Perez Martin 3 years, 1 month ago
On Tue, Mar 16, 2021 at 9:28 AM Jason Wang <jasowang@redhat.com> wrote:
>
>
> 在 2021/3/16 上午3:48, Eugenio Pérez 写道:
> > This series enable shadow virtqueue for vhost-net devices. This is a
> > new method of vhost devices migration: Instead of relay on vhost
> > device's dirty logging capability, SW assisted LM intercepts dataplane,
> > forwarding the descriptors between VM and device. Is intended for vDPA
> > devices with no logging, but this address the basic platform to build
> > that support on.
> >
> > In this migration mode, qemu offers a new vring to the device to
> > read and write into, and disable vhost notifiers, processing guest and
> > vhost notifications in qemu. On used buffer relay, qemu will mark the
> > dirty memory as with plain virtio-net devices. This way, devices does
> > not need to have dirty page logging capability.
> >
> > This series is a POC doing SW LM for vhost-net devices, which already
> > have dirty page logging capabilities. For qemu to use shadow virtqueues
> > these vhost-net devices need to be instantiated:
> > * With IOMMU (iommu_platform=on,ats=on)
> > * Without event_idx (event_idx=off)
> >
> > And shadow virtqueue needs to be enabled for them with QMP command
> > like:
> >
> > {
> >    "execute": "x-vhost-enable-shadow-vq",
> >    "arguments": {
> >      "name": "virtio-net",
> >      "enable": false
> >    }
> > }
> >
> > Just the notification forwarding (with no descriptor relay) can be
> > achieved with patches 5 and 6, and starting SVQ. Previous commits
> > are cleanup ones and declaration of QMP command.
> >
> > Commit 11 introduces the buffer forwarding. Previous one are for
> > preparations again, and laters are for enabling some obvious
> > optimizations.
> >
> > It is based on the ideas of DPDK SW assisted LM, in the series of
> > DPDK's https://patchwork.dpdk.org/cover/48370/ . However, these does
> > not map the shadow vq in guest's VA, but in qemu's.
> >
> > Comments are welcome! Especially on:
> > * Different/improved way of synchronization, particularly on the race
> >    of masking.
> >
> > TODO:
> > * Event, indirect, packed, and others features of virtio - Waiting for
> >    confirmation of the big picture.
>
>
> So two things in my mind after reviewing the seires:
>
> 1) which layer should we implement the shadow virtqueue. E.g if you want
> to do that at virtio level, you need to deal with a lot of
> synchronizations. I prefer to do it in vhost-vDPA.

I'm not sure how to do that and avoid the synchronization. Could you
expand on that point?

> 2) Using VA as IOVA which can not work for vhost-vDPA
>
>
> > * vDPA devices:
>
>
> So I think we can start from a vhost-vDPA specific shadow virtqueue
> first, then extending it to be a general one which might be much easier.
>
>
> > Developing solutions for tracking the available IOVA
> >    space for all devices.
>
>
> For vhost-net, you can assume that [0, ULLONG_MAX] is valid so you can
> simply use VA as IOVA.
>

In the future revision it will be that way unless vdpa device reports
limits on the range of addresses it can translate.

>
> > Small POC available, skipping the get/set
> >    status (since vDPA does not support it) and just allocating more and
> >    more IOVA addresses in a hardcoded range available for the device.
>
>
> I'm not sure this can work but you need make sure that range can fit the
> size of the all memory regions and need to deal with memory region add
> and del.
>
> I guess you probably need a full functional tree based IOVA allocator.
>

The vDPA POC I'm testing with does not free the used memory regions at all.

For future development I'm reusing qemu's iova-tree. Not sure if I
will keep with it until the end of development, but I'm open to better
suggestions of course.


> Thanks
>
>
> > * To sepparate buffers forwarding in its own AIO context, so we can
> >    throw more threads to that task and we don't need to stop the main
> >    event loop.
> > * IOMMU optimizations, so bacthing and bigger chunks of IOVA can be
> >    sent to device.
> > * Automatic kick-in on live-migration.
> > * Proper documentation.
> >
> > Thanks!
> >
> > Changes from v1 RFC:
> >    * Use QMP instead of migration to start SVQ mode.
> >    * Only accepting IOMMU devices, closer behavior with target devices
> >      (vDPA)
> >    * Fix invalid masking/unmasking of vhost call fd.
> >    * Use of proper methods for synchronization.
> >    * No need to modify VirtIO device code, all of the changes are
> >      contained in vhost code.
> >    * Delete superfluous code.
> >    * An intermediate RFC was sent with only the notifications forwarding
> >      changes. It can be seen in
> >      https://patchew.org/QEMU/20210129205415.876290-1-eperezma@redhat.com/
> >    * v1 at
> >      https://lists.gnu.org/archive/html/qemu-devel/2020-11/msg05372.html
> >
> > Eugenio Pérez (13):
> >    virtio: Add virtio_queue_is_host_notifier_enabled
> >    vhost: Save masked_notifier state
> >    vhost: Add VhostShadowVirtqueue
> >    vhost: Add x-vhost-enable-shadow-vq qmp
> >    vhost: Route guest->host notification through shadow virtqueue
> >    vhost: Route host->guest notification through shadow virtqueue
> >    vhost: Avoid re-set masked notifier in shadow vq
> >    virtio: Add vhost_shadow_vq_get_vring_addr
> >    virtio: Add virtio_queue_full
> >    vhost: add vhost_kernel_set_vring_enable
> >    vhost: Shadow virtqueue buffers forwarding
> >    vhost: Check for device VRING_USED_F_NO_NOTIFY at shadow virtqueue
> >      kick
> >    vhost: Use VRING_AVAIL_F_NO_INTERRUPT at device call on shadow
> >      virtqueue
> >
> >   qapi/net.json                      |  22 ++
> >   hw/virtio/vhost-shadow-virtqueue.h |  36 ++
> >   include/hw/virtio/vhost.h          |   6 +
> >   include/hw/virtio/virtio.h         |   3 +
> >   hw/virtio/vhost-backend.c          |  29 ++
> >   hw/virtio/vhost-shadow-virtqueue.c | 551 +++++++++++++++++++++++++++++
> >   hw/virtio/vhost.c                  | 283 +++++++++++++++
> >   hw/virtio/virtio.c                 |  23 +-
> >   hw/virtio/meson.build              |   2 +-
> >   9 files changed, 952 insertions(+), 3 deletions(-)
> >   create mode 100644 hw/virtio/vhost-shadow-virtqueue.h
> >   create mode 100644 hw/virtio/vhost-shadow-virtqueue.c
> >
>