[PATCH v5 00/14] Dynamically switch to vhost shadow virtqueues at vdpa net migration

Eugenio Pérez posted 14 patches 1 year, 1 month ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20230303172445.1089785-1-eperezma@redhat.com
Maintainers: "Michael S. Tsirkin" <mst@redhat.com>, Jason Wang <jasowang@redhat.com>
include/hw/virtio/vhost-backend.h  |   4 +
include/hw/virtio/vhost-vdpa.h     |   3 +
hw/virtio/vhost-shadow-virtqueue.c |   8 +-
hw/virtio/vhost-vdpa.c             | 126 ++++++++++++------
hw/virtio/vhost.c                  |   3 +
net/vhost-vdpa.c                   | 198 ++++++++++++++++++++++++-----
hw/virtio/trace-events             |   1 +
7 files changed, 272 insertions(+), 71 deletions(-)
[PATCH v5 00/14] Dynamically switch to vhost shadow virtqueues at vdpa net migration
Posted by Eugenio Pérez 1 year, 1 month ago
It's possible to migrate vdpa net devices if they are shadowed from the
start.  But to always shadow the dataplane is to effectively break its host
passthrough, so its not efficient in vDPA scenarios.

This series enables dynamically switching to shadow mode only at
migration time.  This allows full data virtqueues passthrough all the
time qemu is not migrating.

In this series only net devices with no CVQ are migratable.  CVQ adds
additional state that would make the series bigger and still had some
controversy on previous RFC, so let's split it.

Successfully tested with vdpa_sim_net with patch [1] applied and with the qemu
emulated device with vp_vdpa with some restrictions:
* No CVQ. No feature that didn't work with SVQ previously (packed, ...)
* VIRTIO_RING_F_STATE patches implementing [2].

Previous versions were tested by many vendors. Not carrying Tested-by because
of code changes, so re-testing would be appreciated.

A ready to clone tag named vdpa.d/stop-nocvq-v5 with this version of the series
is available at https://gitlab.com/eperezmartin/qemu-kvm.git, with the commit
863d54ff00c558ffe54ed2c7ee148ab7f89d4864 ("vdpa: return VHOST_F_LOG_ALL in
vhost-vdpa devices").

Comments are welcome.

v5:
- Reverse SUSPEND polarity check, as qemu was never suspending devices with
  suspend capability.
- Reorder suspend patch so it comes before the reset reorder after
  get_vring_base.
- Remove patch to stop SVQ at vdpa stop, already present in staging

v4:
- Recover used_idx from guest's vring if device cannot suspend.
- Fix starting device in the middle of a migration.  Removed some
  duplication in setting / clearing enable_shadow_vqs and shadow_data
  members of vhost_vdpa.
- Fix (again) "Check for SUSPEND in vhost_dev.backend_cap, as
  .backend_features is empty at the check moment.". It was reverted by
  mistake in v3.
- Fix memory leak of iova tree.
- Properly rewind SVQ as in flight descriptors were still being accounted
  in vq base.
- Expand documentation.

v3:
- Start datapatch in SVQ in device started while migrating.
- Properly register migration blockers if device present unsupported features.
- Fix race condition because of not stopping the SVQ until device cleanup.
- Explain purpose of iova tree in the first patch message.
- s/dynamycally/dynamically/ in cover letter.
- at lore.kernel.org/qemu-devel/20230215173850.298832-14-eperezma@redhat.com

v2:
- Check for SUSPEND in vhost_dev.backend_cap, as .backend_features is empty at
  the check moment.
- at https://lore.kernel.org/all/20230208094253.702672-12-eperezma@redhat.com/T/

v1:
- Omit all code working with CVQ and block migration if the device supports
  CVQ.
- Remove spurious kick.
- Move all possible checks for migration to vhost-vdpa instead of the net
  backend. Move them to init code from start code.
- Suspend on vhost_vdpa_dev_start(false) instead of in vhost-vdpa net backend.
- Properly split suspend after geting base and adding of status_reset patches.
- Add possible TODOs to points where this series can improve in the future.
- Check the state of migration using migration_in_setup and
  migration_has_failed instead of checking all the possible migration status in
  a switch.
- Add TODO with possible low hand fruit using RESUME ops.
- Always offer _F_LOG from virtio/vhost-vdpa and let migration blockers do
  their thing instead of adding a variable.
- RFC v2 at https://lists.gnu.org/archive/html/qemu-devel/2023-01/msg02574.html

RFC v2:
- Use a migration listener instead of a memory listener to know when
  the migration starts.
- Add stuff not picked with ASID patches, like enable rings after
  driver_ok
- Add rewinding on the migration src, not in dst
- RFC v1 at https://lists.gnu.org/archive/html/qemu-devel/2022-08/msg01664.html

[1] https://lore.kernel.org/lkml/20230203142501.300125-1-eperezma@redhat.com/T/
[2] https://lists.oasis-open.org/archives/virtio-comment/202103/msg00036.html

Eugenio Pérez (14):
  vdpa net: move iova tree creation from init to start
  vdpa: Remember last call fd set
  vdpa: Negotiate _F_SUSPEND feature
  vdpa: rewind at get_base, not set_base
  vdpa: add vhost_vdpa->suspended parameter
  vdpa: add vhost_vdpa_suspend
  vdpa: move vhost reset after get vring base
  vdpa: add vdpa net migration state notifier
  vdpa: disable RAM block discard only for the first device
  vdpa net: block migration if the device has CVQ
  vdpa: block migration if device has unsupported features
  vdpa: block migration if SVQ does not admit a feature
  vdpa net: allow VHOST_F_LOG_ALL
  vdpa: return VHOST_F_LOG_ALL in vhost-vdpa devices

 include/hw/virtio/vhost-backend.h  |   4 +
 include/hw/virtio/vhost-vdpa.h     |   3 +
 hw/virtio/vhost-shadow-virtqueue.c |   8 +-
 hw/virtio/vhost-vdpa.c             | 126 ++++++++++++------
 hw/virtio/vhost.c                  |   3 +
 net/vhost-vdpa.c                   | 198 ++++++++++++++++++++++++-----
 hw/virtio/trace-events             |   1 +
 7 files changed, 272 insertions(+), 71 deletions(-)

-- 
2.31.1

Re: [PATCH v5 00/14] Dynamically switch to vhost shadow virtqueues at vdpa net migration
Posted by Lei Yang 1 year, 1 month ago
QE tested this series's V5 again. Creating two vdpa_sim device, and boot
two VMs without shadow virtqueues. The migration was successful and
everything worked fine.

Tested-by: Lei Yang <leiyang@redhat.com>

Eugenio Pérez <eperezma@redhat.com> 于2023年3月4日周六 01:24写道:

> It's possible to migrate vdpa net devices if they are shadowed from the
> start.  But to always shadow the dataplane is to effectively break its host
> passthrough, so its not efficient in vDPA scenarios.
>
> This series enables dynamically switching to shadow mode only at
> migration time.  This allows full data virtqueues passthrough all the
> time qemu is not migrating.
>
> In this series only net devices with no CVQ are migratable.  CVQ adds
> additional state that would make the series bigger and still had some
> controversy on previous RFC, so let's split it.
>
> Successfully tested with vdpa_sim_net with patch [1] applied and with the
> qemu
> emulated device with vp_vdpa with some restrictions:
> * No CVQ. No feature that didn't work with SVQ previously (packed, ...)
> * VIRTIO_RING_F_STATE patches implementing [2].
>
> Previous versions were tested by many vendors. Not carrying Tested-by
> because
> of code changes, so re-testing would be appreciated.
>
> A ready to clone tag named vdpa.d/stop-nocvq-v5 with this version of the
> series
> is available at https://gitlab.com/eperezmartin/qemu-kvm.git, with the
> commit
> 863d54ff00c558ffe54ed2c7ee148ab7f89d4864 ("vdpa: return VHOST_F_LOG_ALL in
> vhost-vdpa devices").
>
> Comments are welcome.
>
> v5:
> - Reverse SUSPEND polarity check, as qemu was never suspending devices with
>   suspend capability.
> - Reorder suspend patch so it comes before the reset reorder after
>   get_vring_base.
> - Remove patch to stop SVQ at vdpa stop, already present in staging
>
> v4:
> - Recover used_idx from guest's vring if device cannot suspend.
> - Fix starting device in the middle of a migration.  Removed some
>   duplication in setting / clearing enable_shadow_vqs and shadow_data
>   members of vhost_vdpa.
> - Fix (again) "Check for SUSPEND in vhost_dev.backend_cap, as
>   .backend_features is empty at the check moment.". It was reverted by
>   mistake in v3.
> - Fix memory leak of iova tree.
> - Properly rewind SVQ as in flight descriptors were still being accounted
>   in vq base.
> - Expand documentation.
>
> v3:
> - Start datapatch in SVQ in device started while migrating.
> - Properly register migration blockers if device present unsupported
> features.
> - Fix race condition because of not stopping the SVQ until device cleanup.
> - Explain purpose of iova tree in the first patch message.
> - s/dynamycally/dynamically/ in cover letter.
> - at
> lore.kernel.org/qemu-devel/20230215173850.298832-14-eperezma@redhat.com
>
> v2:
> - Check for SUSPEND in vhost_dev.backend_cap, as .backend_features is
> empty at
>   the check moment.
> - at
> https://lore.kernel.org/all/20230208094253.702672-12-eperezma@redhat.com/T/
>
> v1:
> - Omit all code working with CVQ and block migration if the device supports
>   CVQ.
> - Remove spurious kick.
> - Move all possible checks for migration to vhost-vdpa instead of the net
>   backend. Move them to init code from start code.
> - Suspend on vhost_vdpa_dev_start(false) instead of in vhost-vdpa net
> backend.
> - Properly split suspend after geting base and adding of status_reset
> patches.
> - Add possible TODOs to points where this series can improve in the future.
> - Check the state of migration using migration_in_setup and
>   migration_has_failed instead of checking all the possible migration
> status in
>   a switch.
> - Add TODO with possible low hand fruit using RESUME ops.
> - Always offer _F_LOG from virtio/vhost-vdpa and let migration blockers do
>   their thing instead of adding a variable.
> - RFC v2 at
> https://lists.gnu.org/archive/html/qemu-devel/2023-01/msg02574.html
>
> RFC v2:
> - Use a migration listener instead of a memory listener to know when
>   the migration starts.
> - Add stuff not picked with ASID patches, like enable rings after
>   driver_ok
> - Add rewinding on the migration src, not in dst
> - RFC v1 at
> https://lists.gnu.org/archive/html/qemu-devel/2022-08/msg01664.html
>
> [1]
> https://lore.kernel.org/lkml/20230203142501.300125-1-eperezma@redhat.com/T/
> [2]
> https://lists.oasis-open.org/archives/virtio-comment/202103/msg00036.html
>
> Eugenio Pérez (14):
>   vdpa net: move iova tree creation from init to start
>   vdpa: Remember last call fd set
>   vdpa: Negotiate _F_SUSPEND feature
>   vdpa: rewind at get_base, not set_base
>   vdpa: add vhost_vdpa->suspended parameter
>   vdpa: add vhost_vdpa_suspend
>   vdpa: move vhost reset after get vring base
>   vdpa: add vdpa net migration state notifier
>   vdpa: disable RAM block discard only for the first device
>   vdpa net: block migration if the device has CVQ
>   vdpa: block migration if device has unsupported features
>   vdpa: block migration if SVQ does not admit a feature
>   vdpa net: allow VHOST_F_LOG_ALL
>   vdpa: return VHOST_F_LOG_ALL in vhost-vdpa devices
>
>  include/hw/virtio/vhost-backend.h  |   4 +
>  include/hw/virtio/vhost-vdpa.h     |   3 +
>  hw/virtio/vhost-shadow-virtqueue.c |   8 +-
>  hw/virtio/vhost-vdpa.c             | 126 ++++++++++++------
>  hw/virtio/vhost.c                  |   3 +
>  net/vhost-vdpa.c                   | 198 ++++++++++++++++++++++++-----
>  hw/virtio/trace-events             |   1 +
>  7 files changed, 272 insertions(+), 71 deletions(-)
>
> --
> 2.31.1
>
>
>