[RFC 0/7] migration patches for VFIO

Juan Quintela posted 7 patches 1 year, 6 months ago
Failed in applying to current master (apply log)
Maintainers: Juan Quintela <quintela@redhat.com>, "Dr. David Alan Gilbert" <dgilbert@redhat.com>, Alex Williamson <alex.williamson@redhat.com>, Cornelia Huck <cohuck@redhat.com>, Thomas Huth <thuth@redhat.com>, Halil Pasic <pasic@linux.ibm.com>, Christian Borntraeger <borntraeger@linux.ibm.com>, Eric Farman <farman@linux.ibm.com>, Richard Henderson <richard.henderson@linaro.org>, David Hildenbrand <david@redhat.com>, Stefan Hajnoczi <stefanha@redhat.com>, Fam Zheng <fam@euphon.net>, Eric Blake <eblake@redhat.com>, Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>, John Snow <jsnow@redhat.com>, Laurent Vivier <lvivier@redhat.com>, Paolo Bonzini <pbonzini@redhat.com>
There is a newer version of this series
docs/devel/migration.rst       | 18 ++++++------
docs/devel/vfio-migration.rst  |  4 +--
include/migration/register.h   | 29 ++++++++++---------
migration/savevm.h             |  8 +++---
hw/s390x/s390-stattrib.c       | 11 ++++---
hw/vfio/migration.c            | 17 +++++------
migration/block-dirty-bitmap.c | 14 ++++-----
migration/block.c              | 17 ++++++-----
migration/migration.c          | 52 ++++++++++++++++++++++------------
migration/ram.c                | 35 ++++++++++++++++-------
migration/savevm.c             | 37 +++++++++++++++++-------
tests/qtest/migration-test.c   |  3 +-
hw/vfio/trace-events           |  2 +-
migration/trace-events         |  7 +++--
14 files changed, 148 insertions(+), 106 deletions(-)
[RFC 0/7] migration patches for VFIO
Posted by Juan Quintela 1 year, 6 months ago
Hi

VFIO migration has several requirements:
- the size of the state is only known when the guest is stopped
- they need to send possible lots of data.

this series only address the 1st set of problems.

What they do:
- res_compatible parameter was not used anywhere, just add that information to res_postcopy.
- Remove QEMUFILE parameter from save_live_pending
- Split save_live_pending into
  * save_pending_estimate(): the pending state size without trying too hard
  * save_pending_exact(): the real pending state size, it is called with the guest stopped.
- Now save_pending_* don't need the threshold parameter
- HACK a way to stop the guest before moving there.

ToDo:
- autoconverge test is broken, no real clue why, but it is possible that the test is wrong.

- Make an artifact to be able to send massive amount of data in the save state stage (probably more multifd channels).

- Be able to not having to start the guest between cheking the state pending size and migration_completion().

Please review.

Thanks, Juan.

Juan Quintela (7):
  migration: Remove res_compatible parameter
  migration: No save_live_pending() method uses the QEMUFile parameter
  migration: Block migration comment or code is wrong
  migration: Split save_live_pending() into state_pending_*
  migration: Remove unused threshold_size parameter
  migration: simplify migration_iteration_run()
  migration: call qemu_savevm_state_pending_exact() with the guest
    stopped

 docs/devel/migration.rst       | 18 ++++++------
 docs/devel/vfio-migration.rst  |  4 +--
 include/migration/register.h   | 29 ++++++++++---------
 migration/savevm.h             |  8 +++---
 hw/s390x/s390-stattrib.c       | 11 ++++---
 hw/vfio/migration.c            | 17 +++++------
 migration/block-dirty-bitmap.c | 14 ++++-----
 migration/block.c              | 17 ++++++-----
 migration/migration.c          | 52 ++++++++++++++++++++++------------
 migration/ram.c                | 35 ++++++++++++++++-------
 migration/savevm.c             | 37 +++++++++++++++++-------
 tests/qtest/migration-test.c   |  3 +-
 hw/vfio/trace-events           |  2 +-
 migration/trace-events         |  7 +++--
 14 files changed, 148 insertions(+), 106 deletions(-)

-- 
2.37.2
RE: [RFC 0/7] migration patches for VFIO
Posted by Yishai Hadas 1 year, 6 months ago
> From: Qemu-devel <qemu-devel-
> bounces+yishaih=nvidia.com@nongnu.org> On Behalf Of Juan Quintela
> Sent: Monday, 3 October 2022 6:16
> To: qemu-devel@nongnu.org
> Cc: Alex Williamson <alex.williamson@redhat.com>; Eric Blake
> <eblake@redhat.com>; Stefan Hajnoczi <stefanha@redhat.com>; Fam
> Zheng <fam@euphon.net>; qemu-s390x@nongnu.org; Cornelia Huck
> <cohuck@redhat.com>; Thomas Huth <thuth@redhat.com>; Vladimir
> Sementsov-Ogievskiy <vsementsov@yandex-team.ru>; Laurent Vivier
> <lvivier@redhat.com>; John Snow <jsnow@redhat.com>; Dr. David Alan
> Gilbert <dgilbert@redhat.com>; Christian Borntraeger
> <borntraeger@linux.ibm.com>; Halil Pasic <pasic@linux.ibm.com>; Juan
> Quintela <quintela@redhat.com>; Paolo Bonzini <pbonzini@redhat.com>;
> qemu-block@nongnu.org; Eric Farman <farman@linux.ibm.com>; Richard
> Henderson <richard.henderson@linaro.org>; David Hildenbrand
> <david@redhat.com>
> Subject: [RFC 0/7] migration patches for VFIO
> 
> Hi
> 
> VFIO migration has several requirements:
> - the size of the state is only known when the guest is stopped

As was discussed in the conference call, I just sent a patch to the kernel mailing list to be able to get the state size in each state.

See:
https://patchwork.kernel.org/project/kvm/patch/20221020132109.112708-1-yishaih@nvidia.com/

This can drop the need to stop the guest and ask for that data.

So, I assume that you can drop some complexity and hacks from your RFC once you'll send the next series.

Specifically,
No need to stop the VM and re-start it in case the SLA can't meet, just read upon RUNNING the estimated data length that will be required to complete STOP_COPY and use it.

Yishai

> - they need to send possible lots of data.
> 
> this series only address the 1st set of problems.
> 
> What they do:
> - res_compatible parameter was not used anywhere, just add that
> information to res_postcopy.
> - Remove QEMUFILE parameter from save_live_pending
> - Split save_live_pending into
>   * save_pending_estimate(): the pending state size without trying too hard
>   * save_pending_exact(): the real pending state size, it is called with the
> guest stopped.
> - Now save_pending_* don't need the threshold parameter
> - HACK a way to stop the guest before moving there.
> 
> ToDo:
> - autoconverge test is broken, no real clue why, but it is possible that the test
> is wrong.
> 
> - Make an artifact to be able to send massive amount of data in the save
> state stage (probably more multifd channels).
> 
> - Be able to not having to start the guest between cheking the state pending
> size and migration_completion().
> 
> Please review.
> 
> Thanks, Juan.
> 
> Juan Quintela (7):
>   migration: Remove res_compatible parameter
>   migration: No save_live_pending() method uses the QEMUFile parameter
>   migration: Block migration comment or code is wrong
>   migration: Split save_live_pending() into state_pending_*
>   migration: Remove unused threshold_size parameter
>   migration: simplify migration_iteration_run()
>   migration: call qemu_savevm_state_pending_exact() with the guest
>     stopped
> 
>  docs/devel/migration.rst       | 18 ++++++------
>  docs/devel/vfio-migration.rst  |  4 +--
>  include/migration/register.h   | 29 ++++++++++---------
>  migration/savevm.h             |  8 +++---
>  hw/s390x/s390-stattrib.c       | 11 ++++---
>  hw/vfio/migration.c            | 17 +++++------
>  migration/block-dirty-bitmap.c | 14 ++++-----
>  migration/block.c              | 17 ++++++-----
>  migration/migration.c          | 52 ++++++++++++++++++++++------------
>  migration/ram.c                | 35 ++++++++++++++++-------
>  migration/savevm.c             | 37 +++++++++++++++++-------
>  tests/qtest/migration-test.c   |  3 +-
>  hw/vfio/trace-events           |  2 +-
>  migration/trace-events         |  7 +++--
>  14 files changed, 148 insertions(+), 106 deletions(-)
> 
> --
> 2.37.2
>