[PATCH 00/20] migration: Postcopy Preemption

Peter Xu posted 20 patches 2 years, 2 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20220216062809.57179-1-peterx@redhat.com
Test checkpatch passed
Maintainers: Paolo Bonzini <pbonzini@redhat.com>, "Dr. David Alan Gilbert" <dgilbert@redhat.com>, Eric Blake <eblake@redhat.com>, Laurent Vivier <lvivier@redhat.com>, Juan Quintela <quintela@redhat.com>, Thomas Huth <thuth@redhat.com>, Markus Armbruster <armbru@redhat.com>
There is a newer version of this series
migration/migration.c        | 184 +++++++++++++++-----
migration/migration.h        |  64 ++++++-
migration/multifd.c          |  19 +--
migration/multifd.h          |   2 -
migration/postcopy-ram.c     | 208 ++++++++++++++++++-----
migration/postcopy-ram.h     |  14 ++
migration/ram.c              | 320 +++++++++++++++++++++++++++++++----
migration/ram.h              |   3 +
migration/savevm.c           |  66 ++++++--
migration/socket.c           |  22 ++-
migration/socket.h           |   1 +
migration/trace-events       |  19 ++-
qapi/migration.json          |   8 +-
tests/qtest/migration-test.c |  39 ++++-
14 files changed, 803 insertions(+), 166 deletions(-)
[PATCH 00/20] migration: Postcopy Preemption
Posted by Peter Xu 2 years, 2 months ago
This is v1 of postcopy preempt series.  It can also be found here:

  https://github.com/xzpeter/qemu/tree/postcopy-preempt

This series added a new migration capability called "postcopy-preempt".  It can
be enabled when postcopy is enabled, and it'll simply (but greatly) speed up
postcopy page requests handling process.

  |----------------+--------------+-----------------------|
  | Host page size | Vanilla (ms) | Postcopy Preempt (ms) |
  |----------------+--------------+-----------------------|
  | 2M             |        10.58 |                  4.96 |
  | 4K             |        10.68 |                  0.57 |
  |----------------+--------------+-----------------------|

The major change since RFC is:

  - The very large patch is split into smaller ones
  - Added postcopy recovery support, and its unit test

The RFC series actually broke postcopy recovery on huge pages, and this version
will also have that issue fixed.

Just a quick note: this series is partly preparing for the doublemap support
too in the future.  The channel separation speedup will be beneficial for both
current postcopy or when doublemap is ready.  The huge page preemption part may
only benefit current postcopy, and it won't be enabled in the future doublemap
support because in that new doublemap world there will have no huge pages at
all being mapped.

The new patch layout:

Patch 1-3: Three leftover patches from patchset "[PATCH v3 0/8] migration:
Postcopy cleanup on ram disgard" that I picked up here too.

  https://lore.kernel.org/qemu-devel/20211224065000.97572-1-peterx@redhat.com/

  migration: Dump sub-cmd name in loadvm_process_command tp
  migration: Finer grained tracepoints for POSTCOPY_LISTEN
  migration: Tracepoint change in postcopy-run bottom half

Patch 4-9: Original postcopy preempt RFC preparation patches (with slight
modifications).

  migration: Introduce postcopy channels on dest node
  migration: Dump ramblock and offset too when non-same-page detected
  migration: Add postcopy_thread_create()
  migration: Move static var in ram_block_from_stream() into global
  migration: Add pss.postcopy_requested status
  migration: Move migrate_allow_multifd and helpers into migration.c

Patch 10-15: Some newly added patches when working on postcopy recovery
support.  After these patches migrate-recover command will allow re-entrance,
which is a very nice side effect.

  migration: Enlarge postcopy recovery to capture !-EIO too
  migration: postcopy_pause_fault_thread() never fails
  migration: Export ram_load_postcopy()
  migration: Move channel setup out of postcopy_try_recover()
  migration: Add migration_incoming_transport_cleanup()
  migration: Allow migrate-recover to run multiple times

Patch 16-19: The major work of postcopy preemption implementation is split into
four patches as suggested by Dave.

  migration: Add postcopy-preempt capability
  migration: Postcopy preemption preparation on channel creation
  migration: Postcopy preemption enablement
  migration: Postcopy recover with preempt enabled

Patch 20: the test case.

  tests: Add postcopy preempt test

For more information, feel free to refer to the RFC series cover letter:

  https://lore.kernel.org/qemu-devel/20220119080929.39485-1-peterx@redhat.com/

Please review, thanks.

Peter Xu (20):
  migration: Dump sub-cmd name in loadvm_process_command tp
  migration: Finer grained tracepoints for POSTCOPY_LISTEN
  migration: Tracepoint change in postcopy-run bottom half
  migration: Introduce postcopy channels on dest node
  migration: Dump ramblock and offset too when non-same-page detected
  migration: Add postcopy_thread_create()
  migration: Move static var in ram_block_from_stream() into global
  migration: Add pss.postcopy_requested status
  migration: Move migrate_allow_multifd and helpers into migration.c
  migration: Enlarge postcopy recovery to capture !-EIO too
  migration: postcopy_pause_fault_thread() never fails
  migration: Export ram_load_postcopy()
  migration: Move channel setup out of postcopy_try_recover()
  migration: Add migration_incoming_transport_cleanup()
  migration: Allow migrate-recover to run multiple times
  migration: Add postcopy-preempt capability
  migration: Postcopy preemption preparation on channel creation
  migration: Postcopy preemption enablement
  migration: Postcopy recover with preempt enabled
  tests: Add postcopy preempt test

 migration/migration.c        | 184 +++++++++++++++-----
 migration/migration.h        |  64 ++++++-
 migration/multifd.c          |  19 +--
 migration/multifd.h          |   2 -
 migration/postcopy-ram.c     | 208 ++++++++++++++++++-----
 migration/postcopy-ram.h     |  14 ++
 migration/ram.c              | 320 +++++++++++++++++++++++++++++++----
 migration/ram.h              |   3 +
 migration/savevm.c           |  66 ++++++--
 migration/socket.c           |  22 ++-
 migration/socket.h           |   1 +
 migration/trace-events       |  19 ++-
 qapi/migration.json          |   8 +-
 tests/qtest/migration-test.c |  39 ++++-
 14 files changed, 803 insertions(+), 166 deletions(-)

-- 
2.32.0


Re: [PATCH 00/20] migration: Postcopy Preemption
Posted by Peter Xu 2 years, 2 months ago
On Wed, Feb 16, 2022 at 02:27:49PM +0800, Peter Xu wrote:
> The new patch layout:
> 
> Patch 1-3: Three leftover patches from patchset "[PATCH v3 0/8] migration:
> Postcopy cleanup on ram disgard" that I picked up here too.
> 
>   https://lore.kernel.org/qemu-devel/20211224065000.97572-1-peterx@redhat.com/
> 
>   migration: Dump sub-cmd name in loadvm_process_command tp
>   migration: Finer grained tracepoints for POSTCOPY_LISTEN
>   migration: Tracepoint change in postcopy-run bottom half
> 
> Patch 4-9: Original postcopy preempt RFC preparation patches (with slight
> modifications).
> 
>   migration: Introduce postcopy channels on dest node
>   migration: Dump ramblock and offset too when non-same-page detected
>   migration: Add postcopy_thread_create()
>   migration: Move static var in ram_block_from_stream() into global
>   migration: Add pss.postcopy_requested status
>   migration: Move migrate_allow_multifd and helpers into migration.c
> 
> Patch 10-15: Some newly added patches when working on postcopy recovery
> support.  After these patches migrate-recover command will allow re-entrance,
> which is a very nice side effect.
> 
>   migration: Enlarge postcopy recovery to capture !-EIO too
>   migration: postcopy_pause_fault_thread() never fails
>   migration: Export ram_load_postcopy()
>   migration: Move channel setup out of postcopy_try_recover()
>   migration: Add migration_incoming_transport_cleanup()
>   migration: Allow migrate-recover to run multiple times

Patches before 15 are IMHO good in various aspects with/without the new
preemption, so they can be considered for review earlier.

Especially:

    migration: Enlarge postcopy recovery to capture !-EIO too
    migration: Add migration_incoming_transport_cleanup()
    migration: Allow migrate-recover to run multiple times

Thanks,

-- 
Peter Xu