docs/devel/migration/postcopy.rst | 36 +++++++------- include/migration/register.h | 26 ++++------ hw/ppc/spapr.c | 2 +- hw/s390x/s390-stattrib.c | 2 +- hw/vfio/migration.c | 2 +- migration/block-dirty-bitmap.c | 3 +- migration/migration-hmp-cmds.c | 81 ++++++++++++++++-------------- migration/migration.c | 61 ++++++++++++++++------- migration/ram.c | 32 +++++++----- migration/savevm.c | 83 +++++++++++++++++-------------- migration/trace-events | 1 + 11 files changed, 184 insertions(+), 145 deletions(-)
v2:
- Collected R-bs
- Avoid using "\b" in HMP dumps [Markus, Dave]
The series is based on a small patch from Yanfei Xu here:
Based-on: <20250514115827.3216082-1-yanfei.xu@bytedance.com>
https://lore.kernel.org/r/20250514115827.3216082-1-yanfei.xu@bytedance.com
This is a series that collected many of either enhancements or cleanups I
got for QEMU 10.1, which almost came from when working on the last patch.
The last patch, which is a oneliner, can further reduce 10% postcopy page
fault latency with preempt mode enabled.
Before: 268.00us (+-1.87%)
After: 232.67us (+-2.01%)
The patch layout is as following:
Patch 1: A follow up of HMP change for "info migrate", per
suggestion from Dave
Patch 2: Yet another HMP fix for blocktime displays
Patch 3-10: Cleanups everywhere, especially please take a look at
patch 10 which changes the core switchover decision logic
Patch 11: The one-liner optimization
Comments welcomed, thanks.
Peter Xu (11):
migration/hmp: Reorg "info migrate" once more
migration/hmp: Fix postcopy-blocktime per-vCPU results
migration/docs: Move docs for postcopy blocktime feature
migration/bg-snapshot: Do not check for SKIP in iterator
migration: Drop save_live_complete_postcopy hook
migration: Rename save_live_complete_precopy to save_complete
migration: qemu_savevm_complete*() helpers
migration/ram: One less indent for ram_find_and_save_block()
migration/ram: Add tracepoints for ram_save_complete()
migration: Rewrite the migration complete detect logic
migration/postcopy: Avoid clearing dirty bitmap for postcopy too
docs/devel/migration/postcopy.rst | 36 +++++++-------
include/migration/register.h | 26 ++++------
hw/ppc/spapr.c | 2 +-
hw/s390x/s390-stattrib.c | 2 +-
hw/vfio/migration.c | 2 +-
migration/block-dirty-bitmap.c | 3 +-
migration/migration-hmp-cmds.c | 81 ++++++++++++++++--------------
migration/migration.c | 61 ++++++++++++++++-------
migration/ram.c | 32 +++++++-----
migration/savevm.c | 83 +++++++++++++++++--------------
migration/trace-events | 1 +
11 files changed, 184 insertions(+), 145 deletions(-)
--
2.49.0
On Mon, Jun 09, 2025 at 12:18:44PM -0400, Peter Xu wrote:
> v2:
> - Collected R-bs
> - Avoid using "\b" in HMP dumps [Markus, Dave]
>
> The series is based on a small patch from Yanfei Xu here:
>
> Based-on: <20250514115827.3216082-1-yanfei.xu@bytedance.com>
> https://lore.kernel.org/r/20250514115827.3216082-1-yanfei.xu@bytedance.com
>
> This is a series that collected many of either enhancements or cleanups I
> got for QEMU 10.1, which almost came from when working on the last patch.
>
> The last patch, which is a oneliner, can further reduce 10% postcopy page
> fault latency with preempt mode enabled.
>
> Before: 268.00us (+-1.87%)
> After: 232.67us (+-2.01%)
>
> The patch layout is as following:
>
> Patch 1: A follow up of HMP change for "info migrate", per
> suggestion from Dave
> Patch 2: Yet another HMP fix for blocktime displays
> Patch 3-10: Cleanups everywhere, especially please take a look at
> patch 10 which changes the core switchover decision logic
> Patch 11: The one-liner optimization
>
> Comments welcomed, thanks.
>
> Peter Xu (11):
> migration/hmp: Reorg "info migrate" once more
> migration/hmp: Fix postcopy-blocktime per-vCPU results
> migration/docs: Move docs for postcopy blocktime feature
> migration/bg-snapshot: Do not check for SKIP in iterator
> migration: Drop save_live_complete_postcopy hook
> migration: Rename save_live_complete_precopy to save_complete
> migration: qemu_savevm_complete*() helpers
> migration/ram: One less indent for ram_find_and_save_block()
> migration/ram: Add tracepoints for ram_save_complete()
> migration: Rewrite the migration complete detect logic
> migration/postcopy: Avoid clearing dirty bitmap for postcopy too
There're two checkpatch issues need fixing. Two fixups will be needed as
below, one remove a space, one fix 80 chars. I'll squash if I'm sending
new versions.
Sorry for the noise.
===8<===
From 25356e1262006fd668ba4e29b01325b5e784e19a Mon Sep 17 00:00:00 2001
From: Peter Xu <peterx@redhat.com>
Date: Wed, 11 Jun 2025 17:23:00 -0400
Subject: [PATCH] fixup! migration/hmp: Fix postcopy-blocktime per-vCPU results
Signed-off-by: Peter Xu <peterx@redhat.com>
---
migration/migration-hmp-cmds.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c
index 6c36e202a0..867e017b32 100644
--- a/migration/migration-hmp-cmds.c
+++ b/migration/migration-hmp-cmds.c
@@ -212,7 +212,7 @@ void hmp_info_migrate(Monitor *mon, const QDict *qdict)
const char *sep = "";
int count = 0;
- monitor_printf(mon, "Postcopy vCPU Blocktime (ms): \n [");
+ monitor_printf(mon, "Postcopy vCPU Blocktime (ms):\n [");
while (item) {
monitor_printf(mon, "%s%"PRIu32, sep, item->value);
--
2.49.0
From 58dfb3e311fb477732d0f109886d02adcb439e14 Mon Sep 17 00:00:00 2001
From: Peter Xu <peterx@redhat.com>
Date: Wed, 11 Jun 2025 17:23:38 -0400
Subject: [PATCH] fixup! migration: Rewrite the migration complete detect logic
Signed-off-by: Peter Xu <peterx@redhat.com>
---
migration/migration.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/migration/migration.c b/migration/migration.c
index 1a26a4bfef..923400f801 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -3460,7 +3460,8 @@ static MigIterateState migration_iteration_run(MigrationState *s)
if (pending_size < s->threshold_size) {
qemu_savevm_state_pending_exact(&must_precopy, &can_postcopy);
pending_size = must_precopy + can_postcopy;
- trace_migrate_pending_exact(pending_size, must_precopy, can_postcopy);
+ trace_migrate_pending_exact(pending_size, must_precopy,
+ can_postcopy);
}
/* Should we switch to postcopy now? */
--
2.49.0
--
Peter Xu
This series has been successfully tested. The information displayed from the HMP info migrate command is more user-friendly, with the possibility of displaying the globals with info migrate -a. (qemu) info migrate -a Status: active Sockets: [ tcp::::8888 ] Globals: store-global-state: on only-migratable: off send-configuration: on send-section-footer: on send-switchover-start: on clear-bitmap-shift: 18 Tested-by: Mario Casquero <mcasquer@redhat.com> On Mon, Jun 9, 2025 at 6:20 PM Peter Xu <peterx@redhat.com> wrote: > > v2: > - Collected R-bs > - Avoid using "\b" in HMP dumps [Markus, Dave] > > The series is based on a small patch from Yanfei Xu here: > > Based-on: <20250514115827.3216082-1-yanfei.xu@bytedance.com> > https://lore.kernel.org/r/20250514115827.3216082-1-yanfei.xu@bytedance.com > > This is a series that collected many of either enhancements or cleanups I > got for QEMU 10.1, which almost came from when working on the last patch. > > The last patch, which is a oneliner, can further reduce 10% postcopy page > fault latency with preempt mode enabled. > > Before: 268.00us (+-1.87%) > After: 232.67us (+-2.01%) > > The patch layout is as following: > > Patch 1: A follow up of HMP change for "info migrate", per > suggestion from Dave > Patch 2: Yet another HMP fix for blocktime displays > Patch 3-10: Cleanups everywhere, especially please take a look at > patch 10 which changes the core switchover decision logic > Patch 11: The one-liner optimization > > Comments welcomed, thanks. > > Peter Xu (11): > migration/hmp: Reorg "info migrate" once more > migration/hmp: Fix postcopy-blocktime per-vCPU results > migration/docs: Move docs for postcopy blocktime feature > migration/bg-snapshot: Do not check for SKIP in iterator > migration: Drop save_live_complete_postcopy hook > migration: Rename save_live_complete_precopy to save_complete > migration: qemu_savevm_complete*() helpers > migration/ram: One less indent for ram_find_and_save_block() > migration/ram: Add tracepoints for ram_save_complete() > migration: Rewrite the migration complete detect logic > migration/postcopy: Avoid clearing dirty bitmap for postcopy too > > docs/devel/migration/postcopy.rst | 36 +++++++------- > include/migration/register.h | 26 ++++------ > hw/ppc/spapr.c | 2 +- > hw/s390x/s390-stattrib.c | 2 +- > hw/vfio/migration.c | 2 +- > migration/block-dirty-bitmap.c | 3 +- > migration/migration-hmp-cmds.c | 81 ++++++++++++++++-------------- > migration/migration.c | 61 ++++++++++++++++------- > migration/ram.c | 32 +++++++----- > migration/savevm.c | 83 +++++++++++++++++-------------- > migration/trace-events | 1 + > 11 files changed, 184 insertions(+), 145 deletions(-) > > -- > 2.49.0 > >
On Wed, Jun 11, 2025 at 08:15:55AM +0200, Mario Casquero wrote: > This series has been successfully tested. The information displayed > from the HMP info migrate command is more user-friendly, with the > possibility of displaying the globals with info migrate -a. > (qemu) info migrate -a > Status: active > Sockets: [ > tcp::::8888 > ] > Globals: > store-global-state: on > only-migratable: off > send-configuration: on > send-section-footer: on > send-switchover-start: on > clear-bitmap-shift: 18 > > Tested-by: Mario Casquero <mcasquer@redhat.com> Hey, Mario, Thanks for doing this! This is a specific HMP dump test on recv side, just to mention the major change will be on the src side, so feel free to try that too. That's what patch 1 does. Patch 2 changed recv side report for blocktime, but in your case you didn't enable it, to cover tests on patch 2, you can enable postcopy-blocktime feature and kickoff a postcopy migration. And just to mention, the real meat in this series is actually the last patch. :) If you want to test that, you'd likely want to apply another of my series: https://lore.kernel.org/r/20250609191259.9053-1-peterx@redhat.com Then invoke postcopy test with some loads, then check the blocktime reports again. The other series added latency tracking to blocktime. With that extra series applied, you should be able to observe average page fault latency reduction after the last patch, aka, the meat. Note that this is not a request to have you test everything! Just to mention the bits from test perspective, so just take it as FYI. I appreciate your help already to test on the recv side! Thanks, -- Peter Xu
Hello Peter,
Thanks for pointing this out! I retested it with the series you
mentioned and everything works fine.
Booted up 2 VMs as usual, one in source and one in destination with
-incoming defer. Set the postcopy-blocktime and postcopy-ram
capabilities and query them to verify that they are enabled.
(qemu) migrate_set_capability postcopy-ram on
(qemu) migrate_set_capability postcopy-blocktime on
(qemu) info migrate_capabilities
...
postcopy-ram: on
...
postcopy-blocktime: on
...
Do migration with postcopy, this time check the full info migrate in source.
(qemu) info migrate -a
Status: postcopy-active
Time (ms): total=6522, setup=33, down=16
RAM info:
Throughput (Mbps): 949.60
Sizes: pagesize=4 KiB, total=16 GiB
Transfers: transferred=703 MiB, remain=5.4 GiB
Channels: precopy=111 MiB, multifd=0 B, postcopy=592 MiB
Page Types: normal=178447, zero=508031
Page Rates (pps): transfer=167581
Others: dirty_syncs=2, postcopy_req=1652
Globals:
store-global-state: on
only-migratable: off
send-configuration: on
send-section-footer: on
send-switchover-start: on
clear-bitmap-shift: 18
Once migration is completed compare the differences in destination
about the postcopy blocktime.
(qemu) info migrate -a
Status: completed
Globals:
...
Postcopy Blocktime (ms): 712
Postcopy vCPU Blocktime (ms):
[1633, 1635, 1710, 2097, 2595, 1993, 1958, 1214]
With all the series applied and same VM:
(qemu) info migrate -a
Status: completed
Globals:
...
Postcopy Blocktime (ms): 134
Postcopy vCPU Blocktime (ms):
[1310, 1064, 1112, 1400, 1334, 756, 1216, 1420]
Postcopy Latency (us): 16075
Postcopy non-vCPU Latencies (us): 14743
Postcopy vCPU Latencies (us):
[24730, 25350, 27125, 25930, 23825, 29110, 22960, 26304]
Indeed the Postcopy Blocktime has been reduced a lot :)
Thanks,
Mario
On Wed, Jun 11, 2025 at 3:06 PM Peter Xu <peterx@redhat.com> wrote:
>
> On Wed, Jun 11, 2025 at 08:15:55AM +0200, Mario Casquero wrote:
> > This series has been successfully tested. The information displayed
> > from the HMP info migrate command is more user-friendly, with the
> > possibility of displaying the globals with info migrate -a.
> > (qemu) info migrate -a
> > Status: active
> > Sockets: [
> > tcp::::8888
> > ]
> > Globals:
> > store-global-state: on
> > only-migratable: off
> > send-configuration: on
> > send-section-footer: on
> > send-switchover-start: on
> > clear-bitmap-shift: 18
> >
> > Tested-by: Mario Casquero <mcasquer@redhat.com>
>
> Hey, Mario,
>
> Thanks for doing this!
>
> This is a specific HMP dump test on recv side, just to mention the major
> change will be on the src side, so feel free to try that too. That's what
> patch 1 does.
>
> Patch 2 changed recv side report for blocktime, but in your case you didn't
> enable it, to cover tests on patch 2, you can enable postcopy-blocktime
> feature and kickoff a postcopy migration.
>
> And just to mention, the real meat in this series is actually the last
> patch. :) If you want to test that, you'd likely want to apply another of
> my series:
>
> https://lore.kernel.org/r/20250609191259.9053-1-peterx@redhat.com
>
> Then invoke postcopy test with some loads, then check the blocktime reports
> again. The other series added latency tracking to blocktime. With that
> extra series applied, you should be able to observe average page fault
> latency reduction after the last patch, aka, the meat.
>
> Note that this is not a request to have you test everything! Just to
> mention the bits from test perspective, so just take it as FYI. I
> appreciate your help already to test on the recv side!
>
> Thanks,
>
> --
> Peter Xu
>
On Thu, Jun 12, 2025 at 12:35:46PM +0200, Mario Casquero wrote: > Hello Peter, Hi, Mario, > > Thanks for pointing this out! I retested it with the series you > mentioned and everything works fine. > > Booted up 2 VMs as usual, one in source and one in destination with > -incoming defer. Set the postcopy-blocktime and postcopy-ram > capabilities and query them to verify that they are enabled. > > (qemu) migrate_set_capability postcopy-ram on > (qemu) migrate_set_capability postcopy-blocktime on > (qemu) info migrate_capabilities > > ... > postcopy-ram: on > ... > postcopy-blocktime: on > ... > > Do migration with postcopy, this time check the full info migrate in source. > (qemu) info migrate -a > Status: postcopy-active > Time (ms): total=6522, setup=33, down=16 > RAM info: > Throughput (Mbps): 949.60 > Sizes: pagesize=4 KiB, total=16 GiB > Transfers: transferred=703 MiB, remain=5.4 GiB > Channels: precopy=111 MiB, multifd=0 B, postcopy=592 MiB > Page Types: normal=178447, zero=508031 > Page Rates (pps): transfer=167581 > Others: dirty_syncs=2, postcopy_req=1652 > Globals: > store-global-state: on > only-migratable: off > send-configuration: on > send-section-footer: on > send-switchover-start: on > clear-bitmap-shift: 18 > > Once migration is completed compare the differences in destination > about the postcopy blocktime. > > (qemu) info migrate -a > Status: completed > Globals: > ... > Postcopy Blocktime (ms): 712 > Postcopy vCPU Blocktime (ms): > [1633, 1635, 1710, 2097, 2595, 1993, 1958, 1214] > > With all the series applied and same VM: > > (qemu) info migrate -a > Status: completed > Globals: > ... > Postcopy Blocktime (ms): 134 > Postcopy vCPU Blocktime (ms): > [1310, 1064, 1112, 1400, 1334, 756, 1216, 1420] > Postcopy Latency (us): 16075 Here the latency is 16ms, my fault here - I forgot to let you enable postcopy-preempt as well, sorry. The optimization won't help much without preempt, because the optimization is in tens of microseconds level. So logically the optimization might be buried in the noise if without preempt mode. It's suggested to enable preempt mode always for a postcopy migration whenever available. > Postcopy non-vCPU Latencies (us): 14743 > Postcopy vCPU Latencies (us): > [24730, 25350, 27125, 25930, 23825, 29110, 22960, 26304] > > Indeed the Postcopy Blocktime has been reduced a lot :) I haven't compared with blocktime before, I'm surprised it changed that much. Though maybe you didn't really run any workload inside? In that case the results can be unpredictable. The perf test would make more sense if you run some loads so the majority of the faults triggered will not be adhoc system probes but more predictable. I normally use mig_mon [1] with something like this: [1] https://github.com/xzpeter/mig_mon $ ./mig_mon mm_dirty -m 13G -p random This first write pre-fault the whole memory using all the CPUs, then dirties the 13G memory single threaded as fast as possible in random fashion. What I did with the test was applying both series then revert the last patch of 1st series, as "postcopy-latency' metrics wasn't around before applying the 2nd series, or you'll need to use some kernel tracepoints. This is definitely an awkward series to test when having the two mangled. Again, feel free to skip that, just FYI! Thanks, -- Peter Xu
© 2016 - 2025 Red Hat, Inc.