[PATCH 13/14] migration/qapi: Introduce system-wise "remaining" reports

Peter Xu posted 14 patches 3 days ago
Maintainers: Pierrick Bouvier <pierrick.bouvier@linaro.org>, Peter Xu <peterx@redhat.com>, Fabiano Rosas <farosas@suse.de>, Alex Williamson <alex@shazbot.org>, "Cédric Le Goater" <clg@redhat.com>, Halil Pasic <pasic@linux.ibm.com>, Christian Borntraeger <borntraeger@linux.ibm.com>, Jason Herne <jjherne@linux.ibm.com>, Richard Henderson <richard.henderson@linaro.org>, Ilya Leoshkevich <iii@linux.ibm.com>, David Hildenbrand <david@kernel.org>, Cornelia Huck <cohuck@redhat.com>, Eric Farman <farman@linux.ibm.com>, Matthew Rosato <mjrosato@linux.ibm.com>, Eric Blake <eblake@redhat.com>, Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>, John Snow <jsnow@redhat.com>, Markus Armbruster <armbru@redhat.com>
[PATCH 13/14] migration/qapi: Introduce system-wise "remaining" reports
Posted by Peter Xu 3 days ago
Currently, mgmt can only query for remaining RAM, not system-wise remaining
data.  It was not a problem before, because for a very long time RAM was
the only part that matters.

After VFIO migrations landed upstream, it may not be true anymore
especially considering that there can be GPU devices that contain GBs of
device states.

Add a new "remaining" field in query-migrate results, reflecting
system-wise remaining data, which will include everything (e.g. VFIO).

This information will be useful for mgmt to implement generic way of stall
detection that covers all system resources.  Say, when system remaining
data does not decrease anymore for a relatively long period of time, then
it may mean that there is a challenge of converging, so mgmt can act based
on how this value changes over time (especially if sampled after each
migration iteration).

Before this patch, "expected_downtime" almost played this role. For
example, by monitoring "expected_downtime" at the beginning of each
iteration can in most cases also reflect the progress of migration
system-wise.  Said that, "expected_downtime" was always calculated based on
a bandwidth value that can fluctuate a lot if avail-switchover-bandwidth is
not used. This new "remaining" field will remove that part of uncertainty
for mgmt.

With the new field, HMP "info migrate" now reports this:

(qemu) info migrate
Status:                 active
Time (ms):              total=12080, setup=14, exp_down=300
Remaining (bytes):      1.36 GiB        <------------------- newline
RAM info:
  Throughput (Mbps):    840.50
  Sizes:                pagesize=4 KiB, total=4.02 GiB
  Transfers:            transferred=1.18 GiB, remain=1.36 GiB
    Channels:           precopy=1.18 GiB, multifd=0 B, postcopy=0 B
    Page Types:         normal=307923, zero=388148
  Page Rates (pps):     transfer=25660
  Others:               dirty_syncs=1

It should be the same value as RAM's remaining report when VFIO is not
involved, and it should report more than that when VFIO is involved.

Cc: Markus Armbruster <armbru@redhat.com>
Cc: Dr. David Alan Gilbert <dave@treblig.org>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 qapi/migration.json            |  4 ++++
 migration/migration-hmp-cmds.c |  5 +++++
 migration/migration.c          | 11 +++++++++++
 3 files changed, 20 insertions(+)

diff --git a/qapi/migration.json b/qapi/migration.json
index e3ad3f0604..a6e24b5685 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -300,6 +300,9 @@
 #     average memory load of the virtual CPU indirectly.  Note that
 #     zero means guest doesn't dirty memory.  (Since 8.1)
 #
+# @remaining: amount of bytes remaining to be migrated system-wise,
+#     includes both RAM and all devices (like VFIO).  (Since 11.1)
+#
 # Features:
 #
 # @unstable: Members @postcopy-latency, @postcopy-vcpu-latency,
@@ -310,6 +313,7 @@
 ##
 { 'struct': 'MigrationInfo',
   'data': {'*status': 'MigrationStatus', '*ram': 'MigrationRAMStats',
+           '*remaining': 'uint64',
            '*vfio': 'VfioStats',
            '*xbzrle-cache': 'XBZRLECacheStats',
            '*total-time': 'int',
diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c
index 0a193b8f54..721c211086 100644
--- a/migration/migration-hmp-cmds.c
+++ b/migration/migration-hmp-cmds.c
@@ -178,6 +178,11 @@ void hmp_info_migrate(Monitor *mon, const QDict *qdict)
         }
     }
 
+    if (info->has_remaining) {
+        g_autofree char *remaining = size_to_str(info->remaining);
+        monitor_printf(mon, "Remaining (bytes): \t%s\n", remaining);
+    }
+
     if (info->has_socket_address) {
         SocketAddressList *addr;
 
diff --git a/migration/migration.c b/migration/migration.c
index 4010e5dcf5..c2aa145106 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1076,6 +1076,16 @@ static void populate_time_info(MigrationInfo *info, MigrationState *s)
     }
 }
 
+static void populate_global_info(MigrationInfo *info, MigrationState *s)
+{
+    MigPendingData data = { };
+
+    qemu_savevm_query_pending(&data, false);
+
+    info->has_remaining = true;
+    info->remaining = data.total_bytes;
+}
+
 static void populate_ram_info(MigrationInfo *info, MigrationState *s)
 {
     size_t page_size = qemu_target_page_size();
@@ -1177,6 +1187,7 @@ static void fill_source_migration_info(MigrationInfo *info)
         /* TODO add some postcopy stats */
         populate_time_info(info, s);
         populate_ram_info(info, s);
+        populate_global_info(info, s);
         migration_populate_vfio_info(info);
         break;
     case MIGRATION_STATUS_COLO:
-- 
2.53.0
Re: [PATCH 13/14] migration/qapi: Introduce system-wise "remaining" reports
Posted by Fabiano Rosas 1 day, 18 hours ago
Peter Xu <peterx@redhat.com> writes:

> Currently, mgmt can only query for remaining RAM, not system-wise remaining
> data.  It was not a problem before, because for a very long time RAM was
> the only part that matters.
>
> After VFIO migrations landed upstream, it may not be true anymore
> especially considering that there can be GPU devices that contain GBs of
> device states.
>
> Add a new "remaining" field in query-migrate results, reflecting
> system-wise remaining data, which will include everything (e.g. VFIO).
>
> This information will be useful for mgmt to implement generic way of stall
> detection that covers all system resources.  Say, when system remaining
> data does not decrease anymore for a relatively long period of time, then
> it may mean that there is a challenge of converging, so mgmt can act based
> on how this value changes over time (especially if sampled after each
> migration iteration).
>
> Before this patch, "expected_downtime" almost played this role. For
> example, by monitoring "expected_downtime" at the beginning of each
> iteration can in most cases also reflect the progress of migration
> system-wise.  Said that, "expected_downtime" was always calculated based on
> a bandwidth value that can fluctuate a lot if avail-switchover-bandwidth is
> not used. This new "remaining" field will remove that part of uncertainty
> for mgmt.
>
> With the new field, HMP "info migrate" now reports this:
>
> (qemu) info migrate
> Status:                 active
> Time (ms):              total=12080, setup=14, exp_down=300
> Remaining (bytes):      1.36 GiB        <------------------- newline

Either bytes or GiB. Better to simply remove the "(bytes)" string.

> RAM info:
>   Throughput (Mbps):    840.50
>   Sizes:                pagesize=4 KiB, total=4.02 GiB
>   Transfers:            transferred=1.18 GiB, remain=1.36 GiB
>     Channels:           precopy=1.18 GiB, multifd=0 B, postcopy=0 B
>     Page Types:         normal=307923, zero=388148
>   Page Rates (pps):     transfer=25660
>   Others:               dirty_syncs=1
>
> It should be the same value as RAM's remaining report when VFIO is not
> involved, and it should report more than that when VFIO is involved.
>
> Cc: Markus Armbruster <armbru@redhat.com>
> Cc: Dr. David Alan Gilbert <dave@treblig.org>
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  qapi/migration.json            |  4 ++++
>  migration/migration-hmp-cmds.c |  5 +++++
>  migration/migration.c          | 11 +++++++++++
>  3 files changed, 20 insertions(+)
>
> diff --git a/qapi/migration.json b/qapi/migration.json
> index e3ad3f0604..a6e24b5685 100644
> --- a/qapi/migration.json
> +++ b/qapi/migration.json
> @@ -300,6 +300,9 @@
>  #     average memory load of the virtual CPU indirectly.  Note that
>  #     zero means guest doesn't dirty memory.  (Since 8.1)
>  #
> +# @remaining: amount of bytes remaining to be migrated system-wise,
> +#     includes both RAM and all devices (like VFIO).  (Since 11.1)
> +#
>  # Features:
>  #
>  # @unstable: Members @postcopy-latency, @postcopy-vcpu-latency,
> @@ -310,6 +313,7 @@
>  ##
>  { 'struct': 'MigrationInfo',
>    'data': {'*status': 'MigrationStatus', '*ram': 'MigrationRAMStats',
> +           '*remaining': 'uint64',
>             '*vfio': 'VfioStats',
>             '*xbzrle-cache': 'XBZRLECacheStats',
>             '*total-time': 'int',
> diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c
> index 0a193b8f54..721c211086 100644
> --- a/migration/migration-hmp-cmds.c
> +++ b/migration/migration-hmp-cmds.c
> @@ -178,6 +178,11 @@ void hmp_info_migrate(Monitor *mon, const QDict *qdict)
>          }
>      }
>  
> +    if (info->has_remaining) {
> +        g_autofree char *remaining = size_to_str(info->remaining);
> +        monitor_printf(mon, "Remaining (bytes): \t%s\n", remaining);
> +    }
> +
>      if (info->has_socket_address) {
>          SocketAddressList *addr;
>  
> diff --git a/migration/migration.c b/migration/migration.c
> index 4010e5dcf5..c2aa145106 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -1076,6 +1076,16 @@ static void populate_time_info(MigrationInfo *info, MigrationState *s)
>      }
>  }
>  
> +static void populate_global_info(MigrationInfo *info, MigrationState *s)
> +{
> +    MigPendingData data = { };
> +
> +    qemu_savevm_query_pending(&data, false);
> +
> +    info->has_remaining = true;
> +    info->remaining = data.total_bytes;
> +}
> +
>  static void populate_ram_info(MigrationInfo *info, MigrationState *s)
>  {
>      size_t page_size = qemu_target_page_size();
> @@ -1177,6 +1187,7 @@ static void fill_source_migration_info(MigrationInfo *info)
>          /* TODO add some postcopy stats */
>          populate_time_info(info, s);
>          populate_ram_info(info, s);
> +        populate_global_info(info, s);
>          migration_populate_vfio_info(info);
>          break;
>      case MIGRATION_STATUS_COLO:
Re: [PATCH 13/14] migration/qapi: Introduce system-wise "remaining" reports
Posted by Dr. David Alan Gilbert 1 day, 19 hours ago
* Peter Xu (peterx@redhat.com) wrote:
> Currently, mgmt can only query for remaining RAM, not system-wise remaining
> data.  It was not a problem before, because for a very long time RAM was
> the only part that matters.
> 
> After VFIO migrations landed upstream, it may not be true anymore
> especially considering that there can be GPU devices that contain GBs of
> device states.
> 
> Add a new "remaining" field in query-migrate results, reflecting
> system-wise remaining data, which will include everything (e.g. VFIO).

Of course you realise the next thing people will ask for is being able
to ask *which* vfio device is the one that's busy.

Reviewed-by: Dr. David Alan Gilbert <dave@treblig.org>

> This information will be useful for mgmt to implement generic way of stall
> detection that covers all system resources.  Say, when system remaining
> data does not decrease anymore for a relatively long period of time, then
> it may mean that there is a challenge of converging, so mgmt can act based
> on how this value changes over time (especially if sampled after each
> migration iteration).
> 
> Before this patch, "expected_downtime" almost played this role. For
> example, by monitoring "expected_downtime" at the beginning of each
> iteration can in most cases also reflect the progress of migration
> system-wise.  Said that, "expected_downtime" was always calculated based on
> a bandwidth value that can fluctuate a lot if avail-switchover-bandwidth is
> not used. This new "remaining" field will remove that part of uncertainty
> for mgmt.
> 
> With the new field, HMP "info migrate" now reports this:
> 
> (qemu) info migrate
> Status:                 active
> Time (ms):              total=12080, setup=14, exp_down=300
> Remaining (bytes):      1.36 GiB        <------------------- newline
> RAM info:
>   Throughput (Mbps):    840.50
>   Sizes:                pagesize=4 KiB, total=4.02 GiB
>   Transfers:            transferred=1.18 GiB, remain=1.36 GiB
>     Channels:           precopy=1.18 GiB, multifd=0 B, postcopy=0 B
>     Page Types:         normal=307923, zero=388148
>   Page Rates (pps):     transfer=25660
>   Others:               dirty_syncs=1
> 
> It should be the same value as RAM's remaining report when VFIO is not
> involved, and it should report more than that when VFIO is involved.
> 
> Cc: Markus Armbruster <armbru@redhat.com>
> Cc: Dr. David Alan Gilbert <dave@treblig.org>
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  qapi/migration.json            |  4 ++++
>  migration/migration-hmp-cmds.c |  5 +++++
>  migration/migration.c          | 11 +++++++++++
>  3 files changed, 20 insertions(+)
> 
> diff --git a/qapi/migration.json b/qapi/migration.json
> index e3ad3f0604..a6e24b5685 100644
> --- a/qapi/migration.json
> +++ b/qapi/migration.json
> @@ -300,6 +300,9 @@
>  #     average memory load of the virtual CPU indirectly.  Note that
>  #     zero means guest doesn't dirty memory.  (Since 8.1)
>  #
> +# @remaining: amount of bytes remaining to be migrated system-wise,
> +#     includes both RAM and all devices (like VFIO).  (Since 11.1)
> +#
>  # Features:
>  #
>  # @unstable: Members @postcopy-latency, @postcopy-vcpu-latency,
> @@ -310,6 +313,7 @@
>  ##
>  { 'struct': 'MigrationInfo',
>    'data': {'*status': 'MigrationStatus', '*ram': 'MigrationRAMStats',
> +           '*remaining': 'uint64',
>             '*vfio': 'VfioStats',
>             '*xbzrle-cache': 'XBZRLECacheStats',
>             '*total-time': 'int',
> diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c
> index 0a193b8f54..721c211086 100644
> --- a/migration/migration-hmp-cmds.c
> +++ b/migration/migration-hmp-cmds.c
> @@ -178,6 +178,11 @@ void hmp_info_migrate(Monitor *mon, const QDict *qdict)
>          }
>      }
>  
> +    if (info->has_remaining) {
> +        g_autofree char *remaining = size_to_str(info->remaining);
> +        monitor_printf(mon, "Remaining (bytes): \t%s\n", remaining);
> +    }
> +
>      if (info->has_socket_address) {
>          SocketAddressList *addr;
>  
> diff --git a/migration/migration.c b/migration/migration.c
> index 4010e5dcf5..c2aa145106 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -1076,6 +1076,16 @@ static void populate_time_info(MigrationInfo *info, MigrationState *s)
>      }
>  }
>  
> +static void populate_global_info(MigrationInfo *info, MigrationState *s)
> +{
> +    MigPendingData data = { };
> +
> +    qemu_savevm_query_pending(&data, false);
> +
> +    info->has_remaining = true;
> +    info->remaining = data.total_bytes;
> +}
> +
>  static void populate_ram_info(MigrationInfo *info, MigrationState *s)
>  {
>      size_t page_size = qemu_target_page_size();
> @@ -1177,6 +1187,7 @@ static void fill_source_migration_info(MigrationInfo *info)
>          /* TODO add some postcopy stats */
>          populate_time_info(info, s);
>          populate_ram_info(info, s);
> +        populate_global_info(info, s);
>          migration_populate_vfio_info(info);
>          break;
>      case MIGRATION_STATUS_COLO:
> -- 
> 2.53.0
> 
-- 
 -----Open up your eyes, open up your mind, open up your code -------   
/ Dr. David Alan Gilbert    |       Running GNU/Linux       | Happy  \ 
\        dave @ treblig.org |                               | In Hex /
 \ _________________________|_____ http://www.treblig.org   |_______/
Re: [PATCH 13/14] migration/qapi: Introduce system-wise "remaining" reports
Posted by Juraj Marcin 1 day, 23 hours ago
On 2026-04-08 12:55, Peter Xu wrote:
> Currently, mgmt can only query for remaining RAM, not system-wise remaining
> data.  It was not a problem before, because for a very long time RAM was
> the only part that matters.
> 
> After VFIO migrations landed upstream, it may not be true anymore
> especially considering that there can be GPU devices that contain GBs of
> device states.
> 
> Add a new "remaining" field in query-migrate results, reflecting
> system-wise remaining data, which will include everything (e.g. VFIO).
> 
> This information will be useful for mgmt to implement generic way of stall
> detection that covers all system resources.  Say, when system remaining
> data does not decrease anymore for a relatively long period of time, then
> it may mean that there is a challenge of converging, so mgmt can act based
> on how this value changes over time (especially if sampled after each
> migration iteration).
> 
> Before this patch, "expected_downtime" almost played this role. For
> example, by monitoring "expected_downtime" at the beginning of each
> iteration can in most cases also reflect the progress of migration
> system-wise.  Said that, "expected_downtime" was always calculated based on
> a bandwidth value that can fluctuate a lot if avail-switchover-bandwidth is
> not used. This new "remaining" field will remove that part of uncertainty
> for mgmt.
> 
> With the new field, HMP "info migrate" now reports this:
> 
> (qemu) info migrate
> Status:                 active
> Time (ms):              total=12080, setup=14, exp_down=300
> Remaining (bytes):      1.36 GiB        <------------------- newline
> RAM info:
>   Throughput (Mbps):    840.50
>   Sizes:                pagesize=4 KiB, total=4.02 GiB
>   Transfers:            transferred=1.18 GiB, remain=1.36 GiB
>     Channels:           precopy=1.18 GiB, multifd=0 B, postcopy=0 B
>     Page Types:         normal=307923, zero=388148
>   Page Rates (pps):     transfer=25660
>   Others:               dirty_syncs=1
> 
> It should be the same value as RAM's remaining report when VFIO is not
> involved, and it should report more than that when VFIO is involved.
> 
> Cc: Markus Armbruster <armbru@redhat.com>
> Cc: Dr. David Alan Gilbert <dave@treblig.org>
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  qapi/migration.json            |  4 ++++
>  migration/migration-hmp-cmds.c |  5 +++++
>  migration/migration.c          | 11 +++++++++++
>  3 files changed, 20 insertions(+)
> 

Reviewed-by: Juraj Marcin <jmarcin@redhat.com>