[v2] migration/postcopy: Blocktime tracking overhaul

[PATCH v2 08/13] migration/postcopy: Report fault latencies in blocktime

Posted by Peter Xu 8 months ago

Blocktime so far only cares about the time one vcpu (or the whole system)
got blocked.  It would be also be helpful if it can also report the latency
of page requests, which could be very sensitive during postcopy.

Blocktime itself is sometimes not very important, especially when one
thinks about KVM async PF support, which means vCPUs are literally almost
not blocked at all because the guest OS is smart enough to switch to
another task when a remote fault is needed.

However, latency is still sensitive and important because even if the guest
vCPU is running on threads that do not need a remote fault, the workload
that accesses some missing page is still affected.

Add two entries to the report, showing how long it takes to resolve a
remote fault.  Mention in the QAPI doc that this is not the real average
fault latency, but only the ones that was requested for a remote fault.

Unwrap get_vcpu_blocktime_list() so we don't need to walk the list twice,
meanwhile add the entry checks in qtests for all postcopy tests.

Cc: Markus Armbruster <armbru@redhat.com>
Cc: Dr. David Alan Gilbert <dave@treblig.org>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 qapi/migration.json                   | 13 +++++
 migration/migration-hmp-cmds.c        | 68 ++++++++++++++++++---------
 migration/postcopy-ram.c              | 48 +++++++++++++------
 tests/qtest/migration/migration-qmp.c |  3 ++
 4 files changed, 96 insertions(+), 36 deletions(-)

diff --git a/qapi/migration.json b/qapi/migration.json
index 4963f6ca12..e95b7402cb 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -236,6 +236,17 @@
 #     This is only present when the postcopy-blocktime migration
 #     capability is enabled.  (Since 3.0)
 #
+# @postcopy-latency: average remote page fault latency (in us).  Note that
+#     this doesn't include all faults, but only the ones that require a
+#     remote page request.  So it should be always bigger than the real
+#     average page fault latency. This is only present when the
+#     postcopy-blocktime migration capability is enabled.  (Since 10.1)
+#
+# @postcopy-vcpu-latency: average remote page fault latency per vCPU (in
+#     us).  It has the same definition of @postcopy-latency, but instead
+#     this is the per-vCPU statistics.  This is only present when the
+#     postcopy-blocktime migration capability is enabled.  (Since 10.1)
+#
 # @socket-address: Only used for tcp, to know what the real port is
 #     (Since 4.0)
 #
@@ -275,6 +286,8 @@
            '*blocked-reasons': ['str'],
            '*postcopy-blocktime': 'uint32',
            '*postcopy-vcpu-blocktime': ['uint32'],
+           '*postcopy-latency': 'uint64',
+           '*postcopy-vcpu-latency': ['uint64'],
            '*socket-address': ['SocketAddress'],
            '*dirty-limit-throttle-time-per-round': 'uint64',
            '*dirty-limit-ring-full-time': 'uint64'} }
diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c
index 6c36e202a0..600b0f8071 100644
--- a/migration/migration-hmp-cmds.c
+++ b/migration/migration-hmp-cmds.c
@@ -52,6 +52,52 @@ static void migration_global_dump(Monitor *mon)
                    ms->clear_bitmap_shift);
 }
 
+static void migration_dump_blocktime(Monitor *mon, MigrationInfo *info)
+{
+    if (info->has_postcopy_blocktime) {
+        monitor_printf(mon, "Postcopy Blocktime (ms): %" PRIu32 "\n",
+                       info->postcopy_blocktime);
+    }
+
+    if (info->has_postcopy_vcpu_blocktime) {
+        uint32List *item = info->postcopy_vcpu_blocktime;
+        const char *sep = "";
+        int count = 0;
+
+        monitor_printf(mon, "Postcopy vCPU Blocktime (ms): \n [");
+
+        while (item) {
+            monitor_printf(mon, "%s%"PRIu32, sep, item->value);
+            item = item->next;
+            /* Each line 10 vcpu results, newline if there's more */
+            sep = ((++count % 10 == 0) && item) ? ",\n  " : ", ";
+        }
+        monitor_printf(mon, "]\n");
+    }
+
+    if (info->has_postcopy_latency) {
+        monitor_printf(mon, "Postcopy Latency (us): %" PRIu64 "\n",
+                       info->postcopy_latency);
+    }
+
+    if (info->has_postcopy_vcpu_latency) {
+        uint64List *item = info->postcopy_vcpu_latency;
+        int count = 0;
+
+        monitor_printf(mon, "Postcopy vCPU Latencies (us): \n [");
+
+        while (item) {
+            monitor_printf(mon, "%"PRIu64", ", item->value);
+            item = item->next;
+            /* Each line 10 vcpu results, newline if there's more */
+            if ((++count % 10 == 0) && item) {
+                monitor_printf(mon, "\n  ");
+            }
+        }
+        monitor_printf(mon, "\b\b]\n");
+    }
+}
+
 void hmp_info_migrate(Monitor *mon, const QDict *qdict)
 {
     bool show_all = qdict_get_try_bool(qdict, "all", false);
@@ -202,27 +248,7 @@ void hmp_info_migrate(Monitor *mon, const QDict *qdict)
                        info->dirty_limit_ring_full_time);
     }
 
-    if (info->has_postcopy_blocktime) {
-        monitor_printf(mon, "Postcopy Blocktime (ms): %" PRIu32 "\n",
-                       info->postcopy_blocktime);
-    }
-
-    if (info->has_postcopy_vcpu_blocktime) {
-        uint32List *item = info->postcopy_vcpu_blocktime;
-        const char *sep = "";
-        int count = 0;
-
-        monitor_printf(mon, "Postcopy vCPU Blocktime (ms): \n [");
-
-        while (item) {
-            monitor_printf(mon, "%s%"PRIu32, sep, item->value);
-            item = item->next;
-            /* Each line 10 vcpu results, newline if there's more */
-            sep = ((++count % 10 == 0) && item) ? ",\n  " : ", ";
-        }
-        monitor_printf(mon, "]\n");
-    }
-
+    migration_dump_blocktime(mon, info);
 out:
     qapi_free_MigrationInfo(info);
 }
diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
index f5c58a6ca7..98d4c29532 100644
--- a/migration/postcopy-ram.c
+++ b/migration/postcopy-ram.c
@@ -166,20 +166,6 @@ static struct PostcopyBlocktimeContext *blocktime_context_new(void)
     return ctx;
 }
 
-static uint32List *get_vcpu_blocktime_list(PostcopyBlocktimeContext *ctx)
-{
-    MachineState *ms = MACHINE(qdev_get_machine());
-    uint32List *list = NULL;
-    int i;
-
-    for (i = ms->smp.cpus - 1; i >= 0; i--) {
-        QAPI_LIST_PREPEND(
-            list, (uint32_t)(ctx->vcpu_blocktime_total[i] / 1000));
-    }
-
-    return list;
-}
-
 /*
  * This function just populates MigrationInfo from postcopy's
  * blocktime context. It will not populate MigrationInfo,
@@ -191,15 +177,47 @@ void fill_destination_postcopy_migration_info(MigrationInfo *info)
 {
     MigrationIncomingState *mis = migration_incoming_get_current();
     PostcopyBlocktimeContext *bc = mis->blocktime_ctx;
+    MachineState *ms = MACHINE(qdev_get_machine());
+    uint64_t latency_total = 0, faults = 0;
+    uint32List *list_blocktime = NULL;
+    uint64List *list_latency = NULL;
+    int i;
 
     if (!bc) {
         return;
     }
 
+    for (i = ms->smp.cpus - 1; i >= 0; i--) {
+        uint64_t latency, total, count;
+
+        /* This is in milliseconds */
+        QAPI_LIST_PREPEND(list_blocktime,
+                          (uint32_t)(bc->vcpu_blocktime_total[i] / 1000));
+
+        /* The rest in microseconds */
+        total = bc->vcpu_blocktime_total[i];
+        latency_total += total;
+        count = bc->vcpu_faults_count[i];
+        faults += count;
+
+        if (count) {
+            latency = total / count;
+        } else {
+            /* No fault detected */
+            latency = 0;
+        }
+
+        QAPI_LIST_PREPEND(list_latency, latency);
+    }
+
     info->has_postcopy_blocktime = true;
     info->postcopy_blocktime = (uint32_t)(bc->total_blocktime / 1000);
     info->has_postcopy_vcpu_blocktime = true;
-    info->postcopy_vcpu_blocktime = get_vcpu_blocktime_list(bc);
+    info->postcopy_vcpu_blocktime = list_blocktime;
+    info->has_postcopy_latency = true;
+    info->postcopy_latency = faults ? (latency_total / faults) : 0;
+    info->has_postcopy_vcpu_latency = true;
+    info->postcopy_vcpu_latency = list_latency;
 }
 
 static uint64_t get_postcopy_total_blocktime(void)
diff --git a/tests/qtest/migration/migration-qmp.c b/tests/qtest/migration/migration-qmp.c
index fb59741b2c..1a5ab2d229 100644
--- a/tests/qtest/migration/migration-qmp.c
+++ b/tests/qtest/migration/migration-qmp.c
@@ -358,6 +358,9 @@ void read_blocktime(QTestState *who)
 
     rsp_return = migrate_query_not_failed(who);
     g_assert(qdict_haskey(rsp_return, "postcopy-blocktime"));
+    g_assert(qdict_haskey(rsp_return, "postcopy-vcpu-blocktime"));
+    g_assert(qdict_haskey(rsp_return, "postcopy-latency"));
+    g_assert(qdict_haskey(rsp_return, "postcopy-vcpu-latency"));
     qobject_unref(rsp_return);
 }
 
-- 
2.49.0

Re: [PATCH v2 08/13] migration/postcopy: Report fault latencies in blocktime

Posted by Dr. David Alan Gilbert 8 months ago

* Peter Xu (peterx@redhat.com) wrote:
> Blocktime so far only cares about the time one vcpu (or the whole system)
> got blocked.  It would be also be helpful if it can also report the latency
> of page requests, which could be very sensitive during postcopy.
> 
> Blocktime itself is sometimes not very important, especially when one
> thinks about KVM async PF support, which means vCPUs are literally almost
> not blocked at all because the guest OS is smart enough to switch to
> another task when a remote fault is needed.
> 
> However, latency is still sensitive and important because even if the guest
> vCPU is running on threads that do not need a remote fault, the workload
> that accesses some missing page is still affected.
> 
> Add two entries to the report, showing how long it takes to resolve a
> remote fault.  Mention in the QAPI doc that this is not the real average
> fault latency, but only the ones that was requested for a remote fault.
> 
> Unwrap get_vcpu_blocktime_list() so we don't need to walk the list twice,
> meanwhile add the entry checks in qtests for all postcopy tests.
> 
> Cc: Markus Armbruster <armbru@redhat.com>
> Cc: Dr. David Alan Gilbert <dave@treblig.org>
> Reviewed-by: Fabiano Rosas <farosas@suse.de>
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  qapi/migration.json                   | 13 +++++
>  migration/migration-hmp-cmds.c        | 68 ++++++++++++++++++---------
>  migration/postcopy-ram.c              | 48 +++++++++++++------
>  tests/qtest/migration/migration-qmp.c |  3 ++
>  4 files changed, 96 insertions(+), 36 deletions(-)
> 
> diff --git a/qapi/migration.json b/qapi/migration.json
> index 4963f6ca12..e95b7402cb 100644
> --- a/qapi/migration.json
> +++ b/qapi/migration.json
> @@ -236,6 +236,17 @@
>  #     This is only present when the postcopy-blocktime migration
>  #     capability is enabled.  (Since 3.0)
>  #
> +# @postcopy-latency: average remote page fault latency (in us).  Note that
> +#     this doesn't include all faults, but only the ones that require a
> +#     remote page request.  So it should be always bigger than the real
> +#     average page fault latency. This is only present when the
> +#     postcopy-blocktime migration capability is enabled.  (Since 10.1)
> +#
> +# @postcopy-vcpu-latency: average remote page fault latency per vCPU (in
> +#     us).  It has the same definition of @postcopy-latency, but instead
> +#     this is the per-vCPU statistics.  This is only present when the
> +#     postcopy-blocktime migration capability is enabled.  (Since 10.1)

I wonder if even 'us' is too big; given you have 64bits to play with, and your
examples show some samples landing in under 10us, perhaps it's best
to at least define the qapi  fields as ns, even if you keep with the same
buckets for now?

Dave

>  # @socket-address: Only used for tcp, to know what the real port is
>  #     (Since 4.0)
>  #
> @@ -275,6 +286,8 @@
>             '*blocked-reasons': ['str'],
>             '*postcopy-blocktime': 'uint32',
>             '*postcopy-vcpu-blocktime': ['uint32'],
> +           '*postcopy-latency': 'uint64',
> +           '*postcopy-vcpu-latency': ['uint64'],
>             '*socket-address': ['SocketAddress'],
>             '*dirty-limit-throttle-time-per-round': 'uint64',
>             '*dirty-limit-ring-full-time': 'uint64'} }
> diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c
> index 6c36e202a0..600b0f8071 100644
> --- a/migration/migration-hmp-cmds.c
> +++ b/migration/migration-hmp-cmds.c
> @@ -52,6 +52,52 @@ static void migration_global_dump(Monitor *mon)
>                     ms->clear_bitmap_shift);
>  }
>  
> +static void migration_dump_blocktime(Monitor *mon, MigrationInfo *info)
> +{
> +    if (info->has_postcopy_blocktime) {
> +        monitor_printf(mon, "Postcopy Blocktime (ms): %" PRIu32 "\n",
> +                       info->postcopy_blocktime);
> +    }
> +
> +    if (info->has_postcopy_vcpu_blocktime) {
> +        uint32List *item = info->postcopy_vcpu_blocktime;
> +        const char *sep = "";
> +        int count = 0;
> +
> +        monitor_printf(mon, "Postcopy vCPU Blocktime (ms): \n [");
> +
> +        while (item) {
> +            monitor_printf(mon, "%s%"PRIu32, sep, item->value);
> +            item = item->next;
> +            /* Each line 10 vcpu results, newline if there's more */
> +            sep = ((++count % 10 == 0) && item) ? ",\n  " : ", ";
> +        }
> +        monitor_printf(mon, "]\n");
> +    }
> +
> +    if (info->has_postcopy_latency) {
> +        monitor_printf(mon, "Postcopy Latency (us): %" PRIu64 "\n",
> +                       info->postcopy_latency);
> +    }
> +
> +    if (info->has_postcopy_vcpu_latency) {
> +        uint64List *item = info->postcopy_vcpu_latency;
> +        int count = 0;
> +
> +        monitor_printf(mon, "Postcopy vCPU Latencies (us): \n [");
> +
> +        while (item) {
> +            monitor_printf(mon, "%"PRIu64", ", item->value);
> +            item = item->next;
> +            /* Each line 10 vcpu results, newline if there's more */
> +            if ((++count % 10 == 0) && item) {
> +                monitor_printf(mon, "\n  ");
> +            }
> +        }
> +        monitor_printf(mon, "\b\b]\n");
> +    }
> +}
> +
>  void hmp_info_migrate(Monitor *mon, const QDict *qdict)
>  {
>      bool show_all = qdict_get_try_bool(qdict, "all", false);
> @@ -202,27 +248,7 @@ void hmp_info_migrate(Monitor *mon, const QDict *qdict)
>                         info->dirty_limit_ring_full_time);
>      }
>  
> -    if (info->has_postcopy_blocktime) {
> -        monitor_printf(mon, "Postcopy Blocktime (ms): %" PRIu32 "\n",
> -                       info->postcopy_blocktime);
> -    }
> -
> -    if (info->has_postcopy_vcpu_blocktime) {
> -        uint32List *item = info->postcopy_vcpu_blocktime;
> -        const char *sep = "";
> -        int count = 0;
> -
> -        monitor_printf(mon, "Postcopy vCPU Blocktime (ms): \n [");
> -
> -        while (item) {
> -            monitor_printf(mon, "%s%"PRIu32, sep, item->value);
> -            item = item->next;
> -            /* Each line 10 vcpu results, newline if there's more */
> -            sep = ((++count % 10 == 0) && item) ? ",\n  " : ", ";
> -        }
> -        monitor_printf(mon, "]\n");
> -    }
> -
> +    migration_dump_blocktime(mon, info);
>  out:
>      qapi_free_MigrationInfo(info);
>  }
> diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
> index f5c58a6ca7..98d4c29532 100644
> --- a/migration/postcopy-ram.c
> +++ b/migration/postcopy-ram.c
> @@ -166,20 +166,6 @@ static struct PostcopyBlocktimeContext *blocktime_context_new(void)
>      return ctx;
>  }
>  
> -static uint32List *get_vcpu_blocktime_list(PostcopyBlocktimeContext *ctx)
> -{
> -    MachineState *ms = MACHINE(qdev_get_machine());
> -    uint32List *list = NULL;
> -    int i;
> -
> -    for (i = ms->smp.cpus - 1; i >= 0; i--) {
> -        QAPI_LIST_PREPEND(
> -            list, (uint32_t)(ctx->vcpu_blocktime_total[i] / 1000));
> -    }
> -
> -    return list;
> -}
> -
>  /*
>   * This function just populates MigrationInfo from postcopy's
>   * blocktime context. It will not populate MigrationInfo,
> @@ -191,15 +177,47 @@ void fill_destination_postcopy_migration_info(MigrationInfo *info)
>  {
>      MigrationIncomingState *mis = migration_incoming_get_current();
>      PostcopyBlocktimeContext *bc = mis->blocktime_ctx;
> +    MachineState *ms = MACHINE(qdev_get_machine());
> +    uint64_t latency_total = 0, faults = 0;
> +    uint32List *list_blocktime = NULL;
> +    uint64List *list_latency = NULL;
> +    int i;
>  
>      if (!bc) {
>          return;
>      }
>  
> +    for (i = ms->smp.cpus - 1; i >= 0; i--) {
> +        uint64_t latency, total, count;
> +
> +        /* This is in milliseconds */
> +        QAPI_LIST_PREPEND(list_blocktime,
> +                          (uint32_t)(bc->vcpu_blocktime_total[i] / 1000));
> +
> +        /* The rest in microseconds */
> +        total = bc->vcpu_blocktime_total[i];
> +        latency_total += total;
> +        count = bc->vcpu_faults_count[i];
> +        faults += count;
> +
> +        if (count) {
> +            latency = total / count;
> +        } else {
> +            /* No fault detected */
> +            latency = 0;
> +        }
> +
> +        QAPI_LIST_PREPEND(list_latency, latency);
> +    }
> +
>      info->has_postcopy_blocktime = true;
>      info->postcopy_blocktime = (uint32_t)(bc->total_blocktime / 1000);
>      info->has_postcopy_vcpu_blocktime = true;
> -    info->postcopy_vcpu_blocktime = get_vcpu_blocktime_list(bc);
> +    info->postcopy_vcpu_blocktime = list_blocktime;
> +    info->has_postcopy_latency = true;
> +    info->postcopy_latency = faults ? (latency_total / faults) : 0;
> +    info->has_postcopy_vcpu_latency = true;
> +    info->postcopy_vcpu_latency = list_latency;
>  }
>  
>  static uint64_t get_postcopy_total_blocktime(void)
> diff --git a/tests/qtest/migration/migration-qmp.c b/tests/qtest/migration/migration-qmp.c
> index fb59741b2c..1a5ab2d229 100644
> --- a/tests/qtest/migration/migration-qmp.c
> +++ b/tests/qtest/migration/migration-qmp.c
> @@ -358,6 +358,9 @@ void read_blocktime(QTestState *who)
>  
>      rsp_return = migrate_query_not_failed(who);
>      g_assert(qdict_haskey(rsp_return, "postcopy-blocktime"));
> +    g_assert(qdict_haskey(rsp_return, "postcopy-vcpu-blocktime"));
> +    g_assert(qdict_haskey(rsp_return, "postcopy-latency"));
> +    g_assert(qdict_haskey(rsp_return, "postcopy-vcpu-latency"));
>      qobject_unref(rsp_return);
>  }
>  
> -- 
> 2.49.0
> 
-- 
 -----Open up your eyes, open up your mind, open up your code -------   
/ Dr. David Alan Gilbert    |       Running GNU/Linux       | Happy  \ 
\        dave @ treblig.org |                               | In Hex /
 \ _________________________|_____ http://www.treblig.org   |_______/

Re: [PATCH v2 08/13] migration/postcopy: Report fault latencies in blocktime

Posted by Peter Xu 8 months ago

On Tue, Jun 10, 2025 at 12:08:23AM +0000, Dr. David Alan Gilbert wrote:
> > diff --git a/qapi/migration.json b/qapi/migration.json
> > index 4963f6ca12..e95b7402cb 100644
> > --- a/qapi/migration.json
> > +++ b/qapi/migration.json
> > @@ -236,6 +236,17 @@
> >  #     This is only present when the postcopy-blocktime migration
> >  #     capability is enabled.  (Since 3.0)
> >  #
> > +# @postcopy-latency: average remote page fault latency (in us).  Note that
> > +#     this doesn't include all faults, but only the ones that require a
> > +#     remote page request.  So it should be always bigger than the real
> > +#     average page fault latency. This is only present when the
> > +#     postcopy-blocktime migration capability is enabled.  (Since 10.1)
> > +#
> > +# @postcopy-vcpu-latency: average remote page fault latency per vCPU (in
> > +#     us).  It has the same definition of @postcopy-latency, but instead
> > +#     this is the per-vCPU statistics.  This is only present when the
> > +#     postcopy-blocktime migration capability is enabled.  (Since 10.1)
> 
> I wonder if even 'us' is too big; given you have 64bits to play with, and your
> examples show some samples landing in under 10us, perhaps it's best
> to at least define the qapi  fields as ns, even if you keep with the same
> buckets for now?

The few <10us ones should pretty much be outliers, I'd expect it happened
because some faulted pages got lucky to be migrated (in the background
stream rather than the preempt stream) right after sending the request.

But it's still a fair point, especially if there's nothing to lose to
switch to nanoseconds here when we have 64bits fields.. I also did a quick
check online, looks like RDMA over 100Gbps NIC may actually do a fast
round-robin transaction within a few microseconds indeed at least with zero
loads..

Let me do the switch in v3.

While at it, when thinking of possible future unit/format changes in the
report, maybe I should also mark all of these fields experimental from the
start? So we don't necessarily need to maintain the ABI - the expectation
is even if a mgmt would like to fetch those they should only fetch and dump
it into log so that human can read later only for debugging purposes.

-- 
Peter Xu

Re: [PATCH v2 08/13] migration/postcopy: Report fault latencies in blocktime

Posted by Dr. David Alan Gilbert 8 months ago

* Peter Xu (peterx@redhat.com) wrote:
> On Tue, Jun 10, 2025 at 12:08:23AM +0000, Dr. David Alan Gilbert wrote:
> > > diff --git a/qapi/migration.json b/qapi/migration.json
> > > index 4963f6ca12..e95b7402cb 100644
> > > --- a/qapi/migration.json
> > > +++ b/qapi/migration.json
> > > @@ -236,6 +236,17 @@
> > >  #     This is only present when the postcopy-blocktime migration
> > >  #     capability is enabled.  (Since 3.0)
> > >  #
> > > +# @postcopy-latency: average remote page fault latency (in us).  Note that
> > > +#     this doesn't include all faults, but only the ones that require a
> > > +#     remote page request.  So it should be always bigger than the real
> > > +#     average page fault latency. This is only present when the
> > > +#     postcopy-blocktime migration capability is enabled.  (Since 10.1)
> > > +#
> > > +# @postcopy-vcpu-latency: average remote page fault latency per vCPU (in
> > > +#     us).  It has the same definition of @postcopy-latency, but instead
> > > +#     this is the per-vCPU statistics.  This is only present when the
> > > +#     postcopy-blocktime migration capability is enabled.  (Since 10.1)
> > 
> > I wonder if even 'us' is too big; given you have 64bits to play with, and your
> > examples show some samples landing in under 10us, perhaps it's best
> > to at least define the qapi  fields as ns, even if you keep with the same
> > buckets for now?
> 
> The few <10us ones should pretty much be outliers, I'd expect it happened
> because some faulted pages got lucky to be migrated (in the background
> stream rather than the preempt stream) right after sending the request.
> 
> But it's still a fair point, especially if there's nothing to lose to
> switch to nanoseconds here when we have 64bits fields.. I also did a quick
> check online, looks like RDMA over 100Gbps NIC may actually do a fast
> round-robin transaction within a few microseconds indeed at least with zero
> loads..
> 
> Let me do the switch in v3.
> 
> While at it, when thinking of possible future unit/format changes in the
> report, maybe I should also mark all of these fields experimental from the
> start? So we don't necessarily need to maintain the ABI - the expectation
> is even if a mgmt would like to fetch those they should only fetch and dump
> it into log so that human can read later only for debugging purposes.

Yeh I think that's OK, although perhaps another way would be to add
a field indicating the time of the first bucket; i.e. you could specify
that all the values are in ns, but have first-bucket=1000 to be exactly
the same as you have it now.

Dave

> -- 
> Peter Xu
> 
-- 
 -----Open up your eyes, open up your mind, open up your code -------   
/ Dr. David Alan Gilbert    |       Running GNU/Linux       | Happy  \ 
\        dave @ treblig.org |                               | In Hex /
 \ _________________________|_____ http://www.treblig.org   |_______/

Re: [PATCH v2 08/13] migration/postcopy: Report fault latencies in blocktime

Posted by Peter Xu 8 months ago

On Tue, Jun 10, 2025 at 01:53:40PM +0000, Dr. David Alan Gilbert wrote:
> * Peter Xu (peterx@redhat.com) wrote:
> > On Tue, Jun 10, 2025 at 12:08:23AM +0000, Dr. David Alan Gilbert wrote:
> > > > diff --git a/qapi/migration.json b/qapi/migration.json
> > > > index 4963f6ca12..e95b7402cb 100644
> > > > --- a/qapi/migration.json
> > > > +++ b/qapi/migration.json
> > > > @@ -236,6 +236,17 @@
> > > >  #     This is only present when the postcopy-blocktime migration
> > > >  #     capability is enabled.  (Since 3.0)
> > > >  #
> > > > +# @postcopy-latency: average remote page fault latency (in us).  Note that
> > > > +#     this doesn't include all faults, but only the ones that require a
> > > > +#     remote page request.  So it should be always bigger than the real
> > > > +#     average page fault latency. This is only present when the
> > > > +#     postcopy-blocktime migration capability is enabled.  (Since 10.1)
> > > > +#
> > > > +# @postcopy-vcpu-latency: average remote page fault latency per vCPU (in
> > > > +#     us).  It has the same definition of @postcopy-latency, but instead
> > > > +#     this is the per-vCPU statistics.  This is only present when the
> > > > +#     postcopy-blocktime migration capability is enabled.  (Since 10.1)
> > > 
> > > I wonder if even 'us' is too big; given you have 64bits to play with, and your
> > > examples show some samples landing in under 10us, perhaps it's best
> > > to at least define the qapi  fields as ns, even if you keep with the same
> > > buckets for now?
> > 
> > The few <10us ones should pretty much be outliers, I'd expect it happened
> > because some faulted pages got lucky to be migrated (in the background
> > stream rather than the preempt stream) right after sending the request.
> > 
> > But it's still a fair point, especially if there's nothing to lose to
> > switch to nanoseconds here when we have 64bits fields.. I also did a quick
> > check online, looks like RDMA over 100Gbps NIC may actually do a fast
> > round-robin transaction within a few microseconds indeed at least with zero
> > loads..
> > 
> > Let me do the switch in v3.
> > 
> > While at it, when thinking of possible future unit/format changes in the
> > report, maybe I should also mark all of these fields experimental from the
> > start? So we don't necessarily need to maintain the ABI - the expectation
> > is even if a mgmt would like to fetch those they should only fetch and dump
> > it into log so that human can read later only for debugging purposes.
> 
> Yeh I think that's OK, although perhaps another way would be to add
> a field indicating the time of the first bucket; i.e. you could specify
> that all the values are in ns, but have first-bucket=1000 to be exactly
> the same as you have it now.

Right, that should also be better than the current.  Marking experimental
could be slightly more flexible though, e.g. once we want to report in
other formats other than power-of-two buckets.  I'll think about it.

Thanks for the input!

-- 
Peter Xu

Re: [PATCH v2 08/13] migration/postcopy: Report fault latencies in blocktime

Posted by Peter Xu 8 months ago

On Mon, Jun 09, 2025 at 03:12:54PM -0400, Peter Xu wrote:
> +static void migration_dump_blocktime(Monitor *mon, MigrationInfo *info)
> +{
> +    if (info->has_postcopy_blocktime) {
> +        monitor_printf(mon, "Postcopy Blocktime (ms): %" PRIu32 "\n",
> +                       info->postcopy_blocktime);
> +    }
> +
> +    if (info->has_postcopy_vcpu_blocktime) {
> +        uint32List *item = info->postcopy_vcpu_blocktime;
> +        const char *sep = "";
> +        int count = 0;
> +
> +        monitor_printf(mon, "Postcopy vCPU Blocktime (ms): \n [");
> +
> +        while (item) {
> +            monitor_printf(mon, "%s%"PRIu32, sep, item->value);
> +            item = item->next;
> +            /* Each line 10 vcpu results, newline if there's more */
> +            sep = ((++count % 10 == 0) && item) ? ",\n  " : ", ";
> +        }
> +        monitor_printf(mon, "]\n");
> +    }
> +
> +    if (info->has_postcopy_latency) {
> +        monitor_printf(mon, "Postcopy Latency (us): %" PRIu64 "\n",
> +                       info->postcopy_latency);
> +    }
> +
> +    if (info->has_postcopy_vcpu_latency) {
> +        uint64List *item = info->postcopy_vcpu_latency;
> +        int count = 0;
> +
> +        monitor_printf(mon, "Postcopy vCPU Latencies (us): \n [");
> +
> +        while (item) {
> +            monitor_printf(mon, "%"PRIu64", ", item->value);
> +            item = item->next;
> +            /* Each line 10 vcpu results, newline if there's more */
> +            if ((++count % 10 == 0) && item) {
> +                monitor_printf(mon, "\n  ");
> +            }
> +        }
> +        monitor_printf(mon, "\b\b]\n");

In the review of the other series I posted, Markus pointed out we should
avoid using "\b" and suggested a better way.  I fixed it there, but I
overlooked I have this similar use case in this series.  I'll fix this too
when posting v3 with similar approach.

> +    }
> +}

-- 
Peter Xu

Re: [PATCH v2 08/13] migration/postcopy: Report fault latencies in blocktime

Posted by Peter Xu 8 months ago

On Mon, Jun 09, 2025 at 06:05:16PM -0400, Peter Xu wrote:
> On Mon, Jun 09, 2025 at 03:12:54PM -0400, Peter Xu wrote:
> > +static void migration_dump_blocktime(Monitor *mon, MigrationInfo *info)
> > +{
> > +    if (info->has_postcopy_blocktime) {
> > +        monitor_printf(mon, "Postcopy Blocktime (ms): %" PRIu32 "\n",
> > +                       info->postcopy_blocktime);
> > +    }
> > +
> > +    if (info->has_postcopy_vcpu_blocktime) {
> > +        uint32List *item = info->postcopy_vcpu_blocktime;
> > +        const char *sep = "";
> > +        int count = 0;
> > +
> > +        monitor_printf(mon, "Postcopy vCPU Blocktime (ms): \n [");
> > +
> > +        while (item) {
> > +            monitor_printf(mon, "%s%"PRIu32, sep, item->value);
> > +            item = item->next;
> > +            /* Each line 10 vcpu results, newline if there's more */
> > +            sep = ((++count % 10 == 0) && item) ? ",\n  " : ", ";
> > +        }
> > +        monitor_printf(mon, "]\n");
> > +    }
> > +
> > +    if (info->has_postcopy_latency) {
> > +        monitor_printf(mon, "Postcopy Latency (us): %" PRIu64 "\n",
> > +                       info->postcopy_latency);
> > +    }
> > +
> > +    if (info->has_postcopy_vcpu_latency) {
> > +        uint64List *item = info->postcopy_vcpu_latency;
> > +        int count = 0;
> > +
> > +        monitor_printf(mon, "Postcopy vCPU Latencies (us): \n [");
> > +
> > +        while (item) {
> > +            monitor_printf(mon, "%"PRIu64", ", item->value);
> > +            item = item->next;
> > +            /* Each line 10 vcpu results, newline if there's more */
> > +            if ((++count % 10 == 0) && item) {
> > +                monitor_printf(mon, "\n  ");
> > +            }
> > +        }
> > +        monitor_printf(mon, "\b\b]\n");
> 
> In the review of the other series I posted, Markus pointed out we should
> avoid using "\b" and suggested a better way.  I fixed it there, but I
> overlooked I have this similar use case in this series.  I'll fix this too
> when posting v3 with similar approach.

Just in case I won't need to repost again to pollute everyone's inbox..
this chunk now look like this:

+    if (info->has_postcopy_vcpu_latency) {
+        uint64List *item = info->postcopy_vcpu_latency;
+        const char *sep = "";
+        int count = 0;
+
+        monitor_printf(mon, "Postcopy vCPU Latencies (us): \n [");
+
+        while (item) {
+            monitor_printf(mon, "%s%"PRIu64, sep, item->value);
+            item = item->next;
+            /* Each line 10 vcpu results, newline if there's more */
+            sep = ((++count % 10 == 0) && item) ? ",\n  " : ", ";
+        }
+        monitor_printf(mon, "]\n");
+    }

I'm making bold to keep the R-b for Fabiano as of now unless there's
objection..

-- 
Peter Xu