[PATCH v2] migration: Fix qmp_query_migrate mbps value

Fabiano Rosas posted 1 patch 8 months, 2 weeks ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20240226143335.14282-1-farosas@suse.de
Maintainers: Peter Xu <peterx@redhat.com>, Fabiano Rosas <farosas@suse.de>
migration/migration.c | 23 ++++++++++++++---------
1 file changed, 14 insertions(+), 9 deletions(-)
[PATCH v2] migration: Fix qmp_query_migrate mbps value
Posted by Fabiano Rosas 8 months, 2 weeks ago
The QMP command query_migrate might see incorrect throughput numbers
if it runs after we've set the migration completion status but before
migration_calculate_complete() has updated s->total_time and s->mbps.

The migration status would show COMPLETED, but the throughput value
would be the one from the last iteration and not the one from the
whole migration. This will usually be a larger value due to the time
period being smaller (one iteration).

Move migration_calculate_complete() earlier so that the status
MIGRATION_STATUS_COMPLETED is only emitted after the final counters
update. Keep everything under the BQL so the QMP thread sees the
updates as atomic.

Rename migration_calculate_complete to migration_completion_end to
reflect its new purpose of also updating s->state.

Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
v2:
- improved comments;
- took the suggestion of creating a new function but used 'end'
  instead of 'finalize' to avoid possible confusion with QOM.

CI run: https://gitlab.com/farosas/qemu/-/pipelines/1191024660
---
 migration/migration.c | 23 ++++++++++++++---------
 1 file changed, 14 insertions(+), 9 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index ab21de2cad..7b0e528d01 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -102,6 +102,7 @@ static int migration_maybe_pause(MigrationState *s,
                                  int new_state);
 static void migrate_fd_cancel(MigrationState *s);
 static bool close_return_path_on_source(MigrationState *s);
+static void migration_completion_end(MigrationState *s);
 
 static void migration_downtime_start(MigrationState *s)
 {
@@ -2746,8 +2747,7 @@ static void migration_completion(MigrationState *s)
         migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
                           MIGRATION_STATUS_COLO);
     } else {
-        migrate_set_state(&s->state, current_active_state,
-                          MIGRATION_STATUS_COMPLETED);
+        migration_completion_end(s);
     }
 
     return;
@@ -2784,8 +2784,7 @@ static void bg_migration_completion(MigrationState *s)
         goto fail;
     }
 
-    migrate_set_state(&s->state, current_active_state,
-                      MIGRATION_STATUS_COMPLETED);
+    migration_completion_end(s);
     return;
 
 fail:
@@ -2987,18 +2986,28 @@ static MigThrError migration_detect_error(MigrationState *s)
     }
 }
 
-static void migration_calculate_complete(MigrationState *s)
+static void migration_completion_end(MigrationState *s)
 {
     uint64_t bytes = migration_transferred_bytes();
     int64_t end_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
     int64_t transfer_time;
 
+    /*
+     * Take the BQL here so that query-migrate on the QMP thread sees:
+     * - atomic update of s->total_time and s->mbps;
+     * - correct ordering of s->mbps update vs. s->state;
+     */
+    bql_lock();
     migration_downtime_end(s);
     s->total_time = end_time - s->start_time;
     transfer_time = s->total_time - s->setup_time;
     if (transfer_time) {
         s->mbps = ((double) bytes * 8.0) / transfer_time / 1000;
     }
+
+    migrate_set_state(&s->state, s->state,
+                      MIGRATION_STATUS_COMPLETED);
+    bql_unlock();
 }
 
 static void update_iteration_initial_status(MigrationState *s)
@@ -3145,7 +3154,6 @@ static void migration_iteration_finish(MigrationState *s)
     bql_lock();
     switch (s->state) {
     case MIGRATION_STATUS_COMPLETED:
-        migration_calculate_complete(s);
         runstate_set(RUN_STATE_POSTMIGRATE);
         break;
     case MIGRATION_STATUS_COLO:
@@ -3189,9 +3197,6 @@ static void bg_migration_iteration_finish(MigrationState *s)
     bql_lock();
     switch (s->state) {
     case MIGRATION_STATUS_COMPLETED:
-        migration_calculate_complete(s);
-        break;
-
     case MIGRATION_STATUS_ACTIVE:
     case MIGRATION_STATUS_FAILED:
     case MIGRATION_STATUS_CANCELLED:
-- 
2.35.3
Re: [PATCH v2] migration: Fix qmp_query_migrate mbps value
Posted by Peter Xu 8 months, 2 weeks ago
On Mon, Feb 26, 2024 at 11:33:35AM -0300, Fabiano Rosas wrote:
> The QMP command query_migrate might see incorrect throughput numbers
> if it runs after we've set the migration completion status but before
> migration_calculate_complete() has updated s->total_time and s->mbps.
> 
> The migration status would show COMPLETED, but the throughput value
> would be the one from the last iteration and not the one from the
> whole migration. This will usually be a larger value due to the time
> period being smaller (one iteration).
> 
> Move migration_calculate_complete() earlier so that the status
> MIGRATION_STATUS_COMPLETED is only emitted after the final counters
> update. Keep everything under the BQL so the QMP thread sees the
> updates as atomic.
> 
> Rename migration_calculate_complete to migration_completion_end to
> reflect its new purpose of also updating s->state.
> 
> Signed-off-by: Fabiano Rosas <farosas@suse.de>

queued, thanks.

-- 
Peter Xu