[PATCH] migration: introduce MIGRATION_STATUS_FAILING

Prasad Pandit posted 1 patch 1 month, 2 weeks ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20251222114822.327623-1-ppandit@redhat.com
Maintainers: Peter Xu <peterx@redhat.com>, Fabiano Rosas <farosas@suse.de>, Eric Blake <eblake@redhat.com>, Markus Armbruster <armbru@redhat.com>, Laurent Vivier <lvivier@redhat.com>, Paolo Bonzini <pbonzini@redhat.com>
migration/migration.c                 | 33 +++++++++++++++------------
migration/multifd.c                   |  4 ++--
qapi/migration.json                   |  8 ++++---
tests/qtest/migration/migration-qmp.c |  3 ++-
tests/qtest/migration/precopy-tests.c |  5 ++--
5 files changed, 31 insertions(+), 22 deletions(-)
[PATCH] migration: introduce MIGRATION_STATUS_FAILING
Posted by Prasad Pandit 1 month, 2 weeks ago
From: Prasad Pandit <pjp@fedoraproject.org>

When migration connection is broken, the QEMU and libvirtd(8)
process on the source side receive TCP connection reset
notification. QEMU sets the migration status to FAILED and
proceeds to migration_cleanup(). Meanwhile, Libvirtd(8) sends
a QMP command to migrate_set_capabilities().

The migration_cleanup() and qmp_migrate_set_capabilities()
calls race with each other. When the latter is invoked first,
since the migration is not running (FAILED), migration
capabilities are reset to false, so during migration_cleanup()
the QEMU process crashes with assertion failure.

  Stack trace of thread 255014:
   #0  __pthread_kill_implementation (libc.so.6 + 0x822e8)
   #1  raise (libc.so.6 + 0x3a73c)
   #2  abort (libc.so.6 + 0x27034)
   #3  __assert_fail_base (libc.so.6 + 0x34090)
   #4  __assert_fail (libc.so.6 + 0x34100)
   #5  yank_unregister_instance (qemu-kvm + 0x8b8280)
   #6  migrate_fd_cleanup (qemu-kvm + 0x3c6308)
   #7  migration_bh_dispatch_bh (qemu-kvm + 0x3c2144)
   #8  aio_bh_poll (qemu-kvm + 0x8ba358)
   #9  aio_dispatch (qemu-kvm + 0x8a0ab4)
   #10 aio_ctx_dispatch (qemu-kvm + 0x8bb180)

Introduce a new migration status FAILING and use it as an
interim status when an error occurs. Once migration_cleanup()
is done, it sets the migration status to FAILED. This helps
to avoid the above race condition and ensuing failure.

Interim status FAILING is set wherever the execution moves
towards migration_cleanup() directly OR via:

  migration_iteration_finish  bg_migration_iteration_finish
  -> migration_bh_schedule    -> migration_bh_schedule
   -> migration_cleanup_bh     -> migration_cleanup_bh
    -> migration_cleanup        -> migration_cleanup
     -> FAILED                   -> FAILED

The migration status finally moves to FAILED and reports an
appropriate error to the user.

Signed-off-by: Prasad Pandit <pjp@fedoraproject.org>
---
 migration/migration.c                 | 33 +++++++++++++++------------
 migration/multifd.c                   |  4 ++--
 qapi/migration.json                   |  8 ++++---
 tests/qtest/migration/migration-qmp.c |  3 ++-
 tests/qtest/migration/precopy-tests.c |  5 ++--
 5 files changed, 31 insertions(+), 22 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index b316ee01ab..5c32bc8fe5 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1216,6 +1216,7 @@ bool migration_is_running(void)
     case MIGRATION_STATUS_DEVICE:
     case MIGRATION_STATUS_WAIT_UNPLUG:
     case MIGRATION_STATUS_CANCELLING:
+    case MIGRATION_STATUS_FAILING:
     case MIGRATION_STATUS_COLO:
         return true;
     default:
@@ -1351,6 +1352,7 @@ static void fill_source_migration_info(MigrationInfo *info)
         break;
     case MIGRATION_STATUS_ACTIVE:
     case MIGRATION_STATUS_CANCELLING:
+    case MIGRATION_STATUS_FAILING:
     case MIGRATION_STATUS_POSTCOPY_DEVICE:
     case MIGRATION_STATUS_POSTCOPY_ACTIVE:
     case MIGRATION_STATUS_PRE_SWITCHOVER:
@@ -1409,6 +1411,7 @@ static void fill_destination_migration_info(MigrationInfo *info)
     case MIGRATION_STATUS_POSTCOPY_ACTIVE:
     case MIGRATION_STATUS_POSTCOPY_PAUSED:
     case MIGRATION_STATUS_POSTCOPY_RECOVER:
+    case MIGRATION_STATUS_FAILING:
     case MIGRATION_STATUS_FAILED:
     case MIGRATION_STATUS_COLO:
         info->has_status = true;
@@ -1531,6 +1534,9 @@ static void migration_cleanup(MigrationState *s)
     if (s->state == MIGRATION_STATUS_CANCELLING) {
         migrate_set_state(&s->state, MIGRATION_STATUS_CANCELLING,
                           MIGRATION_STATUS_CANCELLED);
+    } else if (s->state == MIGRATION_STATUS_FAILING) {
+        migrate_set_state(&s->state, MIGRATION_STATUS_FAILING,
+                          MIGRATION_STATUS_FAILED);
     }
 
     if (s->error) {
@@ -1584,7 +1590,7 @@ static void migration_connect_set_error(MigrationState *s, const Error *error)
 
     switch (current) {
     case MIGRATION_STATUS_SETUP:
-        next = MIGRATION_STATUS_FAILED;
+        next = MIGRATION_STATUS_FAILING;
         break;
     case MIGRATION_STATUS_POSTCOPY_RECOVER_SETUP:
         /* Never fail a postcopy migration; switch back to PAUSED instead */
@@ -1728,6 +1734,7 @@ bool migration_has_failed(MigrationState *s)
 {
     return (s->state == MIGRATION_STATUS_CANCELLING ||
             s->state == MIGRATION_STATUS_CANCELLED ||
+            s->state == MIGRATION_STATUS_FAILING ||
             s->state == MIGRATION_STATUS_FAILED);
 }
 
@@ -2744,7 +2751,7 @@ static int postcopy_start(MigrationState *ms, Error **errp)
         if (postcopy_preempt_establish_channel(ms)) {
             if (ms->state != MIGRATION_STATUS_CANCELLING) {
                 migrate_set_state(&ms->state, ms->state,
-                                  MIGRATION_STATUS_FAILED);
+                                  MIGRATION_STATUS_FAILING);
             }
             error_setg(errp, "%s: Failed to establish preempt channel",
                        __func__);
@@ -2907,7 +2914,7 @@ fail_closefb:
     qemu_fclose(fb);
 fail:
     if (ms->state != MIGRATION_STATUS_CANCELLING) {
-        migrate_set_state(&ms->state, ms->state, MIGRATION_STATUS_FAILED);
+        migrate_set_state(&ms->state, ms->state, MIGRATION_STATUS_FAILING);
     }
     migration_block_activate(NULL);
     migration_call_notifiers(ms, MIG_EVENT_PRECOPY_FAILED, NULL);
@@ -3102,7 +3109,7 @@ fail:
     }
 
     if (s->state != MIGRATION_STATUS_CANCELLING) {
-        migrate_set_state(&s->state, s->state, MIGRATION_STATUS_FAILED);
+        migrate_set_state(&s->state, s->state, MIGRATION_STATUS_FAILING);
     }
 }
 
@@ -3139,7 +3146,7 @@ static void bg_migration_completion(MigrationState *s)
 
 fail:
     migrate_set_state(&s->state, current_active_state,
-                      MIGRATION_STATUS_FAILED);
+                      MIGRATION_STATUS_FAILING);
 }
 
 typedef enum MigThrError {
@@ -3341,7 +3348,7 @@ static MigThrError migration_detect_error(MigrationState *s)
          * For precopy (or postcopy with error outside IO, or before dest
          * starts), we fail with no time.
          */
-        migrate_set_state(&s->state, state, MIGRATION_STATUS_FAILED);
+        migrate_set_state(&s->state, state, MIGRATION_STATUS_FAILING);
         trace_migration_thread_file_err();
 
         /* Time to stop the migration, now. */
@@ -3572,7 +3579,7 @@ static void migration_iteration_finish(MigrationState *s)
         migrate_start_colo_process(s);
         s->vm_old_state = RUN_STATE_RUNNING;
         /* Fallthrough */
-    case MIGRATION_STATUS_FAILED:
+    case MIGRATION_STATUS_FAILING:
     case MIGRATION_STATUS_CANCELLED:
     case MIGRATION_STATUS_CANCELLING:
         if (!migration_block_activate(&local_err)) {
@@ -3631,7 +3638,7 @@ static void bg_migration_iteration_finish(MigrationState *s)
     switch (s->state) {
     case MIGRATION_STATUS_COMPLETED:
     case MIGRATION_STATUS_ACTIVE:
-    case MIGRATION_STATUS_FAILED:
+    case MIGRATION_STATUS_FAILING:
     case MIGRATION_STATUS_CANCELLED:
     case MIGRATION_STATUS_CANCELLING:
         break;
@@ -3821,7 +3828,7 @@ static void *migration_thread(void *opaque)
         migrate_set_error(s, local_err);
         error_free(local_err);
         migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
-                          MIGRATION_STATUS_FAILED);
+                          MIGRATION_STATUS_FAILING);
         goto out;
     }
 
@@ -3945,8 +3952,6 @@ static void *bg_migration_thread(void *opaque)
     if (ret) {
         migrate_set_error(s, local_err);
         error_free(local_err);
-        migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
-                          MIGRATION_STATUS_FAILED);
         goto fail_setup;
     }
 
@@ -4008,12 +4013,12 @@ static void *bg_migration_thread(void *opaque)
 
 fail:
     if (early_fail) {
-        migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
-                MIGRATION_STATUS_FAILED);
         bql_unlock();
     }
 
 fail_setup:
+    migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
+                        MIGRATION_STATUS_FAILING);
     bg_migration_iteration_finish(s);
 
     qemu_fclose(fb);
@@ -4128,7 +4133,7 @@ void migration_connect(MigrationState *s, Error *error_in)
 fail:
     migrate_set_error(s, local_err);
     if (s->state != MIGRATION_STATUS_CANCELLING) {
-        migrate_set_state(&s->state, s->state, MIGRATION_STATUS_FAILED);
+        migrate_set_state(&s->state, s->state, MIGRATION_STATUS_FAILING);
     }
     error_report_err(local_err);
     migration_cleanup(s);
diff --git a/migration/multifd.c b/migration/multifd.c
index 3203dc98e1..970633474e 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -434,7 +434,7 @@ static void multifd_send_set_error(Error *err)
             s->state == MIGRATION_STATUS_DEVICE ||
             s->state == MIGRATION_STATUS_ACTIVE) {
             migrate_set_state(&s->state, s->state,
-                              MIGRATION_STATUS_FAILED);
+                              MIGRATION_STATUS_FAILING);
         }
     }
 }
@@ -1001,7 +1001,7 @@ bool multifd_send_setup(void)
 
 err:
     migrate_set_state(&s->state, MIGRATION_STATUS_SETUP,
-                      MIGRATION_STATUS_FAILED);
+                      MIGRATION_STATUS_FAILING);
     return false;
 }
 
diff --git a/qapi/migration.json b/qapi/migration.json
index cf023bd29d..85f4961781 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -158,7 +158,9 @@
 #
 # @completed: migration is finished.
 #
-# @failed: some error occurred during migration process.
+# @failing: error occurred during migration, clean-up underway.
+#
+# @failed: error occurred during migration, clean-up done.
 #
 # @colo: VM is in the process of fault tolerance, VM can not get into
 #     this state unless colo capability is enabled for migration.
@@ -181,8 +183,8 @@
   'data': [ 'none', 'setup', 'cancelling', 'cancelled',
             'active', 'postcopy-device', 'postcopy-active',
             'postcopy-paused', 'postcopy-recover-setup',
-            'postcopy-recover', 'completed', 'failed', 'colo',
-            'pre-switchover', 'device', 'wait-unplug' ] }
+            'postcopy-recover', 'completed', 'failing', 'failed',
+            'colo', 'pre-switchover', 'device', 'wait-unplug' ] }
 
 ##
 # @VfioStats:
diff --git a/tests/qtest/migration/migration-qmp.c b/tests/qtest/migration/migration-qmp.c
index c803fcee9d..ceb40efd56 100644
--- a/tests/qtest/migration/migration-qmp.c
+++ b/tests/qtest/migration/migration-qmp.c
@@ -241,7 +241,8 @@ void wait_for_migration_fail(QTestState *from, bool allow_active)
     do {
         status = migrate_query_status(from);
         bool result = !strcmp(status, "setup") || !strcmp(status, "failed") ||
-            (allow_active && !strcmp(status, "active"));
+            (allow_active && !strcmp(status, "active")) ||
+            !strcmp(status, "failing");
         if (!result) {
             fprintf(stderr, "%s: unexpected status status=%s allow_active=%d\n",
                     __func__, status, allow_active);
diff --git a/tests/qtest/migration/precopy-tests.c b/tests/qtest/migration/precopy-tests.c
index 57ca623de5..a04442df96 100644
--- a/tests/qtest/migration/precopy-tests.c
+++ b/tests/qtest/migration/precopy-tests.c
@@ -807,7 +807,8 @@ static void test_cancel_src_after_status(void *opaque)
     } else if (g_str_equal(phase, "completed")) {
         test_cancel_src_after_complete(from, to, uri, phase);
 
-    } else if (g_str_equal(phase, "failed")) {
+    } else if (g_str_equal(phase, "failing") ||
+               g_str_equal(phase, "failed")) {
         test_cancel_src_after_failed(from, to, uri, phase);
 
     } else if (g_str_equal(phase, "none")) {
@@ -1316,7 +1317,7 @@ void migration_test_add_precopy(MigrationTestEnv *env)
     }
 
     /* ensure new status don't go unnoticed */
-    assert(MIGRATION_STATUS__MAX == 16);
+    assert(MIGRATION_STATUS__MAX == 17);
 
     for (int i = MIGRATION_STATUS_NONE; i < MIGRATION_STATUS__MAX; i++) {
         switch (i) {
-- 
2.52.0
Re: [PATCH] migration: introduce MIGRATION_STATUS_FAILING
Posted by Peter Xu 1 month, 2 weeks ago
On Mon, Dec 22, 2025 at 05:18:22PM +0530, Prasad Pandit wrote:
> From: Prasad Pandit <pjp@fedoraproject.org>
> 
> When migration connection is broken, the QEMU and libvirtd(8)
> process on the source side receive TCP connection reset
> notification. QEMU sets the migration status to FAILED and
> proceeds to migration_cleanup(). Meanwhile, Libvirtd(8) sends
> a QMP command to migrate_set_capabilities().
> 
> The migration_cleanup() and qmp_migrate_set_capabilities()
> calls race with each other. When the latter is invoked first,
> since the migration is not running (FAILED), migration
> capabilities are reset to false, so during migration_cleanup()
> the QEMU process crashes with assertion failure.
> 
>   Stack trace of thread 255014:
>    #0  __pthread_kill_implementation (libc.so.6 + 0x822e8)
>    #1  raise (libc.so.6 + 0x3a73c)
>    #2  abort (libc.so.6 + 0x27034)
>    #3  __assert_fail_base (libc.so.6 + 0x34090)
>    #4  __assert_fail (libc.so.6 + 0x34100)
>    #5  yank_unregister_instance (qemu-kvm + 0x8b8280)
>    #6  migrate_fd_cleanup (qemu-kvm + 0x3c6308)
>    #7  migration_bh_dispatch_bh (qemu-kvm + 0x3c2144)
>    #8  aio_bh_poll (qemu-kvm + 0x8ba358)
>    #9  aio_dispatch (qemu-kvm + 0x8a0ab4)
>    #10 aio_ctx_dispatch (qemu-kvm + 0x8bb180)
> 
> Introduce a new migration status FAILING and use it as an
> interim status when an error occurs. Once migration_cleanup()
> is done, it sets the migration status to FAILED. This helps
> to avoid the above race condition and ensuing failure.
> 
> Interim status FAILING is set wherever the execution moves
> towards migration_cleanup() directly OR via:
> 
>   migration_iteration_finish  bg_migration_iteration_finish
>   -> migration_bh_schedule    -> migration_bh_schedule
>    -> migration_cleanup_bh     -> migration_cleanup_bh
>     -> migration_cleanup        -> migration_cleanup
>      -> FAILED                   -> FAILED
> 
> The migration status finally moves to FAILED and reports an
> appropriate error to the user.

I raised a request while I was discussing with you internally, I didn't see
this, I will request again:

Would you please list where you decided to switch from FAILED -> FAILING,
and where you decided not, with justifications for each of them?

Let me give a detailed example in this patch, please see below.

> 
> Signed-off-by: Prasad Pandit <pjp@fedoraproject.org>
> ---
>  migration/migration.c                 | 33 +++++++++++++++------------
>  migration/multifd.c                   |  4 ++--
>  qapi/migration.json                   |  8 ++++---
>  tests/qtest/migration/migration-qmp.c |  3 ++-
>  tests/qtest/migration/precopy-tests.c |  5 ++--
>  5 files changed, 31 insertions(+), 22 deletions(-)
> 
> diff --git a/migration/migration.c b/migration/migration.c
> index b316ee01ab..5c32bc8fe5 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -1216,6 +1216,7 @@ bool migration_is_running(void)
>      case MIGRATION_STATUS_DEVICE:
>      case MIGRATION_STATUS_WAIT_UNPLUG:
>      case MIGRATION_STATUS_CANCELLING:
> +    case MIGRATION_STATUS_FAILING:
>      case MIGRATION_STATUS_COLO:
>          return true;
>      default:
> @@ -1351,6 +1352,7 @@ static void fill_source_migration_info(MigrationInfo *info)
>          break;
>      case MIGRATION_STATUS_ACTIVE:
>      case MIGRATION_STATUS_CANCELLING:
> +    case MIGRATION_STATUS_FAILING:
>      case MIGRATION_STATUS_POSTCOPY_DEVICE:
>      case MIGRATION_STATUS_POSTCOPY_ACTIVE:
>      case MIGRATION_STATUS_PRE_SWITCHOVER:
> @@ -1409,6 +1411,7 @@ static void fill_destination_migration_info(MigrationInfo *info)
>      case MIGRATION_STATUS_POSTCOPY_ACTIVE:
>      case MIGRATION_STATUS_POSTCOPY_PAUSED:
>      case MIGRATION_STATUS_POSTCOPY_RECOVER:
> +    case MIGRATION_STATUS_FAILING:
>      case MIGRATION_STATUS_FAILED:
>      case MIGRATION_STATUS_COLO:
>          info->has_status = true;
> @@ -1531,6 +1534,9 @@ static void migration_cleanup(MigrationState *s)
>      if (s->state == MIGRATION_STATUS_CANCELLING) {
>          migrate_set_state(&s->state, MIGRATION_STATUS_CANCELLING,
>                            MIGRATION_STATUS_CANCELLED);
> +    } else if (s->state == MIGRATION_STATUS_FAILING) {
> +        migrate_set_state(&s->state, MIGRATION_STATUS_FAILING,
> +                          MIGRATION_STATUS_FAILED);
>      }
>  
>      if (s->error) {
> @@ -1584,7 +1590,7 @@ static void migration_connect_set_error(MigrationState *s, const Error *error)
>  
>      switch (current) {
>      case MIGRATION_STATUS_SETUP:
> -        next = MIGRATION_STATUS_FAILED;
> +        next = MIGRATION_STATUS_FAILING;

This is the first real change that we'll switch to FAILING when
migration_connect_set_error() is invoked and migration failed.

Please justify why setting FAILING is correct here.

This function is invoked in three callers:

qmp_migrate[2302]              migration_connect_set_error(s, local_err);
qmp_migrate_finish[2347]       migration_connect_set_error(s, local_err);
migration_connect[4047]        migration_connect_set_error(s, error_in);

At least from the initial two callers, I don't see migration_cleanup()
invoked after setting FAILING.  Could this cause migration to get into
FAILING status forever without finally move to FAILED?

>          break;
>      case MIGRATION_STATUS_POSTCOPY_RECOVER_SETUP:
>          /* Never fail a postcopy migration; switch back to PAUSED instead */
> @@ -1728,6 +1734,7 @@ bool migration_has_failed(MigrationState *s)
>  {
>      return (s->state == MIGRATION_STATUS_CANCELLING ||
>              s->state == MIGRATION_STATUS_CANCELLED ||
> +            s->state == MIGRATION_STATUS_FAILING ||
>              s->state == MIGRATION_STATUS_FAILED);
>  }
>  
> @@ -2744,7 +2751,7 @@ static int postcopy_start(MigrationState *ms, Error **errp)
>          if (postcopy_preempt_establish_channel(ms)) {
>              if (ms->state != MIGRATION_STATUS_CANCELLING) {
>                  migrate_set_state(&ms->state, ms->state,
> -                                  MIGRATION_STATUS_FAILED);
> +                                  MIGRATION_STATUS_FAILING);
>              }
>              error_setg(errp, "%s: Failed to establish preempt channel",
>                         __func__);
> @@ -2907,7 +2914,7 @@ fail_closefb:
>      qemu_fclose(fb);
>  fail:
>      if (ms->state != MIGRATION_STATUS_CANCELLING) {
> -        migrate_set_state(&ms->state, ms->state, MIGRATION_STATUS_FAILED);
> +        migrate_set_state(&ms->state, ms->state, MIGRATION_STATUS_FAILING);
>      }
>      migration_block_activate(NULL);
>      migration_call_notifiers(ms, MIG_EVENT_PRECOPY_FAILED, NULL);
> @@ -3102,7 +3109,7 @@ fail:
>      }
>  
>      if (s->state != MIGRATION_STATUS_CANCELLING) {
> -        migrate_set_state(&s->state, s->state, MIGRATION_STATUS_FAILED);
> +        migrate_set_state(&s->state, s->state, MIGRATION_STATUS_FAILING);
>      }
>  }
>  
> @@ -3139,7 +3146,7 @@ static void bg_migration_completion(MigrationState *s)
>  
>  fail:
>      migrate_set_state(&s->state, current_active_state,
> -                      MIGRATION_STATUS_FAILED);
> +                      MIGRATION_STATUS_FAILING);
>  }
>  
>  typedef enum MigThrError {
> @@ -3341,7 +3348,7 @@ static MigThrError migration_detect_error(MigrationState *s)
>           * For precopy (or postcopy with error outside IO, or before dest
>           * starts), we fail with no time.
>           */
> -        migrate_set_state(&s->state, state, MIGRATION_STATUS_FAILED);
> +        migrate_set_state(&s->state, state, MIGRATION_STATUS_FAILING);
>          trace_migration_thread_file_err();
>  
>          /* Time to stop the migration, now. */
> @@ -3572,7 +3579,7 @@ static void migration_iteration_finish(MigrationState *s)
>          migrate_start_colo_process(s);
>          s->vm_old_state = RUN_STATE_RUNNING;
>          /* Fallthrough */
> -    case MIGRATION_STATUS_FAILED:
> +    case MIGRATION_STATUS_FAILING:
>      case MIGRATION_STATUS_CANCELLED:
>      case MIGRATION_STATUS_CANCELLING:
>          if (!migration_block_activate(&local_err)) {
> @@ -3631,7 +3638,7 @@ static void bg_migration_iteration_finish(MigrationState *s)
>      switch (s->state) {
>      case MIGRATION_STATUS_COMPLETED:
>      case MIGRATION_STATUS_ACTIVE:
> -    case MIGRATION_STATUS_FAILED:
> +    case MIGRATION_STATUS_FAILING:
>      case MIGRATION_STATUS_CANCELLED:
>      case MIGRATION_STATUS_CANCELLING:
>          break;
> @@ -3821,7 +3828,7 @@ static void *migration_thread(void *opaque)
>          migrate_set_error(s, local_err);
>          error_free(local_err);
>          migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
> -                          MIGRATION_STATUS_FAILED);
> +                          MIGRATION_STATUS_FAILING);
>          goto out;
>      }
>  
> @@ -3945,8 +3952,6 @@ static void *bg_migration_thread(void *opaque)
>      if (ret) {
>          migrate_set_error(s, local_err);
>          error_free(local_err);
> -        migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
> -                          MIGRATION_STATUS_FAILED);
>          goto fail_setup;
>      }
>  
> @@ -4008,12 +4013,12 @@ static void *bg_migration_thread(void *opaque)
>  
>  fail:
>      if (early_fail) {
> -        migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
> -                MIGRATION_STATUS_FAILED);
>          bql_unlock();
>      }
>  
>  fail_setup:
> +    migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
> +                        MIGRATION_STATUS_FAILING);
>      bg_migration_iteration_finish(s);
>  
>      qemu_fclose(fb);
> @@ -4128,7 +4133,7 @@ void migration_connect(MigrationState *s, Error *error_in)
>  fail:
>      migrate_set_error(s, local_err);
>      if (s->state != MIGRATION_STATUS_CANCELLING) {
> -        migrate_set_state(&s->state, s->state, MIGRATION_STATUS_FAILED);
> +        migrate_set_state(&s->state, s->state, MIGRATION_STATUS_FAILING);
>      }
>      error_report_err(local_err);
>      migration_cleanup(s);
> diff --git a/migration/multifd.c b/migration/multifd.c
> index 3203dc98e1..970633474e 100644
> --- a/migration/multifd.c
> +++ b/migration/multifd.c
> @@ -434,7 +434,7 @@ static void multifd_send_set_error(Error *err)
>              s->state == MIGRATION_STATUS_DEVICE ||
>              s->state == MIGRATION_STATUS_ACTIVE) {
>              migrate_set_state(&s->state, s->state,
> -                              MIGRATION_STATUS_FAILED);
> +                              MIGRATION_STATUS_FAILING);
>          }
>      }
>  }
> @@ -1001,7 +1001,7 @@ bool multifd_send_setup(void)
>  
>  err:
>      migrate_set_state(&s->state, MIGRATION_STATUS_SETUP,
> -                      MIGRATION_STATUS_FAILED);
> +                      MIGRATION_STATUS_FAILING);
>      return false;
>  }
>  
> diff --git a/qapi/migration.json b/qapi/migration.json
> index cf023bd29d..85f4961781 100644
> --- a/qapi/migration.json
> +++ b/qapi/migration.json
> @@ -158,7 +158,9 @@
>  #
>  # @completed: migration is finished.
>  #
> -# @failed: some error occurred during migration process.
> +# @failing: error occurred during migration, clean-up underway.
> +#
> +# @failed: error occurred during migration, clean-up done.
>  #
>  # @colo: VM is in the process of fault tolerance, VM can not get into
>  #     this state unless colo capability is enabled for migration.
> @@ -181,8 +183,8 @@
>    'data': [ 'none', 'setup', 'cancelling', 'cancelled',
>              'active', 'postcopy-device', 'postcopy-active',
>              'postcopy-paused', 'postcopy-recover-setup',
> -            'postcopy-recover', 'completed', 'failed', 'colo',
> -            'pre-switchover', 'device', 'wait-unplug' ] }
> +            'postcopy-recover', 'completed', 'failing', 'failed',
> +            'colo', 'pre-switchover', 'device', 'wait-unplug' ] }
>  
>  ##
>  # @VfioStats:
> diff --git a/tests/qtest/migration/migration-qmp.c b/tests/qtest/migration/migration-qmp.c
> index c803fcee9d..ceb40efd56 100644
> --- a/tests/qtest/migration/migration-qmp.c
> +++ b/tests/qtest/migration/migration-qmp.c
> @@ -241,7 +241,8 @@ void wait_for_migration_fail(QTestState *from, bool allow_active)
>      do {
>          status = migrate_query_status(from);
>          bool result = !strcmp(status, "setup") || !strcmp(status, "failed") ||
> -            (allow_active && !strcmp(status, "active"));
> +            (allow_active && !strcmp(status, "active")) ||
> +            !strcmp(status, "failing");
>          if (!result) {
>              fprintf(stderr, "%s: unexpected status status=%s allow_active=%d\n",
>                      __func__, status, allow_active);
> diff --git a/tests/qtest/migration/precopy-tests.c b/tests/qtest/migration/precopy-tests.c
> index 57ca623de5..a04442df96 100644
> --- a/tests/qtest/migration/precopy-tests.c
> +++ b/tests/qtest/migration/precopy-tests.c
> @@ -807,7 +807,8 @@ static void test_cancel_src_after_status(void *opaque)
>      } else if (g_str_equal(phase, "completed")) {
>          test_cancel_src_after_complete(from, to, uri, phase);
>  
> -    } else if (g_str_equal(phase, "failed")) {
> +    } else if (g_str_equal(phase, "failing") ||
> +               g_str_equal(phase, "failed")) {
>          test_cancel_src_after_failed(from, to, uri, phase);
>  
>      } else if (g_str_equal(phase, "none")) {
> @@ -1316,7 +1317,7 @@ void migration_test_add_precopy(MigrationTestEnv *env)
>      }
>  
>      /* ensure new status don't go unnoticed */
> -    assert(MIGRATION_STATUS__MAX == 16);
> +    assert(MIGRATION_STATUS__MAX == 17);
>  
>      for (int i = MIGRATION_STATUS_NONE; i < MIGRATION_STATUS__MAX; i++) {
>          switch (i) {
> -- 
> 2.52.0
> 

-- 
Peter Xu
Re: [PATCH] migration: introduce MIGRATION_STATUS_FAILING
Posted by Prasad Pandit 1 month ago
Hello Peter, Fabiano,

* Thank you for the comments.

On Tue, 23 Dec 2025 at 20:47, Peter Xu <peterx@redhat.com> wrote:
> I raised a request while I was discussing with you internally, I didn't see
> this, I will request again:
>
> Would you please list where you decided to switch from FAILED -> FAILING,
> and where you decided not, with justifications for each of them?
>
> Let me give a detailed example in this patch, please see below.
> This is the first real change that we'll switch to FAILING when
> migration_connect_set_error() is invoked and migration failed.
>
> Please justify why setting FAILING is correct here.
>
> This function is invoked in three callers:
>
> qmp_migrate[2302]              migration_connect_set_error(s, local_err);
> qmp_migrate_finish[2347]       migration_connect_set_error(s, local_err);
> migration_connect[4047]        migration_connect_set_error(s, error_in);
>
> At least from the initial two callers, I don't see migration_cleanup()
> invoked after setting FAILING.  Could this cause migration to get into
> FAILING status forever without finally move to FAILED?

* Ah, sorry about this; I missed removing this and
qmp_migrate_finish() change from this patch. Idea was to switch to the
interim FAILING state wherever execution follows the
migration_iteration_finish() OR bg_migration_iteration_finish() path
and leads to migration_cleanup() setting the final state to FAILED. (I
create/test patches on a pair of machines and then copy them to the
local repository on my laptop to send upstream, can not do git
send-email from test machines, these discrepancies fall through at
times)

* As you requested, please see below the list of calls where the
change was applied and where it was not. Justification for changing to
the interim 'FAILING' state is that execution follows through to
migration_cleanup() path. And Justification for not changing the state
is that execution does not follow through to the migration_cleanup()
path.
===
01  void migration_channel_process_incoming(QIOChannel *ioc)
      ...
        error_report_err(local_err);
        migrate_set_state(&s->state, s->state, MIGRATION_STATUS_FAILED);

02  static void cpr_exec_cb(void *opaque)
      ...
        error_report_err(error_copy(err));
        migrate_set_state(&s->state, s->state, MIGRATION_STATUS_FAILED);

03  static void coroutine_fn process_incoming_migration_co(void *opaque)
      ...
      fail:
         migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
                           MIGRATION_STATUS_FAILED);
         migrate_error_propagate(s, local_err);

04  static void migration_connect_error_propagate(MigrationState *s,
Error *error)
      ...
        case MIGRATION_STATUS_SETUP:
            next = MIGRATION_STATUS_FAILED;
            break;
        migrate_error_propagate(s, error);

05  static void qmp_migrate_finish(MigrationAddress *addr, ...)
      ...
        else {
            error_setg(&local_err, "uri is not a valid migration protocol");
            migrate_set_state(&s->state, MIGRATION_STATUS_SETUP,
                              MIGRATION_STATUS_FAILED);

06  static void multifd_recv_terminate_threads(Error *err)
      ...
        if (s->state == MIGRATION_STATUS_SETUP ||
            s->state == MIGRATION_STATUS_ACTIVE) {
            migrate_set_state(&s->state, s->state,
                              MIGRATION_STATUS_FAILED);

07  static void *postcopy_listen_thread(void *opaque)
      ...
        error_report_err(local_err);
        migrate_set_state(&mis->state, mis->state, MIGRATION_STATUS_FAILED);
        goto out;

08  static int qemu_savevm_state(QEMUFile *f, Error **errp)
      ...
      cleanup:
        if (ret != 0) {
          status = MIGRATION_STATUS_FAILED;
        } else {
          status = MIGRATION_STATUS_COMPLETED;
        }
        migrate_set_state(&ms->state, MIGRATION_STATUS_SETUP, status);

* Above MIGRATION_STATUS_FAILED instances are _not_ changed because
they do not follow the migration_cleanup() path. The instances below
are changed to the interim FAILING state because execution follows
through the migration_cleanup() path, wherein the interim state
changes to the FAILED state.

09  static int postcopy_start(MigrationState *ms, Error **errp)
      ...
        if (ms->state != MIGRATION_STATUS_CANCELLING) {
           migrate_set_state(&ms->state, ms->state, MIGRATION_STATUS_FAILED);

      ...
      fail:
        if (ms->state != MIGRATION_STATUS_CANCELLING) {
            migrate_set_state(&ms->state, ms->state, MIGRATION_STATUS_FAILED);
        }

10  static void migration_completion(MigrationState *s)
      ...
      fail:
        if (s->state != MIGRATION_STATUS_CANCELLING) {
            migrate_set_state(&s->state, s->state, MIGRATION_STATUS_FAILED);
        }

11  static void bg_migration_completion(MigrationState *s)
      ...
      fail:
        migrate_set_state(&s->state, current_active_state,
MIGRATION_STATUS_FAILED);

12  static MigThrError migration_detect_error(MigrationState *s)
      ...
        migrate_set_state(&s->state, state, MIGRATION_STATUS_FAILED);
        /* Time to stop the migration, now. */
        return MIG_THR_ERR_FATAL;

13  static void *migration_thread(void *opaque)
      ...
        migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
MIGRATION_STATUS_FAILED);
        goto out;

14  static void *bg_migration_thread(void *opaque)
      ...
        migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
MIGRATION_STATUS_FAILED);
        goto fail_setup;

      fail:
        if (early_fail) {
            migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
MIGRATION_STATUS_FAILED);
        }
      fail_setup:
        bg_migration_iteration_finish(s);

15  void migration_connect(MigrationState *s, Error *error_in)
      ...
        if (s->state != MIGRATION_STATUS_CANCELLING) {
            migrate_set_state(&s->state, s->state, MIGRATION_STATUS_FAILED);
        }
        migration_cleanup(s);

16  static void multifd_send_error_propagate(Error *err)
      ...
        s->state == MIGRATION_STATUS_DEVICE ||
        s->state == MIGRATION_STATUS_ACTIVE) {
            migrate_set_state(&s->state, s->state,
                              MIGRATION_STATUS_FAILED);

17  bool multifd_send_setup(void)
      ...
      err:
        migrate_set_state(&s->state, MIGRATION_STATUS_SETUP,
                      MIGRATION_STATUS_FAILED);
      return false;
===

* As proposed earlier, I'll request to divide this patchset into two
(or even) more patches as:
    - One patch to fix the race condition issue at hand, by way of
introducing the interim FAILING state.
    - One or more patches to use the interim FAILING state, where we
currently set the FAILED state.
That should help us to do/test/review these changes. It is complicated
how this state change is done/handled and IIUC, migration code seems
to serve multiple use cases: one is live VM migration between two
machines, second is taking snapshots of a VM, third is migrating a VM
to a new QEMU process in the same host. Doing such a change in a
single large patch seems risky.

Thank you.
---
  - Prasad
Re: [PATCH] migration: introduce MIGRATION_STATUS_FAILING
Posted by Peter Xu 1 month ago
On Tue, Jan 06, 2026 at 04:19:44PM +0530, Prasad Pandit wrote:
> Hello Peter, Fabiano,
> 
> * Thank you for the comments.
> 
> On Tue, 23 Dec 2025 at 20:47, Peter Xu <peterx@redhat.com> wrote:
> > I raised a request while I was discussing with you internally, I didn't see
> > this, I will request again:
> >
> > Would you please list where you decided to switch from FAILED -> FAILING,
> > and where you decided not, with justifications for each of them?
> >
> > Let me give a detailed example in this patch, please see below.
> > This is the first real change that we'll switch to FAILING when
> > migration_connect_set_error() is invoked and migration failed.
> >
> > Please justify why setting FAILING is correct here.
> >
> > This function is invoked in three callers:
> >
> > qmp_migrate[2302]              migration_connect_set_error(s, local_err);
> > qmp_migrate_finish[2347]       migration_connect_set_error(s, local_err);
> > migration_connect[4047]        migration_connect_set_error(s, error_in);
> >
> > At least from the initial two callers, I don't see migration_cleanup()
> > invoked after setting FAILING.  Could this cause migration to get into
> > FAILING status forever without finally move to FAILED?
> 
> * Ah, sorry about this; I missed removing this and
> qmp_migrate_finish() change from this patch. Idea was to switch to the
> interim FAILING state wherever execution follows the
> migration_iteration_finish() OR bg_migration_iteration_finish() path
> and leads to migration_cleanup() setting the final state to FAILED. (I
> create/test patches on a pair of machines and then copy them to the
> local repository on my laptop to send upstream, can not do git
> send-email from test machines, these discrepancies fall through at
> times)
> 
> * As you requested, please see below the list of calls where the
> change was applied and where it was not. Justification for changing to
> the interim 'FAILING' state is that execution follows through to
> migration_cleanup() path. And Justification for not changing the state
> is that execution does not follow through to the migration_cleanup()
> path.
> ===
> 01  void migration_channel_process_incoming(QIOChannel *ioc)
>       ...
>         error_report_err(local_err);
>         migrate_set_state(&s->state, s->state, MIGRATION_STATUS_FAILED);
> 
> 02  static void cpr_exec_cb(void *opaque)
>       ...
>         error_report_err(error_copy(err));
>         migrate_set_state(&s->state, s->state, MIGRATION_STATUS_FAILED);
> 
> 03  static void coroutine_fn process_incoming_migration_co(void *opaque)
>       ...
>       fail:
>          migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
>                            MIGRATION_STATUS_FAILED);
>          migrate_error_propagate(s, local_err);
> 
> 04  static void migration_connect_error_propagate(MigrationState *s,
> Error *error)
>       ...
>         case MIGRATION_STATUS_SETUP:
>             next = MIGRATION_STATUS_FAILED;
>             break;
>         migrate_error_propagate(s, error);
> 
> 05  static void qmp_migrate_finish(MigrationAddress *addr, ...)
>       ...
>         else {
>             error_setg(&local_err, "uri is not a valid migration protocol");
>             migrate_set_state(&s->state, MIGRATION_STATUS_SETUP,
>                               MIGRATION_STATUS_FAILED);
> 
> 06  static void multifd_recv_terminate_threads(Error *err)
>       ...
>         if (s->state == MIGRATION_STATUS_SETUP ||
>             s->state == MIGRATION_STATUS_ACTIVE) {
>             migrate_set_state(&s->state, s->state,
>                               MIGRATION_STATUS_FAILED);
> 
> 07  static void *postcopy_listen_thread(void *opaque)
>       ...
>         error_report_err(local_err);
>         migrate_set_state(&mis->state, mis->state, MIGRATION_STATUS_FAILED);
>         goto out;
> 
> 08  static int qemu_savevm_state(QEMUFile *f, Error **errp)
>       ...
>       cleanup:
>         if (ret != 0) {
>           status = MIGRATION_STATUS_FAILED;
>         } else {
>           status = MIGRATION_STATUS_COMPLETED;
>         }
>         migrate_set_state(&ms->state, MIGRATION_STATUS_SETUP, status);
> 
> * Above MIGRATION_STATUS_FAILED instances are _not_ changed because
> they do not follow the migration_cleanup() path. The instances below
> are changed to the interim FAILING state because execution follows
> through the migration_cleanup() path, wherein the interim state
> changes to the FAILED state.
> 
> 09  static int postcopy_start(MigrationState *ms, Error **errp)
>       ...
>         if (ms->state != MIGRATION_STATUS_CANCELLING) {
>            migrate_set_state(&ms->state, ms->state, MIGRATION_STATUS_FAILED);
> 
>       ...
>       fail:
>         if (ms->state != MIGRATION_STATUS_CANCELLING) {
>             migrate_set_state(&ms->state, ms->state, MIGRATION_STATUS_FAILED);
>         }
> 
> 10  static void migration_completion(MigrationState *s)
>       ...
>       fail:
>         if (s->state != MIGRATION_STATUS_CANCELLING) {
>             migrate_set_state(&s->state, s->state, MIGRATION_STATUS_FAILED);
>         }
> 
> 11  static void bg_migration_completion(MigrationState *s)
>       ...
>       fail:
>         migrate_set_state(&s->state, current_active_state,
> MIGRATION_STATUS_FAILED);
> 
> 12  static MigThrError migration_detect_error(MigrationState *s)
>       ...
>         migrate_set_state(&s->state, state, MIGRATION_STATUS_FAILED);
>         /* Time to stop the migration, now. */
>         return MIG_THR_ERR_FATAL;
> 
> 13  static void *migration_thread(void *opaque)
>       ...
>         migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
> MIGRATION_STATUS_FAILED);
>         goto out;
> 
> 14  static void *bg_migration_thread(void *opaque)
>       ...
>         migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
> MIGRATION_STATUS_FAILED);
>         goto fail_setup;
> 
>       fail:
>         if (early_fail) {
>             migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
> MIGRATION_STATUS_FAILED);
>         }
>       fail_setup:
>         bg_migration_iteration_finish(s);
> 
> 15  void migration_connect(MigrationState *s, Error *error_in)
>       ...
>         if (s->state != MIGRATION_STATUS_CANCELLING) {
>             migrate_set_state(&s->state, s->state, MIGRATION_STATUS_FAILED);
>         }
>         migration_cleanup(s);
> 
> 16  static void multifd_send_error_propagate(Error *err)
>       ...
>         s->state == MIGRATION_STATUS_DEVICE ||
>         s->state == MIGRATION_STATUS_ACTIVE) {
>             migrate_set_state(&s->state, s->state,
>                               MIGRATION_STATUS_FAILED);
> 
> 17  bool multifd_send_setup(void)
>       ...
>       err:
>         migrate_set_state(&s->state, MIGRATION_STATUS_SETUP,
>                       MIGRATION_STATUS_FAILED);
>       return false;
> ===

Yes, something like this would be more than welcomed, thanks.  You can
provide a simplified version in the commit log when repost.

Note that I'm not reading carefully into each of them, because we have
concurrent changes from Fabiano recently, essentially it'll change when
migration_cleanup() will be used in above examples:

https://lore.kernel.org/r/20260105190644.14072-12-farosas@suse.de

So we'll need to be careful landing these two changes.

Considering that the other series was close to landing, IMHO it might be
good idea to have your patch (when reposted) based on that series.  Please
have a look.

> 
> * As proposed earlier, I'll request to divide this patchset into two
> (or even) more patches as:
>     - One patch to fix the race condition issue at hand, by way of
> introducing the interim FAILING state.
>     - One or more patches to use the interim FAILING state, where we
> currently set the FAILED state.
> That should help us to do/test/review these changes. It is complicated
> how this state change is done/handled and IIUC, migration code seems
> to serve multiple use cases: one is live VM migration between two
> machines, second is taking snapshots of a VM, third is migrating a VM
> to a new QEMU process in the same host. Doing such a change in a
> single large patch seems risky.

I still think FAILING isn't such a huge change so it needs to be split into
multiple patches.  It's a new status and we need to review every spot of
FAILED status change, in which case one patch is very well self contained.

Even in a backport I think we should backport all relevant changes about
FAILING when necessary.  We should not backport part of it, causing FAILING
status to behave different over different paths.

Thanks,

-- 
Peter Xu
Re: [PATCH] migration: introduce MIGRATION_STATUS_FAILING
Posted by Prasad Pandit 1 month ago
On Tue, 6 Jan 2026 at 20:53, Peter Xu <peterx@redhat.com> wrote:
> Yes, something like this would be more than welcomed, thanks.  You can
> provide a simplified version in the commit log when repost.

* Okay.

> Note that I'm not reading carefully into each of them, because we have
> concurrent changes from Fabiano recently, essentially it'll change when
> migration_cleanup() will be used in above examples:
>
> https://lore.kernel.org/r/20260105190644.14072-12-farosas@suse.de
>
> So we'll need to be careful landing these two changes.
> Considering that the other series was close to landing, IMHO it might be
> good idea to have your patch (when reposted) based on that series.  Please
> have a look.

* Yes, I see it is changing the same code areas. I'll wait for
Fabiano's series to merge upstream. Will it happen this week? @Fabiano
Rosas?

> I still think FAILING isn't such a huge change so it needs to be split into
> multiple patches.  It's a new status and we need to review every spot of
> FAILED status change, in which case one patch is very well self contained.
>
> Even in a backport I think we should backport all relevant changes about
> FAILING when necessary.  We should not backport part of it, causing FAILING
> status to behave different over different paths.

* Adding a new interim state to the .json file is not going to break
anything. But other places where we start using it via a bulk change
might break things we don't know. So far we've only tested it in a
live migration use case. We have not run snapshots
(bg_migration_iteration()) OR the same host migration use case.
Hopefully it might work fine there, without any breakages.

Thank you.
---
  - Prasad
Re: [PATCH] migration: introduce MIGRATION_STATUS_FAILING
Posted by Peter Xu 1 month ago
On Wed, Jan 07, 2026 at 04:31:54PM +0530, Prasad Pandit wrote:
> On Tue, 6 Jan 2026 at 20:53, Peter Xu <peterx@redhat.com> wrote:
> > Yes, something like this would be more than welcomed, thanks.  You can
> > provide a simplified version in the commit log when repost.
> 
> * Okay.
> 
> > Note that I'm not reading carefully into each of them, because we have
> > concurrent changes from Fabiano recently, essentially it'll change when
> > migration_cleanup() will be used in above examples:
> >
> > https://lore.kernel.org/r/20260105190644.14072-12-farosas@suse.de
> >
> > So we'll need to be careful landing these two changes.
> > Considering that the other series was close to landing, IMHO it might be
> > good idea to have your patch (when reposted) based on that series.  Please
> > have a look.
> 
> * Yes, I see it is changing the same code areas. I'll wait for
> Fabiano's series to merge upstream. Will it happen this week? @Fabiano
> Rosas?

IMHO you can work on top of that series directly, then when you repost you
can add "Based-on:" mentioning that series.

> 
> > I still think FAILING isn't such a huge change so it needs to be split into
> > multiple patches.  It's a new status and we need to review every spot of
> > FAILED status change, in which case one patch is very well self contained.
> >
> > Even in a backport I think we should backport all relevant changes about
> > FAILING when necessary.  We should not backport part of it, causing FAILING
> > status to behave different over different paths.
> 
> * Adding a new interim state to the .json file is not going to break
> anything. But other places where we start using it via a bulk change
> might break things we don't know. So far we've only tested it in a
> live migration use case. We have not run snapshots
> (bg_migration_iteration()) OR the same host migration use case.
> Hopefully it might work fine there, without any breakages.

Having a new status introduced only partially to the subsystem in separate
patch and especially only backporting that partial patch sounds more risky
and wrong.  Let's try our best to make sure it won't break anything but
implement the status properly altogether.

Thanks,

-- 
Peter Xu
Re: [PATCH] migration: introduce MIGRATION_STATUS_FAILING
Posted by Fabiano Rosas 1 month, 2 weeks ago
Peter Xu <peterx@redhat.com> writes:

> On Mon, Dec 22, 2025 at 05:18:22PM +0530, Prasad Pandit wrote:
>> From: Prasad Pandit <pjp@fedoraproject.org>
>> 
>> When migration connection is broken, the QEMU and libvirtd(8)
>> process on the source side receive TCP connection reset
>> notification. QEMU sets the migration status to FAILED and
>> proceeds to migration_cleanup(). Meanwhile, Libvirtd(8) sends
>> a QMP command to migrate_set_capabilities().
>> 
>> The migration_cleanup() and qmp_migrate_set_capabilities()
>> calls race with each other. When the latter is invoked first,
>> since the migration is not running (FAILED), migration
>> capabilities are reset to false, so during migration_cleanup()
>> the QEMU process crashes with assertion failure.
>> 
>>   Stack trace of thread 255014:
>>    #0  __pthread_kill_implementation (libc.so.6 + 0x822e8)
>>    #1  raise (libc.so.6 + 0x3a73c)
>>    #2  abort (libc.so.6 + 0x27034)
>>    #3  __assert_fail_base (libc.so.6 + 0x34090)
>>    #4  __assert_fail (libc.so.6 + 0x34100)
>>    #5  yank_unregister_instance (qemu-kvm + 0x8b8280)
>>    #6  migrate_fd_cleanup (qemu-kvm + 0x3c6308)
>>    #7  migration_bh_dispatch_bh (qemu-kvm + 0x3c2144)
>>    #8  aio_bh_poll (qemu-kvm + 0x8ba358)
>>    #9  aio_dispatch (qemu-kvm + 0x8a0ab4)
>>    #10 aio_ctx_dispatch (qemu-kvm + 0x8bb180)
>> 
>> Introduce a new migration status FAILING and use it as an
>> interim status when an error occurs. Once migration_cleanup()
>> is done, it sets the migration status to FAILED. This helps
>> to avoid the above race condition and ensuing failure.
>> 
>> Interim status FAILING is set wherever the execution moves
>> towards migration_cleanup() directly OR via:
>> 
>>   migration_iteration_finish  bg_migration_iteration_finish
>>   -> migration_bh_schedule    -> migration_bh_schedule
>>    -> migration_cleanup_bh     -> migration_cleanup_bh
>>     -> migration_cleanup        -> migration_cleanup
>>      -> FAILED                   -> FAILED
>> 
>> The migration status finally moves to FAILED and reports an
>> appropriate error to the user.
>
> I raised a request while I was discussing with you internally, I didn't see
> this, I will request again:
>
> Would you please list where you decided to switch from FAILED -> FAILING,
> and where you decided not, with justifications for each of them?
>
> Let me give a detailed example in this patch, please see below.
>
>> 
>> Signed-off-by: Prasad Pandit <pjp@fedoraproject.org>
>> ---
>>  migration/migration.c                 | 33 +++++++++++++++------------
>>  migration/multifd.c                   |  4 ++--
>>  qapi/migration.json                   |  8 ++++---
>>  tests/qtest/migration/migration-qmp.c |  3 ++-
>>  tests/qtest/migration/precopy-tests.c |  5 ++--
>>  5 files changed, 31 insertions(+), 22 deletions(-)
>> 
>> diff --git a/migration/migration.c b/migration/migration.c
>> index b316ee01ab..5c32bc8fe5 100644
>> --- a/migration/migration.c
>> +++ b/migration/migration.c
>> @@ -1216,6 +1216,7 @@ bool migration_is_running(void)
>>      case MIGRATION_STATUS_DEVICE:
>>      case MIGRATION_STATUS_WAIT_UNPLUG:
>>      case MIGRATION_STATUS_CANCELLING:
>> +    case MIGRATION_STATUS_FAILING:
>>      case MIGRATION_STATUS_COLO:
>>          return true;
>>      default:
>> @@ -1351,6 +1352,7 @@ static void fill_source_migration_info(MigrationInfo *info)
>>          break;
>>      case MIGRATION_STATUS_ACTIVE:
>>      case MIGRATION_STATUS_CANCELLING:
>> +    case MIGRATION_STATUS_FAILING:
>>      case MIGRATION_STATUS_POSTCOPY_DEVICE:
>>      case MIGRATION_STATUS_POSTCOPY_ACTIVE:
>>      case MIGRATION_STATUS_PRE_SWITCHOVER:
>> @@ -1409,6 +1411,7 @@ static void fill_destination_migration_info(MigrationInfo *info)
>>      case MIGRATION_STATUS_POSTCOPY_ACTIVE:
>>      case MIGRATION_STATUS_POSTCOPY_PAUSED:
>>      case MIGRATION_STATUS_POSTCOPY_RECOVER:
>> +    case MIGRATION_STATUS_FAILING:
>>      case MIGRATION_STATUS_FAILED:
>>      case MIGRATION_STATUS_COLO:
>>          info->has_status = true;
>> @@ -1531,6 +1534,9 @@ static void migration_cleanup(MigrationState *s)
>>      if (s->state == MIGRATION_STATUS_CANCELLING) {
>>          migrate_set_state(&s->state, MIGRATION_STATUS_CANCELLING,
>>                            MIGRATION_STATUS_CANCELLED);
>> +    } else if (s->state == MIGRATION_STATUS_FAILING) {
>> +        migrate_set_state(&s->state, MIGRATION_STATUS_FAILING,
>> +                          MIGRATION_STATUS_FAILED);
>>      }
>>  
>>      if (s->error) {
>> @@ -1584,7 +1590,7 @@ static void migration_connect_set_error(MigrationState *s, const Error *error)
>>  
>>      switch (current) {
>>      case MIGRATION_STATUS_SETUP:
>> -        next = MIGRATION_STATUS_FAILED;
>> +        next = MIGRATION_STATUS_FAILING;
>
> This is the first real change that we'll switch to FAILING when
> migration_connect_set_error() is invoked and migration failed.
>
> Please justify why setting FAILING is correct here.
>
> This function is invoked in three callers:
>
> qmp_migrate[2302]              migration_connect_set_error(s, local_err);
> qmp_migrate_finish[2347]       migration_connect_set_error(s, local_err);
> migration_connect[4047]        migration_connect_set_error(s, error_in);
>
> At least from the initial two callers, I don't see migration_cleanup()
> invoked after setting FAILING.  Could this cause migration to get into
> FAILING status forever without finally move to FAILED?
>

Good point, I'm working on some cleanups to connection code and one
change I did there is to add a migration_cleanup() call into
migration_connect_error_propagate().

1) branch WIP:
https://gitlab.com/farosas/qemu/-/commits/migration-connect-cleanup

2) the patch:
---
From 1f9eeb898f3a5efba7c183e351fa36a5471fd0b2 Mon Sep 17 00:00:00 2001
From: Fabiano Rosas <farosas@suse.de>
Date: Thu, 18 Dec 2025 10:54:46 -0300
Subject: [PATCH] migration: Fold migration_cleanup into
 migration_connect_error_propagate

The code path from qmp_migrate() until the migration thread starts is
a little twisty due to the async nature of some routines. One issue is
that the async functions cannot return errors to their callers and
must instead call forward into migration_channel_connect() and pass
the error as input.

Ideally we'd have a function that just receives the error as input and
handles it. However, currently migration_channel_connect() has a code
path that moves forward into migration_connect(), also passing the
error as input, only for migration_connect() to then check that an
error happened and call migration_cleanup().

Clean this up:

1) Make migration_connect_error_propagate() be the function that
handles the error and call it at the point the error happens in the
async code. (this is all migration code, there's no layering
violation)

2) Stop checking for an incoming error in migration_connect(), that
function should be only reached when everything that came before
succeeded.

3) Fold migration_cleanup() into migration_connect_error_propagate()
so the cleanup happens at the moment the error is detected and not
several calls down the stack.

4) To address the quirk that during postcopy recovery there should be
no cleanup, move that check into migration_cleanup() and return early
if doing resume.

Notable functional changes:

i) Assumes a larger window for what it means to be "in resume"
   before: from qmp_migrate until migration_connect
   after: from qmp_migration until the state transition into
          MIGRATION_STATUS_POSTCOPY_RECOVER

ii) After an error, MIGRATION_STATUS_POSTCOPY_RECOVER_SETUP changes to
    MIGRATION_STATUS_POSTCOPY_PAUSED.

    This is already the case when migration_connect_error_propagate()
    was used, but not when migration_connect() receives
    error_in. Seems like a pre-existing bug actually.

iii) If the socket_start_outgoing_migration function *returns* an
     error, now migration_cleanup() is called. Previously, cleanup
     only happened when the error was *passed* forward, i.e. only
     after the async call.

iv) If cpr_state_save() fails, now migration_cleanup() is called.

Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
 migration/migration.c | 35 ++++++++++++++++++-----------------
 1 file changed, 18 insertions(+), 17 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index 4b1afcab24..52c1a97e46 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1481,6 +1481,14 @@ static void migration_cleanup(MigrationState *s)
     MigrationEventType type;
     QEMUFile *tmp = NULL;
 
+    /*
+     * Don't do cleanup if we're waiting for another connection from
+     * the user.
+     */
+    if (s->state == MIGRATION_STATUS_POSTCOPY_RECOVER_SETUP) {
+        return;
+    }
+
     trace_migration_cleanup();
 
     migration_cleanup_json_writer(s);
@@ -1585,6 +1593,14 @@ static void migration_connect_error_propagate(MigrationState *s, Error *error)
     case MIGRATION_STATUS_POSTCOPY_RECOVER_SETUP:
         /* Never fail a postcopy migration; switch back to PAUSED instead */
         next = MIGRATION_STATUS_POSTCOPY_PAUSED;
+
+        /*
+         * Give HMP user a hint on what failed.  It's normally done in
+         * migration_cleanup(), but call it here explicitly because we
+         * don't do cleanup when waiting for postcopy recover.
+         */
+        error_report_err(error_copy(error));
+
         break;
     default:
         /*
@@ -1598,6 +1614,7 @@ static void migration_connect_error_propagate(MigrationState *s, Error *error)
 
     migrate_set_state(&s->state, current, next);
     migrate_error_propagate(s, error);
+    migration_cleanup(s);
 }
 
 void migration_cancel(void)
@@ -2326,12 +2343,8 @@ static void qmp_migrate_finish(MigrationAddress *addr, bool resume_requested,
     }
 
     if (local_err) {
-        if (!resume_requested) {
-            yank_unregister_instance(MIGRATION_YANK_INSTANCE);
-        }
-        migration_connect_error_propagate(s, error_copy(local_err));
+        migration_connect_error_propagate(s, local_err);
         error_propagate(errp, local_err);
-        return;
     }
 }
 
@@ -4026,18 +4039,6 @@ void migration_connect(MigrationState *s, Error *error_in)
     s->expected_downtime = migrate_downtime_limit();
     if (error_in) {
         migration_connect_error_propagate(s, error_in);
-        if (resume) {
-            /*
-             * Don't do cleanup for resume if channel is invalid, but only dump
-             * the error.  We wait for another channel connect from the user.
-             * The error_report still gives HMP user a hint on what failed.
-             * It's normally done in migration_cleanup(), but call it here
-             * explicitly.
-             */
-            error_report_err(error_copy(s->error));
-        } else {
-            migration_cleanup(s);
-        }
         return;
     }
 
--
Re: [PATCH] migration: introduce MIGRATION_STATUS_FAILING
Posted by Peter Xu 1 month, 2 weeks ago
On Tue, Dec 23, 2025 at 12:36:19PM -0300, Fabiano Rosas wrote:
> Peter Xu <peterx@redhat.com> writes:
> 
> > On Mon, Dec 22, 2025 at 05:18:22PM +0530, Prasad Pandit wrote:
> >> From: Prasad Pandit <pjp@fedoraproject.org>
> >> 
> >> When migration connection is broken, the QEMU and libvirtd(8)
> >> process on the source side receive TCP connection reset
> >> notification. QEMU sets the migration status to FAILED and
> >> proceeds to migration_cleanup(). Meanwhile, Libvirtd(8) sends
> >> a QMP command to migrate_set_capabilities().
> >> 
> >> The migration_cleanup() and qmp_migrate_set_capabilities()
> >> calls race with each other. When the latter is invoked first,
> >> since the migration is not running (FAILED), migration
> >> capabilities are reset to false, so during migration_cleanup()
> >> the QEMU process crashes with assertion failure.
> >> 
> >>   Stack trace of thread 255014:
> >>    #0  __pthread_kill_implementation (libc.so.6 + 0x822e8)
> >>    #1  raise (libc.so.6 + 0x3a73c)
> >>    #2  abort (libc.so.6 + 0x27034)
> >>    #3  __assert_fail_base (libc.so.6 + 0x34090)
> >>    #4  __assert_fail (libc.so.6 + 0x34100)
> >>    #5  yank_unregister_instance (qemu-kvm + 0x8b8280)
> >>    #6  migrate_fd_cleanup (qemu-kvm + 0x3c6308)
> >>    #7  migration_bh_dispatch_bh (qemu-kvm + 0x3c2144)
> >>    #8  aio_bh_poll (qemu-kvm + 0x8ba358)
> >>    #9  aio_dispatch (qemu-kvm + 0x8a0ab4)
> >>    #10 aio_ctx_dispatch (qemu-kvm + 0x8bb180)
> >> 
> >> Introduce a new migration status FAILING and use it as an
> >> interim status when an error occurs. Once migration_cleanup()
> >> is done, it sets the migration status to FAILED. This helps
> >> to avoid the above race condition and ensuing failure.
> >> 
> >> Interim status FAILING is set wherever the execution moves
> >> towards migration_cleanup() directly OR via:
> >> 
> >>   migration_iteration_finish  bg_migration_iteration_finish
> >>   -> migration_bh_schedule    -> migration_bh_schedule
> >>    -> migration_cleanup_bh     -> migration_cleanup_bh
> >>     -> migration_cleanup        -> migration_cleanup
> >>      -> FAILED                   -> FAILED
> >> 
> >> The migration status finally moves to FAILED and reports an
> >> appropriate error to the user.
> >
> > I raised a request while I was discussing with you internally, I didn't see
> > this, I will request again:
> >
> > Would you please list where you decided to switch from FAILED -> FAILING,
> > and where you decided not, with justifications for each of them?
> >
> > Let me give a detailed example in this patch, please see below.
> >
> >> 
> >> Signed-off-by: Prasad Pandit <pjp@fedoraproject.org>
> >> ---
> >>  migration/migration.c                 | 33 +++++++++++++++------------
> >>  migration/multifd.c                   |  4 ++--
> >>  qapi/migration.json                   |  8 ++++---
> >>  tests/qtest/migration/migration-qmp.c |  3 ++-
> >>  tests/qtest/migration/precopy-tests.c |  5 ++--
> >>  5 files changed, 31 insertions(+), 22 deletions(-)
> >> 
> >> diff --git a/migration/migration.c b/migration/migration.c
> >> index b316ee01ab..5c32bc8fe5 100644
> >> --- a/migration/migration.c
> >> +++ b/migration/migration.c
> >> @@ -1216,6 +1216,7 @@ bool migration_is_running(void)
> >>      case MIGRATION_STATUS_DEVICE:
> >>      case MIGRATION_STATUS_WAIT_UNPLUG:
> >>      case MIGRATION_STATUS_CANCELLING:
> >> +    case MIGRATION_STATUS_FAILING:
> >>      case MIGRATION_STATUS_COLO:
> >>          return true;
> >>      default:
> >> @@ -1351,6 +1352,7 @@ static void fill_source_migration_info(MigrationInfo *info)
> >>          break;
> >>      case MIGRATION_STATUS_ACTIVE:
> >>      case MIGRATION_STATUS_CANCELLING:
> >> +    case MIGRATION_STATUS_FAILING:
> >>      case MIGRATION_STATUS_POSTCOPY_DEVICE:
> >>      case MIGRATION_STATUS_POSTCOPY_ACTIVE:
> >>      case MIGRATION_STATUS_PRE_SWITCHOVER:
> >> @@ -1409,6 +1411,7 @@ static void fill_destination_migration_info(MigrationInfo *info)
> >>      case MIGRATION_STATUS_POSTCOPY_ACTIVE:
> >>      case MIGRATION_STATUS_POSTCOPY_PAUSED:
> >>      case MIGRATION_STATUS_POSTCOPY_RECOVER:
> >> +    case MIGRATION_STATUS_FAILING:
> >>      case MIGRATION_STATUS_FAILED:
> >>      case MIGRATION_STATUS_COLO:
> >>          info->has_status = true;
> >> @@ -1531,6 +1534,9 @@ static void migration_cleanup(MigrationState *s)
> >>      if (s->state == MIGRATION_STATUS_CANCELLING) {
> >>          migrate_set_state(&s->state, MIGRATION_STATUS_CANCELLING,
> >>                            MIGRATION_STATUS_CANCELLED);
> >> +    } else if (s->state == MIGRATION_STATUS_FAILING) {
> >> +        migrate_set_state(&s->state, MIGRATION_STATUS_FAILING,
> >> +                          MIGRATION_STATUS_FAILED);
> >>      }
> >>  
> >>      if (s->error) {
> >> @@ -1584,7 +1590,7 @@ static void migration_connect_set_error(MigrationState *s, const Error *error)
> >>  
> >>      switch (current) {
> >>      case MIGRATION_STATUS_SETUP:
> >> -        next = MIGRATION_STATUS_FAILED;
> >> +        next = MIGRATION_STATUS_FAILING;
> >
> > This is the first real change that we'll switch to FAILING when
> > migration_connect_set_error() is invoked and migration failed.
> >
> > Please justify why setting FAILING is correct here.
> >
> > This function is invoked in three callers:
> >
> > qmp_migrate[2302]              migration_connect_set_error(s, local_err);
> > qmp_migrate_finish[2347]       migration_connect_set_error(s, local_err);
> > migration_connect[4047]        migration_connect_set_error(s, error_in);
> >
> > At least from the initial two callers, I don't see migration_cleanup()
> > invoked after setting FAILING.  Could this cause migration to get into
> > FAILING status forever without finally move to FAILED?
> >
> 
> Good point, I'm working on some cleanups to connection code and one
> change I did there is to add a migration_cleanup() call into
> migration_connect_error_propagate().

Yeh, this sounds like a good thing in general.  A few quick comments
below.

> 
> 1) branch WIP:
> https://gitlab.com/farosas/qemu/-/commits/migration-connect-cleanup
> 
> 2) the patch:
> ---
> From 1f9eeb898f3a5efba7c183e351fa36a5471fd0b2 Mon Sep 17 00:00:00 2001
> From: Fabiano Rosas <farosas@suse.de>
> Date: Thu, 18 Dec 2025 10:54:46 -0300
> Subject: [PATCH] migration: Fold migration_cleanup into
>  migration_connect_error_propagate
> 
> The code path from qmp_migrate() until the migration thread starts is
> a little twisty due to the async nature of some routines. One issue is
> that the async functions cannot return errors to their callers and
> must instead call forward into migration_channel_connect() and pass
> the error as input.
> 
> Ideally we'd have a function that just receives the error as input and
> handles it. However, currently migration_channel_connect() has a code
> path that moves forward into migration_connect(), also passing the
> error as input, only for migration_connect() to then check that an
> error happened and call migration_cleanup().
> 
> Clean this up:
> 
> 1) Make migration_connect_error_propagate() be the function that
> handles the error and call it at the point the error happens in the
> async code. (this is all migration code, there's no layering
> violation)
> 
> 2) Stop checking for an incoming error in migration_connect(), that
> function should be only reached when everything that came before
> succeeded.
> 
> 3) Fold migration_cleanup() into migration_connect_error_propagate()
> so the cleanup happens at the moment the error is detected and not
> several calls down the stack.
> 
> 4) To address the quirk that during postcopy recovery there should be
> no cleanup, move that check into migration_cleanup() and return early
> if doing resume.
> 
> Notable functional changes:
> 
> i) Assumes a larger window for what it means to be "in resume"
>    before: from qmp_migrate until migration_connect
>    after: from qmp_migration until the state transition into
>           MIGRATION_STATUS_POSTCOPY_RECOVER
> 
> ii) After an error, MIGRATION_STATUS_POSTCOPY_RECOVER_SETUP changes to
>     MIGRATION_STATUS_POSTCOPY_PAUSED.
> 
>     This is already the case when migration_connect_error_propagate()
>     was used, but not when migration_connect() receives
>     error_in. Seems like a pre-existing bug actually.
> 
> iii) If the socket_start_outgoing_migration function *returns* an
>      error, now migration_cleanup() is called. Previously, cleanup
>      only happened when the error was *passed* forward, i.e. only
>      after the async call.
> 
> iv) If cpr_state_save() fails, now migration_cleanup() is called.
> 
> Signed-off-by: Fabiano Rosas <farosas@suse.de>
> ---
>  migration/migration.c | 35 ++++++++++++++++++-----------------
>  1 file changed, 18 insertions(+), 17 deletions(-)
> 
> diff --git a/migration/migration.c b/migration/migration.c
> index 4b1afcab24..52c1a97e46 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -1481,6 +1481,14 @@ static void migration_cleanup(MigrationState *s)
>      MigrationEventType type;
>      QEMUFile *tmp = NULL;
>  
> +    /*
> +     * Don't do cleanup if we're waiting for another connection from
> +     * the user.
> +     */
> +    if (s->state == MIGRATION_STATUS_POSTCOPY_RECOVER_SETUP) {
> +        return;
> +    }

I wonder if QEMU will be in RECOVER_SETUP when reaching here..  shouldn't
below [2] already moved it to PAUSED for a postcopy?

The other thing is, migration_cleanup() currently also takes the job of
dumping the error to stderr:

    if (s->error) {
        /* It is used on info migrate.  We can't free it */
        error_report_err(error_copy(s->error));
    }

I wonder if we could merge [1] below, then here when it's PAUSED we "goto"
that part (we may need to move the error_report_err to the end of
migration_cleanup, though, please double check).

> +
>      trace_migration_cleanup();
>  
>      migration_cleanup_json_writer(s);
> @@ -1585,6 +1593,14 @@ static void migration_connect_error_propagate(MigrationState *s, Error *error)
>      case MIGRATION_STATUS_POSTCOPY_RECOVER_SETUP:
>          /* Never fail a postcopy migration; switch back to PAUSED instead */
>          next = MIGRATION_STATUS_POSTCOPY_PAUSED;
> +
> +        /*
> +         * Give HMP user a hint on what failed.  It's normally done in
> +         * migration_cleanup(), but call it here explicitly because we
> +         * don't do cleanup when waiting for postcopy recover.
> +         */
> +        error_report_err(error_copy(error));

[1]

> +
>          break;
>      default:
>          /*
> @@ -1598,6 +1614,7 @@ static void migration_connect_error_propagate(MigrationState *s, Error *error)
>  
>      migrate_set_state(&s->state, current, next);

[2]

>      migrate_error_propagate(s, error);
> +    migration_cleanup(s);
>  }
>  
>  void migration_cancel(void)
> @@ -2326,12 +2343,8 @@ static void qmp_migrate_finish(MigrationAddress *addr, bool resume_requested,
>      }
>  
>      if (local_err) {
> -        if (!resume_requested) {
> -            yank_unregister_instance(MIGRATION_YANK_INSTANCE);
> -        }
> -        migration_connect_error_propagate(s, error_copy(local_err));
> +        migration_connect_error_propagate(s, local_err);

I feel like this is a rebase accident.. now after renaming this function to
migration_connect_error_propagate() it'll take over the ownership, so IIUC
error_copy() will be needed here (as error_propagate() right below will
also take the ownership..).

>          error_propagate(errp, local_err);
> -        return;
>      }
>  }
>  
> @@ -4026,18 +4039,6 @@ void migration_connect(MigrationState *s, Error *error_in)
>      s->expected_downtime = migrate_downtime_limit();
>      if (error_in) {
>          migration_connect_error_propagate(s, error_in);
> -        if (resume) {
> -            /*
> -             * Don't do cleanup for resume if channel is invalid, but only dump
> -             * the error.  We wait for another channel connect from the user.
> -             * The error_report still gives HMP user a hint on what failed.
> -             * It's normally done in migration_cleanup(), but call it here
> -             * explicitly.
> -             */
> -            error_report_err(error_copy(s->error));
> -        } else {
> -            migration_cleanup(s);
> -        }

Yeah this part looks better.

>          return;
>      }
>  
> -- 
> 

-- 
Peter Xu
Re: [PATCH] migration: introduce MIGRATION_STATUS_FAILING
Posted by Fabiano Rosas 1 month, 2 weeks ago
Prasad Pandit <ppandit@redhat.com> writes:

> From: Prasad Pandit <pjp@fedoraproject.org>
>
> When migration connection is broken, the QEMU and libvirtd(8)
> process on the source side receive TCP connection reset
> notification. QEMU sets the migration status to FAILED and
> proceeds to migration_cleanup(). Meanwhile, Libvirtd(8) sends
> a QMP command to migrate_set_capabilities().
>
> The migration_cleanup() and qmp_migrate_set_capabilities()
> calls race with each other. When the latter is invoked first,
> since the migration is not running (FAILED), migration
> capabilities are reset to false, so during migration_cleanup()
> the QEMU process crashes with assertion failure.
>
>   Stack trace of thread 255014:
>    #0  __pthread_kill_implementation (libc.so.6 + 0x822e8)
>    #1  raise (libc.so.6 + 0x3a73c)
>    #2  abort (libc.so.6 + 0x27034)
>    #3  __assert_fail_base (libc.so.6 + 0x34090)
>    #4  __assert_fail (libc.so.6 + 0x34100)
>    #5  yank_unregister_instance (qemu-kvm + 0x8b8280)
>    #6  migrate_fd_cleanup (qemu-kvm + 0x3c6308)
>    #7  migration_bh_dispatch_bh (qemu-kvm + 0x3c2144)
>    #8  aio_bh_poll (qemu-kvm + 0x8ba358)
>    #9  aio_dispatch (qemu-kvm + 0x8a0ab4)
>    #10 aio_ctx_dispatch (qemu-kvm + 0x8bb180)
>

This doesn't look like it's caused by set-capabilities. Seems like:

qmp_migrate()               
-> migrate_init()           
   s->to_dst_file = NULL;
bql_unlock() somewhere
                            bql_lock()
                            migration_cleanup()
                            tmp = s->to_dst_file;
                            if (tmp) {
                                migration_ioc_unregister_yank_from_file(tmp);
                            ...
                            yank_unregister_instance()
                            *sees the function was not unregistered*

Please clarify, there might be some other bug lurking around, such as a
locking issue with qemu_file_lock or even the BQL.

Also, if possible provide an upstream backtrace, or at least mention the
code path based on upstream code. It's ok to keep the downstream
backtrace, but point that out in the commit message.

> Introduce a new migration status FAILING and use it as an
> interim status when an error occurs. Once migration_cleanup()
> is done, it sets the migration status to FAILED. This helps
> to avoid the above race condition and ensuing failure.

I'm fine with the general idea:

i) FAILED and CANCELLED are terminal states. It makes sense to not have
work happen after they're set.

ii) Using an intermediate state, assuming locking/atomic are correct is
a suitable fix for the issue.

iii) Using a FAILING status seems appropriate.

However,

It would be great if we could stop exposing implementation details via
QAPI. Does the user really need to see events for CANCELLING and
FAILING?

It would probably be easier if we kept MigrationStatus as QAPI only and
used a separate mechanism to track the internal states.

That said, we could merge this as is to fix the bug and think about that
later.

>
> Interim status FAILING is set wherever the execution moves
> towards migration_cleanup() directly OR via:
>
>   migration_iteration_finish  bg_migration_iteration_finish
>   -> migration_bh_schedule    -> migration_bh_schedule
>    -> migration_cleanup_bh     -> migration_cleanup_bh
>     -> migration_cleanup        -> migration_cleanup
>      -> FAILED                   -> FAILED
>
> The migration status finally moves to FAILED and reports an
> appropriate error to the user.
>
> Signed-off-by: Prasad Pandit <pjp@fedoraproject.org>
> ---
>  migration/migration.c                 | 33 +++++++++++++++------------
>  migration/multifd.c                   |  4 ++--
>  qapi/migration.json                   |  8 ++++---
>  tests/qtest/migration/migration-qmp.c |  3 ++-
>  tests/qtest/migration/precopy-tests.c |  5 ++--
>  5 files changed, 31 insertions(+), 22 deletions(-)
>
> diff --git a/migration/migration.c b/migration/migration.c
> index b316ee01ab..5c32bc8fe5 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -1216,6 +1216,7 @@ bool migration_is_running(void)
>      case MIGRATION_STATUS_DEVICE:
>      case MIGRATION_STATUS_WAIT_UNPLUG:
>      case MIGRATION_STATUS_CANCELLING:
> +    case MIGRATION_STATUS_FAILING:
>      case MIGRATION_STATUS_COLO:
>          return true;
>      default:
> @@ -1351,6 +1352,7 @@ static void fill_source_migration_info(MigrationInfo *info)
>          break;
>      case MIGRATION_STATUS_ACTIVE:
>      case MIGRATION_STATUS_CANCELLING:
> +    case MIGRATION_STATUS_FAILING:
>      case MIGRATION_STATUS_POSTCOPY_DEVICE:
>      case MIGRATION_STATUS_POSTCOPY_ACTIVE:
>      case MIGRATION_STATUS_PRE_SWITCHOVER:
> @@ -1409,6 +1411,7 @@ static void fill_destination_migration_info(MigrationInfo *info)
>      case MIGRATION_STATUS_POSTCOPY_ACTIVE:
>      case MIGRATION_STATUS_POSTCOPY_PAUSED:
>      case MIGRATION_STATUS_POSTCOPY_RECOVER:
> +    case MIGRATION_STATUS_FAILING:
>      case MIGRATION_STATUS_FAILED:
>      case MIGRATION_STATUS_COLO:
>          info->has_status = true;
> @@ -1531,6 +1534,9 @@ static void migration_cleanup(MigrationState *s)
>      if (s->state == MIGRATION_STATUS_CANCELLING) {
>          migrate_set_state(&s->state, MIGRATION_STATUS_CANCELLING,
>                            MIGRATION_STATUS_CANCELLED);
> +    } else if (s->state == MIGRATION_STATUS_FAILING) {
> +        migrate_set_state(&s->state, MIGRATION_STATUS_FAILING,
> +                          MIGRATION_STATUS_FAILED);
>      }
>  
>      if (s->error) {
> @@ -1584,7 +1590,7 @@ static void migration_connect_set_error(MigrationState *s, const Error *error)
>  
>      switch (current) {
>      case MIGRATION_STATUS_SETUP:
> -        next = MIGRATION_STATUS_FAILED;
> +        next = MIGRATION_STATUS_FAILING;
>          break;
>      case MIGRATION_STATUS_POSTCOPY_RECOVER_SETUP:
>          /* Never fail a postcopy migration; switch back to PAUSED instead */
> @@ -1728,6 +1734,7 @@ bool migration_has_failed(MigrationState *s)
>  {
>      return (s->state == MIGRATION_STATUS_CANCELLING ||
>              s->state == MIGRATION_STATUS_CANCELLED ||
> +            s->state == MIGRATION_STATUS_FAILING ||
>              s->state == MIGRATION_STATUS_FAILED);
>  }
>  
> @@ -2744,7 +2751,7 @@ static int postcopy_start(MigrationState *ms, Error **errp)
>          if (postcopy_preempt_establish_channel(ms)) {
>              if (ms->state != MIGRATION_STATUS_CANCELLING) {
>                  migrate_set_state(&ms->state, ms->state,
> -                                  MIGRATION_STATUS_FAILED);
> +                                  MIGRATION_STATUS_FAILING);
>              }
>              error_setg(errp, "%s: Failed to establish preempt channel",
>                         __func__);
> @@ -2907,7 +2914,7 @@ fail_closefb:
>      qemu_fclose(fb);
>  fail:
>      if (ms->state != MIGRATION_STATUS_CANCELLING) {
> -        migrate_set_state(&ms->state, ms->state, MIGRATION_STATUS_FAILED);
> +        migrate_set_state(&ms->state, ms->state, MIGRATION_STATUS_FAILING);
>      }

This is a good example where having MigrationStatus makes the code more
complicated. This could be exiting=true, running=false, etc. It
shouldn't matter why this routine failed. If we reach
migration_cleanup() and, at the very end, state is CANCELLING, then we
know the cancel command has caused this and set the state to
CANCELLED. If the state was something else, then it's unintended and we
set FAILED.

>      migration_block_activate(NULL);
>      migration_call_notifiers(ms, MIG_EVENT_PRECOPY_FAILED, NULL);
> @@ -3102,7 +3109,7 @@ fail:
>      }
>  
>      if (s->state != MIGRATION_STATUS_CANCELLING) {
> -        migrate_set_state(&s->state, s->state, MIGRATION_STATUS_FAILED);
> +        migrate_set_state(&s->state, s->state, MIGRATION_STATUS_FAILING);
>      }
>  }
>  
> @@ -3139,7 +3146,7 @@ static void bg_migration_completion(MigrationState *s)
>  
>  fail:
>      migrate_set_state(&s->state, current_active_state,
> -                      MIGRATION_STATUS_FAILED);
> +                      MIGRATION_STATUS_FAILING);
>  }
>  
>  typedef enum MigThrError {
> @@ -3341,7 +3348,7 @@ static MigThrError migration_detect_error(MigrationState *s)
>           * For precopy (or postcopy with error outside IO, or before dest
>           * starts), we fail with no time.
>           */
> -        migrate_set_state(&s->state, state, MIGRATION_STATUS_FAILED);
> +        migrate_set_state(&s->state, state, MIGRATION_STATUS_FAILING);
>          trace_migration_thread_file_err();
>  
>          /* Time to stop the migration, now. */
> @@ -3572,7 +3579,7 @@ static void migration_iteration_finish(MigrationState *s)
>          migrate_start_colo_process(s);
>          s->vm_old_state = RUN_STATE_RUNNING;
>          /* Fallthrough */
> -    case MIGRATION_STATUS_FAILED:
> +    case MIGRATION_STATUS_FAILING:
>      case MIGRATION_STATUS_CANCELLED:
>      case MIGRATION_STATUS_CANCELLING:
>          if (!migration_block_activate(&local_err)) {
> @@ -3631,7 +3638,7 @@ static void bg_migration_iteration_finish(MigrationState *s)
>      switch (s->state) {
>      case MIGRATION_STATUS_COMPLETED:
>      case MIGRATION_STATUS_ACTIVE:
> -    case MIGRATION_STATUS_FAILED:
> +    case MIGRATION_STATUS_FAILING:
>      case MIGRATION_STATUS_CANCELLED:
>      case MIGRATION_STATUS_CANCELLING:
>          break;
> @@ -3821,7 +3828,7 @@ static void *migration_thread(void *opaque)
>          migrate_set_error(s, local_err);
>          error_free(local_err);
>          migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
> -                          MIGRATION_STATUS_FAILED);
> +                          MIGRATION_STATUS_FAILING);
>          goto out;
>      }
>  
> @@ -3945,8 +3952,6 @@ static void *bg_migration_thread(void *opaque)
>      if (ret) {
>          migrate_set_error(s, local_err);
>          error_free(local_err);
> -        migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
> -                          MIGRATION_STATUS_FAILED);
>          goto fail_setup;
>      }
>  
> @@ -4008,12 +4013,12 @@ static void *bg_migration_thread(void *opaque)
>  
>  fail:
>      if (early_fail) {
> -        migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
> -                MIGRATION_STATUS_FAILED);
>          bql_unlock();
>      }
>  
>  fail_setup:
> +    migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
> +                        MIGRATION_STATUS_FAILING);
>      bg_migration_iteration_finish(s);
>  
>      qemu_fclose(fb);
> @@ -4128,7 +4133,7 @@ void migration_connect(MigrationState *s, Error *error_in)
>  fail:
>      migrate_set_error(s, local_err);
>      if (s->state != MIGRATION_STATUS_CANCELLING) {
> -        migrate_set_state(&s->state, s->state, MIGRATION_STATUS_FAILED);
> +        migrate_set_state(&s->state, s->state, MIGRATION_STATUS_FAILING);
>      }
>      error_report_err(local_err);
>      migration_cleanup(s);
> diff --git a/migration/multifd.c b/migration/multifd.c
> index 3203dc98e1..970633474e 100644
> --- a/migration/multifd.c
> +++ b/migration/multifd.c
> @@ -434,7 +434,7 @@ static void multifd_send_set_error(Error *err)
>              s->state == MIGRATION_STATUS_DEVICE ||
>              s->state == MIGRATION_STATUS_ACTIVE) {
>              migrate_set_state(&s->state, s->state,
> -                              MIGRATION_STATUS_FAILED);
> +                              MIGRATION_STATUS_FAILING);
>          }
>      }
>  }
> @@ -1001,7 +1001,7 @@ bool multifd_send_setup(void)
>  
>  err:
>      migrate_set_state(&s->state, MIGRATION_STATUS_SETUP,
> -                      MIGRATION_STATUS_FAILED);
> +                      MIGRATION_STATUS_FAILING);
>      return false;
>  }
>  
> diff --git a/qapi/migration.json b/qapi/migration.json
> index cf023bd29d..85f4961781 100644
> --- a/qapi/migration.json
> +++ b/qapi/migration.json
> @@ -158,7 +158,9 @@
>  #
>  # @completed: migration is finished.
>  #
> -# @failed: some error occurred during migration process.
> +# @failing: error occurred during migration, clean-up underway.

"wait a moment, we're picking up the pieces" =D

> +#
> +# @failed: error occurred during migration, clean-up done.
>  #
>  # @colo: VM is in the process of fault tolerance, VM can not get into
>  #     this state unless colo capability is enabled for migration.
> @@ -181,8 +183,8 @@
>    'data': [ 'none', 'setup', 'cancelling', 'cancelled',
>              'active', 'postcopy-device', 'postcopy-active',
>              'postcopy-paused', 'postcopy-recover-setup',
> -            'postcopy-recover', 'completed', 'failed', 'colo',
> -            'pre-switchover', 'device', 'wait-unplug' ] }
> +            'postcopy-recover', 'completed', 'failing', 'failed',
> +            'colo', 'pre-switchover', 'device', 'wait-unplug' ] }
>  
>  ##
>  # @VfioStats:
> diff --git a/tests/qtest/migration/migration-qmp.c b/tests/qtest/migration/migration-qmp.c
> index c803fcee9d..ceb40efd56 100644
> --- a/tests/qtest/migration/migration-qmp.c
> +++ b/tests/qtest/migration/migration-qmp.c
> @@ -241,7 +241,8 @@ void wait_for_migration_fail(QTestState *from, bool allow_active)
>      do {
>          status = migrate_query_status(from);
>          bool result = !strcmp(status, "setup") || !strcmp(status, "failed") ||
> -            (allow_active && !strcmp(status, "active"));
> +            (allow_active && !strcmp(status, "active")) ||
> +            !strcmp(status, "failing");
>          if (!result) {
>              fprintf(stderr, "%s: unexpected status status=%s allow_active=%d\n",
>                      __func__, status, allow_active);
> diff --git a/tests/qtest/migration/precopy-tests.c b/tests/qtest/migration/precopy-tests.c
> index 57ca623de5..a04442df96 100644
> --- a/tests/qtest/migration/precopy-tests.c
> +++ b/tests/qtest/migration/precopy-tests.c
> @@ -807,7 +807,8 @@ static void test_cancel_src_after_status(void *opaque)
>      } else if (g_str_equal(phase, "completed")) {
>          test_cancel_src_after_complete(from, to, uri, phase);
>  
> -    } else if (g_str_equal(phase, "failed")) {
> +    } else if (g_str_equal(phase, "failing") ||
> +               g_str_equal(phase, "failed")) {
>          test_cancel_src_after_failed(from, to, uri, phase);
>  
>      } else if (g_str_equal(phase, "none")) {
> @@ -1316,7 +1317,7 @@ void migration_test_add_precopy(MigrationTestEnv *env)
>      }
>  
>      /* ensure new status don't go unnoticed */
> -    assert(MIGRATION_STATUS__MAX == 16);
> +    assert(MIGRATION_STATUS__MAX == 17);
>  
>      for (int i = MIGRATION_STATUS_NONE; i < MIGRATION_STATUS__MAX; i++) {
>          switch (i) {
Re: [PATCH] migration: introduce MIGRATION_STATUS_FAILING
Posted by Peter Xu 1 month, 2 weeks ago
On Mon, Dec 22, 2025 at 11:29:57AM -0300, Fabiano Rosas wrote:
> I'm fine with the general idea:
> 
> i) FAILED and CANCELLED are terminal states. It makes sense to not have
> work happen after they're set.
> 
> ii) Using an intermediate state, assuming locking/atomic are correct is
> a suitable fix for the issue.
> 
> iii) Using a FAILING status seems appropriate.
> 
> However,
> 
> It would be great if we could stop exposing implementation details via
> QAPI. Does the user really need to see events for CANCELLING and
> FAILING?
> 
> It would probably be easier if we kept MigrationStatus as QAPI only and
> used a separate mechanism to track the internal states.
> 
> That said, we could merge this as is to fix the bug and think about that
> later.

This bug looks to be there for a long time, IMHO we don't need to rush
fixing it if we risk adding a new status and revert it quickly...  Let's
discuss it here, and it's a valid question indeed.

One thing good about exposing such status via QAPI is, it can help us
diagnose issues by seeing CANCELLING / FAILING even looking at
query-migrate results (as normally when bug happens we can't see the
internal status..), so that we know either it's explicitly cancelled, or
something went wrong.

If it's a completely hidden / internal status, we may see ACTIVE even if
something wrong happened..

My current hope is any mgmt should normally by default ignore new migration
states..  If that's always achieved, it looks to me adding FAILING directly
into migration status would still have some benefits on debugging.

[...]

> > @@ -2907,7 +2914,7 @@ fail_closefb:
> >      qemu_fclose(fb);
> >  fail:
> >      if (ms->state != MIGRATION_STATUS_CANCELLING) {
> > -        migrate_set_state(&ms->state, ms->state, MIGRATION_STATUS_FAILED);
> > +        migrate_set_state(&ms->state, ms->state, MIGRATION_STATUS_FAILING);
> >      }
> 
> This is a good example where having MigrationStatus makes the code more
> complicated. This could be exiting=true, running=false, etc. It
> shouldn't matter why this routine failed. If we reach
> migration_cleanup() and, at the very end, state is CANCELLING, then we
> know the cancel command has caused this and set the state to
> CANCELLED. If the state was something else, then it's unintended and we
> set FAILED.

If it'll be an internal status, we'll still need to identify if someone
already have cancelled it, right?

Assuming we introduce stop_reason flag, making it:

enum {
    MIG_STOP_REASON_CANCEL,
    MIG_STOP_REASON_FAIL,
} MigrationStopReason;

Then we can switch to CANCELLED / FAILED when cleanup from those reasons.

Then here, logically we also need logic like:

    if (stop_reason != MIG_STOP_REASON_CANCEL) {
        stop_reason = MIG_STOP_REASON_FAIL;
    }

Because we want to make sure when the user already triggered cancel, it
won't show FAILED but only show CANCELLED at last?

-- 
Peter Xu
Re: [PATCH] migration: introduce MIGRATION_STATUS_FAILING
Posted by Prasad Pandit 1 month ago
Hi,

On Tue, 23 Dec 2025 at 21:00, Peter Xu <peterx@redhat.com> wrote:
> One thing good about exposing such status via QAPI is, it can help us
> diagnose issues by seeing CANCELLING / FAILING even looking at
> query-migrate results (as normally when bug happens we can't see the
> internal status..), so that we know either it's explicitly cancelled, or
> something went wrong.
>
> If it's a completely hidden / internal status, we may see ACTIVE even if
> something wrong happened..

* Both process state and reason(s) for the state change needs to be
visible to the user. But states like cancelling/failing are redundant,
users would derive the same conclusion from CANCELLED and CANCELLING
OR FAILED AND FAILING. Besides, migration_cleanup() does exactly the
same steps irrespective of whether migration is failing or cancelling
or failed or cancelled.

> My current hope is any mgmt should normally by default ignore new migration
> states..  If that's always achieved, it looks to me adding FAILING directly
> into migration status would still have some benefits on debugging.

* libvirtd(8) complains about unknown states multiple times:
      libvirtd[2194267]: unknown status 'failing' in migration event
      libvirtd[2194267]: unknown status 'failing' in migration event
      libvirtd[2194267]: unknown status 'failing' in migration event


> > > @@ -2907,7 +2914,7 @@ fail_closefb:
> > >      qemu_fclose(fb);
> > >  fail:
> > >      if (ms->state != MIGRATION_STATUS_CANCELLING) {
> > > -        migrate_set_state(&ms->state, ms->state, MIGRATION_STATUS_FAILED);
> > > +        migrate_set_state(&ms->state, ms->state, MIGRATION_STATUS_FAILING);
> > >      }
> >
> > This is a good example where having MigrationStatus makes the code more
> > complicated. This could be exiting=true, running=false, etc. It
> > shouldn't matter why this routine failed. If we reach
> > migration_cleanup() and, at the very end, state is CANCELLING, then we
> > know the cancel command has caused this and set the state to
> > CANCELLED. If the state was something else, then it's unintended and we
> > set FAILED.
>
> If it'll be an internal status, we'll still need to identify if someone
> already have cancelled it, right?
>
> Assuming we introduce stop_reason flag, making it:
>
> enum {
>     MIG_STOP_REASON_CANCEL,
>     MIG_STOP_REASON_FAIL,
> } MigrationStopReason;
>
> Then we can switch to CANCELLED / FAILED when cleanup from those reasons.
>
> Then here, logically we also need logic like:
>
>     if (stop_reason != MIG_STOP_REASON_CANCEL) {
>         stop_reason = MIG_STOP_REASON_FAIL;
>     }
>
> Because we want to make sure when the user already triggered cancel, it
> won't show FAILED but only show CANCELLED at last?

* I think the way we are setting/changing these states in as many
locations is only adding to the complications. Do we have to
explicitly set these states like this? What if migration_cleanup()
always sets the state to 'STOP'. Similarly other places set the state
to a predefined state. OR
===
    struct {
        current_state;
        old_state;
        event/trigger;
        reason[];
    } MigrationState s;

    migration_change_state(s) {
          s->old_state = s->current_state;
          if (s->current_state == START && s->trigger ==
'connections-established') {
              s->current_state = ACTIVE;
              s->reason = "connections-established, migration starting"
          } else if (s->current_state == ACTIVE && s->trigger == 'completed') {
              s->current_state = STOP
              s->reason = "migration completed"
          } else if (s->current_state == ACTIVE  && s->trigger == 'pause') {
              s->current_state = PAUSE
              s->reason = "pause, migration paused"
          } else if (s->current_state == ACTIVE && s->trigger ==
'error-occurred') {
              s->current_state = STOP
              s->reason = "Error occurred, migration failed"
          } else if (s->current_state == ACTIVE && s->trigger ==
'user-cancel') {
              s->current_state = STOP
              s->reason = "user-cancel, migration cancelled"
         } else {
              s->current_state = s->current_state;
              warn_msg("unknown combination, maybe define a new rule?");
         }
    }
===
* We define explicit rules for the state change and accordingly we
only call migration_change_state() at any point and it'll change to an
appropriate next state, recording the due reason for the change.

Wdyt...?

Thank you.
---
  - Prasad
Re: [PATCH] migration: introduce MIGRATION_STATUS_FAILING
Posted by Peter Xu 1 month ago
On Tue, Jan 06, 2026 at 05:15:23PM +0530, Prasad Pandit wrote:
> Hi,
> 
> On Tue, 23 Dec 2025 at 21:00, Peter Xu <peterx@redhat.com> wrote:
> > One thing good about exposing such status via QAPI is, it can help us
> > diagnose issues by seeing CANCELLING / FAILING even looking at
> > query-migrate results (as normally when bug happens we can't see the
> > internal status..), so that we know either it's explicitly cancelled, or
> > something went wrong.
> >
> > If it's a completely hidden / internal status, we may see ACTIVE even if
> > something wrong happened..
> 
> * Both process state and reason(s) for the state change needs to be
> visible to the user. But states like cancelling/failing are redundant,
> users would derive the same conclusion from CANCELLED and CANCELLING
> OR FAILED AND FAILING. Besides, migration_cleanup() does exactly the
> same steps irrespective of whether migration is failing or cancelling
> or failed or cancelled.
> 
> > My current hope is any mgmt should normally by default ignore new migration
> > states..  If that's always achieved, it looks to me adding FAILING directly
> > into migration status would still have some benefits on debugging.
> 
> * libvirtd(8) complains about unknown states multiple times:
>       libvirtd[2194267]: unknown status 'failing' in migration event
>       libvirtd[2194267]: unknown status 'failing' in migration event
>       libvirtd[2194267]: unknown status 'failing' in migration event

IIUC these are benign warnings, so should be fine. We'll need one entry for
libvirt ultimately to avoid this warning.  Copying Dan and Jiri in case I
am wrong.

> 
> 
> > > > @@ -2907,7 +2914,7 @@ fail_closefb:
> > > >      qemu_fclose(fb);
> > > >  fail:
> > > >      if (ms->state != MIGRATION_STATUS_CANCELLING) {
> > > > -        migrate_set_state(&ms->state, ms->state, MIGRATION_STATUS_FAILED);
> > > > +        migrate_set_state(&ms->state, ms->state, MIGRATION_STATUS_FAILING);
> > > >      }
> > >
> > > This is a good example where having MigrationStatus makes the code more
> > > complicated. This could be exiting=true, running=false, etc. It
> > > shouldn't matter why this routine failed. If we reach
> > > migration_cleanup() and, at the very end, state is CANCELLING, then we
> > > know the cancel command has caused this and set the state to
> > > CANCELLED. If the state was something else, then it's unintended and we
> > > set FAILED.
> >
> > If it'll be an internal status, we'll still need to identify if someone
> > already have cancelled it, right?
> >
> > Assuming we introduce stop_reason flag, making it:
> >
> > enum {
> >     MIG_STOP_REASON_CANCEL,
> >     MIG_STOP_REASON_FAIL,
> > } MigrationStopReason;
> >
> > Then we can switch to CANCELLED / FAILED when cleanup from those reasons.
> >
> > Then here, logically we also need logic like:
> >
> >     if (stop_reason != MIG_STOP_REASON_CANCEL) {
> >         stop_reason = MIG_STOP_REASON_FAIL;
> >     }
> >
> > Because we want to make sure when the user already triggered cancel, it
> > won't show FAILED but only show CANCELLED at last?
> 
> * I think the way we are setting/changing these states in as many
> locations is only adding to the complications. Do we have to
> explicitly set these states like this? What if migration_cleanup()
> always sets the state to 'STOP'. Similarly other places set the state
> to a predefined state. OR
> ===
>     struct {
>         current_state;
>         old_state;
>         event/trigger;
>         reason[];
>     } MigrationState s;
> 
>     migration_change_state(s) {
>           s->old_state = s->current_state;
>           if (s->current_state == START && s->trigger ==
> 'connections-established') {
>               s->current_state = ACTIVE;
>               s->reason = "connections-established, migration starting"
>           } else if (s->current_state == ACTIVE && s->trigger == 'completed') {
>               s->current_state = STOP
>               s->reason = "migration completed"
>           } else if (s->current_state == ACTIVE  && s->trigger == 'pause') {
>               s->current_state = PAUSE
>               s->reason = "pause, migration paused"
>           } else if (s->current_state == ACTIVE && s->trigger ==
> 'error-occurred') {
>               s->current_state = STOP
>               s->reason = "Error occurred, migration failed"

We can't change status that were already used, like FAILED.  Libvirt and
all mgmt may rely on it.

>           } else if (s->current_state == ACTIVE && s->trigger ==
> 'user-cancel') {
>               s->current_state = STOP
>               s->reason = "user-cancel, migration cancelled"
>          } else {
>               s->current_state = s->current_state;
>               warn_msg("unknown combination, maybe define a new rule?");
>          }
>     }
> ===
> * We define explicit rules for the state change and accordingly we
> only call migration_change_state() at any point and it'll change to an
> appropriate next state, recording the due reason for the change.
> 
> Wdyt...?

Personally I don't see much benefit on adding a new "trigger" internal API.
If we want to forbid some state machine transitions, we can use a
transition map.  Said that, IMHO it's separate from what we're discussing
here.

Thanks,

> 
> Thank you.
> ---
>   - Prasad
> 

-- 
Peter Xu
Re: [PATCH] migration: introduce MIGRATION_STATUS_FAILING
Posted by Prasad Pandit 1 month ago
On Tue, 6 Jan 2026 at 20:39, Peter Xu <peterx@redhat.com> wrote:
> Copying Dan and Jiri in case ...

* Thank you for copying.

>>           } else if (s->current_state == ACTIVE && s->trigger == 'error-occurred') {
>>               s->current_state = STOP
>>               s->reason = "Error occurred, migration failed"
>
> We can't change status that were already used, like FAILED.  Libvirt and
> all mgmt may rely on it.

* True; If we decide to go in that direction, we'll have to tell
libvirtd(8) and others about new states.

> Personally I don't see much benefit on adding a new "trigger" internal API.
> If we want to forbid some state machine transitions, we can use a
> transition map.  Said that, IMHO it's separate from what we're discussing
> here.

* If we can reduce/rationalise the current 17-18 states and related
complexity, it'll help to simplify things.

Thank you.
---
  - Prasad
Re: [PATCH] migration: introduce MIGRATION_STATUS_FAILING
Posted by Peter Xu 1 month ago
On Wed, Jan 07, 2026 at 04:41:09PM +0530, Prasad Pandit wrote:
> On Tue, 6 Jan 2026 at 20:39, Peter Xu <peterx@redhat.com> wrote:
> > Copying Dan and Jiri in case ...
> 
> * Thank you for copying.
> 
> >>           } else if (s->current_state == ACTIVE && s->trigger == 'error-occurred') {
> >>               s->current_state = STOP
> >>               s->reason = "Error occurred, migration failed"
> >
> > We can't change status that were already used, like FAILED.  Libvirt and
> > all mgmt may rely on it.
> 
> * True; If we decide to go in that direction, we'll have to tell
> libvirtd(8) and others about new states.
> 
> > Personally I don't see much benefit on adding a new "trigger" internal API.
> > If we want to forbid some state machine transitions, we can use a
> > transition map.  Said that, IMHO it's separate from what we're discussing
> > here.
> 
> * If we can reduce/rationalise the current 17-18 states and related
> complexity, it'll help to simplify things.

Personally I still don't see the benefit of doing so.

I agree MIGRATION_STATUS_CANCELLING isn't a must to have, but since we have
it already and libvirt should treat it almost the same as ACTIVE we don't
have problem with it either.  It doesn't sound like a big issue.

I don't see how we can shrink the rest of status otherwise, or what would
you propose for removal?

Thanks,

-- 
Peter Xu
Re: [PATCH] migration: introduce MIGRATION_STATUS_FAILING
Posted by Fabiano Rosas 1 month ago
Prasad Pandit <ppandit@redhat.com> writes:

> Hi,
>
> On Tue, 23 Dec 2025 at 21:00, Peter Xu <peterx@redhat.com> wrote:
>> One thing good about exposing such status via QAPI is, it can help us
>> diagnose issues by seeing CANCELLING / FAILING even looking at
>> query-migrate results (as normally when bug happens we can't see the
>> internal status..), so that we know either it's explicitly cancelled, or
>> something went wrong.
>>
>> If it's a completely hidden / internal status, we may see ACTIVE even if
>> something wrong happened..
>
> * Both process state and reason(s) for the state change needs to be
> visible to the user. But states like cancelling/failing are redundant,
> users would derive the same conclusion from CANCELLED and CANCELLING
> OR FAILED AND FAILING. Besides, migration_cleanup() does exactly the
> same steps irrespective of whether migration is failing or cancelling
> or failed or cancelled.
>
>> My current hope is any mgmt should normally by default ignore new migration
>> states..  If that's always achieved, it looks to me adding FAILING directly
>> into migration status would still have some benefits on debugging.
>
> * libvirtd(8) complains about unknown states multiple times:
>       libvirtd[2194267]: unknown status 'failing' in migration event
>       libvirtd[2194267]: unknown status 'failing' in migration event
>       libvirtd[2194267]: unknown status 'failing' in migration event
>
>
>> > > @@ -2907,7 +2914,7 @@ fail_closefb:
>> > >      qemu_fclose(fb);
>> > >  fail:
>> > >      if (ms->state != MIGRATION_STATUS_CANCELLING) {
>> > > -        migrate_set_state(&ms->state, ms->state, MIGRATION_STATUS_FAILED);
>> > > +        migrate_set_state(&ms->state, ms->state, MIGRATION_STATUS_FAILING);
>> > >      }
>> >
>> > This is a good example where having MigrationStatus makes the code more
>> > complicated. This could be exiting=true, running=false, etc. It
>> > shouldn't matter why this routine failed. If we reach
>> > migration_cleanup() and, at the very end, state is CANCELLING, then we
>> > know the cancel command has caused this and set the state to
>> > CANCELLED. If the state was something else, then it's unintended and we
>> > set FAILED.
>>
>> If it'll be an internal status, we'll still need to identify if someone
>> already have cancelled it, right?
>>
>> Assuming we introduce stop_reason flag, making it:
>>
>> enum {
>>     MIG_STOP_REASON_CANCEL,
>>     MIG_STOP_REASON_FAIL,
>> } MigrationStopReason;
>>
>> Then we can switch to CANCELLED / FAILED when cleanup from those reasons.
>>
>> Then here, logically we also need logic like:
>>
>>     if (stop_reason != MIG_STOP_REASON_CANCEL) {
>>         stop_reason = MIG_STOP_REASON_FAIL;
>>     }
>>
>> Because we want to make sure when the user already triggered cancel, it
>> won't show FAILED but only show CANCELLED at last?
>
> * I think the way we are setting/changing these states in as many
> locations is only adding to the complications. Do we have to
> explicitly set these states like this? What if migration_cleanup()
> always sets the state to 'STOP'. Similarly other places set the state
> to a predefined state. OR
> ===
>     struct {
>         current_state;
>         old_state;
>         event/trigger;
>         reason[];
>     } MigrationState s;
>
>     migration_change_state(s) {
>           s->old_state = s->current_state;
>           if (s->current_state == START && s->trigger ==
> 'connections-established') {
>               s->current_state = ACTIVE;
>               s->reason = "connections-established, migration starting"
>           } else if (s->current_state == ACTIVE && s->trigger == 'completed') {
>               s->current_state = STOP
>               s->reason = "migration completed"
>           } else if (s->current_state == ACTIVE  && s->trigger == 'pause') {
>               s->current_state = PAUSE
>               s->reason = "pause, migration paused"
>           } else if (s->current_state == ACTIVE && s->trigger ==
> 'error-occurred') {
>               s->current_state = STOP
>               s->reason = "Error occurred, migration failed"
>           } else if (s->current_state == ACTIVE && s->trigger ==
> 'user-cancel') {
>               s->current_state = STOP
>               s->reason = "user-cancel, migration cancelled"
>          } else {
>               s->current_state = s->current_state;
>               warn_msg("unknown combination, maybe define a new rule?");
>          }
>     }
> ===
> * We define explicit rules for the state change and accordingly we
> only call migration_change_state() at any point and it'll change to an
> appropriate next state, recording the due reason for the change.
>

If we had a linear state transition table, i.e. a DFA without any
branching, that would be ideal. But since we have states that can reach
(and be reached from) multiple other states, then we'll always need some
input to migration_change_state(). Here you're making it the
s->trigger. Where will that come from?

Looking at runstate.c and job.c, it seems we could at least define a
state transition table and do away with the second parameter to
migrate_set_state(s, old, new).

As we've been discussing, the current state-change mechanism has the
dual purpose of emitting the state change event and also serving as
internal tracking of the migration state. It's not clear to me whether
you're covering both in this proposal or just one of them.

I don't think we've established actually what are the goals of having
any state changes. Do we even need state changes for internal tracking?
We could use your s->trigger as an enum and just check it wherever
necessary. And keep the MIGRATION_STATUS exclusive for the external API,
in which case, it's probably better to just set it unconditionally (in
many places migrate_set_state already takes the current state as
argument, i.e. it doesn't care about the current state).

> Wdyt...?
>
> Thank you.
> ---
>   - Prasad
Re: [PATCH] migration: introduce MIGRATION_STATUS_FAILING
Posted by Prasad Pandit 1 month ago
Hi,

On Tue, 6 Jan 2026 at 19:17, Fabiano Rosas <farosas@suse.de> wrote:
> If we had a linear state transition table, i.e. a DFA without any
> branching, that would be ideal. But since we have states that can reach
> (and be reached from) multiple other states, then we'll always need some
> input to migration_change_state(). Here you're making it the
> s->trigger. Where will that come from?

* The trigger or reason can come from the place where we call
migration_change_state(), there we'll know whether migration has
paused OR completed OR failed OR cancelled.

* Even with branches, the process is still linear as it goes from
start to finish. Just that we can reach the end state via different
paths.
  ===
    $ grep -ri 'shutting' /var/log/libvirt/qemu/   | cut -d' ' -f 3- |
sort | uniq
     shutting down, reason=crashed
     shutting down, reason=destroyed
     shutting down, reason=failed
     shutting down, reason=migrated
     shutting down, reason=shutdown
===
As we see, guest VM can stop/shutdown due to various reasons.

* Between [migration-start] and [migration-end], we can define
events/triggers that will lead to the next state. Ex

      - START -> t:connection-established -> ACTIVE

We can reach the ACTIVE state only after connections are established,
not without that. If connection establishment fails, we reach the END.

     - START  -> t:connection-established -> ACTIVE ->  running   -> END

ACTIVE ->  t:error     ->  END

ACTIVE ->  t:cancel  ->  END

ACTIVE ->  t:pause   ->  PAUSED  -> t:resume -> ACTIVE

> Looking at runstate.c and job.c, it seems we could at least define a
> state transition table and do away with the second parameter to
> migrate_set_state(s, old, new).
>
> As we've been discussing, the current state-change mechanism has the
> dual purpose of emitting the state change event and also serving as
> internal tracking of the migration state. It's not clear to me whether
> you're covering both in this proposal or just one of them.

* We are not doing away with migration states, just reducing or
rationalising them to make it easier. Emitting state change to
libvirtd(8) and internal tracking should still serve the same. Just
that in migration_is_running() etc. functions we'll check only if the
state is ACTIVE, instead of 10 other states which also indicate that
the migration is running.

> I don't think we've established actually what are the goals of having
> any state changes. Do we even need state changes for internal tracking?
> We could use your s->trigger as an enum and just check it wherever
> necessary. And keep the MIGRATION_STATUS exclusive for the external API,
> in which case, it's probably better to just set it unconditionally (in
> many places migrate_set_state already takes the current state as
> argument, i.e. it doesn't care about the current state).

* Well as I see it, different states help us to
      1 - know where the process is at a given time. In case of
errors/failures or other events to know what actions to take.
      2 - what actions/triggers/events are possible.

ex. If an error/cancel occurs before ACTIVE state, during connection
establishment, it may not have to go through migration_cleanup(),
probably there's nothing to cleanup. Vs if an error/cancel occurs
after ACTIVE  or in PAUSED state, we know migration_cleanup() is
needed.  Similarly if we receive t:resume command when in ACTIVE
state, OR receive t:pause command in PAUSED state,  we know there's
nothing to do and ignore it.

Thank you.
---
  - Prasad
Re: [PATCH] migration: introduce MIGRATION_STATUS_FAILING
Posted by Fabiano Rosas 1 month ago
Prasad Pandit <ppandit@redhat.com> writes:

> Hi,
>
> On Tue, 6 Jan 2026 at 19:17, Fabiano Rosas <farosas@suse.de> wrote:
>> If we had a linear state transition table, i.e. a DFA without any
>> branching, that would be ideal. But since we have states that can reach
>> (and be reached from) multiple other states, then we'll always need some
>> input to migration_change_state(). Here you're making it the
>> s->trigger. Where will that come from?
>
> * The trigger or reason can come from the place where we call
> migration_change_state(), there we'll know whether migration has
> paused OR completed OR failed OR cancelled.
>

It would be interesting maybe to restrict the set of
states/triggers/events (I'm not sure) to user-visible phases only, and
those would be defined by anything that's triggered via a QMP
command. Plus the error state which goes in the other direction.

In isolation, having a "trigger" for each QMP command seems like a good
idea to me. It could be just a flag that tells us what is the current
command that's being serviced. Most of migration are actions in response
to a QMP command. This could help with ensuring correctness in
concurrent invocations.

> * Even with branches, the process is still linear as it goes from
> start to finish. Just that we can reach the end state via different
> paths.
>   ===
>     $ grep -ri 'shutting' /var/log/libvirt/qemu/   | cut -d' ' -f 3- |
> sort | uniq
>      shutting down, reason=crashed
>      shutting down, reason=destroyed
>      shutting down, reason=failed
>      shutting down, reason=migrated
>      shutting down, reason=shutdown
> ===
> As we see, guest VM can stop/shutdown due to various reasons.
>
> * Between [migration-start] and [migration-end], we can define
> events/triggers that will lead to the next state. Ex
>
>       - START -> t:connection-established -> ACTIVE

I like this because it forces us to determine more clearly what is the
necessary condition for a state change. This could eventually allow the
abstraction of the qapi_event_send_migration() to a higher
layer. Something like:

void qmp_migrate() {
    t:migrate=true
    
    migration_setup() :: do setup, mig_setup_done=true
    migration_advance_state() :: checks the triggers, changes state and
                                 sends the event

    migration_start() :: migrate, mig_done=true
                         failure, mig_failed=true
                         etc
    migration_advance_state()

    migrate_vmstate() :: device state migration, mig_device_done=true
    migration_advance_state()

 etc..
}

IOW, we could do a better job of separating what is work, what is
migration control flow, what is error handling, etc.

>
> We can reach the ACTIVE state only after connections are established,
> not without that. If connection establishment fails, we reach the END.
>
>      - START  -> t:connection-established -> ACTIVE ->  running   -> END
>
> ACTIVE ->  t:error     ->  END
>
> ACTIVE ->  t:cancel  ->  END
>
> ACTIVE ->  t:pause   ->  PAUSED  -> t:resume -> ACTIVE
>

Looks good, not sure if the actual migration flow would fit this, but
let's assume it would.

>> Looking at runstate.c and job.c, it seems we could at least define a
>> state transition table and do away with the second parameter to
>> migrate_set_state(s, old, new).
>>
>> As we've been discussing, the current state-change mechanism has the
>> dual purpose of emitting the state change event and also serving as
>> internal tracking of the migration state. It's not clear to me whether
>> you're covering both in this proposal or just one of them.
>
> * We are not doing away with migration states, just reducing or
> rationalising them to make it easier. Emitting state change to
> libvirtd(8) and internal tracking should still serve the same. Just
> that in migration_is_running() etc. functions we'll check only if the
> state is ACTIVE, instead of 10 other states which also indicate that
> the migration is running.
>

What I'm trying to convey is that we have:

1) events API that needs to be kept stable, this list of states that
   libvirt sees and at what moments we emit them.

2) MigrationStatus being used as an internal record of the current
   (loosely defined) migration phase. This is "arbitrary", hence we're
   discussing adding a new MigrationStatus "just" to make sure we don't
   start a new migration at the wrong moment.

I'm trying to understand if you want to cover 1, 2 or both.

I would suggest we first take all of the internal tracking, i.e. #2, the
"if (state==MIGRATION_STATUS)" code and convert them to use some other
state tracking, either the triggers as you suggest, or random booleans
sprinkled all over, it's not immediately important.

Once that is done, then we could freeze the #1, MigrationStatus. It
would only change whenever we wanted to change the API and that should
be a well documented change.

>> I don't think we've established actually what are the goals of having
>> any state changes. Do we even need state changes for internal tracking?
>> We could use your s->trigger as an enum and just check it wherever
>> necessary. And keep the MIGRATION_STATUS exclusive for the external API,
>> in which case, it's probably better to just set it unconditionally (in
>> many places migrate_set_state already takes the current state as
>> argument, i.e. it doesn't care about the current state).
>
> * Well as I see it, different states help us to
>       1 - know where the process is at a given time. In case of
> errors/failures or other events to know what actions to take.
>       2 - what actions/triggers/events are possible.
>
> ex. If an error/cancel occurs before ACTIVE state, during connection
> establishment, it may not have to go through migration_cleanup(),
> probably there's nothing to cleanup. Vs if an error/cancel occurs
> after ACTIVE  or in PAUSED state, we know migration_cleanup() is
> needed.  Similarly if we receive t:resume command when in ACTIVE
> state, OR receive t:pause command in PAUSED state,  we know there's
> nothing to do and ignore it.
>

Ok, maybe I'm splittling hairs here, I was trying to understand whether
all of these "if (s->state ...)" have the same semantics.

a) For cases such as CANCELLING: that could be a simple
   s->trigger[MIGRATE_CANCEL]=1.

  (we're not removing the CANCELLING state due to the API stability, but
  still)

b) For error conditions: s->event[FAILED]=1, then (possibly at a later
   point in migration_change_state):

   if (s->event[FAILED] && !s->trigger[MIGRATE_CANCEL]) {
      migrate_set_state(s->state, MIGRATION_STATUS_FAILED);
   }

b) For postcopy resume/pause, etc, maybe an actual state machine that can
   only be in one state would be helpful.

c) For "we reached this point, so set this state", most of those could
   just be an invocation to migration_change_state() and, as you
   suggest, that would look for the evidence elsewhere to know what
   state to set:

   if (s->trigger[MIGRATE] && s->event[COMPLETED]) {
      migrate_set_state(s->state, MIGRATION_STATUS_COMPLETED);
   }
Re: [PATCH] migration: introduce MIGRATION_STATUS_FAILING
Posted by Prasad Pandit 3 weeks, 4 days ago
Hello Fabiano,

On Wed, 7 Jan 2026 at 18:54, Fabiano Rosas <farosas@suse.de> wrote:
> I like this because it forces us to determine more clearly what is the
> necessary condition for a state change. This could eventually allow the
> abstraction of the qapi_event_send_migration() to a higher
> layer. Something like:
>
> void qmp_migrate() {
>     t:migrate=true
>
>     migration_setup() :: do setup, mig_setup_done=true
>     migration_advance_state() :: checks the triggers, changes state and
>                                  sends the event
>
>     migration_start() :: migrate, mig_done=true
>                          failure, mig_failed=true
>                          etc
>     migration_advance_state()
>
>     migrate_vmstate() :: device state migration, mig_device_done=true
>     migration_advance_state()
>
>  etc..
> }
>
> IOW, we could do a better job of separating what is work, what is
> migration control flow, what is error handling, etc.

* Yes, indeed. Above skeleton code conveys the plausible
segregation/stages well.

> What I'm trying to convey is that we have:
>
> 1) events API that needs to be kept stable, this list of states that
>    libvirt sees and at what moments we emit them.
===
  qemuProcessHandleMigrationStatus & qemuMigrationUpdateJobType
    -> https://gitlab.com/libvirt/libvirt/-/blob/master/src/qemu/qemu_process.c#L1766
    -> https://gitlab.com/libvirt/libvirt/-/blob/master/src/qemu/qemu_migration.c?ref_type=heads#L1931
===
* I was trying to see how libvirtd(8) handles QEMU migration states.
Looking at the above functions there, it seems they don't do much with
it. Only MIGRATION_STATUS_POSTCOPY_* has some handling, while other
states are not handled for anything. Interestingly, there's no _FAILED
state in there, maybe they call it _ERROR.

* While I get the importance of not breaking APIs, still, simplifying
migration states on the QEMU side should help them too.

> 2) MigrationStatus being used as an internal record of the current
>    (loosely defined) migration phase. This is "arbitrary", hence we're
>    discussing adding a new MigrationStatus "just" to make sure we don't
>    start a new migration at the wrong moment.
>
> I'm trying to understand if you want to cover 1, 2 or both.
>
> I would suggest we first take all of the internal tracking, i.e. #2, the
> "if (state==MIGRATION_STATUS)" code and convert them to use some other
> state tracking, either the triggers as you suggest, or random booleans
> sprinkled all over, it's not immediately important.
>
> Once that is done, then we could freeze the #1, MigrationStatus. It
> would only change whenever we wanted to change the API and that should
> be a well documented change.

* Yes, sounds good. We could start with the QEMU internal state/phase
tracking and then go to #1 above once we see how it all works in
practice.

> Ok, maybe I'm splittling hairs here, I was trying to understand whether
> all of these "if (s->state ...)" have the same semantics.
>
> a) For cases such as CANCELLING: that could be a simple
>    s->trigger[MIGRATE_CANCEL]=1.
>
>   (we're not removing the CANCELLING state due to the API stability, but
>   still)
>
> b) For error conditions: s->event[FAILED]=1, then (possibly at a later
>    point in migration_change_state):
>
>    if (s->event[FAILED] && !s->trigger[MIGRATE_CANCEL]) {
>       migrate_set_state(s->state, MIGRATION_STATUS_FAILED);
>    }

* Do we have to check !MIGRATE_CANCEL like this? It's not clean.
Ideally if an error/failure event occurs before the user cancels, then
cancel can be ignored, no? Because migration is anyway going to stop
or end. OTOH, if we cancel while processing an error/failure, end user
may not see that error because we report - migration was cancelled.

> b) For postcopy resume/pause, etc, maybe an actual state machine that can
>    only be in one state would be helpful.
>
> c) For "we reached this point, so set this state", most of those could
>    just be an invocation to migration_change_state() and, as you
>    suggest, that would look for the evidence elsewhere to know what
>    state to set:
>
>    if (s->trigger[MIGRATE] && s->event[COMPLETED]) {
>       migrate_set_state(s->state, MIGRATION_STATUS_COMPLETED);
>    }

* Yes, right. We need to define/differentiate between _what_ is the
state and _why_ is that state.

* How do we go from here? Next step?

Thank you.
---
  - Prasad
Re: [PATCH] migration: introduce MIGRATION_STATUS_FAILING
Posted by Fabiano Rosas 3 weeks, 4 days ago
Prasad Pandit <ppandit@redhat.com> writes:

> Hello Fabiano,
>
> On Wed, 7 Jan 2026 at 18:54, Fabiano Rosas <farosas@suse.de> wrote:
>> I like this because it forces us to determine more clearly what is the
>> necessary condition for a state change. This could eventually allow the
>> abstraction of the qapi_event_send_migration() to a higher
>> layer. Something like:
>>
>> void qmp_migrate() {
>>     t:migrate=true
>>
>>     migration_setup() :: do setup, mig_setup_done=true
>>     migration_advance_state() :: checks the triggers, changes state and
>>                                  sends the event
>>
>>     migration_start() :: migrate, mig_done=true
>>                          failure, mig_failed=true
>>                          etc
>>     migration_advance_state()
>>
>>     migrate_vmstate() :: device state migration, mig_device_done=true
>>     migration_advance_state()
>>
>>  etc..
>> }
>>
>> IOW, we could do a better job of separating what is work, what is
>> migration control flow, what is error handling, etc.
>
> * Yes, indeed. Above skeleton code conveys the plausible
> segregation/stages well.
>
>> What I'm trying to convey is that we have:
>>
>> 1) events API that needs to be kept stable, this list of states that
>>    libvirt sees and at what moments we emit them.
> ===
>   qemuProcessHandleMigrationStatus & qemuMigrationUpdateJobType
>     -> https://gitlab.com/libvirt/libvirt/-/blob/master/src/qemu/qemu_process.c#L1766
>     -> https://gitlab.com/libvirt/libvirt/-/blob/master/src/qemu/qemu_migration.c?ref_type=heads#L1931
> ===
> * I was trying to see how libvirtd(8) handles QEMU migration states.
> Looking at the above functions there, it seems they don't do much with
> it. Only MIGRATION_STATUS_POSTCOPY_* has some handling, while other
> states are not handled for anything. Interestingly, there's no _FAILED
> state in there, maybe they call it _ERROR.
>
> * While I get the importance of not breaking APIs, still, simplifying
> migration states on the QEMU side should help them too.
>

Also remember libvirt is not the only consumer of events from QEMU,
there are other platforms as well.

>> 2) MigrationStatus being used as an internal record of the current
>>    (loosely defined) migration phase. This is "arbitrary", hence we're
>>    discussing adding a new MigrationStatus "just" to make sure we don't
>>    start a new migration at the wrong moment.
>>
>> I'm trying to understand if you want to cover 1, 2 or both.
>>
>> I would suggest we first take all of the internal tracking, i.e. #2, the
>> "if (state==MIGRATION_STATUS)" code and convert them to use some other
>> state tracking, either the triggers as you suggest, or random booleans
>> sprinkled all over, it's not immediately important.
>>
>> Once that is done, then we could freeze the #1, MigrationStatus. It
>> would only change whenever we wanted to change the API and that should
>> be a well documented change.
>
> * Yes, sounds good. We could start with the QEMU internal state/phase
> tracking and then go to #1 above once we see how it all works in
> practice.
>
>> Ok, maybe I'm splittling hairs here, I was trying to understand whether
>> all of these "if (s->state ...)" have the same semantics.
>>
>> a) For cases such as CANCELLING: that could be a simple
>>    s->trigger[MIGRATE_CANCEL]=1.
>>
>>   (we're not removing the CANCELLING state due to the API stability, but
>>   still)
>>
>> b) For error conditions: s->event[FAILED]=1, then (possibly at a later
>>    point in migration_change_state):
>>
>>    if (s->event[FAILED] && !s->trigger[MIGRATE_CANCEL]) {
>>       migrate_set_state(s->state, MIGRATION_STATUS_FAILED);
>>    }
>
> * Do we have to check !MIGRATE_CANCEL like this? It's not clean.

There are failures that happen _because_ we cancelled. As I've mentioned
somewhere else before, the cancellation is not "informed" to all threads
running migration code, there are some code paths that will simply fail
as a result of migration_cancel(). We need to allow cancelling to work
in a possibly stuck thread (such as a blocked recv in the return path),
but this means we end up calling qemu_file_shutdown indiscriminately.

In these cases, parts of the code would set FAILED, but that failure is
a result of cancelling. We've determined that migrate-cancel should
always lead to CANCELLED and a new migration should always be possible.

> Ideally if an error/failure event occurs before the user cancels, then
> cancel can be ignored, no? Because migration is anyway going to stop
> or end.

This is ok, call it an error and done.

> OTOH, if we cancel while processing an error/failure, end user
> may not see that error because we report - migration was cancelled.
>

This is interesting, I _think_ it wouldn't be possible to cancel while
handling an error due to BQL locked, the migrate-cancel wouldn't be
issued while migration_cleanup is ongoing. However, I don't think we ever
tested this scenario in particular. Maybe you could try to catch
something by modifying the /migration/cancel tests, if you're willing.


>> b) For postcopy resume/pause, etc, maybe an actual state machine that can
>>    only be in one state would be helpful.
>>
>> c) For "we reached this point, so set this state", most of those could
>>    just be an invocation to migration_change_state() and, as you
>>    suggest, that would look for the evidence elsewhere to know what
>>    state to set:
>>
>>    if (s->trigger[MIGRATE] && s->event[COMPLETED]) {
>>       migrate_set_state(s->state, MIGRATION_STATUS_COMPLETED);
>>    }
>
> * Yes, right. We need to define/differentiate between _what_ is the
> state and _why_ is that state.
>
> * How do we go from here? Next step?
>

You could send a PoC patch with your idea fixing this FAILING bug? We'd
need a trigger for migrate, set_caps, etc and the failed event.

If that new patch doesn't get consensus then we merge this one and work
on a new design as time permits.

---

Aside from the QAPI states, there are some internal states we already
track with separate flags, e.g.:

rp_thread_created, start_postcopy, migration_thread_running,
switchover_acked, postcopy_package_loaded, fault_thread_quit,
preempt_thread_status, load_threads_abort.

A bit array could maybe cover all of these and more.
Re: [PATCH] migration: introduce MIGRATION_STATUS_FAILING
Posted by Prasad Pandit 3 weeks, 3 days ago
On Tue, 13 Jan 2026 at 01:15, Fabiano Rosas <farosas@suse.de> wrote:
> There are failures that happen _because_ we cancelled. As I've mentioned
> somewhere else before, the cancellation is not "informed" to all threads
> running migration code, there are some code paths that will simply fail
> as a result of migration_cancel(). We need to allow cancelling to work
> in a possibly stuck thread (such as a blocked recv in the return path),
> but this means we end up calling qemu_file_shutdown indiscriminately.
> In these cases, parts of the code would set FAILED, but that failure is
> a result of cancelling. We've determined that migrate-cancel should
> always lead to CANCELLED and a new migration should always be possible.

* I see.

> This is ok, call it an error and done.
>
> > OTOH, if we cancel while processing an error/failure, end user
> > may not see that error because we report - migration was cancelled.
> >
>
> This is interesting, I _think_ it wouldn't be possible to cancel while
> handling an error due to BQL locked, the migrate-cancel wouldn't be
> issued while migration_cleanup is ongoing. However, I don't think we ever
> tested this scenario in particular. Maybe you could try to catch
> something by modifying the /migration/cancel tests, if you're willing.

* I have made a note of looking at it at a later time.

> Aside from the QAPI states, there are some internal states we already
> track with separate flags, e.g.:
>
> rp_thread_created, start_postcopy, migration_thread_running,
> switchover_acked, postcopy_package_loaded, fault_thread_quit,
> preempt_thread_status, load_threads_abort.
>
> A bit array could maybe cover all of these and more.
>
> ---
>
> You could send a PoC patch with your idea fixing this FAILING bug? We'd
> need a trigger for migrate, set_caps, etc and the failed event.
>
> If that new patch doesn't get consensus then we merge this one and work
> on a new design as time permits.

* Considering the above wider coverage area, I think it is best to
first fix the issue at hand and then move to this new change. For now
I'll try to rebase my current patch on your v3: cleanup early
connection code series. Once that is through, I'll take the states
change patch. Hope that's okay.

Thank you.
---
  - Prasad
Re: [PATCH] migration: introduce MIGRATION_STATUS_FAILING
Posted by Fabiano Rosas 3 weeks, 2 days ago
Prasad Pandit <ppandit@redhat.com> writes:

> On Tue, 13 Jan 2026 at 01:15, Fabiano Rosas <farosas@suse.de> wrote:
>> There are failures that happen _because_ we cancelled. As I've mentioned
>> somewhere else before, the cancellation is not "informed" to all threads
>> running migration code, there are some code paths that will simply fail
>> as a result of migration_cancel(). We need to allow cancelling to work
>> in a possibly stuck thread (such as a blocked recv in the return path),
>> but this means we end up calling qemu_file_shutdown indiscriminately.
>> In these cases, parts of the code would set FAILED, but that failure is
>> a result of cancelling. We've determined that migrate-cancel should
>> always lead to CANCELLED and a new migration should always be possible.
>
> * I see.
>
>> This is ok, call it an error and done.
>>
>> > OTOH, if we cancel while processing an error/failure, end user
>> > may not see that error because we report - migration was cancelled.
>> >
>>
>> This is interesting, I _think_ it wouldn't be possible to cancel while
>> handling an error due to BQL locked, the migrate-cancel wouldn't be
>> issued while migration_cleanup is ongoing. However, I don't think we ever
>> tested this scenario in particular. Maybe you could try to catch
>> something by modifying the /migration/cancel tests, if you're willing.
>
> * I have made a note of looking at it at a later time.
>
>> Aside from the QAPI states, there are some internal states we already
>> track with separate flags, e.g.:
>>
>> rp_thread_created, start_postcopy, migration_thread_running,
>> switchover_acked, postcopy_package_loaded, fault_thread_quit,
>> preempt_thread_status, load_threads_abort.
>>
>> A bit array could maybe cover all of these and more.
>>
>> ---
>>
>> You could send a PoC patch with your idea fixing this FAILING bug? We'd
>> need a trigger for migrate, set_caps, etc and the failed event.
>>
>> If that new patch doesn't get consensus then we merge this one and work
>> on a new design as time permits.
>
> * Considering the above wider coverage area, I think it is best to
> first fix the issue at hand and then move to this new change. For now
> I'll try to rebase my current patch on your v3: cleanup early
> connection code series. Once that is through, I'll take the states
> change patch. Hope that's okay.
>

Ok, go ahead.

> Thank you.
> ---
>   - Prasad
Re: [PATCH] migration: introduce MIGRATION_STATUS_FAILING
Posted by Fabiano Rosas 1 month, 2 weeks ago
Peter Xu <peterx@redhat.com> writes:

> On Mon, Dec 22, 2025 at 11:29:57AM -0300, Fabiano Rosas wrote:
>> I'm fine with the general idea:
>> 
>> i) FAILED and CANCELLED are terminal states. It makes sense to not have
>> work happen after they're set.
>> 
>> ii) Using an intermediate state, assuming locking/atomic are correct is
>> a suitable fix for the issue.
>> 
>> iii) Using a FAILING status seems appropriate.
>> 
>> However,
>> 
>> It would be great if we could stop exposing implementation details via
>> QAPI. Does the user really need to see events for CANCELLING and
>> FAILING?
>> 
>> It would probably be easier if we kept MigrationStatus as QAPI only and
>> used a separate mechanism to track the internal states.
>> 
>> That said, we could merge this as is to fix the bug and think about that
>> later.
>
> This bug looks to be there for a long time, IMHO we don't need to rush
> fixing it if we risk adding a new status and revert it quickly...  Let's
> discuss it here, and it's a valid question indeed.
>
> One thing good about exposing such status via QAPI is, it can help us
> diagnose issues by seeing CANCELLING / FAILING even looking at
> query-migrate results (as normally when bug happens we can't see the
> internal status..), so that we know either it's explicitly cancelled, or
> something went wrong.
>
> If it's a completely hidden / internal status, we may see ACTIVE even if
> something wrong happened..
>

On the other hand, we could have more fine-grained statuses and track
them with tracepoints.

> My current hope is any mgmt should normally by default ignore new migration
> states..  If that's always achieved, it looks to me adding FAILING directly
> into migration status would still have some benefits on debugging.
>

Maybe for FAILING it's ok, as we already have CANCELLING and FAILED is
currently mismatched.

> [...]
>
>> > @@ -2907,7 +2914,7 @@ fail_closefb:
>> >      qemu_fclose(fb);
>> >  fail:
>> >      if (ms->state != MIGRATION_STATUS_CANCELLING) {
>> > -        migrate_set_state(&ms->state, ms->state, MIGRATION_STATUS_FAILED);
>> > +        migrate_set_state(&ms->state, ms->state, MIGRATION_STATUS_FAILING);
>> >      }
>> 
>> This is a good example where having MigrationStatus makes the code more
>> complicated. This could be exiting=true, running=false, etc. It
>> shouldn't matter why this routine failed. If we reach
>> migration_cleanup() and, at the very end, state is CANCELLING, then we
>> know the cancel command has caused this and set the state to
>> CANCELLED. If the state was something else, then it's unintended and we
>> set FAILED.
>
> If it'll be an internal status, we'll still need to identify if someone
> already have cancelled it, right?
>

I was thinking of keeping CANCELLING, since it's already in the API.

> Assuming we introduce stop_reason flag, making it:
>
> enum {
>     MIG_STOP_REASON_CANCEL,
>     MIG_STOP_REASON_FAIL,
> } MigrationStopReason;
>
> Then we can switch to CANCELLED / FAILED when cleanup from those reasons.
>
> Then here, logically we also need logic like:
>
>     if (stop_reason != MIG_STOP_REASON_CANCEL) {
>         stop_reason = MIG_STOP_REASON_FAIL;
>     }
>
> Because we want to make sure when the user already triggered cancel, it
> won't show FAILED but only show CANCELLED at last?
Re: [PATCH] migration: introduce MIGRATION_STATUS_FAILING
Posted by Prasad Pandit 1 month, 2 weeks ago
Hello Fabiano,

* Thanks so much for the quick comments.

On Mon, 22 Dec 2025 at 20:00, Fabiano Rosas <farosas@suse.de> wrote:
> This doesn't look like it's caused by set-capabilities. Seems like:
> Please clarify, there might be some other bug lurking around, such as a
> locking issue with qemu_file_lock or even the BQL.
>
> Also, if possible provide an upstream backtrace, or at least mention the
> code path based on upstream code. It's ok to keep the downstream
> backtrace, but point that out in the commit message.

* Right, migrate_fd_cleanup was renamed to migration_cleanup() in
    -> https://gitlab.com/qemu-project/qemu/-/commit/4bbadfc55e6ec608df75911b4360e6e995daa28c
===
libvirtd(8) 19:03:07.260+0000: 396587: error :
qemuMigrationJobCheckStatus:2056 : operation failed: job 'migration
out' failed: Unable to write to socket: Connection reset by peer
libvirtd(8) 19:03:07.261+0000: 396587: info : qemuMonitorSend:836 :
QEMU_MONITOR_SEND_MSG: mon=0xffffa000e010
msg={"execute":"migrate-set-capabilities","arguments":


qemu-system-aarch64:initiating migration
qemu-system-aarch64: migrate_fd_cleanup: to_dst_file: 1: 0xaaaaccce0110
qemu-system-aarch64: migrate_fd_cleanup: before multifd_send_shutdown:
0   <== multifd disabled
qemu-system-aarch64: migrate_fd_cleanup: to_dst_file: 2: (nil)
qemu-system-aarch64: Unable to write to socket: Connection reset by peer
qemu-system-aarch64: ../util/yank.c:107: yank_unregister_instance:
Assertion `QLIST_EMPTY(&entry->yankfns)' failed.
qemu-system-aarch64:shutting down, reason=crashed
===
* As seen above, when connection is reset
     1) libvirtd(8) sends the migrate-set-capabilities command to QEMU
to reset the migration options to false(0). This leads to resetting of
s->capabilities[MIGRATION_CAPABILITY_MULTIFD]  to false.
     2) When migration_cleanup (earlier migrate_fd_cleanup) is called
after above reset
          migration_cleanup
           -> multifd_send_shutdown
            -> if (!migrate_multifd()) {
                     return;   <== returns because _CAPABILITY_MULTIFD
is reset to false(0).
                }
         when _CAPABILITY_MULTIFD is reset to false,
multifd_send_shutdown() returns without really doing its multifd
cleanup, ie. multifd objects still stay alive, are not freed.

* And that leads to the said assert(3) failure in
     migration_cleanup()
     {
         ...
         multifd_send_shutdown()  <== does not cleanup
         ...
         yank_unregister_instance(MIGRATION_YANK_INSTANCE);   <==
assert(3) failure
     }

> I'm fine with the general idea:
>
> i) FAILED and CANCELLED are terminal states. It makes sense to not have
> work happen after they're set.
>
> ii) Using an intermediate state, assuming locking/atomic are correct is
> a suitable fix for the issue.
>
> iii) Using a FAILING status seems appropriate.
>
> However,
>
> It would be great if we could stop exposing implementation details via
> QAPI. Does the user really need to see events for CANCELLING and
> FAILING?

* Not really; libvirtd(8)/user only needs to know about the success OR
failure and appropriate reasons.

> This is a good example where having MigrationStatus makes the code more
> complicated. This could be exiting=true, running=false, etc. It
> shouldn't matter why this routine failed.

* True, it is rather complicated. Though currently we have 17-18
migration states defined, going by the functions
migration_is_running(), migration_is_active(), and
migration_has_failed(), migration process really has only 3-4 states:
      0 -> stopped (non-active, not-started-yet)
      1 -> running/active (sending-receive migration data)
      2 -> paused/active (not-running, not-send-recv-migration-data)
      3 -> stopped/failed/completed (non-active, not-running)
Other interim states of initiating/setting connections OR
cancelling/failing etc. could be tracked differently.


Thank you.
---
  - Prasad
Re: [PATCH] migration: introduce MIGRATION_STATUS_FAILING
Posted by Fabiano Rosas 1 month, 2 weeks ago
Prasad Pandit <ppandit@redhat.com> writes:

> Hello Fabiano,
>
> * Thanks so much for the quick comments.
>
> On Mon, 22 Dec 2025 at 20:00, Fabiano Rosas <farosas@suse.de> wrote:
>> This doesn't look like it's caused by set-capabilities. Seems like:
>> Please clarify, there might be some other bug lurking around, such as a
>> locking issue with qemu_file_lock or even the BQL.
>>
>> Also, if possible provide an upstream backtrace, or at least mention the
>> code path based on upstream code. It's ok to keep the downstream
>> backtrace, but point that out in the commit message.
>
> * Right, migrate_fd_cleanup was renamed to migration_cleanup() in
>     -> https://gitlab.com/qemu-project/qemu/-/commit/4bbadfc55e6ec608df75911b4360e6e995daa28c

Yep, 10 years from now someone will look at the code and the commit
message and be confused when they don't match. Also, for anyone
backporting or searching for bug fixes, it's good to keep things tight.

I can amend while queuing if you don't mind.

> ===
> libvirtd(8) 19:03:07.260+0000: 396587: error :
> qemuMigrationJobCheckStatus:2056 : operation failed: job 'migration
> out' failed: Unable to write to socket: Connection reset by peer
> libvirtd(8) 19:03:07.261+0000: 396587: info : qemuMonitorSend:836 :
> QEMU_MONITOR_SEND_MSG: mon=0xffffa000e010
> msg={"execute":"migrate-set-capabilities","arguments":
>
>
> qemu-system-aarch64:initiating migration
> qemu-system-aarch64: migrate_fd_cleanup: to_dst_file: 1: 0xaaaaccce0110
> qemu-system-aarch64: migrate_fd_cleanup: before multifd_send_shutdown:
> 0   <== multifd disabled
> qemu-system-aarch64: migrate_fd_cleanup: to_dst_file: 2: (nil)
> qemu-system-aarch64: Unable to write to socket: Connection reset by peer
> qemu-system-aarch64: ../util/yank.c:107: yank_unregister_instance:
> Assertion `QLIST_EMPTY(&entry->yankfns)' failed.
> qemu-system-aarch64:shutting down, reason=crashed
> ===
> * As seen above, when connection is reset
>      1) libvirtd(8) sends the migrate-set-capabilities command to QEMU
> to reset the migration options to false(0). This leads to resetting of
> s->capabilities[MIGRATION_CAPABILITY_MULTIFD]  to false.
>      2) When migration_cleanup (earlier migrate_fd_cleanup) is called
> after above reset
>           migration_cleanup
>            -> multifd_send_shutdown
>             -> if (!migrate_multifd()) {
>                      return;   <== returns because _CAPABILITY_MULTIFD
> is reset to false(0).
>                 }
>          when _CAPABILITY_MULTIFD is reset to false,
> multifd_send_shutdown() returns without really doing its multifd
> cleanup, ie. multifd objects still stay alive, are not freed.
>
> * And that leads to the said assert(3) failure in
>      migration_cleanup()
>      {
>          ...
>          multifd_send_shutdown()  <== does not cleanup

Ok, I forgot there were yank functions for the multifd channels as well.

It seems multifd.c could be made more robust to not require checking
migrate_multifd() so much. For the shutdown call for instance, checking
(!multifd_send_state) would suffice. Maybe a general point for us to
keep in mind, that relying on such a high level knob might not be the
best idea.

>          ...
>          yank_unregister_instance(MIGRATION_YANK_INSTANCE);   <==
> assert(3) failure
>      }
>
>> I'm fine with the general idea:
>>
>> i) FAILED and CANCELLED are terminal states. It makes sense to not have
>> work happen after they're set.
>>
>> ii) Using an intermediate state, assuming locking/atomic are correct is
>> a suitable fix for the issue.
>>
>> iii) Using a FAILING status seems appropriate.
>>
>> However,
>>
>> It would be great if we could stop exposing implementation details via
>> QAPI. Does the user really need to see events for CANCELLING and
>> FAILING?
>
> * Not really; libvirtd(8)/user only needs to know about the success OR
> failure and appropriate reasons.
>
>> This is a good example where having MigrationStatus makes the code more
>> complicated. This could be exiting=true, running=false, etc. It
>> shouldn't matter why this routine failed.
>
> * True, it is rather complicated. Though currently we have 17-18
> migration states defined, going by the functions
> migration_is_running(), migration_is_active(), and
> migration_has_failed(), migration process really has only 3-4 states:
>       0 -> stopped (non-active, not-started-yet)
>       1 -> running/active (sending-receive migration data)
>       2 -> paused/active (not-running, not-send-recv-migration-data)
>       3 -> stopped/failed/completed (non-active, not-running)
> Other interim states of initiating/setting connections OR
> cancelling/failing etc. could be tracked differently.
>

Yes, the amount of states migration_is_running() checks is an indication
that we're making life harder for ourselves. It would be nice if we
could have some idea how to improve this the next time a patch like this
comes along.
Re: [PATCH] migration: introduce MIGRATION_STATUS_FAILING
Posted by Prasad Pandit 1 month ago
Hi,

On Tue, 23 Dec 2025 at 20:34, Fabiano Rosas <farosas@suse.de> wrote:
> Yep, 10 years from now someone will look at the code and the commit
> message and be confused when they don't match. Also, for anyone
> backporting or searching for bug fixes, it's good to keep things tight.
>

* Looking at how we are changing function names, I think it is best to
avoid stack traces in commit messages.
...
> Yes, the amount of states migration_is_running() checks is an indication
> that we're making life harder for ourselves. It would be nice if we
> could have some idea how to improve this the next time a patch like this
> comes along.

True.

Thank you.
---
  - Prasad
Re: [PATCH] migration: introduce MIGRATION_STATUS_FAILING
Posted by Peter Xu 1 month ago
On Tue, Jan 06, 2026 at 04:24:23PM +0530, Prasad Pandit wrote:
> Hi,
> 
> On Tue, 23 Dec 2025 at 20:34, Fabiano Rosas <farosas@suse.de> wrote:
> > Yep, 10 years from now someone will look at the code and the commit
> > message and be confused when they don't match. Also, for anyone
> > backporting or searching for bug fixes, it's good to keep things tight.
> >
> 
> * Looking at how we are changing function names, I think it is best to
> avoid stack traces in commit messages.

Please always attach a backtrace when available.  One can always checkout
the specific commit with all the correct function names.  Renaming is never
an issue.

Thanks,

-- 
Peter Xu