[PATCH v3 06/10] migration-test: Add COLO migration unit test

Lukas Straub posted 10 patches 2 weeks, 1 day ago
Maintainers: Pierrick Bouvier <pierrick.bouvier@linaro.org>, Lukas Straub <lukasstraub2@web.de>, Peter Xu <peterx@redhat.com>, Fabiano Rosas <farosas@suse.de>, Laurent Vivier <lvivier@redhat.com>, Paolo Bonzini <pbonzini@redhat.com>
There is a newer version of this series
[PATCH v3 06/10] migration-test: Add COLO migration unit test
Posted by Lukas Straub 2 weeks, 1 day ago
Add a COLO migration test for COLO migration and failover.

Signed-off-by: Lukas Straub <lukasstraub2@web.de>
---
 MAINTAINERS                        |   1 +
 tests/qtest/meson.build            |   7 +-
 tests/qtest/migration-test.c       |   1 +
 tests/qtest/migration/colo-tests.c | 199 +++++++++++++++++++++++++++++++++++++
 tests/qtest/migration/framework.h  |   5 +
 5 files changed, 212 insertions(+), 1 deletion(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 883f0a8f4eb92d0bf0f89fcab4674ccc4aed1cc1..2a8b9b2d051883c1b7adce9c1afec80d16a317f8 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -3856,6 +3856,7 @@ F: migration/colo*
 F: migration/multifd-colo.*
 F: include/migration/colo.h
 F: include/migration/failover.h
+F: tests/qtest/migration/colo-tests.c
 F: docs/COLO-FT.txt
 
 COLO Proxy
diff --git a/tests/qtest/meson.build b/tests/qtest/meson.build
index dfb83650c643d884daad53a66034ab7aa8c45509..624f7744ec9bd81c8823075b966bc95f7750a667 100644
--- a/tests/qtest/meson.build
+++ b/tests/qtest/meson.build
@@ -371,6 +371,11 @@ if gnutls.found()
   endif
 endif
 
+migration_colo_files = []
+if get_option('replication').allowed()
+  migration_colo_files = [files('migration/colo-tests.c')]
+endif
+
 qtests = {
   'aspeed_hace-test': files('aspeed-hace-utils.c', 'aspeed_hace-test.c'),
   'aspeed_smc-test': files('aspeed-smc-utils.c', 'aspeed_smc-test.c'),
@@ -382,7 +387,7 @@ qtests = {
                              'migration/migration-util.c') + dbus_vmstate1,
   'erst-test': files('erst-test.c'),
   'ivshmem-test': [rt, '../../contrib/ivshmem-server/ivshmem-server.c'],
-  'migration-test': test_migration_files + migration_tls_files,
+  'migration-test': test_migration_files + migration_tls_files + migration_colo_files,
   'pxe-test': files('boot-sector.c'),
   'pnv-xive2-test': files('pnv-xive2-common.c', 'pnv-xive2-flush-sync.c',
                           'pnv-xive2-nvpg_bar.c'),
diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index 08936871741535c926eeac40a7d7c3f461c72fd0..e582f05c7dc2673dbd05a936df8feb6c964b5bbc 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -55,6 +55,7 @@ int main(int argc, char **argv)
     migration_test_add_precopy(env);
     migration_test_add_cpr(env);
     migration_test_add_misc(env);
+    migration_test_add_colo(env);
 
     ret = g_test_run();
 
diff --git a/tests/qtest/migration/colo-tests.c b/tests/qtest/migration/colo-tests.c
new file mode 100644
index 0000000000000000000000000000000000000000..0586970e206f01ed6e7aa3429321aefc1de7be37
--- /dev/null
+++ b/tests/qtest/migration/colo-tests.c
@@ -0,0 +1,199 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ *
+ * QTest testcases for COLO migration
+ *
+ * Copyright (c) 2025 Lukas Straub <lukasstraub2@web.de>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#include "qemu/osdep.h"
+#include "libqtest.h"
+#include "migration/framework.h"
+#include "migration/migration-qmp.h"
+#include "migration/migration-util.h"
+#include "qemu/module.h"
+
+static int test_colo_common(MigrateCommon *args,
+                            bool failover_during_checkpoint,
+                            bool primary_failover)
+{
+    QTestState *from, *to;
+    void *data_hook = NULL;
+
+    /*
+     * For the COLO test, both VMs will run in parallel. Thus both VMs want to
+     * open the image read/write at the same time. Using read-only=on is not
+     * possible here, because ide-hd does not support read-only backing image.
+     *
+     * So use -snapshot, where each qemu instance creates its own writable
+     * snapshot internally while leaving the real image read-only.
+     */
+    args->start.opts_source = "-snapshot";
+    args->start.opts_target = "-snapshot";
+
+    /*
+     * COLO migration code logs many errors when the migration socket
+     * is shut down, these are expected so we hide them here.
+     */
+    args->start.hide_stderr = true;
+
+    args->start.oob = true;
+    args->start.caps[MIGRATION_CAPABILITY_X_COLO] = true;
+
+    if (migrate_start(&from, &to, args->listen_uri, &args->start)) {
+        return -1;
+    }
+
+    migrate_set_parameter_int(from, "x-checkpoint-delay", 300);
+
+    if (args->start_hook) {
+        data_hook = args->start_hook(from, to);
+    }
+
+    migrate_ensure_converge(from);
+    wait_for_serial("src_serial");
+
+    migrate_qmp(from, to, args->connect_uri, NULL, "{}");
+
+    wait_for_migration_status(from, "colo", NULL);
+    wait_for_resume(to, get_dst());
+
+    wait_for_serial("src_serial");
+    wait_for_serial("dest_serial");
+
+    /* wait for 3 checkpoints */
+    for (int i = 0; i < 3; i++) {
+        qtest_qmp_eventwait(to, "RESUME");
+        wait_for_serial("src_serial");
+        wait_for_serial("dest_serial");
+    }
+
+    if (failover_during_checkpoint) {
+        qtest_qmp_eventwait(to, "STOP");
+    }
+    if (primary_failover) {
+        qtest_qmp_assert_success(from, "{'exec-oob': 'yank', 'id': 'yank-cmd', "
+                                            "'arguments': {'instances':"
+                                                "[{'type': 'migration'}]}}");
+        qtest_qmp_assert_success(from, "{'execute': 'x-colo-lost-heartbeat'}");
+        wait_for_serial("src_serial");
+    } else {
+        qtest_qmp_assert_success(to, "{'exec-oob': 'yank', 'id': 'yank-cmd', "
+                                        "'arguments': {'instances':"
+                                            "[{'type': 'migration'}]}}");
+        qtest_qmp_assert_success(to, "{'execute': 'x-colo-lost-heartbeat'}");
+        wait_for_serial("dest_serial");
+    }
+
+    if (args->end_hook) {
+        args->end_hook(from, to, data_hook);
+    }
+
+    migrate_end(from, to, !primary_failover);
+
+    return 0;
+}
+
+static void test_colo_plain_common(MigrateCommon *args,
+                                   bool failover_during_checkpoint,
+                                   bool primary_failover)
+{
+    args->listen_uri = "tcp:127.0.0.1:0";
+    test_colo_common(args, failover_during_checkpoint, primary_failover);
+}
+
+static void *hook_start_multifd(QTestState *from, QTestState *to)
+{
+    return migrate_hook_start_precopy_tcp_multifd_common(from, to, "none");
+}
+
+static void test_colo_multifd_common(MigrateCommon *args,
+                                     bool failover_during_checkpoint,
+                                     bool primary_failover)
+{
+    args->listen_uri = "defer";
+    args->start_hook = hook_start_multifd;
+    args->start.caps[MIGRATION_CAPABILITY_MULTIFD] = true;
+    test_colo_common(args, failover_during_checkpoint, primary_failover);
+}
+
+static void test_colo_plain_primary_failover(char *name, MigrateCommon *args)
+{
+    test_colo_plain_common(args, false, true);
+}
+
+static void test_colo_plain_secondary_failover(char *name, MigrateCommon *args)
+{
+    test_colo_plain_common(args, false, false);
+}
+
+static void test_colo_multifd_primary_failover(char *name, MigrateCommon *args)
+{
+    test_colo_multifd_common(args, false, true);
+}
+
+static void test_colo_multifd_secondary_failover(char *name,
+                                                 MigrateCommon *args)
+{
+    test_colo_multifd_common(args, false, false);
+}
+
+static void test_colo_plain_primary_failover_checkpoint(char *name,
+                                                        MigrateCommon *args)
+{
+    test_colo_plain_common(args, true, true);
+}
+
+static void test_colo_plain_secondary_failover_checkpoint(char *name,
+                                                          MigrateCommon *args)
+{
+    test_colo_plain_common(args, true, false);
+}
+
+static void test_colo_multifd_primary_failover_checkpoint(char *name,
+                                                          MigrateCommon *args)
+{
+    test_colo_multifd_common(args, true, true);
+}
+
+static void test_colo_multifd_secondary_failover_checkpoint(char *name,
+                                                            MigrateCommon *args)
+{
+    test_colo_multifd_common(args, true, false);
+}
+
+void migration_test_add_colo(MigrationTestEnv *env)
+{
+    if (!env->has_kvm) {
+        g_test_skip("COLO requires KVM accelerator");
+        return;
+    }
+
+    if (!env->full_set) {
+        return;
+    }
+
+    migration_test_add("/migration/colo/plain/primary_failover",
+                       test_colo_plain_primary_failover);
+    migration_test_add("/migration/colo/plain/secondary_failover",
+                       test_colo_plain_secondary_failover);
+
+    migration_test_add("/migration/colo/multifd/primary_failover",
+                       test_colo_multifd_primary_failover);
+    migration_test_add("/migration/colo/multifd/secondary_failover",
+                       test_colo_multifd_secondary_failover);
+
+    migration_test_add("/migration/colo/plain/primary_failover_checkpoint",
+                       test_colo_plain_primary_failover_checkpoint);
+    migration_test_add("/migration/colo/plain/secondary_failover_checkpoint",
+                       test_colo_plain_secondary_failover_checkpoint);
+
+    migration_test_add("/migration/colo/multifd/primary_failover_checkpoint",
+                       test_colo_multifd_primary_failover_checkpoint);
+    migration_test_add("/migration/colo/multifd/secondary_failover_checkpoint",
+                       test_colo_multifd_secondary_failover_checkpoint);
+}
diff --git a/tests/qtest/migration/framework.h b/tests/qtest/migration/framework.h
index 40984d04930da2d181326d9f6a742bde49018103..80eef758932ce9c301ed6c0f6383d18756144870 100644
--- a/tests/qtest/migration/framework.h
+++ b/tests/qtest/migration/framework.h
@@ -264,5 +264,10 @@ void migration_test_add_file(MigrationTestEnv *env);
 void migration_test_add_precopy(MigrationTestEnv *env);
 void migration_test_add_cpr(MigrationTestEnv *env);
 void migration_test_add_misc(MigrationTestEnv *env);
+#ifdef CONFIG_REPLICATION
+void migration_test_add_colo(MigrationTestEnv *env);
+#else
+static inline void migration_test_add_colo(MigrationTestEnv *env) {};
+#endif
 
 #endif /* TEST_FRAMEWORK_H */

-- 
2.39.5
Re: [PATCH v3 06/10] migration-test: Add COLO migration unit test
Posted by Fabiano Rosas 1 week, 5 days ago
Lukas Straub <lukasstraub2@web.de> writes:

> Add a COLO migration test for COLO migration and failover.
>
> Signed-off-by: Lukas Straub <lukasstraub2@web.de>
> ---
>  MAINTAINERS                        |   1 +
>  tests/qtest/meson.build            |   7 +-
>  tests/qtest/migration-test.c       |   1 +
>  tests/qtest/migration/colo-tests.c | 199 +++++++++++++++++++++++++++++++++++++
>  tests/qtest/migration/framework.h  |   5 +
>  5 files changed, 212 insertions(+), 1 deletion(-)
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 883f0a8f4eb92d0bf0f89fcab4674ccc4aed1cc1..2a8b9b2d051883c1b7adce9c1afec80d16a317f8 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -3856,6 +3856,7 @@ F: migration/colo*
>  F: migration/multifd-colo.*
>  F: include/migration/colo.h
>  F: include/migration/failover.h
> +F: tests/qtest/migration/colo-tests.c
>  F: docs/COLO-FT.txt
>  
>  COLO Proxy
> diff --git a/tests/qtest/meson.build b/tests/qtest/meson.build
> index dfb83650c643d884daad53a66034ab7aa8c45509..624f7744ec9bd81c8823075b966bc95f7750a667 100644
> --- a/tests/qtest/meson.build
> +++ b/tests/qtest/meson.build
> @@ -371,6 +371,11 @@ if gnutls.found()
>    endif
>  endif
>  
> +migration_colo_files = []
> +if get_option('replication').allowed()
> +  migration_colo_files = [files('migration/colo-tests.c')]
> +endif
> +
>  qtests = {
>    'aspeed_hace-test': files('aspeed-hace-utils.c', 'aspeed_hace-test.c'),
>    'aspeed_smc-test': files('aspeed-smc-utils.c', 'aspeed_smc-test.c'),
> @@ -382,7 +387,7 @@ qtests = {
>                               'migration/migration-util.c') + dbus_vmstate1,
>    'erst-test': files('erst-test.c'),
>    'ivshmem-test': [rt, '../../contrib/ivshmem-server/ivshmem-server.c'],
> -  'migration-test': test_migration_files + migration_tls_files,
> +  'migration-test': test_migration_files + migration_tls_files + migration_colo_files,
>    'pxe-test': files('boot-sector.c'),
>    'pnv-xive2-test': files('pnv-xive2-common.c', 'pnv-xive2-flush-sync.c',
>                            'pnv-xive2-nvpg_bar.c'),
> diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
> index 08936871741535c926eeac40a7d7c3f461c72fd0..e582f05c7dc2673dbd05a936df8feb6c964b5bbc 100644
> --- a/tests/qtest/migration-test.c
> +++ b/tests/qtest/migration-test.c
> @@ -55,6 +55,7 @@ int main(int argc, char **argv)
>      migration_test_add_precopy(env);
>      migration_test_add_cpr(env);
>      migration_test_add_misc(env);
> +    migration_test_add_colo(env);
>  
>      ret = g_test_run();
>  
> diff --git a/tests/qtest/migration/colo-tests.c b/tests/qtest/migration/colo-tests.c
> new file mode 100644
> index 0000000000000000000000000000000000000000..0586970e206f01ed6e7aa3429321aefc1de7be37
> --- /dev/null
> +++ b/tests/qtest/migration/colo-tests.c
> @@ -0,0 +1,199 @@
> +/*
> + * SPDX-License-Identifier: GPL-2.0-or-later
> + *
> + * QTest testcases for COLO migration
> + *
> + * Copyright (c) 2025 Lukas Straub <lukasstraub2@web.de>
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + *
> + */
> +
> +#include "qemu/osdep.h"
> +#include "libqtest.h"
> +#include "migration/framework.h"
> +#include "migration/migration-qmp.h"
> +#include "migration/migration-util.h"
> +#include "qemu/module.h"
> +
> +static int test_colo_common(MigrateCommon *args,
> +                            bool failover_during_checkpoint,
> +                            bool primary_failover)
> +{
> +    QTestState *from, *to;
> +    void *data_hook = NULL;
> +
> +    /*
> +     * For the COLO test, both VMs will run in parallel. Thus both VMs want to
> +     * open the image read/write at the same time. Using read-only=on is not
> +     * possible here, because ide-hd does not support read-only backing image.
> +     *
> +     * So use -snapshot, where each qemu instance creates its own writable
> +     * snapshot internally while leaving the real image read-only.
> +     */
> +    args->start.opts_source = "-snapshot";
> +    args->start.opts_target = "-snapshot";
> +
> +    /*
> +     * COLO migration code logs many errors when the migration socket
> +     * is shut down, these are expected so we hide them here.
> +     */
> +    args->start.hide_stderr = true;
> +
> +    args->start.oob = true;
> +    args->start.caps[MIGRATION_CAPABILITY_X_COLO] = true;
> +
> +    if (migrate_start(&from, &to, args->listen_uri, &args->start)) {
> +        return -1;
> +    }
> +
> +    migrate_set_parameter_int(from, "x-checkpoint-delay", 300);
> +
> +    if (args->start_hook) {
> +        data_hook = args->start_hook(from, to);
> +    }
> +
> +    migrate_ensure_converge(from);
> +    wait_for_serial("src_serial");
> +
> +    migrate_qmp(from, to, args->connect_uri, NULL, "{}");
> +
> +    wait_for_migration_status(from, "colo", NULL);
> +    wait_for_resume(to, get_dst());
> +
> +    wait_for_serial("src_serial");
> +    wait_for_serial("dest_serial");
> +
> +    /* wait for 3 checkpoints */
> +    for (int i = 0; i < 3; i++) {
> +        qtest_qmp_eventwait(to, "RESUME");
> +        wait_for_serial("src_serial");
> +        wait_for_serial("dest_serial");
> +    }
> +
> +    if (failover_during_checkpoint) {
> +        qtest_qmp_eventwait(to, "STOP");
> +    }
> +    if (primary_failover) {
> +        qtest_qmp_assert_success(from, "{'exec-oob': 'yank', 'id': 'yank-cmd', "
> +                                            "'arguments': {'instances':"
> +                                                "[{'type': 'migration'}]}}");
> +        qtest_qmp_assert_success(from, "{'execute': 'x-colo-lost-heartbeat'}");
> +        wait_for_serial("src_serial");
> +    } else {
> +        qtest_qmp_assert_success(to, "{'exec-oob': 'yank', 'id': 'yank-cmd', "
> +                                        "'arguments': {'instances':"
> +                                            "[{'type': 'migration'}]}}");
> +        qtest_qmp_assert_success(to, "{'execute': 'x-colo-lost-heartbeat'}");
> +        wait_for_serial("dest_serial");
> +    }
> +
> +    if (args->end_hook) {
> +        args->end_hook(from, to, data_hook);
> +    }
> +
> +    migrate_end(from, to, !primary_failover);
> +
> +    return 0;
> +}
> +
> +static void test_colo_plain_common(MigrateCommon *args,
> +                                   bool failover_during_checkpoint,
> +                                   bool primary_failover)
> +{
> +    args->listen_uri = "tcp:127.0.0.1:0";
> +    test_colo_common(args, failover_during_checkpoint, primary_failover);
> +}
> +
> +static void *hook_start_multifd(QTestState *from, QTestState *to)
> +{
> +    return migrate_hook_start_precopy_tcp_multifd_common(from, to, "none");
> +}
> +
> +static void test_colo_multifd_common(MigrateCommon *args,
> +                                     bool failover_during_checkpoint,
> +                                     bool primary_failover)
> +{
> +    args->listen_uri = "defer";
> +    args->start_hook = hook_start_multifd;
> +    args->start.caps[MIGRATION_CAPABILITY_MULTIFD] = true;
> +    test_colo_common(args, failover_during_checkpoint, primary_failover);
> +}
> +
> +static void test_colo_plain_primary_failover(char *name, MigrateCommon *args)
> +{
> +    test_colo_plain_common(args, false, true);
> +}
> +
> +static void test_colo_plain_secondary_failover(char *name, MigrateCommon *args)
> +{
> +    test_colo_plain_common(args, false, false);
> +}
> +
> +static void test_colo_multifd_primary_failover(char *name, MigrateCommon *args)
> +{
> +    test_colo_multifd_common(args, false, true);
> +}
> +
> +static void test_colo_multifd_secondary_failover(char *name,
> +                                                 MigrateCommon *args)
> +{
> +    test_colo_multifd_common(args, false, false);
> +}
> +
> +static void test_colo_plain_primary_failover_checkpoint(char *name,
> +                                                        MigrateCommon *args)
> +{
> +    test_colo_plain_common(args, true, true);
> +}
> +
> +static void test_colo_plain_secondary_failover_checkpoint(char *name,
> +                                                          MigrateCommon *args)
> +{
> +    test_colo_plain_common(args, true, false);
> +}
> +
> +static void test_colo_multifd_primary_failover_checkpoint(char *name,
> +                                                          MigrateCommon *args)
> +{
> +    test_colo_multifd_common(args, true, true);
> +}
> +
> +static void test_colo_multifd_secondary_failover_checkpoint(char *name,
> +                                                            MigrateCommon *args)
> +{
> +    test_colo_multifd_common(args, true, false);
> +}
> +
> +void migration_test_add_colo(MigrationTestEnv *env)
> +{
> +    if (!env->has_kvm) {
> +        g_test_skip("COLO requires KVM accelerator");
> +        return;
> +    }
> +
> +    if (!env->full_set) {
> +        return;
> +    }
> +
> +    migration_test_add("/migration/colo/plain/primary_failover",
> +                       test_colo_plain_primary_failover);
> +    migration_test_add("/migration/colo/plain/secondary_failover",
> +                       test_colo_plain_secondary_failover);
> +
> +    migration_test_add("/migration/colo/multifd/primary_failover",
> +                       test_colo_multifd_primary_failover);
> +    migration_test_add("/migration/colo/multifd/secondary_failover",
> +                       test_colo_multifd_secondary_failover);
> +
> +    migration_test_add("/migration/colo/plain/primary_failover_checkpoint",
> +                       test_colo_plain_primary_failover_checkpoint);
> +    migration_test_add("/migration/colo/plain/secondary_failover_checkpoint",
> +                       test_colo_plain_secondary_failover_checkpoint);
> +
> +    migration_test_add("/migration/colo/multifd/primary_failover_checkpoint",
> +                       test_colo_multifd_primary_failover_checkpoint);
> +    migration_test_add("/migration/colo/multifd/secondary_failover_checkpoint",
> +                       test_colo_multifd_secondary_failover_checkpoint);
> +}
> diff --git a/tests/qtest/migration/framework.h b/tests/qtest/migration/framework.h
> index 40984d04930da2d181326d9f6a742bde49018103..80eef758932ce9c301ed6c0f6383d18756144870 100644
> --- a/tests/qtest/migration/framework.h
> +++ b/tests/qtest/migration/framework.h
> @@ -264,5 +264,10 @@ void migration_test_add_file(MigrationTestEnv *env);
>  void migration_test_add_precopy(MigrationTestEnv *env);
>  void migration_test_add_cpr(MigrationTestEnv *env);
>  void migration_test_add_misc(MigrationTestEnv *env);
> +#ifdef CONFIG_REPLICATION
> +void migration_test_add_colo(MigrationTestEnv *env);
> +#else
> +static inline void migration_test_add_colo(MigrationTestEnv *env) {};
> +#endif
>  
>  #endif /* TEST_FRAMEWORK_H */

It survived my stress run. It hit once the race at migration_shutdown()
where current_migration is already freed, but we can ignore that because
it's preexisting.

Tested-by: Fabiano Rosas <farosas@suse.de>
Re: [PATCH v3 06/10] migration-test: Add COLO migration unit test
Posted by Peter Xu 1 week, 6 days ago
On Sun, Jan 25, 2026 at 09:40:11PM +0100, Lukas Straub wrote:
> +void migration_test_add_colo(MigrationTestEnv *env)
> +{
> +    if (!env->has_kvm) {
> +        g_test_skip("COLO requires KVM accelerator");
> +        return;
> +    }

I'm OK if you want to explicitly bypass others, but could you explanation
why?

Thanks,

-- 
Peter Xu
Re: [PATCH v3 06/10] migration-test: Add COLO migration unit test
Posted by Lukas Straub 1 week, 3 days ago
On Tue, 27 Jan 2026 15:49:31 -0500
Peter Xu <peterx@redhat.com> wrote:

> On Sun, Jan 25, 2026 at 09:40:11PM +0100, Lukas Straub wrote:
> > +void migration_test_add_colo(MigrationTestEnv *env)
> > +{
> > +    if (!env->has_kvm) {
> > +        g_test_skip("COLO requires KVM accelerator");
> > +        return;
> > +    }  
> 
> I'm OK if you want to explicitly bypass others, but could you explanation
> why?
> 
> Thanks,
> 

It used to hang with TCG. Now it crashes, since
migration_bitmap_sync_precopy assumes bql is held. Something for later.

#6  0x00007ffff7471517 in __assert_fail
    (assertion=assertion@entry=0x555555f17aee "bql_locked() != locked", file=file@entry=0x555555f17ab0 "../system/cpus.c", line=line@entry=535, function=function@entry=0x55555609bfd0 <__PRETTY_FUNCTION__.9> "bql_update_status") at ./assert/assert.c:105
#7  0x0000555555b09f1e in bql_update_status (locked=locked@entry=false) at ../system/cpus.c:535
#8  0x0000555555ec60e7 in qemu_mutex_pre_unlock (mutex=0x555557166700 <bql>, file=0x555555efe1dc "../cpu-common.c", line=164) at ../util/qemu-thread-common.h:57
#9  qemu_mutex_pre_unlock (line=164, file=0x555555efe1dc "../cpu-common.c", mutex=0x555557166700 <bql>) at ../util/qemu-thread-common.h:48
#10 qemu_cond_wait_impl (cond=0x5555571442c0 <qemu_work_cond>, mutex=0x555557166700 <bql>, file=0x555555efe1dc "../cpu-common.c", line=164) at ../util/qemu-thread-posix.c:224
#11 0x000055555589e6c8 in do_run_on_cpu (cpu=<optimized out>, func=<optimized out>, data=..., mutex=0x555557166700 <bql>) at ../cpu-common.c:164
#12 0x0000555555b17a06 in memory_global_after_dirty_log_sync () at ../system/memory.c:2938
#13 0x0000555555b55b47 in migration_bitmap_sync (rs=0x7fffe8001340, last_stage=last_stage@entry=true) at ../migration/ram.c:1157
#14 0x0000555555b56721 in migration_bitmap_sync_precopy (last_stage=last_stage@entry=true) at ../migration/ram.c:1195
#15 0x0000555555b59f8a in ram_save_complete (f=0x5555575db620, opaque=<optimized out>) at ../migration/ram.c:3381
#16 0x0000555555b5e4f5 in qemu_savevm_complete (se=se@entry=0x5555574c0d80, f=f@entry=0x5555575db620) at ../migration/savevm.c:1521
#17 0x0000555555b60437 in qemu_savevm_state_complete_precopy_iterable (f=f@entry=0x5555575db620, in_postcopy=in_postcopy@entry=false) at ../migration/savevm.c:1627
#18 0x0000555555b60a4f in qemu_savevm_state_complete_precopy (iterable_only=true, f=0x5555575db620) at ../migration/savevm.c:1719
#19 qemu_savevm_live_state (f=0x5555575db620) at ../migration/savevm.c:1855
#20 0x0000555555b65ed9 in colo_do_checkpoint_transaction (fb=<optimized out>, bioc=<optimized out>, s=0x5555574c0070) at ../migration/colo.c:474
#21 colo_process_checkpoint (s=0x5555574c0070) at ../migration/colo.c:592
#22 migrate_start_colo_process (s=0x5555574c0070) at ../migration/colo.c:655
#23 0x0000555555b4971e in migration_iteration_finish (s=0x5555574c0070) at ../migration/migration.c:3297
#24 migration_thread (opaque=opaque@entry=0x5555574c0070) at ../migration/migration.c:3584
#25 0x0000555555ec58c0 in qemu_thread_start (args=0x5555576583e0) at ../util/qemu-thread-posix.c:393
#26 0x00007ffff74d2aa4 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:447
#27 0x00007ffff755fc6c in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:78
Re: [PATCH v3 06/10] migration-test: Add COLO migration unit test
Posted by Peter Xu 1 week ago
On Fri, Jan 30, 2026 at 11:24:02AM +0100, Lukas Straub wrote:
> On Tue, 27 Jan 2026 15:49:31 -0500
> Peter Xu <peterx@redhat.com> wrote:
> 
> > On Sun, Jan 25, 2026 at 09:40:11PM +0100, Lukas Straub wrote:
> > > +void migration_test_add_colo(MigrationTestEnv *env)
> > > +{
> > > +    if (!env->has_kvm) {
> > > +        g_test_skip("COLO requires KVM accelerator");
> > > +        return;
> > > +    }  
> > 
> > I'm OK if you want to explicitly bypass others, but could you explanation
> > why?
> > 
> > Thanks,
> > 
> 
> It used to hang with TCG. Now it crashes, since
> migration_bitmap_sync_precopy assumes bql is held. Something for later.

If we want to keep COLO around and be serious, let's try to make COLO the
same standard we target for migration in general whenever possible.  We
shouldn't randomly workaround bugs.  We should fix it.

It looks to me there's some locking issue instead.

Iterator's complete() requires BQL.  Would a patch like below makes sense
to you?

diff --git a/migration/colo.c b/migration/colo.c
index db783f6fa7..b3ea137120 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -458,8 +458,8 @@ static int colo_do_checkpoint_transaction(MigrationState *s,
     /* Note: device state is saved into buffer */
     ret = qemu_save_device_state(fb);
 
-    bql_unlock();
     if (ret < 0) {
+        bql_unlock();
         goto out;
     }
 
@@ -473,6 +473,9 @@ static int colo_do_checkpoint_transaction(MigrationState *s,
      */
     qemu_savevm_live_state(s->to_dst_file);
 
+    /* Save live state requires BQL */
+    bql_unlock();
+
     qemu_fflush(fb);
 
     /*

> 
> #6  0x00007ffff7471517 in __assert_fail
>     (assertion=assertion@entry=0x555555f17aee "bql_locked() != locked", file=file@entry=0x555555f17ab0 "../system/cpus.c", line=line@entry=535, function=function@entry=0x55555609bfd0 <__PRETTY_FUNCTION__.9> "bql_update_status") at ./assert/assert.c:105
> #7  0x0000555555b09f1e in bql_update_status (locked=locked@entry=false) at ../system/cpus.c:535
> #8  0x0000555555ec60e7 in qemu_mutex_pre_unlock (mutex=0x555557166700 <bql>, file=0x555555efe1dc "../cpu-common.c", line=164) at ../util/qemu-thread-common.h:57
> #9  qemu_mutex_pre_unlock (line=164, file=0x555555efe1dc "../cpu-common.c", mutex=0x555557166700 <bql>) at ../util/qemu-thread-common.h:48
> #10 qemu_cond_wait_impl (cond=0x5555571442c0 <qemu_work_cond>, mutex=0x555557166700 <bql>, file=0x555555efe1dc "../cpu-common.c", line=164) at ../util/qemu-thread-posix.c:224
> #11 0x000055555589e6c8 in do_run_on_cpu (cpu=<optimized out>, func=<optimized out>, data=..., mutex=0x555557166700 <bql>) at ../cpu-common.c:164
> #12 0x0000555555b17a06 in memory_global_after_dirty_log_sync () at ../system/memory.c:2938
> #13 0x0000555555b55b47 in migration_bitmap_sync (rs=0x7fffe8001340, last_stage=last_stage@entry=true) at ../migration/ram.c:1157
> #14 0x0000555555b56721 in migration_bitmap_sync_precopy (last_stage=last_stage@entry=true) at ../migration/ram.c:1195
> #15 0x0000555555b59f8a in ram_save_complete (f=0x5555575db620, opaque=<optimized out>) at ../migration/ram.c:3381
> #16 0x0000555555b5e4f5 in qemu_savevm_complete (se=se@entry=0x5555574c0d80, f=f@entry=0x5555575db620) at ../migration/savevm.c:1521
> #17 0x0000555555b60437 in qemu_savevm_state_complete_precopy_iterable (f=f@entry=0x5555575db620, in_postcopy=in_postcopy@entry=false) at ../migration/savevm.c:1627
> #18 0x0000555555b60a4f in qemu_savevm_state_complete_precopy (iterable_only=true, f=0x5555575db620) at ../migration/savevm.c:1719
> #19 qemu_savevm_live_state (f=0x5555575db620) at ../migration/savevm.c:1855
> #20 0x0000555555b65ed9 in colo_do_checkpoint_transaction (fb=<optimized out>, bioc=<optimized out>, s=0x5555574c0070) at ../migration/colo.c:474
> #21 colo_process_checkpoint (s=0x5555574c0070) at ../migration/colo.c:592
> #22 migrate_start_colo_process (s=0x5555574c0070) at ../migration/colo.c:655
> #23 0x0000555555b4971e in migration_iteration_finish (s=0x5555574c0070) at ../migration/migration.c:3297
> #24 migration_thread (opaque=opaque@entry=0x5555574c0070) at ../migration/migration.c:3584
> #25 0x0000555555ec58c0 in qemu_thread_start (args=0x5555576583e0) at ../util/qemu-thread-posix.c:393
> #26 0x00007ffff74d2aa4 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:447
> #27 0x00007ffff755fc6c in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:78



-- 
Peter Xu
Re: [PATCH v3 06/10] migration-test: Add COLO migration unit test
Posted by Lukas Straub 1 week ago
On Mon, 2 Feb 2026 09:26:06 -0500
Peter Xu <peterx@redhat.com> wrote:

> On Fri, Jan 30, 2026 at 11:24:02AM +0100, Lukas Straub wrote:
> > On Tue, 27 Jan 2026 15:49:31 -0500
> > Peter Xu <peterx@redhat.com> wrote:
> >   
> > > On Sun, Jan 25, 2026 at 09:40:11PM +0100, Lukas Straub wrote:  
> > > > +void migration_test_add_colo(MigrationTestEnv *env)
> > > > +{
> > > > +    if (!env->has_kvm) {
> > > > +        g_test_skip("COLO requires KVM accelerator");
> > > > +        return;
> > > > +    }    
> > > 
> > > I'm OK if you want to explicitly bypass others, but could you explanation
> > > why?
> > > 
> > > Thanks,
> > >   
> > 
> > It used to hang with TCG. Now it crashes, since
> > migration_bitmap_sync_precopy assumes bql is held. Something for later.  
> 
> If we want to keep COLO around and be serious, let's try to make COLO the
> same standard we target for migration in general whenever possible.  We
> shouldn't randomly workaround bugs.  We should fix it.
> 
> It looks to me there's some locking issue instead.
> 
> Iterator's complete() requires BQL.  Would a patch like below makes sense
> to you?
> 
> diff --git a/migration/colo.c b/migration/colo.c
> index db783f6fa7..b3ea137120 100644
> --- a/migration/colo.c
> +++ b/migration/colo.c
> @@ -458,8 +458,8 @@ static int colo_do_checkpoint_transaction(MigrationState *s,
>      /* Note: device state is saved into buffer */
>      ret = qemu_save_device_state(fb);
>  
> -    bql_unlock();
>      if (ret < 0) {
> +        bql_unlock();
>          goto out;
>      }
>  
> @@ -473,6 +473,9 @@ static int colo_do_checkpoint_transaction(MigrationState *s,
>       */
>      qemu_savevm_live_state(s->to_dst_file);
>  
> +    /* Save live state requires BQL */
> +    bql_unlock();
> +
>      qemu_fflush(fb);
>  
>      /*

I already tested that and it works. However, we have to be very careful
around the locking here and I don't think it is safe to take the bql on
the primary here:

The secondary has the bql held at this point:

    colo_receive_check_message(mis->from_src_file,
                       COLO_MESSAGE_VMSTATE_SEND, &local_err);
    ...
    bql_lock();
    cpu_synchronize_all_states();
    ret = qemu_loadvm_state_main(mis->from_src_file, mis, errp);
    bql_unlock();

On the primary there is a filter-mirror mirroring incoming packets to
the secondary filter-redirector. However since the secondary migration
holds bql the receiving filter is blocked and will not receive anything
from the socket. Thus filter-mirror on the primary also may get blocked
during send and block the mainloop (It uses blocking IO).

Now if the primary migration thread wants to take the bql it will
deadlock.

So I think this is something to fix in a separate series since it is
more involved.

Regards,
Lukas Straub

> 
> > 
> > #6  0x00007ffff7471517 in __assert_fail
> >     (assertion=assertion@entry=0x555555f17aee "bql_locked() != locked", file=file@entry=0x555555f17ab0 "../system/cpus.c", line=line@entry=535, function=function@entry=0x55555609bfd0 <__PRETTY_FUNCTION__.9> "bql_update_status") at ./assert/assert.c:105
> > #7  0x0000555555b09f1e in bql_update_status (locked=locked@entry=false) at ../system/cpus.c:535
> > #8  0x0000555555ec60e7 in qemu_mutex_pre_unlock (mutex=0x555557166700 <bql>, file=0x555555efe1dc "../cpu-common.c", line=164) at ../util/qemu-thread-common.h:57
> > #9  qemu_mutex_pre_unlock (line=164, file=0x555555efe1dc "../cpu-common.c", mutex=0x555557166700 <bql>) at ../util/qemu-thread-common.h:48
> > #10 qemu_cond_wait_impl (cond=0x5555571442c0 <qemu_work_cond>, mutex=0x555557166700 <bql>, file=0x555555efe1dc "../cpu-common.c", line=164) at ../util/qemu-thread-posix.c:224
> > #11 0x000055555589e6c8 in do_run_on_cpu (cpu=<optimized out>, func=<optimized out>, data=..., mutex=0x555557166700 <bql>) at ../cpu-common.c:164
> > #12 0x0000555555b17a06 in memory_global_after_dirty_log_sync () at ../system/memory.c:2938
> > #13 0x0000555555b55b47 in migration_bitmap_sync (rs=0x7fffe8001340, last_stage=last_stage@entry=true) at ../migration/ram.c:1157
> > #14 0x0000555555b56721 in migration_bitmap_sync_precopy (last_stage=last_stage@entry=true) at ../migration/ram.c:1195
> > #15 0x0000555555b59f8a in ram_save_complete (f=0x5555575db620, opaque=<optimized out>) at ../migration/ram.c:3381
> > #16 0x0000555555b5e4f5 in qemu_savevm_complete (se=se@entry=0x5555574c0d80, f=f@entry=0x5555575db620) at ../migration/savevm.c:1521
> > #17 0x0000555555b60437 in qemu_savevm_state_complete_precopy_iterable (f=f@entry=0x5555575db620, in_postcopy=in_postcopy@entry=false) at ../migration/savevm.c:1627
> > #18 0x0000555555b60a4f in qemu_savevm_state_complete_precopy (iterable_only=true, f=0x5555575db620) at ../migration/savevm.c:1719
> > #19 qemu_savevm_live_state (f=0x5555575db620) at ../migration/savevm.c:1855
> > #20 0x0000555555b65ed9 in colo_do_checkpoint_transaction (fb=<optimized out>, bioc=<optimized out>, s=0x5555574c0070) at ../migration/colo.c:474
> > #21 colo_process_checkpoint (s=0x5555574c0070) at ../migration/colo.c:592
> > #22 migrate_start_colo_process (s=0x5555574c0070) at ../migration/colo.c:655
> > #23 0x0000555555b4971e in migration_iteration_finish (s=0x5555574c0070) at ../migration/migration.c:3297
> > #24 migration_thread (opaque=opaque@entry=0x5555574c0070) at ../migration/migration.c:3584
> > #25 0x0000555555ec58c0 in qemu_thread_start (args=0x5555576583e0) at ../util/qemu-thread-posix.c:393
> > #26 0x00007ffff74d2aa4 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:447
> > #27 0x00007ffff755fc6c in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:78  
> 
> 
> 

Re: [PATCH v3 06/10] migration-test: Add COLO migration unit test
Posted by Peter Xu 6 days, 12 hours ago
On Tue, Feb 03, 2026 at 10:18:22AM +0100, Lukas Straub wrote:
> On Mon, 2 Feb 2026 09:26:06 -0500
> Peter Xu <peterx@redhat.com> wrote:
> 
> > On Fri, Jan 30, 2026 at 11:24:02AM +0100, Lukas Straub wrote:
> > > On Tue, 27 Jan 2026 15:49:31 -0500
> > > Peter Xu <peterx@redhat.com> wrote:
> > >   
> > > > On Sun, Jan 25, 2026 at 09:40:11PM +0100, Lukas Straub wrote:  
> > > > > +void migration_test_add_colo(MigrationTestEnv *env)
> > > > > +{
> > > > > +    if (!env->has_kvm) {
> > > > > +        g_test_skip("COLO requires KVM accelerator");
> > > > > +        return;
> > > > > +    }    
> > > > 
> > > > I'm OK if you want to explicitly bypass others, but could you explanation
> > > > why?
> > > > 
> > > > Thanks,
> > > >   
> > > 
> > > It used to hang with TCG. Now it crashes, since
> > > migration_bitmap_sync_precopy assumes bql is held. Something for later.  
> > 
> > If we want to keep COLO around and be serious, let's try to make COLO the
> > same standard we target for migration in general whenever possible.  We
> > shouldn't randomly workaround bugs.  We should fix it.
> > 
> > It looks to me there's some locking issue instead.
> > 
> > Iterator's complete() requires BQL.  Would a patch like below makes sense
> > to you?
> > 
> > diff --git a/migration/colo.c b/migration/colo.c
> > index db783f6fa7..b3ea137120 100644
> > --- a/migration/colo.c
> > +++ b/migration/colo.c
> > @@ -458,8 +458,8 @@ static int colo_do_checkpoint_transaction(MigrationState *s,
> >      /* Note: device state is saved into buffer */
> >      ret = qemu_save_device_state(fb);
> >  
> > -    bql_unlock();
> >      if (ret < 0) {
> > +        bql_unlock();
> >          goto out;
> >      }
> >  
> > @@ -473,6 +473,9 @@ static int colo_do_checkpoint_transaction(MigrationState *s,
> >       */
> >      qemu_savevm_live_state(s->to_dst_file);
> >  
> > +    /* Save live state requires BQL */
> > +    bql_unlock();
> > +
> >      qemu_fflush(fb);
> >  
> >      /*
> 
> I already tested that and it works. However, we have to be very careful
> around the locking here and I don't think it is safe to take the bql on
> the primary here:
> 
> The secondary has the bql held at this point:

This is definitely an interesting piece of code... one question:

> 
>     colo_receive_check_message(mis->from_src_file,
>                        COLO_MESSAGE_VMSTATE_SEND, &local_err);
>     ...
>     bql_lock();
>     cpu_synchronize_all_states();

Why this is needed at all? ^^^^^^^^^^^^^^^

The qemu_loadvm_state_main() line right below should only load RAM.  I
don't see how it has anything to do with CPU register states..

>     ret = qemu_loadvm_state_main(mis->from_src_file, mis, errp);
>     bql_unlock();
> 
> On the primary there is a filter-mirror mirroring incoming packets to
> the secondary filter-redirector. However since the secondary migration
> holds bql the receiving filter is blocked and will not receive anything
> from the socket. Thus filter-mirror on the primary also may get blocked
> during send and block the mainloop (It uses blocking IO).

Hmm... could you explain why a blocking IO operation to mirror some packets
require holding BQL?  This sounds wrong on its own.

> 
> Now if the primary migration thread wants to take the bql it will
> deadlock.
> 
> So I think this is something to fix in a separate series since it is
> more involved.

Yes it might be involved, but this is really not something like "let's make
it simple for now and improve it later".  This is "OK this function
_requires_ this lock, but let's not take this lock and leave it for
later".  It's not something we can put aside, afaiu.  We should really fix
it.

How far do you think we can fix it?  Could you explain the problem better?

It might be helpful if you can reproduce the hang, then attach the logs
from both QEMU on a full thread backtrace dump.  I'll see what I can help.

Thanks,

-- 
Peter Xu
Re: [PATCH v3 06/10] migration-test: Add COLO migration unit test
Posted by Lukas Straub 3 days, 14 hours ago
On Tue, 3 Feb 2026 16:21:05 -0500
Peter Xu <peterx@redhat.com> wrote:

> On Tue, Feb 03, 2026 at 10:18:22AM +0100, Lukas Straub wrote:
> > On Mon, 2 Feb 2026 09:26:06 -0500
> > Peter Xu <peterx@redhat.com> wrote:
> >   
> > > On Fri, Jan 30, 2026 at 11:24:02AM +0100, Lukas Straub wrote:  
> > > > On Tue, 27 Jan 2026 15:49:31 -0500
> > > > Peter Xu <peterx@redhat.com> wrote:
> > > >     
> > > > > On Sun, Jan 25, 2026 at 09:40:11PM +0100, Lukas Straub wrote:    
> > > > > > +void migration_test_add_colo(MigrationTestEnv *env)
> > > > > > +{
> > > > > > +    if (!env->has_kvm) {
> > > > > > +        g_test_skip("COLO requires KVM accelerator");
> > > > > > +        return;
> > > > > > +    }      
> > > > > 
> > > > > I'm OK if you want to explicitly bypass others, but could you explanation
> > > > > why?
> > > > > 
> > > > > Thanks,
> > > > >     
> > > > 
> > > > It used to hang with TCG. Now it crashes, since
> > > > migration_bitmap_sync_precopy assumes bql is held. Something for later.    
> > > 
> > > If we want to keep COLO around and be serious, let's try to make COLO the
> > > same standard we target for migration in general whenever possible.  We
> > > shouldn't randomly workaround bugs.  We should fix it.
> > > 
> > > It looks to me there's some locking issue instead.
> > > 
> > > Iterator's complete() requires BQL.  Would a patch like below makes sense
> > > to you?
> > > 
> > > diff --git a/migration/colo.c b/migration/colo.c
> > > index db783f6fa7..b3ea137120 100644
> > > --- a/migration/colo.c
> > > +++ b/migration/colo.c
> > > @@ -458,8 +458,8 @@ static int colo_do_checkpoint_transaction(MigrationState *s,
> > >      /* Note: device state is saved into buffer */
> > >      ret = qemu_save_device_state(fb);
> > >  
> > > -    bql_unlock();
> > >      if (ret < 0) {
> > > +        bql_unlock();
> > >          goto out;
> > >      }
> > >  
> > > @@ -473,6 +473,9 @@ static int colo_do_checkpoint_transaction(MigrationState *s,
> > >       */
> > >      qemu_savevm_live_state(s->to_dst_file);
> > >  
> > > +    /* Save live state requires BQL */
> > > +    bql_unlock();
> > > +
> > >      qemu_fflush(fb);
> > >  
> > >      /*  
> > 
> > I already tested that and it works. However, we have to be very careful
> > around the locking here and I don't think it is safe to take the bql on
> > the primary here:
> > 
> > The secondary has the bql held at this point:  
> 
> This is definitely an interesting piece of code... one question:
> 
> > 
> >     colo_receive_check_message(mis->from_src_file,
> >                        COLO_MESSAGE_VMSTATE_SEND, &local_err);
> >     ...
> >     bql_lock();
> >     cpu_synchronize_all_states();  
> 
> Why this is needed at all? ^^^^^^^^^^^^^^^
> 
> The qemu_loadvm_state_main() line right below should only load RAM.  I
> don't see how it has anything to do with CPU register states..

You are right we don't need this and the lock is needed here. Then I'm
fine with removing the lock here and adding one on the primary side.

> 
> >     ret = qemu_loadvm_state_main(mis->from_src_file, mis, errp);
> >     bql_unlock();
> > 
> > On the primary there is a filter-mirror mirroring incoming packets to
> > the secondary filter-redirector. However since the secondary migration
> > holds bql the receiving filter is blocked and will not receive anything
> > from the socket. Thus filter-mirror on the primary also may get blocked
> > during send and block the mainloop (It uses blocking IO).  
> 
> Hmm... could you explain why a blocking IO operation to mirror some packets
> require holding BQL?  This sounds wrong on its own.

Yes there is no need for the BQL, it just is wrong. The tap fd gets a
POLLIN event, main loop takes BQL and calls the tap fd callback. Tap
reads a packet from the fd and calls qemu_send_packet_async() which
puts it through the net-filters and filter-mirror does a blocking send,
blocking the main loop while BQL is held.

> 
> > 
> > Now if the primary migration thread wants to take the bql it will
> > deadlock.
> > 
> > So I think this is something to fix in a separate series since it is
> > more involved.  
> 
> Yes it might be involved, but this is really not something like "let's make
> it simple for now and improve it later".  This is "OK this function
> _requires_ this lock, but let's not take this lock and leave it for
> later".  It's not something we can put aside, afaiu.  We should really fix
> it..
> 
> How far do you think we can fix it?  Could you explain the problem better?
> 
> It might be helpful if you can reproduce the hang, then attach the logs
> from both QEMU on a full thread backtrace dump.  I'll see what I can help.
> 
> Thanks,
> 

Re: [PATCH v3 06/10] migration-test: Add COLO migration unit test
Posted by Fabiano Rosas 2 weeks ago
Lukas Straub <lukasstraub2@web.de> writes:

> Add a COLO migration test for COLO migration and failover.
>
> Signed-off-by: Lukas Straub <lukasstraub2@web.de>

Reviewed-by: Fabiano Rosas <farosas@suse.de>

Looks ok at first sight, I'll later to some stress testing which usually
picks up subtle issues.