[PATCH v4 1/2] tests/migration: Setup pre-listened cpr.sock to remove race-condition.

Jaehoon Kim posted 2 patches 5 months ago
Maintainers: Steve Sistare <steven.sistare@oracle.com>, Peter Xu <peterx@redhat.com>, Fabiano Rosas <farosas@suse.de>, Laurent Vivier <lvivier@redhat.com>, Paolo Bonzini <pbonzini@redhat.com>
[PATCH v4 1/2] tests/migration: Setup pre-listened cpr.sock to remove race-condition.
Posted by Jaehoon Kim 5 months ago
When the source VM attempts to connect to the destination VM's Unix
domain socket (cpr.sock) during a cpr-transfer test, race conditions can
occur if the socket file isn't ready. This can lead to connection
failures when running tests.

This patch creates and listens on the socket in advance, and passes the
pre-listened FD directly. This avoids timing issues and improves the
reliability of CPR tests.

Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com>
Signed-off-by: Jaehoon Kim <jhkim@linux.ibm.com>
---
 tests/qtest/migration/cpr-tests.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/tests/qtest/migration/cpr-tests.c b/tests/qtest/migration/cpr-tests.c
index 5536e14610..f7bd5c4666 100644
--- a/tests/qtest/migration/cpr-tests.c
+++ b/tests/qtest/migration/cpr-tests.c
@@ -60,13 +60,12 @@ static void test_mode_transfer_common(bool incoming_defer)
     g_autofree char *cpr_path = g_strdup_printf("%s/cpr.sock", tmpfs);
     g_autofree char *mig_path = g_strdup_printf("%s/migsocket", tmpfs);
     g_autofree char *uri = g_strdup_printf("unix:%s", mig_path);
+    g_autofree char *opts_target;
 
     const char *opts = "-machine aux-ram-share=on -nodefaults";
     g_autofree const char *cpr_channel = g_strdup_printf(
         "cpr,addr.transport=socket,addr.type=unix,addr.path=%s",
         cpr_path);
-    g_autofree char *opts_target = g_strdup_printf("-incoming %s %s",
-                                                   cpr_channel, opts);
 
     g_autofree char *connect_channels = g_strdup_printf(
         "[ { 'channel-type': 'main',"
@@ -75,6 +74,17 @@ static void test_mode_transfer_common(bool incoming_defer)
         "              'path': '%s' } } ]",
         mig_path);
 
+    /*
+     * Set up a UNIX domain socket for the CPR channel before
+     * launching the destination VM, to avoid timing issues
+     * during connection setup.
+     */
+    int cpr_sockfd = qtest_socket_server(cpr_path);
+    g_assert(cpr_sockfd >= 0);
+
+    opts_target = g_strdup_printf("-incoming cpr,addr.transport=socket,"
+                                  "addr.type=fd,addr.str=%d %s",
+                                  cpr_sockfd, opts);
     MigrateCommon args = {
         .start.opts_source = opts,
         .start.opts_target = opts_target,
-- 
2.49.0
Re: [PATCH v4 1/2] tests/migration: Setup pre-listened cpr.sock to remove race-condition.
Posted by Steven Sistare 5 months ago
On 6/11/2025 4:56 PM, Jaehoon Kim wrote:
> When the source VM attempts to connect to the destination VM's Unix
> domain socket (cpr.sock) during a cpr-transfer test, race conditions can
> occur if the socket file isn't ready. This can lead to connection
> failures when running tests.
> 
> This patch creates and listens on the socket in advance, and passes the
> pre-listened FD directly. This avoids timing issues and improves the
> reliability of CPR tests.
> 
> Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com>
> Signed-off-by: Jaehoon Kim <jhkim@linux.ibm.com>
> ---
>   tests/qtest/migration/cpr-tests.c | 14 ++++++++++++--
>   1 file changed, 12 insertions(+), 2 deletions(-)
> 
> diff --git a/tests/qtest/migration/cpr-tests.c b/tests/qtest/migration/cpr-tests.c
> index 5536e14610..f7bd5c4666 100644
> --- a/tests/qtest/migration/cpr-tests.c
> +++ b/tests/qtest/migration/cpr-tests.c
> @@ -60,13 +60,12 @@ static void test_mode_transfer_common(bool incoming_defer)
>       g_autofree char *cpr_path = g_strdup_printf("%s/cpr.sock", tmpfs);
>       g_autofree char *mig_path = g_strdup_printf("%s/migsocket", tmpfs);
>       g_autofree char *uri = g_strdup_printf("unix:%s", mig_path);
> +    g_autofree char *opts_target;

This must be initialized. Otherwise, if the function returns before opts_target
is assigned, then a garbage pointer can be freed.

     g_autofree char *opts_target = NULL;

Sorry I did not catch that before.

>       const char *opts = "-machine aux-ram-share=on -nodefaults";
>       g_autofree const char *cpr_channel = g_strdup_printf(
>           "cpr,addr.transport=socket,addr.type=unix,addr.path=%s",
>           cpr_path);
> -    g_autofree char *opts_target = g_strdup_printf("-incoming %s %s",
> -                                                   cpr_channel, opts);
>   
>       g_autofree char *connect_channels = g_strdup_printf(
>           "[ { 'channel-type': 'main',"
> @@ -75,6 +74,17 @@ static void test_mode_transfer_common(bool incoming_defer)
>           "              'path': '%s' } } ]",
>           mig_path);
>   
> +    /*
> +     * Set up a UNIX domain socket for the CPR channel before
> +     * launching the destination VM, to avoid timing issues
> +     * during connection setup.
> +     */
> +    int cpr_sockfd = qtest_socket_server(cpr_path);
> +    g_assert(cpr_sockfd >= 0);
> +
> +    opts_target = g_strdup_printf("-incoming cpr,addr.transport=socket,"
> +                                  "addr.type=fd,addr.str=%d %s",
> +                                  cpr_sockfd, opts);

Or, declare and assign the final value together:

     g_autofree char *opts_target = g_strdup_printf(
         "-incoming cpr,addr.transport=socket,addr.type=fd,addr.str=%d %s",
         cpr_sockfd, opts);

With either change:
Reviewed-by: Steve Sistare <steven.sistare@oracle.com>

Thanks very much for fixing the cpr tests.

- Steve

>       MigrateCommon args = {
>           .start.opts_source = opts,
>           .start.opts_target = opts_target,