[v3] Migration: Make misc.h helpers available for whole VM lifecycle

[PATCH v3 1/8] migration: Take migration object refcount earlier for threads

Posted by Peter Xu 1 year, 3 months ago

Both migration thread or background snapshot thread will take a refcount of
the migration object at the entrace of the thread function.

That makes sense, because it protects the object from being freed by the
main thread in migration_shutdown() later, but it might still race with it
if the thread is scheduled too late.  Consider the case right after
pthread_create() happened, VM shuts down with the object released, but
right after that the migration thread finally got created, referencing
MigrationState* in the opaque pointer which is already freed.

The only 100% safe way to make sure it won't get freed is taking the
refcount right before the thread is created, meanwhile when BQL is held.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 migration/migration.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index 74812ca785..e82ffa8cf3 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -3491,7 +3491,6 @@ static void *migration_thread(void *opaque)
 
     rcu_register_thread();
 
-    object_ref(OBJECT(s));
     update_iteration_initial_status(s);
 
     if (!multifd_send_setup()) {
@@ -3629,7 +3628,6 @@ static void *bg_migration_thread(void *opaque)
     int ret;
 
     rcu_register_thread();
-    object_ref(OBJECT(s));
 
     migration_rate_set(RATE_LIMIT_DISABLED);
 
@@ -3841,6 +3839,14 @@ void migrate_fd_connect(MigrationState *s, Error *error_in)
         }
     }
 
+    /*
+     * Take a refcount to make sure the migration object won't get freed by
+     * the main thread already in migration_shutdown().
+     *
+     * The refcount will be released at the end of the thread function.
+     */
+    object_ref(OBJECT(s));
+
     if (migrate_background_snapshot()) {
         qemu_thread_create(&s->thread, MIGRATION_THREAD_SNAPSHOT,
                 bg_migration_thread, s, QEMU_THREAD_JOINABLE);
-- 
2.45.0

Re: [PATCH v3 1/8] migration: Take migration object refcount earlier for threads

Posted by Fabiano Rosas 1 year, 3 months ago

Peter Xu <peterx@redhat.com> writes:

> Both migration thread or background snapshot thread will take a refcount of
> the migration object at the entrace of the thread function.
>
> That makes sense, because it protects the object from being freed by the
> main thread in migration_shutdown() later, but it might still race with it
> if the thread is scheduled too late.  Consider the case right after
> pthread_create() happened, VM shuts down with the object released, but
> right after that the migration thread finally got created, referencing
> MigrationState* in the opaque pointer which is already freed.
>
> The only 100% safe way to make sure it won't get freed is taking the
> refcount right before the thread is created, meanwhile when BQL is held.
>
> Signed-off-by: Peter Xu <peterx@redhat.com>

Reviewed-by: Fabiano Rosas <farosas@suse.de>

Re: [PATCH v3 1/8] migration: Take migration object refcount earlier for threads

Posted by Cédric Le Goater 1 year, 3 months ago

On 10/24/24 23:30, Peter Xu wrote:
> Both migration thread or background snapshot thread will take a refcount of
> the migration object at the entrace of the thread function.
> 
> That makes sense, because it protects the object from being freed by the
> main thread in migration_shutdown() later, but it might still race with it
> if the thread is scheduled too late.  Consider the case right after
> pthread_create() happened, VM shuts down with the object released, but
> right after that the migration thread finally got created, referencing
> MigrationState* in the opaque pointer which is already freed.
> 
> The only 100% safe way to make sure it won't get freed is taking the
> refcount right before the thread is created, meanwhile when BQL is held.
> 
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>   migration/migration.c | 10 ++++++++--
>   1 file changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/migration/migration.c b/migration/migration.c
> index 74812ca785..e82ffa8cf3 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -3491,7 +3491,6 @@ static void *migration_thread(void *opaque)
>   
>       rcu_register_thread();
>   
> -    object_ref(OBJECT(s));
>       update_iteration_initial_status(s);
>   
>       if (!multifd_send_setup()) {
> @@ -3629,7 +3628,6 @@ static void *bg_migration_thread(void *opaque)
>       int ret;
>   
>       rcu_register_thread();
> -    object_ref(OBJECT(s));
>   
>       migration_rate_set(RATE_LIMIT_DISABLED);
>   
> @@ -3841,6 +3839,14 @@ void migrate_fd_connect(MigrationState *s, Error *error_in)
>           }
>       }
>   
> +    /*
> +     * Take a refcount to make sure the migration object won't get freed by
> +     * the main thread already in migration_shutdown().
> +     *
> +     * The refcount will be released at the end of the thread function.
> +     */
> +    object_ref(OBJECT(s));
> +
>       if (migrate_background_snapshot()) {
>           qemu_thread_create(&s->thread, MIGRATION_THREAD_SNAPSHOT,
>                   bg_migration_thread, s, QEMU_THREAD_JOINABLE);

yes. It is safer to take a ref before starting the migration thread.

Reviewed-by: Cédric Le Goater <clg@redhat.com>

Thanks,

C.

[PATCH v3 1/8] migration: Take migration object refcount earlier for threads
[PATCH v3 2/8] migration: Unexport dirty_bitmap_mig_init()
[PATCH v3 3/8] migration: Unexport ram_mig_init()
[PATCH v3 4/8] migration: Drop migration_is_setup_or_active()
[PATCH v3 5/8] migration: Drop migration_is_idle()
[PATCH v3 6/8] migration: Drop migration_is_device()
[PATCH v3 7/8] migration: Unexport migration_is_active()
[PATCH v3 8/8] migration: Protect updates to current_migration with a mutex