* Juan Quintela (quintela@redhat.com) wrote:
> From: Ivan Ren <renyime@gmail.com>
>
> When migrate_cancel a multifd migration, if run sequence like this:
>
> [source] [destination]
>
> multifd_send_sync_main[finish]
> multifd_recv_thread wait &p->sem_sync
> shutdown to_dst_file
> detect error from_src_file
> send RAM_SAVE_FLAG_EOS[fail] [no chance to run multifd_recv_sync_main]
> multifd_load_cleanup
> join multifd receive thread forever
>
> will lead destination qemu hung at following stack:
>
> pthread_join
> qemu_thread_join
> multifd_load_cleanup
> process_incoming_migration_co
> coroutine_trampoline
>
> Signed-off-by: Ivan Ren <ivanren@tencent.com>
> Reviewed-by: Juan Quintela <quintela@redhat.com>
> Message-Id: <1561468699-9819-4-git-send-email-ivanren@tencent.com>
> Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> ---
> migration/ram.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/migration/ram.c b/migration/ram.c
> index eb6716710e..889148dd84 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -1292,6 +1292,11 @@ int multifd_load_cleanup(Error **errp)
>
> if (p->running) {
> p->quit = true;
> + /*
> + * multifd_recv_thread may hung at MULTIFD_FLAG_SYNC handle code,
> + * however try to wakeup it without harm in cleanup phase.
> + */
> + qemu_sem_post(&p->sem_sync);
> qemu_thread_join(&p->thread);
> }
> object_unref(OBJECT(p->c));
> --
> 2.21.0
>
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK