If any error happens during multifd send thread creating (e.g. channel broke
because new domain is destroyed by the dst), multifd_tls_handshake_thread
may exit silently, leaving main migration thread hanging (ram_save_setup ->
multifd_send_sync_main -> qemu_sem_wait(&p->sem_sync)).
Fix that by adding error handling in multifd_tls_handshake_thread.
Signed-off-by: Hao Wang <wanghao232@huawei.com>
---
migration/multifd.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/migration/multifd.c b/migration/multifd.c
index 2a1ea85ade..03527c564c 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -739,7 +739,16 @@ static void multifd_tls_outgoing_handshake(QIOTask *task,
} else {
trace_multifd_tls_outgoing_handshake_complete(ioc);
}
- multifd_channel_connect(p, ioc, err);
+
+ if (!multifd_channel_connect(p, ioc, err)) {
+ /*
+ * Error happen, mark multifd_send_thread status as 'quit' although it
+ * is not created, and then tell who pay attention to me.
+ */
+ p->quit = true;
+ qemu_sem_post(&multifd_send_state->channels_ready);
+ qemu_sem_post(&p->sem_sync);
+ }
}
static void *multifd_tls_handshake_thread(void *opaque)
--
2.23.0
Reviewed-by: Chuan Zheng <zhengchuan@huawei.com>
On 2021/2/9 18:42, Hao Wang wrote:
> If any error happens during multifd send thread creating (e.g. channel broke
> because new domain is destroyed by the dst), multifd_tls_handshake_thread
> may exit silently, leaving main migration thread hanging (ram_save_setup ->
> multifd_send_sync_main -> qemu_sem_wait(&p->sem_sync)).
> Fix that by adding error handling in multifd_tls_handshake_thread.
>
> Signed-off-by: Hao Wang <wanghao232@huawei.com>
> ---
> migration/multifd.c | 11 ++++++++++-
> 1 file changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/migration/multifd.c b/migration/multifd.c
> index 2a1ea85ade..03527c564c 100644
> --- a/migration/multifd.c
> +++ b/migration/multifd.c
> @@ -739,7 +739,16 @@ static void multifd_tls_outgoing_handshake(QIOTask *task,
> } else {
> trace_multifd_tls_outgoing_handshake_complete(ioc);
> }
> - multifd_channel_connect(p, ioc, err);
> +
> + if (!multifd_channel_connect(p, ioc, err)) {
> + /*
> + * Error happen, mark multifd_send_thread status as 'quit' although it
> + * is not created, and then tell who pay attention to me.
> + */
> + p->quit = true;
> + qemu_sem_post(&multifd_send_state->channels_ready);
> + qemu_sem_post(&p->sem_sync);
> + }
> }
>
> static void *multifd_tls_handshake_thread(void *opaque)
>
--
Regards.
Chuan
On Tue, Feb 09, 2021 at 06:42:37PM +0800, Hao Wang wrote: > If any error happens during multifd send thread creating (e.g. channel broke > because new domain is destroyed by the dst), multifd_tls_handshake_thread > may exit silently, leaving main migration thread hanging (ram_save_setup -> > multifd_send_sync_main -> qemu_sem_wait(&p->sem_sync)). > Fix that by adding error handling in multifd_tls_handshake_thread. > > Signed-off-by: Hao Wang <wanghao232@huawei.com> > --- > migration/multifd.c | 11 ++++++++++- > 1 file changed, 10 insertions(+), 1 deletion(-) Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
© 2016 - 2026 Red Hat, Inc.