We're currently leaking the resources of the TLS thread by not joining
it and also overwriting the p->thread pointer altogether.
Fixes: a1af605bd5 ("migration/multifd: fix hangup with TLS-Multifd due to blocking handshake")
Cc: qemu-stable <qemu-stable@nongnu.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
migration/multifd.c | 8 +++++++-
migration/multifd.h | 2 ++
2 files changed, 9 insertions(+), 1 deletion(-)
diff --git a/migration/multifd.c b/migration/multifd.c
index ef13e2e781..8195c1daf3 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -630,6 +630,10 @@ static void multifd_send_terminate_threads(void)
for (i = 0; i < migrate_multifd_channels(); i++) {
MultiFDSendParams *p = &multifd_send_state->params[i];
+ if (p->tls_thread_created) {
+ qemu_thread_join(&p->tls_thread);
+ }
+
if (p->running) {
qemu_thread_join(&p->thread);
}
@@ -921,7 +925,9 @@ static bool multifd_tls_channel_connect(MultiFDSendParams *p,
trace_multifd_tls_outgoing_handshake_start(ioc, tioc, hostname);
qio_channel_set_name(QIO_CHANNEL(tioc), "multifd-tls-outgoing");
p->c = QIO_CHANNEL(tioc);
- qemu_thread_create(&p->thread, "multifd-tls-handshake-worker",
+
+ p->tls_thread_created = true;
+ qemu_thread_create(&p->tls_thread, "multifd-tls-handshake-worker",
multifd_tls_handshake_thread, p,
QEMU_THREAD_JOINABLE);
return true;
diff --git a/migration/multifd.h b/migration/multifd.h
index 78a2317263..720c9d50db 100644
--- a/migration/multifd.h
+++ b/migration/multifd.h
@@ -73,6 +73,8 @@ typedef struct {
char *name;
/* channel thread id */
QemuThread thread;
+ QemuThread tls_thread;
+ bool tls_thread_created;
/* communication channel */
QIOChannel *c;
/* is the yank function registered */
--
2.35.3
On Mon, Feb 05, 2024 at 04:49:24PM -0300, Fabiano Rosas wrote: > We're currently leaking the resources of the TLS thread by not joining > it and also overwriting the p->thread pointer altogether. AFAICS, it is not ovewriting 'p->thread' because at the time when the TLS thread is created, the main 'send thread' has not yet been created. The TLS thread and send thread execution times are mutually exclusive. The 'p->running' flag is already set to true when the TLS thread is created, so the existing cleanup should be working too, so I'm not seeing a bug that needs fixing here. > > Fixes: a1af605bd5 ("migration/multifd: fix hangup with TLS-Multifd due to blocking handshake") > Cc: qemu-stable <qemu-stable@nongnu.org> > Reviewed-by: Peter Xu <peterx@redhat.com> > Signed-off-by: Fabiano Rosas <farosas@suse.de> > --- > migration/multifd.c | 8 +++++++- > migration/multifd.h | 2 ++ > 2 files changed, 9 insertions(+), 1 deletion(-) > > diff --git a/migration/multifd.c b/migration/multifd.c > index ef13e2e781..8195c1daf3 100644 > --- a/migration/multifd.c > +++ b/migration/multifd.c > @@ -630,6 +630,10 @@ static void multifd_send_terminate_threads(void) > for (i = 0; i < migrate_multifd_channels(); i++) { > MultiFDSendParams *p = &multifd_send_state->params[i]; > > + if (p->tls_thread_created) { > + qemu_thread_join(&p->tls_thread); > + } > + > if (p->running) { > qemu_thread_join(&p->thread); > } > @@ -921,7 +925,9 @@ static bool multifd_tls_channel_connect(MultiFDSendParams *p, > trace_multifd_tls_outgoing_handshake_start(ioc, tioc, hostname); > qio_channel_set_name(QIO_CHANNEL(tioc), "multifd-tls-outgoing"); > p->c = QIO_CHANNEL(tioc); > - qemu_thread_create(&p->thread, "multifd-tls-handshake-worker", > + > + p->tls_thread_created = true; > + qemu_thread_create(&p->tls_thread, "multifd-tls-handshake-worker", > multifd_tls_handshake_thread, p, > QEMU_THREAD_JOINABLE); > return true; > diff --git a/migration/multifd.h b/migration/multifd.h > index 78a2317263..720c9d50db 100644 > --- a/migration/multifd.h > +++ b/migration/multifd.h > @@ -73,6 +73,8 @@ typedef struct { > char *name; > /* channel thread id */ > QemuThread thread; > + QemuThread tls_thread; > + bool tls_thread_created; > /* communication channel */ > QIOChannel *c; > /* is the yank function registered */ > -- > 2.35.3 > With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
On Tue, Feb 06, 2024 at 08:53:45AM +0000, Daniel P. Berrangé wrote: > AFAICS, it is not ovewriting 'p->thread' because at the time when the > TLS thread is created, the main 'send thread' has not yet been > created. The TLS thread and send thread execution times are mutually > exclusive. IIUC it'll be overwritten after the tls handshake, where in the tls thread uses multifd_channel_connect() to create the ultimate multifd thread with the same p->thread variable: qemu_thread_create(&p->thread, p->name, multifd_send_thread, p, QEMU_THREAD_JOINABLE); There it'll overwrite the old value setup by p->thread, hence the tls thread resource should be leaked until QEMU quits when created with JOINABLE in both contexts. Thanks, -- Peter Xu
On Tue, Feb 06, 2024 at 05:15:07PM +0800, Peter Xu wrote: > On Tue, Feb 06, 2024 at 08:53:45AM +0000, Daniel P. Berrangé wrote: > > AFAICS, it is not ovewriting 'p->thread' because at the time when the > > TLS thread is created, the main 'send thread' has not yet been > > created. The TLS thread and send thread execution times are mutually > > exclusive. > > IIUC it'll be overwritten after the tls handshake, where in the tls thread > uses multifd_channel_connect() to create the ultimate multifd thread with > the same p->thread variable: > > qemu_thread_create(&p->thread, p->name, multifd_send_thread, p, > QEMU_THREAD_JOINABLE); > > There it'll overwrite the old value setup by p->thread, hence the tls > thread resource should be leaked until QEMU quits when created with > JOINABLE in both contexts. Ah yes, missed that, you're right. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
© 2016 - 2024 Red Hat, Inc.