[PATCH] migration/multifd: fix hangup with TLS-Multifd due to blocking handshake

Chuan Zheng posted 1 patch 3 years, 5 months ago
Test checkpatch passed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/1604643893-8223-1-git-send-email-zhengchuan@huawei.com
migration/multifd.c | 23 +++++++++++++++++------
1 file changed, 17 insertions(+), 6 deletions(-)
[PATCH] migration/multifd: fix hangup with TLS-Multifd due to blocking handshake
Posted by Chuan Zheng 3 years, 5 months ago
The qemu main loop could hang up forever when we enable TLS+Multifd.
The Src multifd_send_0 invokes tls handshake, it sends hello to sever
and wait response.
However, the Dst main qemu loop has been waiting recvmsg() for multifd_recv_1.
Both of Src and Dst main qemu loop are blocking and waiting for reponse which
results in hanging up forever.

Src: (multifd_send_0)                                              Dst: (multifd_recv_1)
multifd_channel_connect                                            migration_channel_process_incoming
  multifd_tls_channel_connect                                        migration_tls_channel_process_incoming
    multifd_tls_channel_connect                                        qio_channel_tls_handshake_task
       qio_channel_tls_handshake                                         gnutls_handshake
          qio_channel_tls_handshake_task                                       ...
            qcrypto_tls_session_handshake                                      ...
              gnutls_handshake                                                 ...
                   ...                                                         ...
                recvmsg (Blocking I/O waiting for response)                recvmsg (Blocking I/O waiting for response)

Fix this by offloadinig handshake work to a background thread.

Reported-by: Yan Jin <jinyan12@huawei.com>
Suggested-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Chuan Zheng <zhengchuan@huawei.com>
---
 migration/multifd.c | 23 +++++++++++++++++------
 1 file changed, 17 insertions(+), 6 deletions(-)

diff --git a/migration/multifd.c b/migration/multifd.c
index 68b171f..88486b9 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -739,6 +739,19 @@ static void multifd_tls_outgoing_handshake(QIOTask *task,
     multifd_channel_connect(p, ioc, err);
 }
 
+static void *multifd_tls_handshake_thread(void *opaque)
+{
+    MultiFDSendParams *p = opaque;
+    QIOChannelTLS *tioc = QIO_CHANNEL_TLS(p->c);
+
+    qio_channel_tls_handshake(tioc,
+                              multifd_tls_outgoing_handshake,
+                              p,
+                              NULL,
+                              NULL);
+    return NULL;
+}
+
 static void multifd_tls_channel_connect(MultiFDSendParams *p,
                                         QIOChannel *ioc,
                                         Error **errp)
@@ -754,12 +767,10 @@ static void multifd_tls_channel_connect(MultiFDSendParams *p,
 
     trace_multifd_tls_outgoing_handshake_start(ioc, tioc, hostname);
     qio_channel_set_name(QIO_CHANNEL(tioc), "multifd-tls-outgoing");
-    qio_channel_tls_handshake(tioc,
-                              multifd_tls_outgoing_handshake,
-                              p,
-                              NULL,
-                              NULL);
-
+    p->c = QIO_CHANNEL(tioc);
+    qemu_thread_create(&p->thread, "multifd-tls-handshake-worker",
+                       multifd_tls_handshake_thread, p,
+                       QEMU_THREAD_JOINABLE);
 }
 
 static bool multifd_channel_connect(MultiFDSendParams *p,
-- 
1.8.3.1


Re: [PATCH] migration/multifd: fix hangup with TLS-Multifd due to blocking handshake
Posted by Daniel P. Berrangé 3 years, 5 months ago
On Fri, Nov 06, 2020 at 02:24:53PM +0800, Chuan Zheng wrote:
> The qemu main loop could hang up forever when we enable TLS+Multifd.
> The Src multifd_send_0 invokes tls handshake, it sends hello to sever
> and wait response.
> However, the Dst main qemu loop has been waiting recvmsg() for multifd_recv_1.
> Both of Src and Dst main qemu loop are blocking and waiting for reponse which
> results in hanging up forever.
> 
> Src: (multifd_send_0)                                              Dst: (multifd_recv_1)
> multifd_channel_connect                                            migration_channel_process_incoming
>   multifd_tls_channel_connect                                        migration_tls_channel_process_incoming
>     multifd_tls_channel_connect                                        qio_channel_tls_handshake_task
>        qio_channel_tls_handshake                                         gnutls_handshake
>           qio_channel_tls_handshake_task                                       ...
>             qcrypto_tls_session_handshake                                      ...
>               gnutls_handshake                                                 ...
>                    ...                                                         ...
>                 recvmsg (Blocking I/O waiting for response)                recvmsg (Blocking I/O waiting for response)
> 
> Fix this by offloadinig handshake work to a background thread.
> 
> Reported-by: Yan Jin <jinyan12@huawei.com>
> Suggested-by: Daniel P. Berrangé <berrange@redhat.com>
> Signed-off-by: Chuan Zheng <zhengchuan@huawei.com>
> ---
>  migration/multifd.c | 23 +++++++++++++++++------
>  1 file changed, 17 insertions(+), 6 deletions(-)

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


Re: [PATCH] migration/multifd: fix hangup with TLS-Multifd due to blocking handshake
Posted by Dr. David Alan Gilbert 3 years, 5 months ago
* Chuan Zheng (zhengchuan@huawei.com) wrote:
> The qemu main loop could hang up forever when we enable TLS+Multifd.
> The Src multifd_send_0 invokes tls handshake, it sends hello to sever
> and wait response.
> However, the Dst main qemu loop has been waiting recvmsg() for multifd_recv_1.
> Both of Src and Dst main qemu loop are blocking and waiting for reponse which
> results in hanging up forever.
> 
> Src: (multifd_send_0)                                              Dst: (multifd_recv_1)
> multifd_channel_connect                                            migration_channel_process_incoming
>   multifd_tls_channel_connect                                        migration_tls_channel_process_incoming
>     multifd_tls_channel_connect                                        qio_channel_tls_handshake_task
>        qio_channel_tls_handshake                                         gnutls_handshake
>           qio_channel_tls_handshake_task                                       ...
>             qcrypto_tls_session_handshake                                      ...
>               gnutls_handshake                                                 ...
>                    ...                                                         ...
>                 recvmsg (Blocking I/O waiting for response)                recvmsg (Blocking I/O waiting for response)
> 
> Fix this by offloadinig handshake work to a background thread.
> 
> Reported-by: Yan Jin <jinyan12@huawei.com>
> Suggested-by: Daniel P. Berrangé <berrange@redhat.com>
> Signed-off-by: Chuan Zheng <zhengchuan@huawei.com>

Queued

> ---
>  migration/multifd.c | 23 +++++++++++++++++------
>  1 file changed, 17 insertions(+), 6 deletions(-)
> 
> diff --git a/migration/multifd.c b/migration/multifd.c
> index 68b171f..88486b9 100644
> --- a/migration/multifd.c
> +++ b/migration/multifd.c
> @@ -739,6 +739,19 @@ static void multifd_tls_outgoing_handshake(QIOTask *task,
>      multifd_channel_connect(p, ioc, err);
>  }
>  
> +static void *multifd_tls_handshake_thread(void *opaque)
> +{
> +    MultiFDSendParams *p = opaque;
> +    QIOChannelTLS *tioc = QIO_CHANNEL_TLS(p->c);
> +
> +    qio_channel_tls_handshake(tioc,
> +                              multifd_tls_outgoing_handshake,
> +                              p,
> +                              NULL,
> +                              NULL);
> +    return NULL;
> +}
> +
>  static void multifd_tls_channel_connect(MultiFDSendParams *p,
>                                          QIOChannel *ioc,
>                                          Error **errp)
> @@ -754,12 +767,10 @@ static void multifd_tls_channel_connect(MultiFDSendParams *p,
>  
>      trace_multifd_tls_outgoing_handshake_start(ioc, tioc, hostname);
>      qio_channel_set_name(QIO_CHANNEL(tioc), "multifd-tls-outgoing");
> -    qio_channel_tls_handshake(tioc,
> -                              multifd_tls_outgoing_handshake,
> -                              p,
> -                              NULL,
> -                              NULL);
> -
> +    p->c = QIO_CHANNEL(tioc);
> +    qemu_thread_create(&p->thread, "multifd-tls-handshake-worker",
> +                       multifd_tls_handshake_thread, p,
> +                       QEMU_THREAD_JOINABLE);
>  }
>  
>  static bool multifd_channel_connect(MultiFDSendParams *p,
> -- 
> 1.8.3.1
> 
> 
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK