[libvirt PATCH] nodedev: Don't crash when exiting before init is done

Jonathon Jongsma posted 1 patch 3 years, 1 month ago
Test syntax-check failed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/libvirt tags/patchew/20210316222725.998020-1-jjongsma@redhat.com
src/node_device/node_device_udev.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
[libvirt PATCH] nodedev: Don't crash when exiting before init is done
Posted by Jonathon Jongsma 3 years, 1 month ago
If libvirtd is terminated before the node driver finishes
initialization, it can crash with a backtrace similar to the following:

    Stack trace of thread 1922933:
    #0  0x00007f8515178774 g_hash_table_find (libglib-2.0.so.0)
    #1  0x00007f851593ea98 virHashSearch (libvirt.so.0)
    #2  0x00007f8515a1dd83 virNodeDeviceObjListSearch (libvirt.so.0)
    #3  0x00007f84cceb40a1 udevAddOneDevice (libvirt_driver_nodedev.so)
    #4  0x00007f84cceb5fae nodeStateInitializeEnumerate (libvirt_driver_nodedev.so)
    #5  0x00007f85159840cb virThreadHelper (libvirt.so.0)
    #6  0x00007f8511c7d14a start_thread (libpthread.so.0)
    #7  0x00007f851442bdb3 __clone (libc.so.6)

    Stack trace of thread 1922863:
    #0  0x00007f851442651d syscall (libc.so.6)
    #1  0x00007f85159842d4 virThreadSelfID (libvirt.so.0)
    #2  0x00007f851594e240 virLogFormatString (libvirt.so.0)
    #3  0x00007f851596635d vir_object_finalize (libvirt.so.0)
    #4  0x00007f8514efe8e9 g_object_unref (libgobject-2.0.so.0)
    #5  0x00007f85159667f8 virObjectUnref (libvirt.so.0)
    #6  0x00007f851517755f g_hash_table_remove_all_nodes.part.0 (libglib-2.0.so.0)
    #7  0x00007f8515177e62 g_hash_table_unref (libglib-2.0.so.0)
    #8  0x00007f851596637e vir_object_finalize (libvirt.so.0)
    #9  0x00007f8514efe8e9 g_object_unref (libgobject-2.0.so.0)
    #10 0x00007f85159667f8 virObjectUnref (libvirt.so.0)
    #11 0x00007f84cceb2b42 nodeStateCleanup (libvirt_driver_nodedev.so)
    #12 0x00007f8515b37950 virStateCleanup (libvirt.so.0)
    #13 0x00005648085348e8 main (libvirtd)
    #14 0x00007f8514352493 __libc_start_main (libc.so.6)
    #15 0x00005648085350fe _start (libvirtd)

This is because the initial population of the device list is done in a
separate initialization thread. If we attempt to exit libvirtd before
this init thread has completed, we'll try to free the device list while
accessing it from the other thread. In order to guarantee that this
init thread is not accessing the device list when we're cleaning up the
nodedev driver, make it joinable and wait for it to finish before
proceding with the cleanup. This is similar to how we handle the udev
event handler thread.

The separate initialization thread was added in commit
9f0ae0b1.

https://bugzilla.redhat.com/show_bug.cgi?id=1933590

Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
---
 src/node_device/node_device_udev.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/src/node_device/node_device_udev.c b/src/node_device/node_device_udev.c
index fe25f8235e..010ebf4e74 100644
--- a/src/node_device/node_device_udev.c
+++ b/src/node_device/node_device_udev.c
@@ -66,6 +66,9 @@ struct _udevEventData {
     virCond threadCond;
     bool threadQuit;
     bool dataReady;
+
+    /* init thread */
+    virThread initThread;
 };
 
 static virClassPtr udevEventDataClass;
@@ -1660,6 +1663,7 @@ nodeStateCleanup(void)
         priv->threadQuit = true;
         virCondSignal(&priv->threadCond);
         virObjectUnlock(priv);
+        virThreadJoin(&priv->initThread);
         virThreadJoin(&priv->th);
     }
 
@@ -1995,7 +1999,6 @@ nodeStateInitialize(bool privileged,
 {
     udevEventDataPtr priv = NULL;
     struct udev *udev = NULL;
-    virThread enumThread;
 
     if (root != NULL) {
         virReportError(VIR_ERR_INVALID_ARG, "%s",
@@ -2103,7 +2106,7 @@ nodeStateInitialize(bool privileged,
     if (udevSetupSystemDev() != 0)
         goto cleanup;
 
-    if (virThreadCreateFull(&enumThread, false, nodeStateInitializeEnumerate,
+    if (virThreadCreateFull(&priv->initThread, true, nodeStateInitializeEnumerate,
                             "nodedev-init", false, udev) < 0) {
         virReportSystemError(errno, "%s",
                              _("failed to create udev enumerate thread"));
-- 
2.26.2

Re: [libvirt PATCH] nodedev: Don't crash when exiting before init is done
Posted by Michal Privoznik 3 years, 1 month ago
On 3/16/21 11:27 PM, Jonathon Jongsma wrote:
> If libvirtd is terminated before the node driver finishes
> initialization, it can crash with a backtrace similar to the following:
> 
>      Stack trace of thread 1922933:
>      #0  0x00007f8515178774 g_hash_table_find (libglib-2.0.so.0)
>      #1  0x00007f851593ea98 virHashSearch (libvirt.so.0)
>      #2  0x00007f8515a1dd83 virNodeDeviceObjListSearch (libvirt.so.0)
>      #3  0x00007f84cceb40a1 udevAddOneDevice (libvirt_driver_nodedev.so)
>      #4  0x00007f84cceb5fae nodeStateInitializeEnumerate (libvirt_driver_nodedev.so)
>      #5  0x00007f85159840cb virThreadHelper (libvirt.so.0)
>      #6  0x00007f8511c7d14a start_thread (libpthread.so.0)
>      #7  0x00007f851442bdb3 __clone (libc.so.6)
> 
>      Stack trace of thread 1922863:
>      #0  0x00007f851442651d syscall (libc.so.6)
>      #1  0x00007f85159842d4 virThreadSelfID (libvirt.so.0)
>      #2  0x00007f851594e240 virLogFormatString (libvirt.so.0)
>      #3  0x00007f851596635d vir_object_finalize (libvirt.so.0)
>      #4  0x00007f8514efe8e9 g_object_unref (libgobject-2.0.so.0)
>      #5  0x00007f85159667f8 virObjectUnref (libvirt.so.0)
>      #6  0x00007f851517755f g_hash_table_remove_all_nodes.part.0 (libglib-2.0.so.0)
>      #7  0x00007f8515177e62 g_hash_table_unref (libglib-2.0.so.0)
>      #8  0x00007f851596637e vir_object_finalize (libvirt.so.0)
>      #9  0x00007f8514efe8e9 g_object_unref (libgobject-2.0.so.0)
>      #10 0x00007f85159667f8 virObjectUnref (libvirt.so.0)
>      #11 0x00007f84cceb2b42 nodeStateCleanup (libvirt_driver_nodedev.so)
>      #12 0x00007f8515b37950 virStateCleanup (libvirt.so.0)
>      #13 0x00005648085348e8 main (libvirtd)
>      #14 0x00007f8514352493 __libc_start_main (libc.so.6)
>      #15 0x00005648085350fe _start (libvirtd)
> 
> This is because the initial population of the device list is done in a
> separate initialization thread. If we attempt to exit libvirtd before
> this init thread has completed, we'll try to free the device list while
> accessing it from the other thread. In order to guarantee that this
> init thread is not accessing the device list when we're cleaning up the
> nodedev driver, make it joinable and wait for it to finish before
> proceding with the cleanup. This is similar to how we handle the udev
> event handler thread.
> 
> The separate initialization thread was added in commit
> 9f0ae0b1.
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=1933590

Since this bug was closed as a duplicate of 1836865, I'm replacing this 
line. Hope you don't mind.

> 
> Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
> ---
>   src/node_device/node_device_udev.c | 7 +++++--
>   1 file changed, 5 insertions(+), 2 deletions(-)

Reviewed-by: Michal Privoznik <mprivozn@redhat.com>

And pushed.

Michal