[PATCH] rpc: ensure temporary GSource is removed from client event loop

Daniel P. Berrangé posted 1 patch 1 year, 9 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/libvirt tags/patchew/20240502125431.3210288-1-berrange@redhat.com
src/rpc/virnetclient.c | 14 +++++++++++++-
1 file changed, 13 insertions(+), 1 deletion(-)
[PATCH] rpc: ensure temporary GSource is removed from client event loop
Posted by Daniel P. Berrangé 1 year, 9 months ago
Users are seeing periodic segfaults from libvirt client apps,
especially thread heavy ones like virt-manager. A typical
stack trace would end up in the virNetClientIOEventFD method,
with illegal access to stale stack data. eg

==238721==ERROR: AddressSanitizer: stack-use-after-return on address 0x75cd18709788 at pc 0x75cd3111f907 bp 0x75cd181ff550 sp 0x75cd181ff548
WRITE of size 4 at 0x75cd18709788 thread T11
    #0 0x75cd3111f906 in virNetClientIOEventFD /usr/src/debug/libvirt/libvirt-10.2.0/build/../src/rpc/virnetclient.c:1634:15
    #1 0x75cd3210d198  (/usr/lib/libglib-2.0.so.0+0x5a198) (BuildId: 0a2311dfbbc6c215dc36f4b6bdd2b4b6fbae55a2)
    #2 0x75cd3216c3be  (/usr/lib/libglib-2.0.so.0+0xb93be) (BuildId: 0a2311dfbbc6c215dc36f4b6bdd2b4b6fbae55a2)
    #3 0x75cd3210ddc6 in g_main_loop_run (/usr/lib/libglib-2.0.so.0+0x5adc6) (BuildId: 0a2311dfbbc6c215dc36f4b6bdd2b4b6fbae55a2)
    #4 0x75cd3111a47c in virNetClientIOEventLoop /usr/src/debug/libvirt/libvirt-10.2.0/build/../src/rpc/virnetclient.c:1722:9
    #5 0x75cd3111a47c in virNetClientIO /usr/src/debug/libvirt/libvirt-10.2.0/build/../src/rpc/virnetclient.c:2002:10
    #6 0x75cd3111a47c in virNetClientSendInternal /usr/src/debug/libvirt/libvirt-10.2.0/build/../src/rpc/virnetclient.c:2170:11
    #7 0x75cd311198a8 in virNetClientSendWithReply /usr/src/debug/libvirt/libvirt-10.2.0/build/../src/rpc/virnetclient.c:2198:11
    #8 0x75cd31111653 in virNetClientProgramCall /usr/src/debug/libvirt/libvirt-10.2.0/build/../src/rpc/virnetclientprogram.c:318:9
    #9 0x75cd31241c8f in callFull /usr/src/debug/libvirt/libvirt-10.2.0/build/../src/remote/remote_driver.c:6054:10
    #10 0x75cd31241c8f in call /usr/src/debug/libvirt/libvirt-10.2.0/build/../src/remote/remote_driver.c:6076:12
    #11 0x75cd31241c8f in remoteNetworkGetXMLDesc /usr/src/debug/libvirt/libvirt-10.2.0/build/src/remote/remote_client_bodies.h:5959:9
    #12 0x75cd31410ff7 in virNetworkGetXMLDesc /usr/src/debug/libvirt/libvirt-10.2.0/build/../src/libvirt-network.c:952:15

The root cause is a bad assumption in the virNetClientIOEventLoop
method. This method is run by whichever thread currently owns the
buck, and is responsible for handling I/O. Inside a for(;;) loop,
this method creates a temporary GSource, adds it to the event loop
and runs g_main_loop_run(). When I/O is ready, the GSource callback
(virNetClientIOEventFD) will fire and call g_main_loop_quit(), and
return G_SOURCE_REMOVE which results in the temporary GSource being
destroyed. A g_autoptr() will then remove the last reference.

What was overlooked, is that a second thread can come along and
while it can't enter virNetClientIOEventLoop, it will register an
idle source that uses virNetClientIOWakeup to interrupt the
original thread's 'g_main_loop_run' call. When this happens the
virNetClientIOEventFD callback never runs, and so the temporary
GSource is not destroyed. The g_autoptr() will remove a reference,
but by virtue of still being attached to the event context, there
is an extra reference held causing GSource to be leaked. The
next time 'g_main_loop_run' is called, the original GSource will
trigger its callback, and access data that was allocated on the
stack by the previous thread, and likely SEGV.

To solve this, the thread calling 'g_main_loop_run' must call
g_source_destroy, immediately upon return, to guarantee that
the temporary GSource is removed.

CVE-2024-4418
Reported-by: Martin Shirokov <shirokovmartin@gmail.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
 src/rpc/virnetclient.c | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/src/rpc/virnetclient.c b/src/rpc/virnetclient.c
index 68098b1c8d..147b0d661a 100644
--- a/src/rpc/virnetclient.c
+++ b/src/rpc/virnetclient.c
@@ -1657,7 +1657,7 @@ static int virNetClientIOEventLoop(virNetClient *client,
 #endif /* !WIN32 */
         int timeout = -1;
         virNetMessage *msg = NULL;
-        g_autoptr(GSource) G_GNUC_UNUSED source = NULL;
+        g_autoptr(GSource) source = NULL;
         GIOCondition ev = 0;
         struct virNetClientIOEventData data = {
             .client = client,
@@ -1721,6 +1721,18 @@ static int virNetClientIOEventLoop(virNetClient *client,
 
         g_main_loop_run(client->eventLoop);
 
+        /*
+         * If virNetClientIOEventFD ran, this GSource will already be
+         * destroyed due to G_SOURCE_REMOVE. It is harmless to re-destroy
+         * it, since we still own a reference.
+         *
+         * If virNetClientIOWakeup ran, it will have interrupted the
+         * g_main_loop_run call, before virNetClientIOEventFD could
+         * run, and thus the GSource is still registered, and we need
+         * to destroy it since it is referencing stack memory for 'data'
+         */
+        g_source_destroy(source);
+
 #ifndef WIN32
         ignore_value(pthread_sigmask(SIG_SETMASK, &oldmask, NULL));
 #endif /* !WIN32 */
-- 
2.43.0
_______________________________________________
Devel mailing list -- devel@lists.libvirt.org
To unsubscribe send an email to devel-leave@lists.libvirt.org
Re: [PATCH] rpc: ensure temporary GSource is removed from client event loop
Posted by Ján Tomko 1 year, 9 months ago
On a Thursday in 2024, Daniel P. Berrangé wrote:
>Users are seeing periodic segfaults from libvirt client apps,
>especially thread heavy ones like virt-manager. A typical
>stack trace would end up in the virNetClientIOEventFD method,
>with illegal access to stale stack data. eg
>
>==238721==ERROR: AddressSanitizer: stack-use-after-return on address 0x75cd18709788 at pc 0x75cd3111f907 bp 0x75cd181ff550 sp 0x75cd181ff548
>WRITE of size 4 at 0x75cd18709788 thread T11
>    #0 0x75cd3111f906 in virNetClientIOEventFD /usr/src/debug/libvirt/libvirt-10.2.0/build/../src/rpc/virnetclient.c:1634:15
>    #1 0x75cd3210d198  (/usr/lib/libglib-2.0.so.0+0x5a198) (BuildId: 0a2311dfbbc6c215dc36f4b6bdd2b4b6fbae55a2)
>    #2 0x75cd3216c3be  (/usr/lib/libglib-2.0.so.0+0xb93be) (BuildId: 0a2311dfbbc6c215dc36f4b6bdd2b4b6fbae55a2)
>    #3 0x75cd3210ddc6 in g_main_loop_run (/usr/lib/libglib-2.0.so.0+0x5adc6) (BuildId: 0a2311dfbbc6c215dc36f4b6bdd2b4b6fbae55a2)
>    #4 0x75cd3111a47c in virNetClientIOEventLoop /usr/src/debug/libvirt/libvirt-10.2.0/build/../src/rpc/virnetclient.c:1722:9
>    #5 0x75cd3111a47c in virNetClientIO /usr/src/debug/libvirt/libvirt-10.2.0/build/../src/rpc/virnetclient.c:2002:10
>    #6 0x75cd3111a47c in virNetClientSendInternal /usr/src/debug/libvirt/libvirt-10.2.0/build/../src/rpc/virnetclient.c:2170:11
>    #7 0x75cd311198a8 in virNetClientSendWithReply /usr/src/debug/libvirt/libvirt-10.2.0/build/../src/rpc/virnetclient.c:2198:11
>    #8 0x75cd31111653 in virNetClientProgramCall /usr/src/debug/libvirt/libvirt-10.2.0/build/../src/rpc/virnetclientprogram.c:318:9
>    #9 0x75cd31241c8f in callFull /usr/src/debug/libvirt/libvirt-10.2.0/build/../src/remote/remote_driver.c:6054:10
>    #10 0x75cd31241c8f in call /usr/src/debug/libvirt/libvirt-10.2.0/build/../src/remote/remote_driver.c:6076:12
>    #11 0x75cd31241c8f in remoteNetworkGetXMLDesc /usr/src/debug/libvirt/libvirt-10.2.0/build/src/remote/remote_client_bodies.h:5959:9
>    #12 0x75cd31410ff7 in virNetworkGetXMLDesc /usr/src/debug/libvirt/libvirt-10.2.0/build/../src/libvirt-network.c:952:15
>
>The root cause is a bad assumption in the virNetClientIOEventLoop
>method. This method is run by whichever thread currently owns the
>buck, and is responsible for handling I/O. Inside a for(;;) loop,
>this method creates a temporary GSource, adds it to the event loop
>and runs g_main_loop_run(). When I/O is ready, the GSource callback
>(virNetClientIOEventFD) will fire and call g_main_loop_quit(), and
>return G_SOURCE_REMOVE which results in the temporary GSource being
>destroyed. A g_autoptr() will then remove the last reference.
>
>What was overlooked, is that a second thread can come along and
>while it can't enter virNetClientIOEventLoop, it will register an
>idle source that uses virNetClientIOWakeup to interrupt the
>original thread's 'g_main_loop_run' call. When this happens the
>virNetClientIOEventFD callback never runs, and so the temporary
>GSource is not destroyed. The g_autoptr() will remove a reference,
>but by virtue of still being attached to the event context, there
>is an extra reference held causing GSource to be leaked. The
>next time 'g_main_loop_run' is called, the original GSource will
>trigger its callback, and access data that was allocated on the
>stack by the previous thread, and likely SEGV.
>
>To solve this, the thread calling 'g_main_loop_run' must call
>g_source_destroy, immediately upon return, to guarantee that
>the temporary GSource is removed.
>
>CVE-2024-4418
>Reported-by: Martin Shirokov <shirokovmartin@gmail.com>
>Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
>---
> src/rpc/virnetclient.c | 14 +++++++++++++-
> 1 file changed, 13 insertions(+), 1 deletion(-)
>

Reviewed-by: Ján Tomko <jtomko@redhat.com>

Jano
_______________________________________________
Devel mailing list -- devel@lists.libvirt.org
To unsubscribe send an email to devel-leave@lists.libvirt.org