[PATCH] security: fix virSecurityManagerGetNested access illegal address

gongwei@smartx.com posted 1 patch 3 years ago
Failed in applying to current master (apply log)
src/security/security_manager.c | 3 +++
1 file changed, 3 insertions(+)
[PATCH] security: fix virSecurityManagerGetNested access illegal address
Posted by gongwei@smartx.com 3 years ago
    When stop libvirtd is used, libvirtd exits the eventloop and cleans up
    the driverState first. Then release threadPool. If the workers thread
    is still executing at this time, it needs to access driverState.
    If the value in driverState is not judged at this time, direct access
    will cause an abnormal exit and damage the cache file of libvirt.

    In our example, the migration task is in progress at this time,
    the source is waiting for the target libvirtd dstFinish to return,
    the source libvirtd is stopped, and a crash occurs. After start libvirtd,
    the corresponding virtual machine process cannot be managed by libvirt

    stack:
    #0  virSecurityManagerGetNested (mgr=0x7f76141143c0) at security/security_manager.c:1033
    1033	    if (STREQ("stack", mgr->drv->name))
    (gdb) bt
    #0  virSecurityManagerGetNested (mgr=0x7f76141143c0) at security/security_manager.c:1033
    #1  0x00007f761c31660e in virQEMUDriverCreateCapabilities (driver=driver@entry=0x7f7614111060)
        at qemu/qemu_conf.c:1043
    #2  0x00007f761c3168b3 in virQEMUDriverGetCapabilities (driver=0x7f7614111060,
        refresh=<optimized out>) at qemu/qemu_conf.c:1103
    #3  0x00007f761c334d16 in qemuMigrationCookieXMLParse (flags=32, ctxt=0x7f76040040c0,
        doc=0x7f76040425c0, driver=0x7f7614111060, mig=0x7f760400ee10)
        at qemu/qemu_migration_cookie.c:1209
    #4  qemuMigrationCookieXMLParseStr (flags=32,
        xml=0x7f7604004580 "<qemu-migration>\n  <name>519ed304-375a-4819-a2d5-2f0ba662b9bc</name>
        049152ab-efdf-4aaf-ab08-b57ac1816351</uuid>
        <hostname>gongwei-nestedcluster-20210330042359-1</me>
        <hostuuid>41d69"..., driver=0x7f7614111060, mig=0x7f760400ee10)
        at qemu/qemu_migration_cookie.c:1404
    #5  qemuMigrationEatCookie (driver=driver@entry=0x7f7614111060, dom=dom@entry=0x7f7604001ac0,
        cookiein=cookiein@entry=0x7f7604004580 "<qemu-migration>
        <name>519ed304-375a-4819-a2d5-2f09bc</name>
        <uuid>049152ab-efdf-4aaf-ab08-b57ac1816351</uuid>
        <hostname>gongwei-nestedcluste0330042359-1</hostname>
        <hostuuid>41d69"..., cookieinlen=cookieinlen@entry=1410,
        flags=flags@entry=32) at qemu/qemu_migration_cookie.c:1501
    #6  0x00007f761c3291d5 in qemuMigrationSrcConfirmPhase (driver=driver@entry=0x7f7614111060,
        vm=vm@entry=0x7f7604001ac0,
        cookiein=0x7f7604004580 "<qemu-migration>
        <name>519ed304-375a-4819-a2d5-2f0ba662b9bc</nameuuid>049152ab-efdf-4aaf-ab08-b57ac1816351</uuid>
        <hostname>gongwei-nestedcluster-2021033004235ostname>
        <hostuuid>41d69"..., cookieinlen=1410, flags=14875, retcode=retcode@entry=0)
        at qemu/qemu_migration.c:2805
    #7  0x00007f761c331539 in qemuMigrationSrcPerformPeer2Peer3 (flags=14875, useParams=true,
        bandwidth=0, migParams=0x7f760400f070, nbdPort=0, migrate_disks=<optimized out>,
        nmigrate_disks=0, listenAddress=<optimized out>, graphicsuri=<optimized out>,
        uri=<optimized out>, dname=0x0, persist_xml=0x0, xmlin=<optimized out>, vm=0x7f7604001ac0,
        dconnuri=0x7f7604000df0 "qemu+tcp://10.181.177.170/system", dconn=0x7f7604021680,
        sconn=0x7f7608001410, driver=0x7f7614111060) at qemu/qemu_migration.c:4202

    (gdb) frame 1
    #1  0x00007f761c31660e in virQEMUDriverCreateCapabilities (driver=driver@entry=0x7f7614111060)
        at qemu/qemu_conf.c:1043
    1043	    if (!(sec_managers = qemuSecurityGetNested(driver->securityManager)))

    (gdb) p *(driver->securityManager)
    $2 = {parent = {parent = {u = {dummy_align1 = 140145119544368, dummy_align2 = 0x7f7614114430, s =
          magic = 336675888, refs = 32630}}, klass = 0xdeadbeef}, lock = {lock = {__data = {
          __lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 0, __spins = 0, __elision
          __list = {__prev = 0x0, __next = 0x0}}, __size = '\000' <repeats 39 times>, __align = 0
          drv = 0x0, flags = 0, virtDriver = 0x0, privateData = 0x0}

    if (STREQ("stack", mgr->drv->name)  mgr->drv is 0x0

Signed-off-by: gongwei <gongwei@smartx.com>
---
src/security/security_manager.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/src/security/security_manager.c b/src/security/security_manager.c
index d8b84e2861..96ca9ee861 100644
--- a/src/security/security_manager.c
+++ b/src/security/security_manager.c
@@ -1030,6 +1030,9 @@ virSecurityManagerGetNested(virSecurityManager *mgr)
{
     virSecurityManager ** list = NULL;
+    if (mgr == NULL || mgr->drv == NULL)
+        return NULL;
+
     if (STREQ("stack", mgr->drv->name))
         return virSecurityStackGetNested(mgr);
--
2.24.1


Re: [PATCH] security: fix virSecurityManagerGetNested access illegal address
Posted by Peter Krempa 3 years ago
On Mon, Apr 26, 2021 at 17:23:15 +0800, gongwei@smartx.com wrote:
>     When stop libvirtd is used, libvirtd exits the eventloop and cleans up
>     the driverState first. Then release threadPool. If the workers thread
>     is still executing at this time, it needs to access driverState.
>     If the value in driverState is not judged at this time, direct access
>     will cause an abnormal exit and damage the cache file of libvirt.
> 
>     In our example, the migration task is in progress at this time,
>     the source is waiting for the target libvirtd dstFinish to return,
>     the source libvirtd is stopped, and a crash occurs. After start libvirtd,
>     the corresponding virtual machine process cannot be managed by libvirt

This explanation seems to suggest that this happens when you are
shutting down libvirtd ...

>     stack:
>     #0  virSecurityManagerGetNested (mgr=0x7f76141143c0) at security/security_manager.c:1033
>     1033	    if (STREQ("stack", mgr->drv->name))
>     (gdb) bt
>     #0  virSecurityManagerGetNested (mgr=0x7f76141143c0) at security/security_manager.c:1033
>     #1  0x00007f761c31660e in virQEMUDriverCreateCapabilities (driver=driver@entry=0x7f7614111060)
>         at qemu/qemu_conf.c:1043

... so I'd say that calling this function would be invalid in the first
place.

>     #2  0x00007f761c3168b3 in virQEMUDriverGetCapabilities (driver=0x7f7614111060,
>         refresh=<optimized out>) at qemu/qemu_conf.c:1103
>     #3  0x00007f761c334d16 in qemuMigrationCookieXMLParse (flags=32, ctxt=0x7f76040040c0,
>         doc=0x7f76040425c0, driver=0x7f7614111060, mig=0x7f760400ee10)
>         at qemu/qemu_migration_cookie.c:1209
>     #4  qemuMigrationCookieXMLParseStr (flags=32,
>         xml=0x7f7604004580 "<qemu-migration>\n  <name>519ed304-375a-4819-a2d5-2f0ba662b9bc</name>
>         049152ab-efdf-4aaf-ab08-b57ac1816351</uuid>
>         <hostname>gongwei-nestedcluster-20210330042359-1</me>
>         <hostuuid>41d69"..., driver=0x7f7614111060, mig=0x7f760400ee10)
>         at qemu/qemu_migration_cookie.c:1404
>     #5  qemuMigrationEatCookie (driver=driver@entry=0x7f7614111060, dom=dom@entry=0x7f7604001ac0,

This function was renamed in v6.8.0-17-g43f0944f66 so this backtrace
comes from an old version. Which version did you observe the bug in?

I'm asking because around v6.7.0-71-g399039a6b1 libvirt's shutdown code
was modified to actually wait for threads, so the bug as you are
describing it may no longer be valid.

In case you can reproduce it with upstream libvirtd which this patch
should be against please also include some reasonable steps to show that
it's still happening or at least a recent backtrace.