[libvirt] [PATCH] qemuProcessReconnect: ensure vm xml integrity when save status

Wang King posted 1 patch 4 years, 10 months ago
Test syntax-check passed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/libvirt tags/patchew/20190518102436.5028-1-king.wang@huawei.com
src/qemu/qemu_process.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
[libvirt] [PATCH] qemuProcessReconnect: ensure vm xml integrity when save status
Posted by Wang King 4 years, 10 months ago
From: Dingzhong Weng <wengdingzhong@huawei.com>

libvirtd starts qemuProcessReconnect threads when libvirtd is starting
and there are live vm on the host. but as qemuProcessReconnect threads
are simply side job done by libvirtd upon start and are only called
for existing vm. these threads are not managed like that of worker
threads and event pool.

once libvirtd receives SIGTERM signal or any similar command right after
it starts, these qemuProcessReconnect threads might still be running and
libvirtd goes to clean up and frees qemu_driver and its components. In
this short window, qemuProcessReconnect threads might have a NULL
qemu_driver->xmlopt.format, skip this function, and only partial vm
status can be saved to disk. As a result, vm may fail to recover with
missing information.

this patch increases qemu_driver->xmlopt ref count during the lifecycle
of qemuProcessReconnect so that libvirtd will not release this resource
until qemuProcessReconnect threads finish using it. thus make sure all
of vm status information can be formatted and thus maintain its integrity
in virDomainSaveStatus.

no vm is killed in this patch.

Signed-off-by: Dingzhong Weng <wengdingzhong@huawei.com>
---
 src/qemu/qemu_process.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
index 90466771cd..36b9b5fd03 100644
--- a/src/qemu/qemu_process.c
+++ b/src/qemu/qemu_process.c
@@ -8000,6 +8000,7 @@ qemuProcessReconnect(void *opaque)
     struct qemuProcessReconnectData *data = opaque;
     virQEMUDriverPtr driver = data->driver;
     virDomainObjPtr obj = data->obj;
+    virDomainXMLOptionPtr xmlopt = NULL;
     qemuDomainObjPrivatePtr priv;
     qemuDomainJobObj oldjob;
     int state;
@@ -8023,6 +8024,9 @@ qemuProcessReconnect(void *opaque)
     cfg = virQEMUDriverGetConfig(driver);
     priv = obj->privateData;
 
+    /* need xmlopt later to save status, do not free */
+    xmlopt = virObjectRef(driver->xmlopt);
+
     if (!(caps = virQEMUDriverGetCapabilities(driver, false)))
         goto error;
 
@@ -8218,7 +8222,7 @@ qemuProcessReconnect(void *opaque)
     }
 
     /* update domain state XML with possibly updated state in virDomainObj */
-    if (virDomainSaveStatus(driver->xmlopt, cfg->stateDir, obj, driver->caps) < 0)
+    if (virDomainSaveStatus(xmlopt, cfg->stateDir, obj, driver->caps) < 0)
         goto error;
 
     /* Run an hook to allow admins to do some magic */
@@ -8253,6 +8257,7 @@ qemuProcessReconnect(void *opaque)
     virDomainObjEndAPI(&obj);
     virObjectUnref(cfg);
     virObjectUnref(caps);
+    virObjectUnref(xmlopt);
     virNWFilterUnlockFilterUpdates();
     virIdentitySetCurrent(NULL);
     return;
-- 
2.18.0.windows.1


--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH] qemuProcessReconnect: ensure vm xml integrity when save status
Posted by Peter Krempa 4 years, 10 months ago
On Sat, May 18, 2019 at 18:24:36 +0800, Wang King wrote:
> From: Dingzhong Weng <wengdingzhong@huawei.com>
> 
> libvirtd starts qemuProcessReconnect threads when libvirtd is starting
> and there are live vm on the host. but as qemuProcessReconnect threads
> are simply side job done by libvirtd upon start and are only called
> for existing vm. these threads are not managed like that of worker
> threads and event pool.
> 
> once libvirtd receives SIGTERM signal or any similar command right after
> it starts, these qemuProcessReconnect threads might still be running and
> libvirtd goes to clean up and frees qemu_driver and its components. In
> this short window, qemuProcessReconnect threads might have a NULL
> qemu_driver->xmlopt.format, skip this function, and only partial vm
> status can be saved to disk. As a result, vm may fail to recover with
> missing information.
> 
> this patch increases qemu_driver->xmlopt ref count during the lifecycle
> of qemuProcessReconnect so that libvirtd will not release this resource
> until qemuProcessReconnect threads finish using it. thus make sure all
> of vm status information can be formatted and thus maintain its integrity
> in virDomainSaveStatus.
> 
> no vm is killed in this patch.
> 
> Signed-off-by: Dingzhong Weng <wengdingzhong@huawei.com>
> ---
>  src/qemu/qemu_process.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
> index 90466771cd..36b9b5fd03 100644
> --- a/src/qemu/qemu_process.c
> +++ b/src/qemu/qemu_process.c
> @@ -8000,6 +8000,7 @@ qemuProcessReconnect(void *opaque)
>      struct qemuProcessReconnectData *data = opaque;
>      virQEMUDriverPtr driver = data->driver;
>      virDomainObjPtr obj = data->obj;
> +    virDomainXMLOptionPtr xmlopt = NULL;
>      qemuDomainObjPrivatePtr priv;
>      qemuDomainJobObj oldjob;
>      int state;
> @@ -8023,6 +8024,9 @@ qemuProcessReconnect(void *opaque)
>      cfg = virQEMUDriverGetConfig(driver);
>      priv = obj->privateData;
>  
> +    /* need xmlopt later to save status, do not free */
> +    xmlopt = virObjectRef(driver->xmlopt);

So I presume the problem is that qemuStateCleanup is called before this
function finishes and thus accesses invalid memory.

This patch will not fix the problem entirely, because the access to
XMLopt here (and everywhere else) is not atomic. This means that if
qemuStateCleanup is called before the above line you'll try to reference
a pointer which was already freed.

Also even if qemuStateCleanup sets the pointer to NULL your patch does
not check it.

To fully fix this I think we need an accessor similar to
virQEMUDriverGetConfig which will access the xmlopt object.

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH] qemuProcessReconnect: ensure vm xml integrity when save status
Posted by Peter Krempa 4 years, 10 months ago
On Mon, May 20, 2019 at 12:57:17 +0200, Peter Krempa wrote:
> On Sat, May 18, 2019 at 18:24:36 +0800, Wang King wrote:

[...]

> > @@ -8023,6 +8024,9 @@ qemuProcessReconnect(void *opaque)
> >      cfg = virQEMUDriverGetConfig(driver);
> >      priv = obj->privateData;
> >  
> > +    /* need xmlopt later to save status, do not free */
> > +    xmlopt = virObjectRef(driver->xmlopt);
> 
> So I presume the problem is that qemuStateCleanup is called before this
> function finishes and thus accesses invalid memory.
> 
> This patch will not fix the problem entirely, because the access to
> XMLopt here (and everywhere else) is not atomic. This means that if
> qemuStateCleanup is called before the above line you'll try to reference
> a pointer which was already freed.
> 
> Also even if qemuStateCleanup sets the pointer to NULL your patch does
> not check it.
> 
> To fully fix this I think we need an accessor similar to
> virQEMUDriverGetConfig which will access the xmlopt object.

Or ideally we need to turn the qemu_driver struct into an virObject and
increase refcount prior to passing it into the threads reconnecting to
the instances. Otherwise there are potential other fields missing and
can cause problems.
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list