[libvirt PATCH] Fix incorrect uses of g_clear_pointer() introduced in 8.1.0

Mark Mielke posted 1 patch 1 year, 11 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/libvirt tags/patchew/20220612191650.292526-1-mark.mielke@gmail.com
src/qemu/qemu_agent.c   |  3 ++-
src/qemu/qemu_monitor.c |  3 ++-
src/util/vireventglib.c | 12 ++++++++----
3 files changed, 12 insertions(+), 6 deletions(-)
[libvirt PATCH] Fix incorrect uses of g_clear_pointer() introduced in 8.1.0
Posted by Mark Mielke 1 year, 11 months ago
This is a partial revert of 87a43a907f0ad4897a28ad7c216bc70f37270b93.

The change to use g_clear_pointer() in more places was accidentally
applied to cases involving vir_g_source_unref().

In some cases, the ordering of g_source_destroy() and
vir_g_source_unref() was reversed, which resulted in the source being
marked as destroyed, after it is already unreferenced. This
use-after-free case might work in many cases, but with versions of
glibc older than 2.64.0 it may defer unref to run within the main
thread to avoid a race condition, which creates a large distance
between the g_source_unref() and g_source_destroy().

In some cases, the call to vir_g_source_unref() was replaced with a
second call to g_source_destroy(), leading to a memory leak or worse.

In our experience, the symptoms were that use of libvirt-python became
slower over time, with OpenStack nova-compute initially taking around
one second to periodically query the host PCI devices, and within an
hour it was taking over a minute to complete the same operation, until
it is was eventually running this query back-to-back, resulting in the
nova-compute process consuming 100% of one CPU thread, losing its
RabbitMQ connection frequently, and showing up as down to the control
plane.
---
 src/qemu/qemu_agent.c   |  3 ++-
 src/qemu/qemu_monitor.c |  3 ++-
 src/util/vireventglib.c | 12 ++++++++----
 3 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/src/qemu/qemu_agent.c b/src/qemu/qemu_agent.c
index f57a8d5f25..e6e92c7dc4 100644
--- a/src/qemu/qemu_agent.c
+++ b/src/qemu/qemu_agent.c
@@ -452,8 +452,9 @@ static void
 qemuAgentUnregister(qemuAgent *agent)
 {
     if (agent->watch) {
+        g_source_destroy(agent->watch);
         vir_g_source_unref(agent->watch, agent->context);
-        g_clear_pointer(&agent->watch, g_source_destroy);
+        agent->watch = NULL;
     }
 }
 
diff --git a/src/qemu/qemu_monitor.c b/src/qemu/qemu_monitor.c
index 37bcbde31e..32c993a941 100644
--- a/src/qemu/qemu_monitor.c
+++ b/src/qemu/qemu_monitor.c
@@ -794,8 +794,9 @@ void
 qemuMonitorUnregister(qemuMonitor *mon)
 {
     if (mon->watch) {
+        g_source_destroy(mon->watch);
         vir_g_source_unref(mon->watch, mon->context);
-        g_clear_pointer(&mon->watch, g_source_destroy);
+        mon->watch = NULL;
     }
 }
 
diff --git a/src/util/vireventglib.c b/src/util/vireventglib.c
index fc04d8f712..983787932f 100644
--- a/src/util/vireventglib.c
+++ b/src/util/vireventglib.c
@@ -228,7 +228,8 @@ virEventGLibHandleUpdate(int watch,
 
         VIR_DEBUG("Removed old handle source=%p", data->source);
         g_source_destroy(data->source);
-        g_clear_pointer(&data->source, g_source_destroy);
+        vir_g_source_unref(data->source, NULL);
+        data->source = NULL;
         data->events = 0;
     }
 
@@ -275,8 +276,9 @@ virEventGLibHandleRemove(int watch)
               data, watch, data->fd);
 
     if (data->source != NULL) {
+        g_source_destroy(data->source);
         vir_g_source_unref(data->source, NULL);
-        g_clear_pointer(&data->source, g_source_destroy);
+        data->source = NULL;
         data->events = 0;
     }
 
@@ -417,8 +419,9 @@ virEventGLibTimeoutUpdate(int timer,
         if (data->source == NULL)
             goto cleanup;
 
+        g_source_destroy(data->source);
         vir_g_source_unref(data->source, NULL);
-        g_clear_pointer(&data->source, g_source_destroy);
+        data->source = NULL;
     }
 
  cleanup:
@@ -465,8 +468,9 @@ virEventGLibTimeoutRemove(int timer)
               data, timer);
 
     if (data->source != NULL) {
+        g_source_destroy(data->source);
         vir_g_source_unref(data->source, NULL);
-        g_clear_pointer(&data->source, g_source_destroy);
+        data->source = NULL;
     }
 
     /* since the actual timeout deletion is done asynchronously, a timeoutUpdate call may
-- 
2.36.1
Re: [libvirt PATCH] Fix incorrect uses of g_clear_pointer() introduced in 8.1.0
Posted by Ján Tomko 1 year, 11 months ago
On a Sunday in 2022, Mark Mielke wrote:
>This is a partial revert of 87a43a907f0ad4897a28ad7c216bc70f37270b93.

I'll drop the trailing period before pushing.

>
>The change to use g_clear_pointer() in more places was accidentally
>applied to cases involving vir_g_source_unref().
>
>In some cases, the ordering of g_source_destroy() and
>vir_g_source_unref() was reversed, which resulted in the source being
>marked as destroyed, after it is already unreferenced.

Oops, sorry I missed that during review.

>This
>use-after-free case might work in many cases, but with versions of
>glibc older than 2.64.0 it may defer unref to run within the main

s/glibc/glib/

>thread to avoid a race condition, which creates a large distance
>between the g_source_unref() and g_source_destroy().
>
>In some cases, the call to vir_g_source_unref() was replaced with a
>second call to g_source_destroy(), leading to a memory leak or worse.
>
>In our experience, the symptoms were that use of libvirt-python became
>slower over time, with OpenStack nova-compute initially taking around
>one second to periodically query the host PCI devices, and within an
>hour it was taking over a minute to complete the same operation, until
>it is was eventually running this query back-to-back, resulting in the
>nova-compute process consuming 100% of one CPU thread, losing its
>RabbitMQ connection frequently, and showing up as down to the control
>plane.

Your patch is missing a sign-off
https://libvirt.org/hacking.html#developer-certificate-of-origin

Just replying to this e-mail with the Signed-off-by line is enough - no
need to resend the patch. I'll push it with the sign-off included after
the pipeline succeds: https://gitlab.com/janotomko/libvirt/-/pipelines/562139546

>---
> src/qemu/qemu_agent.c   |  3 ++-
> src/qemu/qemu_monitor.c |  3 ++-
> src/util/vireventglib.c | 12 ++++++++----
> 3 files changed, 12 insertions(+), 6 deletions(-)
>

Reviewed-by: Ján Tomko <jtomko@redhat.com>

Jano
Re: [libvirt PATCH] Fix incorrect uses of g_clear_pointer() introduced in 8.1.0
Posted by Mark Mielke 1 year, 11 months ago
Sounds good - thank you...

Patch is

Signed-off-by: Mark Mielke <mark.mielke@gmail.com>


On Mon, Jun 13, 2022 at 5:08 AM Ján Tomko <jtomko@redhat.com> wrote:

> On a Sunday in 2022, Mark Mielke wrote:
> >This is a partial revert of 87a43a907f0ad4897a28ad7c216bc70f37270b93.
>
> I'll drop the trailing period before pushing.
>
> >
> >The change to use g_clear_pointer() in more places was accidentally
> >applied to cases involving vir_g_source_unref().
> >
> >In some cases, the ordering of g_source_destroy() and
> >vir_g_source_unref() was reversed, which resulted in the source being
> >marked as destroyed, after it is already unreferenced.
>
> Oops, sorry I missed that during review.
>
> >This
> >use-after-free case might work in many cases, but with versions of
> >glibc older than 2.64.0 it may defer unref to run within the main
>
> s/glibc/glib/
>
> >thread to avoid a race condition, which creates a large distance
> >between the g_source_unref() and g_source_destroy().
> >
> >In some cases, the call to vir_g_source_unref() was replaced with a
> >second call to g_source_destroy(), leading to a memory leak or worse.
> >
> >In our experience, the symptoms were that use of libvirt-python became
> >slower over time, with OpenStack nova-compute initially taking around
> >one second to periodically query the host PCI devices, and within an
> >hour it was taking over a minute to complete the same operation, until
> >it is was eventually running this query back-to-back, resulting in the
> >nova-compute process consuming 100% of one CPU thread, losing its
> >RabbitMQ connection frequently, and showing up as down to the control
> >plane.
>
> Your patch is missing a sign-off
> https://libvirt.org/hacking.html#developer-certificate-of-origin
>
> Just replying to this e-mail with the Signed-off-by line is enough - no
> need to resend the patch. I'll push it with the sign-off included after
> the pipeline succeds:
> https://gitlab.com/janotomko/libvirt/-/pipelines/562139546
>
> >---
> > src/qemu/qemu_agent.c   |  3 ++-
> > src/qemu/qemu_monitor.c |  3 ++-
> > src/util/vireventglib.c | 12 ++++++++----
> > 3 files changed, 12 insertions(+), 6 deletions(-)
> >
>
> Reviewed-by: Ján Tomko <jtomko@redhat.com>
>
> Jano
>


-- 
Mark Mielke <mark.mielke@gmail.com>
Re: [libvirt PATCH] Fix incorrect uses of g_clear_pointer() introduced in 8.1.0
Posted by Ján Tomko 1 year, 11 months ago
On a Monday in 2022, Mark Mielke wrote:
>Sounds good - thank you...
>
>Patch is
>
>Signed-off-by: Mark Mielke <mark.mielke@gmail.com>

Thanks, pushed now.

Jano