[PATCH] tests/iothread: Always connect iothread GSource to a GMainContext

Peter Maydell posted 1 patch 4 years, 3 months ago
Test asan failed
Test checkpatch failed
Test FreeBSD failed
Test docker-mingw@fedora failed
Test docker-clang@ubuntu failed
Test docker-quick@centos7 failed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20200106144552.7205-1-peter.maydell@linaro.org
tests/iothread.c | 30 ++++++++++++++++++++++++++++++
1 file changed, 30 insertions(+)
[PATCH] tests/iothread: Always connect iothread GSource to a GMainContext
Posted by Peter Maydell 4 years, 3 months ago
On older versions of glib (anything prior to glib commit 0f056ebe
from May 2019), the implementation of g_source_ref() and
g_source_unref() is not threadsafe for a GSource which is not
attached to a GMainContext.

QEMU's real iothread.c implementation always attaches its
iothread->ctx's GSource to a GMainContext created for that iothread,
so it is OK, but the simple test framework implementation in
tests/iothread.c was not doing this.  This was causing intermittent
assertion failures in the test-aio-multithread subtest
"/aio/multi/mutex/contended" test on the BSD hosts.  (It's unclear
why only BSD seems to have been affected -- perhaps a combination of
the specific glib version being used in the VMs and their happening
to run on a host with a lot of CPUs).

Borrow the iothread_init_gcontext() from the real iothread.c
and add the corresponding cleanup code and the calls to
g_main_context_push/pop_thread_default() so we actually use
the GMainContext we create.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
I don't really have a good understanding of the glib APIs here,
so I'm mostly just cribbing code from the real iothread.c;
review by people who do know the glib/iothread stuff better
welcomed. It does seem to fix the intermittent test failure
on NetBSD, at least, where we were running into an assertion
failure because a g_source_unref() incorrectly thought it
had decremented the refcount to 0 and should delete a context
that was actually still in use.

 tests/iothread.c | 30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git a/tests/iothread.c b/tests/iothread.c
index 13c9fdcd8df..d3a2ee9a014 100644
--- a/tests/iothread.c
+++ b/tests/iothread.c
@@ -21,6 +21,8 @@
 
 struct IOThread {
     AioContext *ctx;
+    GMainContext *worker_context;
+    GMainLoop *main_loop;
 
     QemuThread thread;
     QemuMutex init_done_lock;
@@ -35,6 +37,17 @@ AioContext *qemu_get_current_aio_context(void)
     return my_iothread ? my_iothread->ctx : qemu_get_aio_context();
 }
 
+static void iothread_init_gcontext(IOThread *iothread)
+{
+    GSource *source;
+
+    iothread->worker_context = g_main_context_new();
+    source = aio_get_g_source(iothread_get_aio_context(iothread));
+    g_source_attach(source, iothread->worker_context);
+    g_source_unref(source);
+    iothread->main_loop = g_main_loop_new(iothread->worker_context, TRUE);
+}
+
 static void *iothread_run(void *opaque)
 {
     IOThread *iothread = opaque;
@@ -44,6 +57,20 @@ static void *iothread_run(void *opaque)
     my_iothread = iothread;
     qemu_mutex_lock(&iothread->init_done_lock);
     iothread->ctx = aio_context_new(&error_abort);
+
+    /*
+     * We must connect the ctx to a GMainContext, because in older versions
+     * of glib the g_source_ref()/unref() functions are not threadsafe
+     * on sources without a context.
+     */
+    iothread_init_gcontext(iothread);
+
+    /*
+     * g_main_context_push_thread_default() must be called before anything
+     * in this new thread uses glib.
+     */
+    g_main_context_push_thread_default(iothread->worker_context);
+
     qemu_cond_signal(&iothread->init_done_cond);
     qemu_mutex_unlock(&iothread->init_done_lock);
 
@@ -51,6 +78,7 @@ static void *iothread_run(void *opaque)
         aio_poll(iothread->ctx, true);
     }
 
+    g_main_context_pop_thread_default(iothread->worker_context);
     rcu_unregister_thread();
     return NULL;
 }
@@ -66,6 +94,8 @@ void iothread_join(IOThread *iothread)
 {
     aio_bh_schedule_oneshot(iothread->ctx, iothread_stop_bh, iothread);
     qemu_thread_join(&iothread->thread);
+    g_main_context_unref(iothread->worker_context);
+    g_main_loop_unref(iothread->main_loop);
     qemu_cond_destroy(&iothread->init_done_cond);
     qemu_mutex_destroy(&iothread->init_done_lock);
     aio_context_unref(iothread->ctx);
-- 
2.20.1


Re: [PATCH] tests/iothread: Always connect iothread GSource to a GMainContext
Posted by Peter Xu 4 years, 3 months ago
On Mon, Jan 06, 2020 at 02:45:52PM +0000, Peter Maydell wrote:
> On older versions of glib (anything prior to glib commit 0f056ebe
> from May 2019), the implementation of g_source_ref() and
> g_source_unref() is not threadsafe for a GSource which is not
> attached to a GMainContext.
> 
> QEMU's real iothread.c implementation always attaches its
> iothread->ctx's GSource to a GMainContext created for that iothread,
> so it is OK, but the simple test framework implementation in
> tests/iothread.c was not doing this.  This was causing intermittent
> assertion failures in the test-aio-multithread subtest
> "/aio/multi/mutex/contended" test on the BSD hosts.  (It's unclear
> why only BSD seems to have been affected -- perhaps a combination of
> the specific glib version being used in the VMs and their happening
> to run on a host with a lot of CPUs).
> 
> Borrow the iothread_init_gcontext() from the real iothread.c
> and add the corresponding cleanup code and the calls to
> g_main_context_push/pop_thread_default() so we actually use
> the GMainContext we create.
> 
> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

I've no idea on the g_source_ref() issue, but if so then a patch like
this makes sense to me especially if it fixes up test failures.

Reviewed-by: Peter Xu <peterx@redhat.com>

Still a few comments below.

> ---
> I don't really have a good understanding of the glib APIs here,
> so I'm mostly just cribbing code from the real iothread.c;
> review by people who do know the glib/iothread stuff better
> welcomed. It does seem to fix the intermittent test failure
> on NetBSD, at least, where we were running into an assertion
> failure because a g_source_unref() incorrectly thought it
> had decremented the refcount to 0 and should delete a context
> that was actually still in use.
> 
>  tests/iothread.c | 30 ++++++++++++++++++++++++++++++
>  1 file changed, 30 insertions(+)
> 
> diff --git a/tests/iothread.c b/tests/iothread.c
> index 13c9fdcd8df..d3a2ee9a014 100644
> --- a/tests/iothread.c
> +++ b/tests/iothread.c
> @@ -21,6 +21,8 @@
>  
>  struct IOThread {
>      AioContext *ctx;
> +    GMainContext *worker_context;
> +    GMainLoop *main_loop;
>  
>      QemuThread thread;
>      QemuMutex init_done_lock;
> @@ -35,6 +37,17 @@ AioContext *qemu_get_current_aio_context(void)
>      return my_iothread ? my_iothread->ctx : qemu_get_aio_context();
>  }
>  
> +static void iothread_init_gcontext(IOThread *iothread)
> +{
> +    GSource *source;
> +
> +    iothread->worker_context = g_main_context_new();
> +    source = aio_get_g_source(iothread_get_aio_context(iothread));
> +    g_source_attach(source, iothread->worker_context);
> +    g_source_unref(source);
> +    iothread->main_loop = g_main_loop_new(iothread->worker_context, TRUE);

IIUC the main_loop is not required because in this case we only use
the aio context to run rather than the main context itself.

> +}
> +
>  static void *iothread_run(void *opaque)
>  {
>      IOThread *iothread = opaque;
> @@ -44,6 +57,20 @@ static void *iothread_run(void *opaque)
>      my_iothread = iothread;
>      qemu_mutex_lock(&iothread->init_done_lock);
>      iothread->ctx = aio_context_new(&error_abort);
> +
> +    /*
> +     * We must connect the ctx to a GMainContext, because in older versions
> +     * of glib the g_source_ref()/unref() functions are not threadsafe
> +     * on sources without a context.
> +     */
> +    iothread_init_gcontext(iothread);
> +
> +    /*
> +     * g_main_context_push_thread_default() must be called before anything
> +     * in this new thread uses glib.
> +     */
> +    g_main_context_push_thread_default(iothread->worker_context);

IMHO if all the users of tests/iothread.c are block layers who only
uses the aio context directly, then I think this is not needed too.

Thanks,

> +
>      qemu_cond_signal(&iothread->init_done_cond);
>      qemu_mutex_unlock(&iothread->init_done_lock);
>  
> @@ -51,6 +78,7 @@ static void *iothread_run(void *opaque)
>          aio_poll(iothread->ctx, true);
>      }
>  
> +    g_main_context_pop_thread_default(iothread->worker_context);
>      rcu_unregister_thread();
>      return NULL;
>  }
> @@ -66,6 +94,8 @@ void iothread_join(IOThread *iothread)
>  {
>      aio_bh_schedule_oneshot(iothread->ctx, iothread_stop_bh, iothread);
>      qemu_thread_join(&iothread->thread);
> +    g_main_context_unref(iothread->worker_context);
> +    g_main_loop_unref(iothread->main_loop);
>      qemu_cond_destroy(&iothread->init_done_cond);
>      qemu_mutex_destroy(&iothread->init_done_lock);
>      aio_context_unref(iothread->ctx);
> -- 
> 2.20.1
> 

-- 
Peter Xu


Re: [PATCH] tests/iothread: Always connect iothread GSource to a GMainContext
Posted by Peter Maydell 4 years, 3 months ago
On Mon, 6 Jan 2020 at 17:28, Peter Xu <peterx@redhat.com> wrote:
>
> On Mon, Jan 06, 2020 at 02:45:52PM +0000, Peter Maydell wrote:
> > On older versions of glib (anything prior to glib commit 0f056ebe
> > from May 2019), the implementation of g_source_ref() and
> > g_source_unref() is not threadsafe for a GSource which is not
> > attached to a GMainContext.
> >
> > QEMU's real iothread.c implementation always attaches its
> > iothread->ctx's GSource to a GMainContext created for that iothread,
> > so it is OK, but the simple test framework implementation in
> > tests/iothread.c was not doing this.  This was causing intermittent
> > assertion failures in the test-aio-multithread subtest
> > "/aio/multi/mutex/contended" test on the BSD hosts.  (It's unclear
> > why only BSD seems to have been affected -- perhaps a combination of
> > the specific glib version being used in the VMs and their happening
> > to run on a host with a lot of CPUs).
> >
> > Borrow the iothread_init_gcontext() from the real iothread.c
> > and add the corresponding cleanup code and the calls to
> > g_main_context_push/pop_thread_default() so we actually use
> > the GMainContext we create.
> >
> > Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
>
> I've no idea on the g_source_ref() issue, but if so then a patch like
> this makes sense to me especially if it fixes up test failures.
>
> Reviewed-by: Peter Xu <peterx@redhat.com>
>
> Still a few comments below.

> > +static void iothread_init_gcontext(IOThread *iothread)
> > +{
> > +    GSource *source;
> > +
> > +    iothread->worker_context = g_main_context_new();
> > +    source = aio_get_g_source(iothread_get_aio_context(iothread));
> > +    g_source_attach(source, iothread->worker_context);
> > +    g_source_unref(source);
> > +    iothread->main_loop = g_main_loop_new(iothread->worker_context, TRUE);
>
> IIUC the main_loop is not required because in this case we only use
> the aio context to run rather than the main context itself.

Mmm. I wasn't sure to what extent glib expects the
GMainContext and GMainLoop to match up, so I was mostly
just copying from iothread.c.

> > +    /*
> > +     * g_main_context_push_thread_default() must be called before anything
> > +     * in this new thread uses glib.
> > +     */
> > +    g_main_context_push_thread_default(iothread->worker_context);
>
> IMHO if all the users of tests/iothread.c are block layers who only
> uses the aio context directly, then I think this is not needed too.

So we're OK to not do this because tests/iothread.c's
main loop doesn't call g_main_loop_run(), and it doesn't
provide an iothread_get_g_main_context() ?

I'm kind of inclined towards being lazy and sticking with
what this patch has, because:
 * it matches the real iothread.c, which reduces the possiblity
   of future surprise bugs due to things not matching up
 * it's already been reviewed
 * it saves me having to do a respin and retest

But if people would prefer these bits deleted I'll stop
being lazy :-)

thanks
-- PMM

Re: [PATCH] tests/iothread: Always connect iothread GSource to a GMainContext
Posted by Peter Xu 4 years, 3 months ago
On Mon, Jan 06, 2020 at 05:40:15PM +0000, Peter Maydell wrote:

[...]

> So we're OK to not do this because tests/iothread.c's
> main loop doesn't call g_main_loop_run(), and it doesn't
> provide an iothread_get_g_main_context() ?
> 
> I'm kind of inclined towards being lazy and sticking with
> what this patch has, because:
>  * it matches the real iothread.c, which reduces the possiblity
>    of future surprise bugs due to things not matching up
>  * it's already been reviewed
>  * it saves me having to do a respin and retest
> 
> But if people would prefer these bits deleted I'll stop
> being lazy :-)

Please feel free to be lazy and fix the test sooner (and that's why I
offered my r-b :).

Thanks,

-- 
Peter Xu


Re: [PATCH] tests/iothread: Always connect iothread GSource to a GMainContext
Posted by Peter Maydell 4 years, 3 months ago
On Mon, 6 Jan 2020 at 18:01, Peter Xu <peterx@redhat.com> wrote:
>
> On Mon, Jan 06, 2020 at 05:40:15PM +0000, Peter Maydell wrote:
>
> [...]
>
> > So we're OK to not do this because tests/iothread.c's
> > main loop doesn't call g_main_loop_run(), and it doesn't
> > provide an iothread_get_g_main_context() ?
> >
> > I'm kind of inclined towards being lazy and sticking with
> > what this patch has, because:
> >  * it matches the real iothread.c, which reduces the possiblity
> >    of future surprise bugs due to things not matching up
> >  * it's already been reviewed
> >  * it saves me having to do a respin and retest
> >
> > But if people would prefer these bits deleted I'll stop
> > being lazy :-)
>
> Please feel free to be lazy and fix the test sooner (and that's why I
> offered my r-b :).

Thanks; I've applied this to master (though some other
problem with the NetBSD VM test run seems to have crept
in while I was ignoring failures on the assumption they
were due to this bug :-( )

-- PMM

Re: [PATCH] tests/iothread: Always connect iothread GSource to a GMainContext
Posted by Marc-André Lureau 4 years, 3 months ago
Hi

On Mon, Jan 6, 2020 at 7:03 PM Peter Maydell <peter.maydell@linaro.org> wrote:
>
> On older versions of glib (anything prior to glib commit 0f056ebe
> from May 2019), the implementation of g_source_ref() and
> g_source_unref() is not threadsafe for a GSource which is not
> attached to a GMainContext.
>
> QEMU's real iothread.c implementation always attaches its
> iothread->ctx's GSource to a GMainContext created for that iothread,
> so it is OK, but the simple test framework implementation in
> tests/iothread.c was not doing this.  This was causing intermittent
> assertion failures in the test-aio-multithread subtest
> "/aio/multi/mutex/contended" test on the BSD hosts.  (It's unclear
> why only BSD seems to have been affected -- perhaps a combination of
> the specific glib version being used in the VMs and their happening
> to run on a host with a lot of CPUs).
>
> Borrow the iothread_init_gcontext() from the real iothread.c
> and add the corresponding cleanup code and the calls to
> g_main_context_push/pop_thread_default() so we actually use
> the GMainContext we create.
>
> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
> ---
> I don't really have a good understanding of the glib APIs here,
> so I'm mostly just cribbing code from the real iothread.c;
> review by people who do know the glib/iothread stuff better
> welcomed. It does seem to fix the intermittent test failure
> on NetBSD, at least, where we were running into an assertion
> failure because a g_source_unref() incorrectly thought it
> had decremented the refcount to 0 and should delete a context
> that was actually still in use.
>
>  tests/iothread.c | 30 ++++++++++++++++++++++++++++++
>  1 file changed, 30 insertions(+)
>
> diff --git a/tests/iothread.c b/tests/iothread.c
> index 13c9fdcd8df..d3a2ee9a014 100644
> --- a/tests/iothread.c
> +++ b/tests/iothread.c
> @@ -21,6 +21,8 @@
>
>  struct IOThread {
>      AioContext *ctx;
> +    GMainContext *worker_context;
> +    GMainLoop *main_loop;
>
>      QemuThread thread;
>      QemuMutex init_done_lock;
> @@ -35,6 +37,17 @@ AioContext *qemu_get_current_aio_context(void)
>      return my_iothread ? my_iothread->ctx : qemu_get_aio_context();
>  }
>
> +static void iothread_init_gcontext(IOThread *iothread)
> +{
> +    GSource *source;
> +
> +    iothread->worker_context = g_main_context_new();
> +    source = aio_get_g_source(iothread_get_aio_context(iothread));
> +    g_source_attach(source, iothread->worker_context);
> +    g_source_unref(source);
> +    iothread->main_loop = g_main_loop_new(iothread->worker_context, TRUE);
> +}
> +
>  static void *iothread_run(void *opaque)
>  {
>      IOThread *iothread = opaque;
> @@ -44,6 +57,20 @@ static void *iothread_run(void *opaque)
>      my_iothread = iothread;
>      qemu_mutex_lock(&iothread->init_done_lock);
>      iothread->ctx = aio_context_new(&error_abort);
> +
> +    /*
> +     * We must connect the ctx to a GMainContext, because in older versions
> +     * of glib the g_source_ref()/unref() functions are not threadsafe
> +     * on sources without a context.
> +     */
> +    iothread_init_gcontext(iothread);
> +
> +    /*
> +     * g_main_context_push_thread_default() must be called before anything
> +     * in this new thread uses glib.

in/if, I suppose

> +     */
> +    g_main_context_push_thread_default(iothread->worker_context);
> +
>      qemu_cond_signal(&iothread->init_done_cond);
>      qemu_mutex_unlock(&iothread->init_done_lock);
>
> @@ -51,6 +78,7 @@ static void *iothread_run(void *opaque)
>          aio_poll(iothread->ctx, true);
>      }
>
> +    g_main_context_pop_thread_default(iothread->worker_context);
>      rcu_unregister_thread();
>      return NULL;
>  }
> @@ -66,6 +94,8 @@ void iothread_join(IOThread *iothread)
>  {
>      aio_bh_schedule_oneshot(iothread->ctx, iothread_stop_bh, iothread);
>      qemu_thread_join(&iothread->thread);
> +    g_main_context_unref(iothread->worker_context);
> +    g_main_loop_unref(iothread->main_loop);
>      qemu_cond_destroy(&iothread->init_done_cond);
>      qemu_mutex_destroy(&iothread->init_done_lock);
>      aio_context_unref(iothread->ctx);
> --
> 2.20.1
>
>

Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>


-- 
Marc-André Lureau

Re: [PATCH] tests/iothread: Always connect iothread GSource to a GMainContext
Posted by Peter Maydell 4 years, 3 months ago
On Mon, 6 Jan 2020 at 15:22, Marc-André Lureau
<marcandre.lureau@gmail.com> wrote:
>
> Hi
>
> On Mon, Jan 6, 2020 at 7:03 PM Peter Maydell <peter.maydell@linaro.org> wrote:
> >
> > On older versions of glib (anything prior to glib commit 0f056ebe
> > from May 2019), the implementation of g_source_ref() and
> > g_source_unref() is not threadsafe for a GSource which is not
> > attached to a GMainContext.
> >
> > QEMU's real iothread.c implementation always attaches its
> > iothread->ctx's GSource to a GMainContext created for that iothread,
> > so it is OK, but the simple test framework implementation in
> > tests/iothread.c was not doing this.  This was causing intermittent
> > assertion failures in the test-aio-multithread subtest
> > "/aio/multi/mutex/contended" test on the BSD hosts.  (It's unclear
> > why only BSD seems to have been affected -- perhaps a combination of
> > the specific glib version being used in the VMs and their happening
> > to run on a host with a lot of CPUs).

> >  static void *iothread_run(void *opaque)
> >  {
> >      IOThread *iothread = opaque;
> > @@ -44,6 +57,20 @@ static void *iothread_run(void *opaque)
> >      my_iothread = iothread;
> >      qemu_mutex_lock(&iothread->init_done_lock);
> >      iothread->ctx = aio_context_new(&error_abort);
> > +
> > +    /*
> > +     * We must connect the ctx to a GMainContext, because in older versions
> > +     * of glib the g_source_ref()/unref() functions are not threadsafe
> > +     * on sources without a context.
> > +     */
> > +    iothread_init_gcontext(iothread);
> > +
> > +    /*
> > +     * g_main_context_push_thread_default() must be called before anything
> > +     * in this new thread uses glib.
>
> in/if, I suppose

No; it means "before anything in this new thread specifically" as
opposed to "before anything in the whole process". (This comment is
verbatim copied from the main iothread.c, incidentally).

Thanks for the review.

-- PMM