[PATCH] thread-pool: signal "request_cond" while locked

Anthony PERARD via posted 1 patch 10 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20230714152720.5077-1-anthony.perard@citrix.com
util/thread-pool.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
[PATCH] thread-pool: signal "request_cond" while locked
Posted by Anthony PERARD via 10 months ago
From: Anthony PERARD <anthony.perard@citrix.com>

thread_pool_free() might have been called on the `pool`, which would
be a reason for worker_thread() to quit. In this case,
`pool->request_cond` is been destroyed.

If worker_thread() didn't managed to signal `request_cond` before it
been destroyed by thread_pool_free(), we got:
    util/qemu-thread-posix.c:198: qemu_cond_signal: Assertion `cond->initialized' failed.

One backtrace:
    __GI___assert_fail (assertion=0x55555614abcb "cond->initialized", file=0x55555614ab88 "util/qemu-thread-posix.c", line=198,
	function=0x55555614ad80 <__PRETTY_FUNCTION__.17104> "qemu_cond_signal") at assert.c:101
    qemu_cond_signal (cond=0x7fffb800db30) at util/qemu-thread-posix.c:198
    worker_thread (opaque=0x7fffb800dab0) at util/thread-pool.c:129
    qemu_thread_start (args=0x7fffb8000b20) at util/qemu-thread-posix.c:505
    start_thread (arg=<optimized out>) at pthread_create.c:486

Reported here:
    https://lore.kernel.org/all/ZJwoK50FcnTSfFZ8@MacBook-Air-de-Roger.local/T/#u

To avoid issue, keep lock while sending a signal to `request_cond`.

Fixes: 900fa208f506 ("thread-pool: replace semaphore with condition variable")
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
---

There's maybe an issue in thread_pool_submit_aio() as well with
signalling `request_cond`, but maybe it's much less likely to be an
issue?
---
 util/thread-pool.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/util/thread-pool.c b/util/thread-pool.c
index 0d97888df0..e3d8292d14 100644
--- a/util/thread-pool.c
+++ b/util/thread-pool.c
@@ -120,13 +120,13 @@ static void *worker_thread(void *opaque)
 
     pool->cur_threads--;
     qemu_cond_signal(&pool->worker_stopped);
-    qemu_mutex_unlock(&pool->lock);
 
     /*
      * Wake up another thread, in case we got a wakeup but decided
      * to exit due to pool->cur_threads > pool->max_threads.
      */
     qemu_cond_signal(&pool->request_cond);
+    qemu_mutex_unlock(&pool->lock);
     return NULL;
 }
 
-- 
Anthony PERARD
Re: [PATCH] thread-pool: signal "request_cond" while locked
Posted by Stefan Hajnoczi 10 months ago
On Fri, Jul 14, 2023 at 04:27:20PM +0100, Anthony PERARD wrote:
> From: Anthony PERARD <anthony.perard@citrix.com>
> 
> thread_pool_free() might have been called on the `pool`, which would
> be a reason for worker_thread() to quit. In this case,
> `pool->request_cond` is been destroyed.
> 
> If worker_thread() didn't managed to signal `request_cond` before it
> been destroyed by thread_pool_free(), we got:
>     util/qemu-thread-posix.c:198: qemu_cond_signal: Assertion `cond->initialized' failed.
> 
> One backtrace:
>     __GI___assert_fail (assertion=0x55555614abcb "cond->initialized", file=0x55555614ab88 "util/qemu-thread-posix.c", line=198,
> 	function=0x55555614ad80 <__PRETTY_FUNCTION__.17104> "qemu_cond_signal") at assert.c:101
>     qemu_cond_signal (cond=0x7fffb800db30) at util/qemu-thread-posix.c:198
>     worker_thread (opaque=0x7fffb800dab0) at util/thread-pool.c:129
>     qemu_thread_start (args=0x7fffb8000b20) at util/qemu-thread-posix.c:505
>     start_thread (arg=<optimized out>) at pthread_create.c:486
> 
> Reported here:
>     https://lore.kernel.org/all/ZJwoK50FcnTSfFZ8@MacBook-Air-de-Roger.local/T/#u
> 
> To avoid issue, keep lock while sending a signal to `request_cond`.
> 
> Fixes: 900fa208f506 ("thread-pool: replace semaphore with condition variable")
> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
> ---
> 
> There's maybe an issue in thread_pool_submit_aio() as well with
> signalling `request_cond`, but maybe it's much less likely to be an
> issue?

The caller must not submit work while destroying the pool, so I'm not
sure when this problem could occur with thread_pool_submit_aio()?

> ---
>  util/thread-pool.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

This is QEMU 8.1 material.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>