util/async: hold AioContext ref to prevent use-after-free

[Qemu-devel] [PATCH] util/async: hold AioContext ref to prevent use-after-free

Posted by Stefan Hajnoczi 6 years, 6 months ago

The tests/test-bdrv-drain /bdrv-drain/iothread/drain test case does the
following:

1. The preadv coroutine calls aio_bh_schedule_oneshot() and then yields.
2. The one-shot BH executes in another AioContext.  All it does is call
   aio_co_wakeup(preadv_co).
3. The preadv coroutine is re-entered and returns.

There is a race condition in aio_co_wake() where the preadv coroutine
returns and the test case destroys the preadv IOThread.  aio_co_wake()
can still be running in the other AioContext and it performs an access
to the freed IOThread AioContext.

Here is the race in aio_co_schedule():

  QSLIST_INSERT_HEAD_ATOMIC(&ctx->scheduled_coroutines,
                            co, co_scheduled_next);
  <-- race: co may execute before we invoke qemu_bh_schedule()!
  qemu_bh_schedule(ctx->co_schedule_bh);

So if co causes ctx to be freed then we're in trouble.  Fix this problem
by holding a reference to ctx.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 util/async.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/util/async.c b/util/async.c
index 8d2105729c..4e4c7af51e 100644
--- a/util/async.c
+++ b/util/async.c
@@ -459,9 +459,17 @@ void aio_co_schedule(AioContext *ctx, Coroutine *co)
         abort();
     }
 
+    /* The coroutine might run and release the last ctx reference before we
+     * invoke qemu_bh_schedule().  Take a reference to keep ctx alive until
+     * we're done.
+     */
+    aio_context_ref(ctx);
+
     QSLIST_INSERT_HEAD_ATOMIC(&ctx->scheduled_coroutines,
                               co, co_scheduled_next);
     qemu_bh_schedule(ctx->co_schedule_bh);
+
+    aio_context_unref(ctx);
 }
 
 void aio_co_wake(struct Coroutine *co)
-- 
2.21.0

Re: [Qemu-devel] [Qemu-block] [PATCH] util/async: hold AioContext ref to prevent use-after-free

Posted by Stefan Hajnoczi 6 years, 6 months ago

On Tue, Jul 23, 2019 at 8:06 PM Stefan Hajnoczi <stefanha@redhat.com> wrote:
> So if co causes ctx to be freed then we're in trouble.  Fix this problem
> by holding a reference to ctx.

For QEMU 4.2.  I'm not aware of a way to trigger this bug in QEMU
proper.  This fix just makes tests/test-bdrv-drain more reliable.

Stefan

Re: [Qemu-devel] [Qemu-block] [PATCH] util/async: hold AioContext ref to prevent use-after-free

Posted by Philippe Mathieu-Daudé 6 years, 6 months ago

On 7/23/19 9:09 PM, Stefan Hajnoczi wrote:
> On Tue, Jul 23, 2019 at 8:06 PM Stefan Hajnoczi <stefanha@redhat.com> wrote:
>> So if co causes ctx to be freed then we're in trouble.  Fix this problem
>> by holding a reference to ctx.
> 
> For QEMU 4.2.  I'm not aware of a way to trigger this bug in QEMU
> proper.  This fix just makes tests/test-bdrv-drain more reliable.

This looks harmless for 4.1-rc3.

Re: [Qemu-devel] [PATCH] util/async: hold AioContext ref to prevent use-after-free

Posted by Paolo Bonzini 6 years, 6 months ago

On 23/07/19 21:06, Stefan Hajnoczi wrote:
> The tests/test-bdrv-drain /bdrv-drain/iothread/drain test case does the
> following:
> 
> 1. The preadv coroutine calls aio_bh_schedule_oneshot() and then yields.
> 2. The one-shot BH executes in another AioContext.  All it does is call
>    aio_co_wakeup(preadv_co).
> 3. The preadv coroutine is re-entered and returns.
> 
> There is a race condition in aio_co_wake() where the preadv coroutine
> returns and the test case destroys the preadv IOThread.  aio_co_wake()
> can still be running in the other AioContext and it performs an access
> to the freed IOThread AioContext.
> 
> Here is the race in aio_co_schedule():
> 
>   QSLIST_INSERT_HEAD_ATOMIC(&ctx->scheduled_coroutines,
>                             co, co_scheduled_next);
>   <-- race: co may execute before we invoke qemu_bh_schedule()!
>   qemu_bh_schedule(ctx->co_schedule_bh);
> 
> So if co causes ctx to be freed then we're in trouble.  Fix this problem
> by holding a reference to ctx.
> 
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  util/async.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/util/async.c b/util/async.c
> index 8d2105729c..4e4c7af51e 100644
> --- a/util/async.c
> +++ b/util/async.c
> @@ -459,9 +459,17 @@ void aio_co_schedule(AioContext *ctx, Coroutine *co)
>          abort();
>      }
>  
> +    /* The coroutine might run and release the last ctx reference before we
> +     * invoke qemu_bh_schedule().  Take a reference to keep ctx alive until
> +     * we're done.
> +     */
> +    aio_context_ref(ctx);
> +
>      QSLIST_INSERT_HEAD_ATOMIC(&ctx->scheduled_coroutines,
>                                co, co_scheduled_next);
>      qemu_bh_schedule(ctx->co_schedule_bh);
> +
> +    aio_context_unref(ctx);
>  }
>  
>  void aio_co_wake(struct Coroutine *co)
> 

This must have been painful to debug.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>

Paolo

Re: [Qemu-devel] [PATCH] util/async: hold AioContext ref to prevent use-after-free

Posted by Stefan Hajnoczi 6 years, 6 months ago

On Tue, Jul 23, 2019 at 08:06:23PM +0100, Stefan Hajnoczi wrote:
> The tests/test-bdrv-drain /bdrv-drain/iothread/drain test case does the
> following:
> 
> 1. The preadv coroutine calls aio_bh_schedule_oneshot() and then yields.
> 2. The one-shot BH executes in another AioContext.  All it does is call
>    aio_co_wakeup(preadv_co).
> 3. The preadv coroutine is re-entered and returns.
> 
> There is a race condition in aio_co_wake() where the preadv coroutine
> returns and the test case destroys the preadv IOThread.  aio_co_wake()
> can still be running in the other AioContext and it performs an access
> to the freed IOThread AioContext.
> 
> Here is the race in aio_co_schedule():
> 
>   QSLIST_INSERT_HEAD_ATOMIC(&ctx->scheduled_coroutines,
>                             co, co_scheduled_next);
>   <-- race: co may execute before we invoke qemu_bh_schedule()!
>   qemu_bh_schedule(ctx->co_schedule_bh);
> 
> So if co causes ctx to be freed then we're in trouble.  Fix this problem
> by holding a reference to ctx.
> 
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  util/async.c | 8 ++++++++
>  1 file changed, 8 insertions(+)

Thanks, applied to my block tree:
https://github.com/stefanha/qemu/commits/block

Stefan