AIO_WAIT_WHILE questions

Dietmar Maurer posted 1 patch 4 years ago
Test docker-mingw@fedora passed
Test docker-quick@centos7 failed
Test checkpatch failed
Test FreeBSD passed
Test asan failed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/1242491200.59.1585326983523@webmail.proxmox.com
Maintainers: Stefan Hajnoczi <stefanha@redhat.com>, Fam Zheng <fam@euphon.net>, Max Reitz <mreitz@redhat.com>, Kevin Wolf <kwolf@redhat.com>
AIO_WAIT_WHILE questions
Posted by Dietmar Maurer 4 years ago
Hi all,

I have a question about AIO_WAIT_WHILE. The docs inside the code say:

 * The caller's thread must be the IOThread that owns @ctx or the main loop
 * thread (with @ctx acquired exactly once).

I wonder if that "with @ctx acquired exactly once" is always required?

I have done a quick test (see code below) and this reveals that the condition is not
always met.

Or is my test wrong (or the docs)?

---debug helper---
diff --git a/include/block/aio-wait.h b/include/block/aio-wait.h
index afeeb18f95..cf78dca9f9 100644
--- a/include/block/aio-wait.h
+++ b/include/block/aio-wait.h
@@ -82,6 +82,8 @@ extern AioWait global_aio_wait;
     atomic_inc(&wait_->num_waiters);                               \
     if (ctx_ && in_aio_context_home_thread(ctx_)) {                \
         while ((cond)) {                                           \
+            printf("AIO_WAIT_WHILE %p %d\n", ctx, ctx_->lock_count);     \
+            assert(ctx_->lock_count == 1);                   \
             aio_poll(ctx_, true);                                  \
             waited_ = true;                                        \
         }                                                          \
diff --git a/include/block/aio.h b/include/block/aio.h
index cb1989105a..51ef20e2f0 100644
--- a/include/block/aio.h
+++ b/include/block/aio.h
@@ -125,6 +125,7 @@ struct AioContext {
 
     /* Used by AioContext users to protect from multi-threaded access.  */
     QemuRecMutex lock;
+    int lock_count;
 
     /* The list of registered AIO handlers.  Protected by ctx->list_lock. */
     AioHandlerList aio_handlers;
diff --git a/util/async.c b/util/async.c
index b94518b948..9804c6c64f 100644
--- a/util/async.c
+++ b/util/async.c
@@ -594,9 +594,11 @@ void aio_context_unref(AioContext *ctx)
 void aio_context_acquire(AioContext *ctx)
 {
     qemu_rec_mutex_lock(&ctx->lock);
+    ctx->lock_count++;
 }
 
 void aio_context_release(AioContext *ctx)
 {
+    ctx->lock_count--;
     qemu_rec_mutex_unlock(&ctx->lock);
 }


Re: AIO_WAIT_WHILE questions
Posted by no-reply@patchew.org 4 years ago
Patchew URL: https://patchew.org/QEMU/1242491200.59.1585326983523@webmail.proxmox.com/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Subject: AIO_WAIT_WHILE questions
Message-id: 1242491200.59.1585326983523@webmail.proxmox.com
Type: series

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

From https://github.com/patchew-project/qemu
   77a48a7..127fe86  master     -> master
Switched to a new branch 'test'
a128ad0 AIO_WAIT_WHILE questions

=== OUTPUT BEGIN ===
ERROR: Missing Signed-off-by: line(s)

total: 1 errors, 0 warnings, 26 lines checked

Commit a128ad0b1009 (AIO_WAIT_WHILE questions) has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
=== OUTPUT END ===

Test command exited with code: 1


The full log is available at
http://patchew.org/logs/1242491200.59.1585326983523@webmail.proxmox.com/testing.checkpatch/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com
Re: AIO_WAIT_WHILE questions
Posted by Markus Armbruster 4 years ago
Cc'ing people based on output of "scripts/get_maintainer.pl -f
include/block/aio-wait.h".

Dietmar Maurer <dietmar@proxmox.com> writes:

> Hi all,
>
> I have a question about AIO_WAIT_WHILE. The docs inside the code say:
>
>  * The caller's thread must be the IOThread that owns @ctx or the main loop
>  * thread (with @ctx acquired exactly once).
>
> I wonder if that "with @ctx acquired exactly once" is always required?
>
> I have done a quick test (see code below) and this reveals that the condition is not
> always met.
>
> Or is my test wrong (or the docs)?
>
> ---debug helper---
> diff --git a/include/block/aio-wait.h b/include/block/aio-wait.h
> index afeeb18f95..cf78dca9f9 100644
> --- a/include/block/aio-wait.h
> +++ b/include/block/aio-wait.h
> @@ -82,6 +82,8 @@ extern AioWait global_aio_wait;
>      atomic_inc(&wait_->num_waiters);                               \
>      if (ctx_ && in_aio_context_home_thread(ctx_)) {                \
>          while ((cond)) {                                           \
> +            printf("AIO_WAIT_WHILE %p %d\n", ctx, ctx_->lock_count);     \
> +            assert(ctx_->lock_count == 1);                   \
>              aio_poll(ctx_, true);                                  \
>              waited_ = true;                                        \
>          }                                                          \
> diff --git a/include/block/aio.h b/include/block/aio.h
> index cb1989105a..51ef20e2f0 100644
> --- a/include/block/aio.h
> +++ b/include/block/aio.h
> @@ -125,6 +125,7 @@ struct AioContext {
>  
>      /* Used by AioContext users to protect from multi-threaded access.  */
>      QemuRecMutex lock;
> +    int lock_count;
>  
>      /* The list of registered AIO handlers.  Protected by ctx->list_lock. */
>      AioHandlerList aio_handlers;
> diff --git a/util/async.c b/util/async.c
> index b94518b948..9804c6c64f 100644
> --- a/util/async.c
> +++ b/util/async.c
> @@ -594,9 +594,11 @@ void aio_context_unref(AioContext *ctx)
>  void aio_context_acquire(AioContext *ctx)
>  {
>      qemu_rec_mutex_lock(&ctx->lock);
> +    ctx->lock_count++;
>  }
>  
>  void aio_context_release(AioContext *ctx)
>  {
> +    ctx->lock_count--;
>      qemu_rec_mutex_unlock(&ctx->lock);
>  }


Re: AIO_WAIT_WHILE questions
Posted by Stefan Hajnoczi 4 years ago
On Mon, Mar 30, 2020 at 10:09:45AM +0200, Markus Armbruster wrote:
> Cc'ing people based on output of "scripts/get_maintainer.pl -f
> include/block/aio-wait.h".
> 
> Dietmar Maurer <dietmar@proxmox.com> writes:
> 
> > Hi all,
> >
> > I have a question about AIO_WAIT_WHILE. The docs inside the code say:
> >
> >  * The caller's thread must be the IOThread that owns @ctx or the main loop
> >  * thread (with @ctx acquired exactly once).
> >
> > I wonder if that "with @ctx acquired exactly once" is always required?
> >
> > I have done a quick test (see code below) and this reveals that the condition is not
> > always met.
> >
> > Or is my test wrong (or the docs)?
> >
> > ---debug helper---
> > diff --git a/include/block/aio-wait.h b/include/block/aio-wait.h
> > index afeeb18f95..cf78dca9f9 100644
> > --- a/include/block/aio-wait.h
> > +++ b/include/block/aio-wait.h
> > @@ -82,6 +82,8 @@ extern AioWait global_aio_wait;
> >      atomic_inc(&wait_->num_waiters);                               \
> >      if (ctx_ && in_aio_context_home_thread(ctx_)) {                \
> >          while ((cond)) {                                           \
> > +            printf("AIO_WAIT_WHILE %p %d\n", ctx, ctx_->lock_count);     \
> > +            assert(ctx_->lock_count == 1);                   \
> >              aio_poll(ctx_, true);                                  \
> >              waited_ = true;                                        \
> >          }                                                          \

In this case it doesn't matter.  Handlers invoked by aio_poll() that
acquire ctx's recursive mutex will succeed.

The "exactly once" requirement is there because nested locking is not
supported when waiting for an AioContext that runs in a different
thread:

    } else {                                                       \
        assert(qemu_get_current_aio_context() ==                   \
               qemu_get_aio_context());                            \
        while ((cond)) {                                           \
            if (ctx_) {                                            \
                aio_context_release(ctx_);                         \
		^--- doesn't work if we have acquired it multiple times

I think it would be okay to update the documentation to make this clear.
Re: AIO_WAIT_WHILE questions
Posted by no-reply@patchew.org 4 years ago
Patchew URL: https://patchew.org/QEMU/1242491200.59.1585326983523@webmail.proxmox.com/



Hi,

This series failed the asan build test. Please find the testing commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
export ARCH=x86_64
make docker-image-fedora V=1 NETWORK=1
time make docker-test-debug@fedora TARGET_LIST=x86_64-softmmu J=14 NETWORK=1
=== TEST SCRIPT END ===

PASS 1 fdc-test /x86_64/fdc/cmos
PASS 2 fdc-test /x86_64/fdc/no_media_on_start
PASS 3 fdc-test /x86_64/fdc/read_without_media
==8595==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases!
PASS 4 fdc-test /x86_64/fdc/media_change
PASS 5 fdc-test /x86_64/fdc/sense_interrupt
PASS 6 fdc-test /x86_64/fdc/relative_seek
---
qemu-system-x86_64: /tmp/qemu-test/src/block/block-backend.c:1271: int blk_prw(BlockBackend *, int64_t, uint8_t *, int64_t, CoroutineEntry *, BdrvRequestFlags): Assertion `ctx_->lock_count == 1' failed.
Broken pipe
/tmp/qemu-test/src/tests/qtest/libqtest.c:175: kill_qemu() detected QEMU death from signal 6 (Aborted)
ERROR - too few tests run (expected 13, got 9)
make: *** [/tmp/qemu-test/src/tests/Makefile.include:636: check-qtest-x86_64] Error 1
make: *** Waiting for unfinished jobs....
PASS 1 check-qjson /literals/keyword
PASS 2 check-qjson /literals/string/escaped
---
PASS 32 test-opts-visitor /visitor/opts/range/beyond
PASS 33 test-opts-visitor /visitor/opts/dict/unvisited
MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))}  tests/test-coroutine -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl --test-name="test-coroutine" 
==8649==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases!
==8649==WARNING: ASan is ignoring requested __asan_handle_no_return: stack top: 0x7ffe80cca000; bottom 0x7f7b14c10000; size: 0x00836c0ba000 (564453416960)
False positive error reports may follow
For details see https://github.com/google/sanitizers/issues/189
PASS 1 test-coroutine /basic/no-dangling-access
---
PASS 11 test-aio /aio/event/wait
PASS 12 test-aio /aio/event/flush
PASS 13 test-aio /aio/event/wait/no-flush-cb
==8664==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases!
PASS 14 test-aio /aio/timer/schedule
PASS 15 test-aio /aio/coroutine/queue-chaining
PASS 16 test-aio /aio-gsource/flush
---
PASS 27 test-aio /aio-gsource/event/wait/no-flush-cb
PASS 28 test-aio /aio-gsource/timer/schedule
MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))}  tests/test-aio-multithread -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl --test-name="test-aio-multithread" 
==8669==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases!
PASS 1 test-aio-multithread /aio/multi/lifecycle
PASS 2 test-aio-multithread /aio/multi/schedule
PASS 3 test-aio-multithread /aio/multi/mutex/contended
---
PASS 6 test-throttle /throttle/detach_attach
PASS 7 test-throttle /throttle/config_functions
PASS 8 test-throttle /throttle/accounting
==8703==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases!
PASS 9 test-throttle /throttle/groups
PASS 10 test-throttle /throttle/config/enabled
PASS 11 test-throttle /throttle/config/conflicting
---
PASS 14 test-throttle /throttle/config/max
PASS 15 test-throttle /throttle/config/iops_size
MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))}  tests/test-thread-pool -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl --test-name="test-thread-pool" 
==8707==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases!
PASS 1 test-thread-pool /thread-pool/submit
PASS 2 test-thread-pool /thread-pool/submit-aio
PASS 3 test-thread-pool /thread-pool/submit-co
---
PASS 39 test-hbitmap /hbitmap/next_dirty_area/next_dirty_area_4
PASS 40 test-hbitmap /hbitmap/next_dirty_area/next_dirty_area_after_truncate
MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))}  tests/test-bdrv-drain -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl --test-name="test-bdrv-drain" 
==8779==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases!
test-bdrv-drain: /tmp/qemu-test/src/block/io.c:429: void bdrv_do_drained_begin(BlockDriverState *, _Bool, BdrvChild *, _Bool, _Bool): Assertion `ctx_->lock_count == 1' failed.
ERROR - too few tests run (expected 42, got 0)
make: *** [/tmp/qemu-test/src/tests/Makefile.include:641: check-unit] Error 1
Traceback (most recent call last):
  File "./tests/docker/docker.py", line 664, in <module>
    sys.exit(main())
---
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', '--label', 'com.qemu.instance.uuid=993ef2554b5d41a1bd1d6f3d9a3dd700', '-u', '1003', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 'TARGET_LIST=x86_64-softmmu', '-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 'SHOW_ENV=', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', '/home/patchew2/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', '/var/tmp/patchew-tester-tmp-yyt7umnc/src/docker-src.2020-03-27-15.23.39.30469:/var/tmp/qemu:z,ro', 'qemu:fedora', '/var/tmp/qemu/run', 'test-debug']' returned non-zero exit status 2.
filter=--filter=label=com.qemu.instance.uuid=993ef2554b5d41a1bd1d6f3d9a3dd700
make[1]: *** [docker-run] Error 1
make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-yyt7umnc/src'
make: *** [docker-run-test-debug@fedora] Error 2

real    28m41.492s
user    0m8.980s


The full log is available at
http://patchew.org/logs/1242491200.59.1585326983523@webmail.proxmox.com/testing.asan/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com