[Qemu-devel] [RFC PATCH 0/5] Fix some jobs/drain/aio_poll related hangs

Kevin Wolf posted 5 patches 7 years, 2 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20180817170246.14641-1-kwolf@redhat.com
Test checkpatch failed
Test docker-mingw@fedora passed
Test docker-clang@ubuntu passed
Test docker-quick@centos7 passed
There is a newer version of this series
include/block/blockjob.h | 13 +++++++++++++
include/qemu/job.h       |  9 +++++++++
block/io.c               | 31 ++++++++++++++++++++++++++++++-
blockjob.c               | 18 ++++++++++++++++++
job.c                    | 10 ++++++++++
tests/test-bdrv-drain.c  |  6 ++++++
tests/test-blockjob.c    |  6 ++++++
7 files changed, 92 insertions(+), 1 deletion(-)
[Qemu-devel] [RFC PATCH 0/5] Fix some jobs/drain/aio_poll related hangs
Posted by Kevin Wolf 7 years, 2 months ago
I'm running out of time and will be offline for the next two weeks, so
I'm just sending out what I have right now. This is probably not ready
to be merged yet, but if need be, someone else can pick it up. Otherwise
I'll do that myself when I return.

This is related to the following bug reports:
https://bugzilla.redhat.com/show_bug.cgi?id=1614623
https://bugzilla.redhat.com/show_bug.cgi?id=1601212

Essentially it tries to fix a few deadlocks into which we run when
running two commit jobs (probably just one example for problematic
scenarios) on nodes that are in a non-mainloop AioContext (i.e. with
dataplane).

Obviously, besides verifying that the code changes are actually correct
as they are, we also want to add some test cases for this. I suppose
test-bdrv-drain should test for the individual bugs, and a qemu-iotests
case could try the higher level scenario of multiple commit jobs with
dataplane.

Kevin Wolf (5):
  blockjob: Wake up BDS when job becomes idle
  tests: Acquire AioContext around job_finish_sync()
  job: Drop AioContext lock around aio_poll()
  block: Drop AioContext lock in bdrv_drain_poll_top_level()
  [WIP] Lock AioContext in bdrv_co_drain_bh_cb()

 include/block/blockjob.h | 13 +++++++++++++
 include/qemu/job.h       |  9 +++++++++
 block/io.c               | 31 ++++++++++++++++++++++++++++++-
 blockjob.c               | 18 ++++++++++++++++++
 job.c                    | 10 ++++++++++
 tests/test-bdrv-drain.c  |  6 ++++++
 tests/test-blockjob.c    |  6 ++++++
 7 files changed, 92 insertions(+), 1 deletion(-)

-- 
2.13.6


Re: [Qemu-devel] [RFC PATCH 0/5] Fix some jobs/drain/aio_poll related hangs
Posted by no-reply@patchew.org 7 years, 2 months ago
Hi,

This series seems to have some coding style problems. See output below for
more information:

Type: series
Message-id: 20180817170246.14641-1-kwolf@redhat.com
Subject: [Qemu-devel] [RFC PATCH 0/5] Fix some jobs/drain/aio_poll related hangs

=== TEST SCRIPT BEGIN ===
#!/bin/bash

BASE=base
n=1
total=$(git log --oneline $BASE.. | wc -l)
failed=0

git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram

commits="$(git log --format=%H --reverse $BASE..)"
for c in $commits; do
    echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..."
    if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then
        failed=1
        echo
    fi
    n=$((n+1))
done

exit $failed
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
Switched to a new branch 'test'
fb25339a46 Lock AioContext in bdrv_co_drain_bh_cb()
11c6cc6aa9 block: Drop AioContext lock in bdrv_drain_poll_top_level()
1b110e2bb8 job: Drop AioContext lock around aio_poll()
52269c1c01 tests: Acquire AioContext around job_finish_sync()
25292b40af blockjob: Wake up BDS when job becomes idle

=== OUTPUT BEGIN ===
Checking PATCH 1/5: blockjob: Wake up BDS when job becomes idle...
Checking PATCH 2/5: tests: Acquire AioContext around job_finish_sync()...
Checking PATCH 3/5: job: Drop AioContext lock around aio_poll()...
Checking PATCH 4/5: block: Drop AioContext lock in bdrv_drain_poll_top_level()...
ERROR: trailing statements should be on next line
#45: FILE: block/io.c:292:
+    while (aio_poll(ctx, false));

ERROR: braces {} are necessary for all arms of this statement
#45: FILE: block/io.c:292:
+    while (aio_poll(ctx, false));
[...]

total: 2 errors, 0 warnings, 33 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Checking PATCH 5/5: Lock AioContext in bdrv_co_drain_bh_cb()...
=== OUTPUT END ===

Test command exited with code: 1


---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-devel@redhat.com
Re: [Qemu-devel] [RFC PATCH 0/5] Fix some jobs/drain/aio_poll related hangs
Posted by Fam Zheng 7 years, 2 months ago
On Fri, 08/17 19:02, Kevin Wolf wrote:
> I'm running out of time and will be offline for the next two weeks, so
> I'm just sending out what I have right now. This is probably not ready
> to be merged yet, but if need be, someone else can pick it up. Otherwise
> I'll do that myself when I return.

I'll try take a look at these. But I'll start from

https://bugzilla.redhat.com/show_bug.cgi?id=1609137

> 
> This is related to the following bug reports:
> https://bugzilla.redhat.com/show_bug.cgi?id=1614623
> https://bugzilla.redhat.com/show_bug.cgi?id=1601212
> 
> Essentially it tries to fix a few deadlocks into which we run when
> running two commit jobs (probably just one example for problematic
> scenarios) on nodes that are in a non-mainloop AioContext (i.e. with
> dataplane).
> 
> Obviously, besides verifying that the code changes are actually correct
> as they are, we also want to add some test cases for this. I suppose
> test-bdrv-drain should test for the individual bugs, and a qemu-iotests
> case could try the higher level scenario of multiple commit jobs with
> dataplane.

The test coverage of dataplane is terribly low. Let's think about adding
variants to existing iotests around block jobs at least.

Fam

> 
> Kevin Wolf (5):
>   blockjob: Wake up BDS when job becomes idle
>   tests: Acquire AioContext around job_finish_sync()
>   job: Drop AioContext lock around aio_poll()
>   block: Drop AioContext lock in bdrv_drain_poll_top_level()
>   [WIP] Lock AioContext in bdrv_co_drain_bh_cb()
> 
>  include/block/blockjob.h | 13 +++++++++++++
>  include/qemu/job.h       |  9 +++++++++
>  block/io.c               | 31 ++++++++++++++++++++++++++++++-
>  blockjob.c               | 18 ++++++++++++++++++
>  job.c                    | 10 ++++++++++
>  tests/test-bdrv-drain.c  |  6 ++++++
>  tests/test-blockjob.c    |  6 ++++++
>  7 files changed, 92 insertions(+), 1 deletion(-)
> 
> -- 
> 2.13.6
>