[Qemu-devel] [PATCH v2 00/20] Drain fixes and cleanups, part 3

Kevin Wolf posted 20 patches 5 years, 11 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20180529172156.29311-1-kwolf@redhat.com
Test checkpatch failed
Test docker-mingw@fedora passed
Test docker-quick@centos7 passed
Test s390x passed
include/block/aio-wait.h     |  25 +-
include/block/block.h        |  31 +-
include/block/block_int.h    |  14 +
include/block/blockjob_int.h |   8 +
block.c                      |  52 +++-
block/io.c                   | 332 ++++++++++++--------
block/mirror.c               |   8 +
block/vvfat.c                |   1 +
blockjob.c                   |  23 ++
tests/test-bdrv-drain.c      | 705 +++++++++++++++++++++++++++++++++++++++++--
10 files changed, 1032 insertions(+), 167 deletions(-)
[Qemu-devel] [PATCH v2 00/20] Drain fixes and cleanups, part 3
Posted by Kevin Wolf 5 years, 11 months ago
This is the third and hopefully for now last part of my work to fix
drain. The main goal of this series is to make drain robust against
graph changes that happen in any callbacks of in-flight requests while
we drain a block node.

The individual patches describe the details, but the rough plan is to
change all three drain types (single node, subtree and all) to work like
this:

1. First call all the necessary callbacks to quiesce external sources
   for new requests. This includes the block driver callbacks, the child
   node callbacks and disabling external AioContext events. This is done
   recursively.

   Much of the trouble we had with drain resulted from the fact that the
   graph changed while we were traversing the graph recursively. None of
   the callbacks called in this phase may change the graph.

2. Then do a single AIO_WAIT_WHILE() to drain the requests of all
   affected nodes. The aio_poll() called by it is where graph changes
   can happen and we need to be careful.

   However, while evaluating the loop condition, the graph can't change,
   so we can safely call all necessary callbacks, if needed recursively,
   to determine whether there are still pending requests in any affected
   nodes. We just need to make sure that we don't rely on the set of
   nodes being the same between any two evaluation of the condition.

There are a few more smaller, mostly self-contained changes needed
before we're actually safe, but this is the main mechanism that will
help you understand what we're working towards during the series.

v2:

- Rebased on top of current master (e.g. including Job infrastructure)

- Avoid unnecessary parent callbacks for .drained_begin/poll/end:
  * subtree drains: Don't propagate the drain to the parent that we came
                    from recursively
  * drain_all:      Don't propagate the drain to BDS parents (which are
                    already separately drained), but only to non-BDS
                    parents like BBs or Jobs

- Separate bdrv_drain_poll_top_level() function instead of having a
  top_level parameter for bdrv_drain_poll().

- A few commit message and comment improvements


Kevin Wolf (19):
  test-bdrv-drain: bdrv_drain() works with cross-AioContext events
  block: Use bdrv_do_drain_begin/end in bdrv_drain_all()
  block: Remove 'recursive' parameter from bdrv_drain_invoke()
  block: Don't manually poll in bdrv_drain_all()
  tests/test-bdrv-drain: bdrv_drain_all() works in coroutines now
  block: Avoid unnecessary aio_poll() in AIO_WAIT_WHILE()
  block: Really pause block jobs on drain
  block: Remove bdrv_drain_recurse()
  block: Drain recursively with a single BDRV_POLL_WHILE()
  test-bdrv-drain: Test node deletion in subtree recursion
  block: Don't poll in parent drain callbacks
  test-bdrv-drain: Graph change through parent callback
  block: Defer .bdrv_drain_begin callback to polling phase
  test-bdrv-drain: Test that bdrv_drain_invoke() doesn't poll
  block: Allow AIO_WAIT_WHILE with NULL ctx
  block: Move bdrv_drain_all_begin() out of coroutine context
  block: ignore_bds_parents parameter for drain functions
  block: Allow graph changes in bdrv_drain_all_begin/end sections
  test-bdrv-drain: Test graph changes in drain_all section

Max Reitz (1):
  test-bdrv-drain: Add test for node deletion

 include/block/aio-wait.h     |  25 +-
 include/block/block.h        |  31 +-
 include/block/block_int.h    |  14 +
 include/block/blockjob_int.h |   8 +
 block.c                      |  52 +++-
 block/io.c                   | 332 ++++++++++++--------
 block/mirror.c               |   8 +
 block/vvfat.c                |   1 +
 blockjob.c                   |  23 ++
 tests/test-bdrv-drain.c      | 705 +++++++++++++++++++++++++++++++++++++++++--
 10 files changed, 1032 insertions(+), 167 deletions(-)

-- 
2.13.6


Re: [Qemu-devel] [PATCH v2 00/20] Drain fixes and cleanups, part 3
Posted by no-reply@patchew.org 5 years, 11 months ago
Hi,

This series seems to have some coding style problems. See output below for
more information:

Type: series
Message-id: 20180529172156.29311-1-kwolf@redhat.com
Subject: [Qemu-devel] [PATCH v2 00/20] Drain fixes and cleanups, part 3

=== TEST SCRIPT BEGIN ===
#!/bin/bash

BASE=base
n=1
total=$(git log --oneline $BASE.. | wc -l)
failed=0

git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram

commits="$(git log --format=%H --reverse $BASE..)"
for c in $commits; do
    echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..."
    if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then
        failed=1
        echo
    fi
    n=$((n+1))
done

exit $failed
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 * [new tag]               patchew/20180529172156.29311-1-kwolf@redhat.com -> patchew/20180529172156.29311-1-kwolf@redhat.com
Switched to a new branch 'test'
eab5e8287f test-bdrv-drain: Test graph changes in drain_all section
3300d94a9b block: Allow graph changes in bdrv_drain_all_begin/end sections
3a37012ee9 block: ignore_bds_parents parameter for drain functions
a0cd9923b2 block: Move bdrv_drain_all_begin() out of coroutine context
7df9cb285f block: Allow AIO_WAIT_WHILE with NULL ctx
fa7d0ac25e test-bdrv-drain: Test that bdrv_drain_invoke() doesn't poll
3c98182ba5 block: Defer .bdrv_drain_begin callback to polling phase
928f6f85a4 test-bdrv-drain: Graph change through parent callback
75e58be3aa block: Don't poll in parent drain callbacks
84fea14482 test-bdrv-drain: Test node deletion in subtree recursion
6e4d9cda70 block: Drain recursively with a single BDRV_POLL_WHILE()
9531ba32b7 test-bdrv-drain: Add test for node deletion
f6fe14abad block: Remove bdrv_drain_recurse()
aa7ade75cf block: Really pause block jobs on drain
00d52a27b6 block: Avoid unnecessary aio_poll() in AIO_WAIT_WHILE()
c96450896a tests/test-bdrv-drain: bdrv_drain_all() works in coroutines now
955c52ae29 block: Don't manually poll in bdrv_drain_all()
09713c9ac7 block: Remove 'recursive' parameter from bdrv_drain_invoke()
e0ccd9c8f2 block: Use bdrv_do_drain_begin/end in bdrv_drain_all()
9e0ea8ed00 test-bdrv-drain: bdrv_drain() works with cross-AioContext events

=== OUTPUT BEGIN ===
Checking PATCH 1/20: test-bdrv-drain: bdrv_drain() works with cross-AioContext events...
Checking PATCH 2/20: block: Use bdrv_do_drain_begin/end in bdrv_drain_all()...
Checking PATCH 3/20: block: Remove 'recursive' parameter from bdrv_drain_invoke()...
Checking PATCH 4/20: block: Don't manually poll in bdrv_drain_all()...
Checking PATCH 5/20: tests/test-bdrv-drain: bdrv_drain_all() works in coroutines now...
Checking PATCH 6/20: block: Avoid unnecessary aio_poll() in AIO_WAIT_WHILE()...
ERROR: trailing statements should be on next line
#38: FILE: block/io.c:190:
+    while (aio_poll(bs->aio_context, false));

ERROR: braces {} are necessary for all arms of this statement
#38: FILE: block/io.c:190:
+    while (aio_poll(bs->aio_context, false));
[...]

total: 2 errors, 0 warnings, 60 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Checking PATCH 7/20: block: Really pause block jobs on drain...
Checking PATCH 8/20: block: Remove bdrv_drain_recurse()...
Checking PATCH 9/20: test-bdrv-drain: Add test for node deletion...
Checking PATCH 10/20: block: Drain recursively with a single BDRV_POLL_WHILE()...
Checking PATCH 11/20: test-bdrv-drain: Test node deletion in subtree recursion...
WARNING: line over 80 characters
#85: FILE: tests/test-bdrv-drain.c:1034:
+    g_test_add_func("/bdrv-drain/detach/drain_subtree", test_detach_by_drain_subtree);

total: 0 errors, 1 warnings, 68 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
Checking PATCH 12/20: block: Don't poll in parent drain callbacks...
Checking PATCH 13/20: test-bdrv-drain: Graph change through parent callback...
WARNING: line over 80 characters
#81: FILE: tests/test-bdrv-drain.c:1049:
+    child_a = bdrv_attach_child(parent_b, a, "PB-A", &child_backing, &error_abort);

total: 0 errors, 1 warnings, 142 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
Checking PATCH 14/20: block: Defer .bdrv_drain_begin callback to polling phase...
Checking PATCH 15/20: test-bdrv-drain: Test that bdrv_drain_invoke() doesn't poll...
Checking PATCH 16/20: block: Allow AIO_WAIT_WHILE with NULL ctx...
Checking PATCH 17/20: block: Move bdrv_drain_all_begin() out of coroutine context...
WARNING: line over 80 characters
#27: FILE: block/io.c:270:
+            bdrv_do_drained_begin(bs, data->recursive, data->parent, data->poll);

total: 0 errors, 1 warnings, 41 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
Checking PATCH 18/20: block: ignore_bds_parents parameter for drain functions...
Checking PATCH 19/20: block: Allow graph changes in bdrv_drain_all_begin/end sections...
ERROR: do not initialise globals to 0 or NULL
#123: FILE: block/io.c:477:
+unsigned int bdrv_drain_all_count = 0;

ERROR: trailing statements should be on next line
#132: FILE: block/io.c:486:
+    while (aio_poll(qemu_get_aio_context(), false));

ERROR: braces {} are necessary for all arms of this statement
#132: FILE: block/io.c:486:
+    while (aio_poll(qemu_get_aio_context(), false));
[...]

total: 3 errors, 0 warnings, 193 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Checking PATCH 20/20: test-bdrv-drain: Test graph changes in drain_all section...
=== OUTPUT END ===

Test command exited with code: 1


---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-devel@redhat.com
Re: [Qemu-devel] [PATCH v2 00/20] Drain fixes and cleanups, part 3
Posted by Kevin Wolf 5 years, 10 months ago
ping?

Am 29.05.2018 um 19:21 hat Kevin Wolf geschrieben:
> This is the third and hopefully for now last part of my work to fix
> drain. The main goal of this series is to make drain robust against
> graph changes that happen in any callbacks of in-flight requests while
> we drain a block node.
> 
> The individual patches describe the details, but the rough plan is to
> change all three drain types (single node, subtree and all) to work like
> this:
> 
> 1. First call all the necessary callbacks to quiesce external sources
>    for new requests. This includes the block driver callbacks, the child
>    node callbacks and disabling external AioContext events. This is done
>    recursively.
> 
>    Much of the trouble we had with drain resulted from the fact that the
>    graph changed while we were traversing the graph recursively. None of
>    the callbacks called in this phase may change the graph.
> 
> 2. Then do a single AIO_WAIT_WHILE() to drain the requests of all
>    affected nodes. The aio_poll() called by it is where graph changes
>    can happen and we need to be careful.
> 
>    However, while evaluating the loop condition, the graph can't change,
>    so we can safely call all necessary callbacks, if needed recursively,
>    to determine whether there are still pending requests in any affected
>    nodes. We just need to make sure that we don't rely on the set of
>    nodes being the same between any two evaluation of the condition.
> 
> There are a few more smaller, mostly self-contained changes needed
> before we're actually safe, but this is the main mechanism that will
> help you understand what we're working towards during the series.
> 
> v2:
> 
> - Rebased on top of current master (e.g. including Job infrastructure)
> 
> - Avoid unnecessary parent callbacks for .drained_begin/poll/end:
>   * subtree drains: Don't propagate the drain to the parent that we came
>                     from recursively
>   * drain_all:      Don't propagate the drain to BDS parents (which are
>                     already separately drained), but only to non-BDS
>                     parents like BBs or Jobs
> 
> - Separate bdrv_drain_poll_top_level() function instead of having a
>   top_level parameter for bdrv_drain_poll().
> 
> - A few commit message and comment improvements
> 
> 
> Kevin Wolf (19):
>   test-bdrv-drain: bdrv_drain() works with cross-AioContext events
>   block: Use bdrv_do_drain_begin/end in bdrv_drain_all()
>   block: Remove 'recursive' parameter from bdrv_drain_invoke()
>   block: Don't manually poll in bdrv_drain_all()
>   tests/test-bdrv-drain: bdrv_drain_all() works in coroutines now
>   block: Avoid unnecessary aio_poll() in AIO_WAIT_WHILE()
>   block: Really pause block jobs on drain
>   block: Remove bdrv_drain_recurse()
>   block: Drain recursively with a single BDRV_POLL_WHILE()
>   test-bdrv-drain: Test node deletion in subtree recursion
>   block: Don't poll in parent drain callbacks
>   test-bdrv-drain: Graph change through parent callback
>   block: Defer .bdrv_drain_begin callback to polling phase
>   test-bdrv-drain: Test that bdrv_drain_invoke() doesn't poll
>   block: Allow AIO_WAIT_WHILE with NULL ctx
>   block: Move bdrv_drain_all_begin() out of coroutine context
>   block: ignore_bds_parents parameter for drain functions
>   block: Allow graph changes in bdrv_drain_all_begin/end sections
>   test-bdrv-drain: Test graph changes in drain_all section
> 
> Max Reitz (1):
>   test-bdrv-drain: Add test for node deletion
> 
>  include/block/aio-wait.h     |  25 +-
>  include/block/block.h        |  31 +-
>  include/block/block_int.h    |  14 +
>  include/block/blockjob_int.h |   8 +
>  block.c                      |  52 +++-
>  block/io.c                   | 332 ++++++++++++--------
>  block/mirror.c               |   8 +
>  block/vvfat.c                |   1 +
>  blockjob.c                   |  23 ++
>  tests/test-bdrv-drain.c      | 705 +++++++++++++++++++++++++++++++++++++++++--
>  10 files changed, 1032 insertions(+), 167 deletions(-)
> 
> -- 
> 2.13.6
> 

Re: [Qemu-devel] [Qemu-block] [PATCH v2 00/20] Drain fixes and cleanups, part 3
Posted by Kevin Wolf 5 years, 10 months ago
Am 11.06.2018 um 14:23 hat Kevin Wolf geschrieben:
> ping?
> 
> Am 29.05.2018 um 19:21 hat Kevin Wolf geschrieben:
> > This is the third and hopefully for now last part of my work to fix
> > drain. The main goal of this series is to make drain robust against
> > graph changes that happen in any callbacks of in-flight requests while
> > we drain a block node.
> > 
> > The individual patches describe the details, but the rough plan is to
> > change all three drain types (single node, subtree and all) to work like
> > this:
> > 
> > 1. First call all the necessary callbacks to quiesce external sources
> >    for new requests. This includes the block driver callbacks, the child
> >    node callbacks and disabling external AioContext events. This is done
> >    recursively.
> > 
> >    Much of the trouble we had with drain resulted from the fact that the
> >    graph changed while we were traversing the graph recursively. None of
> >    the callbacks called in this phase may change the graph.
> > 
> > 2. Then do a single AIO_WAIT_WHILE() to drain the requests of all
> >    affected nodes. The aio_poll() called by it is where graph changes
> >    can happen and we need to be careful.
> > 
> >    However, while evaluating the loop condition, the graph can't change,
> >    so we can safely call all necessary callbacks, if needed recursively,
> >    to determine whether there are still pending requests in any affected
> >    nodes. We just need to make sure that we don't rely on the set of
> >    nodes being the same between any two evaluation of the condition.
> > 
> > There are a few more smaller, mostly self-contained changes needed
> > before we're actually safe, but this is the main mechanism that will
> > help you understand what we're working towards during the series.

Without objection, applied to the block branch.

Kevin