[PATCH v2 00/15] block: Simplify drain

Kevin Wolf posted 15 patches 1 year, 5 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20221118174110.55183-1-kwolf@redhat.com
Maintainers: Kevin Wolf <kwolf@redhat.com>, Hanna Reitz <hreitz@redhat.com>, Stefan Hajnoczi <stefanha@redhat.com>, Fam Zheng <fam@euphon.net>, Wen Congyang <wencongyang2@huawei.com>, Xie Changlong <xiechanglong.d@gmail.com>, John Snow <jsnow@redhat.com>, Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
include/block/block-global-state.h |   3 +
include/block/block-io.h           |  58 ++---
include/block/block_int-common.h   |  25 +-
include/block/block_int-io.h       |  12 -
block.c                            | 185 ++++++++++-----
block/block-backend.c              |   4 +-
block/io.c                         | 290 +++++------------------
block/qed.c                        |  26 +-
block/replication.c                |   6 -
block/stream.c                     |  26 +-
block/throttle.c                   |   8 +-
blockdev.c                         |  13 -
blockjob.c                         |   2 +-
tests/unit/test-bdrv-drain.c       | 369 +++++++----------------------
14 files changed, 340 insertions(+), 687 deletions(-)
[PATCH v2 00/15] block: Simplify drain
Posted by Kevin Wolf 1 year, 5 months ago
I'm aware that exactly nobody has been looking forward to a series with
this title, but it has to be. The way drain works means that we need to
poll in bdrv_replace_child_noperm() and that makes things rather messy
with Emanuele's multiqueue work because you must not poll while you hold
the graph lock.

The other reason why it has to be is that drain is way too complex and
there are too many different cases. Some simplification like this will
hopefully make it considerably more maintainable. The diffstat probably
tells something, too.

There are roughly speaking three parts in this series:

1. Make BlockDriver.bdrv_drained_begin/end() non-coroutine_fn again,
   which allows us to not poll on bdrv_drained_end() any more.

2. Remove subtree drains. They are a considerable complication in the
   whole drain machinery (in particular, they require polling in the
   BdrvChildClass.attach/detach() callbacks that are called during
   bdrv_replace_child_noperm()) and none of their users actually has a
   good reason to use them.

3. Finally get rid of polling in bdrv_replace_child_noperm() by
   requiring that the child is already drained by the caller and calling
   callbacks only once and not again for every nested drain section.

If necessary, a prefix of this series can be merged that covers only the
first or the first two parts and it would still make sense.

v2:
- Rebased on master
- Patch 3: Removed left over _co parts in function names
- Patch 4: Updated function comments to reflect that we're not polling
  any more
- Patch 6 (new): Fix inconsistent AioContext locking for reopen code
- Patch 9 (was 8): Added comment to clarify when polling is allowed
  and the graph may change again
- Patch 11 (was 10):
  * Reworded some comments and the commit message.
  * Dropped a now unnecessary assertion that was dropped only in a later
    patch in v1 of the series.
  * Changed 'int parent_quiesce_counter' into 'bool quiesced_parent'
- Patch 12 (was 11): Don't remove ignore_bds_parents from
  bdrv_drain_poll(), it is actually still a valid optimisation there
  that makes polling O(n) instead of O(n²)
- Patch 13 (new): Instead of only removing assert(!qemu_in_coroutine())
  like in v1 of the series, drop out of coroutine context in
  bdrv_do_drained_begin_quiesce() just to be sure that we'll never get
  coroutine surprises in drain code.
- Patch 14 (was 12): More and reworded comments to make things hopefully
  a bit clearer

Kevin Wolf (15):
  qed: Don't yield in bdrv_qed_co_drain_begin()
  test-bdrv-drain: Don't yield in .bdrv_co_drained_begin/end()
  block: Revert .bdrv_drained_begin/end to non-coroutine_fn
  block: Remove drained_end_counter
  block: Inline bdrv_drain_invoke()
  block: Fix locking for bdrv_reopen_queue_child()
  block: Drain invidual nodes during reopen
  block: Don't use subtree drains in bdrv_drop_intermediate()
  stream: Replace subtree drain with a single node drain
  block: Remove subtree drains
  block: Call drain callbacks only once
  block: Remove ignore_bds_parents parameter from drain_begin/end.
  block: Drop out of coroutine in bdrv_do_drained_begin_quiesce()
  block: Don't poll in bdrv_replace_child_noperm()
  block: Remove poll parameter from bdrv_parent_drained_begin_single()

 include/block/block-global-state.h |   3 +
 include/block/block-io.h           |  58 ++---
 include/block/block_int-common.h   |  25 +-
 include/block/block_int-io.h       |  12 -
 block.c                            | 185 ++++++++++-----
 block/block-backend.c              |   4 +-
 block/io.c                         | 290 +++++------------------
 block/qed.c                        |  26 +-
 block/replication.c                |   6 -
 block/stream.c                     |  26 +-
 block/throttle.c                   |   8 +-
 blockdev.c                         |  13 -
 blockjob.c                         |   2 +-
 tests/unit/test-bdrv-drain.c       | 369 +++++++----------------------
 14 files changed, 340 insertions(+), 687 deletions(-)

-- 
2.38.1


Re: [PATCH v2 00/15] block: Simplify drain
Posted by Hanna Reitz 1 year, 5 months ago
On 18.11.22 18:40, Kevin Wolf wrote:
> I'm aware that exactly nobody has been looking forward to a series with
> this title, but it has to be. The way drain works means that we need to
> poll in bdrv_replace_child_noperm() and that makes things rather messy
> with Emanuele's multiqueue work because you must not poll while you hold
> the graph lock.
>
> The other reason why it has to be is that drain is way too complex and
> there are too many different cases. Some simplification like this will
> hopefully make it considerably more maintainable. The diffstat probably
> tells something, too.
>
> There are roughly speaking three parts in this series:
>
> 1. Make BlockDriver.bdrv_drained_begin/end() non-coroutine_fn again,
>     which allows us to not poll on bdrv_drained_end() any more.
>
> 2. Remove subtree drains. They are a considerable complication in the
>     whole drain machinery (in particular, they require polling in the
>     BdrvChildClass.attach/detach() callbacks that are called during
>     bdrv_replace_child_noperm()) and none of their users actually has a
>     good reason to use them.
>
> 3. Finally get rid of polling in bdrv_replace_child_noperm() by
>     requiring that the child is already drained by the caller and calling
>     callbacks only once and not again for every nested drain section.
>
> If necessary, a prefix of this series can be merged that covers only the
> first or the first two parts and it would still make sense.
>
> v2:
> - Rebased on master
> - Patch 3: Removed left over _co parts in function names
> - Patch 4: Updated function comments to reflect that we're not polling
>    any more
> - Patch 6 (new): Fix inconsistent AioContext locking for reopen code
> - Patch 9 (was 8): Added comment to clarify when polling is allowed
>    and the graph may change again
> - Patch 11 (was 10):
>    * Reworded some comments and the commit message.
>    * Dropped a now unnecessary assertion that was dropped only in a later
>      patch in v1 of the series.
>    * Changed 'int parent_quiesce_counter' into 'bool quiesced_parent'
> - Patch 12 (was 11): Don't remove ignore_bds_parents from
>    bdrv_drain_poll(), it is actually still a valid optimisation there
>    that makes polling O(n) instead of O(n²)
> - Patch 13 (new): Instead of only removing assert(!qemu_in_coroutine())
>    like in v1 of the series, drop out of coroutine context in
>    bdrv_do_drained_begin_quiesce() just to be sure that we'll never get
>    coroutine surprises in drain code.
> - Patch 14 (was 12): More and reworded comments to make things hopefully
>    a bit clearer

Thanks!

Reviewed-by: Hanna Reitz <hreitz@redhat.com>


Re: [PATCH v2 00/15] block: Simplify drain
Posted by Kevin Wolf 1 year, 4 months ago
Am 18.11.2022 um 18:40 hat Kevin Wolf geschrieben:
> I'm aware that exactly nobody has been looking forward to a series with
> this title, but it has to be. The way drain works means that we need to
> poll in bdrv_replace_child_noperm() and that makes things rather messy
> with Emanuele's multiqueue work because you must not poll while you hold
> the graph lock.
> 
> The other reason why it has to be is that drain is way too complex and
> there are too many different cases. Some simplification like this will
> hopefully make it considerably more maintainable. The diffstat probably
> tells something, too.
> 
> There are roughly speaking three parts in this series:
> 
> 1. Make BlockDriver.bdrv_drained_begin/end() non-coroutine_fn again,
>    which allows us to not poll on bdrv_drained_end() any more.
> 
> 2. Remove subtree drains. They are a considerable complication in the
>    whole drain machinery (in particular, they require polling in the
>    BdrvChildClass.attach/detach() callbacks that are called during
>    bdrv_replace_child_noperm()) and none of their users actually has a
>    good reason to use them.
> 
> 3. Finally get rid of polling in bdrv_replace_child_noperm() by
>    requiring that the child is already drained by the caller and calling
>    callbacks only once and not again for every nested drain section.
> 
> If necessary, a prefix of this series can be merged that covers only the
> first or the first two parts and it would still make sense.

Thanks for the review, applied to block-next.

Kevin