[PATCH 00/16] block: Some multi-threading fixes

Hanna Czenczek posted 16 patches 2 weeks, 3 days ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20251028163343.116249-1-hreitz@redhat.com
Maintainers: Paolo Bonzini <pbonzini@redhat.com>, "Alex Bennée" <alex.bennee@linaro.org>, Kevin Wolf <kwolf@redhat.com>, Hanna Reitz <hreitz@redhat.com>, Stefan Hajnoczi <stefanha@redhat.com>, Fam Zheng <fam@euphon.net>, Ronnie Sahlberg <ronniesahlberg@gmail.com>, Peter Lieven <pl@dlhnet.de>, "Philippe Mathieu-Daudé" <philmd@linaro.org>, Ilya Dryomov <idryomov@gmail.com>, "Richard W.M. Jones" <rjones@redhat.com>, Stefan Weil <sw@weilnetz.de>
There is a newer version of this series
block/qcow2.h                    |  1 +
include/block/aio.h              | 15 ++++++
include/block/block_int-common.h |  7 ++-
block/blkreplay.c                |  3 +-
block/curl.c                     | 55 ++++++++++++-------
block/gluster.c                  | 17 +++---
block/io.c                       |  3 ++
block/iscsi.c                    | 46 +++++-----------
block/nfs.c                      | 23 +++-----
block/null.c                     |  7 ++-
block/nvme.c                     | 70 +++++++++++++------------
block/qcow2.c                    | 90 +++++++++++++++++++++++++++-----
block/rbd.c                      | 12 ++---
block/ssh.c                      | 22 ++++----
block/win32-aio.c                | 31 ++++++++---
15 files changed, 251 insertions(+), 151 deletions(-)
[PATCH 00/16] block: Some multi-threading fixes
Posted by Hanna Czenczek 2 weeks, 3 days ago
Hi,

As noted in “the original patch”[1] (which, in slightly different form,
is still part of this series), with multi-queue, a BDS’s “main”
AioContext should have little meaning.  Instead, every request can be in
any AioContext.  This series fixes some of the places where that doesn’t
seem to be considered yet, some of which can cause user-visible bugs
(those patches are Cc-ed to stable, I hope).

[1] https://lists.nongnu.org/archive/html/qemu-block/2025-02/msg00123.html

A common problem pattern is that the request coroutine yields, awaiting
a completion function that runs in a different thread.  Waking the
coroutine is then often done in the BDS’s main AioContext (e.g. via a BH
scheduled there), i.e. when using multiqueue, still a different thread
from the original request.  This can cause races, particularly in case
the coroutine yields in a loop awaiting request completion, which can
cause it to see completion before yielding for the first time, even
though the completion function is still going to wake it.  (A wake
without a yield is bad.)

In other cases, there is no race (patch 1 tries to clarify when
aio_co_wake() is safe to call), but scheduling the completion wake-up in
the BDS main AioContext still doesn’t make too much sense, and it should
instead run in the request’s context, so it can directly enter the
request coroutine instead of just scheduling it yet again.

Patches 7, 9, and 10 are general concurrency fixes that have nothing to
do with this wake-up pattern, I just found those issues along the way.

As for the last four patches: The block layer currently doesn’t specify
the context in which AIO callbacks run.  Callers probably expect this to
be the same context in which they issued the request, but we don’t
explicitly say so.  Now, the only caller of these AIO-style methods is
block/io.c, which immediately “maps” them to coroutines in a non-racey
manner, i.e. it doesn’t actually care much about the context.

So while it makes sense to specify the AIOCB context (and then make the
implementations adhere to it), in practice, the only caller doesn’t
really care, and the block layer as a whole doesn’t really care about
the AIO context either.  So maybe we should just drop the last four
patches, or keep patch 13, but instead of stating that the CB is run in
the request context, explicitly say that it may be run in any
AioContext.


Hanna Czenczek (16):
  block: Note on aio_co_wake use if not yet yielding
  rbd: Run co BH CB in the coroutine’s AioContext
  iscsi: Run co BH CB in the coroutine’s AioContext
  nfs: Run co BH CB in the coroutine’s AioContext
  curl: Fix coroutine waking
  gluster: Do not move coroutine into BDS context
  nvme: Kick and check completions in BDS context
  nvme: Fix coroutine waking
  block/io: Take reqs_lock for tracked_requests
  qcow2: Fix cache_clean_timer
  ssh: Run restart_coroutine in current AioContext
  blkreplay: Run BH in coroutine’s AioContext
  block: Note in which AioContext AIO CBs are called
  iscsi: Create AIO BH in original AioContext
  null-aio: Run CB in original AioContext
  win32-aio: Run CB in original context

 block/qcow2.h                    |  1 +
 include/block/aio.h              | 15 ++++++
 include/block/block_int-common.h |  7 ++-
 block/blkreplay.c                |  3 +-
 block/curl.c                     | 55 ++++++++++++-------
 block/gluster.c                  | 17 +++---
 block/io.c                       |  3 ++
 block/iscsi.c                    | 46 +++++-----------
 block/nfs.c                      | 23 +++-----
 block/null.c                     |  7 ++-
 block/nvme.c                     | 70 +++++++++++++------------
 block/qcow2.c                    | 90 +++++++++++++++++++++++++++-----
 block/rbd.c                      | 12 ++---
 block/ssh.c                      | 22 ++++----
 block/win32-aio.c                | 31 ++++++++---
 15 files changed, 251 insertions(+), 151 deletions(-)

-- 
2.51.0