Hi,
As noted in “the original patch”[1] (which, in slightly different form,
is still part of this series), with multi-queue, a BDS’s “main”
AioContext should have little meaning. Instead, every request can be in
any AioContext. This series fixes some of the places where that doesn’t
seem to be considered yet, some of which can cause user-visible bugs
(those patches are Cc-ed to stable, I hope).
[1] https://lists.nongnu.org/archive/html/qemu-block/2025-02/msg00123.html
A common problem pattern is that the request coroutine yields, awaiting
a completion function that runs in a different thread. Waking the
coroutine is then often done in the BDS’s main AioContext (e.g. via a BH
scheduled there), i.e. when using multiqueue, still a different thread
from the original request. This can cause races, particularly in case
the coroutine yields in a loop awaiting request completion, which can
cause it to see completion before yielding for the first time, even
though the completion function is still going to wake it. (A wake
without a yield is bad.)
In other cases, there is no race (patch 1 tries to clarify when
aio_co_wake() is safe to call), but scheduling the completion wake-up in
the BDS main AioContext still doesn’t make too much sense, and it should
instead run in the request’s context, so it can directly enter the
request coroutine instead of just scheduling it yet again.
Patches 7, 9, and 10 are general concurrency fixes that have nothing to
do with this wake-up pattern, I just found those issues along the way.
As for the last four patches: The block layer currently doesn’t specify
the context in which AIO callbacks run. Callers probably expect this to
be the same context in which they issued the request, but we don’t
explicitly say so. Now, the only caller of these AIO-style methods is
block/io.c, which immediately “maps” them to coroutines in a non-racey
manner, i.e. it doesn’t actually care much about the context.
So while it makes sense to specify the AIOCB context (and then make the
implementations adhere to it), in practice, the only caller doesn’t
really care, and the block layer as a whole doesn’t really care about
the AIO context either. So maybe we should just drop the last four
patches, or keep patch 13, but instead of stating that the CB is run in
the request context, explicitly say that it may be run in any
AioContext.
Hanna Czenczek (16):
block: Note on aio_co_wake use if not yet yielding
rbd: Run co BH CB in the coroutine’s AioContext
iscsi: Run co BH CB in the coroutine’s AioContext
nfs: Run co BH CB in the coroutine’s AioContext
curl: Fix coroutine waking
gluster: Do not move coroutine into BDS context
nvme: Kick and check completions in BDS context
nvme: Fix coroutine waking
block/io: Take reqs_lock for tracked_requests
qcow2: Fix cache_clean_timer
ssh: Run restart_coroutine in current AioContext
blkreplay: Run BH in coroutine’s AioContext
block: Note in which AioContext AIO CBs are called
iscsi: Create AIO BH in original AioContext
null-aio: Run CB in original AioContext
win32-aio: Run CB in original context
block/qcow2.h | 1 +
include/block/aio.h | 15 ++++++
include/block/block_int-common.h | 7 ++-
block/blkreplay.c | 3 +-
block/curl.c | 55 ++++++++++++-------
block/gluster.c | 17 +++---
block/io.c | 3 ++
block/iscsi.c | 46 +++++-----------
block/nfs.c | 23 +++-----
block/null.c | 7 ++-
block/nvme.c | 70 +++++++++++++------------
block/qcow2.c | 90 +++++++++++++++++++++++++++-----
block/rbd.c | 12 ++---
block/ssh.c | 22 ++++----
block/win32-aio.c | 31 ++++++++---
15 files changed, 251 insertions(+), 151 deletions(-)
--
2.51.0