There is no need for aio_context_use_g_source() now that epoll(7) and
io_uring(7) file descriptor monitoring works with the glib event loop.
AioContext doesn't need to be notified that GSource is being used.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
include/block/aio.h | 3 ---
tests/unit/test-nested-aio-poll.c | 6 ------
util/aio-posix.c | 12 ------------
util/aio-win32.c | 4 ----
util/async.c | 1 -
5 files changed, 26 deletions(-)
diff --git a/include/block/aio.h b/include/block/aio.h
index 39ed86d14d..1657740a0e 100644
--- a/include/block/aio.h
+++ b/include/block/aio.h
@@ -728,9 +728,6 @@ void aio_context_setup(AioContext *ctx);
*/
void aio_context_destroy(AioContext *ctx);
-/* Used internally, do not call outside AioContext code */
-void aio_context_use_g_source(AioContext *ctx);
-
/**
* aio_context_set_poll_params:
* @ctx: the aio context
diff --git a/tests/unit/test-nested-aio-poll.c b/tests/unit/test-nested-aio-poll.c
index 45484e745b..d13ecccd8c 100644
--- a/tests/unit/test-nested-aio-poll.c
+++ b/tests/unit/test-nested-aio-poll.c
@@ -83,12 +83,6 @@ static void test(void)
/* Enable polling */
aio_context_set_poll_params(td.ctx, 1000000, 2, 2, &error_abort);
- /*
- * The GSource is unused but this has the side-effect of changing the fdmon
- * that AioContext uses.
- */
- aio_get_g_source(td.ctx);
-
/* Make the event notifier active (set) right away */
event_notifier_init(&td.poll_notifier, 1);
aio_set_event_notifier(td.ctx, &td.poll_notifier,
diff --git a/util/aio-posix.c b/util/aio-posix.c
index 9de05ee7e8..bebd9ce3a2 100644
--- a/util/aio-posix.c
+++ b/util/aio-posix.c
@@ -743,18 +743,6 @@ void aio_context_destroy(AioContext *ctx)
aio_free_deleted_handlers(ctx);
}
-void aio_context_use_g_source(AioContext *ctx)
-{
- /*
- * Disable io_uring when the glib main loop is used because it doesn't
- * support mixed glib/aio_poll() usage. It relies on aio_poll() being
- * called regularly so that changes to the monitored file descriptors are
- * submitted, otherwise a list of pending fd handlers builds up.
- */
- fdmon_io_uring_destroy(ctx);
- aio_free_deleted_handlers(ctx);
-}
-
void aio_context_set_poll_params(AioContext *ctx, int64_t max_ns,
int64_t grow, int64_t shrink, Error **errp)
{
diff --git a/util/aio-win32.c b/util/aio-win32.c
index 6583d5c5f3..34c4074133 100644
--- a/util/aio-win32.c
+++ b/util/aio-win32.c
@@ -427,10 +427,6 @@ void aio_context_destroy(AioContext *ctx)
{
}
-void aio_context_use_g_source(AioContext *ctx)
-{
-}
-
void aio_context_set_poll_params(AioContext *ctx, int64_t max_ns,
int64_t grow, int64_t shrink, Error **errp)
{
diff --git a/util/async.c b/util/async.c
index 2719c629ae..a39410d675 100644
--- a/util/async.c
+++ b/util/async.c
@@ -430,7 +430,6 @@ static GSourceFuncs aio_source_funcs = {
GSource *aio_get_g_source(AioContext *ctx)
{
- aio_context_use_g_source(ctx);
g_source_ref(&ctx->source);
return &ctx->source;
}
--
2.51.0
Am 10.09.2025 um 19:56 hat Stefan Hajnoczi geschrieben:
> There is no need for aio_context_use_g_source() now that epoll(7) and
> io_uring(7) file descriptor monitoring works with the glib event loop.
> AioContext doesn't need to be notified that GSource is being used.
>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> Reviewed-by: Eric Blake <eblake@redhat.com>
We should probably mention in the commit message that this causes the
default fdmon on Linux to change from poll to io_uring. It's a small
code change, but it makes QEMU use a completely different code path by
default.
With this added: Reviewed-by: Kevin Wolf <kwolf@redhat.com>
> diff --git a/tests/unit/test-nested-aio-poll.c b/tests/unit/test-nested-aio-poll.c
> index 45484e745b..d13ecccd8c 100644
> --- a/tests/unit/test-nested-aio-poll.c
> +++ b/tests/unit/test-nested-aio-poll.c
> @@ -83,12 +83,6 @@ static void test(void)
> /* Enable polling */
> aio_context_set_poll_params(td.ctx, 1000000, 2, 2, &error_abort);
>
> - /*
> - * The GSource is unused but this has the side-effect of changing the fdmon
> - * that AioContext uses.
> - */
> - aio_get_g_source(td.ctx);
> -
> /* Make the event notifier active (set) right away */
> event_notifier_init(&td.poll_notifier, 1);
> aio_set_event_notifier(td.ctx, &td.poll_notifier,
I wonder if it wouldn't make sense to squash this hunk into patch 3
('tests/unit: skip test-nested-aio-poll with io_uring').
Kevin
On Thu, Oct 09, 2025 at 05:46:57PM +0200, Kevin Wolf wrote:
> Am 10.09.2025 um 19:56 hat Stefan Hajnoczi geschrieben:
> > There is no need for aio_context_use_g_source() now that epoll(7) and
> > io_uring(7) file descriptor monitoring works with the glib event loop.
> > AioContext doesn't need to be notified that GSource is being used.
> >
> > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> > Reviewed-by: Eric Blake <eblake@redhat.com>
>
> We should probably mention in the commit message that this causes the
> default fdmon on Linux to change from poll to io_uring. It's a small
> code change, but it makes QEMU use a completely different code path by
> default.
>
> With this added: Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Will fix.
> > diff --git a/tests/unit/test-nested-aio-poll.c b/tests/unit/test-nested-aio-poll.c
> > index 45484e745b..d13ecccd8c 100644
> > --- a/tests/unit/test-nested-aio-poll.c
> > +++ b/tests/unit/test-nested-aio-poll.c
> > @@ -83,12 +83,6 @@ static void test(void)
> > /* Enable polling */
> > aio_context_set_poll_params(td.ctx, 1000000, 2, 2, &error_abort);
> >
> > - /*
> > - * The GSource is unused but this has the side-effect of changing the fdmon
> > - * that AioContext uses.
> > - */
> > - aio_get_g_source(td.ctx);
> > -
> > /* Make the event notifier active (set) right away */
> > event_notifier_init(&td.poll_notifier, 1);
> > aio_set_event_notifier(td.ctx, &td.poll_notifier,
>
> I wonder if it wouldn't make sense to squash this hunk into patch 3
> ('tests/unit: skip test-nested-aio-poll with io_uring').
Sure, I will move it.
Am 09.10.2025 um 17:46 hat Kevin Wolf geschrieben: > Am 10.09.2025 um 19:56 hat Stefan Hajnoczi geschrieben: > > There is no need for aio_context_use_g_source() now that epoll(7) and > > io_uring(7) file descriptor monitoring works with the glib event loop. > > AioContext doesn't need to be notified that GSource is being used. > > > > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> > > Reviewed-by: Eric Blake <eblake@redhat.com> > > We should probably mention in the commit message that this causes the > default fdmon on Linux to change from poll to io_uring. It's a small > code change, but it makes QEMU use a completely different code path by > default. Just to make sure, I ran 'make check' after this patch and it's failing for me: 10/401 qemu:qtest+qtest-x86_64 / qtest-x86_64/ahci-test TIMEOUT 150.02s killed by signal 15 SIGTERM 133/401 qemu:unit / test-aio TIMEOUT 30.01s killed by signal 15 SIGTERM 137/401 qemu:unit / test-bdrv-drain TIMEOUT 30.01s killed by signal 15 SIGTERM 142/401 qemu:unit / test-block-iothread TIMEOUT 30.01s killed by signal 15 SIGTERM 192/401 qemu:doc+rust / rust-bql-rs-doctests FAIL 0.84s exit status 101 311/401 qemu:block / io-qcow2-267 ERROR 3.20s exit status 1 321/401 qemu:block / io-qcow2-copy-before-write TIMEOUT 180.01s killed by signal 15 SIGTERM Some of them look unrelated, but I have confirmed that the three unit tests still pass before this patch (and still hang after the complete series). Kevin
On Thu, Oct 09, 2025 at 06:59:20PM +0200, Kevin Wolf wrote: > Am 09.10.2025 um 17:46 hat Kevin Wolf geschrieben: > > Am 10.09.2025 um 19:56 hat Stefan Hajnoczi geschrieben: > > > There is no need for aio_context_use_g_source() now that epoll(7) and > > > io_uring(7) file descriptor monitoring works with the glib event loop. > > > AioContext doesn't need to be notified that GSource is being used. > > > > > > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> > > > Reviewed-by: Eric Blake <eblake@redhat.com> > > > > We should probably mention in the commit message that this causes the > > default fdmon on Linux to change from poll to io_uring. It's a small > > code change, but it makes QEMU use a completely different code path by > > default. > > Just to make sure, I ran 'make check' after this patch and it's failing > for me: > > 10/401 qemu:qtest+qtest-x86_64 / qtest-x86_64/ahci-test TIMEOUT 150.02s killed by signal 15 SIGTERM > 133/401 qemu:unit / test-aio TIMEOUT 30.01s killed by signal 15 SIGTERM > 137/401 qemu:unit / test-bdrv-drain TIMEOUT 30.01s killed by signal 15 SIGTERM > 142/401 qemu:unit / test-block-iothread TIMEOUT 30.01s killed by signal 15 SIGTERM > 192/401 qemu:doc+rust / rust-bql-rs-doctests FAIL 0.84s exit status 101 > 311/401 qemu:block / io-qcow2-267 ERROR 3.20s exit status 1 > 321/401 qemu:block / io-qcow2-copy-before-write TIMEOUT 180.01s killed by signal 15 SIGTERM > > Some of them look unrelated, but I have confirmed that the three unit > tests still pass before this patch (and still hang after the complete > series). I pushed my latest rebased code with many of your review comments addressed here: https://gitlab.com/stefanha/qemu/-/tree/aio_add_sqe It doesn't contain any fixes specifically for the hangs, but it's what I've been testing here. Stefan
On Thu, Oct 09, 2025 at 06:59:20PM +0200, Kevin Wolf wrote: > Am 09.10.2025 um 17:46 hat Kevin Wolf geschrieben: > > Am 10.09.2025 um 19:56 hat Stefan Hajnoczi geschrieben: > > > There is no need for aio_context_use_g_source() now that epoll(7) and > > > io_uring(7) file descriptor monitoring works with the glib event loop. > > > AioContext doesn't need to be notified that GSource is being used. > > > > > > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> > > > Reviewed-by: Eric Blake <eblake@redhat.com> > > > > We should probably mention in the commit message that this causes the > > default fdmon on Linux to change from poll to io_uring. It's a small > > code change, but it makes QEMU use a completely different code path by > > default. > > Just to make sure, I ran 'make check' after this patch and it's failing > for me: > > 10/401 qemu:qtest+qtest-x86_64 / qtest-x86_64/ahci-test TIMEOUT 150.02s killed by signal 15 SIGTERM > 133/401 qemu:unit / test-aio TIMEOUT 30.01s killed by signal 15 SIGTERM > 137/401 qemu:unit / test-bdrv-drain TIMEOUT 30.01s killed by signal 15 SIGTERM > 142/401 qemu:unit / test-block-iothread TIMEOUT 30.01s killed by signal 15 SIGTERM > 192/401 qemu:doc+rust / rust-bql-rs-doctests FAIL 0.84s exit status 101 > 311/401 qemu:block / io-qcow2-267 ERROR 3.20s exit status 1 > 321/401 qemu:block / io-qcow2-copy-before-write TIMEOUT 180.01s killed by signal 15 SIGTERM > > Some of them look unrelated, but I have confirmed that the three unit > tests still pass before this patch (and still hang after the complete > series). I can't reproduce these failures, regardless of whether sysctl kernel.io_uring_disabled is 0 or 1. Can you launch the unit tests from your terminal and post the output? $ cd qemu $ build/tests/unit/test-aio $ build/tests/unit/test-bdrv-drain $ build/tests/unit/test-block-iothread That will show exactly which sub-test case is hanging. Other information that might help: your host kernel version and liburing version. Thank you! Stefan
Am 21.10.2025 um 21:10 hat Stefan Hajnoczi geschrieben: > On Thu, Oct 09, 2025 at 06:59:20PM +0200, Kevin Wolf wrote: > > Am 09.10.2025 um 17:46 hat Kevin Wolf geschrieben: > > > Am 10.09.2025 um 19:56 hat Stefan Hajnoczi geschrieben: > > > > There is no need for aio_context_use_g_source() now that epoll(7) and > > > > io_uring(7) file descriptor monitoring works with the glib event loop. > > > > AioContext doesn't need to be notified that GSource is being used. > > > > > > > > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> > > > > Reviewed-by: Eric Blake <eblake@redhat.com> > > > > > > We should probably mention in the commit message that this causes the > > > default fdmon on Linux to change from poll to io_uring. It's a small > > > code change, but it makes QEMU use a completely different code path by > > > default. > > > > Just to make sure, I ran 'make check' after this patch and it's failing > > for me: > > > > 10/401 qemu:qtest+qtest-x86_64 / qtest-x86_64/ahci-test TIMEOUT 150.02s killed by signal 15 SIGTERM > > 133/401 qemu:unit / test-aio TIMEOUT 30.01s killed by signal 15 SIGTERM > > 137/401 qemu:unit / test-bdrv-drain TIMEOUT 30.01s killed by signal 15 SIGTERM > > 142/401 qemu:unit / test-block-iothread TIMEOUT 30.01s killed by signal 15 SIGTERM > > 192/401 qemu:doc+rust / rust-bql-rs-doctests FAIL 0.84s exit status 101 > > 311/401 qemu:block / io-qcow2-267 ERROR 3.20s exit status 1 > > 321/401 qemu:block / io-qcow2-copy-before-write TIMEOUT 180.01s killed by signal 15 SIGTERM > > > > Some of them look unrelated, but I have confirmed that the three unit > > tests still pass before this patch (and still hang after the complete > > series). > > I can't reproduce these failures, regardless of whether sysctl > kernel.io_uring_disabled is 0 or 1. > > Can you launch the unit tests from your terminal and post the output? > > $ cd qemu > $ build/tests/unit/test-aio TAP version 14 # random seed: R02S48dcdde28634143f18bad3947c52d334 1..27 # Start of aio tests # Start of bh tests ok 1 /aio/bh/schedule ok 2 /aio/bh/schedule10 ok 3 /aio/bh/cancel ok 4 /aio/bh/delete ok 5 /aio/bh/flush # Start of callback-delete tests ok 6 /aio/bh/callback-delete/one ok 7 /aio/bh/callback-delete/many # End of callback-delete tests # End of bh tests # Start of event tests ok 8 /aio/event/add-remove ok 9 /aio/event/wait ok 10 /aio/event/flush # Start of wait tests ok 11 /aio/event/wait/no-flush-cb # End of wait tests # End of event tests # Start of timer tests > $ build/tests/unit/test-bdrv-drain TAP version 14 # random seed: R02S7d6ba0fc81d5b90d323813d680a30644 1..30 # Start of bdrv-drain tests ok 1 /bdrv-drain/nested ok 2 /bdrv-drain/set_aio_context # Start of driver-cb tests > $ build/tests/unit/test-block-iothread TAP version 14 # random seed: R02Sf81baf68887daa9b86be5c72b99df589 1..22 # Start of sync-op tests ok 1 /sync-op/pread ok 2 /sync-op/pwrite ok 3 /sync-op/preadv ok 4 /sync-op/pwritev ok 5 /sync-op/preadv_part ok 6 /sync-op/pwritev_part ok 7 /sync-op/pwrite_compressed ok 8 /sync-op/pwrite_zeroes ok 9 /sync-op/load_vmstate ok 10 /sync-op/save_vmstate ok 11 /sync-op/pdiscard ok 12 /sync-op/truncate ok 13 /sync-op/block_status ok 14 /sync-op/flush ok 15 /sync-op/check ok 16 /sync-op/activate # End of sync-op tests # Start of attach tests > That will show exactly which sub-test case is hanging. > Other information that might help: your host kernel version and liburing > version. This is a F42 system. kernel-6.16.12-200.fc42.x86_64 liburing-2.9-1.fc42.x86_64 If you can't reproduce or find a hypothesis what's happening, I can try to debug one of the hanging processes. Kevin
On Wed, Oct 22, 2025 at 11:02:28AM +0200, Kevin Wolf wrote:
> Am 21.10.2025 um 21:10 hat Stefan Hajnoczi geschrieben:
> > On Thu, Oct 09, 2025 at 06:59:20PM +0200, Kevin Wolf wrote:
> > > Am 09.10.2025 um 17:46 hat Kevin Wolf geschrieben:
> > > > Am 10.09.2025 um 19:56 hat Stefan Hajnoczi geschrieben:
> > > > > There is no need for aio_context_use_g_source() now that epoll(7) and
> > > > > io_uring(7) file descriptor monitoring works with the glib event loop.
> > > > > AioContext doesn't need to be notified that GSource is being used.
> > > > >
> > > > > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> > > > > Reviewed-by: Eric Blake <eblake@redhat.com>
> > > >
> > > > We should probably mention in the commit message that this causes the
> > > > default fdmon on Linux to change from poll to io_uring. It's a small
> > > > code change, but it makes QEMU use a completely different code path by
> > > > default.
> > >
> > > Just to make sure, I ran 'make check' after this patch and it's failing
> > > for me:
> > >
> > > 10/401 qemu:qtest+qtest-x86_64 / qtest-x86_64/ahci-test TIMEOUT 150.02s killed by signal 15 SIGTERM
> > > 133/401 qemu:unit / test-aio TIMEOUT 30.01s killed by signal 15 SIGTERM
> > > 137/401 qemu:unit / test-bdrv-drain TIMEOUT 30.01s killed by signal 15 SIGTERM
> > > 142/401 qemu:unit / test-block-iothread TIMEOUT 30.01s killed by signal 15 SIGTERM
> > > 192/401 qemu:doc+rust / rust-bql-rs-doctests FAIL 0.84s exit status 101
> > > 311/401 qemu:block / io-qcow2-267 ERROR 3.20s exit status 1
> > > 321/401 qemu:block / io-qcow2-copy-before-write TIMEOUT 180.01s killed by signal 15 SIGTERM
> > >
> > > Some of them look unrelated, but I have confirmed that the three unit
> > > tests still pass before this patch (and still hang after the complete
> > > series).
> >
> > I can't reproduce these failures, regardless of whether sysctl
> > kernel.io_uring_disabled is 0 or 1.
> >
> > Can you launch the unit tests from your terminal and post the output?
> >
> > $ cd qemu
> > $ build/tests/unit/test-aio
>
> TAP version 14
> # random seed: R02S48dcdde28634143f18bad3947c52d334
> 1..27
> # Start of aio tests
> # Start of bh tests
> ok 1 /aio/bh/schedule
> ok 2 /aio/bh/schedule10
> ok 3 /aio/bh/cancel
> ok 4 /aio/bh/delete
> ok 5 /aio/bh/flush
> # Start of callback-delete tests
> ok 6 /aio/bh/callback-delete/one
> ok 7 /aio/bh/callback-delete/many
> # End of callback-delete tests
> # End of bh tests
> # Start of event tests
> ok 8 /aio/event/add-remove
> ok 9 /aio/event/wait
> ok 10 /aio/event/flush
> # Start of wait tests
> ok 11 /aio/event/wait/no-flush-cb
> # End of wait tests
> # End of event tests
> # Start of timer tests
>
> > $ build/tests/unit/test-bdrv-drain
>
> TAP version 14
> # random seed: R02S7d6ba0fc81d5b90d323813d680a30644
> 1..30
> # Start of bdrv-drain tests
> ok 1 /bdrv-drain/nested
> ok 2 /bdrv-drain/set_aio_context
> # Start of driver-cb tests
>
>
> > $ build/tests/unit/test-block-iothread
>
> TAP version 14
> # random seed: R02Sf81baf68887daa9b86be5c72b99df589
> 1..22
> # Start of sync-op tests
> ok 1 /sync-op/pread
> ok 2 /sync-op/pwrite
> ok 3 /sync-op/preadv
> ok 4 /sync-op/pwritev
> ok 5 /sync-op/preadv_part
> ok 6 /sync-op/pwritev_part
> ok 7 /sync-op/pwrite_compressed
> ok 8 /sync-op/pwrite_zeroes
> ok 9 /sync-op/load_vmstate
> ok 10 /sync-op/save_vmstate
> ok 11 /sync-op/pdiscard
> ok 12 /sync-op/truncate
> ok 13 /sync-op/block_status
> ok 14 /sync-op/flush
> ok 15 /sync-op/check
> ok 16 /sync-op/activate
> # End of sync-op tests
> # Start of attach tests
>
> > That will show exactly which sub-test case is hanging.
> > Other information that might help: your host kernel version and liburing
> > version.
>
> This is a F42 system.
>
> kernel-6.16.12-200.fc42.x86_64
> liburing-2.9-1.fc42.x86_64
>
> If you can't reproduce or find a hypothesis what's happening, I can try
> to debug one of the hanging processes.
Unfortunately I haven't been able to reproduce it on my system. It's
a F42 machine with the same package versions as your machine.
The test-aio timer tests look like good candidates for debugging. It is
likely that the test is either getting to an infinite do {} while
(!aio_poll(ctx, false)) loop or to an aio_poll(ctx, true) call that
hangs.
Thanks for your help with debugging!
Stefan
© 2016 - 2026 Red Hat, Inc.