[PATCH 3/3] win32: fix main-loop busy loop on socket/fd event

Marc-André Lureau posted 3 patches 6 years, 1 month ago
Maintainers: Stefan Weil <sw@weilnetz.de>, Fam Zheng <fam@euphon.net>, Stefan Hajnoczi <stefanha@redhat.com>
[PATCH 3/3] win32: fix main-loop busy loop on socket/fd event
Posted by Marc-André Lureau 6 years, 1 month ago
Commit 05e514b1d4d5bd4209e2c8bbc76ff05c85a235f3 introduced an AIO
context optimization to avoid calling event_notifier_test_and_clear() on
ctx->notifier. On Windows, the same notifier is being used to wakeup the
wait on socket events (see commit
d3385eb448e38f828c78f8f68ec5d79c66a58b5d).

The ctx->notifier event is added to the gpoll sources in
aio_set_event_notifier(), aio_ctx_check() should clear the event
regardless of ctx->notified, since Windows sets the event by itself,
bypassing the aio->notified. This fixes qemu not clearing the event
resulting in a busy loop.

Paolo suggested to me on irc to call event_notifier_test_and_clear()
after select() >0 from aio-win32.c's aio_prepare. Unfortunately, not all
fds associated with ctx->notifiers are in AIO fd handlers set.
(qemu_set_nonblock() in util/oslib-win32.c calls qemu_fd_register()).

This is essentially a v2 of a patch that was sent earlier:
https://lists.gnu.org/archive/html/qemu-devel/2017-01/msg00420.html

that resurfaced when James investigated Spice performance issues on Windows:
https://gitlab.freedesktop.org/spice/spice/issues/36

In order to test that patch, I simply tried running test-char on
win32, and it hangs. Applying that patch solves it. QIO idle sources
are not dispatched. I haven't investigated much further, I suspect
source priorities and busy looping still come into play.

This version keeps the "notified" field, so event_notifier_poll()
should still work as expected.

Cc: James Le Cuirot <chewi@gentoo.org>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
---
 util/async.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/util/async.c b/util/async.c
index 4e4c7af51e..ca83e32c7f 100644
--- a/util/async.c
+++ b/util/async.c
@@ -354,7 +354,11 @@ void aio_notify(AioContext *ctx)
 
 void aio_notify_accept(AioContext *ctx)
 {
-    if (atomic_xchg(&ctx->notified, false)) {
+    if (atomic_xchg(&ctx->notified, false)
+#ifdef WIN32
+        || true
+#endif
+    ) {
         event_notifier_test_and_clear(&ctx->notifier);
     }
 }
-- 
2.23.0


Re: [PATCH 3/3] win32: fix main-loop busy loop on socket/fd event
Posted by James Le Cuirot 6 years, 1 month ago
On Tue,  1 Oct 2019 17:26:09 +0400
Marc-André Lureau <marcandre.lureau@redhat.com> wrote:

> Commit 05e514b1d4d5bd4209e2c8bbc76ff05c85a235f3 introduced an AIO
> context optimization to avoid calling event_notifier_test_and_clear() on
> ctx->notifier. On Windows, the same notifier is being used to wakeup the
> wait on socket events (see commit
> d3385eb448e38f828c78f8f68ec5d79c66a58b5d).
> 
> The ctx->notifier event is added to the gpoll sources in
> aio_set_event_notifier(), aio_ctx_check() should clear the event
> regardless of ctx->notified, since Windows sets the event by itself,
> bypassing the aio->notified. This fixes qemu not clearing the event
> resulting in a busy loop.
> 
> Paolo suggested to me on irc to call event_notifier_test_and_clear()
> after select() >0 from aio-win32.c's aio_prepare. Unfortunately, not all
> fds associated with ctx->notifiers are in AIO fd handlers set.
> (qemu_set_nonblock() in util/oslib-win32.c calls qemu_fd_register()).
> 
> This is essentially a v2 of a patch that was sent earlier:
> https://lists.gnu.org/archive/html/qemu-devel/2017-01/msg00420.html
> 
> that resurfaced when James investigated Spice performance issues on Windows:
> https://gitlab.freedesktop.org/spice/spice/issues/36
> 
> In order to test that patch, I simply tried running test-char on
> win32, and it hangs. Applying that patch solves it. QIO idle sources
> are not dispatched. I haven't investigated much further, I suspect
> source priorities and busy looping still come into play.
> 
> This version keeps the "notified" field, so event_notifier_poll()
> should still work as expected.
> 
> Cc: James Le Cuirot <chewi@gentoo.org>
> Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
> ---
>  util/async.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/util/async.c b/util/async.c
> index 4e4c7af51e..ca83e32c7f 100644
> --- a/util/async.c
> +++ b/util/async.c
> @@ -354,7 +354,11 @@ void aio_notify(AioContext *ctx)
>  
>  void aio_notify_accept(AioContext *ctx)
>  {
> -    if (atomic_xchg(&ctx->notified, false)) {
> +    if (atomic_xchg(&ctx->notified, false)
> +#ifdef WIN32
> +        || true
> +#endif
> +    ) {
>          event_notifier_test_and_clear(&ctx->notifier);
>      }
>  }

I can confirm that this updated patch fixes my performance issue. The
idle CPU usage drops from around 35% to around 2%. Moving the mouse now
makes the usage go up, not down. :) Many thanks!

Regards,
-- 
James Le Cuirot (chewi)
Gentoo Linux Developer