From nobody Fri Nov 14 00:46:49 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1583428932; cv=none; d=zohomail.com; s=zohoarc; b=FRFXu6evQ3M5p6L9My5NgsnXL4bUWdTTlkgQhpeXso5OFKxicVbS/71kZhMB6G0ncI/CTx3Nrary1WtNtsPs9xfLMCcurPZoOWNBU0Gg2VYx6omsWTRIK76IbWtzYIMFw9smFeJKVQAayNhJw75Ml3rFcag+0ME18avuAYTH29g= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1583428932; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=ZSd10YyDelIgpQ6PH9ngXWKEeTT2D1gh++ZXGYXUwgk=; b=G9UQDx4tLos2nt8GgaO2PkorvHEnDLp1K+mFcA2gTWHSbsbLXr+0yH84TahQO14WVZadYosrmAuIg48l6YTT3Z4sH5v1keS2LXo0pnVJQLIDAehUk1BR9lPya6wfFMiaqqE2jg3adW8ZGqQ/mUHe4k83Bas9NbyinoviZNBCQaU= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail header.from= (p=none dis=none) header.from= Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1583428932769342.50339142716666; Thu, 5 Mar 2020 09:22:12 -0800 (PST) Received: from localhost ([::1]:53966 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1j9uCJ-0004TU-PZ for importer@patchew.org; Thu, 05 Mar 2020 12:22:11 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:38092) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1j9tzB-0000V5-AX for qemu-devel@nongnu.org; Thu, 05 Mar 2020 12:08:38 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1j9tz9-0003ND-Sx for qemu-devel@nongnu.org; Thu, 05 Mar 2020 12:08:37 -0500 Received: from us-smtp-1.mimecast.com ([207.211.31.81]:58424 helo=us-smtp-delivery-1.mimecast.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1j9tz9-0003MZ-No for qemu-devel@nongnu.org; Thu, 05 Mar 2020 12:08:35 -0500 Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-81-LuA6_eZCNmS9X8sajAILxA-1; Thu, 05 Mar 2020 12:08:34 -0500 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id C7391107ACC4; Thu, 5 Mar 2020 17:08:32 +0000 (UTC) Received: from localhost (ovpn-117-104.ams2.redhat.com [10.36.117.104]) by smtp.corp.redhat.com (Postfix) with ESMTP id 5AF9E90796; Thu, 5 Mar 2020 17:08:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1583428115; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ZSd10YyDelIgpQ6PH9ngXWKEeTT2D1gh++ZXGYXUwgk=; b=PfSYVqqmnhsqtCpzAULQHlE15jG5CvU8ASP/pdinLX3oQzHQCSKyFeJtbfcaTCw2rZIt34 vfzINU+v0S0E832M5RbHN3PdtsK9O/djPZJJ38G1RJF3t00RadPvELPlCLKWEeVOw6iYja 6Iat8bzw0KCkMm5EriOcD4tP4ue1wKY= X-MC-Unique: LuA6_eZCNmS9X8sajAILxA-1 From: Stefan Hajnoczi To: qemu-devel@nongnu.org Subject: [PATCH 6/7] aio-posix: support userspace polling of fd monitoring Date: Thu, 5 Mar 2020 17:08:05 +0000 Message-Id: <20200305170806.1313245-7-stefanha@redhat.com> In-Reply-To: <20200305170806.1313245-1-stefanha@redhat.com> References: <20200305170806.1313245-1-stefanha@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 207.211.31.81 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Fam Zheng , Kevin Wolf , qemu-block@nongnu.org, Max Reitz , Stefan Hajnoczi , Paolo Bonzini Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) Content-Type: text/plain; charset="utf-8" Unlike ppoll(2) and epoll(7), Linux io_uring completions can be polled from userspace. Previously userspace polling was only allowed when all AioHandler's had an ->io_poll() callback. This prevented starvation of fds by userspace pollable handlers. Add the FDMonOps->need_wait() callback that enables userspace polling even when some AioHandlers lack ->io_poll(). For example, it's now possible to do userspace polling when a TCP/IP socket is monitored thanks to Linux io_uring. Signed-off-by: Stefan Hajnoczi --- include/block/aio.h | 19 +++++++++++++++++++ util/aio-posix.c | 11 ++++++++--- util/fdmon-epoll.c | 1 + util/fdmon-io_uring.c | 6 ++++++ util/fdmon-poll.c | 1 + 5 files changed, 35 insertions(+), 3 deletions(-) diff --git a/include/block/aio.h b/include/block/aio.h index 83fc9b844d..f07ebb76b8 100644 --- a/include/block/aio.h +++ b/include/block/aio.h @@ -55,6 +55,9 @@ struct ThreadPool; struct LinuxAioState; struct LuringState; =20 +/* Is polling disabled? */ +bool aio_poll_disabled(AioContext *ctx); + /* Callbacks for file descriptor monitoring implementations */ typedef struct { /* @@ -84,6 +87,22 @@ typedef struct { * Returns: number of ready file descriptors. */ int (*wait)(AioContext *ctx, AioHandlerList *ready_list, int64_t timeo= ut); + + /* + * need_wait: + * @ctx: the AioContext + * + * Tell aio_poll() when to stop userspace polling early because ->wait= () + * has fds ready. + * + * File descriptor monitoring implementations that cannot poll fd read= iness + * from userspace should use aio_poll_disabled() here. This ensures t= hat + * file descriptors are not starved by handlers that frequently make + * progress via userspace polling. + * + * Returns: true if ->wait() should be called, false otherwise. + */ + bool (*need_wait)(AioContext *ctx); } FDMonOps; =20 /* diff --git a/util/aio-posix.c b/util/aio-posix.c index a24a33c15a..ede04a4bc2 100644 --- a/util/aio-posix.c +++ b/util/aio-posix.c @@ -22,6 +22,11 @@ #include "trace.h" #include "aio-posix.h" =20 +bool aio_poll_disabled(AioContext *ctx) +{ + return atomic_read(&ctx->poll_disable_cnt); +} + void aio_add_ready_handler(AioHandlerList *ready_list, AioHandler *node, int revents) @@ -423,7 +428,7 @@ static bool run_poll_handlers(AioContext *ctx, int64_t = max_ns, int64_t *timeout) elapsed_time =3D qemu_clock_get_ns(QEMU_CLOCK_REALTIME) - start_ti= me; max_ns =3D qemu_soonest_timeout(*timeout, max_ns); assert(!(max_ns && progress)); - } while (elapsed_time < max_ns && !atomic_read(&ctx->poll_disable_cnt)= ); + } while (elapsed_time < max_ns && !ctx->fdmon_ops->need_wait(ctx)); =20 /* If time has passed with no successful polling, adjust *timeout to * keep the same ending time. @@ -451,7 +456,7 @@ static bool try_poll_mode(AioContext *ctx, int64_t *tim= eout) { int64_t max_ns =3D qemu_soonest_timeout(*timeout, ctx->poll_ns); =20 - if (max_ns && !atomic_read(&ctx->poll_disable_cnt)) { + if (max_ns && !ctx->fdmon_ops->need_wait(ctx)) { poll_set_started(ctx, true); =20 if (run_poll_handlers(ctx, max_ns, timeout)) { @@ -501,7 +506,7 @@ bool aio_poll(AioContext *ctx, bool blocking) /* If polling is allowed, non-blocking aio_poll does not need the * system call---a single round of run_poll_handlers_once suffices. */ - if (timeout || atomic_read(&ctx->poll_disable_cnt)) { + if (timeout || ctx->fdmon_ops->need_wait(ctx)) { ret =3D ctx->fdmon_ops->wait(ctx, &ready_list, timeout); } =20 diff --git a/util/fdmon-epoll.c b/util/fdmon-epoll.c index d56b69468b..fcd989d47d 100644 --- a/util/fdmon-epoll.c +++ b/util/fdmon-epoll.c @@ -100,6 +100,7 @@ out: static const FDMonOps fdmon_epoll_ops =3D { .update =3D fdmon_epoll_update, .wait =3D fdmon_epoll_wait, + .need_wait =3D aio_poll_disabled, }; =20 static bool fdmon_epoll_try_enable(AioContext *ctx) diff --git a/util/fdmon-io_uring.c b/util/fdmon-io_uring.c index fb99b4b61e..893b79b622 100644 --- a/util/fdmon-io_uring.c +++ b/util/fdmon-io_uring.c @@ -288,9 +288,15 @@ static int fdmon_io_uring_wait(AioContext *ctx, AioHan= dlerList *ready_list, return process_cq_ring(ctx, ready_list); } =20 +static bool fdmon_io_uring_need_wait(AioContext *ctx) +{ + return io_uring_cq_ready(&ctx->fdmon_io_uring); +} + static const FDMonOps fdmon_io_uring_ops =3D { .update =3D fdmon_io_uring_update, .wait =3D fdmon_io_uring_wait, + .need_wait =3D fdmon_io_uring_need_wait, }; =20 bool fdmon_io_uring_setup(AioContext *ctx) diff --git a/util/fdmon-poll.c b/util/fdmon-poll.c index 28114a0f39..488067b679 100644 --- a/util/fdmon-poll.c +++ b/util/fdmon-poll.c @@ -103,4 +103,5 @@ static void fdmon_poll_update(AioContext *ctx, const FDMonOps fdmon_poll_ops =3D { .update =3D fdmon_poll_update, .wait =3D fdmon_poll_wait, + .need_wait =3D aio_poll_disabled, }; --=20 2.24.1