From nobody Wed Apr 16 03:54:05 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1583930777; cv=none; d=zohomail.com; s=zohoarc; b=JyW100ELpvN5dRQcDieLEJ6MNWfstU7Uudm0wfWHLQ6Y91ZubSQH305C+Q3Vh8B27q5WLxrS84rogWSZgdNzNn45Npq+x4q2aVXfdZ9gLYMtgqo2Mwa+NYm30Ws9Ab/FZ5KQm712/BDtvlMR5QRkglhKTc17f0PNhFcNFdbAbaI= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1583930777; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=UOPPWZZHASZTVyNhgRUoBjPgRsVz4P/bYh4pQhKe7FM=; b=EhwlQuIKU2Jx50vN5d4UoCJnY0lN27hzdZJL4AfURmgLDJ0698/CT+3BWXMnnFnZYMtJYFzs7ezHvAs4g/tYUmvIPi7VAx3bWCjZXz6tQfAgm/Tb1x/alDBQQLDm5fgzoJtoCDjn+55AJtQOYmElXk/Q2UxxN5JZ8jTT9L/4+WI= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail header.from= (p=none dis=none) header.from= Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1583930777523219.74751077674102; Wed, 11 Mar 2020 05:46:17 -0700 (PDT) Received: from localhost ([::1]:51212 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jC0ka-0006Ex-FH for importer@patchew.org; Wed, 11 Mar 2020 08:46:16 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:47052) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jC0g2-0007Lj-CF for qemu-devel@nongnu.org; Wed, 11 Mar 2020 08:41:35 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jC0g0-0000Mr-Ss for qemu-devel@nongnu.org; Wed, 11 Mar 2020 08:41:34 -0400 Received: from us-smtp-delivery-1.mimecast.com ([205.139.110.120]:30832 helo=us-smtp-1.mimecast.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1jC0g0-0000Ml-Og for qemu-devel@nongnu.org; Wed, 11 Mar 2020 08:41:32 -0400 Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-310-giofkviPMti6e68MzRXybA-1; Wed, 11 Mar 2020 08:41:26 -0400 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 63A20801E66; Wed, 11 Mar 2020 12:41:25 +0000 (UTC) Received: from localhost (unknown [10.36.118.127]) by smtp.corp.redhat.com (Postfix) with ESMTP id 494D373880; Wed, 11 Mar 2020 12:41:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1583930492; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=UOPPWZZHASZTVyNhgRUoBjPgRsVz4P/bYh4pQhKe7FM=; b=N3bT0Hd3y081TYA7WtInIVxor9JXbvPHs1MK7aRftsDCTMAV+I2Eb8s/8yKVvuxr3GzjX9 kcPPTRDqdZH8/1FVdWTrNzMYdT21utg8CopCfi9gvbPKhGzzBFcqeRnMoY0J5cpWMi93m/ tSI5EEu5j2XbUeZJymmRZ82cZvV3O4k= X-MC-Unique: giofkviPMti6e68MzRXybA-1 From: Stefan Hajnoczi To: qemu-devel@nongnu.org Subject: [PULL 8/9] aio-posix: support userspace polling of fd monitoring Date: Wed, 11 Mar 2020 12:40:44 +0000 Message-Id: <20200311124045.277969-9-stefanha@redhat.com> In-Reply-To: <20200311124045.277969-1-stefanha@redhat.com> References: <20200311124045.277969-1-stefanha@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 205.139.110.120 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Fam Zheng , Peter Maydell , qemu-block@nongnu.org, Max Reitz , Stefan Hajnoczi , Paolo Bonzini , Kevin Wolf Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) Content-Type: text/plain; charset="utf-8" Unlike ppoll(2) and epoll(7), Linux io_uring completions can be polled from userspace. Previously userspace polling was only allowed when all AioHandler's had an ->io_poll() callback. This prevented starvation of fds by userspace pollable handlers. Add the FDMonOps->need_wait() callback that enables userspace polling even when some AioHandlers lack ->io_poll(). For example, it's now possible to do userspace polling when a TCP/IP socket is monitored thanks to Linux io_uring. Signed-off-by: Stefan Hajnoczi Link: https://lore.kernel.org/r/20200305170806.1313245-7-stefanha@redhat.com Message-Id: <20200305170806.1313245-7-stefanha@redhat.com> --- include/block/aio.h | 19 +++++++++++++++++++ util/aio-posix.c | 11 ++++++++--- util/fdmon-epoll.c | 1 + util/fdmon-io_uring.c | 6 ++++++ util/fdmon-poll.c | 1 + 5 files changed, 35 insertions(+), 3 deletions(-) diff --git a/include/block/aio.h b/include/block/aio.h index 83fc9b844d..f07ebb76b8 100644 --- a/include/block/aio.h +++ b/include/block/aio.h @@ -55,6 +55,9 @@ struct ThreadPool; struct LinuxAioState; struct LuringState; =20 +/* Is polling disabled? */ +bool aio_poll_disabled(AioContext *ctx); + /* Callbacks for file descriptor monitoring implementations */ typedef struct { /* @@ -84,6 +87,22 @@ typedef struct { * Returns: number of ready file descriptors. */ int (*wait)(AioContext *ctx, AioHandlerList *ready_list, int64_t timeo= ut); + + /* + * need_wait: + * @ctx: the AioContext + * + * Tell aio_poll() when to stop userspace polling early because ->wait= () + * has fds ready. + * + * File descriptor monitoring implementations that cannot poll fd read= iness + * from userspace should use aio_poll_disabled() here. This ensures t= hat + * file descriptors are not starved by handlers that frequently make + * progress via userspace polling. + * + * Returns: true if ->wait() should be called, false otherwise. + */ + bool (*need_wait)(AioContext *ctx); } FDMonOps; =20 /* diff --git a/util/aio-posix.c b/util/aio-posix.c index ffd9cc381b..759989b45b 100644 --- a/util/aio-posix.c +++ b/util/aio-posix.c @@ -22,6 +22,11 @@ #include "trace.h" #include "aio-posix.h" =20 +bool aio_poll_disabled(AioContext *ctx) +{ + return atomic_read(&ctx->poll_disable_cnt); +} + void aio_add_ready_handler(AioHandlerList *ready_list, AioHandler *node, int revents) @@ -423,7 +428,7 @@ static bool run_poll_handlers(AioContext *ctx, int64_t = max_ns, int64_t *timeout) elapsed_time =3D qemu_clock_get_ns(QEMU_CLOCK_REALTIME) - start_ti= me; max_ns =3D qemu_soonest_timeout(*timeout, max_ns); assert(!(max_ns && progress)); - } while (elapsed_time < max_ns && !atomic_read(&ctx->poll_disable_cnt)= ); + } while (elapsed_time < max_ns && !ctx->fdmon_ops->need_wait(ctx)); =20 /* If time has passed with no successful polling, adjust *timeout to * keep the same ending time. @@ -451,7 +456,7 @@ static bool try_poll_mode(AioContext *ctx, int64_t *tim= eout) { int64_t max_ns =3D qemu_soonest_timeout(*timeout, ctx->poll_ns); =20 - if (max_ns && !atomic_read(&ctx->poll_disable_cnt)) { + if (max_ns && !ctx->fdmon_ops->need_wait(ctx)) { poll_set_started(ctx, true); =20 if (run_poll_handlers(ctx, max_ns, timeout)) { @@ -501,7 +506,7 @@ bool aio_poll(AioContext *ctx, bool blocking) /* If polling is allowed, non-blocking aio_poll does not need the * system call---a single round of run_poll_handlers_once suffices. */ - if (timeout || atomic_read(&ctx->poll_disable_cnt)) { + if (timeout || ctx->fdmon_ops->need_wait(ctx)) { ret =3D ctx->fdmon_ops->wait(ctx, &ready_list, timeout); } =20 diff --git a/util/fdmon-epoll.c b/util/fdmon-epoll.c index d56b69468b..fcd989d47d 100644 --- a/util/fdmon-epoll.c +++ b/util/fdmon-epoll.c @@ -100,6 +100,7 @@ out: static const FDMonOps fdmon_epoll_ops =3D { .update =3D fdmon_epoll_update, .wait =3D fdmon_epoll_wait, + .need_wait =3D aio_poll_disabled, }; =20 static bool fdmon_epoll_try_enable(AioContext *ctx) diff --git a/util/fdmon-io_uring.c b/util/fdmon-io_uring.c index fb99b4b61e..893b79b622 100644 --- a/util/fdmon-io_uring.c +++ b/util/fdmon-io_uring.c @@ -288,9 +288,15 @@ static int fdmon_io_uring_wait(AioContext *ctx, AioHan= dlerList *ready_list, return process_cq_ring(ctx, ready_list); } =20 +static bool fdmon_io_uring_need_wait(AioContext *ctx) +{ + return io_uring_cq_ready(&ctx->fdmon_io_uring); +} + static const FDMonOps fdmon_io_uring_ops =3D { .update =3D fdmon_io_uring_update, .wait =3D fdmon_io_uring_wait, + .need_wait =3D fdmon_io_uring_need_wait, }; =20 bool fdmon_io_uring_setup(AioContext *ctx) diff --git a/util/fdmon-poll.c b/util/fdmon-poll.c index 28114a0f39..488067b679 100644 --- a/util/fdmon-poll.c +++ b/util/fdmon-poll.c @@ -103,4 +103,5 @@ static void fdmon_poll_update(AioContext *ctx, const FDMonOps fdmon_poll_ops =3D { .update =3D fdmon_poll_update, .wait =3D fdmon_poll_wait, + .need_wait =3D aio_poll_disabled, }; --=20 2.24.1