From nobody Sat May 30 19:21:50 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=quarantine dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1777486984; cv=none; d=zohomail.com; s=zohoarc; b=BDZHiHSS/wA8E+4DkedBJyWe/M6PmKIKM3Z3X04sU/eHM+t9HIelqDHL8mpQq6s/ZT53l/PJgo499DvFHeDF6F2lNnPdYNwKCq65R21qgHtzLOWqOqdNQMhBFJPEwhFib2naz/peBRbVUKt16QrtQTEm3cC61PjSEcsj30I+LyY= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1777486984; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=6iFIOBsqvi6+k9V427YKtoDvLloAhnlHP+JZ25es5R8=; b=H6VMIeNij54Ig5/gz5N/BATXAt2U9WSCqKjn/UKlxtVHzHWVvffKr4qGi+b7fe7ROPG31CqSM0mIGA7IidhVMrOAMgRmNWoxLTC7GU7eMfx49ZCyKs2qlnzehyrNbgHd806vOberWS/GxOhmtGwOxq3lxRakSelOt/AOm1l49TY= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=quarantine dis=none) Return-Path: Received: from lists1p.gnu.org (lists1p.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1777486984826459.52057186806087; Wed, 29 Apr 2026 11:23:04 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists1p.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1wI9Xr-0002fH-4u; Wed, 29 Apr 2026 14:21:59 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists1p.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1wI9Xd-0002cW-NA for qemu-devel@nongnu.org; Wed, 29 Apr 2026 14:21:47 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1wI9XY-0003he-PU for qemu-devel@nongnu.org; Wed, 29 Apr 2026 14:21:42 -0400 Received: from mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-295-FgJw67dJMraYE63242TgqA-1; Wed, 29 Apr 2026 14:21:38 -0400 Received: from mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.17]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 8AF171800652; Wed, 29 Apr 2026 18:21:36 +0000 (UTC) Received: from localhost (unknown [10.44.33.46]) by mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 5E759195608E; Wed, 29 Apr 2026 18:21:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1777486899; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=6iFIOBsqvi6+k9V427YKtoDvLloAhnlHP+JZ25es5R8=; b=U07uF+GiM7s3oB6S2QiwYU2OEds8xXoysnZGD3qmb9RLuP6+tr/fy6av5Baz4ctKeV9gLR oZAMn9K128p4uzAy+II5L4ESj01BLcPao1pS997h9pKyBsDFvewA33uvQ6qigEuIYvtavd upJESFtIwv3mLoyu54z+317g5EsXAUY= X-MC-Unique: FgJw67dJMraYE63242TgqA-1 X-Mimecast-MFC-AGG-ID: FgJw67dJMraYE63242TgqA_1777486896 From: Stefan Hajnoczi To: qemu-devel@nongnu.org Cc: Stefan Weil , Hanna Reitz , Peter Maydell , Kevin Wolf , Markus Armbruster , Stefan Hajnoczi , qemu-block@nongnu.org, "Dr. David Alan Gilbert" , Paolo Bonzini , Fam Zheng , "Michael S. Tsirkin" , =?UTF-8?q?Daniel=20P=2E=20Berrang=C3=A9?= , Eric Blake , Jaehoon Kim Subject: [PULL 1/3] aio-poll: avoid unnecessary polling time computation Date: Wed, 29 Apr 2026 14:21:25 -0400 Message-ID: <20260429182127.219776-2-stefanha@redhat.com> In-Reply-To: <20260429182127.219776-1-stefanha@redhat.com> References: <20260429182127.219776-1-stefanha@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.0 on 10.30.177.17 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists1p.gnu.org; Received-SPF: pass client-ip=170.10.129.124; envelope-from=stefanha@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: qemu development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1777486985535158500 Content-Type: text/plain; charset="utf-8" From: Jaehoon Kim Nodes are no longer added to poll_aio_handlers when adaptive polling is disabled, preventing unnecessary try_poll_mode() calls. This avoids iterating over all nodes to compute max_ns unnecessarily when polling is disabled. Signed-off-by: Jaehoon Kim Reviewed-by: Stefan Hajnoczi Message-ID: <20260423195918.661299-2-jhkim@linux.ibm.com> Signed-off-by: Stefan Hajnoczi --- util/aio-posix.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/util/aio-posix.c b/util/aio-posix.c index 488d964611..351847c6fb 100644 --- a/util/aio-posix.c +++ b/util/aio-posix.c @@ -307,9 +307,8 @@ static bool aio_dispatch_handler(AioContext *ctx, AioHa= ndler *node) * fdmon_supports_polling(), but only until the fd fires for the first * time. */ - if (!QLIST_IS_INSERTED(node, node_deleted) && - !QLIST_IS_INSERTED(node, node_poll) && - node->io_poll) { + if (ctx->poll_max_ns && !QLIST_IS_INSERTED(node, node_deleted) && + !QLIST_IS_INSERTED(node, node_poll) && node->io_poll) { trace_poll_add(ctx, node, node->pfd.fd, revents); if (ctx->poll_started && node->io_poll_begin) { node->io_poll_begin(node->opaque); @@ -631,7 +630,7 @@ static void adjust_polling_time(AioContext *ctx, AioPol= ledEvent *poll, bool aio_poll(AioContext *ctx, bool blocking) { AioHandlerList ready_list =3D QLIST_HEAD_INITIALIZER(ready_list); - bool progress; + bool progress =3D false; bool use_notify_me; int64_t timeout; int64_t start =3D 0; @@ -656,7 +655,9 @@ bool aio_poll(AioContext *ctx, bool blocking) } =20 timeout =3D blocking ? aio_compute_timeout(ctx) : 0; - progress =3D try_poll_mode(ctx, &ready_list, &timeout); + if (ctx->poll_max_ns !=3D 0) { + progress =3D try_poll_mode(ctx, &ready_list, &timeout); + } assert(!(timeout && progress)); =20 /* --=20 2.54.0 From nobody Sat May 30 19:21:50 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=quarantine dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1777486928; cv=none; d=zohomail.com; s=zohoarc; b=Az5k8zUUk43/m9L4Jv0n7jqkV+s4pIJkQg9TO4r3Az44Y1eUy3+AKCy1iPmpBQ5f4fHOmh0/+hnFMHHuOsXa/F8RunPStjnx0Qr/VREKfBWv57PxMXMYqDAZm1ctPki9XcP1sIiEWZUP3m7EwUKyfEf6Gb9wYNDNBamz0dpqpVE= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1777486928; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=iXB5hgY0R41Sv4TSwA9HHxO+9wz/jng/A3ubUblAZ5o=; b=Cgpid3Feqh9a0+5RIn1L3VVQQJWsgzXsl25oKNtsrDEzWMPGYTSNJYir8Px25JqOCDX8gKO6nSxw8PwTfKjNm2xcwL+gGmB4LNFccdyq1qIScP7dBp+XvJgeK2DYzQP/VKAGjmPKiqg5r2B0FmlgiA/wNc9DgTC5RQ9emHnKPO4= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=quarantine dis=none) Return-Path: Received: from lists1p.gnu.org (lists1p.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1777486928904323.73080701255674; Wed, 29 Apr 2026 11:22:08 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists1p.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1wI9Xs-0002fp-Od; Wed, 29 Apr 2026 14:22:01 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists1p.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1wI9Xk-0002dY-Iq for qemu-devel@nongnu.org; Wed, 29 Apr 2026 14:21:53 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1wI9Xd-0003hz-Pm for qemu-devel@nongnu.org; Wed, 29 Apr 2026 14:21:48 -0400 Received: from mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-484-vmZs4b_SM_mkATI4zAmauQ-1; Wed, 29 Apr 2026 14:21:42 -0400 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id D0E3F18002CA; Wed, 29 Apr 2026 18:21:40 +0000 (UTC) Received: from localhost (unknown [10.44.33.46]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 90CA41800480; Wed, 29 Apr 2026 18:21:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1777486904; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=iXB5hgY0R41Sv4TSwA9HHxO+9wz/jng/A3ubUblAZ5o=; b=Az7pa3V7NsE4s8JdgvN0SqeC4OdjVJB/du07HEtRGqBLp3o8qe4WorAQ6zM+Z0VTsSK02l 3hp9foDTKjpDKXqoY2170cjCDak0nTX8QmwooHFyq4ue3rjnjsxbw8qQ4JE3fKPwiZXr8m vYI4A6XSmqq4msspv7ADv+JJEqJl1SE= X-MC-Unique: vmZs4b_SM_mkATI4zAmauQ-1 X-Mimecast-MFC-AGG-ID: vmZs4b_SM_mkATI4zAmauQ_1777486900 From: Stefan Hajnoczi To: qemu-devel@nongnu.org Cc: Stefan Weil , Hanna Reitz , Peter Maydell , Kevin Wolf , Markus Armbruster , Stefan Hajnoczi , qemu-block@nongnu.org, "Dr. David Alan Gilbert" , Paolo Bonzini , Fam Zheng , "Michael S. Tsirkin" , =?UTF-8?q?Daniel=20P=2E=20Berrang=C3=A9?= , Eric Blake , Jaehoon Kim Subject: [PULL 2/3] aio-poll: refine iothread polling using weighted handler intervals Date: Wed, 29 Apr 2026 14:21:26 -0400 Message-ID: <20260429182127.219776-3-stefanha@redhat.com> In-Reply-To: <20260429182127.219776-1-stefanha@redhat.com> References: <20260429182127.219776-1-stefanha@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists1p.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=stefanha@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: qemu development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1777486931920154100 Content-Type: text/plain; charset="utf-8" From: Jaehoon Kim Improve adaptive polling by updating each AioHandler's poll.ns every loop iteration using weighted averages. This reduces CPU consumption while minimizing performance impact. Background: Starting from QEMU 10.0, poll.ns was introduced per event handler to mitigate excessive fluctuations in IOThread polling times observed in earlier versions (QEMU 9.x). However, the current design has limitations: 1. poll.ns is updated only when an event occurs, making it difficult to treat block_ns as a reliable event interval. 2. The IOThread's next polling time is determined by the maximum poll.ns among all AioHandlers, meaning idle AioHandlers with high poll.ns can have an outsized impact on polling duration. 3. For io_uring, idle AioHandlers are cleared after POLL_IDLE_INTERVAL_NS (7s), but for ppoll/epoll there is no such mechanism, leading to increased CPU consumption from idle nodes. Implementation: This patch treats block_ns as an event interval and updates each AioHandler's poll.ns in every loop iteration: - Active handlers (with events): poll.ns is updated using a weighted average of the current block_ns and previous poll.ns, smoothing out adjustments and preventing excessive fluctuations. - Inactive handlers (no events): poll.ns accumulates block_ns without weighting, allowing rapid isolation of idle nodes. When poll.ns exceeds poll_max_ns, it resets to 0, preventing sporadically active handlers from unnecessarily prolonging iothread polling. - The iothread polling duration is set based on the largest poll.ns among active handlers. The shrink divider defaults to 2, matching the grow rate, to reduce frequent poll_ns resets for slow devices. The implementation renames poll_idle_timeout to last_dispatch_timestamp for use as an active handler identifier. Testing: POLL_WEIGHT_SHIFT=3D3 (12.5% weight) was selected based on testing comparing baseline vs weight=3D2/3 across various workloads: Performance results (RHEL 10.1 + QEMU 10.0.0, FCP/FICON, 1-8 iothreads, numjobs 1/4/8 averaged): | poll-weight=3D2 | poll-weight=3D3 --------------------|--------------------|----------------- Throughput avg | -2.4% (all tests) | -2.2% (all tests) CPU consumption avg | -10.9% (all tests) | -9.4% (all tests) Both configurations achieve ~10% CPU reduction with minimal throughput impact (~2%). Weight=3D3 is chosen as default for slightly better throughput while maintaining substantial CPU savings. Additional validation testing on s390x SSD with fio (bs=3D8k, iodepth=3D8, numjobs=3D1) shows how poll_weight affects polling time (poll.ns) behavior: RandRead workload: +-------------+-----------+-----------+-------------+-------------+ | poll_weight | #samples | Mean (ns) | 50th % (ns) | 90th % (ns) | +-------------+-----------+-----------+-------------+-------------+ | 1 | 4.79M | 8,034 | 5,116 | 20,509 | | 2 | 5.01M | 12,584 | 11,078 | 24,693 | | 3 | 5.01M | 15,647 | 14,863 | 28,695 | | 4 | 5.12M | 16,430 | 15,556 | 30,848 | | 5 | 5.14M | 16,461 | 15,306 | 32,123 | +-------------+-----------+-----------+-------------+-------------+ RandWrite workload: +-------------+-----------+-----------+-------------+-------------+ | poll_weight | #samples | Mean (ns) | 50th % (ns) | 90th % (ns) | +-------------+-----------+-----------+-------------+-------------+ | 1 | 6.37M | 2,049 | 1,262 | 4,301 | | 2 | 7.46M | 4,118 | 3,226 | 7,476 | | 3 | 7.97M | 7,034 | 5,984 | 11,645 | | 4 | 7.96M | 12,789 | 11,362 | 20,040 | | 5 | 7.82M | 22,992 | 20,644 | 32,768 | +-------------+-----------+-----------+-------------+-------------+ Signed-off-by: Jaehoon Kim Message-ID: <20260423195918.661299-3-jhkim@linux.ibm.com> Signed-off-by: Stefan Hajnoczi --- include/qemu/aio.h | 3 +- util/aio-posix.h | 2 +- util/aio-posix.c | 128 ++++++++++++++++++++++++++++++--------------- util/async.c | 1 + 4 files changed, 89 insertions(+), 45 deletions(-) diff --git a/include/qemu/aio.h b/include/qemu/aio.h index 8cca2360d1..6c22064a28 100644 --- a/include/qemu/aio.h +++ b/include/qemu/aio.h @@ -195,7 +195,7 @@ struct BHListSlice { typedef QSLIST_HEAD(, AioHandler) AioHandlerSList; =20 typedef struct AioPolledEvent { - int64_t ns; /* current polling time in nanoseconds */ + int64_t ns; /* estimated block time in nanoseconds */ } AioPolledEvent; =20 struct AioContext { @@ -306,6 +306,7 @@ struct AioContext { int poll_disable_cnt; =20 /* Polling mode parameters */ + int64_t poll_ns; /* current polling time in nanoseconds */ int64_t poll_max_ns; /* maximum polling time in nanoseconds */ int64_t poll_grow; /* polling time growth factor */ int64_t poll_shrink; /* polling time shrink factor */ diff --git a/util/aio-posix.h b/util/aio-posix.h index ab894a3c0f..cd459bbbae 100644 --- a/util/aio-posix.h +++ b/util/aio-posix.h @@ -38,7 +38,7 @@ struct AioHandler { unsigned flags; /* see fdmon-io_uring.c */ CqeHandler internal_cqe_handler; /* used for POLL_ADD/POLL_REMOVE */ #endif - int64_t poll_idle_timeout; /* when to stop userspace polling */ + int64_t last_dispatch_timestamp; /* when last handler was dispatched */ bool poll_ready; /* has polling detected an event? */ AioPolledEvent poll; }; diff --git a/util/aio-posix.c b/util/aio-posix.c index 351847c6fb..8e9e9e5d8f 100644 --- a/util/aio-posix.c +++ b/util/aio-posix.c @@ -29,9 +29,11 @@ =20 /* Stop userspace polling on a handler if it isn't active for some time */ #define POLL_IDLE_INTERVAL_NS (7 * NANOSECONDS_PER_SECOND) +#define POLL_WEIGHT_SHIFT (3) =20 -static void adjust_polling_time(AioContext *ctx, AioPolledEvent *poll, - int64_t block_ns); +static void update_handler_poll_times(AioContext *ctx, int64_t block_ns, + int64_t dispatch_time); +static void adjust_polling_time(AioContext *ctx, int64_t block_ns); =20 bool aio_poll_disabled(AioContext *ctx) { @@ -359,7 +361,7 @@ static bool aio_dispatch_handler(AioContext *ctx, AioHa= ndler *node) =20 static bool aio_dispatch_ready_handlers(AioContext *ctx, AioHandlerList *ready_list, - int64_t block_ns) + int64_t dispatch_time) { bool progress =3D false; AioHandler *node; @@ -369,11 +371,11 @@ static bool aio_dispatch_ready_handlers(AioContext *c= tx, progress =3D aio_dispatch_handler(ctx, node) || progress; =20 /* - * Adjust polling time only after aio_dispatch_handler(), which can - * add the handler to ctx->poll_aio_handlers. + * Update last_dispatch_timestamp to mark this as an active + * handler for polling time adjustment and prevent idle removal. */ if (ctx->poll_max_ns && QLIST_IS_INSERTED(node, node_poll)) { - adjust_polling_time(ctx, &node->poll, block_ns); + node->last_dispatch_timestamp =3D dispatch_time; } } =20 @@ -394,7 +396,7 @@ void aio_dispatch(AioContext *ctx) ctx->fdmon_ops->dispatch(ctx); } =20 - /* block_ns is 0 because polling is disabled in the glib event loop */ + /* Set now to 0 as polling is disabled in the glib event loop */ aio_dispatch_ready_handlers(ctx, &ready_list, 0); =20 aio_free_deleted_handlers(ctx); @@ -415,9 +417,6 @@ static bool run_poll_handlers_once(AioContext *ctx, QLIST_FOREACH_SAFE(node, &ctx->poll_aio_handlers, node_poll, tmp) { if (node->io_poll(node->opaque)) { aio_add_poll_ready_handler(ready_list, node); - - node->poll_idle_timeout =3D now + POLL_IDLE_INTERVAL_NS; - /* * Polling was successful, exit try_poll_mode immediately * to adjust the next polling time. @@ -458,11 +457,10 @@ static bool remove_idle_poll_handlers(AioContext *ctx, } =20 QLIST_FOREACH_SAFE(node, &ctx->poll_aio_handlers, node_poll, tmp) { - if (node->poll_idle_timeout =3D=3D 0LL) { - node->poll_idle_timeout =3D now + POLL_IDLE_INTERVAL_NS; - } else if (now >=3D node->poll_idle_timeout) { + if (node->poll_ready =3D=3D false && + now >=3D node->last_dispatch_timestamp + POLL_IDLE_INTERVAL_NS= ) { trace_poll_remove(ctx, node, node->pfd.fd); - node->poll_idle_timeout =3D 0LL; + node->last_dispatch_timestamp =3D 0LL; QLIST_SAFE_REMOVE(node, node_poll); if (ctx->poll_started && node->io_poll_end) { node->io_poll_end(node->opaque); @@ -560,18 +558,13 @@ static bool run_poll_handlers(AioContext *ctx, AioHan= dlerList *ready_list, static bool try_poll_mode(AioContext *ctx, AioHandlerList *ready_list, int64_t *timeout) { - AioHandler *node; int64_t max_ns; =20 if (QLIST_EMPTY_RCU(&ctx->poll_aio_handlers)) { return false; } =20 - max_ns =3D 0; - QLIST_FOREACH(node, &ctx->poll_aio_handlers, node_poll) { - max_ns =3D MAX(max_ns, node->poll.ns); - } - max_ns =3D qemu_soonest_timeout(*timeout, max_ns); + max_ns =3D qemu_soonest_timeout(*timeout, ctx->poll_ns); =20 if (max_ns && !ctx->fdmon_ops->need_wait(ctx)) { /* @@ -587,43 +580,85 @@ static bool try_poll_mode(AioContext *ctx, AioHandler= List *ready_list, return false; } =20 -static void adjust_polling_time(AioContext *ctx, AioPolledEvent *poll, - int64_t block_ns) +static void adjust_polling_time(AioContext *ctx, int64_t block_ns) { - if (block_ns <=3D poll->ns) { - /* This is the sweet spot, no adjustment needed */ - } else if (block_ns > ctx->poll_max_ns) { - /* We'd have to poll for too long, poll less */ - int64_t old =3D poll->ns; + if (block_ns < ctx->poll_ns) { + int64_t old =3D ctx->poll_ns; + int64_t shrink =3D ctx->poll_shrink; =20 - if (ctx->poll_shrink) { - poll->ns /=3D ctx->poll_shrink; - } else { - poll->ns =3D 0; + if (shrink =3D=3D 0) { + shrink =3D 2; } =20 - trace_poll_shrink(ctx, old, poll->ns); - } else if (poll->ns < ctx->poll_max_ns && - block_ns < ctx->poll_max_ns) { + if (block_ns < (ctx->poll_ns / shrink)) { + ctx->poll_ns /=3D shrink; + } + + trace_poll_shrink(ctx, old, ctx->poll_ns); + } else if (block_ns > ctx->poll_ns) { /* There is room to grow, poll longer */ - int64_t old =3D poll->ns; + int64_t old =3D ctx->poll_ns; int64_t grow =3D ctx->poll_grow; =20 if (grow =3D=3D 0) { grow =3D 2; } =20 - if (poll->ns) { - poll->ns *=3D grow; + if (block_ns > ctx->poll_ns * grow) { + ctx->poll_ns =3D block_ns; } else { - poll->ns =3D 4000; /* start polling at 4 microseconds */ + ctx->poll_ns *=3D grow; } =20 - if (poll->ns > ctx->poll_max_ns) { - poll->ns =3D ctx->poll_max_ns; + if (ctx->poll_ns > ctx->poll_max_ns) { + ctx->poll_ns =3D ctx->poll_max_ns; } =20 - trace_poll_grow(ctx, old, poll->ns); + trace_poll_grow(ctx, old, ctx->poll_ns); + } +} + +static void update_handler_poll_times(AioContext *ctx, int64_t block_ns, + int64_t dispatch_time) +{ + AioHandler *node; + int64_t max_poll_ns =3D -1; + + QLIST_FOREACH(node, &ctx->poll_aio_handlers, node_poll) { + if (node->last_dispatch_timestamp =3D=3D dispatch_time) { + /* + * Active handler: had an event in this aio_poll() call. + * Update poll.ns using a weighted average of the current + * block_ns and previous poll.ns to smooth adjustments. + */ + node->poll.ns =3D node->poll.ns + ? (node->poll.ns - (node->poll.ns >> POLL_WEIGHT_SHIFT)) + + (block_ns >> POLL_WEIGHT_SHIFT) : block_ns; + + if (node->poll.ns > ctx->poll_max_ns) { + node->poll.ns =3D 0; + } + /* + * Track the maximum poll.ns among active handlers to + * calculate the next polling time. + */ + max_poll_ns =3D MAX(max_poll_ns, node->poll.ns); + } else { + /* + * Inactive handler: no event in this aio_poll() call but + * was active before. Increase poll.ns by block_ns. If it + * exceeds poll_max_ns, reset to 0 until next event. + */ + if (node->poll.ns !=3D 0) { + node->poll.ns +=3D block_ns; + if (node->poll.ns > ctx->poll_max_ns) { + node->poll.ns =3D 0; + } + } + } + } + if (max_poll_ns >=3D 0) { + adjust_polling_time(ctx, max_poll_ns); } } =20 @@ -635,6 +670,7 @@ bool aio_poll(AioContext *ctx, bool blocking) int64_t timeout; int64_t start =3D 0; int64_t block_ns =3D 0; + int64_t dispatch_ns =3D 0; =20 /* * There cannot be two concurrent aio_poll calls for the same AioConte= xt (or @@ -711,7 +747,8 @@ bool aio_poll(AioContext *ctx, bool blocking) =20 /* Calculate blocked time for adaptive polling */ if (ctx->poll_max_ns) { - block_ns =3D qemu_clock_get_ns(QEMU_CLOCK_REALTIME) - start; + dispatch_ns =3D qemu_clock_get_ns(QEMU_CLOCK_REALTIME); + block_ns =3D dispatch_ns - start; } =20 if (ctx->fdmon_ops->dispatch) { @@ -719,10 +756,14 @@ bool aio_poll(AioContext *ctx, bool blocking) } =20 progress |=3D aio_bh_poll(ctx); - progress |=3D aio_dispatch_ready_handlers(ctx, &ready_list, block_ns); + progress |=3D aio_dispatch_ready_handlers(ctx, &ready_list, dispatch_n= s); =20 aio_free_deleted_handlers(ctx); =20 + if (ctx->poll_max_ns) { + update_handler_poll_times(ctx, block_ns, dispatch_ns); + } + qemu_lockcnt_dec(&ctx->list_lock); =20 progress |=3D timerlistgroup_run_timers(&ctx->tlg); @@ -794,6 +835,7 @@ void aio_context_set_poll_params(AioContext *ctx, int64= _t max_ns, ctx->poll_max_ns =3D max_ns; ctx->poll_grow =3D grow; ctx->poll_shrink =3D shrink; + ctx->poll_ns =3D 0; =20 aio_notify(ctx); } diff --git a/util/async.c b/util/async.c index 80d6b01a8a..9d3627566f 100644 --- a/util/async.c +++ b/util/async.c @@ -606,6 +606,7 @@ AioContext *aio_context_new(Error **errp) timerlistgroup_init(&ctx->tlg, aio_timerlist_notify, ctx); =20 ctx->poll_max_ns =3D 0; + ctx->poll_ns =3D 0; ctx->poll_grow =3D 0; ctx->poll_shrink =3D 0; =20 --=20 2.54.0 From nobody Sat May 30 19:21:50 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=quarantine dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1777486984; cv=none; d=zohomail.com; s=zohoarc; b=Y0Qsa727sky8/os55+ykOEPWvR4HHWMMYeN+niyVrmYfmGJ34WQbsh/1Ey0suwL9dpHm+FXdMBKa3hg1x0kzqLfQUbIUfCD0qJBVTBBtclmFfj80el/8BdEH3ufFv12XQIZTo0EOxB9Gt77Msg8SD4wpWcykEtu25LWEKQEjz1o= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1777486984; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=8bUrkCrR0eARoen5Ol8TqCgzEV95zBAPnbtOs4MP4uU=; b=RL/F7Yfw1Vm36Ngta0LVKBvPQ+3WyXDmVLNHwzS/juXMtl+pO1iz49QTAMvczCOTWzqsEZbxthITuHenhfqr5C0HA4GRn1BSoQk8gaJzc9YHkL1TDVgBt275Tx0zW6iSI4fhvQStdASBkOyMSc7fjOqzqGfhwL59H9MDzi0f6v4= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=quarantine dis=none) Return-Path: Received: from lists1p.gnu.org (lists1p.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1777486984614167.36488965524336; Wed, 29 Apr 2026 11:23:04 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists1p.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1wI9Yn-0003Ef-Vu; Wed, 29 Apr 2026 14:22:58 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists1p.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1wI9Xq-0002fI-K1 for qemu-devel@nongnu.org; Wed, 29 Apr 2026 14:21:58 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1wI9Xk-0003it-EE for qemu-devel@nongnu.org; Wed, 29 Apr 2026 14:21:57 -0400 Received: from mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-588-HG82EkBBPvOMBAp2a3Vcmw-1; Wed, 29 Apr 2026 14:21:46 -0400 Received: from mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.12]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 28CF01800579; Wed, 29 Apr 2026 18:21:45 +0000 (UTC) Received: from localhost (unknown [10.44.33.46]) by mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id A8A5D19560AB; Wed, 29 Apr 2026 18:21:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1777486910; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=8bUrkCrR0eARoen5Ol8TqCgzEV95zBAPnbtOs4MP4uU=; b=ScOeHD+OXB1PxKHwB9qZ0dU/vVaiTXuu1aXGgpgpz9GdoxwZ3o/RRICDOfAlZhtp6raOFI Bd+7wKeCCwmg0fkd7TLiWhqOOl8Mlkz/DaB09OY4DtqmbEx/+jozqqNlDRkVIjO4t8amBe YjC/GIf+DTTQpP/3ipPFv3Ghi3Uh+3o= X-MC-Unique: HG82EkBBPvOMBAp2a3Vcmw-1 X-Mimecast-MFC-AGG-ID: HG82EkBBPvOMBAp2a3Vcmw_1777486905 From: Stefan Hajnoczi To: qemu-devel@nongnu.org Cc: Stefan Weil , Hanna Reitz , Peter Maydell , Kevin Wolf , Markus Armbruster , Stefan Hajnoczi , qemu-block@nongnu.org, "Dr. David Alan Gilbert" , Paolo Bonzini , Fam Zheng , "Michael S. Tsirkin" , =?UTF-8?q?Daniel=20P=2E=20Berrang=C3=A9?= , Eric Blake , Jaehoon Kim Subject: [PULL 3/3] qapi/iothread: introduce poll-weight parameter for aio-poll Date: Wed, 29 Apr 2026 14:21:27 -0400 Message-ID: <20260429182127.219776-4-stefanha@redhat.com> In-Reply-To: <20260429182127.219776-1-stefanha@redhat.com> References: <20260429182127.219776-1-stefanha@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.0 on 10.30.177.12 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists1p.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=stefanha@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: qemu development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1777486986654154100 Content-Type: text/plain; charset="utf-8" From: Jaehoon Kim Introduce a configurable poll-weight parameter for adaptive polling in IOThread. This parameter replaces the hardcoded POLL_WEIGHT_SHIFT constant, allowing runtime control over how much the most recent event interval affects the next polling duration calculation. The poll-weight parameter uses a shift value where larger values decrease the weight of the current interval, enabling more gradual adjustments. When set to 0, a default value of 3 is used (meaning the current interval contributes approximately 1/8 to the weighted average). This patch also removes the hardcoded default value checks from adjust_polling_time(). Instead, poll-grow, poll-shrink, and poll-weight now use default values initialized in iothread.c during IOThread creation. Signed-off-by: Jaehoon Kim Reviewed-by: Stefan Hajnoczi Acked-by: Markus Armbruster Message-ID: <20260423195918.661299-4-jhkim@linux.ibm.com> Signed-off-by: Stefan Hajnoczi --- qapi/misc.json | 6 ++++ qapi/qom.json | 10 ++++++- include/qemu/aio.h | 4 ++- include/system/iothread.h | 18 ++++++++++++ iothread.c | 47 ++++++++++++++++++++++--------- monitor/hmp-cmds.c | 1 + tests/unit/test-nested-aio-poll.c | 2 +- util/aio-posix.c | 37 +++++++++--------------- util/aio-win32.c | 3 +- util/async.c | 1 + qemu-options.hx | 8 +++++- 11 files changed, 95 insertions(+), 42 deletions(-) diff --git a/qapi/misc.json b/qapi/misc.json index 28c641fe2f..22b7afed9f 100644 --- a/qapi/misc.json +++ b/qapi/misc.json @@ -85,6 +85,11 @@ # @poll-shrink: how many ns will be removed from polling time, 0 means # that it's not configured (since 2.9) # +# @poll-weight: the weight factor for adaptive polling. +# Determines how much the current event interval contributes to +# the next polling time calculation. Valid values are 1 or +# greater (since 11.1) +# # @aio-max-batch: maximum number of requests in a batch for the AIO # engine, 0 means that the engine will use its default (since 6.1) # @@ -96,6 +101,7 @@ 'poll-max-ns': 'int', 'poll-grow': 'int', 'poll-shrink': 'int', + 'poll-weight': 'int', 'aio-max-batch': 'int' } } =20 ## diff --git a/qapi/qom.json b/qapi/qom.json index c653248f85..dd45ac1087 100644 --- a/qapi/qom.json +++ b/qapi/qom.json @@ -606,6 +606,13 @@ # algorithm detects it is spending too long polling without # encountering events. 0 selects a default behaviour (default: 0) # +# @poll-weight: the weight factor for adaptive polling. Determines +# how much the most recent event interval affects the next +# polling duration calculation. If set to 0, the system default +# value of 3 is used. Typical values: 1 (high weight on recent +# interval), 2-4 (moderate weight on recent interval). +# (default: 0) (since 11.1) +# # The @aio-max-batch option is available since 6.1. # # Since: 2.0 @@ -614,7 +621,8 @@ 'base': 'EventLoopBaseProperties', 'data': { '*poll-max-ns': 'int', '*poll-grow': 'int', - '*poll-shrink': 'int' } } + '*poll-shrink': 'int', + '*poll-weight': 'int' } } =20 ## # @MainLoopProperties: diff --git a/include/qemu/aio.h b/include/qemu/aio.h index 6c22064a28..e65e90093a 100644 --- a/include/qemu/aio.h +++ b/include/qemu/aio.h @@ -310,6 +310,7 @@ struct AioContext { int64_t poll_max_ns; /* maximum polling time in nanoseconds */ int64_t poll_grow; /* polling time growth factor */ int64_t poll_shrink; /* polling time shrink factor */ + int64_t poll_weight; /* weight of current interval in calculation */ =20 /* AIO engine parameters */ int64_t aio_max_batch; /* maximum number of requests in a batch */ @@ -791,12 +792,13 @@ void aio_context_destroy(AioContext *ctx); * @max_ns: how long to busy poll for, in nanoseconds * @grow: polling time growth factor * @shrink: polling time shrink factor + * @weight: weight factor applied to the current polling interval * * Poll mode can be disabled by setting poll_max_ns to 0. */ void aio_context_set_poll_params(AioContext *ctx, int64_t max_ns, int64_t grow, int64_t shrink, - Error **errp); + int64_t weight, Error **errp); =20 /** * aio_context_set_aio_params: diff --git a/include/system/iothread.h b/include/system/iothread.h index e26d13c6c7..a1ef7696cb 100644 --- a/include/system/iothread.h +++ b/include/system/iothread.h @@ -21,6 +21,23 @@ =20 #define TYPE_IOTHREAD "iothread" =20 +#ifdef CONFIG_POSIX +/* + * Benchmark results from 2016 on NVMe SSD drives show max polling times a= round + * 16-32 microseconds yield IOPS improvements for both iodepth=3D1 and iod= epth=3D32 + * workloads. + */ +#define IOTHREAD_POLL_MAX_NS_DEFAULT 32768ULL +#define IOTHREAD_POLL_GROW_DEFAULT 2ULL +#define IOTHREAD_POLL_SHRINK_DEFAULT 2ULL +#define IOTHREAD_POLL_WEIGHT_DEFAULT 3ULL +#else +#define IOTHREAD_POLL_MAX_NS_DEFAULT 0ULL +#define IOTHREAD_POLL_GROW_DEFAULT 0ULL +#define IOTHREAD_POLL_SHRINK_DEFAULT 0ULL +#define IOTHREAD_POLL_WEIGHT_DEFAULT 0ULL +#endif + struct IOThread { EventLoopBase parent_obj; =20 @@ -38,6 +55,7 @@ struct IOThread { int64_t poll_max_ns; int64_t poll_grow; int64_t poll_shrink; + int64_t poll_weight; }; typedef struct IOThread IOThread; =20 diff --git a/iothread.c b/iothread.c index caf68e0764..3558535b40 100644 --- a/iothread.c +++ b/iothread.c @@ -25,17 +25,6 @@ #include "qemu/rcu.h" #include "qemu/main-loop.h" =20 - -#ifdef CONFIG_POSIX -/* Benchmark results from 2016 on NVMe SSD drives show max polling times a= round - * 16-32 microseconds yield IOPS improvements for both iodepth=3D1 and iod= epth=3D32 - * workloads. - */ -#define IOTHREAD_POLL_MAX_NS_DEFAULT 32768ULL -#else -#define IOTHREAD_POLL_MAX_NS_DEFAULT 0ULL -#endif - static void *iothread_run(void *opaque) { IOThread *iothread =3D opaque; @@ -103,6 +92,10 @@ static void iothread_instance_init(Object *obj) IOThread *iothread =3D IOTHREAD(obj); =20 iothread->poll_max_ns =3D IOTHREAD_POLL_MAX_NS_DEFAULT; + iothread->poll_grow =3D IOTHREAD_POLL_GROW_DEFAULT; + iothread->poll_shrink =3D IOTHREAD_POLL_SHRINK_DEFAULT; + iothread->poll_weight =3D IOTHREAD_POLL_WEIGHT_DEFAULT; + iothread->thread_id =3D -1; qemu_sem_init(&iothread->init_done_sem, 0); /* By default, we don't run gcontext */ @@ -164,6 +157,7 @@ static void iothread_set_aio_context_params(EventLoopBa= se *base, Error **errp) iothread->poll_max_ns, iothread->poll_grow, iothread->poll_shrink, + iothread->poll_weight, errp); if (*errp) { return; @@ -233,6 +227,9 @@ static IOThreadParamInfo poll_grow_info =3D { static IOThreadParamInfo poll_shrink_info =3D { "poll-shrink", offsetof(IOThread, poll_shrink), }; +static IOThreadParamInfo poll_weight_info =3D { + "poll-weight", offsetof(IOThread, poll_weight), +}; =20 static void iothread_get_param(Object *obj, Visitor *v, const char *name, IOThreadParamInfo *info, Error **errp) @@ -254,13 +251,31 @@ static bool iothread_set_param(Object *obj, Visitor *= v, return false; } =20 - if (value < 0) { + if (info->offset =3D=3D offsetof(IOThread, poll_weight)) { + if (value < 0 || value > 63) { + error_setg(errp, "%s value must be in range [0, 63]", + info->name); + return false; + } + } else if (value < 0) { error_setg(errp, "%s value must be in range [0, %" PRId64 "]", info->name, INT64_MAX); return false; } =20 - *field =3D value; + if (value =3D=3D 0) { + if (info->offset =3D=3D offsetof(IOThread, poll_grow)) { + *field =3D IOTHREAD_POLL_GROW_DEFAULT; + } else if (info->offset =3D=3D offsetof(IOThread, poll_shrink)) { + *field =3D IOTHREAD_POLL_SHRINK_DEFAULT; + } else if (info->offset =3D=3D offsetof(IOThread, poll_weight)) { + *field =3D IOTHREAD_POLL_WEIGHT_DEFAULT; + } else { + *field =3D value; + } + } else { + *field =3D value; + } =20 return true; } @@ -288,6 +303,7 @@ static void iothread_set_poll_param(Object *obj, Visito= r *v, iothread->poll_max_ns, iothread->poll_grow, iothread->poll_shrink, + iothread->poll_weight, errp); } } @@ -311,6 +327,10 @@ static void iothread_class_init(ObjectClass *klass, co= nst void *class_data) iothread_get_poll_param, iothread_set_poll_param, NULL, &poll_shrink_info); + object_class_property_add(klass, "poll-weight", "int", + iothread_get_poll_param, + iothread_set_poll_param, + NULL, &poll_weight_info); } =20 static const TypeInfo iothread_info =3D { @@ -356,6 +376,7 @@ static int query_one_iothread(Object *object, void *opa= que) info->poll_max_ns =3D iothread->poll_max_ns; info->poll_grow =3D iothread->poll_grow; info->poll_shrink =3D iothread->poll_shrink; + info->poll_weight =3D iothread->poll_weight; info->aio_max_batch =3D iothread->parent_obj.aio_max_batch; =20 QAPI_LIST_APPEND(*tail, info); diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c index bc26b39d70..afa7b709a6 100644 --- a/monitor/hmp-cmds.c +++ b/monitor/hmp-cmds.c @@ -206,6 +206,7 @@ void hmp_info_iothreads(Monitor *mon, const QDict *qdic= t) monitor_printf(mon, " poll-max-ns=3D%" PRId64 "\n", value->poll_m= ax_ns); monitor_printf(mon, " poll-grow=3D%" PRId64 "\n", value->poll_gro= w); monitor_printf(mon, " poll-shrink=3D%" PRId64 "\n", value->poll_s= hrink); + monitor_printf(mon, " poll-weight=3D%" PRId64 "\n", value->poll_w= eight); monitor_printf(mon, " aio-max-batch=3D%" PRId64 "\n", value->aio_max_batch); } diff --git a/tests/unit/test-nested-aio-poll.c b/tests/unit/test-nested-aio= -poll.c index 9ab1ad08a7..4c38f36fd4 100644 --- a/tests/unit/test-nested-aio-poll.c +++ b/tests/unit/test-nested-aio-poll.c @@ -81,7 +81,7 @@ static void test(void) qemu_set_current_aio_context(td.ctx); =20 /* Enable polling */ - aio_context_set_poll_params(td.ctx, 1000000, 2, 2, &error_abort); + aio_context_set_poll_params(td.ctx, 1000000, 2, 2, 3, &error_abort); =20 /* Make the event notifier active (set) right away */ event_notifier_init(&td.poll_notifier, 1); diff --git a/util/aio-posix.c b/util/aio-posix.c index 8e9e9e5d8f..df1c213ce5 100644 --- a/util/aio-posix.c +++ b/util/aio-posix.c @@ -29,7 +29,6 @@ =20 /* Stop userspace polling on a handler if it isn't active for some time */ #define POLL_IDLE_INTERVAL_NS (7 * NANOSECONDS_PER_SECOND) -#define POLL_WEIGHT_SHIFT (3) =20 static void update_handler_poll_times(AioContext *ctx, int64_t block_ns, int64_t dispatch_time); @@ -582,28 +581,11 @@ static bool try_poll_mode(AioContext *ctx, AioHandler= List *ready_list, =20 static void adjust_polling_time(AioContext *ctx, int64_t block_ns) { - if (block_ns < ctx->poll_ns) { - int64_t old =3D ctx->poll_ns; - int64_t shrink =3D ctx->poll_shrink; - - if (shrink =3D=3D 0) { - shrink =3D 2; - } - - if (block_ns < (ctx->poll_ns / shrink)) { - ctx->poll_ns /=3D shrink; - } - - trace_poll_shrink(ctx, old, ctx->poll_ns); - } else if (block_ns > ctx->poll_ns) { + if (block_ns > ctx->poll_ns) { /* There is room to grow, poll longer */ int64_t old =3D ctx->poll_ns; int64_t grow =3D ctx->poll_grow; =20 - if (grow =3D=3D 0) { - grow =3D 2; - } - if (block_ns > ctx->poll_ns * grow) { ctx->poll_ns =3D block_ns; } else { @@ -615,6 +597,11 @@ static void adjust_polling_time(AioContext *ctx, int64= _t block_ns) } =20 trace_poll_grow(ctx, old, ctx->poll_ns); + } else if (block_ns < (ctx->poll_ns / ctx->poll_shrink)) { + int64_t old =3D ctx->poll_ns; + ctx->poll_ns /=3D ctx->poll_shrink; + + trace_poll_shrink(ctx, old, ctx->poll_ns); } } =20 @@ -632,8 +619,8 @@ static void update_handler_poll_times(AioContext *ctx, = int64_t block_ns, * block_ns and previous poll.ns to smooth adjustments. */ node->poll.ns =3D node->poll.ns - ? (node->poll.ns - (node->poll.ns >> POLL_WEIGHT_SHIFT)) - + (block_ns >> POLL_WEIGHT_SHIFT) : block_ns; + ? (node->poll.ns - (node->poll.ns >> ctx->poll_weight)) + + (block_ns >> ctx->poll_weight) : block_ns; =20 if (node->poll.ns > ctx->poll_max_ns) { node->poll.ns =3D 0; @@ -819,7 +806,8 @@ void aio_context_destroy(AioContext *ctx) } =20 void aio_context_set_poll_params(AioContext *ctx, int64_t max_ns, - int64_t grow, int64_t shrink, Error **err= p) + int64_t grow, int64_t shrink, + int64_t weight, Error **errp) { AioHandler *node; =20 @@ -833,8 +821,9 @@ void aio_context_set_poll_params(AioContext *ctx, int64= _t max_ns, * is used once. */ ctx->poll_max_ns =3D max_ns; - ctx->poll_grow =3D grow; - ctx->poll_shrink =3D shrink; + ctx->poll_grow =3D (grow ? grow : IOTHREAD_POLL_GROW_DEFAULT); + ctx->poll_shrink =3D (shrink ? shrink : IOTHREAD_POLL_SHRINK_DEFAULT); + ctx->poll_weight =3D (weight ? weight : IOTHREAD_POLL_WEIGHT_DEFAULT); ctx->poll_ns =3D 0; =20 aio_notify(ctx); diff --git a/util/aio-win32.c b/util/aio-win32.c index 6e6f699e4b..1985843233 100644 --- a/util/aio-win32.c +++ b/util/aio-win32.c @@ -429,7 +429,8 @@ void aio_context_destroy(AioContext *ctx) } =20 void aio_context_set_poll_params(AioContext *ctx, int64_t max_ns, - int64_t grow, int64_t shrink, Error **err= p) + int64_t grow, int64_t shrink, + int64_t weight, Error **errp) { if (max_ns) { error_setg(errp, "AioContext polling is not implemented on Windows= "); diff --git a/util/async.c b/util/async.c index 9d3627566f..741fcfd6a7 100644 --- a/util/async.c +++ b/util/async.c @@ -609,6 +609,7 @@ AioContext *aio_context_new(Error **errp) ctx->poll_ns =3D 0; ctx->poll_grow =3D 0; ctx->poll_shrink =3D 0; + ctx->poll_weight =3D 0; =20 ctx->aio_max_batch =3D 0; =20 diff --git a/qemu-options.hx b/qemu-options.hx index 21972f8326..29c09415c1 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -6443,7 +6443,7 @@ SRST =20 CN=3Dlaptop.example.com,O=3DExample Home,L=3DLondon,ST=3DLondo= n,C=3DGB =20 - ``-object iothread,id=3Did,poll-max-ns=3Dpoll-max-ns,poll-grow=3Dpoll-= grow,poll-shrink=3Dpoll-shrink,aio-max-batch=3Daio-max-batch`` + ``-object iothread,id=3Did,poll-max-ns=3Dpoll-max-ns,poll-grow=3Dpoll-= grow,poll-shrink=3Dpoll-shrink,poll-weight=3Dpoll-weight,aio-max-batch=3Dai= o-max-batch`` Creates a dedicated event loop thread that devices can be assigned to. This is known as an IOThread. By default device emulation happens in vCPU threads or the main event loop thread. @@ -6479,6 +6479,12 @@ SRST the polling time when the algorithm detects it is spending too long polling without encountering events. =20 + The ``poll-weight`` parameter is the weight factor for adaptive + polling. It determines how much the most recent event interval + affects the next polling duration calculation. If set to 0, the + system default value of 3 is used. Typical values: 1 (high weight + on recent interval), 2-4 (moderate weight on recent interval). + The ``aio-max-batch`` parameter is the maximum number of requests in a batch for the AIO engine, 0 means that the engine will use its default. --=20 2.54.0