From nobody Sun Apr 12 00:56:45 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=quarantine dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1771856511; cv=none; d=zohomail.com; s=zohoarc; b=f9ZuH67kJUGxk6jsjkYMf32G405XycIpxtSOySo3tNMCaGSH83Kv4q+GkFyA7JKc4XKQzVOBM8rfTH4hccQ2DK4OkyuSF5vegy7EtvhTtmeTNgnLAHNGrXYMMKQc/H7cejnbUuGU/fZW0E+tC564nlJ35Ic4SxHEkCw4Ydy71Fs= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1771856511; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=6AdKr8cSC/9b0AmvQPZSkr7dfkA/wuBIVQpTyoyx7PE=; b=nNIQhVREXQnxxEQBEWQc8xdrkK96ydkZDa0ubpLl3Fu+MMSJIES4Gjr75SVGHWAI1zopeVHl/C7Qdb9vupWzWScLqdt7Zv7ZAOeHqvdUPoSxYkJMrjIvMMlrmOqpiInT5O4tP/WaxRgjUDfIet/tmg4VYOeyFykEaC2ntlpcd8I= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=quarantine dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1771856511626188.3738884439001; Mon, 23 Feb 2026 06:21:51 -0800 (PST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1vuWoI-0004JI-2X; Mon, 23 Feb 2026 09:21:18 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1vuWny-0004Ae-Lw for qemu-devel@nongnu.org; Mon, 23 Feb 2026 09:21:03 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1vuWnw-0005XX-0U for qemu-devel@nongnu.org; Mon, 23 Feb 2026 09:20:57 -0500 Received: from mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-587-NTerMvO0MsSrwBn1NkL0uQ-1; Mon, 23 Feb 2026 09:20:50 -0500 Received: from mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.17]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id BFAC71956055; Mon, 23 Feb 2026 14:20:48 +0000 (UTC) Received: from localhost (unknown [10.2.16.160]) by mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 68B641955F22; Mon, 23 Feb 2026 14:20:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1771856454; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=6AdKr8cSC/9b0AmvQPZSkr7dfkA/wuBIVQpTyoyx7PE=; b=J/Sk6JdouSCrwUYNxMMszx65ZDYSl4fjhSm7IT6BoUM7kN6CJUYooBe0diSqvHnBTrn8xI ySkgI1W57XwtfA721ux9ZuYkaq9rJR3xftHJeiV+zp8dICr86wg/bZQFA49SAmmChd7z2B KgKUfnvq6Zr9Z7N5Wo/MOmQtwQmbdIc= X-MC-Unique: NTerMvO0MsSrwBn1NkL0uQ-1 X-Mimecast-MFC-AGG-ID: NTerMvO0MsSrwBn1NkL0uQ_1771856449 From: Stefan Hajnoczi To: qemu-devel@nongnu.org Cc: Paolo Bonzini , Fam Zheng , qemu-block@nongnu.org, Stefan Hajnoczi , Peter Maydell , Jens Axboe , Kevin Wolf Subject: [PULL 1/2] aio-posix: notify main loop when SQEs are queued Date: Mon, 23 Feb 2026 09:20:30 -0500 Message-ID: <20260223142031.1397832-2-stefanha@redhat.com> In-Reply-To: <20260223142031.1397832-1-stefanha@redhat.com> References: <20260223142031.1397832-1-stefanha@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.0 on 10.30.177.17 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.129.124; envelope-from=stefanha@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -1 X-Spam_score: -0.2 X-Spam_bar: / X-Spam_report: (-0.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=1.179, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.717, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: qemu development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1771856513668158500 Content-Type: text/plain; charset="utf-8" From: Jens Axboe When a vCPU thread handles MMIO (holding BQL), aio_co_enter() runs the block I/O coroutine inline on the vCPU thread because qemu_get_current_aio_context() returns the main AioContext when BQL is held. The coroutine calls luring_co_submit() which queues an SQE via fdmon_io_uring_add_sqe(), but the actual io_uring_submit() only happens in gsource_prepare() on the main loop thread. Since the coroutine ran inline (not via aio_co_schedule()), no BH is scheduled and aio_notify() is never called. The main loop remains asleep in ppoll() with up to a 499ms timeout, leaving the SQE unsubmitted until the next timer fires. Fix this by calling aio_notify() after queuing the SQE. This wakes the main loop via the eventfd so it can run gsource_prepare() and submit the pending SQE promptly. This is a generic fix that benefits all devices using aio=3Dio_uring. Without it, AHCI/SATA devices see MUCH worse I/O latency since they use MMIO (not ioeventfd like virtio) and have no other mechanism to wake the main loop after queuing block I/O. This is usually a bit hard to detect, as it also relies on the ppoll loop not waking up for other activity, and micro benchmarks tend not to see it because they don't have any real processing time. With a synthetic test case that has a few usleep() to simulate processing of read data, it's very noticeable. The below example reads 128MB with O_DIRECT in 128KB chunks in batches of 16, and has a 1ms delay before each batch submit, and a 1ms delay after processing each completion. Running it on /dev/sda yields: time sudo ./iotest /dev/sda ________________________________________________________ Executed in 25.76 secs fish external usr time 6.19 millis 783.00 micros 5.41 millis sys time 12.43 millis 642.00 micros 11.79 millis while on a virtio-blk or NVMe device we get: time sudo ./iotest /dev/vdb ________________________________________________________ Executed in 1.25 secs fish external usr time 1.40 millis 0.30 millis 1.10 millis sys time 17.61 millis 1.43 millis 16.18 millis time sudo ./iotest /dev/nvme0n1 ________________________________________________________ Executed in 1.26 secs fish external usr time 6.11 millis 0.52 millis 5.59 millis sys time 13.94 millis 1.50 millis 12.43 millis where the latter are consistent. If we run the same test but keep the socket for the ssh connection active by having activity there, then the sda test looks as follows: time sudo ./iotest /dev/sda ________________________________________________________ Executed in 1.23 secs fish external usr time 2.70 millis 39.00 micros 2.66 millis sys time 4.97 millis 977.00 micros 3.99 millis as now the ppoll loop is woken all the time anyway. After this fix, on an idle system: time sudo ./iotest /dev/sda ________________________________________________________ Executed in 1.30 secs fish external usr time 2.14 millis 0.14 millis 2.00 millis sys time 16.93 millis 1.16 millis 15.76 millis Signed-off-by: Jens Axboe Message-Id: <07d701b9-3039-4f9b-99a2-abeae51146a5@kernel.dk> Reviewed-by: Kevin Wolf [Generalize the comment since this applies to all vCPU thread activity, not just coroutines, as suggested by Kevin Wolf . --Stefan] Signed-off-by: Stefan Hajnoczi --- util/aio-posix.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/util/aio-posix.c b/util/aio-posix.c index e24b955fd9..488d964611 100644 --- a/util/aio-posix.c +++ b/util/aio-posix.c @@ -23,6 +23,7 @@ #include "qemu/rcu_queue.h" #include "qemu/sockets.h" #include "qemu/cutils.h" +#include "system/iothread.h" #include "trace.h" #include "aio-posix.h" =20 @@ -813,5 +814,13 @@ void aio_add_sqe(void (*prep_sqe)(struct io_uring_sqe = *sqe, void *opaque), { AioContext *ctx =3D qemu_get_current_aio_context(); ctx->fdmon_ops->add_sqe(ctx, prep_sqe, opaque, cqe_handler); + + /* + * Wake the main loop if it is sleeping in ppoll(). When a vCPU thread + * queues SQEs, the actual io_uring_submit() only happens in + * gsource_prepare() in the main loop thread. Without this notify, the + * main loop thread's ppoll() can sleep up to 499ms before submitting. + */ + aio_notify(ctx); } #endif /* CONFIG_LINUX_IO_URING */ --=20 2.53.0 From nobody Sun Apr 12 00:56:45 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=quarantine dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1771856537; cv=none; d=zohomail.com; s=zohoarc; b=EJ5UCsSRelUR14bcsmPONzow1RZU6GZyM84lhsMKRNMiQc9yFFXKMF5rSxbeUGA1KP/sxM+8HsdxlqQiAlRIlS3ODLLN5mC3uIHyclCR3VqegIPRXTHVVDvrBMLnyxkLFtbVzaDAkkrDr/p6xcAMT2me/Z5O+9C9wpor6b2pmLA= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1771856537; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=Gd3OmvsplKIPl813mRZWly59V7xqaiRqqJWFXFoYp/Y=; b=fWZq/k0Mv6wFtGQ4wDhOFFM64UWBFhpdFgn233fJtB/N4cOQjhA2x8g+Qs+RQOw5fYXu3N8baeavwL8YiPaUaf04RrnHM7hhmhiXyxIxd4S8lrnnBO/LQoLrw/JK07hFK1wyBYdn2lYCVtx/vkHE3C2TltBoyjlrB7IS4Kyr5iU= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=quarantine dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1771856537617400.7991893655177; Mon, 23 Feb 2026 06:22:17 -0800 (PST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1vuWoE-0004IS-TC; Mon, 23 Feb 2026 09:21:14 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1vuWo3-0004B6-Gk for qemu-devel@nongnu.org; Mon, 23 Feb 2026 09:21:08 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1vuWo1-0005YU-EW for qemu-devel@nongnu.org; Mon, 23 Feb 2026 09:21:03 -0500 Received: from mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-607-jNruTWd1NHqBBVWbzeKzXg-1; Mon, 23 Feb 2026 09:20:54 -0500 Received: from mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.111]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 282401800366; Mon, 23 Feb 2026 14:20:53 +0000 (UTC) Received: from localhost (unknown [10.2.16.160]) by mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 9B7D41800465; Mon, 23 Feb 2026 14:20:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1771856459; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Gd3OmvsplKIPl813mRZWly59V7xqaiRqqJWFXFoYp/Y=; b=gwpKr5WSiypHKV7rViQMdVkiZauhUkInMnkI03i1UDIznm9QYJkDOybWECr2HZOdWLfTyc nfo0o057EYHc4SGDzu4j9VbrC2HG6ZLC9Hj2WgEmhla7R11+4m4GIAD1MhEFF5ed3L9Fry yFmOKDoJZtiF8l2iYCQMiTd8isjN7dE= X-MC-Unique: jNruTWd1NHqBBVWbzeKzXg-1 X-Mimecast-MFC-AGG-ID: jNruTWd1NHqBBVWbzeKzXg_1771856453 From: Stefan Hajnoczi To: qemu-devel@nongnu.org Cc: Paolo Bonzini , Fam Zheng , qemu-block@nongnu.org, Stefan Hajnoczi , Peter Maydell , Jens Axboe Subject: [PULL 2/2] fdmon-io_uring: check CQ ring directly in gsource_check Date: Mon, 23 Feb 2026 09:20:31 -0500 Message-ID: <20260223142031.1397832-3-stefanha@redhat.com> In-Reply-To: <20260223142031.1397832-1-stefanha@redhat.com> References: <20260223142031.1397832-1-stefanha@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.111 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.129.124; envelope-from=stefanha@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -1 X-Spam_score: -0.2 X-Spam_bar: / X-Spam_report: (-0.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=1.179, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.717, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: qemu development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1771856539707158500 Content-Type: text/plain; charset="utf-8" From: Jens Axboe gsource_check() only looks at the ppoll revents for the io_uring fd, but CQEs can be posted during gsource_prepare()'s io_uring_submit() call via kernel task_work processing on syscall exit. These completions are already sitting in the CQ ring but the ring fd may not be signaled yet, causing gsource_check() to return false. Add a fallback io_uring_cq_ready() check so completions that arrive during submission are dispatched immediately rather than waiting for the next ppoll() cycle. Signed-off-by: Jens Axboe Message-ID: <20260213143225.161043-3-axboe@kernel.dk> Signed-off-by: Stefan Hajnoczi --- util/fdmon-io_uring.c | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/util/fdmon-io_uring.c b/util/fdmon-io_uring.c index d0b56127c6..b81e412402 100644 --- a/util/fdmon-io_uring.c +++ b/util/fdmon-io_uring.c @@ -344,7 +344,19 @@ static void fdmon_io_uring_gsource_prepare(AioContext = *ctx) static bool fdmon_io_uring_gsource_check(AioContext *ctx) { gpointer tag =3D ctx->io_uring_fd_tag; - return g_source_query_unix_fd(&ctx->source, tag) & G_IO_IN; + + /* Check ppoll revents (normal path) */ + if (g_source_query_unix_fd(&ctx->source, tag) & G_IO_IN) { + return true; + } + + /* + * Also check for CQEs that may have been posted during prepare's + * io_uring_submit() via task_work on syscall exit. Without this, + * the main loop can miss completions and sleep in ppoll() until the + * next timer fires. + */ + return io_uring_cq_ready(&ctx->fdmon_io_uring); } =20 /* Dispatch CQE handlers that are ready */ --=20 2.53.0