From nobody Fri Apr 3 16:03:50 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=quarantine dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1774341879; cv=none; d=zohomail.com; s=zohoarc; b=nIFg5m2z3pPvFzEawsZeaVfbkY8mAbgpAxr+efUCzO2+EHS8zN6DtoPJ78U8nf1LckKkwJ+CL9j5+Mm2pkErQRlG2Ko+BcNn989rn07A97GjyVJRZ6eHdIJ9N5rVjwBdKIGnH1JVKPU+SEy3aMRUrauQBfNKWSLMmwXQOhpvuq4= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1774341879; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=JKWtmzsVyC/vUaJKxKrxcKgLxnja+qeqEwklaNrjqAg=; b=OegKSJjQX53psjmCAIrUFIcRckTLST8EscLvCfA3IFc9w88mu/42oWnFdMFOWniOCihxadPa1EgRxA3ax0PQTND3qJnB0vddBOI0ZUHdqAu4MFl6PBERrXrzOwErx1AXL4HFhqEiNw3a1D3AWWP1F7uKg2t3y30T2R+OyE7jiG0= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=quarantine dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1774341879015205.13107991683387; Tue, 24 Mar 2026 01:44:39 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1w4xMg-0004ab-9B; Tue, 24 Mar 2026 04:43:54 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1w4xMe-0004ZZ-De for qemu-devel@nongnu.org; Tue, 24 Mar 2026 04:43:52 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1w4xMb-0005Ku-P3 for qemu-devel@nongnu.org; Tue, 24 Mar 2026 04:43:52 -0400 Received: from mail-wr1-f71.google.com (mail-wr1-f71.google.com [209.85.221.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-42-uYfwWC9PPiO3lhH-Kd83SA-1; Tue, 24 Mar 2026 04:43:47 -0400 Received: by mail-wr1-f71.google.com with SMTP id ffacd0b85a97d-43b46b2b161so4062855f8f.0 for ; Tue, 24 Mar 2026 01:43:47 -0700 (PDT) Received: from localhost (p200300cfd737d03cdf99cbc9b4448cfa.dip0.t-ipconnect.de. [2003:cf:d737:d03c:df99:cbc9:b444:8cfa]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-43b6470b243sm36839348f8f.26.2026.03.24.01.43.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Mar 2026 01:43:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1774341829; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=JKWtmzsVyC/vUaJKxKrxcKgLxnja+qeqEwklaNrjqAg=; b=el6/U5IVTP5uMEMRHHh79VnZxknQFNlZ0sDX3w2+HwGNCSF0yer4ejz+BSSly7bYsh1xYs HxP9WqqR824/bVnX7CbQm+yOBy0dDRbnVR2dwB4tqg8ej5rsuGgeuOTlZMycovE5kIxbW6 GtndCSu/Pg4SsB4ic2PbWS1kMdakabI= X-MC-Unique: uYfwWC9PPiO3lhH-Kd83SA-1 X-Mimecast-MFC-AGG-ID: uYfwWC9PPiO3lhH-Kd83SA_1774341826 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=google; t=1774341826; x=1774946626; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=JKWtmzsVyC/vUaJKxKrxcKgLxnja+qeqEwklaNrjqAg=; b=IWLnuXOa8Jh+EP2/glL1fbEgqFU6ZFYyxXXIYDyWcqcWpTEiySQggwzd9IP20IHTaX euWPCOF3XYANmFsye22vHHEg0KW7ZqfRkga+jc6Gryk8ghjFgAWOZheVrzv/12Six0L0 D/i+eUCUbImKfNH2BCsQfT/ixxQ27uAggPov99JodsIFr1WViJsnudOIpRyjD+uTLBfj b1DQNLlXIOdT0/G+dzyh31AbAUTuaxhYlLlWcRNWoVxQAM6zdXc6FV40yWv3LLDYSE6G On2O+7ZmfDzizWt54/iR4QHlPOu9LXpVZ8LdOZm5RezykBFHRMGfXAkCSWkBGohwHoYM gpgQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1774341826; x=1774946626; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=JKWtmzsVyC/vUaJKxKrxcKgLxnja+qeqEwklaNrjqAg=; b=GVVgyxLOpHMw88dcqsfzwZNZd4t83PLS4Ec7d0EWXIyOPeD+Nj31WlgBTnp9Xxgh3m u8JAYfxpPgh3n139RqHsciIKC8rWwAE/TIYEXoApeNrdJx/xpfVkOobITFk/28ox7WYJ ZQE5XlIN9PiHYPTAgvJtTDZk/iIdJl9ZPw9HzL+b/rRi5gZKUJSGlB4rbvdqr2NTlFBr ApKwE0zz3j/6bBeJsgDfw6AtK9SF11LSIFfTzd3Ye5XDlbDY+QSWcplOoGWYMm77JQLR q1DbsX6PpSwl4uKUxiz03sEFvdz85Zqz1e00M7FOyxsXO5naFy/g7qvQAac8OhAnByZb AE3g== X-Gm-Message-State: AOJu0YxTBhLWuxTreT0QiM4QXUokJoUD6BPnJSUoNP5DFDimagOmfPG3 dWatqLle0nCOwHJUPNccm040hSzBs3xj82ILjJf/yl+QaBS5xxFgPRPWUuU1RgkrQUJqFbk6pVN yMx+QHl9cT8YBOfKSuYSFqou/UCTjUWtvQAwl4nJD2wOTqvrI87MyljT0 X-Gm-Gg: ATEYQzzqOGz25ZrRC6dFH7Uxz9qxTxplpLJ9qrVC/UQDUrQd3JYmgjrTjsA184X4GvQ fek9UBSx00uVhjaMBsupzwp6ZxtjYlFIfTbXSdq1oSDUJIOfjLc6vbmwfqJZvJgHJ3BIqVBK4Ci i/yV0ypyr7WeyI0A7eULTclxa5s30SDwVDEEKK/+8ijtEJeGQYrL2INTKHzHRmWyjdIOqy7Xhxc twFHj+9xaAxIklT2WRF4qG7JxVN5HUg2YbSbnRqbe0RGj/Y5vPOewSlTc9ZLj/VckhBtdDJ7MS4 HpTnCboLCq3Kw4aYKzQuJF1N5GlMjZEEQ9/D5TBhoVSnR+KL+tcvNK49bxeOfbkSRmGRTiqSaGY ciluoRVK2yVL726l2ZS+ZynMLCGHKYgBM7/qXUyPP/qdRmPNHV/znxj/uQKFkjVVyBetT78a4Yt yEV3eM X-Received: by 2002:a05:6000:601:b0:439:c5cf:fc73 with SMTP id ffacd0b85a97d-43b6423b842mr23735085f8f.12.1774341826145; Tue, 24 Mar 2026 01:43:46 -0700 (PDT) X-Received: by 2002:a05:6000:601:b0:439:c5cf:fc73 with SMTP id ffacd0b85a97d-43b6423b842mr23735050f8f.12.1774341825684; Tue, 24 Mar 2026 01:43:45 -0700 (PDT) From: Hanna Czenczek To: qemu-block@nongnu.org Cc: qemu-devel@nongnu.org, Hanna Czenczek , Kevin Wolf , Julia Suvorova , Aarushi Mehta , Stefan Hajnoczi , Stefano Garzarella Subject: [PATCH for-11.0 v2 1/3] linux-aio: Put all parameters into qemu_laiocb Date: Tue, 24 Mar 2026 09:43:34 +0100 Message-ID: <20260324084338.37453-2-hreitz@redhat.com> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260324084338.37453-1-hreitz@redhat.com> References: <20260324084338.37453-1-hreitz@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=hreitz@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: qemu development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1774341881452154100 Content-Type: text/plain; charset="utf-8" Put all request parameters into the qemu_laiocb struct, which will allow re-submitting the tail of short reads/writes. Reviewed-by: Kevin Wolf Signed-off-by: Hanna Czenczek --- block/linux-aio.c | 34 ++++++++++++++++++++++------------ 1 file changed, 22 insertions(+), 12 deletions(-) diff --git a/block/linux-aio.c b/block/linux-aio.c index 53c3e9af8a..3843f45eac 100644 --- a/block/linux-aio.c +++ b/block/linux-aio.c @@ -41,9 +41,15 @@ struct qemu_laiocb { LinuxAioState *ctx; struct iocb iocb; ssize_t ret; + off_t offset; size_t nbytes; QEMUIOVector *qiov; - bool is_read; + + int fd; + int type; + BdrvRequestFlags flags; + + uint64_t dev_max_batch; QSIMPLEQ_ENTRY(qemu_laiocb) next; }; =20 @@ -87,7 +93,7 @@ static void qemu_laio_process_completion(struct qemu_laio= cb *laiocb) ret =3D 0; } else if (ret >=3D 0) { /* Short reads mean EOF, pad with zeros. */ - if (laiocb->is_read) { + if (laiocb->type =3D=3D QEMU_AIO_READ) { qemu_iovec_memset(laiocb->qiov, ret, 0, laiocb->qiov->size - ret); } else { @@ -367,23 +373,23 @@ static void laio_deferred_fn(void *opaque) } } =20 -static int laio_do_submit(int fd, struct qemu_laiocb *laiocb, off_t offset, - int type, BdrvRequestFlags flags, - uint64_t dev_max_batch) +static int laio_do_submit(struct qemu_laiocb *laiocb) { LinuxAioState *s =3D laiocb->ctx; struct iocb *iocbs =3D &laiocb->iocb; QEMUIOVector *qiov =3D laiocb->qiov; + int fd =3D laiocb->fd; + off_t offset =3D laiocb->offset; =20 - switch (type) { + switch (laiocb->type) { case QEMU_AIO_WRITE: #ifdef HAVE_IO_PREP_PWRITEV2 { - int laio_flags =3D (flags & BDRV_REQ_FUA) ? RWF_DSYNC : 0; + int laio_flags =3D (laiocb->flags & BDRV_REQ_FUA) ? RWF_DSYNC : 0; io_prep_pwritev2(iocbs, fd, qiov->iov, qiov->niov, offset, laio_fl= ags); } #else - assert(flags =3D=3D 0); + assert(laiocb->flags =3D=3D 0); io_prep_pwritev(iocbs, fd, qiov->iov, qiov->niov, offset); #endif break; @@ -399,7 +405,7 @@ static int laio_do_submit(int fd, struct qemu_laiocb *l= aiocb, off_t offset, /* Currently Linux kernel does not support other operations */ default: fprintf(stderr, "%s: invalid AIO request type 0x%x.\n", - __func__, type); + __func__, laiocb->type); return -EIO; } io_set_eventfd(&laiocb->iocb, event_notifier_get_fd(&s->e)); @@ -407,7 +413,7 @@ static int laio_do_submit(int fd, struct qemu_laiocb *l= aiocb, off_t offset, QSIMPLEQ_INSERT_TAIL(&s->io_q.pending, laiocb, next); s->io_q.in_queue++; if (!s->io_q.blocked) { - if (s->io_q.in_queue >=3D laio_max_batch(s, dev_max_batch)) { + if (s->io_q.in_queue >=3D laio_max_batch(s, laiocb->dev_max_batch)= ) { ioq_submit(s); } else { defer_call(laio_deferred_fn, s); @@ -425,14 +431,18 @@ int coroutine_fn laio_co_submit(int fd, uint64_t offs= et, QEMUIOVector *qiov, AioContext *ctx =3D qemu_get_current_aio_context(); struct qemu_laiocb laiocb =3D { .co =3D qemu_coroutine_self(), + .offset =3D offset, .nbytes =3D qiov ? qiov->size : 0, .ctx =3D aio_get_linux_aio(ctx), .ret =3D -EINPROGRESS, - .is_read =3D (type =3D=3D QEMU_AIO_READ), .qiov =3D qiov, + .fd =3D fd, + .type =3D type, + .flags =3D flags, + .dev_max_batch =3D dev_max_batch, }; =20 - ret =3D laio_do_submit(fd, &laiocb, offset, type, flags, dev_max_batch= ); + ret =3D laio_do_submit(&laiocb); if (ret < 0) { return ret; } --=20 2.53.0 From nobody Fri Apr 3 16:03:50 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=quarantine dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1774341875; cv=none; d=zohomail.com; s=zohoarc; b=WWf2dRDL9YPjFpqTgICMbxZqtuGrbfUQ3FxH67WV19S2kc4i/6w0yigG910MBM4+bmzgpx6l3YcbkvwXUqRZpl6Ul59D/+2WMiuZ7JP/zc2KV3cCV4FG0rbF+skQTbTplx5kq14iwenI8j84q6VSbPCYSuxnR3CPlIYfGZh5Qo8= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1774341875; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=x95ITeG3qDTYkVXpwGDq0a+8lz2mrJu4Ql89QIroNy4=; b=PHDSlmdHPIOxmv2JEqfrzYlYYzRKJKu1EIKvq0PuSvBBReZ7NaZf8sQ4Lv0mq0NiUXovvP5Qmw5zE+EaD+asXlIJOcoBqBw8FE5wTmGjKbDTNwl+4Qi54TnzI4hfU6CMt9n4cfl+qRTabt3GNDPtZjIQJKRodvUeQgCpacQN8XM= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=quarantine dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1774341875720422.8596752906808; Tue, 24 Mar 2026 01:44:35 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1w4xMi-0004fX-DY; Tue, 24 Mar 2026 04:43:56 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1w4xMg-0004aZ-1W for qemu-devel@nongnu.org; Tue, 24 Mar 2026 04:43:54 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1w4xMe-0005LO-61 for qemu-devel@nongnu.org; Tue, 24 Mar 2026 04:43:53 -0400 Received: from mail-wr1-f69.google.com (mail-wr1-f69.google.com [209.85.221.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-669-9dJUOIjFN86t9OW_Mb6Uog-1; Tue, 24 Mar 2026 04:43:49 -0400 Received: by mail-wr1-f69.google.com with SMTP id ffacd0b85a97d-43b4dcdb372so2728575f8f.0 for ; Tue, 24 Mar 2026 01:43:49 -0700 (PDT) Received: from localhost (p200300cfd737d03cdf99cbc9b4448cfa.dip0.t-ipconnect.de. [2003:cf:d737:d03c:df99:cbc9:b444:8cfa]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-43b644ae048sm36104144f8f.1.2026.03.24.01.43.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Mar 2026 01:43:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1774341831; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=x95ITeG3qDTYkVXpwGDq0a+8lz2mrJu4Ql89QIroNy4=; b=ar6nLM8a5tyyfP/4wqnfT7eQNpLP8SfOF8MklZYQ46/YMKFOZsoql8nzkEeTHNJf48w1ak eIahL0aasHWYJxyKtROkVCdhvxkDMi9BRIJqMtPThKRG8ndzTdQDJhWz/HLwKBSb5ll2tU QJKWZcn6dV6aERq/tDL95av9Uz8h+RE= X-MC-Unique: 9dJUOIjFN86t9OW_Mb6Uog-1 X-Mimecast-MFC-AGG-ID: 9dJUOIjFN86t9OW_Mb6Uog_1774341828 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=google; t=1774341828; x=1774946628; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=x95ITeG3qDTYkVXpwGDq0a+8lz2mrJu4Ql89QIroNy4=; b=BZxygHcm9FWEOPEnnILcM5bfBNdNj9wb4/YMy/aAjcqVHE57cBfB5IMgFaK4I6h07b GEbVQXxkR7tS/lb6NrzX4sNkakudG4gan6RYHfT8MGTBLsBTfGQ1VOiPFo0PhKxqQc1o WH7kvMEeTUKYvlkgDLzTxIBwKxILJlcvLX95xC6cLm/B1tdeD1OXbFEPyTpRzlv/KAUw oQVuLNya96MF6B/a4Ze6PJfO1iZbYIJxfN9UhDLzke7YhzpU+zGLy41hHx+hNlAmUE7N SHLsbh0Zl7jgH6/4ooJfHoZ5wYlKxKAu3oR9ZWC7vi5Jq3vSXMgT1FzghoP68oZx2NJB QkkQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1774341828; x=1774946628; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=x95ITeG3qDTYkVXpwGDq0a+8lz2mrJu4Ql89QIroNy4=; b=HoSeGa/EFtSkKS+OcCF9v7YSqPFmiwITxsxMCKA/CA5g/kTNtl/RJYW9rOufn0Uuoq T8Uek0YffV0tvqTqjoHyOep/BZc4TRv14PyDhPt36Ba1poEAi7QRrvQTPzzDSDeKUxxQ zTdXm8mSDkHM0tZ2udmPir1SYCMdpFOYY/vL6KkvHjkhTAnEIf3/kyBvDKaNMultxE3i Xz4+2FwHnVPALL4gTYRVfMkqLpl7ajDh9Wy6sFx4V2Rk9FmsOkKWYcuiyRZitgnv537z oQxu4QH4Iu08WLsILvw0txhr7JzmfR/J0OebLN+6PbIX1L3tjwgI4oA+7AWCm2mVuIQ9 vZkA== X-Gm-Message-State: AOJu0YwzP6F3rVAU+RqU/AeiKZ59YsA+BSCz7p6vjb7+4CCI11htL/I2 X7ajlJVCAEAy8Ri9OexXNmhPNI3dEUFQbDki8z10hAq4DGUU7SKmezntMxUEYsARAxV1HZWUETP K3KEXpSJ42aFNu8bZ2tYnAsWx5bz1syzM/jyslEJK8gxkttg6rBcGhk/d X-Gm-Gg: ATEYQzy78AFs1rtL8HDFR2brGn0DeOpDYmFCqa6s+11FCemhtV/UjCG3tkk1DWEd2iX mzbzVhwqTCGGEvzZ8gcHhmiLZ3u+J99uoLAaqRFezwyd0rcGDfkw+rssxO3YD9pAk22DnA7g+jX f2kVA2lLoXguNWiPYC4154UfWnMfTqxnj/DrlfDw+nVraYD4SMgLFMi1SQkndHqNvrIIXT/S4g1 a7whb3lfR58EauICONDk+KDOYnwKF6W8YfcHSY95qj1K7PJuXbplNqeDEwIRxxoRZoAzYqMtIyW yEI9RkNe6RnV2imL2XEb/0GARADjKUmM7dqcss1Ao1aqquqV+2mOsMK42ckdN8NHbH+kqR68zZB HaHWoFYOPSOy0HckiUrGiTQNx6yj5Jt26Z8MQEuia82g5AoBj4n8B/eOyMmM9eTbpEF4nRwZwPl W9EA6N X-Received: by 2002:a05:600c:698e:b0:486:f4d2:eac6 with SMTP id 5b1f17b1804b1-486fedd80c9mr193787265e9.13.1774341828138; Tue, 24 Mar 2026 01:43:48 -0700 (PDT) X-Received: by 2002:a05:600c:698e:b0:486:f4d2:eac6 with SMTP id 5b1f17b1804b1-486fedd80c9mr193786915e9.13.1774341827663; Tue, 24 Mar 2026 01:43:47 -0700 (PDT) From: Hanna Czenczek To: qemu-block@nongnu.org Cc: qemu-devel@nongnu.org, Hanna Czenczek , Kevin Wolf , Julia Suvorova , Aarushi Mehta , Stefan Hajnoczi , Stefano Garzarella Subject: [PATCH for-11.0 v2 2/3] linux-aio: Resubmit tails of short reads/writes Date: Tue, 24 Mar 2026 09:43:35 +0100 Message-ID: <20260324084338.37453-3-hreitz@redhat.com> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260324084338.37453-1-hreitz@redhat.com> References: <20260324084338.37453-1-hreitz@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.129.124; envelope-from=hreitz@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: qemu development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1774341877320154100 Content-Type: text/plain; charset="utf-8" Short reads/writes can happen. One way to reproduce them is via our FUSE export, with the following diff applied (%s/escaped // to apply -- if you put plain diffs in commit messages, git-am will apply them, and I would rather avoid breaking FUSE accidentally via this patch): escaped diff --git a/block/export/fuse.c b/block/export/fuse.c escaped index a2a478d293..67dc50a412 100644 escaped --- a/block/export/fuse.c escaped +++ b/block/export/fuse.c @@ -828,7 +828,7 @@ static ssize_t coroutine_fn GRAPH_RDLOCK fuse_co_init(FuseExport *exp, struct fuse_init_out *out, const struct fuse_init_in_compat *in) { - const uint32_t supported_flags =3D FUSE_ASYNC_READ | FUSE_ASYNC_DIO; + const uint32_t supported_flags =3D FUSE_ASYNC_READ; if (in->major !=3D 7) { error_report("FUSE major version mismatch: We have 7, but kernel h= as %" @@ -1060,6 +1060,8 @@ fuse_co_read(FuseExport *exp, void **bufptr, uint64_t= offset, uint32_t size) void *buf; int ret; + size =3D MIN(size, 4096); + /* Limited by max_read, should not happen */ if (size > FUSE_MAX_READ_BYTES) { return -EINVAL; @@ -1110,6 +1112,8 @@ fuse_co_write(FuseExport *exp, struct fuse_write_out = *out, int64_t blk_len; int ret; + size =3D MIN(size, 4096); + QEMU_BUILD_BUG_ON(FUSE_MAX_WRITE_BYTES > BDRV_REQUEST_MAX_BYTES); /* Limited by max_write, should not happen */ if (size > FUSE_MAX_WRITE_BYTES) { Then: $ ./qemu-img create -f raw test.raw 8k Formatting 'test.raw', fmt=3Draw size=3D8192 $ ./qemu-io -f raw -c 'write -P 42 0 8k' test.raw wrote 8192/8192 bytes at offset 0 8 KiB, 1 ops; 00.00 sec (64.804 MiB/sec and 8294.9003 ops/sec) $ hexdump -C test.raw 00000000 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a |**************= **| * 00002000 With aio=3Dthreads, short I/O works: $ storage-daemon/qemu-storage-daemon \ --blockdev file,node-name=3Dtest,filename=3Dtest.raw \ --export fuse,id=3Dexp,node-name=3Dtest,mountpoint=3Dtest.raw,writable= =3Dtrue Other shell: $ ./qemu-io --image-opts -c 'read -P 42 0 8k' \ driver=3Dfile,filename=3Dtest.raw,cache.direct=3Don,aio=3Dthreads read 8192/8192 bytes at offset 0 8 KiB, 1 ops; 00.00 sec (36.563 MiB/sec and 4680.0923 ops/sec) $ ./qemu-io --image-opts -c 'write -P 23 0 8k' \ driver=3Dfile,filename=3Dtest.raw,cache.direct=3Don,aio=3Dthreads wrote 8192/8192 bytes at offset 0 8 KiB, 1 ops; 00.00 sec (35.995 MiB/sec and 4607.2970 ops/sec) $ hexdump -C test.raw 00000000 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 |..............= ..| * 00002000 But with aio=3Dnative, it does not: $ ./qemu-io --image-opts -c 'read -P 23 0 8k' \ driver=3Dfile,filename=3Dtest.raw,cache.direct=3Don,aio=3Dnative Pattern verification failed at offset 0, 8192 bytes read 8192/8192 bytes at offset 0 8 KiB, 1 ops; 00.00 sec (86.155 MiB/sec and 11027.7900 ops/sec) $ ./qemu-io --image-opts -c 'write -P 42 0 8k' \ driver=3Dfile,filename=3Dtest.raw,cache.direct=3Don,aio=3Dnative write failed: No space left on device $ hexdump -C test.raw 00000000 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a |**************= **| * 00001000 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 |..............= ..| * 00002000 This patch fixes that. Reviewed-by: Kevin Wolf Signed-off-by: Hanna Czenczek --- block/linux-aio.c | 56 ++++++++++++++++++++++++++++++++++++++++++----- 1 file changed, 50 insertions(+), 6 deletions(-) diff --git a/block/linux-aio.c b/block/linux-aio.c index 3843f45eac..0a7424fbb3 100644 --- a/block/linux-aio.c +++ b/block/linux-aio.c @@ -45,6 +45,10 @@ struct qemu_laiocb { size_t nbytes; QEMUIOVector *qiov; =20 + /* For handling short reads/writes */ + size_t total_done; + QEMUIOVector resubmit_qiov; + int fd; int type; BdrvRequestFlags flags; @@ -74,28 +78,61 @@ struct LinuxAioState { }; =20 static void ioq_submit(LinuxAioState *s); +static int laio_do_submit(struct qemu_laiocb *laiocb); =20 static inline ssize_t io_event_ret(struct io_event *ev) { return (ssize_t)(((uint64_t)ev->res2 << 32) | ev->res); } =20 +/** + * Retry tail of short requests. + */ +static int laio_resubmit_short_io(struct qemu_laiocb *laiocb, size_t done) +{ + QEMUIOVector *resubmit_qiov =3D &laiocb->resubmit_qiov; + + laiocb->total_done +=3D done; + + if (!resubmit_qiov->iov) { + qemu_iovec_init(resubmit_qiov, laiocb->qiov->niov); + } else { + qemu_iovec_reset(resubmit_qiov); + } + qemu_iovec_concat(resubmit_qiov, laiocb->qiov, + laiocb->total_done, laiocb->nbytes - laiocb->total_d= one); + + return laio_do_submit(laiocb); +} + /* * Completes an AIO request. */ static void qemu_laio_process_completion(struct qemu_laiocb *laiocb) { - int ret; + ssize_t ret; =20 ret =3D laiocb->ret; if (ret !=3D -ECANCELED) { - if (ret =3D=3D laiocb->nbytes) { + if (ret =3D=3D laiocb->nbytes - laiocb->total_done) { ret =3D 0; + } else if (ret > 0 && (laiocb->type =3D=3D QEMU_AIO_READ || + laiocb->type =3D=3D QEMU_AIO_WRITE)) { + ret =3D laio_resubmit_short_io(laiocb, ret); + if (!ret) { + return; + } } else if (ret >=3D 0) { - /* Short reads mean EOF, pad with zeros. */ + /* + * For normal reads and writes, we only get here if ret =3D=3D= 0, which + * means EOF for reads and ENOSPC for writes. + * For zone-append, we get here with any ret >=3D 0, which we = just + * treat as ENOSPC, too (safer than resubmitting, probably, bu= t not + * 100 % clear). + */ if (laiocb->type =3D=3D QEMU_AIO_READ) { - qemu_iovec_memset(laiocb->qiov, ret, 0, - laiocb->qiov->size - ret); + qemu_iovec_memset(laiocb->qiov, laiocb->total_done, 0, + laiocb->qiov->size - laiocb->total_done); } else { ret =3D -ENOSPC; } @@ -103,6 +140,9 @@ static void qemu_laio_process_completion(struct qemu_la= iocb *laiocb) } =20 laiocb->ret =3D ret; + if (laiocb->resubmit_qiov.iov) { + qemu_iovec_destroy(&laiocb->resubmit_qiov); + } =20 /* * If the coroutine is already entered it must be in ioq_submit() and @@ -379,7 +419,11 @@ static int laio_do_submit(struct qemu_laiocb *laiocb) struct iocb *iocbs =3D &laiocb->iocb; QEMUIOVector *qiov =3D laiocb->qiov; int fd =3D laiocb->fd; - off_t offset =3D laiocb->offset; + off_t offset =3D laiocb->offset + laiocb->total_done; + + if (laiocb->resubmit_qiov.iov) { + qiov =3D &laiocb->resubmit_qiov; + } =20 switch (laiocb->type) { case QEMU_AIO_WRITE: --=20 2.53.0 From nobody Fri Apr 3 16:03:50 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=quarantine dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1774341858; cv=none; d=zohomail.com; s=zohoarc; b=iG9wCUYH+tN3HM4g2qlc3WyHYZ1VzG1DVGjoaxHtzlBS/c5XyELbkUJrusykAWsFGg/9IoizaEGIizWMG9R/tP3vXH2EFBPgNAozKKhuyYKo9G9MkSC4thQmcyjBOb3BDdmR/LORKNSKFcE7ArzjXyjLu9hhNJTDgyN7qMj1SBc= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1774341858; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=ncLTNRZsMZnHsDUzsH/LBV8BbD4FT+2RHhEEq0PsVOI=; b=VcFMfGlTk/dh0Jeui8XZSKaotkxA+/cgbDSowA/F+RwrHCQsWf4nWSWkXE+cQs/+Cw6iILVZ/x4UNYdutMor9XuVfQSnCw0Q8lnqHGLg7MAb5N29YeqiHrzWy6TgPZS0xaGX6XmKKqTbLOauqxNgO8aG7N2mg2+rvmffki9QyyI= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=quarantine dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1774341858225916.2286455296373; Tue, 24 Mar 2026 01:44:18 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1w4xMk-0004l5-0F; Tue, 24 Mar 2026 04:43:58 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1w4xMh-0004eq-T3 for qemu-devel@nongnu.org; Tue, 24 Mar 2026 04:43:55 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1w4xMf-0005Lz-Tx for qemu-devel@nongnu.org; Tue, 24 Mar 2026 04:43:55 -0400 Received: from mail-wm1-f70.google.com (mail-wm1-f70.google.com [209.85.128.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-402-0C2KA4fsOW-wCbSzD-aPIw-1; Tue, 24 Mar 2026 04:43:51 -0400 Received: by mail-wm1-f70.google.com with SMTP id 5b1f17b1804b1-483786a09b1so13560455e9.3 for ; Tue, 24 Mar 2026 01:43:51 -0700 (PDT) Received: from localhost (p200300cfd737d03cdf99cbc9b4448cfa.dip0.t-ipconnect.de. [2003:cf:d737:d03c:df99:cbc9:b444:8cfa]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-487116f17f3sm31013705e9.1.2026.03.24.01.43.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Mar 2026 01:43:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1774341833; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ncLTNRZsMZnHsDUzsH/LBV8BbD4FT+2RHhEEq0PsVOI=; b=X+boW4SlrDt3gMS75cEArdSgHN2ObxgwOx0WySOMcXEwevtAwclIuzMfwB88RdU4GTuNnq 98mxw8Jea0MIl8MnZ+B/ZMtR4pdqXfKj5SftDDC++rKHxNLh7EGpJrF3CKZrmk0UqD8Pfy GEz2+KxEHDWhY0N7uuFLnYUY+1v4sEk= X-MC-Unique: 0C2KA4fsOW-wCbSzD-aPIw-1 X-Mimecast-MFC-AGG-ID: 0C2KA4fsOW-wCbSzD-aPIw_1774341830 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=google; t=1774341830; x=1774946630; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=ncLTNRZsMZnHsDUzsH/LBV8BbD4FT+2RHhEEq0PsVOI=; b=iTRjNMrFm1ZiQGqjkZs+E2JppzKuEuDOXKQp/Botf2sAPbzFN+KrLLI6l6tYzWvKmn 8kzMXa/5MbfaeOc2sIPFp3T5VD0dZvRj0CeczXn2nbkpXdMmZUwCuPFa8h0wLxEoDnG8 pFDNR24gfZDFtPpSXrnlaAMKjyDe97EQJmE6twD5d37s+sVCV5ka6rYpCySPLB3eutFf BB626mTKwz3d6rO7OVp6YPfk01uFkRSgtJJqQgenCvzz/iExbBgugYnzTnzfyqn7wCSs PhKXnZG9/oN14a2LBqi7Dkf5gvZxK4yeP/en8I71zViLKYJn/5V2lUuz74/ln7MCckYX sq5A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1774341830; x=1774946630; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=ncLTNRZsMZnHsDUzsH/LBV8BbD4FT+2RHhEEq0PsVOI=; b=C+fbkAMJyUQVlbTnjJ9OfJlCyygmzmCpu618FycFeo5U/7zggdjVdjnbQmNQ09Ofwa qRKqmM0QlIGRgGROuCwS4mPbOKH/9pgoiUuMI2MR7guZYzXmIvi6zMvg3FJK/VYACNjT vOmBj7JF6LmbL9r8dzNTsm4R2OQ6YrBOFiNKi8+45vkmyFyaHK6nU2npvQbwgvYr0+Ug J2UpeSHRVH2qI72HHCOU/eS+bqaHvpHdBY7LEls/A4ULhwgf9hhoBElXpLbwEqUHB94u 83czbm/kU8KLqv5k6LGupKvB0ON7egBLnz83qNsRArIyErRK1a4PHZFxR5geDFPgccRY bIuA== X-Gm-Message-State: AOJu0Yy8brmY8foOxQzbkg9n9SkFCbuKK9AcDq2dD1zw+wU9c/sjrO+o Dx/4wBBCjZovtXVHudX659ajCiTHBob+/6BMfYcaKR1LvYVEhJ9VJNjkIPnXXZijJGXsurv3GhS epXFoprbtNMDrkGDmxvSK4NLn46CfZoNBwdvqYAtOIvPiQNe3n8J3DqC8 X-Gm-Gg: ATEYQzxV35a7Kw5BtZ9Xxfyymrg+UfDcI/QHAW2Z8dtO/krPVzjOauitywvO/Rd6d5o ebSePzkT0ubZ7ubLaxjdAVhCMJ8EZuU9UFJfvCWtq82ZCB2qSfxZOhSO27yBMpOAP54xC0QfTRC CWciKezekAWI1cygAgLVuxUnecWgtstP6cB9XMSU4KLE57hUYggYPhJsYUdTBVY4wkaeH0u2Uho 8etwLrOPrStnz5AOdFeFrZ8jGnp620T0AJeioe9AyNEpWYqbqdpkzZ0uq2s/7WtcrW4VV8zFlW0 pinM3O3YEYMD2U6Ocwo0HZPkjsawPd3ieFVuXx1ORfeqBd6eWNp79bLo/0Hv+nw0S9y94vpIqHa MdkyFfIPd5Bj+8MOT+GQI/FNCC0Uw2PQCO3ow+y+Qwm2t9MDdlJ0ohw4p0tK4EK1Ce5P+3o2eml TF+cOd X-Received: by 2002:a05:600c:820e:b0:483:7903:c3b1 with SMTP id 5b1f17b1804b1-486fee12c91mr226107645e9.20.1774341830136; Tue, 24 Mar 2026 01:43:50 -0700 (PDT) X-Received: by 2002:a05:600c:820e:b0:483:7903:c3b1 with SMTP id 5b1f17b1804b1-486fee12c91mr226107065e9.20.1774341829585; Tue, 24 Mar 2026 01:43:49 -0700 (PDT) From: Hanna Czenczek To: qemu-block@nongnu.org Cc: qemu-devel@nongnu.org, Hanna Czenczek , Kevin Wolf , Julia Suvorova , Aarushi Mehta , Stefan Hajnoczi , Stefano Garzarella Subject: [PATCH for-11.0 v2 3/3] io-uring: Resubmit tails of short writes Date: Tue, 24 Mar 2026 09:43:36 +0100 Message-ID: <20260324084338.37453-4-hreitz@redhat.com> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260324084338.37453-1-hreitz@redhat.com> References: <20260324084338.37453-1-hreitz@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=hreitz@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: qemu development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1774341861284154100 Content-Type: text/plain; charset="utf-8" Short writes can happen, too, not just short reads. The difference to aio=3Dnative is that the kernel will actually retry the tail of short requests internally already -- so it is harder to reproduce. But if the tail of a short request returns an error to the kernel, we will see it in userspace still. To reproduce this, apply the following patch on top of the one shown in HEAD^ (again %s/escaped // to apply): escaped diff --git a/block/export/fuse.c b/block/export/fuse.c escaped index 67dc50a412..2b98489a32 100644 escaped --- a/block/export/fuse.c escaped +++ b/block/export/fuse.c @@ -1059,8 +1059,15 @@ fuse_co_read(FuseExport *exp, void **bufptr, uint64_= t offset, uint32_t size) int64_t blk_len; void *buf; int ret; + static uint32_t error_size; - size =3D MIN(size, 4096); + if (error_size =3D=3D size) { + error_size =3D 0; + return -EIO; + } else if (size > 4096) { + error_size =3D size - 4096; + size =3D 4096; + } /* Limited by max_read, should not happen */ if (size > FUSE_MAX_READ_BYTES) { @@ -1111,8 +1118,15 @@ fuse_co_write(FuseExport *exp, struct fuse_write_out= *out, { int64_t blk_len; int ret; + static uint32_t error_size; - size =3D MIN(size, 4096); + if (error_size =3D=3D size) { + error_size =3D 0; + return -EIO; + } else if (size > 4096) { + error_size =3D size - 4096; + size =3D 4096; + } QEMU_BUILD_BUG_ON(FUSE_MAX_WRITE_BYTES > BDRV_REQUEST_MAX_BYTES); /* Limited by max_write, should not happen */ I know this is a bit artificial because to produce this, there must be an I/O error somewhere anyway, but if it does happen, qemu will understand it to mean ENOSPC for short writes, which is incorrect. So I believe we need to resubmit the tail to maybe have it succeed now, or at least get the correct error code. Reproducer as before: $ ./qemu-img create -f raw test.raw 8k Formatting 'test.raw', fmt=3Draw size=3D8192 $ ./qemu-io -f raw -c 'write -P 42 0 8k' test.raw wrote 8192/8192 bytes at offset 0 8 KiB, 1 ops; 00.00 sec (64.804 MiB/sec and 8294.9003 ops/sec) $ hexdump -C test.raw 00000000 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a |**************= **| * 00002000 $ storage-daemon/qemu-storage-daemon \ --blockdev file,node-name=3Dtest,filename=3Dtest.raw \ --export fuse,id=3Dexp,node-name=3Dtest,mountpoint=3Dtest.raw,writable= =3Dtrue $ ./qemu-io --image-opts -c 'read -P 23 0 8k' \ driver=3Dfile,filename=3Dtest.raw,cache.direct=3Don,aio=3Dio_uring read 8192/8192 bytes at offset 0 8 KiB, 1 ops; 00.00 sec (58.481 MiB/sec and 7485.5342 ops/sec) $ ./qemu-io --image-opts -c 'write -P 23 0 8k' \ driver=3Dfile,filename=3Dtest.raw,cache.direct=3Don,aio=3Dio_uring write failed: No space left on device $ hexdump -C test.raw 00000000 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 |..............= ..| * 00001000 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a |**************= **| * 00002000 So short reads already work (because there is code for that), but short writes incorrectly produce ENOSPC. This patch fixes that by resubmitting not only the tail of short reads but short writes also. (And this patch uses the opportunity to make it so qemu_iovec_destroy() is called only if req->resubmit_qiov.iov is non-NULL. Functionally a non-op, but this is how the code generally checks whether the resubmit_qiov has been set up or not.) Reviewed-by: Kevin Wolf Signed-off-by: Hanna Czenczek --- block/io_uring.c | 82 +++++++++++++++++++++++++--------------------- block/trace-events | 2 +- 2 files changed, 46 insertions(+), 38 deletions(-) diff --git a/block/io_uring.c b/block/io_uring.c index cb131d3b8b..c48a72d37e 100644 --- a/block/io_uring.c +++ b/block/io_uring.c @@ -27,10 +27,10 @@ typedef struct { BdrvRequestFlags flags; =20 /* - * Buffered reads may require resubmission, see - * luring_resubmit_short_read(). + * Short reads/writes require resubmission, see + * luring_resubmit_short_io(). */ - int total_read; + int total_done; QEMUIOVector resubmit_qiov; =20 CqeHandler cqe_handler; @@ -40,10 +40,14 @@ static void luring_prep_sqe(struct io_uring_sqe *sqe, v= oid *opaque) { LuringRequest *req =3D opaque; QEMUIOVector *qiov =3D req->qiov; - uint64_t offset =3D req->offset; + uint64_t offset =3D req->offset + req->total_done; int fd =3D req->fd; BdrvRequestFlags flags =3D req->flags; =20 + if (req->resubmit_qiov.iov) { + qiov =3D &req->resubmit_qiov; + } + switch (req->type) { case QEMU_AIO_WRITE: { @@ -73,17 +77,12 @@ static void luring_prep_sqe(struct io_uring_sqe *sqe, v= oid *opaque) break; case QEMU_AIO_READ: { - if (req->resubmit_qiov.iov !=3D NULL) { - qiov =3D &req->resubmit_qiov; - } if (qiov->niov > 1) { - io_uring_prep_readv(sqe, fd, qiov->iov, qiov->niov, - offset + req->total_read); + io_uring_prep_readv(sqe, fd, qiov->iov, qiov->niov, offset); } else { /* The man page says non-vectored is faster than vectored */ struct iovec *iov =3D qiov->iov; - io_uring_prep_read(sqe, fd, iov->iov_base, iov->iov_len, - offset + req->total_read); + io_uring_prep_read(sqe, fd, iov->iov_base, iov->iov_len, offse= t); } break; } @@ -98,21 +97,26 @@ static void luring_prep_sqe(struct io_uring_sqe *sqe, v= oid *opaque) } =20 /** - * luring_resubmit_short_read: + * luring_resubmit_short_io: * - * Short reads are rare but may occur. The remaining read request needs to= be - * resubmitted. + * Short reads and writes are rare but may occur. The remaining request n= eeds + * to be resubmitted. + * + * For example, short reads can be reproduced by a FUSE export deliberately + * executing short reads. The tail of short writes is generally resubmitt= ed by + * io-uring in the kernel, but if that resubmission encounters an I/O erro= r, the + * already submitted portion will be returned as a short write. */ -static void luring_resubmit_short_read(LuringRequest *req, int nread) +static void luring_resubmit_short_io(LuringRequest *req, int ndone) { QEMUIOVector *resubmit_qiov; size_t remaining; =20 - trace_luring_resubmit_short_read(req, nread); + trace_luring_resubmit_short_io(req, ndone); =20 - /* Update read position */ - req->total_read +=3D nread; - remaining =3D req->qiov->size - req->total_read; + /* Update I/O position */ + req->total_done +=3D ndone; + remaining =3D req->qiov->size - req->total_done; =20 /* Shorten qiov */ resubmit_qiov =3D &req->resubmit_qiov; @@ -121,7 +125,7 @@ static void luring_resubmit_short_read(LuringRequest *r= eq, int nread) } else { qemu_iovec_reset(resubmit_qiov); } - qemu_iovec_concat(resubmit_qiov, req->qiov, req->total_read, remaining= ); + qemu_iovec_concat(resubmit_qiov, req->qiov, req->total_done, remaining= ); =20 aio_add_sqe(luring_prep_sqe, req, &req->cqe_handler); } @@ -153,31 +157,35 @@ static void luring_cqe_handler(CqeHandler *cqe_handle= r) return; } } else if (req->qiov) { - /* total_read is non-zero only for resubmitted read requests */ - int total_bytes =3D ret + req->total_read; + /* total_done is non-zero only for resubmitted requests */ + int total_bytes =3D ret + req->total_done; =20 if (total_bytes =3D=3D req->qiov->size) { ret =3D 0; - } else { + } else if (ret > 0 && (req->type =3D=3D QEMU_AIO_READ || + req->type =3D=3D QEMU_AIO_WRITE)) { /* Short Read/Write */ - if (req->type =3D=3D QEMU_AIO_READ) { - if (ret > 0) { - luring_resubmit_short_read(req, ret); - return; - } - - /* Pad with zeroes */ - qemu_iovec_memset(req->qiov, total_bytes, 0, - req->qiov->size - total_bytes); - ret =3D 0; - } else { - ret =3D -ENOSPC; - } + luring_resubmit_short_io(req, ret); + return; + } else if (req->type =3D=3D QEMU_AIO_READ) { + /* Read ret =3D=3D 0: EOF, pad with zeroes */ + qemu_iovec_memset(req->qiov, total_bytes, 0, + req->qiov->size - total_bytes); + ret =3D 0; + } else { + /* + * Normal write ret =3D=3D 0 means ENOSPC. + * For zone-append, we treat any 0 <=3D ret < qiov->size as EN= OSPC, + * too, because resubmitting the tail seems a little unsafe. + */ + ret =3D -ENOSPC; } } =20 req->ret =3D ret; - qemu_iovec_destroy(&req->resubmit_qiov); + if (req->resubmit_qiov.iov) { + qemu_iovec_destroy(&req->resubmit_qiov); + } =20 /* * If the coroutine is already entered it must be in luring_co_submit(= ) and diff --git a/block/trace-events b/block/trace-events index d170fc96f1..950c82d4b8 100644 --- a/block/trace-events +++ b/block/trace-events @@ -64,7 +64,7 @@ file_paio_submit(void *acb, void *opaque, int64_t offset,= int count, int type) " # io_uring.c luring_cqe_handler(void *req, int ret) "req %p ret %d" luring_co_submit(void *bs, void *req, int fd, uint64_t offset, size_t nbyt= es, int type) "bs %p req %p fd %d offset %" PRId64 " nbytes %zd type %d" -luring_resubmit_short_read(void *req, int nread) "req %p nread %d" +luring_resubmit_short_io(void *req, int ndone) "req %p ndone %d" =20 # qcow2.c qcow2_add_task(void *co, void *bs, void *pool, const char *action, int clu= ster_type, uint64_t host_offset, uint64_t offset, uint64_t bytes, void *qio= v, size_t qiov_offset) "co %p bs %p pool %p: %s: cluster_type %d file_clust= er_offset %" PRIu64 " offset %" PRIu64 " bytes %" PRIu64 " qiov %p qiov_off= set %zu" --=20 2.53.0