From nobody Mon Apr 6 17:27:47 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=quarantine dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1773847982; cv=none; d=zohomail.com; s=zohoarc; b=HuJg6Hoa3bbGyFtS3GUBU3oToK0fA4Mjp7CfB3uFKz1gvarTWxMHJhPLE/GjlhwEGdOvz1KXgQIax3D8K1dSGwrjGoPm4nVezm2vX+3nJ/MeRX9xswtHps7n3DHx5njMN3XzjeJoD+QOL2OZRugxqELNrCE88/Q0uba+5wsyJNA= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1773847982; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=nYV5RP+iiMyqxFM+ztI27k0197hYqdiP0TOvg+Wefz4=; b=fmNVXBUImxDFDpAe6OALxjjE65wOENxjgp/fqRW7f6NDIJA5AC26jB5fJS47UAU/3JWeXf1yDfqOvWSwtgjtsrIG76Z9gkmLEucMeqT+TcUr9SDd2TJk0HC73v5kTvdAoZCJQa6sI0So8yUUYMnp9wMyjtJL31prh18iPxAJsB4= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=quarantine dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1773847982776129.9663124384448; Wed, 18 Mar 2026 08:33:02 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1w2ssp-0002Wg-8U; Wed, 18 Mar 2026 11:32:31 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1w2ssl-0002Vu-Fa for qemu-devel@nongnu.org; Wed, 18 Mar 2026 11:32:28 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1w2ssi-0000OD-LE for qemu-devel@nongnu.org; Wed, 18 Mar 2026 11:32:26 -0400 Received: from mail-wr1-f70.google.com (mail-wr1-f70.google.com [209.85.221.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-481-JLsDT6acOIWMhH-e5ps_Kw-1; Wed, 18 Mar 2026 11:32:20 -0400 Received: by mail-wr1-f70.google.com with SMTP id ffacd0b85a97d-439a85832c0so5655530f8f.2 for ; Wed, 18 Mar 2026 08:32:20 -0700 (PDT) Received: from localhost (p200300cfd737d03304b9bf783d26f60b.dip0.t-ipconnect.de. [2003:cf:d737:d033:4b9:bf78:3d26:f60b]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-43b51852ab3sm8954215f8f.12.2026.03.18.08.32.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 18 Mar 2026 08:32:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1773847943; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=nYV5RP+iiMyqxFM+ztI27k0197hYqdiP0TOvg+Wefz4=; b=KNDuOe/4NVYY8s3pE0kbwvnUMuCf8yFe8cg7iQ617t90qr0lkMs6OPg7PvV0ysDTTNd+8+ gT1W1f48M9Xg3TK0N13jVBB+hxRQgsIncgxNxwdQwi8Dh7qdd5En8YVI5QdX+P/Ql7IbOw asSWM+WFSbtWaGY2C87qM7VCdw+H2R0= X-MC-Unique: JLsDT6acOIWMhH-e5ps_Kw-1 X-Mimecast-MFC-AGG-ID: JLsDT6acOIWMhH-e5ps_Kw_1773847939 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=google; t=1773847939; x=1774452739; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=nYV5RP+iiMyqxFM+ztI27k0197hYqdiP0TOvg+Wefz4=; b=uNMgQwapaxClfag2QB+Zb1vmVVHT+xTeOTbas3odn8mdzgbVlv/aQzeNnrub4FgNec W2L08TZn1j2+3pUmxJNoHTzuhoQC5R0inkGsAQ21q2pMUQb70X/3JmgnANYALh+3HPPy 4FV690UxPLkGpFszL5Sf1MDiorcVXQcpQ7cXaVhkWCWO9n+34bgQ3eJXGaL7msiJuxEo C+1e5V9PZl9v5W7K6+tq4/q2+og/m3RvqKQTbEB2fTZTON/+Ap13C9iqsU2kEqg71mPC DMRy8xNuBe9bvMxJj2WhCfSBs6uSkueJio0N3fUnwtwKYOVBDxOivYhrpzw7WKRQRteW B+7w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1773847939; x=1774452739; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=nYV5RP+iiMyqxFM+ztI27k0197hYqdiP0TOvg+Wefz4=; b=ele7ZGCZKJPAc8pzh+hVleB1pfPk3l0NCCVqA91df+Sz/m+y8CtyGlqx886PeTl3JS ADXRd4xq8DHrSQsCsleTYEJThZ7I5t4klM4EwmO//hXOnxsgDmoEeE2gXlkphEHo3g9J Hhq+d6WIKmEejr+ATApHfYbZWR5BtTREqwp1O6B0rJ5vgAzO7Y8axpbBJwoA1vS3dKGi XC+Y2+jx9He9be3mLiPNCuLrlyY4JwWcheQt5KOJ6vwx3hTRAg3kZOGa2PhWoZghWy7f vIeKy/BCNFszrbTG+rvjkjMQGWkvumP+MaiEsLYFY76l5wVvT8KpAf1LUB3S5YJXrnjf 1JqQ== X-Gm-Message-State: AOJu0YynNQQwM4LKBngMyVuvMY3wW2Tu7t0Ca9GgkZLv7gNDTeq82zKM 9G+xI9OPhEEGAmyQKXB6Bhy6NfIjRGUEYjuqShmdE75JrFP7OUY045cNDoHmHda1oYVfKJXh6fW vK9RwA/Nr7LzW1zj4GATKZzD60bV/FvqT1Zv+keNIr0pPkvVmlMPsS5aA X-Gm-Gg: ATEYQzxEKnZ6oyb52FhRMypyZacc5bwHkSMt41wLTcmIjlx1Z1r07rc1Yi9s4tM/f6O NLtzqWxWce04L72Q4dO+emSxHT1QWQivI6IEZCI3bD6dWyCyFWEInznLyoSk0GisT/CAeh8aGr4 /NFtpQ/kZsgUeafnbZd68a2DQI22kvm0Ak6FPUitbhX0s/t8FSfcDZVy+Qrr6UuUt+oTG4A5tUb NZmkECUiPpLw+2jKy8stYzWU0E4tSW+pQuI1kH/mzdZJmlkkg1YzPhTW/lE7krwvAjLfSNoJhfw UGiGYUBd7nuoFfQuZnlaHkJGo2N8e0KC6P9x7NuUOXsNVfxHJ8MHMbxO+sz0C431rTq8+loa5S1 YvLo4ZS2Bf1mxHIUvzz5Oh5SvJHVsyDgL2MNNNja3xU4uMFC8/ThzdyptvgujrlYJyYQ6voDQ24 ghkTQ= X-Received: by 2002:a05:6000:4008:b0:43b:42d3:31ac with SMTP id ffacd0b85a97d-43b5264dc3bmr6370570f8f.0.1773847938957; Wed, 18 Mar 2026 08:32:18 -0700 (PDT) X-Received: by 2002:a05:6000:4008:b0:43b:42d3:31ac with SMTP id ffacd0b85a97d-43b5264dc3bmr6370498f8f.0.1773847938398; Wed, 18 Mar 2026 08:32:18 -0700 (PDT) From: Hanna Czenczek To: qemu-block@nongnu.org Cc: qemu-devel@nongnu.org, Hanna Czenczek , Kevin Wolf , Julia Suvorova , Aarushi Mehta , Stefan Hajnoczi , Stefano Garzarella Subject: [PATCH for-11.0 1/3] linux-aio: Put all parameters into qemu_laiocb Date: Wed, 18 Mar 2026 16:32:03 +0100 Message-ID: <20260318153206.171494-2-hreitz@redhat.com> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260318153206.171494-1-hreitz@redhat.com> References: <20260318153206.171494-1-hreitz@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.129.124; envelope-from=hreitz@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -3 X-Spam_score: -0.4 X-Spam_bar: / X-Spam_report: (-0.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.819, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.903, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: qemu development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1773847986208154100 Content-Type: text/plain; charset="utf-8" Put all request parameters into the qemu_laiocb struct, which will allow re-submitting the tail of short reads/writes. Signed-off-by: Hanna Czenczek Reviewed-by: Kevin Wolf --- block/linux-aio.c | 35 ++++++++++++++++++++++------------- 1 file changed, 22 insertions(+), 13 deletions(-) diff --git a/block/linux-aio.c b/block/linux-aio.c index 53c3e9af8a..1f25339dc9 100644 --- a/block/linux-aio.c +++ b/block/linux-aio.c @@ -40,10 +40,15 @@ struct qemu_laiocb { Coroutine *co; LinuxAioState *ctx; struct iocb iocb; + int fd; ssize_t ret; + off_t offset; size_t nbytes; QEMUIOVector *qiov; - bool is_read; + + int type; + BdrvRequestFlags flags; + uint64_t dev_max_batch; QSIMPLEQ_ENTRY(qemu_laiocb) next; }; =20 @@ -87,7 +92,7 @@ static void qemu_laio_process_completion(struct qemu_laio= cb *laiocb) ret =3D 0; } else if (ret >=3D 0) { /* Short reads mean EOF, pad with zeros. */ - if (laiocb->is_read) { + if (laiocb->type =3D=3D QEMU_AIO_READ) { qemu_iovec_memset(laiocb->qiov, ret, 0, laiocb->qiov->size - ret); } else { @@ -367,23 +372,23 @@ static void laio_deferred_fn(void *opaque) } } =20 -static int laio_do_submit(int fd, struct qemu_laiocb *laiocb, off_t offset, - int type, BdrvRequestFlags flags, - uint64_t dev_max_batch) +static int laio_do_submit(struct qemu_laiocb *laiocb) { LinuxAioState *s =3D laiocb->ctx; struct iocb *iocbs =3D &laiocb->iocb; QEMUIOVector *qiov =3D laiocb->qiov; + int fd =3D laiocb->fd; + off_t offset =3D laiocb->offset; =20 - switch (type) { + switch (laiocb->type) { case QEMU_AIO_WRITE: #ifdef HAVE_IO_PREP_PWRITEV2 { - int laio_flags =3D (flags & BDRV_REQ_FUA) ? RWF_DSYNC : 0; + int laio_flags =3D (laiocb->flags & BDRV_REQ_FUA) ? RWF_DSYNC : 0; io_prep_pwritev2(iocbs, fd, qiov->iov, qiov->niov, offset, laio_fl= ags); } #else - assert(flags =3D=3D 0); + assert(laiocb->flags =3D=3D 0); io_prep_pwritev(iocbs, fd, qiov->iov, qiov->niov, offset); #endif break; @@ -399,7 +404,7 @@ static int laio_do_submit(int fd, struct qemu_laiocb *l= aiocb, off_t offset, /* Currently Linux kernel does not support other operations */ default: fprintf(stderr, "%s: invalid AIO request type 0x%x.\n", - __func__, type); + __func__, laiocb->type); return -EIO; } io_set_eventfd(&laiocb->iocb, event_notifier_get_fd(&s->e)); @@ -407,7 +412,7 @@ static int laio_do_submit(int fd, struct qemu_laiocb *l= aiocb, off_t offset, QSIMPLEQ_INSERT_TAIL(&s->io_q.pending, laiocb, next); s->io_q.in_queue++; if (!s->io_q.blocked) { - if (s->io_q.in_queue >=3D laio_max_batch(s, dev_max_batch)) { + if (s->io_q.in_queue >=3D laio_max_batch(s, laiocb->dev_max_batch)= ) { ioq_submit(s); } else { defer_call(laio_deferred_fn, s); @@ -425,14 +430,18 @@ int coroutine_fn laio_co_submit(int fd, uint64_t offs= et, QEMUIOVector *qiov, AioContext *ctx =3D qemu_get_current_aio_context(); struct qemu_laiocb laiocb =3D { .co =3D qemu_coroutine_self(), - .nbytes =3D qiov ? qiov->size : 0, + .fd =3D fd, + .offset =3D offset, + .nbytes =3D (qiov ? qiov->size : 0), .ctx =3D aio_get_linux_aio(ctx), .ret =3D -EINPROGRESS, - .is_read =3D (type =3D=3D QEMU_AIO_READ), .qiov =3D qiov, + .type =3D type, + .flags =3D flags, + .dev_max_batch =3D dev_max_batch, }; =20 - ret =3D laio_do_submit(fd, &laiocb, offset, type, flags, dev_max_batch= ); + ret =3D laio_do_submit(&laiocb); if (ret < 0) { return ret; } --=20 2.53.0 From nobody Mon Apr 6 17:27:47 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=quarantine dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1773848033; cv=none; d=zohomail.com; s=zohoarc; b=JuMfGfU/WRx9bnqiWrkM1xMjRuPOfBkU0Wr3wgzS/decxo97i/pA4hqedmBCTXL3Iym/gzSLAieqbt2G92PwLNjOqBq8XKpRp8wVik6oBSRjnGYaysCgnPxWLX0VPe3pRW8DEBytoH7daTJ07X75FaF7DbB/AGsu7I4CU7QzhyA= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1773848033; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=uAiGkojcHQ/1SSiJjKjsQUYUlQYXvjhlE4iBA6dpRbk=; b=NE5qQGRdvlTZwN278PVXtWbFYAQAYHB0XjwMc6Ihx9BBXcdww8XC0f/GYEBMgX1WgDR6aiM2dSvsj4U9xyolRe2tj3MjhHCYdLyr8rqqPhxIm6uToAEsB6uPpqtWD9oXwr8EK472HbSmVKFQyz1NdjxI1Pof1HwmT01y48wrvpg= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=quarantine dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1773848033670721.4448551551956; Wed, 18 Mar 2026 08:33:53 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1w2ssv-0002Z2-Ll; Wed, 18 Mar 2026 11:32:37 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1w2ssn-0002WH-Eb for qemu-devel@nongnu.org; Wed, 18 Mar 2026 11:32:29 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1w2ssj-0000OM-EQ for qemu-devel@nongnu.org; Wed, 18 Mar 2026 11:32:28 -0400 Received: from mail-wm1-f70.google.com (mail-wm1-f70.google.com [209.85.128.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-344-T2ZuqBELNHOOk7bbtN06QA-1; Wed, 18 Mar 2026 11:32:22 -0400 Received: by mail-wm1-f70.google.com with SMTP id 5b1f17b1804b1-4839fc4cef6so12317965e9.0 for ; Wed, 18 Mar 2026 08:32:22 -0700 (PDT) Received: from localhost (p200300cfd737d03304b9bf783d26f60b.dip0.t-ipconnect.de. [2003:cf:d737:d033:4b9:bf78:3d26:f60b]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-4856eaffbbdsm147563775e9.15.2026.03.18.08.32.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 18 Mar 2026 08:32:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1773847944; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=uAiGkojcHQ/1SSiJjKjsQUYUlQYXvjhlE4iBA6dpRbk=; b=fUT0Fs+NCLZDP1RUMnyzrPZMrtb0VxZCjcENdBL5TTX66WlyPQywkLcHKbhFDdjWd6ofUO 4kS4JVfk4gfYBbqov5L/9oYv6mGVEyQyI0j6Qemsl6MqYzSHc7Q67Mm3PEJh8jDidUJ0W9 uYdXhoqtAVRK/B74pKQxtZykOR2qRhg= X-MC-Unique: T2ZuqBELNHOOk7bbtN06QA-1 X-Mimecast-MFC-AGG-ID: T2ZuqBELNHOOk7bbtN06QA_1773847941 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=google; t=1773847941; x=1774452741; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=uAiGkojcHQ/1SSiJjKjsQUYUlQYXvjhlE4iBA6dpRbk=; b=kXmW3niei62ykpPW9pmcjIRdDsMd4l8uqOFFGKjnEVFoSecifcRNXHxFC116ngJeeM qWHaMDgxyPOwSW92V91Xo+xo+vVO/JOor42QvU5keJ8OOpPWseYpV8kRzwpdeQA35Zpn LB54oxqrxyKQB3p4duD42kRAhSN2KSa3l+Qs1qMpFgTvey/Dd9e5CtuLtAvx9ewQhmId P1hJyHi3k9hjjnYDZmtwVtwWQgMyKQSz1Y6chKQL5gPS5mp+Ix/MiMdZusNEkcRugiI/ 1yK/ldAQhOBjrhWyV3EOvOBW4VbbsnqnSrw7vEYmD9c4QUqtBIzp+qmAwOmyxL0QcwJU ZqEw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1773847941; x=1774452741; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=uAiGkojcHQ/1SSiJjKjsQUYUlQYXvjhlE4iBA6dpRbk=; b=npOp5HBIwhClupTbNq4T4szpbM1ApO7t5GGdzK3dlUzGDiDw2jtr19GQT/jV5iH805 cJQWY9jFdmpoTjaebbk7SEuiHopS7Ij599vfNxAVCTx+nGp379qVamWslFp5PzZPSIWS QzudORzeUBQVWYx3VViNuDl/P1Yr0ZI7C2ZYkvbHmTyb3H8mHzHv03pfAIZpTcM7Qk1S RcLyXzyVFK7qVh/9u5uqjFeQnyguGdIaqdrBGWaTaftma6ZiHTc01bN10OOO05wPQtsd zbygCx4PQRfChz1EzGyGDofeyjfRbVH6+v46tqhTHRTz/avUbpwFd7o9uJZ0EA3kxxxn hPEw== X-Gm-Message-State: AOJu0YyOncJWdDFsM8il98vAQRvQfgcjzwvpoFftqy83qgrt2e0C5XJE hNkSXMQX8FJ5PZGBc34X3m6trY4pJWlt5gN6f3/AHC2gzjNxNlabVQsveqW210fMqruPigasE07 4O/enc9UzCgdj4gzX2x2GP4XfNYlKcGPq6csxiO7f7lG1zvgR0QeId5JU X-Gm-Gg: ATEYQzzVYN05rpAy9zuNZBIk8psNr7Dwlt5Jaw2pSIYrn+nE62OWb+l8UcOub1sEcfh e/0jrBiaFXMF+yLbczxzzwFzJp6i6I1SYYhG1MynLM4TShftZduns8VyAqLW+KH4Ug6Lg/lU5kG i4sMg5Xy3oJr0qG+ZDgpECclRfH/PRM4R3XuzeGo/VCT56kq1FJzg0p05s4LZ42FoIhp9Yc4YIM 1Qx9N3do1z9uwlM/FtGdiZ5OGJxU5MTLMCjU8t0ZMCuoazt6bKT+IAj4zs20CYK7k7Zu3V+hZqk /w/rT4bq3ABIk880HPPos1OwI2WtzbQCMAwXhmNhVA3/0/oWxKkY4bJtxe3Y6RHyP1IeATVeE3x xjjEc7bA9PYLbLL5JhCCLyy+zz4w/oVwtkWHPLy/Uq+hP3HAdOPLJEbJOvPYr4kijfJTTv+h7N1 em+ws= X-Received: by 2002:a05:600c:848d:b0:485:3989:b3e4 with SMTP id 5b1f17b1804b1-486f8b219e0mr519305e9.6.1773847941238; Wed, 18 Mar 2026 08:32:21 -0700 (PDT) X-Received: by 2002:a05:600c:848d:b0:485:3989:b3e4 with SMTP id 5b1f17b1804b1-486f8b219e0mr518775e9.6.1773847940636; Wed, 18 Mar 2026 08:32:20 -0700 (PDT) From: Hanna Czenczek To: qemu-block@nongnu.org Cc: qemu-devel@nongnu.org, Hanna Czenczek , Kevin Wolf , Julia Suvorova , Aarushi Mehta , Stefan Hajnoczi , Stefano Garzarella Subject: [PATCH for-11.0 2/3] linux-aio: Resubmit tails of short reads/writes Date: Wed, 18 Mar 2026 16:32:04 +0100 Message-ID: <20260318153206.171494-3-hreitz@redhat.com> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260318153206.171494-1-hreitz@redhat.com> References: <20260318153206.171494-1-hreitz@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=hreitz@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -3 X-Spam_score: -0.4 X-Spam_bar: / X-Spam_report: (-0.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.819, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.903, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: qemu development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1773848036149158500 Content-Type: text/plain; charset="utf-8" Short reads/writes can happen. One way to reproduce them is via our FUSE export, with the following diff applied (%s/escaped // to apply -- if you put plain diffs in commit messages, git-am will apply them, and I would rather avoid breaking FUSE accidentally via this patch): escaped diff --git a/block/export/fuse.c b/block/export/fuse.c escaped index a2a478d293..67dc50a412 100644 escaped --- a/block/export/fuse.c escaped +++ b/block/export/fuse.c @@ -828,7 +828,7 @@ static ssize_t coroutine_fn GRAPH_RDLOCK fuse_co_init(FuseExport *exp, struct fuse_init_out *out, const struct fuse_init_in_compat *in) { - const uint32_t supported_flags =3D FUSE_ASYNC_READ | FUSE_ASYNC_DIO; + const uint32_t supported_flags =3D FUSE_ASYNC_READ; if (in->major !=3D 7) { error_report("FUSE major version mismatch: We have 7, but kernel h= as %" @@ -1060,6 +1060,8 @@ fuse_co_read(FuseExport *exp, void **bufptr, uint64_t= offset, uint32_t size) void *buf; int ret; + size =3D MIN(size, 4096); + /* Limited by max_read, should not happen */ if (size > FUSE_MAX_READ_BYTES) { return -EINVAL; @@ -1110,6 +1112,8 @@ fuse_co_write(FuseExport *exp, struct fuse_write_out = *out, int64_t blk_len; int ret; + size =3D MIN(size, 4096); + QEMU_BUILD_BUG_ON(FUSE_MAX_WRITE_BYTES > BDRV_REQUEST_MAX_BYTES); /* Limited by max_write, should not happen */ if (size > FUSE_MAX_WRITE_BYTES) { Then: $ ./qemu-img create -f raw test.raw 8k Formatting 'test.raw', fmt=3Draw size=3D8192 $ ./qemu-io -f raw -c 'write -P 42 0 8k' test.raw wrote 8192/8192 bytes at offset 0 8 KiB, 1 ops; 00.00 sec (64.804 MiB/sec and 8294.9003 ops/sec) $ hexdump -C test.raw 00000000 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a |**************= **| * 00002000 With aio=3Dthreads, short I/O works: $ storage-daemon/qemu-storage-daemon \ --blockdev file,node-name=3Dtest,filename=3Dtest.raw \ --export fuse,id=3Dexp,node-name=3Dtest,mountpoint=3Dtest.raw,writable= =3Dtrue Other shell: $ ./qemu-io --image-opts -c 'read -P 42 0 8k' \ driver=3Dfile,filename=3Dtest.raw,cache.direct=3Don,aio=3Dthreads read 8192/8192 bytes at offset 0 8 KiB, 1 ops; 00.00 sec (36.563 MiB/sec and 4680.0923 ops/sec) $ ./qemu-io --image-opts -c 'write -P 23 0 8k' \ driver=3Dfile,filename=3Dtest.raw,cache.direct=3Don,aio=3Dthreads wrote 8192/8192 bytes at offset 0 8 KiB, 1 ops; 00.00 sec (35.995 MiB/sec and 4607.2970 ops/sec) $ hexdump -C test.raw 00000000 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 |..............= ..| * 00002000 But with aio=3Dnative, it does not: $ ./qemu-io --image-opts -c 'read -P 23 0 8k' \ driver=3Dfile,filename=3Dtest.raw,cache.direct=3Don,aio=3Dnative Pattern verification failed at offset 0, 8192 bytes read 8192/8192 bytes at offset 0 8 KiB, 1 ops; 00.00 sec (86.155 MiB/sec and 11027.7900 ops/sec) $ ./qemu-io --image-opts -c 'write -P 42 0 8k' \ driver=3Dfile,filename=3Dtest.raw,cache.direct=3Don,aio=3Dnative write failed: No space left on device $ hexdump -C test.raw 00000000 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a |**************= **| * 00001000 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 |..............= ..| * 00002000 This patch fixes that. Signed-off-by: Hanna Czenczek Reviewed-by: Kevin Wolf --- block/linux-aio.c | 61 ++++++++++++++++++++++++++++++++++++++++------- 1 file changed, 53 insertions(+), 8 deletions(-) diff --git a/block/linux-aio.c b/block/linux-aio.c index 1f25339dc9..01621d4794 100644 --- a/block/linux-aio.c +++ b/block/linux-aio.c @@ -46,6 +46,10 @@ struct qemu_laiocb { size_t nbytes; QEMUIOVector *qiov; =20 + /* For handling short reads/writes */ + size_t total_done; + QEMUIOVector resubmit_qiov; + int type; BdrvRequestFlags flags; uint64_t dev_max_batch; @@ -73,28 +77,61 @@ struct LinuxAioState { }; =20 static void ioq_submit(LinuxAioState *s); +static int laio_do_submit(struct qemu_laiocb *laiocb); =20 static inline ssize_t io_event_ret(struct io_event *ev) { return (ssize_t)(((uint64_t)ev->res2 << 32) | ev->res); } =20 +/** + * Retry tail of short requests. + */ +static int laio_resubmit_short_io(struct qemu_laiocb *laiocb, size_t done) +{ + QEMUIOVector *resubmit_qiov =3D &laiocb->resubmit_qiov; + + laiocb->total_done +=3D done; + + if (!resubmit_qiov->iov) { + qemu_iovec_init(resubmit_qiov, laiocb->qiov->niov); + } else { + qemu_iovec_reset(resubmit_qiov); + } + qemu_iovec_concat(resubmit_qiov, laiocb->qiov, + laiocb->total_done, laiocb->nbytes - laiocb->total_d= one); + + return laio_do_submit(laiocb); +} + /* * Completes an AIO request. */ static void qemu_laio_process_completion(struct qemu_laiocb *laiocb) { - int ret; + ssize_t ret; =20 ret =3D laiocb->ret; if (ret !=3D -ECANCELED) { - if (ret =3D=3D laiocb->nbytes) { + if (ret =3D=3D laiocb->nbytes - laiocb->total_done) { ret =3D 0; + } else if (ret > 0 && (laiocb->type =3D=3D QEMU_AIO_READ || + laiocb->type =3D=3D QEMU_AIO_WRITE)) { + ret =3D laio_resubmit_short_io(laiocb, ret); + if (!ret) { + return; + } } else if (ret >=3D 0) { - /* Short reads mean EOF, pad with zeros. */ + /* + * For normal reads and writes, we only get here if ret =3D=3D= 0, which + * means EOF for reads and ENOSPC for writes. + * For zone-append, we get here with any ret >=3D 0, which we = just + * treat as ENOSPC, too (safer than resubmitting, probably, bu= t not + * 100 % clear). + */ if (laiocb->type =3D=3D QEMU_AIO_READ) { - qemu_iovec_memset(laiocb->qiov, ret, 0, - laiocb->qiov->size - ret); + qemu_iovec_memset(laiocb->qiov, laiocb->total_done, 0, + laiocb->qiov->size - laiocb->total_done); } else { ret =3D -ENOSPC; } @@ -102,6 +139,7 @@ static void qemu_laio_process_completion(struct qemu_la= iocb *laiocb) } =20 laiocb->ret =3D ret; + qemu_iovec_destroy(&laiocb->resubmit_qiov); =20 /* * If the coroutine is already entered it must be in ioq_submit() and @@ -380,23 +418,30 @@ static int laio_do_submit(struct qemu_laiocb *laiocb) int fd =3D laiocb->fd; off_t offset =3D laiocb->offset; =20 + if (laiocb->resubmit_qiov.iov) { + qiov =3D &laiocb->resubmit_qiov; + } + switch (laiocb->type) { case QEMU_AIO_WRITE: #ifdef HAVE_IO_PREP_PWRITEV2 { int laio_flags =3D (laiocb->flags & BDRV_REQ_FUA) ? RWF_DSYNC : 0; - io_prep_pwritev2(iocbs, fd, qiov->iov, qiov->niov, offset, laio_fl= ags); + io_prep_pwritev2(iocbs, fd, qiov->iov, qiov->niov, + offset + laiocb->total_done, laio_flags); } #else assert(laiocb->flags =3D=3D 0); - io_prep_pwritev(iocbs, fd, qiov->iov, qiov->niov, offset); + io_prep_pwritev(iocbs, fd, qiov->iov, qiov->niov, + offset + laiocb->total_done); #endif break; case QEMU_AIO_ZONE_APPEND: io_prep_pwritev(iocbs, fd, qiov->iov, qiov->niov, offset); break; case QEMU_AIO_READ: - io_prep_preadv(iocbs, fd, qiov->iov, qiov->niov, offset); + io_prep_preadv(iocbs, fd, qiov->iov, qiov->niov, + offset + laiocb->total_done); break; case QEMU_AIO_FLUSH: io_prep_fdsync(iocbs, fd); --=20 2.53.0 From nobody Mon Apr 6 17:27:47 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=quarantine dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1773847993; cv=none; d=zohomail.com; s=zohoarc; b=NnA0Madako7UqQryLUmnCrbkse04eai4nqrwSXDax/Ge1oAqFCVV+O5uYdSres0lPEMJXjhGvqrHV9Q557JFmgUjxc67Wo6d9O4xHrSrRAKP8UjzNs35/Cpv3PuN6sxAY4S+uPAjkFxOzMaH0aDa9lxkE00ma8jMw6XUFNv7aBY= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1773847993; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=V2/+0Og3S4wOSV0lrvsiMpNrEkJuUBoXaBIw/93KO9Q=; b=MQKpo34ibRI8PSlTDEG8Rveyv9PQpLP24RPKd9kkDjXkIKT6JDSEdnwzclKwqI2S4UsJ7xoQwU25ViL1dSpvwT/rcedb+D8iaanixoO5t+hUK3BQVxThc/NzjlFjPNqrEWyGuj2ijjewF3OOon3Wgga/E0AhDtm+d2kUH/kcciE= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=quarantine dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1773847993519253.19548665531795; Wed, 18 Mar 2026 08:33:13 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1w2ssw-0002ZF-IS; Wed, 18 Mar 2026 11:32:38 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1w2sst-0002Y3-OU for qemu-devel@nongnu.org; Wed, 18 Mar 2026 11:32:35 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1w2ssr-0000Q8-Mo for qemu-devel@nongnu.org; Wed, 18 Mar 2026 11:32:35 -0400 Received: from mail-wr1-f71.google.com (mail-wr1-f71.google.com [209.85.221.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-618-4vVGwpL8OWaD-VSdYdDgNg-1; Wed, 18 Mar 2026 11:32:25 -0400 Received: by mail-wr1-f71.google.com with SMTP id ffacd0b85a97d-43b3c9568a3so337669f8f.0 for ; Wed, 18 Mar 2026 08:32:24 -0700 (PDT) Received: from localhost (p200300cfd737d03304b9bf783d26f60b.dip0.t-ipconnect.de. [2003:cf:d737:d033:4b9:bf78:3d26:f60b]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-43b518522d7sm7490639f8f.13.2026.03.18.08.32.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 18 Mar 2026 08:32:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1773847952; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=V2/+0Og3S4wOSV0lrvsiMpNrEkJuUBoXaBIw/93KO9Q=; b=enRTSXT3lD/ty5i+iWEFnRDw9wfbpZe5qUbWADrJgYg5/hmjh7PIZb08x6M9D4LD2nHAwf iMokqC/E4cOzRxgXNPMiWE6qn/qEQTBoQBmdKAVJ0/KnB4NK9yq0HUB10oni7xqKLR+q2N RvCdqdpAldynnfTQ22A8HF+0qXtXI5k= X-MC-Unique: 4vVGwpL8OWaD-VSdYdDgNg-1 X-Mimecast-MFC-AGG-ID: 4vVGwpL8OWaD-VSdYdDgNg_1773847944 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=google; t=1773847944; x=1774452744; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=V2/+0Og3S4wOSV0lrvsiMpNrEkJuUBoXaBIw/93KO9Q=; b=rc5mBJ21dTQTwJma7ohMFRiylnJJ+aKypbpMSzRLFIqu+tgxa0Bko4L49i4KnwwHid gfciNtbCAMOLZSjBU2wa1NGzmJhUMKHOMYlBD3rKoQZcZ+eMuFpmZnJ1tt8/jHrIz1WY KcKRRlCECzXRHq14uaMhOJ7/lua8ZeerAApMcxhdUhGS87pvCIYeqW2oOc8bnYiFRPDv wkNwFdal6I32IEWjxt6DjTSUfcRCaTURes+PQ3Qd7oyS+nU2zZB/yY21alpZ4KS1lOEq 1Yc2fNj/vQcK/nJOMqdO6x5zfbpla0FDOl/gyfIhANRHLC4hIuVOqMNHoLZy+K0LOXDk 9thw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1773847944; x=1774452744; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=V2/+0Og3S4wOSV0lrvsiMpNrEkJuUBoXaBIw/93KO9Q=; b=J43tHJm4DBxB2+U1eMAqljLui/GMYxL0KZe6DFTsN8nWhZ1OCY2NmG5rB342zJTPno 3TbTAnlqohxHcI7tKTZLG9ee2ifO5L6wFTT/MoSIBvQtxeYjKxhEKUbXryYvHu+v6ROJ 5HnDjUmGLWvZD1Mio21cdSmxjJ3I2Tyr+4HVXTdlQy81lGAI1J2l0fDBD159A0M4LAMr 3Ik/otaT9xleQ2pcs/OScnCyDFM38306hzPrUGtzDTN8u4kXNnKijpnY1P2ISMp22VFI kTC8yQu5khR4Px63zOCSjsttmR59/IGBSlxkXcCu9wR+IMjfR3wX+9k6rkElQmAOmh7O NsUA== X-Gm-Message-State: AOJu0Yy7600U4Yah15WrqRdDV8qePTD1rTFjl5H/VLNG1Eg7kmnbrrgA vctxlnUqfOtNIVG4Rt3IYwgi2FuAV49ZoX7RgFeQmdgC7pcazsvBMlTlXmkqTT6ziM82kAWr94k DlALlD26ji6LRVmUMkwQCQU9rJYfx91g4SM2uolfm6vVa3ttAJ8/RPQtF X-Gm-Gg: ATEYQzw3C7f6C5ZWQ2vWAlXIB8t4pHU7Mua2QYx75s0AjOPqsm5YPvJnW0CjLQ8N4TE +9hEA9NJJIfnhmQ/au8TDu+ieqpzQ+NwiQaFmTrqCCYSDNJH4kk3LL7PAg3HZR4GJPxYy+tzSMC 65hY4F3DSFpf/eBVhb0wFM7ow8Tkg+RvPiValeSuJ+V7o5Ua4J8PEg9ZurMC6aPgpREYSsPH+F+ 4fB4rwFE/lRVLcNQjAcuy/07j62w7j72e5EzSUtMKRE3sCcKoIBTFMkPErHjgijvXJ4SrimMYaV wNJdAR3bJ6Iy4S9hgwdnxCiloSRfFvducNVm2Q2/vWQw28ea2l9ZYDYr8NaOXQABAxiMtf+xrBf 3KepFQkBGPltIWJj61OGO0McSV6CtTK/RJqqm7/ZzjBNNNWeyDtSEh3Cg7kYJiiD8XzprG1qyN0 94jfM= X-Received: by 2002:a05:6000:4387:b0:439:b744:c601 with SMTP id ffacd0b85a97d-43b527c55cfmr6073007f8f.40.1773847943459; Wed, 18 Mar 2026 08:32:23 -0700 (PDT) X-Received: by 2002:a05:6000:4387:b0:439:b744:c601 with SMTP id ffacd0b85a97d-43b527c55cfmr6072937f8f.40.1773847942840; Wed, 18 Mar 2026 08:32:22 -0700 (PDT) From: Hanna Czenczek To: qemu-block@nongnu.org Cc: qemu-devel@nongnu.org, Hanna Czenczek , Kevin Wolf , Julia Suvorova , Aarushi Mehta , Stefan Hajnoczi , Stefano Garzarella Subject: [PATCH for-11.0 3/3] io-uring: Resubmit tails of short writes Date: Wed, 18 Mar 2026 16:32:05 +0100 Message-ID: <20260318153206.171494-4-hreitz@redhat.com> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260318153206.171494-1-hreitz@redhat.com> References: <20260318153206.171494-1-hreitz@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=hreitz@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -3 X-Spam_score: -0.4 X-Spam_bar: / X-Spam_report: (-0.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.819, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.903, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: qemu development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1773847996174154100 Content-Type: text/plain; charset="utf-8" Short writes can happen, too, not just short reads. The difference to aio=3Dnative is that the kernel will actually retry the tail of short requests internally already -- so it is harder to reproduce. But if the tail of a short request returns an error to the kernel, we will see it in userspace still. To reproduce this, apply the following patch on top of the one shown in HEAD^ (again %s/escaped // to apply): escaped diff --git a/block/export/fuse.c b/block/export/fuse.c escaped index 67dc50a412..2b98489a32 100644 escaped --- a/block/export/fuse.c escaped +++ b/block/export/fuse.c @@ -1059,8 +1059,15 @@ fuse_co_read(FuseExport *exp, void **bufptr, uint64_= t offset, uint32_t size) int64_t blk_len; void *buf; int ret; + static uint32_t error_size; - size =3D MIN(size, 4096); + if (error_size =3D=3D size) { + error_size =3D 0; + return -EIO; + } else if (size > 4096) { + error_size =3D size - 4096; + size =3D 4096; + } /* Limited by max_read, should not happen */ if (size > FUSE_MAX_READ_BYTES) { @@ -1111,8 +1118,15 @@ fuse_co_write(FuseExport *exp, struct fuse_write_out= *out, { int64_t blk_len; int ret; + static uint32_t error_size; - size =3D MIN(size, 4096); + if (error_size =3D=3D size) { + error_size =3D 0; + return -EIO; + } else if (size > 4096) { + error_size =3D size - 4096; + size =3D 4096; + } QEMU_BUILD_BUG_ON(FUSE_MAX_WRITE_BYTES > BDRV_REQUEST_MAX_BYTES); /* Limited by max_write, should not happen */ I know this is a bit artificial because to produce this, there must be an I/O error somewhere anyway, but if it does happen, qemu will understand it to mean ENOSPC for short writes, which is incorrect. So I believe we need to resubmit the tail to maybe have it succeed now, or at least get the correct error code. Reproducer as before: $ ./qemu-img create -f raw test.raw 8k Formatting 'test.raw', fmt=3Draw size=3D8192 $ ./qemu-io -f raw -c 'write -P 42 0 8k' test.raw wrote 8192/8192 bytes at offset 0 8 KiB, 1 ops; 00.00 sec (64.804 MiB/sec and 8294.9003 ops/sec) $ hexdump -C test.raw 00000000 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a |**************= **| * 00002000 $ storage-daemon/qemu-storage-daemon \ --blockdev file,node-name=3Dtest,filename=3Dtest.raw \ --export fuse,id=3Dexp,node-name=3Dtest,mountpoint=3Dtest.raw,writable= =3Dtrue $ ./qemu-io --image-opts -c 'read -P 23 0 8k' \ driver=3Dfile,filename=3Dtest.raw,cache.direct=3Don,aio=3Dio_uring read 8192/8192 bytes at offset 0 8 KiB, 1 ops; 00.00 sec (58.481 MiB/sec and 7485.5342 ops/sec) $ ./qemu-io --image-opts -c 'write -P 23 0 8k' \ driver=3Dfile,filename=3Dtest.raw,cache.direct=3Don,aio=3Dio_uring write failed: No space left on device $ hexdump -C test.raw 00000000 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 17 |..............= ..| * 00001000 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a |**************= **| * 00002000 So short reads already work (because there is code for that), but short writes incorrectly produce ENOSPC. This patch fixes that by resubmitting not only the tail of short reads but short writes also. Signed-off-by: Hanna Czenczek Reviewed-by: Kevin Wolf --- block/io_uring.c | 83 ++++++++++++++++++++++++++-------------------- block/trace-events | 2 +- 2 files changed, 48 insertions(+), 37 deletions(-) diff --git a/block/io_uring.c b/block/io_uring.c index cb131d3b8b..61b54647ae 100644 --- a/block/io_uring.c +++ b/block/io_uring.c @@ -27,10 +27,10 @@ typedef struct { BdrvRequestFlags flags; =20 /* - * Buffered reads may require resubmission, see - * luring_resubmit_short_read(). + * Short reads/writes require resubmission, see + * luring_resubmit_short_io(). */ - int total_read; + int total_done; QEMUIOVector resubmit_qiov; =20 CqeHandler cqe_handler; @@ -44,6 +44,10 @@ static void luring_prep_sqe(struct io_uring_sqe *sqe, vo= id *opaque) int fd =3D req->fd; BdrvRequestFlags flags =3D req->flags; =20 + if (req->resubmit_qiov.iov !=3D NULL) { + qiov =3D &req->resubmit_qiov; + } + switch (req->type) { case QEMU_AIO_WRITE: { @@ -51,7 +55,8 @@ static void luring_prep_sqe(struct io_uring_sqe *sqe, voi= d *opaque) if (luring_flags !=3D 0 || qiov->niov > 1) { #ifdef HAVE_IO_URING_PREP_WRITEV2 io_uring_prep_writev2(sqe, fd, qiov->iov, - qiov->niov, offset, luring_flags); + qiov->niov, offset + req->total_done, + luring_flags); #else /* * FUA should only be enabled with HAVE_IO_URING_PREP_WRITEV2,= see @@ -59,12 +64,14 @@ static void luring_prep_sqe(struct io_uring_sqe *sqe, v= oid *opaque) */ assert(luring_flags =3D=3D 0); =20 - io_uring_prep_writev(sqe, fd, qiov->iov, qiov->niov, offset); + io_uring_prep_writev(sqe, fd, qiov->iov, qiov->niov, + offset + req->total_done); #endif } else { /* The man page says non-vectored is faster than vectored */ struct iovec *iov =3D qiov->iov; - io_uring_prep_write(sqe, fd, iov->iov_base, iov->iov_len, offs= et); + io_uring_prep_write(sqe, fd, iov->iov_base, iov->iov_len, + offset + req->total_done); } break; } @@ -73,17 +80,14 @@ static void luring_prep_sqe(struct io_uring_sqe *sqe, v= oid *opaque) break; case QEMU_AIO_READ: { - if (req->resubmit_qiov.iov !=3D NULL) { - qiov =3D &req->resubmit_qiov; - } if (qiov->niov > 1) { io_uring_prep_readv(sqe, fd, qiov->iov, qiov->niov, - offset + req->total_read); + offset + req->total_done); } else { /* The man page says non-vectored is faster than vectored */ struct iovec *iov =3D qiov->iov; io_uring_prep_read(sqe, fd, iov->iov_base, iov->iov_len, - offset + req->total_read); + offset + req->total_done); } break; } @@ -98,21 +102,26 @@ static void luring_prep_sqe(struct io_uring_sqe *sqe, = void *opaque) } =20 /** - * luring_resubmit_short_read: + * luring_resubmit_short_io: * - * Short reads are rare but may occur. The remaining read request needs to= be - * resubmitted. + * Short reads and writes are rare but may occur. The remaining request n= eeds + * to be resubmitted. + * + * For example, short reads can be reproduced by a FUSE export deliberately + * executing short reads. The tail of short writes is generally resubmitt= ed by + * io-uring in the kernel, but if that resubmission encounters an I/O erro= r, the + * already submitted portion will be returned as a short write. */ -static void luring_resubmit_short_read(LuringRequest *req, int nread) +static void luring_resubmit_short_io(LuringRequest *req, int ndone) { QEMUIOVector *resubmit_qiov; size_t remaining; =20 - trace_luring_resubmit_short_read(req, nread); + trace_luring_resubmit_short_io(req, ndone); =20 - /* Update read position */ - req->total_read +=3D nread; - remaining =3D req->qiov->size - req->total_read; + /* Update I/O position */ + req->total_done +=3D ndone; + remaining =3D req->qiov->size - req->total_done; =20 /* Shorten qiov */ resubmit_qiov =3D &req->resubmit_qiov; @@ -121,7 +130,7 @@ static void luring_resubmit_short_read(LuringRequest *r= eq, int nread) } else { qemu_iovec_reset(resubmit_qiov); } - qemu_iovec_concat(resubmit_qiov, req->qiov, req->total_read, remaining= ); + qemu_iovec_concat(resubmit_qiov, req->qiov, req->total_done, remaining= ); =20 aio_add_sqe(luring_prep_sqe, req, &req->cqe_handler); } @@ -153,26 +162,28 @@ static void luring_cqe_handler(CqeHandler *cqe_handle= r) return; } } else if (req->qiov) { - /* total_read is non-zero only for resubmitted read requests */ - int total_bytes =3D ret + req->total_read; + /* total_done is non-zero only for resubmitted requests */ + int total_bytes =3D ret + req->total_done; =20 if (total_bytes =3D=3D req->qiov->size) { ret =3D 0; - } else { + } else if (ret > 0 && (req->type =3D=3D QEMU_AIO_READ || + req->type =3D=3D QEMU_AIO_WRITE)) { /* Short Read/Write */ - if (req->type =3D=3D QEMU_AIO_READ) { - if (ret > 0) { - luring_resubmit_short_read(req, ret); - return; - } - - /* Pad with zeroes */ - qemu_iovec_memset(req->qiov, total_bytes, 0, - req->qiov->size - total_bytes); - ret =3D 0; - } else { - ret =3D -ENOSPC; - } + luring_resubmit_short_io(req, ret); + return; + } else if (req->type =3D=3D QEMU_AIO_READ) { + /* Read ret =3D=3D 0: EOF, pad with zeroes */ + qemu_iovec_memset(req->qiov, total_bytes, 0, + req->qiov->size - total_bytes); + ret =3D 0; + } else { + /* + * Normal write ret =3D=3D 0 means ENOSPC. + * For zone-append, we treat any 0 <=3D ret < qiov->size as EN= OSPC, + * too, because resubmitting the tail seems a little unsafe. + */ + ret =3D -ENOSPC; } } =20 diff --git a/block/trace-events b/block/trace-events index d170fc96f1..950c82d4b8 100644 --- a/block/trace-events +++ b/block/trace-events @@ -64,7 +64,7 @@ file_paio_submit(void *acb, void *opaque, int64_t offset,= int count, int type) " # io_uring.c luring_cqe_handler(void *req, int ret) "req %p ret %d" luring_co_submit(void *bs, void *req, int fd, uint64_t offset, size_t nbyt= es, int type) "bs %p req %p fd %d offset %" PRId64 " nbytes %zd type %d" -luring_resubmit_short_read(void *req, int nread) "req %p nread %d" +luring_resubmit_short_io(void *req, int ndone) "req %p ndone %d" =20 # qcow2.c qcow2_add_task(void *co, void *bs, void *pool, const char *action, int clu= ster_type, uint64_t host_offset, uint64_t offset, uint64_t bytes, void *qio= v, size_t qiov_offset) "co %p bs %p pool %p: %s: cluster_type %d file_clust= er_offset %" PRIu64 " offset %" PRIu64 " bytes %" PRIu64 " qiov %p qiov_off= set %zu" --=20 2.53.0