From nobody Mon Feb 9 22:37:55 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1648034676427589.337481884055; Wed, 23 Mar 2022 04:24:36 -0700 (PDT) Received: from localhost ([::1]:46084 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nWz6R-0003Zq-Dn for importer@patchew.org; Wed, 23 Mar 2022 07:24:35 -0400 Received: from eggs.gnu.org ([209.51.188.92]:36984) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nWyzm-0000zz-Am for qemu-devel@nongnu.org; Wed, 23 Mar 2022 07:17:43 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:59825) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nWyzk-0001zZ-Ck for qemu-devel@nongnu.org; Wed, 23 Mar 2022 07:17:42 -0400 Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-344-tnuIouCcMLmdxHAATYdOYA-1; Wed, 23 Mar 2022 07:17:36 -0400 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.rdu2.redhat.com [10.11.54.8]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 25D2D185A7BA; Wed, 23 Mar 2022 11:17:36 +0000 (UTC) Received: from localhost (unknown [10.39.194.72]) by smtp.corp.redhat.com (Postfix) with ESMTP id ACC5DC27D8C; Wed, 23 Mar 2022 11:17:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1648034259; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=qtIiIG5cYjGFrp5Olf62TgllKgB7t+S2bdDeFCbd31o=; b=DgvtiPJ1RZLILcQ1JeBMuHey6Uyu2+LXxQYug6eFm0AyOCATImOHLU61i0VM3wCSYBJrog ygHrkJwTeh+U5FHHRhYtVJA9U5Z24UFXj4qTMz9kfaMrrfTK52rSyE1TyYZyaQEfateLF/ EXVcrYGbdCCSjz9A7On3dCdLev6tDAw= X-MC-Unique: tnuIouCcMLmdxHAATYdOYA-1 From: Stefan Hajnoczi To: qemu-devel@nongnu.org Subject: [RFC 4/8] block: add BDRV_REQ_REGISTERED_BUF request flag Date: Wed, 23 Mar 2022 11:17:23 +0000 Message-Id: <20220323111727.1100209-5-stefanha@redhat.com> In-Reply-To: <20220323111727.1100209-1-stefanha@redhat.com> References: <20220323111727.1100209-1-stefanha@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.85 on 10.11.54.8 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=stefanha@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=stefanha@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -21 X-Spam_score: -2.2 X-Spam_bar: -- X-Spam_report: (-2.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.082, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Laurent Vivier , Kevin Wolf , Thomas Huth , Vladimir Sementsov-Ogievskiy , qemu-block@nongnu.org, "Michael S. Tsirkin" , John Snow , =?UTF-8?q?Philippe=20Mathieu-Daud=C3=A9?= , Alberto Faria , Markus Armbruster , Yanan Wang , Eduardo Habkost , Hanna Reitz , Stefan Hajnoczi , Paolo Bonzini , Fam Zheng , Eric Blake , sgarzare@redhat.com Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZM-MESSAGEID: 1648034677370100001 Content-Type: text/plain; charset="utf-8" Block drivers may optimize I/O requests accessing buffers previously registered with bdrv_register_buf(). Checking whether all elements of a request's QEMUIOVector are within previously registered buffers is expensive, so we need a hint from the user to avoid costly checks. Add a BDRV_REQ_REGISTERED_BUF request flag to indicate that all QEMUIOVector elements in an I/O request are known to be within previously registered buffers. bdrv_aligned_preadv() is strict in validating supported read flags and its assertions fail when it sees BDRV_REQ_REGISTERED_BUF. There is no harm in passing BDRV_REQ_REGISTERED_BUF to block drivers that do not support it, so update the assertions to ignore BDRV_REQ_REGISTERED_BUF. Care must be taken to clear the flag when the block layer or filter drivers replace QEMUIOVector elements with bounce buffers since these have not been registered with bdrv_register_buf(). A lot of the changes in this commit deal with clearing the flag in those cases. Ensuring that the flag is cleared properly is somewhat invasive to implement across the block layer and it's hard to spot when future code changes accidentally break it. Another option might be to add a flag to QEMUIOVector itself and clear it in qemu_iovec_*() functions that modify elements. That is more robust but somewhat of a layering violation, so I haven't attempted that. Signed-off-by: Stefan Hajnoczi --- include/block/block-common.h | 9 +++++++++ block/blkverify.c | 4 ++-- block/crypto.c | 2 ++ block/io.c | 30 +++++++++++++++++++++++------- block/mirror.c | 2 ++ block/raw-format.c | 2 ++ 6 files changed, 40 insertions(+), 9 deletions(-) diff --git a/include/block/block-common.h b/include/block/block-common.h index fdb7306e78..061606e867 100644 --- a/include/block/block-common.h +++ b/include/block/block-common.h @@ -80,6 +80,15 @@ typedef enum { */ BDRV_REQ_MAY_UNMAP =3D 0x4, =20 + /* + * An optimization hint when all QEMUIOVector elements are within + * previously registered bdrv_register_buf() memory ranges. + * + * Code that replaces the user's QEMUIOVector elements with bounce buf= fers + * must take care to clear this flag. + */ + BDRV_REQ_REGISTERED_BUF =3D 0x8, + BDRV_REQ_FUA =3D 0x10, BDRV_REQ_WRITE_COMPRESSED =3D 0x20, =20 diff --git a/block/blkverify.c b/block/blkverify.c index e4a37af3b2..d624f4fd05 100644 --- a/block/blkverify.c +++ b/block/blkverify.c @@ -235,8 +235,8 @@ blkverify_co_preadv(BlockDriverState *bs, int64_t offse= t, int64_t bytes, qemu_iovec_init(&raw_qiov, qiov->niov); qemu_iovec_clone(&raw_qiov, qiov, buf); =20 - ret =3D blkverify_co_prwv(bs, &r, offset, bytes, qiov, &raw_qiov, flag= s, - false); + ret =3D blkverify_co_prwv(bs, &r, offset, bytes, qiov, &raw_qiov, + flags & ~BDRV_REQ_REGISTERED_BUF, false); =20 cmp_offset =3D qemu_iovec_compare(qiov, &raw_qiov); if (cmp_offset !=3D -1) { diff --git a/block/crypto.c b/block/crypto.c index 1ba82984ef..c900355adb 100644 --- a/block/crypto.c +++ b/block/crypto.c @@ -473,6 +473,8 @@ block_crypto_co_pwritev(BlockDriverState *bs, int64_t o= ffset, int64_t bytes, uint64_t sector_size =3D qcrypto_block_get_sector_size(crypto->block); uint64_t payload_offset =3D qcrypto_block_get_payload_offset(crypto->b= lock); =20 + flags &=3D ~BDRV_REQ_REGISTERED_BUF; + assert(!(flags & ~BDRV_REQ_FUA)); assert(payload_offset < INT64_MAX); assert(QEMU_IS_ALIGNED(offset, sector_size)); diff --git a/block/io.c b/block/io.c index a8a7920e29..139e36c2e1 100644 --- a/block/io.c +++ b/block/io.c @@ -1556,11 +1556,14 @@ static int coroutine_fn bdrv_aligned_preadv(BdrvChi= ld *child, max_transfer =3D QEMU_ALIGN_DOWN(MIN_NON_ZERO(bs->bl.max_transfer, INT= _MAX), align); =20 - /* TODO: We would need a per-BDS .supported_read_flags and + /* + * TODO: We would need a per-BDS .supported_read_flags and * potential fallback support, if we ever implement any read flags * to pass through to drivers. For now, there aren't any - * passthrough flags. */ - assert(!(flags & ~(BDRV_REQ_COPY_ON_READ | BDRV_REQ_PREFETCH))); + * passthrough flags except the BDRV_REQ_REGISTERED_BUF optimization h= int. + */ + assert(!(flags & ~(BDRV_REQ_COPY_ON_READ | BDRV_REQ_PREFETCH | + BDRV_REQ_REGISTERED_BUF))); =20 /* Handle Copy on Read and associated serialisation */ if (flags & BDRV_REQ_COPY_ON_READ) { @@ -1601,7 +1604,7 @@ static int coroutine_fn bdrv_aligned_preadv(BdrvChild= *child, goto out; } =20 - assert(!(flags & ~bs->supported_read_flags)); + assert(!(flags & ~(bs->supported_read_flags | BDRV_REQ_REGISTERED_BUF)= )); =20 max_bytes =3D ROUND_UP(MAX(0, total_bytes - offset), align); if (bytes <=3D max_bytes && bytes <=3D max_transfer) { @@ -1790,7 +1793,8 @@ static void bdrv_padding_destroy(BdrvRequestPadding *= pad) static int bdrv_pad_request(BlockDriverState *bs, QEMUIOVector **qiov, size_t *qiov_offset, int64_t *offset, int64_t *bytes, - BdrvRequestPadding *pad, bool *padded) + BdrvRequestPadding *pad, bool *padded, + BdrvRequestFlags *flags) { int ret; =20 @@ -1818,6 +1822,10 @@ static int bdrv_pad_request(BlockDriverState *bs, if (padded) { *padded =3D true; } + if (flags) { + /* Can't use optimization hint with bounce buffer */ + *flags &=3D ~BDRV_REQ_REGISTERED_BUF; + } =20 return 0; } @@ -1872,7 +1880,7 @@ int coroutine_fn bdrv_co_preadv_part(BdrvChild *child, } =20 ret =3D bdrv_pad_request(bs, &qiov, &qiov_offset, &offset, &bytes, &pa= d, - NULL); + NULL, &flags); if (ret < 0) { goto fail; } @@ -1917,6 +1925,11 @@ static int coroutine_fn bdrv_co_do_pwrite_zeroes(Blo= ckDriverState *bs, return -ENOTSUP; } =20 + /* By definition there is no user buffer so this flag doesn't make sen= se */ + if (flags & BDRV_REQ_REGISTERED_BUF) { + return -EINVAL; + } + /* Invalidate the cached block-status data range if this write overlap= s */ bdrv_bsc_invalidate_range(bs, offset, bytes); =20 @@ -2202,6 +2215,9 @@ static int coroutine_fn bdrv_co_do_zero_pwritev(BdrvC= hild *child, bool padding; BdrvRequestPadding pad; =20 + /* This flag doesn't make sense for padding or zero writes */ + flags &=3D ~BDRV_REQ_REGISTERED_BUF; + padding =3D bdrv_init_padding(bs, offset, bytes, &pad); if (padding) { assert(!(flags & BDRV_REQ_NO_WAIT)); @@ -2319,7 +2335,7 @@ int coroutine_fn bdrv_co_pwritev_part(BdrvChild *chil= d, * alignment only if there is no ZERO flag. */ ret =3D bdrv_pad_request(bs, &qiov, &qiov_offset, &offset, &bytes,= &pad, - &padded); + &padded, &flags); if (ret < 0) { return ret; } diff --git a/block/mirror.c b/block/mirror.c index d8ecb9efa2..3a0773622d 100644 --- a/block/mirror.c +++ b/block/mirror.c @@ -1477,6 +1477,8 @@ static int coroutine_fn bdrv_mirror_top_pwritev(Block= DriverState *bs, qemu_iovec_init(&bounce_qiov, 1); qemu_iovec_add(&bounce_qiov, bounce_buf, bytes); qiov =3D &bounce_qiov; + + flags &=3D ~BDRV_REQ_REGISTERED_BUF; } =20 ret =3D bdrv_mirror_top_do_write(bs, MIRROR_METHOD_COPY, offset, bytes= , qiov, diff --git a/block/raw-format.c b/block/raw-format.c index 69fd650eaf..9bae3dd7f2 100644 --- a/block/raw-format.c +++ b/block/raw-format.c @@ -258,6 +258,8 @@ static int coroutine_fn raw_co_pwritev(BlockDriverState= *bs, int64_t offset, qemu_iovec_add(&local_qiov, buf, 512); qemu_iovec_concat(&local_qiov, qiov, 512, qiov->size - 512); qiov =3D &local_qiov; + + flags &=3D ~BDRV_REQ_REGISTERED_BUF; } =20 ret =3D raw_adjust_offset(bs, &offset, bytes, true); --=20 2.35.1