From nobody Mon Feb 9 03:31:27 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1657254077; cv=none; d=zohomail.com; s=zohoarc; b=YxM7442pAoT8oZ7lJUlta5n6AZEltDDto5/yvGtXRIkh6mNz13BSiRoHBm2XvfA/ZwukYoQcM6sHJ64P/anTiYzAOvDe4FHSuIeCkdhsbmPem9Q2NL1smJ3eOWwzGne9LzFMUP6MS3+PeRbKkRnzWt8SwFqEGiEhufVHlDtYbKI= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1657254077; h=Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=fDpXixGbwnc7abZTjUkT522Va4fjEbUXW9Z12X1ZURk=; b=aMnQ0PssnwAA0+fj12u8gRmMUWiBfW5GLnlPe14JjLHupxHuqRv/pQpfU8a0DnwB5SuFIlZDsBYgcAp50A2bh1nblv/Z5NoDDvTiniVMtDEt/mnJyjsnOZAUyeWenZOGPyJAc8DCClDKOOcDovYzQyZAqE6TXrqNEM37ONnSnF0= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1657254077810820.3111197547723; Thu, 7 Jul 2022 21:21:17 -0700 (PDT) Received: from localhost ([::1]:55462 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1o9fUS-00060p-0a for importer@patchew.org; Fri, 08 Jul 2022 00:21:16 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:35414) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1o9fSm-0003Ur-Cp for qemu-devel@nongnu.org; Fri, 08 Jul 2022 00:19:32 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:27089) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1o9fSj-0000Ko-2O for qemu-devel@nongnu.org; Fri, 08 Jul 2022 00:19:32 -0400 Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-333-c9IMP4_BPNK5F52tIIJJiA-1; Fri, 08 Jul 2022 00:19:25 -0400 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id B09C5801231; Fri, 8 Jul 2022 04:19:24 +0000 (UTC) Received: from localhost (unknown [10.39.192.20]) by smtp.corp.redhat.com (Postfix) with ESMTP id 2BFA51121315; Fri, 8 Jul 2022 04:19:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1657253968; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=fDpXixGbwnc7abZTjUkT522Va4fjEbUXW9Z12X1ZURk=; b=DzDwcdAanJfeXZiprgYAMIA8SbCnT1H2tz69eEZTNN2ZLEXnjbR7kKQYcv5c0J28+dRQqh hl4urJKZ/i185edMFmp8coUaCUXLrFSUxQGcGBbVOqqLrxP3lUGUBq+UA+hhhINQA387uS vMmHOegToe5SE+S/NL8Ig1urCWSwa0A= X-MC-Unique: c9IMP4_BPNK5F52tIIJJiA-1 From: Stefan Hajnoczi To: qemu-devel@nongnu.org Cc: Alberto Faria , Stefan Hajnoczi , Vladimir Sementsov-Ogievskiy , "Michael S. Tsirkin" , Paolo Bonzini , Laurent Vivier , Eric Blake , sgarzare@redhat.com, Marcel Apfelbaum , =?UTF-8?q?Philippe=20Mathieu-Daud=C3=A9?= , qemu-block@nongnu.org, Eduardo Habkost , Vladimir Sementsov-Ogievskiy , John Snow , Thomas Huth , Kevin Wolf , Markus Armbruster , Hanna Reitz , Fam Zheng , Yanan Wang Subject: [RFC v3 1/8] blkio: add io_uring block driver using libblkio Date: Fri, 8 Jul 2022 05:17:30 +0100 Message-Id: <20220708041737.1768521-2-stefanha@redhat.com> In-Reply-To: <20220708041737.1768521-1-stefanha@redhat.com> References: <20220708041737.1768521-1-stefanha@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 2.78 on 10.11.54.3 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.129.124; envelope-from=stefanha@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -28 X-Spam_score: -2.9 X-Spam_bar: -- X-Spam_report: (-2.9 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.082, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1657254078934100003 Content-Type: text/plain; charset="utf-8" libblkio (https://gitlab.com/libblkio/libblkio/) is a library for high-performance disk I/O. It currently supports io_uring and virtio-blk-vhost-vdpa with additional drivers under development. One of the reasons for developing libblkio is that other applications besides QEMU can use it. This will be particularly useful for vhost-user-blk which applications may wish to use for connecting to qemu-storage-daemon. libblkio also gives us an opportunity to develop in Rust behind a C API that is easy to consume from QEMU. This commit adds io_uring and virtio-blk-vhost-vdpa BlockDrivers to QEMU using libblkio. It will be easy to add other libblkio drivers since they will share the majority of code. For now I/O buffers are copied through bounce buffers if the libblkio driver requires it. Later commits add an optimization for pre-registering guest RAM to avoid bounce buffers. The syntax is: --blockdev io_uring,node-name=3Ddrive0,filename=3Dtest.img,readonly=3Don|= off,cache.direct=3Don|off and: --blockdev virtio-blk-vhost-vdpa,node-name=3Ddrive0,path=3D/dev/vdpa...,r= eadonly=3Don|off Signed-off-by: Stefan Hajnoczi --- MAINTAINERS | 6 + meson_options.txt | 2 + qapi/block-core.json | 37 +- meson.build | 9 + block/blkio.c | 659 ++++++++++++++++++++++++++++++++++ tests/qtest/modules-test.c | 3 + block/meson.build | 1 + scripts/meson-buildoptions.sh | 3 + 8 files changed, 718 insertions(+), 2 deletions(-) create mode 100644 block/blkio.c diff --git a/MAINTAINERS b/MAINTAINERS index 450abd0252..50f340d9ee 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -3395,6 +3395,12 @@ L: qemu-block@nongnu.org S: Maintained F: block/vdi.c =20 +blkio +M: Stefan Hajnoczi +L: qemu-block@nongnu.org +S: Maintained +F: block/blkio.c + iSCSI M: Ronnie Sahlberg M: Paolo Bonzini diff --git a/meson_options.txt b/meson_options.txt index 97c38109b1..b0b2e0c9b5 100644 --- a/meson_options.txt +++ b/meson_options.txt @@ -117,6 +117,8 @@ option('bzip2', type : 'feature', value : 'auto', description: 'bzip2 support for DMG images') option('cap_ng', type : 'feature', value : 'auto', description: 'cap_ng support') +option('blkio', type : 'feature', value : 'auto', + description: 'libblkio block device driver') option('bpf', type : 'feature', value : 'auto', description: 'eBPF support') option('cocoa', type : 'feature', value : 'auto', diff --git a/qapi/block-core.json b/qapi/block-core.json index 2173e7734a..aa63d5e9bd 100644 --- a/qapi/block-core.json +++ b/qapi/block-core.json @@ -2951,11 +2951,15 @@ 'file', 'snapshot-access', 'ftp', 'ftps', 'gluster', {'name': 'host_cdrom', 'if': 'HAVE_HOST_BLOCK_DEVICE' }, {'name': 'host_device', 'if': 'HAVE_HOST_BLOCK_DEVICE' }, - 'http', 'https', 'iscsi', + 'http', 'https', + { 'name': 'io_uring', 'if': 'CONFIG_BLKIO' }, + 'iscsi', 'luks', 'nbd', 'nfs', 'null-aio', 'null-co', 'nvme', 'parallel= s', 'preallocate', 'qcow', 'qcow2', 'qed', 'quorum', 'raw', 'rbd', { 'name': 'replication', 'if': 'CONFIG_REPLICATION' }, - 'ssh', 'throttle', 'vdi', 'vhdx', 'vmdk', 'vpc', 'vvfat' ] } + 'ssh', 'throttle', 'vdi', 'vhdx', + { 'name': 'virtio-blk-vhost-vdpa', 'if': 'CONFIG_BLKIO' }, + 'vmdk', 'vpc', 'vvfat' ] } =20 ## # @BlockdevOptionsFile: @@ -3678,6 +3682,30 @@ '*debug': 'int', '*logfile': 'str' } } =20 +## +# @BlockdevOptionsIoUring: +# +# Driver specific block device options for the io_uring backend. +# +# @filename: path to the image file +# +# Since: 7.1 +## +{ 'struct': 'BlockdevOptionsIoUring', + 'data': { 'filename': 'str' } } + +## +# @BlockdevOptionsVirtioBlkVhostVdpa: +# +# Driver specific block device options for the virtio-blk-vhost-vdpa backe= nd. +# +# @path: path to the vhost-vdpa character device. +# +# Since: 7.1 +## +{ 'struct': 'BlockdevOptionsVirtioBlkVhostVdpa', + 'data': { 'path': 'str' } } + ## # @IscsiTransport: # @@ -4305,6 +4333,8 @@ 'if': 'HAVE_HOST_BLOCK_DEVICE' }, 'http': 'BlockdevOptionsCurlHttp', 'https': 'BlockdevOptionsCurlHttps', + 'io_uring': { 'type': 'BlockdevOptionsIoUring', + 'if': 'CONFIG_BLKIO' }, 'iscsi': 'BlockdevOptionsIscsi', 'luks': 'BlockdevOptionsLUKS', 'nbd': 'BlockdevOptionsNbd', @@ -4327,6 +4357,9 @@ 'throttle': 'BlockdevOptionsThrottle', 'vdi': 'BlockdevOptionsGenericFormat', 'vhdx': 'BlockdevOptionsGenericFormat', + 'virtio-blk-vhost-vdpa': + { 'type': 'BlockdevOptionsVirtioBlkVhostVdpa', + 'if': 'CONFIG_BLKIO' }, 'vmdk': 'BlockdevOptionsGenericCOWFormat', 'vpc': 'BlockdevOptionsGenericFormat', 'vvfat': 'BlockdevOptionsVVFAT' diff --git a/meson.build b/meson.build index bc5569ace1..f09b009428 100644 --- a/meson.build +++ b/meson.build @@ -713,6 +713,13 @@ if not get_option('virglrenderer').auto() or have_syst= em or have_vhost_user_gpu required: get_option('virglrenderer'), kwargs: static_kwargs) endif +blkio =3D not_found +if not get_option('blkio').auto() or have_block + blkio =3D dependency('blkio', + method: 'pkg-config', + required: get_option('blkio'), + kwargs: static_kwargs) +endif curl =3D not_found if not get_option('curl').auto() or have_block curl =3D dependency('libcurl', version: '>=3D7.29.0', @@ -1755,6 +1762,7 @@ config_host_data.set('CONFIG_LIBUDEV', libudev.found(= )) config_host_data.set('CONFIG_LZO', lzo.found()) config_host_data.set('CONFIG_MPATH', mpathpersist.found()) config_host_data.set('CONFIG_MPATH_NEW_API', mpathpersist_new_api) +config_host_data.set('CONFIG_BLKIO', blkio.found()) config_host_data.set('CONFIG_CURL', curl.found()) config_host_data.set('CONFIG_CURSES', curses.found()) config_host_data.set('CONFIG_GBM', gbm.found()) @@ -3909,6 +3917,7 @@ summary_info +=3D {'PAM': pam} summary_info +=3D {'iconv support': iconv} summary_info +=3D {'curses support': curses} summary_info +=3D {'virgl support': virgl} +summary_info +=3D {'blkio support': blkio} summary_info +=3D {'curl support': curl} summary_info +=3D {'Multipath support': mpathpersist} summary_info +=3D {'PNG support': png} diff --git a/block/blkio.c b/block/blkio.c new file mode 100644 index 0000000000..7fbdbd7fae --- /dev/null +++ b/block/blkio.c @@ -0,0 +1,659 @@ +#include "qemu/osdep.h" +#include +#include "block/block_int.h" +#include "qapi/error.h" +#include "qapi/qmp/qdict.h" +#include "qemu/module.h" + +typedef struct BlkAIOCB { + BlockAIOCB common; + struct blkio_mem_region mem_region; + QEMUIOVector qiov; + struct iovec bounce_iov; +} BlkioAIOCB; + +typedef struct { + /* Protects ->blkio and request submission on ->blkioq */ + QemuMutex lock; + + struct blkio *blkio; + struct blkioq *blkioq; /* this could be multi-queue in the future */ + int completion_fd; + + /* Polling fetches the next completion into this field */ + struct blkio_completion poll_completion; + + /* The value of the "mem-region-alignment" property */ + size_t mem_region_alignment; + + /* Can we skip adding/deleting blkio_mem_regions? */ + bool needs_mem_regions; +} BDRVBlkioState; + +static void blkio_aiocb_complete(BlkioAIOCB *acb, int ret) +{ + /* Copy bounce buffer back to qiov */ + if (acb->qiov.niov > 0) { + qemu_iovec_from_buf(&acb->qiov, 0, + acb->bounce_iov.iov_base, + acb->bounce_iov.iov_len); + qemu_iovec_destroy(&acb->qiov); + } + + acb->common.cb(acb->common.opaque, ret); + + if (acb->mem_region.len > 0) { + BDRVBlkioState *s =3D acb->common.bs->opaque; + + WITH_QEMU_LOCK_GUARD(&s->lock) { + blkio_free_mem_region(s->blkio, &acb->mem_region); + } + } + + qemu_aio_unref(&acb->common); +} + +/* + * Only the thread that calls aio_poll() invokes fd and poll handlers. + * Therefore locks are not necessary except when accessing s->blkio. + * + * No locking is performed around blkioq_get_completions() although other + * threads may submit I/O requests on s->blkioq. We're assuming there is no + * inteference between blkioq_get_completions() and other s->blkioq APIs. + */ + +static void blkio_completion_fd_read(void *opaque) +{ + BlockDriverState *bs =3D opaque; + BDRVBlkioState *s =3D bs->opaque; + struct blkio_completion completion; + uint64_t val; + ssize_t ret __attribute__((unused)); + + /* Polling may have already fetched a completion */ + if (s->poll_completion.user_data !=3D NULL) { + completion =3D s->poll_completion; + + /* Clear it in case blkio_aiocb_complete() has a nested event loop= */ + s->poll_completion.user_data =3D NULL; + + blkio_aiocb_complete(completion.user_data, completion.ret); + } + + /* Reset completion fd status */ + ret =3D read(s->completion_fd, &val, sizeof(val)); + + /* + * Reading one completion at a time makes nested event loop re-entrancy + * simple. Change this loop to get multiple completions in one go if it + * becomes a performance bottleneck. + */ + while (blkioq_do_io(s->blkioq, &completion, 0, 1, NULL) =3D=3D 1) { + blkio_aiocb_complete(completion.user_data, completion.ret); + } +} + +static bool blkio_completion_fd_poll(void *opaque) +{ + BlockDriverState *bs =3D opaque; + BDRVBlkioState *s =3D bs->opaque; + + /* Just in case we already fetched a completion */ + if (s->poll_completion.user_data !=3D NULL) { + return true; + } + + return blkioq_do_io(s->blkioq, &s->poll_completion, 0, 1, NULL) =3D=3D= 1; +} + +static void blkio_completion_fd_poll_ready(void *opaque) +{ + blkio_completion_fd_read(opaque); +} + +static void blkio_attach_aio_context(BlockDriverState *bs, + AioContext *new_context) +{ + BDRVBlkioState *s =3D bs->opaque; + + aio_set_fd_handler(new_context, + s->completion_fd, + false, + blkio_completion_fd_read, + NULL, + blkio_completion_fd_poll, + blkio_completion_fd_poll_ready, + bs); +} + +static void blkio_detach_aio_context(BlockDriverState *bs) +{ + BDRVBlkioState *s =3D bs->opaque; + + aio_set_fd_handler(bdrv_get_aio_context(bs), + s->completion_fd, + false, NULL, NULL, NULL, NULL, NULL); +} + +static const AIOCBInfo blkio_aiocb_info =3D { + .aiocb_size =3D sizeof(BlkioAIOCB), +}; + +/* Create a BlkioAIOCB */ +static BlkioAIOCB *blkio_aiocb_get(BlockDriverState *bs, + BlockCompletionFunc *cb, + void *opaque) +{ + BlkioAIOCB *acb =3D qemu_aio_get(&blkio_aiocb_info, bs, cb, opaque); + + /* A few fields need to be initialized, leave the rest... */ + acb->qiov.niov =3D 0; + acb->mem_region.len =3D 0; + return acb; +} + +/* s->lock must be held */ +static int blkio_aiocb_init_mem_region_locked(BlkioAIOCB *acb, size_t len) +{ + BDRVBlkioState *s =3D acb->common.bs->opaque; + size_t mem_region_len =3D QEMU_ALIGN_UP(len, s->mem_region_alignment); + int ret; + + ret =3D blkio_alloc_mem_region(s->blkio, &acb->mem_region, mem_region_= len); + if (ret < 0) { + return ret; + } + + acb->bounce_iov.iov_base =3D acb->mem_region.addr; + acb->bounce_iov.iov_len =3D len; + return 0; +} + +/* Call this to submit I/O after enqueuing a new request */ +static void blkio_submit_io(BlockDriverState *bs) +{ + if (qatomic_read(&bs->io_plugged) =3D=3D 0) { + BDRVBlkioState *s =3D bs->opaque; + + blkioq_do_io(s->blkioq, NULL, 0, 0, NULL); + } +} + +static BlockAIOCB *blkio_aio_pdiscard(BlockDriverState *bs, int64_t offset, + int bytes, BlockCompletionFunc *cb, void *opaque) +{ + BDRVBlkioState *s =3D bs->opaque; + BlkioAIOCB *acb; + + QEMU_LOCK_GUARD(&s->lock); + + acb =3D blkio_aiocb_get(bs, cb, opaque); + blkioq_discard(s->blkioq, offset, bytes, acb, 0); + blkio_submit_io(bs); + return &acb->common; +} + +static BlockAIOCB *blkio_aio_preadv(BlockDriverState *bs, int64_t offset, + int64_t bytes, QEMUIOVector *qiov, BdrvRequestFlags flags, + BlockCompletionFunc *cb, void *opaque) +{ + BDRVBlkioState *s =3D bs->opaque; + struct iovec *iov =3D qiov->iov; + int iovcnt =3D qiov->niov; + BlkioAIOCB *acb; + + QEMU_LOCK_GUARD(&s->lock); + + acb =3D blkio_aiocb_get(bs, cb, opaque); + + if (s->needs_mem_regions) { + if (blkio_aiocb_init_mem_region_locked(acb, bytes) < 0) { + qemu_aio_unref(&acb->common); + return NULL; + } + + /* Copy qiov because we'll call qemu_iovec_from_buf() on completio= n */ + qemu_iovec_init_slice(&acb->qiov, qiov, 0, qiov->size); + + iov =3D &acb->bounce_iov; + iovcnt =3D 1; + } + + blkioq_readv(s->blkioq, offset, iov, iovcnt, acb, 0); + blkio_submit_io(bs); + return &acb->common; +} + +static BlockAIOCB *blkio_aio_pwritev(BlockDriverState *bs, int64_t offset, + int64_t bytes, QEMUIOVector *qiov, BdrvRequestFlags flags, + BlockCompletionFunc *cb, void *opaque) +{ + uint32_t blkio_flags =3D (flags & BDRV_REQ_FUA) ? BLKIO_REQ_FUA : 0; + BDRVBlkioState *s =3D bs->opaque; + struct iovec *iov =3D qiov->iov; + int iovcnt =3D qiov->niov; + BlkioAIOCB *acb; + + QEMU_LOCK_GUARD(&s->lock); + + acb =3D blkio_aiocb_get(bs, cb, opaque); + + if (s->needs_mem_regions) { + if (blkio_aiocb_init_mem_region_locked(acb, bytes) < 0) { + qemu_aio_unref(&acb->common); + return NULL; + } + + qemu_iovec_to_buf(qiov, 0, acb->bounce_iov.iov_base, bytes); + + iov =3D &acb->bounce_iov; + iovcnt =3D 1; + } + + blkioq_writev(s->blkioq, offset, iov, iovcnt, acb, blkio_flags); + blkio_submit_io(bs); + return &acb->common; +} + +static BlockAIOCB *blkio_aio_flush(BlockDriverState *bs, + BlockCompletionFunc *cb, + void *opaque) +{ + BDRVBlkioState *s =3D bs->opaque; + BlkioAIOCB *acb; + + QEMU_LOCK_GUARD(&s->lock); + + acb =3D blkio_aiocb_get(bs, cb, opaque); + + blkioq_flush(s->blkioq, acb, 0); + blkio_submit_io(bs); + return &acb->common; +} + +/* For async to .bdrv_co_*() conversion */ +typedef struct { + Coroutine *coroutine; + int ret; +} BlkioCoData; + +static void blkio_co_pwrite_zeroes_complete(void *opaque, int ret) +{ + BlkioCoData *data =3D opaque; + + data->ret =3D ret; + aio_co_wake(data->coroutine); +} + +static int coroutine_fn blkio_co_pwrite_zeroes(BlockDriverState *bs, + int64_t offset, int64_t bytes, BdrvRequestFlags flags) +{ + BDRVBlkioState *s =3D bs->opaque; + BlkioCoData data =3D { + .coroutine =3D qemu_coroutine_self(), + }; + uint32_t blkio_flags =3D 0; + + if (flags & BDRV_REQ_FUA) { + blkio_flags |=3D BLKIO_REQ_FUA; + } + if (!(flags & BDRV_REQ_MAY_UNMAP)) { + blkio_flags |=3D BLKIO_REQ_NO_UNMAP; + } + if (flags & BDRV_REQ_NO_FALLBACK) { + blkio_flags |=3D BLKIO_REQ_NO_FALLBACK; + } + + WITH_QEMU_LOCK_GUARD(&s->lock) { + BlkioAIOCB *acb =3D + blkio_aiocb_get(bs, blkio_co_pwrite_zeroes_complete, &data); + blkioq_write_zeroes(s->blkioq, offset, bytes, acb, blkio_flags); + blkio_submit_io(bs); + } + + qemu_coroutine_yield(); + return data.ret; +} + +static void blkio_io_unplug(BlockDriverState *bs) +{ + BDRVBlkioState *s =3D bs->opaque; + + WITH_QEMU_LOCK_GUARD(&s->lock) { + blkio_submit_io(bs); + } +} + +static void blkio_parse_filename_io_uring(const char *filename, QDict *opt= ions, + Error **errp) +{ + bdrv_parse_filename_strip_prefix(filename, "io_uring:", options); +} + +static void blkio_parse_filename_virtio_blk_vhost_vdpa( + const char *filename, + QDict *options, + Error **errp) +{ + bdrv_parse_filename_strip_prefix(filename, "virtio-blk-vhost-vdpa:", o= ptions); +} + +static int blkio_io_uring_open(BlockDriverState *bs, QDict *options, int f= lags, + Error **errp) +{ + const char *filename =3D qdict_get_try_str(options, "filename"); + BDRVBlkioState *s =3D bs->opaque; + int ret; + + ret =3D blkio_set_str(s->blkio, "path", filename); + qdict_del(options, "filename"); + if (ret < 0) { + error_setg_errno(errp, -ret, "failed to set path: %s", + blkio_get_error_msg()); + return ret; + } + + if (flags & BDRV_O_NOCACHE) { + ret =3D blkio_set_bool(s->blkio, "direct", true); + if (ret < 0) { + error_setg_errno(errp, -ret, "failed to set direct: %s", + blkio_get_error_msg()); + return ret; + } + } + + return 0; +} + +static int blkio_virtio_blk_vhost_vdpa_open(BlockDriverState *bs, + QDict *options, int flags, Error **errp) +{ + const char *path =3D qdict_get_try_str(options, "path"); + BDRVBlkioState *s =3D bs->opaque; + int ret; + + ret =3D blkio_set_str(s->blkio, "path", path); + qdict_del(options, "path"); + if (ret < 0) { + error_setg_errno(errp, -ret, "failed to set path: %s", + blkio_get_error_msg()); + return ret; + } + + if (flags & BDRV_O_NOCACHE) { + error_setg(errp, "cache.direct=3Doff is not supported"); + return -EINVAL; + } + return 0; +} + +static int blkio_file_open(BlockDriverState *bs, QDict *options, int flags, + Error **errp) +{ + const char *blkio_driver =3D bs->drv->protocol_name; + BDRVBlkioState *s =3D bs->opaque; + int ret; + + ret =3D blkio_create(blkio_driver, &s->blkio); + if (ret < 0) { + error_setg_errno(errp, -ret, "blkio_create failed: %s", + blkio_get_error_msg()); + return ret; + } + + if (strcmp(blkio_driver, "io_uring") =3D=3D 0) { + ret =3D blkio_io_uring_open(bs, options, flags, errp); + } else if (strcmp(blkio_driver, "virtio-blk-vhost-vdpa") =3D=3D 0) { + ret =3D blkio_virtio_blk_vhost_vdpa_open(bs, options, flags, errp); + } + if (ret < 0) { + blkio_destroy(&s->blkio); + return ret; + } + + if (!(flags & BDRV_O_RDWR)) { + ret =3D blkio_set_bool(s->blkio, "readonly", true); + if (ret < 0) { + error_setg_errno(errp, -ret, "failed to set readonly: %s", + blkio_get_error_msg()); + blkio_destroy(&s->blkio); + return ret; + } + } + + ret =3D blkio_connect(s->blkio); + if (ret < 0) { + error_setg_errno(errp, -ret, "blkio_connect failed: %s", + blkio_get_error_msg()); + blkio_destroy(&s->blkio); + return ret; + } + + ret =3D blkio_get_bool(s->blkio, + "needs-mem-regions", + &s->needs_mem_regions); + if (ret < 0) { + error_setg_errno(errp, -ret, + "failed to get needs-mem-regions: %s", + blkio_get_error_msg()); + blkio_destroy(&s->blkio); + return ret; + } + + ret =3D blkio_get_uint64(s->blkio, + "mem-region-alignment", + &s->mem_region_alignment); + if (ret < 0) { + error_setg_errno(errp, -ret, + "failed to get mem-region-alignment: %s", + blkio_get_error_msg()); + blkio_destroy(&s->blkio); + return ret; + } + + ret =3D blkio_start(s->blkio); + if (ret < 0) { + error_setg_errno(errp, -ret, "blkio_start failed: %s", + blkio_get_error_msg()); + blkio_destroy(&s->blkio); + return ret; + } + + bs->supported_write_flags =3D BDRV_REQ_FUA; + bs->supported_zero_flags =3D BDRV_REQ_FUA | BDRV_REQ_MAY_UNMAP | + BDRV_REQ_NO_FALLBACK; + + qemu_mutex_init(&s->lock); + s->blkioq =3D blkio_get_queue(s->blkio, 0); + s->completion_fd =3D blkioq_get_completion_fd(s->blkioq); + + blkio_attach_aio_context(bs, bdrv_get_aio_context(bs)); + return 0; +} + +static void blkio_close(BlockDriverState *bs) +{ + BDRVBlkioState *s =3D bs->opaque; + + qemu_mutex_destroy(&s->lock); + blkio_destroy(&s->blkio); +} + +static int64_t blkio_getlength(BlockDriverState *bs) +{ + BDRVBlkioState *s =3D bs->opaque; + uint64_t capacity; + int ret; + + WITH_QEMU_LOCK_GUARD(&s->lock) { + ret =3D blkio_get_uint64(s->blkio, "capacity", &capacity); + } + if (ret < 0) { + return -ret; + } + + return capacity; +} + +static int blkio_get_info(BlockDriverState *bs, BlockDriverInfo *bdi) +{ + return 0; +} + +static void blkio_refresh_limits(BlockDriverState *bs, Error **errp) +{ + BDRVBlkioState *s =3D bs->opaque; + int value; + int ret; + + ret =3D blkio_get_int(s->blkio, + "request-alignment", + (int *)&bs->bl.request_alignment); + if (ret < 0) { + error_setg_errno(errp, -ret, "failed to get \"request-alignment\":= %s", + blkio_get_error_msg()); + return; + } + if (bs->bl.request_alignment < 1 || + bs->bl.request_alignment >=3D INT_MAX || + !is_power_of_2(bs->bl.request_alignment)) { + error_setg(errp, "invalid \"request-alignment\" value %d, must be " + "power of 2 less than INT_MAX", bs->bl.request_alignmen= t); + return; + } + + ret =3D blkio_get_int(s->blkio, + "optimal-io-size", + (int *)&bs->bl.opt_transfer); + if (ret < 0) { + error_setg_errno(errp, -ret, "failed to get \"buf-alignment\": %s", + blkio_get_error_msg()); + return; + } + if (bs->bl.opt_transfer > INT_MAX || + (bs->bl.opt_transfer % bs->bl.request_alignment)) { + error_setg(errp, "invalid \"buf-alignment\" value %d, must be a " + "multiple of %d", bs->bl.opt_transfer, + bs->bl.request_alignment); + return; + } + + ret =3D blkio_get_int(s->blkio, + "max-transfer", + (int *)&bs->bl.max_transfer); + if (ret < 0) { + error_setg_errno(errp, -ret, "failed to get \"max-transfer\": %s", + blkio_get_error_msg()); + return; + } + if ((bs->bl.max_transfer % bs->bl.request_alignment) || + (bs->bl.opt_transfer && (bs->bl.max_transfer % bs->bl.opt_transfer= ))) { + error_setg(errp, "invalid \"max-transfer\" value %d, must be a " + "multiple of %d and %d (if non-zero)", + bs->bl.max_transfer, bs->bl.request_alignment, + bs->bl.opt_transfer); + return; + } + + ret =3D blkio_get_int(s->blkio, "buf-alignment", &value); + if (ret < 0) { + error_setg_errno(errp, -ret, "failed to get \"buf-alignment\": %s", + blkio_get_error_msg()); + return; + } + if (value < 1) { + error_setg(errp, "invalid \"buf-alignment\" value %d, must be " + "positive", value); + return; + } + bs->bl.min_mem_alignment =3D value; + + ret =3D blkio_get_int(s->blkio, "optimal-buf-alignment", &value); + if (ret < 0) { + error_setg_errno(errp, -ret, + "failed to get \"optimal-buf-alignment\": %s", + blkio_get_error_msg()); + return; + } + if (value < 1) { + error_setg(errp, "invalid \"optimal-buf-alignment\" value %d, " + "must be positive", value); + return; + } + bs->bl.opt_mem_alignment =3D value; + + ret =3D blkio_get_int(s->blkio, "max-segments", &bs->bl.max_iov); + if (ret < 0) { + error_setg_errno(errp, -ret, "failed to get \"max-segments\": %s", + blkio_get_error_msg()); + return; + } + if (value < 1) { + error_setg(errp, "invalid \"max-segments\" value %d, must be posit= ive", + bs->bl.max_iov); + return; + } +} + +/* + * TODO + * Missing libblkio APIs: + * - write zeroes + * - discard + * - block_status + * - co_invalidate_cache + * + * Out of scope? + * - create + * - truncate + */ + +static BlockDriver bdrv_io_uring =3D { + .format_name =3D "io_uring", + .protocol_name =3D "io_uring", + .instance_size =3D sizeof(BDRVBlkioState), + .bdrv_needs_filename =3D true, + .bdrv_parse_filename =3D blkio_parse_filename_io_uring, + .bdrv_file_open =3D blkio_file_open, + .bdrv_close =3D blkio_close, + .bdrv_getlength =3D blkio_getlength, + .bdrv_get_info =3D blkio_get_info, + .bdrv_attach_aio_context =3D blkio_attach_aio_context, + .bdrv_detach_aio_context =3D blkio_detach_aio_context, + .bdrv_aio_pdiscard =3D blkio_aio_pdiscard, + .bdrv_aio_preadv =3D blkio_aio_preadv, + .bdrv_aio_pwritev =3D blkio_aio_pwritev, + .bdrv_aio_flush =3D blkio_aio_flush, + .bdrv_co_pwrite_zeroes =3D blkio_co_pwrite_zeroes, + .bdrv_io_unplug =3D blkio_io_unplug, + .bdrv_refresh_limits =3D blkio_refresh_limits, +}; + +static BlockDriver bdrv_virtio_blk_vhost_vdpa =3D { + .format_name =3D "virtio-blk-vhost-vdpa", + .protocol_name =3D "virtio-blk-vhost-vdpa", + .instance_size =3D sizeof(BDRVBlkioState), + .bdrv_needs_filename =3D true, + .bdrv_parse_filename =3D blkio_parse_filename_virtio_blk_vhost_= vdpa, + .bdrv_file_open =3D blkio_file_open, + .bdrv_close =3D blkio_close, + .bdrv_getlength =3D blkio_getlength, + .bdrv_get_info =3D blkio_get_info, + .bdrv_attach_aio_context =3D blkio_attach_aio_context, + .bdrv_detach_aio_context =3D blkio_detach_aio_context, + .bdrv_aio_pdiscard =3D blkio_aio_pdiscard, + .bdrv_aio_preadv =3D blkio_aio_preadv, + .bdrv_aio_pwritev =3D blkio_aio_pwritev, + .bdrv_aio_flush =3D blkio_aio_flush, + .bdrv_co_pwrite_zeroes =3D blkio_co_pwrite_zeroes, + .bdrv_io_unplug =3D blkio_io_unplug, + .bdrv_refresh_limits =3D blkio_refresh_limits, +}; + +static void bdrv_blkio_init(void) +{ + bdrv_register(&bdrv_io_uring); + bdrv_register(&bdrv_virtio_blk_vhost_vdpa); +} + +block_init(bdrv_blkio_init); diff --git a/tests/qtest/modules-test.c b/tests/qtest/modules-test.c index 88217686e1..be2575ae6d 100644 --- a/tests/qtest/modules-test.c +++ b/tests/qtest/modules-test.c @@ -16,6 +16,9 @@ static void test_modules_load(const void *data) int main(int argc, char *argv[]) { const char *modules[] =3D { +#ifdef CONFIG_BLKIO + "block-", "blkio", +#endif #ifdef CONFIG_CURL "block-", "curl", #endif diff --git a/block/meson.build b/block/meson.build index 0b2a60c99b..787667384a 100644 --- a/block/meson.build +++ b/block/meson.build @@ -92,6 +92,7 @@ block_modules =3D {} =20 modsrc =3D [] foreach m : [ + [blkio, 'blkio', files('blkio.c')], [curl, 'curl', files('curl.c')], [glusterfs, 'gluster', files('gluster.c')], [libiscsi, 'iscsi', [files('iscsi.c'), libm]], diff --git a/scripts/meson-buildoptions.sh b/scripts/meson-buildoptions.sh index d0e14fd6de..fb0d559eb1 100644 --- a/scripts/meson-buildoptions.sh +++ b/scripts/meson-buildoptions.sh @@ -69,6 +69,7 @@ meson_options_help() { printf "%s\n" ' auth-pam PAM access control' printf "%s\n" ' avx2 AVX2 optimizations' printf "%s\n" ' avx512f AVX512F optimizations' + printf "%s\n" ' blkio libblkio block device driver' printf "%s\n" ' bochs bochs image format support' printf "%s\n" ' bpf eBPF support' printf "%s\n" ' brlapi brlapi character device driver' @@ -198,6 +199,8 @@ _meson_option_parse() { --disable-gcov) printf "%s" -Db_coverage=3Dfalse ;; --enable-lto) printf "%s" -Db_lto=3Dtrue ;; --disable-lto) printf "%s" -Db_lto=3Dfalse ;; + --enable-blkio) printf "%s" -Dblkio=3Denabled ;; + --disable-blkio) printf "%s" -Dblkio=3Ddisabled ;; --block-drv-ro-whitelist=3D*) quote_sh "-Dblock_drv_ro_whitelist=3D$2"= ;; --block-drv-rw-whitelist=3D*) quote_sh "-Dblock_drv_rw_whitelist=3D$2"= ;; --enable-block-drv-whitelist-in-tools) printf "%s" -Dblock_drv_whiteli= st_in_tools=3Dtrue ;; --=20 2.36.1 From nobody Mon Feb 9 03:31:27 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1657254226; cv=none; d=zohomail.com; s=zohoarc; b=kps8BosrIEjBauBgbQrqRIL27Wofa0UnGPF7U1LEFFOVHVI0YQKwte4tM+/h811zTeHgj1avSjC5+puRXW2SDwqSaFql5n40BB1lQgjd3dSKEDsszgahevPUp5TxWb8jq4ffCrVvaV0fGvycacNkaNYJNySHcQghROkecGCsN3U= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1657254226; h=Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=B/CltSIRykMyJ7RW4TmqGM41Hs2rFlkk5NXlMCfqTvc=; b=WQbsbPhWOYPQiDCv1UlBhgCXaCY4rbUrqhxDFpLdCgRPNq+jzskbMABO1BPgxrbsdxOHW4G/8uU6S6RiAp5MeQBJuG/SYJl+ZYE/yEHC2BpMtAwEN5U4bR6Mq5lPWD2cIxmmrx+nx5T1tyI9T3UDRrZOYgRs/0NqX6+/IbkGthQ= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1657254226605992.9746911607673; Thu, 7 Jul 2022 21:23:46 -0700 (PDT) Received: from localhost ([::1]:33246 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1o9fWr-0001dk-Dy for importer@patchew.org; Fri, 08 Jul 2022 00:23:45 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:35466) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1o9fSn-0003Xq-K4 for qemu-devel@nongnu.org; Fri, 08 Jul 2022 00:19:33 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:43454) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1o9fSl-0000LT-SE for qemu-devel@nongnu.org; Fri, 08 Jul 2022 00:19:33 -0400 Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-322-gGs1kax0NU2adppxY73bZA-1; Fri, 08 Jul 2022 00:19:27 -0400 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 893EB3802AC1; Fri, 8 Jul 2022 04:19:26 +0000 (UTC) Received: from localhost (unknown [10.39.192.20]) by smtp.corp.redhat.com (Postfix) with ESMTP id 26C554010D2A; Fri, 8 Jul 2022 04:19:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1657253971; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=B/CltSIRykMyJ7RW4TmqGM41Hs2rFlkk5NXlMCfqTvc=; b=DbIlWEjnBtC2FgAhZlvhvdWwFVZ9+kmhSSL/jZiQ79ThCE/TS0tASM7eUEsNaY2GS1s8Dz pXIwkcik0iQ8bF2fWp5lYndU5ASdR8mS+AKmRC1Etbk3nc52kY+6zuP4RA2RlU8E1f0Ctl U7Ep7bVAnsVChsaZ+zCFjC1ipNbMiPA= X-MC-Unique: gGs1kax0NU2adppxY73bZA-1 From: Stefan Hajnoczi To: qemu-devel@nongnu.org Cc: Alberto Faria , Stefan Hajnoczi , Vladimir Sementsov-Ogievskiy , "Michael S. Tsirkin" , Paolo Bonzini , Laurent Vivier , Eric Blake , sgarzare@redhat.com, Marcel Apfelbaum , =?UTF-8?q?Philippe=20Mathieu-Daud=C3=A9?= , qemu-block@nongnu.org, Eduardo Habkost , Vladimir Sementsov-Ogievskiy , John Snow , Thomas Huth , Kevin Wolf , Markus Armbruster , Hanna Reitz , Fam Zheng , Yanan Wang , David Hildenbrand Subject: [RFC v3 2/8] numa: call ->ram_block_removed() in ram_block_notifer_remove() Date: Fri, 8 Jul 2022 05:17:31 +0100 Message-Id: <20220708041737.1768521-3-stefanha@redhat.com> In-Reply-To: <20220708041737.1768521-1-stefanha@redhat.com> References: <20220708041737.1768521-1-stefanha@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 2.84 on 10.11.54.2 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=stefanha@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -21 X-Spam_score: -2.2 X-Spam_bar: -- X-Spam_report: (-2.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.082, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1657254227781100001 Content-Type: text/plain; charset="utf-8" When a RAMBlockNotifier is added, ->ram_block_added() is called with all existing RAMBlocks. There is no equivalent ->ram_block_removed() call when a RAMBlockNotifier is removed. The util/vfio-helpers.c code (the sole user of RAMBlockNotifier) is fine with this asymmetry because it does not rely on RAMBlockNotifier for cleanup. It walks its internal list of DMA mappings and unmaps them by itself. Future users of RAMBlockNotifier may not have an internal data structure that records added RAMBlocks so they will need ->ram_block_removed() callbacks. This patch makes ram_block_notifier_remove() symmetric with respect to callbacks. Now util/vfio-helpers.c needs to unmap remaining DMA mappings after ram_block_notifier_remove() has been called. This is necessary since users like block/nvme.c may create additional DMA mappings that do not originate from the RAMBlockNotifier. Reviewed-by: David Hildenbrand Signed-off-by: Stefan Hajnoczi --- hw/core/numa.c | 17 +++++++++++++++++ util/vfio-helpers.c | 5 ++++- 2 files changed, 21 insertions(+), 1 deletion(-) diff --git a/hw/core/numa.c b/hw/core/numa.c index 26d8e5f616..31e6fe1caa 100644 --- a/hw/core/numa.c +++ b/hw/core/numa.c @@ -822,6 +822,19 @@ static int ram_block_notify_add_single(RAMBlock *rb, v= oid *opaque) return 0; } =20 +static int ram_block_notify_remove_single(RAMBlock *rb, void *opaque) +{ + const ram_addr_t max_size =3D qemu_ram_get_max_length(rb); + const ram_addr_t size =3D qemu_ram_get_used_length(rb); + void *host =3D qemu_ram_get_host_addr(rb); + RAMBlockNotifier *notifier =3D opaque; + + if (host) { + notifier->ram_block_removed(notifier, host, size, max_size); + } + return 0; +} + void ram_block_notifier_add(RAMBlockNotifier *n) { QLIST_INSERT_HEAD(&ram_list.ramblock_notifiers, n, next); @@ -835,6 +848,10 @@ void ram_block_notifier_add(RAMBlockNotifier *n) void ram_block_notifier_remove(RAMBlockNotifier *n) { QLIST_REMOVE(n, next); + + if (n->ram_block_removed) { + qemu_ram_foreach_block(ram_block_notify_remove_single, n); + } } =20 void ram_block_notify_add(void *host, size_t size, size_t max_size) diff --git a/util/vfio-helpers.c b/util/vfio-helpers.c index 5ba01177bf..0d1520caac 100644 --- a/util/vfio-helpers.c +++ b/util/vfio-helpers.c @@ -847,10 +847,13 @@ void qemu_vfio_close(QEMUVFIOState *s) if (!s) { return; } + + ram_block_notifier_remove(&s->ram_notifier); + for (i =3D 0; i < s->nr_mappings; ++i) { qemu_vfio_undo_mapping(s, &s->mappings[i], NULL); } - ram_block_notifier_remove(&s->ram_notifier); + g_free(s->usable_iova_ranges); s->nb_iova_ranges =3D 0; qemu_vfio_reset(s); --=20 2.36.1 From nobody Mon Feb 9 03:31:27 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1657254242; cv=none; d=zohomail.com; s=zohoarc; b=P2HWJA4kViI13RGca9HIpllRrOdYENd2DhycsW66UExDgBEmOtG7fC397bHVjfTgLb+XjP2+YAI3ImN1pqX+5XJMDbVYLVOSNhm8U6ZVept4/rNS+qc2OlPBDt7VxveSTyHBK3kU1ZXIdwBZ+NVan+6aX//AhT4lTEEO9IQyImI= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1657254242; h=Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=Fi0bQ6l8vbRSQxvXQVnlbB5PSCBj8OEguP1xKeU1T0Y=; b=bDNMm/TUwHRO49acZpnNe9FQ1/AxN49+pSrGiUR79baig1maKzLKFBs1enFM3noz3VbOM9JBu4BQ07IbG0OuclqFTjA/pQr7fkWevtvf23q7rfSSA5Lp+T0OEmIli39thePIgVXrvl0om086gs4kuGL3cP/jEY7Q0o08BPuQ9QY= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1657254242438893.7035983049369; Thu, 7 Jul 2022 21:24:02 -0700 (PDT) Received: from localhost ([::1]:34278 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1o9fX7-0002KK-8N for importer@patchew.org; Fri, 08 Jul 2022 00:24:01 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:35480) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1o9fSo-0003Z6-4l for qemu-devel@nongnu.org; Fri, 08 Jul 2022 00:19:34 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:25324) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1o9fSm-0000Li-8L for qemu-devel@nongnu.org; Fri, 08 Jul 2022 00:19:33 -0400 Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-607-XkZXXdqePcOBC6bvrE8XmQ-1; Fri, 08 Jul 2022 00:19:28 -0400 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 0659085A583; Fri, 8 Jul 2022 04:19:28 +0000 (UTC) Received: from localhost (unknown [10.39.192.20]) by smtp.corp.redhat.com (Postfix) with ESMTP id 932FD18EB5; Fri, 8 Jul 2022 04:19:27 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1657253971; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Fi0bQ6l8vbRSQxvXQVnlbB5PSCBj8OEguP1xKeU1T0Y=; b=OEwTwTQRxCmZvJH2VBoBfNrBoyIuKjfzrbg4v9xXhiIuYnrNpQulY7wF+NC+bP2Nm2yGBy pmgp8Z6WuT7GUHVRwtFwI7sVs2VRTv1wnpvge3jfTOO04mG7DaxVyf1LNVNbqRIULR3kso IhFULWCriK0XNWM1gS26tSgrFrmzOSk= X-MC-Unique: XkZXXdqePcOBC6bvrE8XmQ-1 From: Stefan Hajnoczi To: qemu-devel@nongnu.org Cc: Alberto Faria , Stefan Hajnoczi , Vladimir Sementsov-Ogievskiy , "Michael S. Tsirkin" , Paolo Bonzini , Laurent Vivier , Eric Blake , sgarzare@redhat.com, Marcel Apfelbaum , =?UTF-8?q?Philippe=20Mathieu-Daud=C3=A9?= , qemu-block@nongnu.org, Eduardo Habkost , Vladimir Sementsov-Ogievskiy , John Snow , Thomas Huth , Kevin Wolf , Markus Armbruster , Hanna Reitz , Fam Zheng , Yanan Wang Subject: [RFC v3 3/8] block: pass size to bdrv_unregister_buf() Date: Fri, 8 Jul 2022 05:17:32 +0100 Message-Id: <20220708041737.1768521-4-stefanha@redhat.com> In-Reply-To: <20220708041737.1768521-1-stefanha@redhat.com> References: <20220708041737.1768521-1-stefanha@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 2.79 on 10.11.54.5 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.129.124; envelope-from=stefanha@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -28 X-Spam_score: -2.9 X-Spam_bar: -- X-Spam_report: (-2.9 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.082, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1657254243693100001 Content-Type: text/plain; charset="utf-8" The only implementor of bdrv_register_buf() is block/nvme.c, where the size is not needed when unregistering a buffer. This is because util/vfio-helpers.c can look up mappings by address. Future block drivers that implement bdrv_register_buf() may not be able to do their job given only the buffer address. Add a size argument to bdrv_unregister_buf(). Also document the assumptions about bdrv_register_buf()/bdrv_unregister_buf() calls. The same values that were given to bdrv_register_buf() must be given to bdrv_unregister_buf(). gcc 11.2.1 emits a spurious warning that img_bench()'s buf_size local variable might be uninitialized, so it's necessary to silence the compiler. Signed-off-by: Stefan Hajnoczi Reviewed-by: Hanna Reitz --- include/block/block-global-state.h | 5 ++++- include/block/block_int-common.h | 2 +- include/sysemu/block-backend-global-state.h | 2 +- block/block-backend.c | 4 ++-- block/io.c | 6 +++--- block/nvme.c | 2 +- qemu-img.c | 4 ++-- 7 files changed, 14 insertions(+), 11 deletions(-) diff --git a/include/block/block-global-state.h b/include/block/block-globa= l-state.h index 21265e3966..7901f35863 100644 --- a/include/block/block-global-state.h +++ b/include/block/block-global-state.h @@ -243,9 +243,12 @@ void bdrv_del_child(BlockDriverState *parent, BdrvChil= d *child, Error **errp); * Register/unregister a buffer for I/O. For example, VFIO drivers are * interested to know the memory areas that would later be used for I/O, so * that they can prepare IOMMU mapping etc., to get better performance. + * + * Buffers must not overlap and they must be unregistered with the same values that they were registered with. */ void bdrv_register_buf(BlockDriverState *bs, void *host, size_t size); -void bdrv_unregister_buf(BlockDriverState *bs, void *host); +void bdrv_unregister_buf(BlockDriverState *bs, void *host, size_t size); =20 void bdrv_cancel_in_flight(BlockDriverState *bs); =20 diff --git a/include/block/block_int-common.h b/include/block/block_int-com= mon.h index 8947abab76..b7a7cbd3a5 100644 --- a/include/block/block_int-common.h +++ b/include/block/block_int-common.h @@ -435,7 +435,7 @@ struct BlockDriver { * DMA mapping for hot buffers. */ void (*bdrv_register_buf)(BlockDriverState *bs, void *host, size_t siz= e); - void (*bdrv_unregister_buf)(BlockDriverState *bs, void *host); + void (*bdrv_unregister_buf)(BlockDriverState *bs, void *host, size_t s= ize); =20 /* * This field is modified only under the BQL, and is part of diff --git a/include/sysemu/block-backend-global-state.h b/include/sysemu/b= lock-backend-global-state.h index 415f0c91d7..97f7dad2c3 100644 --- a/include/sysemu/block-backend-global-state.h +++ b/include/sysemu/block-backend-global-state.h @@ -107,7 +107,7 @@ void blk_io_limits_update_group(BlockBackend *blk, cons= t char *group); void blk_set_force_allow_inactivate(BlockBackend *blk); =20 void blk_register_buf(BlockBackend *blk, void *host, size_t size); -void blk_unregister_buf(BlockBackend *blk, void *host); +void blk_unregister_buf(BlockBackend *blk, void *host, size_t size); =20 const BdrvChild *blk_root(BlockBackend *blk); =20 diff --git a/block/block-backend.c b/block/block-backend.c index f425b00793..44f7c61e0b 100644 --- a/block/block-backend.c +++ b/block/block-backend.c @@ -2581,10 +2581,10 @@ void blk_register_buf(BlockBackend *blk, void *host= , size_t size) bdrv_register_buf(blk_bs(blk), host, size); } =20 -void blk_unregister_buf(BlockBackend *blk, void *host) +void blk_unregister_buf(BlockBackend *blk, void *host, size_t size) { GLOBAL_STATE_CODE(); - bdrv_unregister_buf(blk_bs(blk), host); + bdrv_unregister_buf(blk_bs(blk), host, size); } =20 int coroutine_fn blk_co_copy_range(BlockBackend *blk_in, int64_t off_in, diff --git a/block/io.c b/block/io.c index 1e9bf09a49..e7f4117fe7 100644 --- a/block/io.c +++ b/block/io.c @@ -3350,16 +3350,16 @@ void bdrv_register_buf(BlockDriverState *bs, void *= host, size_t size) } } =20 -void bdrv_unregister_buf(BlockDriverState *bs, void *host) +void bdrv_unregister_buf(BlockDriverState *bs, void *host, size_t size) { BdrvChild *child; =20 GLOBAL_STATE_CODE(); if (bs->drv && bs->drv->bdrv_unregister_buf) { - bs->drv->bdrv_unregister_buf(bs, host); + bs->drv->bdrv_unregister_buf(bs, host, size); } QLIST_FOREACH(child, &bs->children, next) { - bdrv_unregister_buf(child->bs, host); + bdrv_unregister_buf(child->bs, host, size); } } =20 diff --git a/block/nvme.c b/block/nvme.c index 01fb28aa63..696502acea 100644 --- a/block/nvme.c +++ b/block/nvme.c @@ -1592,7 +1592,7 @@ static void nvme_register_buf(BlockDriverState *bs, v= oid *host, size_t size) } } =20 -static void nvme_unregister_buf(BlockDriverState *bs, void *host) +static void nvme_unregister_buf(BlockDriverState *bs, void *host, size_t s= ize) { BDRVNVMeState *s =3D bs->opaque; =20 diff --git a/qemu-img.c b/qemu-img.c index 4cf4d2423d..b7ffc37a49 100644 --- a/qemu-img.c +++ b/qemu-img.c @@ -4368,7 +4368,7 @@ static int img_bench(int argc, char **argv) struct timeval t1, t2; int i; bool force_share =3D false; - size_t buf_size; + size_t buf_size =3D 0; =20 for (;;) { static const struct option long_options[] =3D { @@ -4590,7 +4590,7 @@ static int img_bench(int argc, char **argv) =20 out: if (data.buf) { - blk_unregister_buf(blk, data.buf); + blk_unregister_buf(blk, data.buf, buf_size); } qemu_vfree(data.buf); blk_unref(blk); --=20 2.36.1 From nobody Mon Feb 9 03:31:27 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1657254084; cv=none; d=zohomail.com; s=zohoarc; b=fSyMNbtWsHYhJiKOEVU7vPLMlWpWaYSZ1MFTSGXXlH6ep1WI9JceT3YZp7OZd8cgxsrSuN/1TLDJICyRhTRd2kMPFoTloc8Eac25nKsX++NqvjNGvpyaQyO3LYEFrTU434SRIAHumu5e4xUXJVcV/3QSdI+MupEQLdBqBqbXyRY= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1657254084; h=Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=A1deBwnpNjSq9Xt6LKlX/UdBwzqO6LBadYyKsO3v2PY=; b=AsEvxoBbD3IesVAq4xZFLgpxpFgjaimBdkByUYH8/55yVBy7Lkk+wfJsFIDY5IbTVttTISH9/gC4bflEzs5pvyGrov6jZY0I3qkvVm+IV2osQoI3PsjVCBaQuaJAxQWcnNY0XBo4/TRKVAYkS+2faP77bK0cdy+rjI5hGm+dlck= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1657254084277779.6189619330928; Thu, 7 Jul 2022 21:21:24 -0700 (PDT) Received: from localhost ([::1]:55794 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1o9fUZ-0006EC-3g for importer@patchew.org; Fri, 08 Jul 2022 00:21:23 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:35528) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1o9fSp-0003d5-W2 for qemu-devel@nongnu.org; Fri, 08 Jul 2022 00:19:36 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:54929) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1o9fSo-0000MK-0H for qemu-devel@nongnu.org; Fri, 08 Jul 2022 00:19:35 -0400 Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-211-UBVVi__fPpqcNg9djzxLPw-1; Fri, 08 Jul 2022 00:19:30 -0400 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.rdu2.redhat.com [10.11.54.8]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 9B775811E76; Fri, 8 Jul 2022 04:19:29 +0000 (UTC) Received: from localhost (unknown [10.39.192.20]) by smtp.corp.redhat.com (Postfix) with ESMTP id 4172FC2811A; Fri, 8 Jul 2022 04:19:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1657253973; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=A1deBwnpNjSq9Xt6LKlX/UdBwzqO6LBadYyKsO3v2PY=; b=G6XoAgDdmMsIKnWN9nJrzKPMpGFAKI5N2oFVJGER80kpTjVgdymxJsy5edDGSRk7OUtJ5O uXJFblxq5FGmWw2GT4O4cEX6OO2OoI4HOH5+/o1MvE75A4JoAoj1vUJU+Gt7A9Kyn6Bgz9 F7YbIn95qdfdyx4I/yfOY2o8bbawmgg= X-MC-Unique: UBVVi__fPpqcNg9djzxLPw-1 From: Stefan Hajnoczi To: qemu-devel@nongnu.org Cc: Alberto Faria , Stefan Hajnoczi , Vladimir Sementsov-Ogievskiy , "Michael S. Tsirkin" , Paolo Bonzini , Laurent Vivier , Eric Blake , sgarzare@redhat.com, Marcel Apfelbaum , =?UTF-8?q?Philippe=20Mathieu-Daud=C3=A9?= , qemu-block@nongnu.org, Eduardo Habkost , Vladimir Sementsov-Ogievskiy , John Snow , Thomas Huth , Kevin Wolf , Markus Armbruster , Hanna Reitz , Fam Zheng , Yanan Wang Subject: [RFC v3 4/8] block: add BDRV_REQ_REGISTERED_BUF request flag Date: Fri, 8 Jul 2022 05:17:33 +0100 Message-Id: <20220708041737.1768521-5-stefanha@redhat.com> In-Reply-To: <20220708041737.1768521-1-stefanha@redhat.com> References: <20220708041737.1768521-1-stefanha@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 2.85 on 10.11.54.8 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.129.124; envelope-from=stefanha@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -28 X-Spam_score: -2.9 X-Spam_bar: -- X-Spam_report: (-2.9 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.082, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1657254085491100001 Content-Type: text/plain; charset="utf-8" Block drivers may optimize I/O requests accessing buffers previously registered with bdrv_register_buf(). Checking whether all elements of a request's QEMUIOVector are within previously registered buffers is expensive, so we need a hint from the user to avoid costly checks. Add a BDRV_REQ_REGISTERED_BUF request flag to indicate that all QEMUIOVector elements in an I/O request are known to be within previously registered buffers. bdrv_aligned_preadv() is strict in validating supported read flags and its assertions fail when it sees BDRV_REQ_REGISTERED_BUF. There is no harm in passing BDRV_REQ_REGISTERED_BUF to block drivers that do not support it, so update the assertions to ignore BDRV_REQ_REGISTERED_BUF. Care must be taken to clear the flag when the block layer or filter drivers replace QEMUIOVector elements with bounce buffers since these have not been registered with bdrv_register_buf(). A lot of the changes in this commit deal with clearing the flag in those cases. Ensuring that the flag is cleared properly is somewhat invasive to implement across the block layer and it's hard to spot when future code changes accidentally break it. Another option might be to add a flag to QEMUIOVector itself and clear it in qemu_iovec_*() functions that modify elements. That is more robust but somewhat of a layering violation, so I haven't attempted that. Signed-off-by: Stefan Hajnoczi --- include/block/block-common.h | 9 +++++++++ block/blkverify.c | 4 ++-- block/crypto.c | 2 ++ block/io.c | 30 +++++++++++++++++++++++------- block/mirror.c | 2 ++ block/raw-format.c | 2 ++ 6 files changed, 40 insertions(+), 9 deletions(-) diff --git a/include/block/block-common.h b/include/block/block-common.h index fdb7306e78..061606e867 100644 --- a/include/block/block-common.h +++ b/include/block/block-common.h @@ -80,6 +80,15 @@ typedef enum { */ BDRV_REQ_MAY_UNMAP =3D 0x4, =20 + /* + * An optimization hint when all QEMUIOVector elements are within + * previously registered bdrv_register_buf() memory ranges. + * + * Code that replaces the user's QEMUIOVector elements with bounce buf= fers + * must take care to clear this flag. + */ + BDRV_REQ_REGISTERED_BUF =3D 0x8, + BDRV_REQ_FUA =3D 0x10, BDRV_REQ_WRITE_COMPRESSED =3D 0x20, =20 diff --git a/block/blkverify.c b/block/blkverify.c index e4a37af3b2..d624f4fd05 100644 --- a/block/blkverify.c +++ b/block/blkverify.c @@ -235,8 +235,8 @@ blkverify_co_preadv(BlockDriverState *bs, int64_t offse= t, int64_t bytes, qemu_iovec_init(&raw_qiov, qiov->niov); qemu_iovec_clone(&raw_qiov, qiov, buf); =20 - ret =3D blkverify_co_prwv(bs, &r, offset, bytes, qiov, &raw_qiov, flag= s, - false); + ret =3D blkverify_co_prwv(bs, &r, offset, bytes, qiov, &raw_qiov, + flags & ~BDRV_REQ_REGISTERED_BUF, false); =20 cmp_offset =3D qemu_iovec_compare(qiov, &raw_qiov); if (cmp_offset !=3D -1) { diff --git a/block/crypto.c b/block/crypto.c index 1ba82984ef..c900355adb 100644 --- a/block/crypto.c +++ b/block/crypto.c @@ -473,6 +473,8 @@ block_crypto_co_pwritev(BlockDriverState *bs, int64_t o= ffset, int64_t bytes, uint64_t sector_size =3D qcrypto_block_get_sector_size(crypto->block); uint64_t payload_offset =3D qcrypto_block_get_payload_offset(crypto->b= lock); =20 + flags &=3D ~BDRV_REQ_REGISTERED_BUF; + assert(!(flags & ~BDRV_REQ_FUA)); assert(payload_offset < INT64_MAX); assert(QEMU_IS_ALIGNED(offset, sector_size)); diff --git a/block/io.c b/block/io.c index e7f4117fe7..83b8259227 100644 --- a/block/io.c +++ b/block/io.c @@ -1541,11 +1541,14 @@ static int coroutine_fn bdrv_aligned_preadv(BdrvChi= ld *child, max_transfer =3D QEMU_ALIGN_DOWN(MIN_NON_ZERO(bs->bl.max_transfer, INT= _MAX), align); =20 - /* TODO: We would need a per-BDS .supported_read_flags and + /* + * TODO: We would need a per-BDS .supported_read_flags and * potential fallback support, if we ever implement any read flags * to pass through to drivers. For now, there aren't any - * passthrough flags. */ - assert(!(flags & ~(BDRV_REQ_COPY_ON_READ | BDRV_REQ_PREFETCH))); + * passthrough flags except the BDRV_REQ_REGISTERED_BUF optimization h= int. + */ + assert(!(flags & ~(BDRV_REQ_COPY_ON_READ | BDRV_REQ_PREFETCH | + BDRV_REQ_REGISTERED_BUF))); =20 /* Handle Copy on Read and associated serialisation */ if (flags & BDRV_REQ_COPY_ON_READ) { @@ -1586,7 +1589,7 @@ static int coroutine_fn bdrv_aligned_preadv(BdrvChild= *child, goto out; } =20 - assert(!(flags & ~bs->supported_read_flags)); + assert(!(flags & ~(bs->supported_read_flags | BDRV_REQ_REGISTERED_BUF)= )); =20 max_bytes =3D ROUND_UP(MAX(0, total_bytes - offset), align); if (bytes <=3D max_bytes && bytes <=3D max_transfer) { @@ -1775,7 +1778,8 @@ static void bdrv_padding_destroy(BdrvRequestPadding *= pad) static int bdrv_pad_request(BlockDriverState *bs, QEMUIOVector **qiov, size_t *qiov_offset, int64_t *offset, int64_t *bytes, - BdrvRequestPadding *pad, bool *padded) + BdrvRequestPadding *pad, bool *padded, + BdrvRequestFlags *flags) { int ret; =20 @@ -1803,6 +1807,10 @@ static int bdrv_pad_request(BlockDriverState *bs, if (padded) { *padded =3D true; } + if (flags) { + /* Can't use optimization hint with bounce buffer */ + *flags &=3D ~BDRV_REQ_REGISTERED_BUF; + } =20 return 0; } @@ -1857,7 +1865,7 @@ int coroutine_fn bdrv_co_preadv_part(BdrvChild *child, } =20 ret =3D bdrv_pad_request(bs, &qiov, &qiov_offset, &offset, &bytes, &pa= d, - NULL); + NULL, &flags); if (ret < 0) { goto fail; } @@ -1902,6 +1910,11 @@ static int coroutine_fn bdrv_co_do_pwrite_zeroes(Blo= ckDriverState *bs, return -ENOTSUP; } =20 + /* By definition there is no user buffer so this flag doesn't make sen= se */ + if (flags & BDRV_REQ_REGISTERED_BUF) { + return -EINVAL; + } + /* Invalidate the cached block-status data range if this write overlap= s */ bdrv_bsc_invalidate_range(bs, offset, bytes); =20 @@ -2187,6 +2200,9 @@ static int coroutine_fn bdrv_co_do_zero_pwritev(BdrvC= hild *child, bool padding; BdrvRequestPadding pad; =20 + /* This flag doesn't make sense for padding or zero writes */ + flags &=3D ~BDRV_REQ_REGISTERED_BUF; + padding =3D bdrv_init_padding(bs, offset, bytes, &pad); if (padding) { assert(!(flags & BDRV_REQ_NO_WAIT)); @@ -2304,7 +2320,7 @@ int coroutine_fn bdrv_co_pwritev_part(BdrvChild *chil= d, * alignment only if there is no ZERO flag. */ ret =3D bdrv_pad_request(bs, &qiov, &qiov_offset, &offset, &bytes,= &pad, - &padded); + &padded, &flags); if (ret < 0) { return ret; } diff --git a/block/mirror.c b/block/mirror.c index 3c4ab1159d..8d3fc3f19b 100644 --- a/block/mirror.c +++ b/block/mirror.c @@ -1477,6 +1477,8 @@ static int coroutine_fn bdrv_mirror_top_pwritev(Block= DriverState *bs, qemu_iovec_init(&bounce_qiov, 1); qemu_iovec_add(&bounce_qiov, bounce_buf, bytes); qiov =3D &bounce_qiov; + + flags &=3D ~BDRV_REQ_REGISTERED_BUF; } =20 ret =3D bdrv_mirror_top_do_write(bs, MIRROR_METHOD_COPY, offset, bytes= , qiov, diff --git a/block/raw-format.c b/block/raw-format.c index 69fd650eaf..9bae3dd7f2 100644 --- a/block/raw-format.c +++ b/block/raw-format.c @@ -258,6 +258,8 @@ static int coroutine_fn raw_co_pwritev(BlockDriverState= *bs, int64_t offset, qemu_iovec_add(&local_qiov, buf, 512); qemu_iovec_concat(&local_qiov, qiov, 512, qiov->size - 512); qiov =3D &local_qiov; + + flags &=3D ~BDRV_REQ_REGISTERED_BUF; } =20 ret =3D raw_adjust_offset(bs, &offset, bytes, true); --=20 2.36.1 From nobody Mon Feb 9 03:31:27 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1657254238; cv=none; d=zohomail.com; s=zohoarc; b=W7ZjMLkwzjbz8A75kmviorMY/NyCGXusojwThgvukrKyQPHfzcW+b5iytjq1vSP1NNpHdnnusdsu805O4lsDlazdljkP1FEgJcCvl2UXQ1Lmp3qhHkNiQobXE5gBK+ZLUfctUzoQksKBweRGXLyzOKIiQfyEy7D/MI5QFYbd/lA= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1657254238; h=Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=q+uLPCMNUisDw58MHbGAjYx3doQt6Uk+rRiNr5fVsqg=; b=mH8ucg8xRoPnn2DG9rsaL2QFIfW/kEXlHM6XagNTGSvpckDu0KFmz34ke2zZWQ/+tEUgQ9rmAckVj2XaVH/TLztFGY90YfMLsb1r83rXNnAOR1OePip7q+FZuupE7JSeCcNdohFrkQngWK/VkQle1s9iasL4ViKMgoy7juF00lA= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1657254238098383.3612848422804; Thu, 7 Jul 2022 21:23:58 -0700 (PDT) Received: from localhost ([::1]:34066 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1o9fX3-0002AW-16 for importer@patchew.org; Fri, 08 Jul 2022 00:23:57 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:35548) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1o9fSr-0003gU-3b for qemu-devel@nongnu.org; Fri, 08 Jul 2022 00:19:37 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:42890) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1o9fSp-0000MT-C8 for qemu-devel@nongnu.org; Fri, 08 Jul 2022 00:19:36 -0400 Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-85-fd1yKlhINxOjZUUlXw_NQg-1; Fri, 08 Jul 2022 00:19:31 -0400 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 10EE1181E06C; Fri, 8 Jul 2022 04:19:31 +0000 (UTC) Received: from localhost (unknown [10.39.192.20]) by smtp.corp.redhat.com (Postfix) with ESMTP id 9C8F52026D64; Fri, 8 Jul 2022 04:19:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1657253973; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=q+uLPCMNUisDw58MHbGAjYx3doQt6Uk+rRiNr5fVsqg=; b=aUNnp7ZhKq1akkfympnd+14oxO7TQ+1rwOXUuhDb0BubiJN3FI/ZrSaETMvIQUgQlEYjid OTTlxfcR7hGOAT/wuy4nGs8PjII6WcRpCMvqoaY/ulUqSBHURA2cV5Jcb4a42DHx971XgA JW0JIcesQnj+4kejL75TY/UKZCLZqFM= X-MC-Unique: fd1yKlhINxOjZUUlXw_NQg-1 From: Stefan Hajnoczi To: qemu-devel@nongnu.org Cc: Alberto Faria , Stefan Hajnoczi , Vladimir Sementsov-Ogievskiy , "Michael S. Tsirkin" , Paolo Bonzini , Laurent Vivier , Eric Blake , sgarzare@redhat.com, Marcel Apfelbaum , =?UTF-8?q?Philippe=20Mathieu-Daud=C3=A9?= , qemu-block@nongnu.org, Eduardo Habkost , Vladimir Sementsov-Ogievskiy , John Snow , Thomas Huth , Kevin Wolf , Markus Armbruster , Hanna Reitz , Fam Zheng , Yanan Wang Subject: [RFC v3 5/8] block: add BlockRAMRegistrar Date: Fri, 8 Jul 2022 05:17:34 +0100 Message-Id: <20220708041737.1768521-6-stefanha@redhat.com> In-Reply-To: <20220708041737.1768521-1-stefanha@redhat.com> References: <20220708041737.1768521-1-stefanha@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 2.78 on 10.11.54.4 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.129.124; envelope-from=stefanha@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -28 X-Spam_score: -2.9 X-Spam_bar: -- X-Spam_report: (-2.9 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.082, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1657254239838100003 Content-Type: text/plain; charset="utf-8" Emulated devices and other BlockBackend users wishing to take advantage of blk_register_buf() all have the same repetitive job: register RAMBlocks with the BlockBackend using RAMBlockNotifier. Add a BlockRAMRegistrar API to do this. A later commit will use this from hw/block/virtio-blk.c. Signed-off-by: Stefan Hajnoczi --- MAINTAINERS | 1 + include/sysemu/block-ram-registrar.h | 30 +++++++++++++++++++++ block/block-ram-registrar.c | 39 ++++++++++++++++++++++++++++ block/meson.build | 1 + 4 files changed, 71 insertions(+) create mode 100644 include/sysemu/block-ram-registrar.h create mode 100644 block/block-ram-registrar.c diff --git a/MAINTAINERS b/MAINTAINERS index 50f340d9ee..d16189449f 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -2490,6 +2490,7 @@ F: block* F: block/ F: hw/block/ F: include/block/ +F: include/sysemu/block-*.h F: qemu-img* F: docs/tools/qemu-img.rst F: qemu-io* diff --git a/include/sysemu/block-ram-registrar.h b/include/sysemu/block-ra= m-registrar.h new file mode 100644 index 0000000000..09d63f64b2 --- /dev/null +++ b/include/sysemu/block-ram-registrar.h @@ -0,0 +1,30 @@ +/* + * BlockBackend RAM Registrar + * + * SPDX-License-Identifier: GPL-2.0-or-later + */ + +#ifndef BLOCK_RAM_REGISTRAR_H +#define BLOCK_RAM_REGISTRAR_H + +#include "exec/ramlist.h" + +/** + * struct BlockRAMRegistrar: + * + * Keeps RAMBlock memory registered with a BlockBackend using + * blk_register_buf() including hotplugged memory. + * + * Emulated devices or other BlockBackend users initialize a BlockRAMRegis= trar + * with blk_ram_registrar_init() before submitting I/O requests with the + * BLK_REQ_REGISTERED_BUF flag set. + */ +typedef struct { + BlockBackend *blk; + RAMBlockNotifier notifier; +} BlockRAMRegistrar; + +void blk_ram_registrar_init(BlockRAMRegistrar *r, BlockBackend *blk); +void blk_ram_registrar_destroy(BlockRAMRegistrar *r); + +#endif /* BLOCK_RAM_REGISTRAR_H */ diff --git a/block/block-ram-registrar.c b/block/block-ram-registrar.c new file mode 100644 index 0000000000..32a14b69ae --- /dev/null +++ b/block/block-ram-registrar.c @@ -0,0 +1,39 @@ +/* + * BlockBackend RAM Registrar + * + * SPDX-License-Identifier: GPL-2.0-or-later + */ + +#include "qemu/osdep.h" +#include "sysemu/block-backend.h" +#include "sysemu/block-ram-registrar.h" + +static void ram_block_added(RAMBlockNotifier *n, void *host, size_t size, + size_t max_size) +{ + BlockRAMRegistrar *r =3D container_of(n, BlockRAMRegistrar, notifier); + blk_register_buf(r->blk, host, max_size); +} + +static void ram_block_removed(RAMBlockNotifier *n, void *host, size_t size, + size_t max_size) +{ + BlockRAMRegistrar *r =3D container_of(n, BlockRAMRegistrar, notifier); + blk_unregister_buf(r->blk, host, max_size); +} + +void blk_ram_registrar_init(BlockRAMRegistrar *r, BlockBackend *blk) +{ + r->blk =3D blk; + r->notifier =3D (RAMBlockNotifier){ + .ram_block_added =3D ram_block_added, + .ram_block_removed =3D ram_block_removed, + }; + + ram_block_notifier_add(&r->notifier); +} + +void blk_ram_registrar_destroy(BlockRAMRegistrar *r) +{ + ram_block_notifier_remove(&r->notifier); +} diff --git a/block/meson.build b/block/meson.build index 787667384a..b315593054 100644 --- a/block/meson.build +++ b/block/meson.build @@ -46,6 +46,7 @@ block_ss.add(files( ), zstd, zlib, gnutls) =20 softmmu_ss.add(when: 'CONFIG_TCG', if_true: files('blkreplay.c')) +softmmu_ss.add(files('block-ram-registrar.c')) =20 if get_option('qcow1').allowed() block_ss.add(files('qcow.c')) --=20 2.36.1 From nobody Mon Feb 9 03:31:27 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1657254402; cv=none; d=zohomail.com; s=zohoarc; b=hYcTNZHEaWlhI45h9pQUpa9cjVBHQU/ZSro4XUKXKL8JopMdwvvNjg9vt+/x2Y8/8REUxBdipf0bwYy5S0tbk7yOpSm/IDub0YCI97NUFZo+9Zp0svXtUIT6NPWFlWARmS/ZRADljx7KeW8+JKa+vml+Qigv05DT2bO1XXJC19c= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1657254402; h=Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=W23/TQNesWgtF+AvyiYshwh897niXnOhB3RFFC8PjA8=; b=UNNUmMrrGKMfGZ+IL50Vwl/msBSlUcnS7AacQP90XrMviZtGE4B9FQU5TTdrwfsi6h0E9McDlgaSRXKusVYjy94U52HesIvMFUdB1oZUomsqIWbvJqEDNJ7d6MwtOt4gl5l3dZN/Mj7zmIps5nYnmHDLa7IPmo9dsZ/vaes/jtQ= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1657254402746980.0775982254827; Thu, 7 Jul 2022 21:26:42 -0700 (PDT) Received: from localhost ([::1]:40594 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1o9fZh-0006Zx-8K for importer@patchew.org; Fri, 08 Jul 2022 00:26:41 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:35576) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1o9fSu-0003oS-8s for qemu-devel@nongnu.org; Fri, 08 Jul 2022 00:19:40 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:52630) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1o9fSs-0000Nr-8L for qemu-devel@nongnu.org; Fri, 08 Jul 2022 00:19:39 -0400 Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-90-qBOOxxoQMhWThCLp8JlcJQ-1; Fri, 08 Jul 2022 00:19:33 -0400 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id DEB008316FB; Fri, 8 Jul 2022 04:19:32 +0000 (UTC) Received: from localhost (unknown [10.39.192.20]) by smtp.corp.redhat.com (Postfix) with ESMTP id 76EBB1121315; Fri, 8 Jul 2022 04:19:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1657253977; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=W23/TQNesWgtF+AvyiYshwh897niXnOhB3RFFC8PjA8=; b=iLQ3inbMjR1UZaZIZIP1vEincJQmeDoJh3PT8cIbRi7CPVxHP3CvSnuRSSVIrFkijgHipi EhDjnodcINxo3KRibfGqNNJHFhNG6RvyInU6W09nMCkyBv7N7Tb8a7lZL+uOKekCn1g+IZ GAqdJXefuo5NMrNcnrA0jbpYiFmEldk= X-MC-Unique: qBOOxxoQMhWThCLp8JlcJQ-1 From: Stefan Hajnoczi To: qemu-devel@nongnu.org Cc: Alberto Faria , Stefan Hajnoczi , Vladimir Sementsov-Ogievskiy , "Michael S. Tsirkin" , Paolo Bonzini , Laurent Vivier , Eric Blake , sgarzare@redhat.com, Marcel Apfelbaum , =?UTF-8?q?Philippe=20Mathieu-Daud=C3=A9?= , qemu-block@nongnu.org, Eduardo Habkost , Vladimir Sementsov-Ogievskiy , John Snow , Thomas Huth , Kevin Wolf , Markus Armbruster , Hanna Reitz , Fam Zheng , Yanan Wang Subject: [RFC v3 6/8] stubs: add memory_region_from_host() and memory_region_get_fd() Date: Fri, 8 Jul 2022 05:17:35 +0100 Message-Id: <20220708041737.1768521-7-stefanha@redhat.com> In-Reply-To: <20220708041737.1768521-1-stefanha@redhat.com> References: <20220708041737.1768521-1-stefanha@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 2.78 on 10.11.54.3 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=stefanha@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -21 X-Spam_score: -2.2 X-Spam_bar: -- X-Spam_report: (-2.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.082, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1657254404483100002 Content-Type: text/plain; charset="utf-8" The blkio block driver will need to look up the file descriptor for a given pointer. This is possible in softmmu builds where the memory API is available for querying guest RAM. Add stubs so tools like qemu-img that link the block layer still build successfully. In this case there is no guest RAM but that is fine. Bounce buffers and their file descriptors will be allocated with libblkio's blkio_alloc_mem_region() so we won't rely on QEMU's memory_region_get_fd() in that case. Signed-off-by: Stefan Hajnoczi Reviewed-by: Hanna Reitz --- stubs/memory.c | 13 +++++++++++++ stubs/meson.build | 1 + 2 files changed, 14 insertions(+) create mode 100644 stubs/memory.c diff --git a/stubs/memory.c b/stubs/memory.c new file mode 100644 index 0000000000..e9ec4e384b --- /dev/null +++ b/stubs/memory.c @@ -0,0 +1,13 @@ +#include "qemu/osdep.h" +#include "exec/memory.h" + +MemoryRegion *memory_region_from_host(void *host, ram_addr_t *offset) +{ + return NULL; +} + +int memory_region_get_fd(MemoryRegion *mr) +{ + return -1; +} + diff --git a/stubs/meson.build b/stubs/meson.build index d8f3fd5c44..fbd3dfa7b4 100644 --- a/stubs/meson.build +++ b/stubs/meson.build @@ -25,6 +25,7 @@ stub_ss.add(files('is-daemonized.c')) if libaio.found() stub_ss.add(files('linux-aio.c')) endif +stub_ss.add(files('memory.c')) stub_ss.add(files('migr-blocker.c')) stub_ss.add(files('module-opts.c')) stub_ss.add(files('monitor.c')) --=20 2.36.1 From nobody Mon Feb 9 03:31:27 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1657254403; cv=none; d=zohomail.com; s=zohoarc; b=IlPYVN1q1x1AwH2Em9SdnhYwVrCthlCTd2tUYhZKF+5LIBbmGjMpMlbT+51Jea1kNSTbotzIHFLASFd7+wEvIndbO/nHnEBdIcwVkKJTV69Uw7XF2FKrlmgN6u5fKwNdbxVJvJtuuoAL4K1VLOGuOqVZ4zlY9uhDSTA7csaXnCA= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1657254403; h=Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=JROyeDCuqRPYheOCOIiVjDQFrHQ+HPgpGEYy8Gg1Q3A=; b=JSL3BcVM9XGj3NBKX9asWDrxpO2284z2j0mXXLaSZquKPSqyqPo0zYcUpR2Mgz8X3XEHw2F1XgjBUh0/UAkD4derJRpVYrF/dIMpS0+mixPK45WhE5G8mljt2q7C5p/HZIoIjyq8jdVez/FAousrYRjXdTYY3BiWJdude6xAo6Q= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 165725440327914.972679741265551; Thu, 7 Jul 2022 21:26:43 -0700 (PDT) Received: from localhost ([::1]:40566 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1o9fZi-0006Yw-2Z for importer@patchew.org; Fri, 08 Jul 2022 00:26:42 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:35646) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1o9fSy-0003vL-MH for qemu-devel@nongnu.org; Fri, 08 Jul 2022 00:19:45 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:54160) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1o9fSw-0000OT-Jc for qemu-devel@nongnu.org; Fri, 08 Jul 2022 00:19:44 -0400 Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-93-YmTOPGAYMb2jdRK9_KjDBA-1; Fri, 08 Jul 2022 00:19:35 -0400 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.rdu2.redhat.com [10.11.54.8]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id A49603C0E22A; Fri, 8 Jul 2022 04:19:34 +0000 (UTC) Received: from localhost (unknown [10.39.192.20]) by smtp.corp.redhat.com (Postfix) with ESMTP id 3AB48C23DBF; Fri, 8 Jul 2022 04:19:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1657253981; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=JROyeDCuqRPYheOCOIiVjDQFrHQ+HPgpGEYy8Gg1Q3A=; b=CHxf6NC7xm991b+4mujU4xObyMmPfcnqyrIZKX+3gnzBnEl/s20jxJhqzLdd1C5m3+aB4D hwvKyLPDVTawF2EBSoXRAl1yt2TFVBsk9oz6vEhHnTTV2Dl6/hvSgoJHRqkXKZ1imE+mFe uE881BcnExnhRsucI+U58CCL0EN5ZXY= X-MC-Unique: YmTOPGAYMb2jdRK9_KjDBA-1 From: Stefan Hajnoczi To: qemu-devel@nongnu.org Cc: Alberto Faria , Stefan Hajnoczi , Vladimir Sementsov-Ogievskiy , "Michael S. Tsirkin" , Paolo Bonzini , Laurent Vivier , Eric Blake , sgarzare@redhat.com, Marcel Apfelbaum , =?UTF-8?q?Philippe=20Mathieu-Daud=C3=A9?= , qemu-block@nongnu.org, Eduardo Habkost , Vladimir Sementsov-Ogievskiy , John Snow , Thomas Huth , Kevin Wolf , Markus Armbruster , Hanna Reitz , Fam Zheng , Yanan Wang Subject: [RFC v3 7/8] blkio: implement BDRV_REQ_REGISTERED_BUF optimization Date: Fri, 8 Jul 2022 05:17:36 +0100 Message-Id: <20220708041737.1768521-8-stefanha@redhat.com> In-Reply-To: <20220708041737.1768521-1-stefanha@redhat.com> References: <20220708041737.1768521-1-stefanha@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 2.85 on 10.11.54.8 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.129.124; envelope-from=stefanha@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -28 X-Spam_score: -2.9 X-Spam_bar: -- X-Spam_report: (-2.9 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.082, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1657254404537100005 Content-Type: text/plain; charset="utf-8" Avoid bounce buffers when QEMUIOVector elements are within previously registered bdrv_register_buf() buffers. The idea is that emulated storage controllers will register guest RAM using bdrv_register_buf() and set the BDRV_REQ_REGISTERED_BUF on I/O requests. Therefore no blkio_map_mem_region() calls are necessary in the performance-critical I/O code path. This optimization doesn't apply if the I/O buffer is internally allocated by QEMU (e.g. qcow2 metadata). There we still take the slow path because BDRV_REQ_REGISTERED_BUF is not set. Signed-off-by: Stefan Hajnoczi --- block/blkio.c | 104 ++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 101 insertions(+), 3 deletions(-) diff --git a/block/blkio.c b/block/blkio.c index 7fbdbd7fae..37d593a20c 100644 --- a/block/blkio.c +++ b/block/blkio.c @@ -1,7 +1,9 @@ #include "qemu/osdep.h" #include #include "block/block_int.h" +#include "exec/memory.h" #include "qapi/error.h" +#include "qemu/error-report.h" #include "qapi/qmp/qdict.h" #include "qemu/module.h" =20 @@ -28,6 +30,9 @@ typedef struct { =20 /* Can we skip adding/deleting blkio_mem_regions? */ bool needs_mem_regions; + + /* Are file descriptors necessary for blkio_mem_regions? */ + bool needs_mem_region_fd; } BDRVBlkioState; =20 static void blkio_aiocb_complete(BlkioAIOCB *acb, int ret) @@ -198,6 +203,8 @@ static BlockAIOCB *blkio_aio_preadv(BlockDriverState *b= s, int64_t offset, BlockCompletionFunc *cb, void *opaque) { BDRVBlkioState *s =3D bs->opaque; + bool needs_mem_regions =3D + s->needs_mem_regions && !(flags & BDRV_REQ_REGISTERED_BUF); struct iovec *iov =3D qiov->iov; int iovcnt =3D qiov->niov; BlkioAIOCB *acb; @@ -206,7 +213,7 @@ static BlockAIOCB *blkio_aio_preadv(BlockDriverState *b= s, int64_t offset, =20 acb =3D blkio_aiocb_get(bs, cb, opaque); =20 - if (s->needs_mem_regions) { + if (needs_mem_regions) { if (blkio_aiocb_init_mem_region_locked(acb, bytes) < 0) { qemu_aio_unref(&acb->common); return NULL; @@ -230,6 +237,8 @@ static BlockAIOCB *blkio_aio_pwritev(BlockDriverState *= bs, int64_t offset, { uint32_t blkio_flags =3D (flags & BDRV_REQ_FUA) ? BLKIO_REQ_FUA : 0; BDRVBlkioState *s =3D bs->opaque; + bool needs_mem_regions =3D + s->needs_mem_regions && !(flags & BDRV_REQ_REGISTERED_BUF); struct iovec *iov =3D qiov->iov; int iovcnt =3D qiov->niov; BlkioAIOCB *acb; @@ -238,7 +247,7 @@ static BlockAIOCB *blkio_aio_pwritev(BlockDriverState *= bs, int64_t offset, =20 acb =3D blkio_aiocb_get(bs, cb, opaque); =20 - if (s->needs_mem_regions) { + if (needs_mem_regions) { if (blkio_aiocb_init_mem_region_locked(acb, bytes) < 0) { qemu_aio_unref(&acb->common); return NULL; @@ -324,6 +333,80 @@ static void blkio_io_unplug(BlockDriverState *bs) } } =20 +static void blkio_register_buf(BlockDriverState *bs, void *host, size_t si= ze) +{ + BDRVBlkioState *s =3D bs->opaque; + int ret; + struct blkio_mem_region region =3D (struct blkio_mem_region){ + .addr =3D host, + .len =3D size, + .fd =3D -1, + }; + + if (((uintptr_t)host | size) % s->mem_region_alignment) { + error_report_once("%s: skipping unaligned buf %p with size %zu", + __func__, host, size); + return; /* skip unaligned */ + } + + /* Attempt to find the fd for a MemoryRegion */ + if (s->needs_mem_region_fd) { + int fd =3D -1; + ram_addr_t offset; + MemoryRegion *mr; + + /* + * bdrv_register_buf() is called with the BQL held so mr lives at = least + * until this function returns. + */ + mr =3D memory_region_from_host(host, &offset); + if (mr) { + fd =3D memory_region_get_fd(mr); + } + if (fd =3D=3D -1) { + error_report_once("%s: skipping fd-less buf %p with size %zu", + __func__, host, size); + return; /* skip if there is no fd */ + } + + region.fd =3D fd; + region.fd_offset =3D offset; + } + + WITH_QEMU_LOCK_GUARD(&s->lock) { + ret =3D blkio_map_mem_region(s->blkio, ®ion); + } + + if (ret < 0) { + error_report_once("Failed to add blkio mem region %p with size %zu= : %s", + host, size, blkio_get_error_msg()); + } +} + +static void blkio_unregister_buf(BlockDriverState *bs, void *host, size_t = size) +{ + BDRVBlkioState *s =3D bs->opaque; + int ret; + struct blkio_mem_region region =3D (struct blkio_mem_region){ + .addr =3D host, + .len =3D size, + .fd =3D -1, + }; + + if (((uintptr_t)host | size) % s->mem_region_alignment) { + return; /* skip unaligned */ + } + + WITH_QEMU_LOCK_GUARD(&s->lock) { + ret =3D blkio_unmap_mem_region(s->blkio, ®ion); + } + + if (ret < 0) { + error_report_once("Failed to delete blkio mem region %p with size = %zu: %s", + host, size, blkio_get_error_msg()); + } +} + static void blkio_parse_filename_io_uring(const char *filename, QDict *opt= ions, Error **errp) { @@ -440,6 +523,17 @@ static int blkio_file_open(BlockDriverState *bs, QDict= *options, int flags, return ret; } =20 + ret =3D blkio_get_bool(s->blkio, + "needs-mem-region-fd", + &s->needs_mem_region_fd); + if (ret < 0) { + error_setg_errno(errp, -ret, + "failed to get needs-mem-region-fd: %s", + blkio_get_error_msg()); + blkio_destroy(&s->blkio); + return ret; + } + ret =3D blkio_get_uint64(s->blkio, "mem-region-alignment", &s->mem_region_alignment); @@ -459,7 +553,7 @@ static int blkio_file_open(BlockDriverState *bs, QDict = *options, int flags, return ret; } =20 - bs->supported_write_flags =3D BDRV_REQ_FUA; + bs->supported_write_flags =3D BDRV_REQ_FUA | BDRV_REQ_REGISTERED_BUF; bs->supported_zero_flags =3D BDRV_REQ_FUA | BDRV_REQ_MAY_UNMAP | BDRV_REQ_NO_FALLBACK; =20 @@ -627,6 +721,8 @@ static BlockDriver bdrv_io_uring =3D { .bdrv_co_pwrite_zeroes =3D blkio_co_pwrite_zeroes, .bdrv_io_unplug =3D blkio_io_unplug, .bdrv_refresh_limits =3D blkio_refresh_limits, + .bdrv_register_buf =3D blkio_register_buf, + .bdrv_unregister_buf =3D blkio_unregister_buf, }; =20 static BlockDriver bdrv_virtio_blk_vhost_vdpa =3D { @@ -648,6 +744,8 @@ static BlockDriver bdrv_virtio_blk_vhost_vdpa =3D { .bdrv_co_pwrite_zeroes =3D blkio_co_pwrite_zeroes, .bdrv_io_unplug =3D blkio_io_unplug, .bdrv_refresh_limits =3D blkio_refresh_limits, + .bdrv_register_buf =3D blkio_register_buf, + .bdrv_unregister_buf =3D blkio_unregister_buf, }; =20 static void bdrv_blkio_init(void) --=20 2.36.1 From nobody Mon Feb 9 03:31:27 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1657254402; cv=none; d=zohomail.com; s=zohoarc; b=ZQ6Ln23KV8a7U/YE9JVNLNNT1LncxSUXOlcYkyK3QSCsemcsy2VQcIFu77bJzQJLp9RRQoOzu7sVgOrKv35z3mCZY7kHpkkW9JzWTUtLDbwjFJ1X8zVY7SaSucZxMn+diBe5n+rIZRH/2ExXLalXbYiWqxz+sHn8vwLcUnJjofU= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1657254402; h=Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=fX055+lti1fPP0GUE+G7NEy7xdox3yfVVq795BYrgvA=; b=in8g3KbxG8y0nvQVcTfF11D/Hlx14ZlclXsMLQAONKaq0l8ZEa8t66d0GzVoPPt3B+zGj3fFIvDvc1gIuwWIX2wAS9fjqPByRtKYlGF/Ohq/ueeNtYo5iDXk23oqW3l4X5Ee0iWfdRI5PgLyGaBjBsK+SrphypaB8BBUYJZu1y8= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1657254402728309.3092411490662; Thu, 7 Jul 2022 21:26:42 -0700 (PDT) Received: from localhost ([::1]:40548 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1o9fZh-0006YQ-7L for importer@patchew.org; Fri, 08 Jul 2022 00:26:41 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:35652) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1o9fSy-0003vO-NY for qemu-devel@nongnu.org; Fri, 08 Jul 2022 00:19:45 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:44081) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1o9fSw-0000OY-LR for qemu-devel@nongnu.org; Fri, 08 Jul 2022 00:19:44 -0400 Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-297-tFV1qfa3NrSPLSKdKTV1Ag-1; Fri, 08 Jul 2022 00:19:37 -0400 Received: from smtp.corp.redhat.com (int-mx09.intmail.prod.int.rdu2.redhat.com [10.11.54.9]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 5F450801231; Fri, 8 Jul 2022 04:19:36 +0000 (UTC) Received: from localhost (unknown [10.39.192.20]) by smtp.corp.redhat.com (Postfix) with ESMTP id D76D7492C3B; Fri, 8 Jul 2022 04:19:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1657253982; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=fX055+lti1fPP0GUE+G7NEy7xdox3yfVVq795BYrgvA=; b=Jg8QJkW5sWRy5kKaDOQ2SbxhjRWkhBMBVoSRTy5jCBAKfqP4zRwcnlHprvnADcpR3FCA3b Kd1zKqRCIP+MJJ1f7TUtYPTtVNF9LZ+SJlLXT564uLjhtrHINGspqx0TX1mP7NCYjZC4ZO p0bNpL5GW/ObnmSS9pf4RHdBup8Pf3M= X-MC-Unique: tFV1qfa3NrSPLSKdKTV1Ag-1 From: Stefan Hajnoczi To: qemu-devel@nongnu.org Cc: Alberto Faria , Stefan Hajnoczi , Vladimir Sementsov-Ogievskiy , "Michael S. Tsirkin" , Paolo Bonzini , Laurent Vivier , Eric Blake , sgarzare@redhat.com, Marcel Apfelbaum , =?UTF-8?q?Philippe=20Mathieu-Daud=C3=A9?= , qemu-block@nongnu.org, Eduardo Habkost , Vladimir Sementsov-Ogievskiy , John Snow , Thomas Huth , Kevin Wolf , Markus Armbruster , Hanna Reitz , Fam Zheng , Yanan Wang Subject: [RFC v3 8/8] virtio-blk: use BDRV_REQ_REGISTERED_BUF optimization hint Date: Fri, 8 Jul 2022 05:17:37 +0100 Message-Id: <20220708041737.1768521-9-stefanha@redhat.com> In-Reply-To: <20220708041737.1768521-1-stefanha@redhat.com> References: <20220708041737.1768521-1-stefanha@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 2.85 on 10.11.54.9 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.129.124; envelope-from=stefanha@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -28 X-Spam_score: -2.9 X-Spam_bar: -- X-Spam_report: (-2.9 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.082, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1657254404453100001 Content-Type: text/plain; charset="utf-8" Register guest RAM using BlockRAMRegistrar and set the BDRV_REQ_REGISTERED_BUF flag so block drivers can optimize memory accesses in I/O requests. This is for vdpa-blk, vhost-user-blk, and other I/O interfaces that rely on DMA mapping/unmapping. Signed-off-by: Stefan Hajnoczi --- include/hw/virtio/virtio-blk.h | 2 ++ hw/block/virtio-blk.c | 13 +++++++++---- 2 files changed, 11 insertions(+), 4 deletions(-) diff --git a/include/hw/virtio/virtio-blk.h b/include/hw/virtio/virtio-blk.h index d311c57cca..7f589b4146 100644 --- a/include/hw/virtio/virtio-blk.h +++ b/include/hw/virtio/virtio-blk.h @@ -19,6 +19,7 @@ #include "hw/block/block.h" #include "sysemu/iothread.h" #include "sysemu/block-backend.h" +#include "sysemu/block-ram-registrar.h" #include "qom/object.h" =20 #define TYPE_VIRTIO_BLK "virtio-blk-device" @@ -64,6 +65,7 @@ struct VirtIOBlock { struct VirtIOBlockDataPlane *dataplane; uint64_t host_features; size_t config_size; + BlockRAMRegistrar blk_ram_registrar; }; =20 typedef struct VirtIOBlockReq { diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c index e9ba752f6b..41f8c73453 100644 --- a/hw/block/virtio-blk.c +++ b/hw/block/virtio-blk.c @@ -21,6 +21,7 @@ #include "hw/block/block.h" #include "hw/qdev-properties.h" #include "sysemu/blockdev.h" +#include "sysemu/block-ram-registrar.h" #include "sysemu/sysemu.h" #include "sysemu/runstate.h" #include "hw/virtio/virtio-blk.h" @@ -421,11 +422,13 @@ static inline void submit_requests(BlockBackend *blk,= MultiReqBuffer *mrb, } =20 if (is_write) { - blk_aio_pwritev(blk, sector_num << BDRV_SECTOR_BITS, qiov, 0, - virtio_blk_rw_complete, mrb->reqs[start]); + blk_aio_pwritev(blk, sector_num << BDRV_SECTOR_BITS, qiov, + BDRV_REQ_REGISTERED_BUF, virtio_blk_rw_complete, + mrb->reqs[start]); } else { - blk_aio_preadv(blk, sector_num << BDRV_SECTOR_BITS, qiov, 0, - virtio_blk_rw_complete, mrb->reqs[start]); + blk_aio_preadv(blk, sector_num << BDRV_SECTOR_BITS, qiov, + BDRV_REQ_REGISTERED_BUF, virtio_blk_rw_complete, + mrb->reqs[start]); } } =20 @@ -1227,6 +1230,7 @@ static void virtio_blk_device_realize(DeviceState *de= v, Error **errp) } =20 s->change =3D qemu_add_vm_change_state_handler(virtio_blk_dma_restart_= cb, s); + blk_ram_registrar_init(&s->blk_ram_registrar, s->blk); blk_set_dev_ops(s->blk, &virtio_block_ops, s); =20 blk_iostatus_enable(s->blk); @@ -1252,6 +1256,7 @@ static void virtio_blk_device_unrealize(DeviceState *= dev) virtio_del_queue(vdev, i); } qemu_coroutine_dec_pool_size(conf->num_queues * conf->queue_size / 2); + blk_ram_registrar_destroy(&s->blk_ram_registrar); qemu_del_vm_change_state_handler(s->change); blockdev_mark_auto_del(s->blk); virtio_cleanup(vdev); --=20 2.36.1