From nobody Sun May 12 14:33:58 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1665688105; cv=none; d=zohomail.com; s=zohoarc; b=QO/2z3qqHhmDz6Gz+qShYc09zwP/4OiYamzxMp1eblx66ecICmTPE+V+FjO8A+zuzFOh5q+ebKMtiarVJ6cVnZNB4aZEMf8E9KKGtWd8oCry0jffxdLkhQzQ3d7xblVp7Z2yOGGdEXF6cYV0QcO8Djg5y/HIz5wd6CkKRl2HQdw= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1665688105; h=Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=oogAcoRLp/QWsr4CO7HP7wKYUqyqCTLVEVguGfP/s9g=; b=XwfO5eSbhwKJQrp7DzFcg4vP9mCY19E5g0gxjpszINZ4L6c6M8PvgswDRLihpovQnuA5UU5Ks5haY8CsWUqOEpWpkO4gsPNPmetARQArCLIJ+vIGL2UEMsu46/mTT9dFuWQCXeuk9+bMmCyfcJm4RAeH5vDEtgq+mSUSBeao2fo= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1665688105119973.7449870485793; Thu, 13 Oct 2022 12:08:25 -0700 (PDT) Received: from localhost ([::1]:44410 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oj3ZA-0004xy-03 for importer@patchew.org; Thu, 13 Oct 2022 15:08:24 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:55606) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oj3QY-0006L0-9h for qemu-devel@nongnu.org; Thu, 13 Oct 2022 14:59:30 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:27230) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oj3QW-00042s-Rr for qemu-devel@nongnu.org; Thu, 13 Oct 2022 14:59:30 -0400 Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-488-NAZPW-cCN2Kk2eV1ZwllwA-1; Thu, 13 Oct 2022 14:59:22 -0400 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 0BDF33C138A6; Thu, 13 Oct 2022 18:59:22 +0000 (UTC) Received: from localhost (unknown [10.39.194.71]) by smtp.corp.redhat.com (Postfix) with ESMTP id 4C83B7AE5; Thu, 13 Oct 2022 18:59:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1665687568; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=oogAcoRLp/QWsr4CO7HP7wKYUqyqCTLVEVguGfP/s9g=; b=QAp80J75KG3AlUW/WTWomdxvkfDhy+e+23t0pnbLWIFGDrioMkjL8tyykcerhtzvvYPFej 6qzXDh9j6LNaQsB5Fx5oWssYJN2e8UhrMzqj00o57rlc0FDaLRT7+g6Dspiq1sa9bAtO1n X4ngOa/2gkiWnInBqe3Sp5KU+S8/wLE= X-MC-Unique: NAZPW-cCN2Kk2eV1ZwllwA-1 From: Stefan Hajnoczi To: qemu-devel@nongnu.org Cc: Fam Zheng , Markus Armbruster , Stefan Hajnoczi , Laurent Vivier , Marcel Apfelbaum , "Michael S. Tsirkin" , Wen Congyang , Kevin Wolf , =?UTF-8?q?Philippe=20Mathieu-Daud=C3=A9?= , Richard Henderson , David Hildenbrand , Hanna Reitz , Xie Changlong , Eduardo Habkost , qemu-block@nongnu.org, Eric Blake , John Snow , afaria@redhat.com, Jeff Cody , Yanan Wang , Paolo Bonzini , Raphael Norwitz , sgarzare@redhat.com, integration@gluster.org, Peter Xu , "Richard W.M. Jones" , Thomas Huth , Vladimir Sementsov-Ogievskiy , "Denis V. Lunev" Subject: [PATCH v7 01/13] coroutine: add flag to re-queue at front of CoQueue Date: Thu, 13 Oct 2022 14:58:56 -0400 Message-Id: <20221013185908.1297568-2-stefanha@redhat.com> In-Reply-To: <20221013185908.1297568-1-stefanha@redhat.com> References: <20221013185908.1297568-1-stefanha@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.1 on 10.11.54.5 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=stefanha@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1665688105638100001 Content-Type: text/plain; charset="utf-8" When a coroutine wakes up it may determine that it must re-queue. Normally coroutines are pushed onto the back of the CoQueue, but for fairness it may be necessary to push it onto the front of the CoQueue. Add a flag to specify that the coroutine should be pushed onto the front of the CoQueue. A later patch will use this to ensure fairness in the bounce buffer CoQueue used by the blkio BlockDriver. Signed-off-by: Stefan Hajnoczi --- include/qemu/coroutine.h | 15 +++++++++++++-- util/qemu-coroutine-lock.c | 9 +++++++-- 2 files changed, 20 insertions(+), 4 deletions(-) diff --git a/include/qemu/coroutine.h b/include/qemu/coroutine.h index aae33cce17..608fe45dcf 100644 --- a/include/qemu/coroutine.h +++ b/include/qemu/coroutine.h @@ -198,14 +198,25 @@ typedef struct CoQueue { */ void qemu_co_queue_init(CoQueue *queue); =20 +typedef enum { + /* + * Enqueue at front instead of back. Use this to re-queue a request wh= en + * its wait condition is not satisfied after being woken up. + */ + CO_QUEUE_WAIT_FRONT =3D 0x1, +} CoQueueWaitFlags; + /** * Adds the current coroutine to the CoQueue and transfers control to the * caller of the coroutine. The mutex is unlocked during the wait and * locked again afterwards. */ #define qemu_co_queue_wait(queue, lock) \ - qemu_co_queue_wait_impl(queue, QEMU_MAKE_LOCKABLE(lock)) -void coroutine_fn qemu_co_queue_wait_impl(CoQueue *queue, QemuLockable *lo= ck); + qemu_co_queue_wait_impl(queue, QEMU_MAKE_LOCKABLE(lock), 0) +#define qemu_co_queue_wait_flags(queue, lock, flags) \ + qemu_co_queue_wait_impl(queue, QEMU_MAKE_LOCKABLE(lock), (flags)) +void coroutine_fn qemu_co_queue_wait_impl(CoQueue *queue, QemuLockable *lo= ck, + CoQueueWaitFlags flags); =20 /** * Removes the next coroutine from the CoQueue, and queue it to run after diff --git a/util/qemu-coroutine-lock.c b/util/qemu-coroutine-lock.c index 15c82d9348..45c6b57374 100644 --- a/util/qemu-coroutine-lock.c +++ b/util/qemu-coroutine-lock.c @@ -39,10 +39,15 @@ void qemu_co_queue_init(CoQueue *queue) QSIMPLEQ_INIT(&queue->entries); } =20 -void coroutine_fn qemu_co_queue_wait_impl(CoQueue *queue, QemuLockable *lo= ck) +void coroutine_fn qemu_co_queue_wait_impl(CoQueue *queue, QemuLockable *lo= ck, + CoQueueWaitFlags flags) { Coroutine *self =3D qemu_coroutine_self(); - QSIMPLEQ_INSERT_TAIL(&queue->entries, self, co_queue_next); + if (flags & CO_QUEUE_WAIT_FRONT) { + QSIMPLEQ_INSERT_HEAD(&queue->entries, self, co_queue_next); + } else { + QSIMPLEQ_INSERT_TAIL(&queue->entries, self, co_queue_next); + } =20 if (lock) { qemu_lockable_unlock(lock); --=20 2.37.3 From nobody Sun May 12 14:33:58 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1665688303; cv=none; d=zohomail.com; s=zohoarc; b=mB74BKhpPwl4unFH8Eo2IXS4EGm6izjWgAxmutKuDn+vIZ3l4upqJ57stGBdHRCI4KWgnoed/Am1MR10UPkiECu4O7HTFVzQoezxSxrbtXNClGAOGyJiYfIazA9zi6UCHWkm08NeTwSw060wSL1xPwxVSHDwO0GBiDQ5l1+goIY= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1665688303; h=Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=wBihOzmjfFDtOkrFO6JANeWXXmPI2+e8/tQF5amjiIQ=; b=J1gR9Ye2sNYhkgcJye58xnaBMNdENlPjddWn9yzxhkcqqHY/ip0yJJXqQ0/bmGgoPfVLzyVxuQ2ylaR+p+SOEkKoX9DYr3j9j+X9yDisDPfThzk47Jaz+GRzsIk+6/4LxZAio070PoaUhiuu5bz9BB8Cbvq3x5rWTATU+cyU4cA= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1665688303808751.6425874845833; Thu, 13 Oct 2022 12:11:43 -0700 (PDT) Received: from localhost ([::1]:56200 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oj3cL-0008IG-Rq for importer@patchew.org; Thu, 13 Oct 2022 15:11:41 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:55608) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oj3Qc-0006Uu-9E for qemu-devel@nongnu.org; Thu, 13 Oct 2022 14:59:34 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:27993) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oj3QY-000430-LW for qemu-devel@nongnu.org; Thu, 13 Oct 2022 14:59:34 -0400 Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-447-aPjbTuDjMCOHVGCjnnIjjA-1; Thu, 13 Oct 2022 14:59:26 -0400 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 72DC92A5954A; Thu, 13 Oct 2022 18:59:25 +0000 (UTC) Received: from localhost (unknown [10.39.194.71]) by smtp.corp.redhat.com (Postfix) with ESMTP id 01107112C064; Thu, 13 Oct 2022 18:59:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1665687569; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=wBihOzmjfFDtOkrFO6JANeWXXmPI2+e8/tQF5amjiIQ=; b=OG9GOy39flThbyFfte0IGpoDdQppQnG/5QJaELMondiKnu58AN7qRAw9oA+UXZHl0UFUla /DahCXfOT1UR2En4eW+wfVA+8oebakwisFWTyjYPfCyyzc9k7q/vStNx1LxvVDn9dyRo4w NUW4QoJ03QigrAH6Spf2koNqch6jiLg= X-MC-Unique: aPjbTuDjMCOHVGCjnnIjjA-1 From: Stefan Hajnoczi To: qemu-devel@nongnu.org Cc: Fam Zheng , Markus Armbruster , Stefan Hajnoczi , Laurent Vivier , Marcel Apfelbaum , "Michael S. Tsirkin" , Wen Congyang , Kevin Wolf , =?UTF-8?q?Philippe=20Mathieu-Daud=C3=A9?= , Richard Henderson , David Hildenbrand , Hanna Reitz , Xie Changlong , Eduardo Habkost , qemu-block@nongnu.org, Eric Blake , John Snow , afaria@redhat.com, Jeff Cody , Yanan Wang , Paolo Bonzini , Raphael Norwitz , sgarzare@redhat.com, integration@gluster.org, Peter Xu , "Richard W.M. Jones" , Thomas Huth , Vladimir Sementsov-Ogievskiy , "Denis V. Lunev" Subject: [PATCH v7 02/13] blkio: add libblkio block driver Date: Thu, 13 Oct 2022 14:58:57 -0400 Message-Id: <20221013185908.1297568-3-stefanha@redhat.com> In-Reply-To: <20221013185908.1297568-1-stefanha@redhat.com> References: <20221013185908.1297568-1-stefanha@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.1 on 10.11.54.3 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.129.124; envelope-from=stefanha@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1665688306134100001 Content-Type: text/plain; charset="utf-8" libblkio (https://gitlab.com/libblkio/libblkio/) is a library for high-performance disk I/O. It currently supports io_uring, virtio-blk-vhost-user, and virtio-blk-vhost-vdpa with additional drivers under development. One of the reasons for developing libblkio is that other applications besides QEMU can use it. This will be particularly useful for virtio-blk-vhost-user which applications may wish to use for connecting to qemu-storage-daemon. libblkio also gives us an opportunity to develop in Rust behind a C API that is easy to consume from QEMU. This commit adds io_uring, nvme-io_uring, virtio-blk-vhost-user, and virtio-blk-vhost-vdpa BlockDrivers to QEMU using libblkio. It will be easy to add other libblkio drivers since they will share the majority of code. For now I/O buffers are copied through bounce buffers if the libblkio driver requires it. Later commits add an optimization for pre-registering guest RAM to avoid bounce buffers. The syntax is: --blockdev io_uring,node-name=3Ddrive0,filename=3Dtest.img,readonly=3Don|= off,cache.direct=3Don|off --blockdev nvme-io_uring,node-name=3Ddrive0,filename=3D/dev/ng0n1,readonl= y=3Don|off,cache.direct=3Don --blockdev virtio-blk-vhost-vdpa,node-name=3Ddrive0,path=3D/dev/vdpa...,r= eadonly=3Don|off,cache.direct=3Don --blockdev virtio-blk-vhost-user,node-name=3Ddrive0,path=3Dvhost-user-blk= .sock,readonly=3Don|off,cache.direct=3Don Signed-off-by: Stefan Hajnoczi Acked-by: Markus Armbruster Reviewed-by: Stefano Garzarella --- MAINTAINERS | 6 + meson_options.txt | 2 + qapi/block-core.json | 77 +++- meson.build | 9 + block/blkio.c | 831 ++++++++++++++++++++++++++++++++++ tests/qtest/modules-test.c | 3 + block/meson.build | 1 + scripts/meson-buildoptions.sh | 3 + 8 files changed, 928 insertions(+), 4 deletions(-) create mode 100644 block/blkio.c diff --git a/MAINTAINERS b/MAINTAINERS index 8ae2e43c83..da30a8b93f 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -3415,6 +3415,12 @@ L: qemu-block@nongnu.org S: Maintained F: block/vdi.c =20 +blkio +M: Stefan Hajnoczi +L: qemu-block@nongnu.org +S: Maintained +F: block/blkio.c + iSCSI M: Ronnie Sahlberg M: Paolo Bonzini diff --git a/meson_options.txt b/meson_options.txt index 79c6af18d5..66128178bf 100644 --- a/meson_options.txt +++ b/meson_options.txt @@ -117,6 +117,8 @@ option('bzip2', type : 'feature', value : 'auto', description: 'bzip2 support for DMG images') option('cap_ng', type : 'feature', value : 'auto', description: 'cap_ng support') +option('blkio', type : 'feature', value : 'auto', + description: 'libblkio block device driver') option('bpf', type : 'feature', value : 'auto', description: 'eBPF support') option('cocoa', type : 'feature', value : 'auto', diff --git a/qapi/block-core.json b/qapi/block-core.json index 882b266532..cb5079e645 100644 --- a/qapi/block-core.json +++ b/qapi/block-core.json @@ -2951,11 +2951,18 @@ 'file', 'snapshot-access', 'ftp', 'ftps', 'gluster', {'name': 'host_cdrom', 'if': 'HAVE_HOST_BLOCK_DEVICE' }, {'name': 'host_device', 'if': 'HAVE_HOST_BLOCK_DEVICE' }, - 'http', 'https', 'iscsi', - 'luks', 'nbd', 'nfs', 'null-aio', 'null-co', 'nvme', 'parallel= s', - 'preallocate', 'qcow', 'qcow2', 'qed', 'quorum', 'raw', 'rbd', + 'http', 'https', + { 'name': 'io_uring', 'if': 'CONFIG_BLKIO' }, + 'iscsi', + 'luks', 'nbd', 'nfs', 'null-aio', 'null-co', 'nvme', + { 'name': 'nvme-io_uring', 'if': 'CONFIG_BLKIO' }, + 'parallels', 'preallocate', 'qcow', 'qcow2', 'qed', 'quorum', + 'raw', 'rbd', { 'name': 'replication', 'if': 'CONFIG_REPLICATION' }, - 'ssh', 'throttle', 'vdi', 'vhdx', 'vmdk', 'vpc', 'vvfat' ] } + 'ssh', 'throttle', 'vdi', 'vhdx', + { 'name': 'virtio-blk-vhost-user', 'if': 'CONFIG_BLKIO' }, + { 'name': 'virtio-blk-vhost-vdpa', 'if': 'CONFIG_BLKIO' }, + 'vmdk', 'vpc', 'vvfat' ] } =20 ## # @BlockdevOptionsFile: @@ -3678,6 +3685,58 @@ '*debug': 'int', '*logfile': 'str' } } =20 +## +# @BlockdevOptionsIoUring: +# +# Driver specific block device options for the io_uring backend. +# +# @filename: path to the image file +# +# Since: 7.2 +## +{ 'struct': 'BlockdevOptionsIoUring', + 'data': { 'filename': 'str' }, + 'if': 'CONFIG_BLKIO' } + +## +# @BlockdevOptionsNvmeIoUring: +# +# Driver specific block device options for the nvme-io_uring backend. +# +# @filename: path to the image file +# +# Since: 7.2 +## +{ 'struct': 'BlockdevOptionsNvmeIoUring', + 'data': { 'filename': 'str' }, + 'if': 'CONFIG_BLKIO' } + +## +# @BlockdevOptionsVirtioBlkVhostUser: +# +# Driver specific block device options for the virtio-blk-vhost-user backe= nd. +# +# @path: path to the vhost-user UNIX domain socket. +# +# Since: 7.2 +## +{ 'struct': 'BlockdevOptionsVirtioBlkVhostUser', + 'data': { 'path': 'str' }, + 'if': 'CONFIG_BLKIO' } + +## +# @BlockdevOptionsVirtioBlkVhostVdpa: +# +# Driver specific block device options for the virtio-blk-vhost-vdpa backe= nd. +# +# @path: path to the vhost-vdpa character device. +# +# Since: 7.2 +## +{ 'struct': 'BlockdevOptionsVirtioBlkVhostVdpa', + 'data': { 'path': 'str' }, + 'if': 'CONFIG_BLKIO' } + ## # @IscsiTransport: # @@ -4305,6 +4364,8 @@ 'if': 'HAVE_HOST_BLOCK_DEVICE' }, 'http': 'BlockdevOptionsCurlHttp', 'https': 'BlockdevOptionsCurlHttps', + 'io_uring': { 'type': 'BlockdevOptionsIoUring', + 'if': 'CONFIG_BLKIO' }, 'iscsi': 'BlockdevOptionsIscsi', 'luks': 'BlockdevOptionsLUKS', 'nbd': 'BlockdevOptionsNbd', @@ -4312,6 +4373,8 @@ 'null-aio': 'BlockdevOptionsNull', 'null-co': 'BlockdevOptionsNull', 'nvme': 'BlockdevOptionsNVMe', + 'nvme-io_uring': { 'type': 'BlockdevOptionsNvmeIoUring', + 'if': 'CONFIG_BLKIO' }, 'parallels': 'BlockdevOptionsGenericFormat', 'preallocate':'BlockdevOptionsPreallocate', 'qcow2': 'BlockdevOptionsQcow2', @@ -4327,6 +4390,12 @@ 'throttle': 'BlockdevOptionsThrottle', 'vdi': 'BlockdevOptionsGenericFormat', 'vhdx': 'BlockdevOptionsGenericFormat', + 'virtio-blk-vhost-user': + { 'type': 'BlockdevOptionsVirtioBlkVhostUser', + 'if': 'CONFIG_BLKIO' }, + 'virtio-blk-vhost-vdpa': + { 'type': 'BlockdevOptionsVirtioBlkVhostVdpa', + 'if': 'CONFIG_BLKIO' }, 'vmdk': 'BlockdevOptionsGenericCOWFormat', 'vpc': 'BlockdevOptionsGenericFormat', 'vvfat': 'BlockdevOptionsVVFAT' diff --git a/meson.build b/meson.build index b686dfef75..1e9068adf6 100644 --- a/meson.build +++ b/meson.build @@ -771,6 +771,13 @@ if not get_option('virglrenderer').auto() or have_syst= em or have_vhost_user_gpu required: get_option('virglrenderer'), kwargs: static_kwargs) endif +blkio =3D not_found +if not get_option('blkio').auto() or have_block + blkio =3D dependency('blkio', + method: 'pkg-config', + required: get_option('blkio'), + kwargs: static_kwargs) +endif curl =3D not_found if not get_option('curl').auto() or have_block curl =3D dependency('libcurl', version: '>=3D7.29.0', @@ -1815,6 +1822,7 @@ config_host_data.set('CONFIG_LIBUDEV', libudev.found(= )) config_host_data.set('CONFIG_LZO', lzo.found()) config_host_data.set('CONFIG_MPATH', mpathpersist.found()) config_host_data.set('CONFIG_MPATH_NEW_API', mpathpersist_new_api) +config_host_data.set('CONFIG_BLKIO', blkio.found()) config_host_data.set('CONFIG_CURL', curl.found()) config_host_data.set('CONFIG_CURSES', curses.found()) config_host_data.set('CONFIG_GBM', gbm.found()) @@ -3869,6 +3877,7 @@ summary_info +=3D {'PAM': pam} summary_info +=3D {'iconv support': iconv} summary_info +=3D {'curses support': curses} summary_info +=3D {'virgl support': virgl} +summary_info +=3D {'blkio support': blkio} summary_info +=3D {'curl support': curl} summary_info +=3D {'Multipath support': mpathpersist} summary_info +=3D {'PNG support': png} diff --git a/block/blkio.c b/block/blkio.c new file mode 100644 index 0000000000..b0cfd74b36 --- /dev/null +++ b/block/blkio.c @@ -0,0 +1,831 @@ +/* SPDX-License-Identifier: LGPL-2.1-or-later */ +/* + * libblkio BlockDriver + * + * Copyright Red Hat, Inc. + * + * Author: + * Stefan Hajnoczi + */ + +#include "qemu/osdep.h" +#include +#include "block/block_int.h" +#include "qapi/error.h" +#include "qapi/qmp/qdict.h" +#include "qemu/module.h" + +/* + * Keep the QEMU BlockDriver names identical to the libblkio driver names. + * Using macros instead of typing out the string literals avoids typos. + */ +#define DRIVER_IO_URING "io_uring" +#define DRIVER_NVME_IO_URING "nvme-io_uring" +#define DRIVER_VIRTIO_BLK_VHOST_USER "virtio-blk-vhost-user" +#define DRIVER_VIRTIO_BLK_VHOST_VDPA "virtio-blk-vhost-vdpa" + +/* + * Allocated bounce buffers are kept in a list sorted by buffer address. + */ +typedef struct BlkioBounceBuf { + QLIST_ENTRY(BlkioBounceBuf) next; + + /* The bounce buffer */ + struct iovec buf; +} BlkioBounceBuf; + +typedef struct { + /* + * libblkio is not thread-safe so this lock protects ->blkio and + * ->blkioq. + */ + QemuMutex blkio_lock; + struct blkio *blkio; + struct blkioq *blkioq; /* make this multi-queue in the future... */ + int completion_fd; + + /* + * Polling fetches the next completion into this field. + * + * No lock is necessary since only one thread calls aio_poll() and inv= okes + * fd and poll handlers. + */ + struct blkio_completion poll_completion; + + /* + * Protects ->bounce_pool, ->bounce_bufs, ->bounce_available. + * + * Lock ordering: ->bounce_lock before ->blkio_lock. + */ + CoMutex bounce_lock; + + /* Bounce buffer pool */ + struct blkio_mem_region bounce_pool; + + /* Sorted list of allocated bounce buffers */ + QLIST_HEAD(, BlkioBounceBuf) bounce_bufs; + + /* Queue for coroutines waiting for bounce buffer space */ + CoQueue bounce_available; + + /* The value of the "mem-region-alignment" property */ + size_t mem_region_alignment; + + /* Can we skip adding/deleting blkio_mem_regions? */ + bool needs_mem_regions; +} BDRVBlkioState; + +/* Called with s->bounce_lock held */ +static int blkio_resize_bounce_pool(BDRVBlkioState *s, int64_t bytes) +{ + /* There can be no allocated bounce buffers during resize */ + assert(QLIST_EMPTY(&s->bounce_bufs)); + + /* Pad size to reduce frequency of resize calls */ + bytes +=3D 128 * 1024; + + WITH_QEMU_LOCK_GUARD(&s->blkio_lock) { + int ret; + + if (s->bounce_pool.addr) { + blkio_unmap_mem_region(s->blkio, &s->bounce_pool); + blkio_free_mem_region(s->blkio, &s->bounce_pool); + memset(&s->bounce_pool, 0, sizeof(s->bounce_pool)); + } + + /* Automatically freed when s->blkio is destroyed */ + ret =3D blkio_alloc_mem_region(s->blkio, &s->bounce_pool, bytes); + if (ret < 0) { + return ret; + } + + ret =3D blkio_map_mem_region(s->blkio, &s->bounce_pool); + if (ret < 0) { + blkio_free_mem_region(s->blkio, &s->bounce_pool); + memset(&s->bounce_pool, 0, sizeof(s->bounce_pool)); + return ret; + } + } + + return 0; +} + +/* Called with s->bounce_lock held */ +static bool +blkio_do_alloc_bounce_buffer(BDRVBlkioState *s, BlkioBounceBuf *bounce, + int64_t bytes) +{ + void *addr =3D s->bounce_pool.addr; + BlkioBounceBuf *cur =3D NULL; + BlkioBounceBuf *prev =3D NULL; + ptrdiff_t space; + + /* + * This is just a linear search over the holes between requests. An + * efficient allocator would be nice. + */ + QLIST_FOREACH(cur, &s->bounce_bufs, next) { + space =3D cur->buf.iov_base - addr; + if (bytes <=3D space) { + QLIST_INSERT_BEFORE(cur, bounce, next); + bounce->buf.iov_base =3D addr; + bounce->buf.iov_len =3D bytes; + return true; + } + + addr =3D cur->buf.iov_base + cur->buf.iov_len; + prev =3D cur; + } + + /* Is there space after the last request? */ + space =3D s->bounce_pool.addr + s->bounce_pool.len - addr; + if (bytes > space) { + return false; + } + if (prev) { + QLIST_INSERT_AFTER(prev, bounce, next); + } else { + QLIST_INSERT_HEAD(&s->bounce_bufs, bounce, next); + } + bounce->buf.iov_base =3D addr; + bounce->buf.iov_len =3D bytes; + return true; +} + +static int coroutine_fn +blkio_alloc_bounce_buffer(BDRVBlkioState *s, BlkioBounceBuf *bounce, + int64_t bytes) +{ + /* + * Ensure fairness: first time around we join the back of the queue, + * subsequently we join the front so we don't lose our place. + */ + CoQueueWaitFlags wait_flags =3D 0; + + QEMU_LOCK_GUARD(&s->bounce_lock); + + /* Ensure fairness: don't even try if other requests are already waiti= ng */ + if (!qemu_co_queue_empty(&s->bounce_available)) { + qemu_co_queue_wait_flags(&s->bounce_available, &s->bounce_lock, + wait_flags); + wait_flags =3D CO_QUEUE_WAIT_FRONT; + } + + while (true) { + if (blkio_do_alloc_bounce_buffer(s, bounce, bytes)) { + /* Kick the next queued request since there may be space */ + qemu_co_queue_next(&s->bounce_available); + return 0; + } + + /* + * If there are no in-flight requests then the pool was simply too + * small. + */ + if (QLIST_EMPTY(&s->bounce_bufs)) { + bool ok; + int ret; + + ret =3D blkio_resize_bounce_pool(s, bytes); + if (ret < 0) { + /* Kick the next queued request since that may fail too */ + qemu_co_queue_next(&s->bounce_available); + return ret; + } + + ok =3D blkio_do_alloc_bounce_buffer(s, bounce, bytes); + assert(ok); /* must have space this time */ + return 0; + } + + qemu_co_queue_wait_flags(&s->bounce_available, &s->bounce_lock, + wait_flags); + wait_flags =3D CO_QUEUE_WAIT_FRONT; + } +} + +static void coroutine_fn blkio_free_bounce_buffer(BDRVBlkioState *s, + BlkioBounceBuf *bounce) +{ + QEMU_LOCK_GUARD(&s->bounce_lock); + + QLIST_REMOVE(bounce, next); + + /* Wake up waiting coroutines since space may now be available */ + qemu_co_queue_next(&s->bounce_available); +} + +/* For async to .bdrv_co_*() conversion */ +typedef struct { + Coroutine *coroutine; + int ret; +} BlkioCoData; + +static void blkio_completion_fd_read(void *opaque) +{ + BlockDriverState *bs =3D opaque; + BDRVBlkioState *s =3D bs->opaque; + uint64_t val; + int ret; + + /* Polling may have already fetched a completion */ + if (s->poll_completion.user_data !=3D NULL) { + BlkioCoData *cod =3D s->poll_completion.user_data; + cod->ret =3D s->poll_completion.ret; + + /* Clear it in case aio_co_wake() enters a nested event loop */ + s->poll_completion.user_data =3D NULL; + + aio_co_wake(cod->coroutine); + } + + /* Reset completion fd status */ + ret =3D read(s->completion_fd, &val, sizeof(val)); + + /* Ignore errors, there's nothing we can do */ + (void)ret; + + /* + * Reading one completion at a time makes nested event loop re-entrancy + * simple. Change this loop to get multiple completions in one go if it + * becomes a performance bottleneck. + */ + while (true) { + struct blkio_completion completion; + + WITH_QEMU_LOCK_GUARD(&s->blkio_lock) { + ret =3D blkioq_do_io(s->blkioq, &completion, 0, 1, NULL); + } + if (ret !=3D 1) { + break; + } + + BlkioCoData *cod =3D completion.user_data; + cod->ret =3D completion.ret; + aio_co_wake(cod->coroutine); + } +} + +static bool blkio_completion_fd_poll(void *opaque) +{ + BlockDriverState *bs =3D opaque; + BDRVBlkioState *s =3D bs->opaque; + int ret; + + /* Just in case we already fetched a completion */ + if (s->poll_completion.user_data !=3D NULL) { + return true; + } + + WITH_QEMU_LOCK_GUARD(&s->blkio_lock) { + ret =3D blkioq_do_io(s->blkioq, &s->poll_completion, 0, 1, NULL); + } + return ret =3D=3D 1; +} + +static void blkio_completion_fd_poll_ready(void *opaque) +{ + blkio_completion_fd_read(opaque); +} + +static void blkio_attach_aio_context(BlockDriverState *bs, + AioContext *new_context) +{ + BDRVBlkioState *s =3D bs->opaque; + + aio_set_fd_handler(new_context, + s->completion_fd, + false, + blkio_completion_fd_read, + NULL, + blkio_completion_fd_poll, + blkio_completion_fd_poll_ready, + bs); +} + +static void blkio_detach_aio_context(BlockDriverState *bs) +{ + BDRVBlkioState *s =3D bs->opaque; + + aio_set_fd_handler(bdrv_get_aio_context(bs), + s->completion_fd, + false, NULL, NULL, NULL, NULL, NULL); +} + +/* Call with s->blkio_lock held to submit I/O after enqueuing a new reques= t */ +static void blkio_submit_io(BlockDriverState *bs) +{ + if (qatomic_read(&bs->io_plugged) =3D=3D 0) { + BDRVBlkioState *s =3D bs->opaque; + + blkioq_do_io(s->blkioq, NULL, 0, 0, NULL); + } +} + +static int coroutine_fn +blkio_co_pdiscard(BlockDriverState *bs, int64_t offset, int64_t bytes) +{ + BDRVBlkioState *s =3D bs->opaque; + BlkioCoData cod =3D { + .coroutine =3D qemu_coroutine_self(), + }; + + WITH_QEMU_LOCK_GUARD(&s->blkio_lock) { + blkioq_discard(s->blkioq, offset, bytes, &cod, 0); + blkio_submit_io(bs); + } + + qemu_coroutine_yield(); + return cod.ret; +} + +static int coroutine_fn +blkio_co_preadv(BlockDriverState *bs, int64_t offset, int64_t bytes, + QEMUIOVector *qiov, BdrvRequestFlags flags) +{ + BlkioCoData cod =3D { + .coroutine =3D qemu_coroutine_self(), + }; + BDRVBlkioState *s =3D bs->opaque; + bool use_bounce_buffer =3D s->needs_mem_regions; + BlkioBounceBuf bounce; + struct iovec *iov =3D qiov->iov; + int iovcnt =3D qiov->niov; + + if (use_bounce_buffer) { + int ret =3D blkio_alloc_bounce_buffer(s, &bounce, bytes); + if (ret < 0) { + return ret; + } + + iov =3D &bounce.buf; + iovcnt =3D 1; + } + + WITH_QEMU_LOCK_GUARD(&s->blkio_lock) { + blkioq_readv(s->blkioq, offset, iov, iovcnt, &cod, 0); + blkio_submit_io(bs); + } + + qemu_coroutine_yield(); + + if (use_bounce_buffer) { + if (cod.ret =3D=3D 0) { + qemu_iovec_from_buf(qiov, 0, + bounce.buf.iov_base, + bounce.buf.iov_len); + } + + blkio_free_bounce_buffer(s, &bounce); + } + + return cod.ret; +} + +static int coroutine_fn blkio_co_pwritev(BlockDriverState *bs, int64_t off= set, + int64_t bytes, QEMUIOVector *qiov, BdrvRequestFlags flags) +{ + uint32_t blkio_flags =3D (flags & BDRV_REQ_FUA) ? BLKIO_REQ_FUA : 0; + BlkioCoData cod =3D { + .coroutine =3D qemu_coroutine_self(), + }; + BDRVBlkioState *s =3D bs->opaque; + bool use_bounce_buffer =3D s->needs_mem_regions; + BlkioBounceBuf bounce; + struct iovec *iov =3D qiov->iov; + int iovcnt =3D qiov->niov; + + if (use_bounce_buffer) { + int ret =3D blkio_alloc_bounce_buffer(s, &bounce, bytes); + if (ret < 0) { + return ret; + } + + qemu_iovec_to_buf(qiov, 0, bounce.buf.iov_base, bytes); + iov =3D &bounce.buf; + iovcnt =3D 1; + } + + WITH_QEMU_LOCK_GUARD(&s->blkio_lock) { + blkioq_writev(s->blkioq, offset, iov, iovcnt, &cod, blkio_flags); + blkio_submit_io(bs); + } + + qemu_coroutine_yield(); + + if (use_bounce_buffer) { + blkio_free_bounce_buffer(s, &bounce); + } + + return cod.ret; +} + +static int coroutine_fn blkio_co_flush(BlockDriverState *bs) +{ + BDRVBlkioState *s =3D bs->opaque; + BlkioCoData cod =3D { + .coroutine =3D qemu_coroutine_self(), + }; + + WITH_QEMU_LOCK_GUARD(&s->blkio_lock) { + blkioq_flush(s->blkioq, &cod, 0); + blkio_submit_io(bs); + } + + qemu_coroutine_yield(); + return cod.ret; +} + +static int coroutine_fn blkio_co_pwrite_zeroes(BlockDriverState *bs, + int64_t offset, int64_t bytes, BdrvRequestFlags flags) +{ + BDRVBlkioState *s =3D bs->opaque; + BlkioCoData cod =3D { + .coroutine =3D qemu_coroutine_self(), + }; + uint32_t blkio_flags =3D 0; + + if (flags & BDRV_REQ_FUA) { + blkio_flags |=3D BLKIO_REQ_FUA; + } + if (!(flags & BDRV_REQ_MAY_UNMAP)) { + blkio_flags |=3D BLKIO_REQ_NO_UNMAP; + } + if (flags & BDRV_REQ_NO_FALLBACK) { + blkio_flags |=3D BLKIO_REQ_NO_FALLBACK; + } + + WITH_QEMU_LOCK_GUARD(&s->blkio_lock) { + blkioq_write_zeroes(s->blkioq, offset, bytes, &cod, blkio_flags); + blkio_submit_io(bs); + } + + qemu_coroutine_yield(); + return cod.ret; +} + +static void blkio_io_unplug(BlockDriverState *bs) +{ + BDRVBlkioState *s =3D bs->opaque; + + WITH_QEMU_LOCK_GUARD(&s->blkio_lock) { + blkio_submit_io(bs); + } +} + +static int blkio_io_uring_open(BlockDriverState *bs, QDict *options, int f= lags, + Error **errp) +{ + const char *filename =3D qdict_get_str(options, "filename"); + BDRVBlkioState *s =3D bs->opaque; + int ret; + + ret =3D blkio_set_str(s->blkio, "path", filename); + qdict_del(options, "filename"); + if (ret < 0) { + error_setg_errno(errp, -ret, "failed to set path: %s", + blkio_get_error_msg()); + return ret; + } + + if (flags & BDRV_O_NOCACHE) { + ret =3D blkio_set_bool(s->blkio, "direct", true); + if (ret < 0) { + error_setg_errno(errp, -ret, "failed to set direct: %s", + blkio_get_error_msg()); + return ret; + } + } + + return 0; +} + +static int blkio_nvme_io_uring(BlockDriverState *bs, QDict *options, int f= lags, + Error **errp) +{ + const char *filename =3D qdict_get_str(options, "filename"); + BDRVBlkioState *s =3D bs->opaque; + int ret; + + ret =3D blkio_set_str(s->blkio, "path", filename); + qdict_del(options, "filename"); + if (ret < 0) { + error_setg_errno(errp, -ret, "failed to set path: %s", + blkio_get_error_msg()); + return ret; + } + + if (!(flags & BDRV_O_NOCACHE)) { + error_setg(errp, "cache.direct=3Doff is not supported"); + return -EINVAL; + } + + return 0; +} + +static int blkio_virtio_blk_common_open(BlockDriverState *bs, + QDict *options, int flags, Error **errp) +{ + const char *path =3D qdict_get_try_str(options, "path"); + BDRVBlkioState *s =3D bs->opaque; + int ret; + + if (!path) { + error_setg(errp, "missing 'path' option"); + return -EINVAL; + } + + ret =3D blkio_set_str(s->blkio, "path", path); + qdict_del(options, "path"); + if (ret < 0) { + error_setg_errno(errp, -ret, "failed to set path: %s", + blkio_get_error_msg()); + return ret; + } + + if (!(flags & BDRV_O_NOCACHE)) { + error_setg(errp, "cache.direct=3Doff is not supported"); + return -EINVAL; + } + return 0; +} + +static int blkio_file_open(BlockDriverState *bs, QDict *options, int flags, + Error **errp) +{ + const char *blkio_driver =3D bs->drv->protocol_name; + BDRVBlkioState *s =3D bs->opaque; + int ret; + + ret =3D blkio_create(blkio_driver, &s->blkio); + if (ret < 0) { + error_setg_errno(errp, -ret, "blkio_create failed: %s", + blkio_get_error_msg()); + return ret; + } + + if (strcmp(blkio_driver, DRIVER_IO_URING) =3D=3D 0) { + ret =3D blkio_io_uring_open(bs, options, flags, errp); + } else if (strcmp(blkio_driver, DRIVER_NVME_IO_URING) =3D=3D 0) { + ret =3D blkio_nvme_io_uring(bs, options, flags, errp); + } else if (strcmp(blkio_driver, DRIVER_VIRTIO_BLK_VHOST_USER) =3D=3D 0= ) { + ret =3D blkio_virtio_blk_common_open(bs, options, flags, errp); + } else if (strcmp(blkio_driver, DRIVER_VIRTIO_BLK_VHOST_VDPA) =3D=3D 0= ) { + ret =3D blkio_virtio_blk_common_open(bs, options, flags, errp); + } else { + g_assert_not_reached(); + } + if (ret < 0) { + blkio_destroy(&s->blkio); + return ret; + } + + if (!(flags & BDRV_O_RDWR)) { + ret =3D blkio_set_bool(s->blkio, "read-only", true); + if (ret < 0) { + error_setg_errno(errp, -ret, "failed to set read-only: %s", + blkio_get_error_msg()); + blkio_destroy(&s->blkio); + return ret; + } + } + + ret =3D blkio_connect(s->blkio); + if (ret < 0) { + error_setg_errno(errp, -ret, "blkio_connect failed: %s", + blkio_get_error_msg()); + blkio_destroy(&s->blkio); + return ret; + } + + ret =3D blkio_get_bool(s->blkio, + "needs-mem-regions", + &s->needs_mem_regions); + if (ret < 0) { + error_setg_errno(errp, -ret, + "failed to get needs-mem-regions: %s", + blkio_get_error_msg()); + blkio_destroy(&s->blkio); + return ret; + } + + ret =3D blkio_get_uint64(s->blkio, + "mem-region-alignment", + &s->mem_region_alignment); + if (ret < 0) { + error_setg_errno(errp, -ret, + "failed to get mem-region-alignment: %s", + blkio_get_error_msg()); + blkio_destroy(&s->blkio); + return ret; + } + + ret =3D blkio_start(s->blkio); + if (ret < 0) { + error_setg_errno(errp, -ret, "blkio_start failed: %s", + blkio_get_error_msg()); + blkio_destroy(&s->blkio); + return ret; + } + + bs->supported_write_flags =3D BDRV_REQ_FUA; + bs->supported_zero_flags =3D BDRV_REQ_FUA | BDRV_REQ_MAY_UNMAP | + BDRV_REQ_NO_FALLBACK; + + qemu_mutex_init(&s->blkio_lock); + qemu_co_mutex_init(&s->bounce_lock); + qemu_co_queue_init(&s->bounce_available); + QLIST_INIT(&s->bounce_bufs); + s->blkioq =3D blkio_get_queue(s->blkio, 0); + s->completion_fd =3D blkioq_get_completion_fd(s->blkioq); + + blkio_attach_aio_context(bs, bdrv_get_aio_context(bs)); + return 0; +} + +static void blkio_close(BlockDriverState *bs) +{ + BDRVBlkioState *s =3D bs->opaque; + + /* There is no destroy() API for s->bounce_lock */ + + qemu_mutex_destroy(&s->blkio_lock); + blkio_detach_aio_context(bs); + blkio_destroy(&s->blkio); +} + +static int64_t blkio_getlength(BlockDriverState *bs) +{ + BDRVBlkioState *s =3D bs->opaque; + uint64_t capacity; + int ret; + + WITH_QEMU_LOCK_GUARD(&s->blkio_lock) { + ret =3D blkio_get_uint64(s->blkio, "capacity", &capacity); + } + if (ret < 0) { + return -ret; + } + + return capacity; +} + +static int blkio_get_info(BlockDriverState *bs, BlockDriverInfo *bdi) +{ + return 0; +} + +static void blkio_refresh_limits(BlockDriverState *bs, Error **errp) +{ + BDRVBlkioState *s =3D bs->opaque; + QEMU_LOCK_GUARD(&s->blkio_lock); + int value; + int ret; + + ret =3D blkio_get_int(s->blkio, "request-alignment", &value); + if (ret < 0) { + error_setg_errno(errp, -ret, "failed to get \"request-alignment\":= %s", + blkio_get_error_msg()); + return; + } + bs->bl.request_alignment =3D value; + if (bs->bl.request_alignment < 1 || + bs->bl.request_alignment >=3D INT_MAX || + !is_power_of_2(bs->bl.request_alignment)) { + error_setg(errp, "invalid \"request-alignment\" value %" PRIu32 ",= " + "must be a power of 2 less than INT_MAX", + bs->bl.request_alignment); + return; + } + + ret =3D blkio_get_int(s->blkio, "optimal-io-size", &value); + if (ret < 0) { + error_setg_errno(errp, -ret, "failed to get \"optimal-io-size\": %= s", + blkio_get_error_msg()); + return; + } + bs->bl.opt_transfer =3D value; + if (bs->bl.opt_transfer > INT_MAX || + (bs->bl.opt_transfer % bs->bl.request_alignment)) { + error_setg(errp, "invalid \"optimal-io-size\" value %" PRIu32 ", m= ust " + "be a multiple of %" PRIu32, bs->bl.opt_transfer, + bs->bl.request_alignment); + return; + } + + ret =3D blkio_get_int(s->blkio, "max-transfer", &value); + if (ret < 0) { + error_setg_errno(errp, -ret, "failed to get \"max-transfer\": %s", + blkio_get_error_msg()); + return; + } + bs->bl.max_transfer =3D value; + if ((bs->bl.max_transfer % bs->bl.request_alignment) || + (bs->bl.opt_transfer && (bs->bl.max_transfer % bs->bl.opt_transfer= ))) { + error_setg(errp, "invalid \"max-transfer\" value %" PRIu32 ", must= be " + "a multiple of %" PRIu32 " and %" PRIu32 " (if non-zero= )", + bs->bl.max_transfer, bs->bl.request_alignment, + bs->bl.opt_transfer); + return; + } + + ret =3D blkio_get_int(s->blkio, "buf-alignment", &value); + if (ret < 0) { + error_setg_errno(errp, -ret, "failed to get \"buf-alignment\": %s", + blkio_get_error_msg()); + return; + } + if (value < 1) { + error_setg(errp, "invalid \"buf-alignment\" value %d, must be " + "positive", value); + return; + } + bs->bl.min_mem_alignment =3D value; + + ret =3D blkio_get_int(s->blkio, "optimal-buf-alignment", &value); + if (ret < 0) { + error_setg_errno(errp, -ret, + "failed to get \"optimal-buf-alignment\": %s", + blkio_get_error_msg()); + return; + } + if (value < 1) { + error_setg(errp, "invalid \"optimal-buf-alignment\" value %d, " + "must be positive", value); + return; + } + bs->bl.opt_mem_alignment =3D value; + + ret =3D blkio_get_int(s->blkio, "max-segments", &value); + if (ret < 0) { + error_setg_errno(errp, -ret, "failed to get \"max-segments\": %s", + blkio_get_error_msg()); + return; + } + if (value < 1) { + error_setg(errp, "invalid \"max-segments\" value %d, must be posit= ive", + value); + return; + } + bs->bl.max_iov =3D value; +} + +/* + * TODO + * Missing libblkio APIs: + * - block_status + * - co_invalidate_cache + * + * Out of scope? + * - create + * - truncate + */ + +#define BLKIO_DRIVER(name, ...) \ + { \ + .format_name =3D name, \ + .protocol_name =3D name, \ + .instance_size =3D sizeof(BDRVBlkioState), \ + .bdrv_file_open =3D blkio_file_open, \ + .bdrv_close =3D blkio_close, \ + .bdrv_getlength =3D blkio_getlength, \ + .bdrv_get_info =3D blkio_get_info, \ + .bdrv_attach_aio_context =3D blkio_attach_aio_context, \ + .bdrv_detach_aio_context =3D blkio_detach_aio_context, \ + .bdrv_co_pdiscard =3D blkio_co_pdiscard, \ + .bdrv_co_preadv =3D blkio_co_preadv, \ + .bdrv_co_pwritev =3D blkio_co_pwritev, \ + .bdrv_co_flush_to_disk =3D blkio_co_flush, \ + .bdrv_co_pwrite_zeroes =3D blkio_co_pwrite_zeroes, \ + .bdrv_io_unplug =3D blkio_io_unplug, \ + .bdrv_refresh_limits =3D blkio_refresh_limits, \ + __VA_ARGS__ \ + } + +static BlockDriver bdrv_io_uring =3D BLKIO_DRIVER( + DRIVER_IO_URING, + .bdrv_needs_filename =3D true, +); + +static BlockDriver bdrv_nvme_io_uring =3D BLKIO_DRIVER( + DRIVER_NVME_IO_URING, + .bdrv_needs_filename =3D true, +); + +static BlockDriver bdrv_virtio_blk_vhost_user =3D BLKIO_DRIVER( + DRIVER_VIRTIO_BLK_VHOST_USER +); + +static BlockDriver bdrv_virtio_blk_vhost_vdpa =3D BLKIO_DRIVER( + DRIVER_VIRTIO_BLK_VHOST_VDPA +); + +static void bdrv_blkio_init(void) +{ + bdrv_register(&bdrv_io_uring); + bdrv_register(&bdrv_nvme_io_uring); + bdrv_register(&bdrv_virtio_blk_vhost_user); + bdrv_register(&bdrv_virtio_blk_vhost_vdpa); +} + +block_init(bdrv_blkio_init); diff --git a/tests/qtest/modules-test.c b/tests/qtest/modules-test.c index 88217686e1..be2575ae6d 100644 --- a/tests/qtest/modules-test.c +++ b/tests/qtest/modules-test.c @@ -16,6 +16,9 @@ static void test_modules_load(const void *data) int main(int argc, char *argv[]) { const char *modules[] =3D { +#ifdef CONFIG_BLKIO + "block-", "blkio", +#endif #ifdef CONFIG_CURL "block-", "curl", #endif diff --git a/block/meson.build b/block/meson.build index 60bc305597..500878f082 100644 --- a/block/meson.build +++ b/block/meson.build @@ -92,6 +92,7 @@ block_modules =3D {} =20 modsrc =3D [] foreach m : [ + [blkio, 'blkio', files('blkio.c')], [curl, 'curl', files('curl.c')], [glusterfs, 'gluster', files('gluster.c')], [libiscsi, 'iscsi', [files('iscsi.c'), libm]], diff --git a/scripts/meson-buildoptions.sh b/scripts/meson-buildoptions.sh index eb3267bef5..2cb0de5601 100644 --- a/scripts/meson-buildoptions.sh +++ b/scripts/meson-buildoptions.sh @@ -67,6 +67,7 @@ meson_options_help() { printf "%s\n" ' auth-pam PAM access control' printf "%s\n" ' avx2 AVX2 optimizations' printf "%s\n" ' avx512f AVX512F optimizations' + printf "%s\n" ' blkio libblkio block device driver' printf "%s\n" ' bochs bochs image format support' printf "%s\n" ' bpf eBPF support' printf "%s\n" ' brlapi brlapi character device driver' @@ -198,6 +199,8 @@ _meson_option_parse() { --disable-gcov) printf "%s" -Db_coverage=3Dfalse ;; --enable-lto) printf "%s" -Db_lto=3Dtrue ;; --disable-lto) printf "%s" -Db_lto=3Dfalse ;; + --enable-blkio) printf "%s" -Dblkio=3Denabled ;; + --disable-blkio) printf "%s" -Dblkio=3Ddisabled ;; --block-drv-ro-whitelist=3D*) quote_sh "-Dblock_drv_ro_whitelist=3D$2"= ;; --block-drv-rw-whitelist=3D*) quote_sh "-Dblock_drv_rw_whitelist=3D$2"= ;; --enable-block-drv-whitelist-in-tools) printf "%s" -Dblock_drv_whiteli= st_in_tools=3Dtrue ;; --=20 2.37.3 From nobody Sun May 12 14:33:58 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1665688498; cv=none; d=zohomail.com; s=zohoarc; b=aQkxc7AhQw+QIZtSOmXJjISf/kcG4ycWTB5UpI+W5OZk0Aj6urkF3UpqM87QH6oLuPnURkYv7ubgSaeaj2K73P08QLSzydEZo1KqJ/hgayZCtDsm9Iq17T/JDtHdCqROs0RDjLrAalSw4At2LB0AQXkmc5Jtqg26k4Pl7wsuEUU= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1665688498; h=Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=3mgJt1ynd7ZiWF7e8csGQVMXxZYrxlOZeWtbVF2Vfrg=; b=jDt+1YGLrUECstW3LM5vHHyeJkQBQa3ou5M6a+KBDc9wLCdKp28cuYNpNJPNmikErA15O1PdpWyWzKkQeVtrRLVuk3Tf1rpJ4HadBjOHPLrwXvIHYHQAQ5Fjj+L3HtRXbYqCF8g3ej9EizgBuOjTONUubV8S90nLnqrySaEutQY= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1665688498141201.06863582625226; Thu, 13 Oct 2022 12:14:58 -0700 (PDT) Received: from localhost ([::1]:53990 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oj3fV-0004Dj-5Q for importer@patchew.org; Thu, 13 Oct 2022 15:14:57 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:42478) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oj3Qd-0006XN-Mp for qemu-devel@nongnu.org; Thu, 13 Oct 2022 14:59:36 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:48318) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oj3Qa-00043T-N7 for qemu-devel@nongnu.org; Thu, 13 Oct 2022 14:59:35 -0400 Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-426-whgNUt2XMMeu5EkwRX0yew-1; Thu, 13 Oct 2022 14:59:29 -0400 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 67C45382C965; Thu, 13 Oct 2022 18:59:28 +0000 (UTC) Received: from localhost (unknown [10.39.194.71]) by smtp.corp.redhat.com (Postfix) with ESMTP id 96B5E112C064; Thu, 13 Oct 2022 18:59:27 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1665687572; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=3mgJt1ynd7ZiWF7e8csGQVMXxZYrxlOZeWtbVF2Vfrg=; b=jKuHUSefH8EGfmIo+SvqYJVWaIFh2yru/dMb/4d8YCxPxn0EkeKV2O1Q1qHw8VbZVDgS5f W2fOlDo7BNbpMKc/WremwG6SjeS7+bZ+KxbstksFH6OrtIREfBY1Xkza38K1WCXMtQWjOa /X80W9WXxhRGx3otjdmgly459CXwS0Q= X-MC-Unique: whgNUt2XMMeu5EkwRX0yew-1 From: Stefan Hajnoczi To: qemu-devel@nongnu.org Cc: Fam Zheng , Markus Armbruster , Stefan Hajnoczi , Laurent Vivier , Marcel Apfelbaum , "Michael S. Tsirkin" , Wen Congyang , Kevin Wolf , =?UTF-8?q?Philippe=20Mathieu-Daud=C3=A9?= , Richard Henderson , David Hildenbrand , Hanna Reitz , Xie Changlong , Eduardo Habkost , qemu-block@nongnu.org, Eric Blake , John Snow , afaria@redhat.com, Jeff Cody , Yanan Wang , Paolo Bonzini , Raphael Norwitz , sgarzare@redhat.com, integration@gluster.org, Peter Xu , "Richard W.M. Jones" , Thomas Huth , Vladimir Sementsov-Ogievskiy , "Denis V. Lunev" Subject: [PATCH v7 03/13] numa: call ->ram_block_removed() in ram_block_notifer_remove() Date: Thu, 13 Oct 2022 14:58:58 -0400 Message-Id: <20221013185908.1297568-4-stefanha@redhat.com> In-Reply-To: <20221013185908.1297568-1-stefanha@redhat.com> References: <20221013185908.1297568-1-stefanha@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.1 on 10.11.54.3 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=stefanha@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1665688499738100001 Content-Type: text/plain; charset="utf-8" When a RAMBlockNotifier is added, ->ram_block_added() is called with all existing RAMBlocks. There is no equivalent ->ram_block_removed() call when a RAMBlockNotifier is removed. The util/vfio-helpers.c code (the sole user of RAMBlockNotifier) is fine with this asymmetry because it does not rely on RAMBlockNotifier for cleanup. It walks its internal list of DMA mappings and unmaps them by itself. Future users of RAMBlockNotifier may not have an internal data structure that records added RAMBlocks so they will need ->ram_block_removed() callbacks. This patch makes ram_block_notifier_remove() symmetric with respect to callbacks. Now util/vfio-helpers.c needs to unmap remaining DMA mappings after ram_block_notifier_remove() has been called. This is necessary since users like block/nvme.c may create additional DMA mappings that do not originate from the RAMBlockNotifier. Reviewed-by: David Hildenbrand Signed-off-by: Stefan Hajnoczi --- hw/core/numa.c | 17 +++++++++++++++++ util/vfio-helpers.c | 5 ++++- 2 files changed, 21 insertions(+), 1 deletion(-) diff --git a/hw/core/numa.c b/hw/core/numa.c index 26d8e5f616..31e6fe1caa 100644 --- a/hw/core/numa.c +++ b/hw/core/numa.c @@ -822,6 +822,19 @@ static int ram_block_notify_add_single(RAMBlock *rb, v= oid *opaque) return 0; } =20 +static int ram_block_notify_remove_single(RAMBlock *rb, void *opaque) +{ + const ram_addr_t max_size =3D qemu_ram_get_max_length(rb); + const ram_addr_t size =3D qemu_ram_get_used_length(rb); + void *host =3D qemu_ram_get_host_addr(rb); + RAMBlockNotifier *notifier =3D opaque; + + if (host) { + notifier->ram_block_removed(notifier, host, size, max_size); + } + return 0; +} + void ram_block_notifier_add(RAMBlockNotifier *n) { QLIST_INSERT_HEAD(&ram_list.ramblock_notifiers, n, next); @@ -835,6 +848,10 @@ void ram_block_notifier_add(RAMBlockNotifier *n) void ram_block_notifier_remove(RAMBlockNotifier *n) { QLIST_REMOVE(n, next); + + if (n->ram_block_removed) { + qemu_ram_foreach_block(ram_block_notify_remove_single, n); + } } =20 void ram_block_notify_add(void *host, size_t size, size_t max_size) diff --git a/util/vfio-helpers.c b/util/vfio-helpers.c index 5ba01177bf..0d1520caac 100644 --- a/util/vfio-helpers.c +++ b/util/vfio-helpers.c @@ -847,10 +847,13 @@ void qemu_vfio_close(QEMUVFIOState *s) if (!s) { return; } + + ram_block_notifier_remove(&s->ram_notifier); + for (i =3D 0; i < s->nr_mappings; ++i) { qemu_vfio_undo_mapping(s, &s->mappings[i], NULL); } - ram_block_notifier_remove(&s->ram_notifier); + g_free(s->usable_iova_ranges); s->nb_iova_ranges =3D 0; qemu_vfio_reset(s); --=20 2.37.3 From nobody Sun May 12 14:33:58 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1665688731; cv=none; d=zohomail.com; s=zohoarc; b=nFpcJfepAQMy+6zjAvyYkUI+vUKd6WQo3WlckS8wzZK8V/6GgBvVGJsT0jKgYrTR2WDjhqZ4lzvgtjhuxDQYAnGiHpCF/c4ISBfDPdN0Xb+uRm7fx5l7WhRW7hH05DEOv5W/Nm9Bx3qE2exeJRFD1I4LHvGIssqjDq4u+cScfsA= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1665688731; h=Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=gPpqW5naSnsWtQFDbiZpa6/8bBL5jR6RajzN0ASqE/c=; b=KQO80dJgQyuqaaqXmH93nWZjqnI9xU4OMrj06iVuVQav2al9tKg2TrelQadv2si2xY7fpY/cqxVY8DQCxFjt3is+YGj6FBuXhOoNaO7yUZJzlu3OrxZEekNtWnvvSUhKF47zY5IKDd6zTEw7Q5Se8bXGtXrC+4LaSfEinKOwQlI= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1665688731971491.01066951492965; Thu, 13 Oct 2022 12:18:51 -0700 (PDT) Received: from localhost ([::1]:59124 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oj3jG-0007SK-SE for importer@patchew.org; Thu, 13 Oct 2022 15:18:50 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:42484) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oj3Qi-0006av-44 for qemu-devel@nongnu.org; Thu, 13 Oct 2022 14:59:41 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:35545) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oj3Qf-00044Q-Kc for qemu-devel@nongnu.org; Thu, 13 Oct 2022 14:59:39 -0400 Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-332-RiBvS4NBMmGhdKfEqdDfdA-1; Thu, 13 Oct 2022 14:59:31 -0400 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id EBC5085A583; Thu, 13 Oct 2022 18:59:30 +0000 (UTC) Received: from localhost (unknown [10.39.194.71]) by smtp.corp.redhat.com (Postfix) with ESMTP id 30A717AE5; Thu, 13 Oct 2022 18:59:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1665687577; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=gPpqW5naSnsWtQFDbiZpa6/8bBL5jR6RajzN0ASqE/c=; b=UzX6h7tMs/drk7mtrgpBOqGgh+282bFBodfT73+cOvIwHdq/n6zIhsv5bM/p58XL0ZZre0 r//xn+/z2N/haOuN7OirX+5/jRpXIL7gV+0juKmMM7OCWwqcklA3ftb7Qhr8QjyrRPUMis GFfAuPCq9FC0y0Lr5VQDN6uSunSjeJs= X-MC-Unique: RiBvS4NBMmGhdKfEqdDfdA-1 From: Stefan Hajnoczi To: qemu-devel@nongnu.org Cc: Fam Zheng , Markus Armbruster , Stefan Hajnoczi , Laurent Vivier , Marcel Apfelbaum , "Michael S. Tsirkin" , Wen Congyang , Kevin Wolf , =?UTF-8?q?Philippe=20Mathieu-Daud=C3=A9?= , Richard Henderson , David Hildenbrand , Hanna Reitz , Xie Changlong , Eduardo Habkost , qemu-block@nongnu.org, Eric Blake , John Snow , afaria@redhat.com, Jeff Cody , Yanan Wang , Paolo Bonzini , Raphael Norwitz , sgarzare@redhat.com, integration@gluster.org, Peter Xu , "Richard W.M. Jones" , Thomas Huth , Vladimir Sementsov-Ogievskiy , "Denis V. Lunev" Subject: [PATCH v7 04/13] block: pass size to bdrv_unregister_buf() Date: Thu, 13 Oct 2022 14:58:59 -0400 Message-Id: <20221013185908.1297568-5-stefanha@redhat.com> In-Reply-To: <20221013185908.1297568-1-stefanha@redhat.com> References: <20221013185908.1297568-1-stefanha@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.1 on 10.11.54.5 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=stefanha@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1665688733534100001 Content-Type: text/plain; charset="utf-8" The only implementor of bdrv_register_buf() is block/nvme.c, where the size is not needed when unregistering a buffer. This is because util/vfio-helpers.c can look up mappings by address. Future block drivers that implement bdrv_register_buf() may not be able to do their job given only the buffer address. Add a size argument to bdrv_unregister_buf(). Also document the assumptions about bdrv_register_buf()/bdrv_unregister_buf() calls. The same values that were given to bdrv_register_buf() must be given to bdrv_unregister_buf(). gcc 11.2.1 emits a spurious warning that img_bench()'s buf_size local variable might be uninitialized, so it's necessary to silence the compiler. Signed-off-by: Stefan Hajnoczi Reviewed-by: Stefano Garzarella --- include/block/block-global-state.h | 5 ++++- include/block/block_int-common.h | 2 +- include/sysemu/block-backend-global-state.h | 2 +- block/block-backend.c | 4 ++-- block/io.c | 6 +++--- block/nvme.c | 2 +- qemu-img.c | 4 ++-- 7 files changed, 14 insertions(+), 11 deletions(-) diff --git a/include/block/block-global-state.h b/include/block/block-globa= l-state.h index 21265e3966..7901f35863 100644 --- a/include/block/block-global-state.h +++ b/include/block/block-global-state.h @@ -243,9 +243,12 @@ void bdrv_del_child(BlockDriverState *parent, BdrvChil= d *child, Error **errp); * Register/unregister a buffer for I/O. For example, VFIO drivers are * interested to know the memory areas that would later be used for I/O, so * that they can prepare IOMMU mapping etc., to get better performance. + * + * Buffers must not overlap and they must be unregistered with the same values that they were registered with. */ void bdrv_register_buf(BlockDriverState *bs, void *host, size_t size); -void bdrv_unregister_buf(BlockDriverState *bs, void *host); +void bdrv_unregister_buf(BlockDriverState *bs, void *host, size_t size); =20 void bdrv_cancel_in_flight(BlockDriverState *bs); =20 diff --git a/include/block/block_int-common.h b/include/block/block_int-com= mon.h index 8947abab76..b7a7cbd3a5 100644 --- a/include/block/block_int-common.h +++ b/include/block/block_int-common.h @@ -435,7 +435,7 @@ struct BlockDriver { * DMA mapping for hot buffers. */ void (*bdrv_register_buf)(BlockDriverState *bs, void *host, size_t siz= e); - void (*bdrv_unregister_buf)(BlockDriverState *bs, void *host); + void (*bdrv_unregister_buf)(BlockDriverState *bs, void *host, size_t s= ize); =20 /* * This field is modified only under the BQL, and is part of diff --git a/include/sysemu/block-backend-global-state.h b/include/sysemu/b= lock-backend-global-state.h index 415f0c91d7..97f7dad2c3 100644 --- a/include/sysemu/block-backend-global-state.h +++ b/include/sysemu/block-backend-global-state.h @@ -107,7 +107,7 @@ void blk_io_limits_update_group(BlockBackend *blk, cons= t char *group); void blk_set_force_allow_inactivate(BlockBackend *blk); =20 void blk_register_buf(BlockBackend *blk, void *host, size_t size); -void blk_unregister_buf(BlockBackend *blk, void *host); +void blk_unregister_buf(BlockBackend *blk, void *host, size_t size); =20 const BdrvChild *blk_root(BlockBackend *blk); =20 diff --git a/block/block-backend.c b/block/block-backend.c index aa4adf06ae..ae42474891 100644 --- a/block/block-backend.c +++ b/block/block-backend.c @@ -2551,10 +2551,10 @@ void blk_register_buf(BlockBackend *blk, void *host= , size_t size) bdrv_register_buf(blk_bs(blk), host, size); } =20 -void blk_unregister_buf(BlockBackend *blk, void *host) +void blk_unregister_buf(BlockBackend *blk, void *host, size_t size) { GLOBAL_STATE_CODE(); - bdrv_unregister_buf(blk_bs(blk), host); + bdrv_unregister_buf(blk_bs(blk), host, size); } =20 int coroutine_fn blk_co_copy_range(BlockBackend *blk_in, int64_t off_in, diff --git a/block/io.c b/block/io.c index d30073036e..cca402bf7b 100644 --- a/block/io.c +++ b/block/io.c @@ -3275,16 +3275,16 @@ void bdrv_register_buf(BlockDriverState *bs, void *= host, size_t size) } } =20 -void bdrv_unregister_buf(BlockDriverState *bs, void *host) +void bdrv_unregister_buf(BlockDriverState *bs, void *host, size_t size) { BdrvChild *child; =20 GLOBAL_STATE_CODE(); if (bs->drv && bs->drv->bdrv_unregister_buf) { - bs->drv->bdrv_unregister_buf(bs, host); + bs->drv->bdrv_unregister_buf(bs, host, size); } QLIST_FOREACH(child, &bs->children, next) { - bdrv_unregister_buf(child->bs, host); + bdrv_unregister_buf(child->bs, host, size); } } =20 diff --git a/block/nvme.c b/block/nvme.c index 2b24f95164..94b76b16f2 100644 --- a/block/nvme.c +++ b/block/nvme.c @@ -1602,7 +1602,7 @@ static void nvme_register_buf(BlockDriverState *bs, v= oid *host, size_t size) } } =20 -static void nvme_unregister_buf(BlockDriverState *bs, void *host) +static void nvme_unregister_buf(BlockDriverState *bs, void *host, size_t s= ize) { BDRVNVMeState *s =3D bs->opaque; =20 diff --git a/qemu-img.c b/qemu-img.c index ace3adf8ae..9fe94df650 100644 --- a/qemu-img.c +++ b/qemu-img.c @@ -4371,7 +4371,7 @@ static int img_bench(int argc, char **argv) struct timeval t1, t2; int i; bool force_share =3D false; - size_t buf_size; + size_t buf_size =3D 0; =20 for (;;) { static const struct option long_options[] =3D { @@ -4593,7 +4593,7 @@ static int img_bench(int argc, char **argv) =20 out: if (data.buf) { - blk_unregister_buf(blk, data.buf); + blk_unregister_buf(blk, data.buf, buf_size); } qemu_vfree(data.buf); blk_unref(blk); --=20 2.37.3 From nobody Sun May 12 14:33:58 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1665687809; cv=none; d=zohomail.com; s=zohoarc; b=RSighNZb4skU7IJlp+FBA/C7ZJCtIo43qFCFA9F/eMK2U2rYbKfhMYGMduCxAUrPwbzy6pnHXODW5EoXx+RU6cdFButiVJ77feWKKQ5pdiBtzXtH1BouKqZ+cJZ4WCeB6o3Re143wci7W0LmsaoeD9Q1lWI41BbJte7mMRzVqAs= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1665687809; h=Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=Si3ngGKtQ37Tc9moyJg9LdNUwHwty4UA7DMk5607KiM=; b=BLCQNyVrwJcGiwYzh2qzoSwfnfXD+5rm4aRosAD6S0PolT+kFEM7968V1oBvObgCFB60kOihkRJuOQSSJtcnwVWYUvZPPSumKtfrJPZaMHbtpPVsWaoifeL1Sos8+Imi2M+tTCeNDxKVjhZW4qXLohBQT79xORVzm77B9Flz570= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1665687809295893.158599774307; Thu, 13 Oct 2022 12:03:29 -0700 (PDT) Received: from localhost ([::1]:36094 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oj3UN-00016q-FS for importer@patchew.org; Thu, 13 Oct 2022 15:03:27 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:42488) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oj3Qk-0006c4-7D for qemu-devel@nongnu.org; Thu, 13 Oct 2022 14:59:43 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:47853) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oj3Qi-00044z-Mp for qemu-devel@nongnu.org; Thu, 13 Oct 2022 14:59:41 -0400 Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-45-s9jRaOMsNp-4wlr1ML8qvQ-1; Thu, 13 Oct 2022 14:59:35 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 2C3218027F5; Thu, 13 Oct 2022 18:59:34 +0000 (UTC) Received: from localhost (unknown [10.39.194.71]) by smtp.corp.redhat.com (Postfix) with ESMTP id 404422166BA3; Thu, 13 Oct 2022 18:59:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1665687580; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Si3ngGKtQ37Tc9moyJg9LdNUwHwty4UA7DMk5607KiM=; b=VIi3GNFUGMIbMF+lN1cr2geu/4gIJaDXM/D2kO3jlHj2ABPMWxYDYhipFwEJYXnViVHRQ7 MbjRv4Dmp+P3Trp6t35jJXIgJF2uHwSY77X5EhFYFKig8YRxWf21yZBiTn4N+/R+blVpDi 91E8zyWB9g5bOckAMKuCqiBjjQ86Bp8= X-MC-Unique: s9jRaOMsNp-4wlr1ML8qvQ-1 From: Stefan Hajnoczi To: qemu-devel@nongnu.org Cc: Fam Zheng , Markus Armbruster , Stefan Hajnoczi , Laurent Vivier , Marcel Apfelbaum , "Michael S. Tsirkin" , Wen Congyang , Kevin Wolf , =?UTF-8?q?Philippe=20Mathieu-Daud=C3=A9?= , Richard Henderson , David Hildenbrand , Hanna Reitz , Xie Changlong , Eduardo Habkost , qemu-block@nongnu.org, Eric Blake , John Snow , afaria@redhat.com, Jeff Cody , Yanan Wang , Paolo Bonzini , Raphael Norwitz , sgarzare@redhat.com, integration@gluster.org, Peter Xu , "Richard W.M. Jones" , Thomas Huth , Vladimir Sementsov-Ogievskiy , "Denis V. Lunev" Subject: [PATCH v7 05/13] block: use BdrvRequestFlags type for supported flag fields Date: Thu, 13 Oct 2022 14:59:00 -0400 Message-Id: <20221013185908.1297568-6-stefanha@redhat.com> In-Reply-To: <20221013185908.1297568-1-stefanha@redhat.com> References: <20221013185908.1297568-1-stefanha@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.1 on 10.11.54.6 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.129.124; envelope-from=stefanha@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1665687811507100001 Content-Type: text/plain; charset="utf-8" Use the enum type so GDB displays the enum members instead of printing a numeric constant. Signed-off-by: Stefan Hajnoczi Reviewed-by: Stefano Garzarella --- include/block/block_int-common.h | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/include/block/block_int-common.h b/include/block/block_int-com= mon.h index b7a7cbd3a5..19798d0e77 100644 --- a/include/block/block_int-common.h +++ b/include/block/block_int-common.h @@ -1051,7 +1051,7 @@ struct BlockDriverState { /* * Flags honored during pread */ - unsigned int supported_read_flags; + BdrvRequestFlags supported_read_flags; /* * Flags honored during pwrite (so far: BDRV_REQ_FUA, * BDRV_REQ_WRITE_UNCHANGED). @@ -1069,12 +1069,12 @@ struct BlockDriverState { * flag), or they have to explicitly take the WRITE permission for * their children. */ - unsigned int supported_write_flags; + BdrvRequestFlags supported_write_flags; /* * Flags honored during pwrite_zeroes (so far: BDRV_REQ_FUA, * BDRV_REQ_MAY_UNMAP, BDRV_REQ_WRITE_UNCHANGED) */ - unsigned int supported_zero_flags; + BdrvRequestFlags supported_zero_flags; /* * Flags honoured during truncate (so far: BDRV_REQ_ZERO_WRITE). * @@ -1082,7 +1082,7 @@ struct BlockDriverState { * that any added space reads as all zeros. If this can't be guarantee= d, * the operation must fail. */ - unsigned int supported_truncate_flags; + BdrvRequestFlags supported_truncate_flags; =20 /* the following member gives a name to every node on the bs graph. */ char node_name[32]; --=20 2.37.3 From nobody Sun May 12 14:33:58 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1665688880; cv=none; d=zohomail.com; s=zohoarc; b=XTGMe7JChnOY7SlEx8WiuXPNh0LARv+TLkpDlwFnR1HZifwBcEUIM3fY2F815PexAPeAij2GvUTDwmdqB30nzZii1hKQI2I7i7Ykr75EWnSoemSRd0/8nUuX+BPWHWKo2bPYNF8G0UxFAS012O+ytXlSheoo7q28ZOYJCOr0F/A= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1665688880; h=Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=sqScfLjVjGvJ62mXmqyV83JG/kQlGruPxg1Hy6dcNAs=; b=lAamIje0hEfrxiYzoxF+UpDx3/MtOSEVHQw0t0HrlfZDDFYGXLubbM71GbfIHT4mRNBkfS52kkzYm+/+F+fpJT5ZoWixKywshLzJJ8t1ydHAT0NeMHGCsRAMD0AYyVqPjf4cxJGo97ZL8Pi1Wl2zQg9mr7aWvo8H3cQt6BNEVhI= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1665688880213399.6929692463202; Thu, 13 Oct 2022 12:21:20 -0700 (PDT) Received: from localhost ([::1]:49938 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oj3le-0001wV-K6 for importer@patchew.org; Thu, 13 Oct 2022 15:21:18 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:58164) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oj3Qo-0006cg-3R for qemu-devel@nongnu.org; Thu, 13 Oct 2022 14:59:49 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:30322) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oj3Qk-00045X-VZ for qemu-devel@nongnu.org; Thu, 13 Oct 2022 14:59:44 -0400 Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-636-kLPgrkuBMbiv8WyQOdCw_Q-1; Thu, 13 Oct 2022 14:59:38 -0400 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 87EFA382C965; Thu, 13 Oct 2022 18:59:37 +0000 (UTC) Received: from localhost (unknown [10.39.194.71]) by smtp.corp.redhat.com (Postfix) with ESMTP id 74E4F112C064; Thu, 13 Oct 2022 18:59:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1665687582; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=sqScfLjVjGvJ62mXmqyV83JG/kQlGruPxg1Hy6dcNAs=; b=drpa3kEY9YcehIh1qVkm2FH9mmkhs0Ygw+CJ2lRJ8bDqQNi+sNN+0jaJgdGeeIXw1qWLVh DK3iLKj2bs8mwbx0eSqEZaSiLjqkdQ/pIOzaVZJV+t/02FUEx7xbN1OjpD1Y2Ppf1vtLRH qgSGcNyya4tdESQBiqXK4z4WGqEjDTE= X-MC-Unique: kLPgrkuBMbiv8WyQOdCw_Q-1 From: Stefan Hajnoczi To: qemu-devel@nongnu.org Cc: Fam Zheng , Markus Armbruster , Stefan Hajnoczi , Laurent Vivier , Marcel Apfelbaum , "Michael S. Tsirkin" , Wen Congyang , Kevin Wolf , =?UTF-8?q?Philippe=20Mathieu-Daud=C3=A9?= , Richard Henderson , David Hildenbrand , Hanna Reitz , Xie Changlong , Eduardo Habkost , qemu-block@nongnu.org, Eric Blake , John Snow , afaria@redhat.com, Jeff Cody , Yanan Wang , Paolo Bonzini , Raphael Norwitz , sgarzare@redhat.com, integration@gluster.org, Peter Xu , "Richard W.M. Jones" , Thomas Huth , Vladimir Sementsov-Ogievskiy , "Denis V. Lunev" Subject: [PATCH v7 06/13] block: add BDRV_REQ_REGISTERED_BUF request flag Date: Thu, 13 Oct 2022 14:59:01 -0400 Message-Id: <20221013185908.1297568-7-stefanha@redhat.com> In-Reply-To: <20221013185908.1297568-1-stefanha@redhat.com> References: <20221013185908.1297568-1-stefanha@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.1 on 10.11.54.3 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=stefanha@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1665688881989100001 Content-Type: text/plain; charset="utf-8" Block drivers may optimize I/O requests accessing buffers previously registered with bdrv_register_buf(). Checking whether all elements of a request's QEMUIOVector are within previously registered buffers is expensive, so we need a hint from the user to avoid costly checks. Add a BDRV_REQ_REGISTERED_BUF request flag to indicate that all QEMUIOVector elements in an I/O request are known to be within previously registered buffers. Always pass the flag through to driver read/write functions. There is little harm in passing the flag to a driver that does not use it. Passing the flag to drivers avoids changes across many block drivers. Filter drivers would need to explicitly support the flag and pass through to their children when the children support it. That's a lot of code changes and it's hard to remember to do that everywhere, leading to silent reduced performance when the flag is accidentally dropped. The only problematic scenario with the approach in this patch is when a driver passes the flag through to internal I/O requests that don't use the same I/O buffer. In that case the hint may be set when it should actually be clear. This is a rare case though so the risk is low. Some drivers have assert(!flags), which no longer works when BDRV_REQ_REGISTERED_BUF is passed in. These assertions aren't very useful anyway since the functions are called almost exclusively by bdrv_driver_preadv/pwritev() so if we get flags handling right there then the assertion is not needed. Signed-off-by: Stefan Hajnoczi --- include/block/block-common.h | 9 ++++++ block.c | 14 +++++++++ block/blkverify.c | 4 +-- block/crypto.c | 4 +-- block/file-posix.c | 1 - block/gluster.c | 1 - block/io.c | 61 ++++++++++++++++++++++-------------- block/mirror.c | 2 ++ block/nbd.c | 1 - block/parallels.c | 1 - block/qcow.c | 2 -- block/qed.c | 1 - block/raw-format.c | 2 ++ block/replication.c | 1 - block/ssh.c | 1 - block/vhdx.c | 1 - 16 files changed, 69 insertions(+), 37 deletions(-) diff --git a/include/block/block-common.h b/include/block/block-common.h index fdb7306e78..061606e867 100644 --- a/include/block/block-common.h +++ b/include/block/block-common.h @@ -80,6 +80,15 @@ typedef enum { */ BDRV_REQ_MAY_UNMAP =3D 0x4, =20 + /* + * An optimization hint when all QEMUIOVector elements are within + * previously registered bdrv_register_buf() memory ranges. + * + * Code that replaces the user's QEMUIOVector elements with bounce buf= fers + * must take care to clear this flag. + */ + BDRV_REQ_REGISTERED_BUF =3D 0x8, + BDRV_REQ_FUA =3D 0x10, BDRV_REQ_WRITE_COMPRESSED =3D 0x20, =20 diff --git a/block.c b/block.c index 1fbf6b9e69..c69be2cfe3 100644 --- a/block.c +++ b/block.c @@ -1641,6 +1641,20 @@ static int bdrv_open_driver(BlockDriverState *bs, Bl= ockDriver *drv, goto open_failed; } =20 + assert(!(bs->supported_read_flags & ~BDRV_REQ_MASK)); + assert(!(bs->supported_write_flags & ~BDRV_REQ_MASK)); + + /* + * Always allow the BDRV_REQ_REGISTERED_BUF optimization hint. This sa= ves + * drivers that pass read/write requests through to a child the troubl= e of + * declaring support explicitly. + * + * Drivers must not propagate this flag accidentally when they initiat= e I/O + * to a bounce buffer. That case should be rare though. + */ + bs->supported_read_flags |=3D BDRV_REQ_REGISTERED_BUF; + bs->supported_write_flags |=3D BDRV_REQ_REGISTERED_BUF; + ret =3D refresh_total_sectors(bs, bs->total_sectors); if (ret < 0) { error_setg_errno(errp, -ret, "Could not refresh total sector count= "); diff --git a/block/blkverify.c b/block/blkverify.c index 020b1ae7b6..f36fd6aeb2 100644 --- a/block/blkverify.c +++ b/block/blkverify.c @@ -235,8 +235,8 @@ blkverify_co_preadv(BlockDriverState *bs, int64_t offse= t, int64_t bytes, qemu_iovec_init(&raw_qiov, qiov->niov); qemu_iovec_clone(&raw_qiov, qiov, buf); =20 - ret =3D blkverify_co_prwv(bs, &r, offset, bytes, qiov, &raw_qiov, flag= s, - false); + ret =3D blkverify_co_prwv(bs, &r, offset, bytes, qiov, &raw_qiov, + flags & ~BDRV_REQ_REGISTERED_BUF, false); =20 cmp_offset =3D qemu_iovec_compare(qiov, &raw_qiov); if (cmp_offset !=3D -1) { diff --git a/block/crypto.c b/block/crypto.c index 7a57774b76..c7365598a7 100644 --- a/block/crypto.c +++ b/block/crypto.c @@ -410,7 +410,6 @@ block_crypto_co_preadv(BlockDriverState *bs, int64_t of= fset, int64_t bytes, uint64_t sector_size =3D qcrypto_block_get_sector_size(crypto->block); uint64_t payload_offset =3D qcrypto_block_get_payload_offset(crypto->b= lock); =20 - assert(!flags); assert(payload_offset < INT64_MAX); assert(QEMU_IS_ALIGNED(offset, sector_size)); assert(QEMU_IS_ALIGNED(bytes, sector_size)); @@ -473,7 +472,8 @@ block_crypto_co_pwritev(BlockDriverState *bs, int64_t o= ffset, int64_t bytes, uint64_t sector_size =3D qcrypto_block_get_sector_size(crypto->block); uint64_t payload_offset =3D qcrypto_block_get_payload_offset(crypto->b= lock); =20 - assert(!(flags & ~BDRV_REQ_FUA)); + flags &=3D ~BDRV_REQ_REGISTERED_BUF; + assert(payload_offset < INT64_MAX); assert(QEMU_IS_ALIGNED(offset, sector_size)); assert(QEMU_IS_ALIGNED(bytes, sector_size)); diff --git a/block/file-posix.c b/block/file-posix.c index 23acffb9a4..b9647c5ffc 100644 --- a/block/file-posix.c +++ b/block/file-posix.c @@ -2133,7 +2133,6 @@ static int coroutine_fn raw_co_pwritev(BlockDriverSta= te *bs, int64_t offset, int64_t bytes, QEMUIOVector *qiov, BdrvRequestFlags flags) { - assert(flags =3D=3D 0); return raw_co_prw(bs, offset, bytes, qiov, QEMU_AIO_WRITE); } =20 diff --git a/block/gluster.c b/block/gluster.c index bb1144cf6a..7c90f7ba4b 100644 --- a/block/gluster.c +++ b/block/gluster.c @@ -1236,7 +1236,6 @@ static coroutine_fn int qemu_gluster_co_writev(BlockD= riverState *bs, QEMUIOVector *qiov, int flags) { - assert(!flags); return qemu_gluster_co_rw(bs, sector_num, nb_sectors, qiov, 1); } =20 diff --git a/block/io.c b/block/io.c index cca402bf7b..4207648db6 100644 --- a/block/io.c +++ b/block/io.c @@ -1130,8 +1130,7 @@ static int coroutine_fn bdrv_driver_preadv(BlockDrive= rState *bs, int ret; =20 bdrv_check_qiov_request(offset, bytes, qiov, qiov_offset, &error_abort= ); - assert(!(flags & ~BDRV_REQ_MASK)); - assert(!(flags & BDRV_REQ_NO_FALLBACK)); + assert(!(flags & ~bs->supported_read_flags)); =20 if (!drv) { return -ENOMEDIUM; @@ -1195,23 +1194,29 @@ static int coroutine_fn bdrv_driver_pwritev(BlockDr= iverState *bs, BdrvRequestFlags flags) { BlockDriver *drv =3D bs->drv; + bool emulate_fua =3D false; int64_t sector_num; unsigned int nb_sectors; QEMUIOVector local_qiov; int ret; =20 bdrv_check_qiov_request(offset, bytes, qiov, qiov_offset, &error_abort= ); - assert(!(flags & ~BDRV_REQ_MASK)); - assert(!(flags & BDRV_REQ_NO_FALLBACK)); =20 if (!drv) { return -ENOMEDIUM; } =20 + if ((flags & BDRV_REQ_FUA) && + (~bs->supported_write_flags & BDRV_REQ_FUA)) { + flags &=3D ~BDRV_REQ_FUA; + emulate_fua =3D true; + } + + flags &=3D bs->supported_write_flags; + if (drv->bdrv_co_pwritev_part) { ret =3D drv->bdrv_co_pwritev_part(bs, offset, bytes, qiov, qiov_of= fset, - flags & bs->supported_write_flags); - flags &=3D ~bs->supported_write_flags; + flags); goto emulate_flags; } =20 @@ -1221,9 +1226,7 @@ static int coroutine_fn bdrv_driver_pwritev(BlockDriv= erState *bs, } =20 if (drv->bdrv_co_pwritev) { - ret =3D drv->bdrv_co_pwritev(bs, offset, bytes, qiov, - flags & bs->supported_write_flags); - flags &=3D ~bs->supported_write_flags; + ret =3D drv->bdrv_co_pwritev(bs, offset, bytes, qiov, flags); goto emulate_flags; } =20 @@ -1233,10 +1236,8 @@ static int coroutine_fn bdrv_driver_pwritev(BlockDri= verState *bs, .coroutine =3D qemu_coroutine_self(), }; =20 - acb =3D drv->bdrv_aio_pwritev(bs, offset, bytes, qiov, - flags & bs->supported_write_flags, + acb =3D drv->bdrv_aio_pwritev(bs, offset, bytes, qiov, flags, bdrv_co_io_em_complete, &co); - flags &=3D ~bs->supported_write_flags; if (acb =3D=3D NULL) { ret =3D -EIO; } else { @@ -1254,12 +1255,10 @@ static int coroutine_fn bdrv_driver_pwritev(BlockDr= iverState *bs, assert(bytes <=3D BDRV_REQUEST_MAX_BYTES); =20 assert(drv->bdrv_co_writev); - ret =3D drv->bdrv_co_writev(bs, sector_num, nb_sectors, qiov, - flags & bs->supported_write_flags); - flags &=3D ~bs->supported_write_flags; + ret =3D drv->bdrv_co_writev(bs, sector_num, nb_sectors, qiov, flags); =20 emulate_flags: - if (ret =3D=3D 0 && (flags & BDRV_REQ_FUA)) { + if (ret =3D=3D 0 && emulate_fua) { ret =3D bdrv_co_flush(bs); } =20 @@ -1487,11 +1486,14 @@ static int coroutine_fn bdrv_aligned_preadv(BdrvChi= ld *child, max_transfer =3D QEMU_ALIGN_DOWN(MIN_NON_ZERO(bs->bl.max_transfer, INT= _MAX), align); =20 - /* TODO: We would need a per-BDS .supported_read_flags and + /* + * TODO: We would need a per-BDS .supported_read_flags and * potential fallback support, if we ever implement any read flags * to pass through to drivers. For now, there aren't any - * passthrough flags. */ - assert(!(flags & ~(BDRV_REQ_COPY_ON_READ | BDRV_REQ_PREFETCH))); + * passthrough flags except the BDRV_REQ_REGISTERED_BUF optimization h= int. + */ + assert(!(flags & ~(BDRV_REQ_COPY_ON_READ | BDRV_REQ_PREFETCH | + BDRV_REQ_REGISTERED_BUF))); =20 /* Handle Copy on Read and associated serialisation */ if (flags & BDRV_REQ_COPY_ON_READ) { @@ -1532,7 +1534,7 @@ static int coroutine_fn bdrv_aligned_preadv(BdrvChild= *child, goto out; } =20 - assert(!(flags & ~bs->supported_read_flags)); + assert(!(flags & ~(bs->supported_read_flags | BDRV_REQ_REGISTERED_BUF)= )); =20 max_bytes =3D ROUND_UP(MAX(0, total_bytes - offset), align); if (bytes <=3D max_bytes && bytes <=3D max_transfer) { @@ -1721,7 +1723,8 @@ static void bdrv_padding_destroy(BdrvRequestPadding *= pad) static int bdrv_pad_request(BlockDriverState *bs, QEMUIOVector **qiov, size_t *qiov_offset, int64_t *offset, int64_t *bytes, - BdrvRequestPadding *pad, bool *padded) + BdrvRequestPadding *pad, bool *padded, + BdrvRequestFlags *flags) { int ret; =20 @@ -1749,6 +1752,10 @@ static int bdrv_pad_request(BlockDriverState *bs, if (padded) { *padded =3D true; } + if (flags) { + /* Can't use optimization hint with bounce buffer */ + *flags &=3D ~BDRV_REQ_REGISTERED_BUF; + } =20 return 0; } @@ -1803,7 +1810,7 @@ int coroutine_fn bdrv_co_preadv_part(BdrvChild *child, } =20 ret =3D bdrv_pad_request(bs, &qiov, &qiov_offset, &offset, &bytes, &pa= d, - NULL); + NULL, &flags); if (ret < 0) { goto fail; } @@ -1848,6 +1855,11 @@ static int coroutine_fn bdrv_co_do_pwrite_zeroes(Blo= ckDriverState *bs, return -ENOTSUP; } =20 + /* By definition there is no user buffer so this flag doesn't make sen= se */ + if (flags & BDRV_REQ_REGISTERED_BUF) { + return -EINVAL; + } + /* Invalidate the cached block-status data range if this write overlap= s */ bdrv_bsc_invalidate_range(bs, offset, bytes); =20 @@ -2133,6 +2145,9 @@ static int coroutine_fn bdrv_co_do_zero_pwritev(BdrvC= hild *child, bool padding; BdrvRequestPadding pad; =20 + /* This flag doesn't make sense for padding or zero writes */ + flags &=3D ~BDRV_REQ_REGISTERED_BUF; + padding =3D bdrv_init_padding(bs, offset, bytes, &pad); if (padding) { assert(!(flags & BDRV_REQ_NO_WAIT)); @@ -2250,7 +2265,7 @@ int coroutine_fn bdrv_co_pwritev_part(BdrvChild *chil= d, * alignment only if there is no ZERO flag. */ ret =3D bdrv_pad_request(bs, &qiov, &qiov_offset, &offset, &bytes,= &pad, - &padded); + &padded, &flags); if (ret < 0) { return ret; } diff --git a/block/mirror.c b/block/mirror.c index 80c0109d39..bed089d2e0 100644 --- a/block/mirror.c +++ b/block/mirror.c @@ -1486,6 +1486,8 @@ static int coroutine_fn bdrv_mirror_top_pwritev(Block= DriverState *bs, qemu_iovec_init(&bounce_qiov, 1); qemu_iovec_add(&bounce_qiov, bounce_buf, bytes); qiov =3D &bounce_qiov; + + flags &=3D ~BDRV_REQ_REGISTERED_BUF; } =20 ret =3D bdrv_mirror_top_do_write(bs, MIRROR_METHOD_COPY, offset, bytes= , qiov, diff --git a/block/nbd.c b/block/nbd.c index 494b9d683e..7d485c86d2 100644 --- a/block/nbd.c +++ b/block/nbd.c @@ -1222,7 +1222,6 @@ static int coroutine_fn nbd_client_co_preadv(BlockDri= verState *bs, int64_t offse }; =20 assert(bytes <=3D NBD_MAX_BUFFER_SIZE); - assert(!flags); =20 if (!bytes) { return 0; diff --git a/block/parallels.c b/block/parallels.c index c1523e7dab..dd15a44100 100644 --- a/block/parallels.c +++ b/block/parallels.c @@ -329,7 +329,6 @@ static coroutine_fn int parallels_co_writev(BlockDriver= State *bs, QEMUIOVector hd_qiov; int ret =3D 0; =20 - assert(!flags); qemu_iovec_init(&hd_qiov, qiov->niov); =20 while (nb_sectors > 0) { diff --git a/block/qcow.c b/block/qcow.c index 311aaa8705..e9180c7b61 100644 --- a/block/qcow.c +++ b/block/qcow.c @@ -628,7 +628,6 @@ static coroutine_fn int qcow_co_preadv(BlockDriverState= *bs, int64_t offset, uint8_t *buf; void *orig_buf; =20 - assert(!flags); if (qiov->niov > 1) { buf =3D orig_buf =3D qemu_try_blockalign(bs, qiov->size); if (buf =3D=3D NULL) { @@ -725,7 +724,6 @@ static coroutine_fn int qcow_co_pwritev(BlockDriverStat= e *bs, int64_t offset, uint8_t *buf; void *orig_buf; =20 - assert(!flags); s->cluster_cache_offset =3D -1; /* disable compressed cache */ =20 /* We must always copy the iov when encrypting, so we diff --git a/block/qed.c b/block/qed.c index bda00e6257..99a9ec9b57 100644 --- a/block/qed.c +++ b/block/qed.c @@ -1395,7 +1395,6 @@ static int coroutine_fn bdrv_qed_co_writev(BlockDrive= rState *bs, int64_t sector_num, int nb_sect= ors, QEMUIOVector *qiov, int flags) { - assert(!flags); return qed_co_request(bs, sector_num, qiov, nb_sectors, QED_AIOCB_WRIT= E); } =20 diff --git a/block/raw-format.c b/block/raw-format.c index f337ac7569..c8dc9bc850 100644 --- a/block/raw-format.c +++ b/block/raw-format.c @@ -258,6 +258,8 @@ static int coroutine_fn raw_co_pwritev(BlockDriverState= *bs, int64_t offset, qemu_iovec_add(&local_qiov, buf, 512); qemu_iovec_concat(&local_qiov, qiov, 512, qiov->size - 512); qiov =3D &local_qiov; + + flags &=3D ~BDRV_REQ_REGISTERED_BUF; } =20 ret =3D raw_adjust_offset(bs, &offset, bytes, true); diff --git a/block/replication.c b/block/replication.c index c67f931f37..13f1d39571 100644 --- a/block/replication.c +++ b/block/replication.c @@ -261,7 +261,6 @@ static coroutine_fn int replication_co_writev(BlockDriv= erState *bs, int ret; int64_t n; =20 - assert(!flags); ret =3D replication_get_io_status(s); if (ret < 0) { goto out; diff --git a/block/ssh.c b/block/ssh.c index a2dc646536..a3cddc392c 100644 --- a/block/ssh.c +++ b/block/ssh.c @@ -1196,7 +1196,6 @@ static coroutine_fn int ssh_co_writev(BlockDriverStat= e *bs, BDRVSSHState *s =3D bs->opaque; int ret; =20 - assert(!flags); qemu_co_mutex_lock(&s->lock); ret =3D ssh_write(s, bs, sector_num * BDRV_SECTOR_SIZE, nb_sectors * BDRV_SECTOR_SIZE, qiov); diff --git a/block/vhdx.c b/block/vhdx.c index e10e78ebfd..e2344ee0b7 100644 --- a/block/vhdx.c +++ b/block/vhdx.c @@ -1342,7 +1342,6 @@ static coroutine_fn int vhdx_co_writev(BlockDriverSta= te *bs, int64_t sector_num, uint64_t bat_prior_offset =3D 0; bool bat_update =3D false; =20 - assert(!flags); qemu_iovec_init(&hd_qiov, qiov->niov); =20 qemu_co_mutex_lock(&s->lock); --=20 2.37.3 From nobody Sun May 12 14:33:58 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1665688987; cv=none; d=zohomail.com; s=zohoarc; b=jfjHr2qewPCvFWUmeEhiU0nM6oHzr76I9NnbtzYalVNLVjt+snBYT0GwV9bVdEkwoOVQyjiRLumCmYTktocBrZ8QzbLPs+m2AURghKwdbhYGeieUNYYVvRz0VtjHYzfL889b+zB9Km07wkd3iJEtZj1I/r7l1inmqaJfB1tFZmo= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1665688987; h=Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=quCuyXKTrlHuigrwt7ADWSd4FewbZ5zFO53N+o1pttE=; b=Qnn7hJgP7Us+S3pstr7N9XsyeZMe+lBXGWlhfGBspuMo4UramzOYhWB0OuEeg9/6tsjOeLLpve3qfkee5PWIGUurG7n2u0tNVhQgi0zgNQ31FthOGac6HCJ6vq3+q2DRymy6gsnf76qoC2NvbQkDWerhzDIs4i/0wBKlS4HokDM= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1665688987153977.2753281665201; Thu, 13 Oct 2022 12:23:07 -0700 (PDT) Received: from localhost ([::1]:42208 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oj3nN-000422-TJ for importer@patchew.org; Thu, 13 Oct 2022 15:23:05 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:58168) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oj3Qq-0006d0-8r for qemu-devel@nongnu.org; Thu, 13 Oct 2022 14:59:50 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:59664) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oj3Qn-00046D-Si for qemu-devel@nongnu.org; Thu, 13 Oct 2022 14:59:48 -0400 Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-131-WygD33I2Py-Zi-jQJ2JIaw-1; Thu, 13 Oct 2022 14:59:41 -0400 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 624D33C138A6; Thu, 13 Oct 2022 18:59:40 +0000 (UTC) Received: from localhost (unknown [10.39.194.71]) by smtp.corp.redhat.com (Postfix) with ESMTP id 96B0A1415102; Thu, 13 Oct 2022 18:59:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1665687584; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=quCuyXKTrlHuigrwt7ADWSd4FewbZ5zFO53N+o1pttE=; b=ajulfJb6ptgC/YMaLMdigwRsTVrjhHgrEAyHo5oz5futoDfHcyIfvhqiphcG2T6o+NS4IR uqVL3bzpucGBdl2pX9TTTFcS37baPEdkpfEnyMh/WGwP8hDt0ZT+DMI1IxVWIQEln0knsk SHVlHwMrWtf/8TwxxUouqoJBJN4KLnM= X-MC-Unique: WygD33I2Py-Zi-jQJ2JIaw-1 From: Stefan Hajnoczi To: qemu-devel@nongnu.org Cc: Fam Zheng , Markus Armbruster , Stefan Hajnoczi , Laurent Vivier , Marcel Apfelbaum , "Michael S. Tsirkin" , Wen Congyang , Kevin Wolf , =?UTF-8?q?Philippe=20Mathieu-Daud=C3=A9?= , Richard Henderson , David Hildenbrand , Hanna Reitz , Xie Changlong , Eduardo Habkost , qemu-block@nongnu.org, Eric Blake , John Snow , afaria@redhat.com, Jeff Cody , Yanan Wang , Paolo Bonzini , Raphael Norwitz , sgarzare@redhat.com, integration@gluster.org, Peter Xu , "Richard W.M. Jones" , Thomas Huth , Vladimir Sementsov-Ogievskiy , "Denis V. Lunev" Subject: [PATCH v7 07/13] block: return errors from bdrv_register_buf() Date: Thu, 13 Oct 2022 14:59:02 -0400 Message-Id: <20221013185908.1297568-8-stefanha@redhat.com> In-Reply-To: <20221013185908.1297568-1-stefanha@redhat.com> References: <20221013185908.1297568-1-stefanha@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.1 on 10.11.54.7 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.129.124; envelope-from=stefanha@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1665688989655100001 Content-Type: text/plain; charset="utf-8" Registering an I/O buffer is only a performance optimization hint but it is still necessary to return errors when it fails. Later patches will need to detect errors when registering buffers but an immediate advantage is that error_report() calls are no longer needed in block driver .bdrv_register_buf() functions. Signed-off-by: Stefan Hajnoczi --- include/block/block-global-state.h | 5 ++- include/block/block_int-common.h | 5 ++- include/sysemu/block-backend-global-state.h | 2 +- block/block-backend.c | 4 +-- block/io.c | 34 +++++++++++++++++++-- block/nvme.c | 18 +++++------ qemu-img.c | 2 +- 7 files changed, 52 insertions(+), 18 deletions(-) diff --git a/include/block/block-global-state.h b/include/block/block-globa= l-state.h index 7901f35863..eba4ed23b4 100644 --- a/include/block/block-global-state.h +++ b/include/block/block-global-state.h @@ -246,8 +246,11 @@ void bdrv_del_child(BlockDriverState *parent, BdrvChil= d *child, Error **errp); * * Buffers must not overlap and they must be unregistered with the same values that they were registered with. + * + * Returns: true on success, false on failure */ -void bdrv_register_buf(BlockDriverState *bs, void *host, size_t size); +bool bdrv_register_buf(BlockDriverState *bs, void *host, size_t size, + Error **errp); void bdrv_unregister_buf(BlockDriverState *bs, void *host, size_t size); =20 void bdrv_cancel_in_flight(BlockDriverState *bs); diff --git a/include/block/block_int-common.h b/include/block/block_int-com= mon.h index 19798d0e77..9c569be162 100644 --- a/include/block/block_int-common.h +++ b/include/block/block_int-common.h @@ -433,8 +433,11 @@ struct BlockDriver { * that it can do IOMMU mapping with VFIO etc., in order to get better * performance. In the case of VFIO drivers, this callback is used to = do * DMA mapping for hot buffers. + * + * Returns: true on success, false on failure */ - void (*bdrv_register_buf)(BlockDriverState *bs, void *host, size_t siz= e); + bool (*bdrv_register_buf)(BlockDriverState *bs, void *host, size_t siz= e, + Error **errp); void (*bdrv_unregister_buf)(BlockDriverState *bs, void *host, size_t s= ize); =20 /* diff --git a/include/sysemu/block-backend-global-state.h b/include/sysemu/b= lock-backend-global-state.h index 97f7dad2c3..6858e39cb6 100644 --- a/include/sysemu/block-backend-global-state.h +++ b/include/sysemu/block-backend-global-state.h @@ -106,7 +106,7 @@ void blk_io_limits_enable(BlockBackend *blk, const char= *group); void blk_io_limits_update_group(BlockBackend *blk, const char *group); void blk_set_force_allow_inactivate(BlockBackend *blk); =20 -void blk_register_buf(BlockBackend *blk, void *host, size_t size); +bool blk_register_buf(BlockBackend *blk, void *host, size_t size, Error **= errp); void blk_unregister_buf(BlockBackend *blk, void *host, size_t size); =20 const BdrvChild *blk_root(BlockBackend *blk); diff --git a/block/block-backend.c b/block/block-backend.c index ae42474891..4f59664397 100644 --- a/block/block-backend.c +++ b/block/block-backend.c @@ -2545,10 +2545,10 @@ static void blk_root_drained_end(BdrvChild *child, = int *drained_end_counter) } } =20 -void blk_register_buf(BlockBackend *blk, void *host, size_t size) +bool blk_register_buf(BlockBackend *blk, void *host, size_t size, Error **= errp) { GLOBAL_STATE_CODE(); - bdrv_register_buf(blk_bs(blk), host, size); + return bdrv_register_buf(blk_bs(blk), host, size, errp); } =20 void blk_unregister_buf(BlockBackend *blk, void *host, size_t size) diff --git a/block/io.c b/block/io.c index 4207648db6..a9673465dd 100644 --- a/block/io.c +++ b/block/io.c @@ -3277,17 +3277,45 @@ void bdrv_io_unplug(BlockDriverState *bs) } } =20 -void bdrv_register_buf(BlockDriverState *bs, void *host, size_t size) +/* Helper that undoes bdrv_register_buf() when it fails partway through */ +static void bdrv_register_buf_rollback(BlockDriverState *bs, + void *host, + size_t size, + BdrvChild *final_child) +{ + BdrvChild *child; + + QLIST_FOREACH(child, &bs->children, next) { + if (child =3D=3D final_child) { + break; + } + + bdrv_unregister_buf(child->bs, host, size); + } + + if (bs->drv && bs->drv->bdrv_unregister_buf) { + bs->drv->bdrv_unregister_buf(bs, host, size); + } +} + +bool bdrv_register_buf(BlockDriverState *bs, void *host, size_t size, + Error **errp) { BdrvChild *child; =20 GLOBAL_STATE_CODE(); if (bs->drv && bs->drv->bdrv_register_buf) { - bs->drv->bdrv_register_buf(bs, host, size); + if (!bs->drv->bdrv_register_buf(bs, host, size, errp)) { + return false; + } } QLIST_FOREACH(child, &bs->children, next) { - bdrv_register_buf(child->bs, host, size); + if (!bdrv_register_buf(child->bs, host, size, errp)) { + bdrv_register_buf_rollback(bs, host, size, child); + return false; + } } + return true; } =20 void bdrv_unregister_buf(BlockDriverState *bs, void *host, size_t size) diff --git a/block/nvme.c b/block/nvme.c index 94b76b16f2..656624c585 100644 --- a/block/nvme.c +++ b/block/nvme.c @@ -1587,19 +1587,19 @@ static void nvme_aio_unplug(BlockDriverState *bs) } } =20 -static void nvme_register_buf(BlockDriverState *bs, void *host, size_t siz= e) +static bool nvme_register_buf(BlockDriverState *bs, void *host, size_t siz= e, + Error **errp) { int ret; - Error *local_err =3D NULL; BDRVNVMeState *s =3D bs->opaque; =20 - ret =3D qemu_vfio_dma_map(s->vfio, host, size, false, NULL, &local_err= ); - if (ret) { - /* FIXME: we may run out of IOVA addresses after repeated - * bdrv_register_buf/bdrv_unregister_buf, because nvme_vfio_dma_un= map - * doesn't reclaim addresses for fixed mappings. */ - error_reportf_err(local_err, "nvme_register_buf failed: "); - } + /* + * FIXME: we may run out of IOVA addresses after repeated + * bdrv_register_buf/bdrv_unregister_buf, because nvme_vfio_dma_unmap + * doesn't reclaim addresses for fixed mappings. + */ + ret =3D qemu_vfio_dma_map(s->vfio, host, size, false, NULL, errp); + return ret =3D=3D 0; } =20 static void nvme_unregister_buf(BlockDriverState *bs, void *host, size_t s= ize) diff --git a/qemu-img.c b/qemu-img.c index 9fe94df650..a3b64c88af 100644 --- a/qemu-img.c +++ b/qemu-img.c @@ -4570,7 +4570,7 @@ static int img_bench(int argc, char **argv) data.buf =3D blk_blockalign(blk, buf_size); memset(data.buf, pattern, data.nrreq * data.bufsize); =20 - blk_register_buf(blk, data.buf, buf_size); + blk_register_buf(blk, data.buf, buf_size, &error_fatal); =20 data.qiov =3D g_new(QEMUIOVector, data.nrreq); for (i =3D 0; i < data.nrreq; i++) { --=20 2.37.3 From nobody Sun May 12 14:33:58 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1665689195; cv=none; d=zohomail.com; s=zohoarc; b=kcANyJz2bYgvhkoe34yCpIZIcRhuQUpSfemTlc79SXjFou4foe+kGRYg1gxgOfoHH7bJ56Pd6roYxLOMtihZoBYAC+V5mC09k1olU8OAIBClYKtYXhGUfaFEjJxqQQShlPpMJo70vH7Q9zaOqhOTLGPJRskqNFU9+1RdMoRbZaA= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1665689195; h=Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=ckx8MLOJfZFv8z3iKrpQW9pb0/6EvGKP4ckVJF/JoW8=; b=U27xKSL7qpCgCfY7xDm9R2GbeA5tjYbwbqlnLgekwrWdkSCNM7A/XimoeNX1bfmzJnY5tEzxrHdaV+X+Dd0yyWbzyvlY8lgZv+3s3HSWvxoyH6ZlSXGx7CZSahKsTytUm5aUwa99UMm/YjjQUwd2Ekcxvssg0wfpDmYIXBzCPuw= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1665689195956532.626096884554; Thu, 13 Oct 2022 12:26:35 -0700 (PDT) Received: from localhost ([::1]:40938 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oj3qk-00070n-OB for importer@patchew.org; Thu, 13 Oct 2022 15:26:34 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:40398) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oj3Qz-0006eS-30 for qemu-devel@nongnu.org; Thu, 13 Oct 2022 14:59:58 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:22331) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oj3Qv-00047L-SX for qemu-devel@nongnu.org; Thu, 13 Oct 2022 14:59:56 -0400 Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-363-xmnZ1UNCNu6rS98iAGf79g-1; Thu, 13 Oct 2022 14:59:43 -0400 Received: from smtp.corp.redhat.com (int-mx09.intmail.prod.int.rdu2.redhat.com [10.11.54.9]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id F1D27811E75; Thu, 13 Oct 2022 18:59:42 +0000 (UTC) Received: from localhost (unknown [10.39.194.71]) by smtp.corp.redhat.com (Postfix) with ESMTP id 4CAD74B3FCF; Thu, 13 Oct 2022 18:59:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1665687589; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ckx8MLOJfZFv8z3iKrpQW9pb0/6EvGKP4ckVJF/JoW8=; b=Fp9yi55UbQKGiMvhSw+US0Q/EFKTeUcuqIVSTS9120D5pNdE4lT8b7x4HlZGVbwP3Lp4XF j49KGWOz92nqPGhqMt6i//vMuP9KNNf/2qRCsGGckXLduC2xW/rMabH/gnnBYl9CAd6/ee 1T8Jp7J/0QtNAZSFeRm7Dm0CvJSBHWk= X-MC-Unique: xmnZ1UNCNu6rS98iAGf79g-1 From: Stefan Hajnoczi To: qemu-devel@nongnu.org Cc: Fam Zheng , Markus Armbruster , Stefan Hajnoczi , Laurent Vivier , Marcel Apfelbaum , "Michael S. Tsirkin" , Wen Congyang , Kevin Wolf , =?UTF-8?q?Philippe=20Mathieu-Daud=C3=A9?= , Richard Henderson , David Hildenbrand , Hanna Reitz , Xie Changlong , Eduardo Habkost , qemu-block@nongnu.org, Eric Blake , John Snow , afaria@redhat.com, Jeff Cody , Yanan Wang , Paolo Bonzini , Raphael Norwitz , sgarzare@redhat.com, integration@gluster.org, Peter Xu , "Richard W.M. Jones" , Thomas Huth , Vladimir Sementsov-Ogievskiy , "Denis V. Lunev" Subject: [PATCH v7 08/13] numa: use QLIST_FOREACH_SAFE() for RAM block notifiers Date: Thu, 13 Oct 2022 14:59:03 -0400 Message-Id: <20221013185908.1297568-9-stefanha@redhat.com> In-Reply-To: <20221013185908.1297568-1-stefanha@redhat.com> References: <20221013185908.1297568-1-stefanha@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.1 on 10.11.54.9 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.129.124; envelope-from=stefanha@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1665689197343100001 Content-Type: text/plain; charset="utf-8" Make list traversal work when a callback removes a notifier mid-traversal. This is a cleanup to prevent bugs in the future. Signed-off-by: Stefan Hajnoczi Reviewed-by: David Hildenbrand --- hw/core/numa.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/hw/core/numa.c b/hw/core/numa.c index 31e6fe1caa..ea24a5fa8c 100644 --- a/hw/core/numa.c +++ b/hw/core/numa.c @@ -857,8 +857,9 @@ void ram_block_notifier_remove(RAMBlockNotifier *n) void ram_block_notify_add(void *host, size_t size, size_t max_size) { RAMBlockNotifier *notifier; + RAMBlockNotifier *next; =20 - QLIST_FOREACH(notifier, &ram_list.ramblock_notifiers, next) { + QLIST_FOREACH_SAFE(notifier, &ram_list.ramblock_notifiers, next, next)= { if (notifier->ram_block_added) { notifier->ram_block_added(notifier, host, size, max_size); } @@ -868,8 +869,9 @@ void ram_block_notify_add(void *host, size_t size, size= _t max_size) void ram_block_notify_remove(void *host, size_t size, size_t max_size) { RAMBlockNotifier *notifier; + RAMBlockNotifier *next; =20 - QLIST_FOREACH(notifier, &ram_list.ramblock_notifiers, next) { + QLIST_FOREACH_SAFE(notifier, &ram_list.ramblock_notifiers, next, next)= { if (notifier->ram_block_removed) { notifier->ram_block_removed(notifier, host, size, max_size); } @@ -879,8 +881,9 @@ void ram_block_notify_remove(void *host, size_t size, s= ize_t max_size) void ram_block_notify_resize(void *host, size_t old_size, size_t new_size) { RAMBlockNotifier *notifier; + RAMBlockNotifier *next; =20 - QLIST_FOREACH(notifier, &ram_list.ramblock_notifiers, next) { + QLIST_FOREACH_SAFE(notifier, &ram_list.ramblock_notifiers, next, next)= { if (notifier->ram_block_resized) { notifier->ram_block_resized(notifier, host, old_size, new_size= ); } --=20 2.37.3 From nobody Sun May 12 14:33:58 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1665689086; cv=none; d=zohomail.com; s=zohoarc; b=f+zh8S2laRR6ipu0Bg3L2XwkCjMfBGGNOHZ9MI11ZtnIS4xCF5j6A57jJsfKiPqgxlUrlXHBgboXYZySP8SGRWYfvGFTeLZFxIi7e408vm9yPgyvxuhVE8/aUwNFZoaXdiR9KetGQA0Udlvpa8PDnAp3qZapyp0NenGAd/xaV54= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1665689086; h=Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=gHEQP7D04zIdoQ39SnnN53P7AHDL1TQ7+p1CdBeUMBc=; b=TaKqZOPgFjFHmiRDyIfB4bozbTZkGzAsb8sJW1rB7LO18kh1Ode9ZPm59b046uDojVHqthzpEt1mnK0tZkZ6uT6+ElUlxqRRPNXr0CZ+hwgY/n2mD9fZUhNYIblGL2wx65DEYTRLrknxelVMRoA8sp/7b2m9R29wFnTkAp0WKj8= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1665689086044626.1056143531891; Thu, 13 Oct 2022 12:24:46 -0700 (PDT) Received: from localhost ([::1]:34294 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oj3oy-0005VY-QZ for importer@patchew.org; Thu, 13 Oct 2022 15:24:44 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:40396) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oj3Qy-0006eO-76 for qemu-devel@nongnu.org; Thu, 13 Oct 2022 14:59:58 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:51209) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oj3Qv-00047F-8c for qemu-devel@nongnu.org; Thu, 13 Oct 2022 14:59:54 -0400 Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-220-mOW0hFKMOZqFr-NdDCeBfg-1; Thu, 13 Oct 2022 14:59:46 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id A170F382C969; Thu, 13 Oct 2022 18:59:45 +0000 (UTC) Received: from localhost (unknown [10.39.194.71]) by smtp.corp.redhat.com (Postfix) with ESMTP id E18472166BA3; Thu, 13 Oct 2022 18:59:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1665687589; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=gHEQP7D04zIdoQ39SnnN53P7AHDL1TQ7+p1CdBeUMBc=; b=Jxm1jgFx0h2Fa/oq/MOCiKC9Pzz/0PVZ4vzeQlCXi6pjx1zmzP8A+rWUqXmMSU4iKCV+dg KEqP9lo9S+KGVO3u3hmsk4zhHPSfahoiNPqALmM8sARZFsPqgJQNcK8Ejtsawrqek0xxSx gmYlvXl4XL/y29LUZ6/B3XH74lTqpdE= X-MC-Unique: mOW0hFKMOZqFr-NdDCeBfg-1 From: Stefan Hajnoczi To: qemu-devel@nongnu.org Cc: Fam Zheng , Markus Armbruster , Stefan Hajnoczi , Laurent Vivier , Marcel Apfelbaum , "Michael S. Tsirkin" , Wen Congyang , Kevin Wolf , =?UTF-8?q?Philippe=20Mathieu-Daud=C3=A9?= , Richard Henderson , David Hildenbrand , Hanna Reitz , Xie Changlong , Eduardo Habkost , qemu-block@nongnu.org, Eric Blake , John Snow , afaria@redhat.com, Jeff Cody , Yanan Wang , Paolo Bonzini , Raphael Norwitz , sgarzare@redhat.com, integration@gluster.org, Peter Xu , "Richard W.M. Jones" , Thomas Huth , Vladimir Sementsov-Ogievskiy , "Denis V. Lunev" Subject: [PATCH v7 09/13] block: add BlockRAMRegistrar Date: Thu, 13 Oct 2022 14:59:04 -0400 Message-Id: <20221013185908.1297568-10-stefanha@redhat.com> In-Reply-To: <20221013185908.1297568-1-stefanha@redhat.com> References: <20221013185908.1297568-1-stefanha@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.1 on 10.11.54.6 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=stefanha@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1665689088215100001 Content-Type: text/plain; charset="utf-8" Emulated devices and other BlockBackend users wishing to take advantage of blk_register_buf() all have the same repetitive job: register RAMBlocks with the BlockBackend using RAMBlockNotifier. Add a BlockRAMRegistrar API to do this. A later commit will use this from hw/block/virtio-blk.c. Signed-off-by: Stefan Hajnoczi Reviewed-by: Stefano Garzarella --- MAINTAINERS | 1 + include/sysemu/block-ram-registrar.h | 37 ++++++++++++++++++ block/block-ram-registrar.c | 58 ++++++++++++++++++++++++++++ block/meson.build | 1 + 4 files changed, 97 insertions(+) create mode 100644 include/sysemu/block-ram-registrar.h create mode 100644 block/block-ram-registrar.c diff --git a/MAINTAINERS b/MAINTAINERS index da30a8b93f..c4622a3109 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -2510,6 +2510,7 @@ F: block* F: block/ F: hw/block/ F: include/block/ +F: include/sysemu/block-*.h F: qemu-img* F: docs/tools/qemu-img.rst F: qemu-io* diff --git a/include/sysemu/block-ram-registrar.h b/include/sysemu/block-ra= m-registrar.h new file mode 100644 index 0000000000..d8b2f7942b --- /dev/null +++ b/include/sysemu/block-ram-registrar.h @@ -0,0 +1,37 @@ +/* + * BlockBackend RAM Registrar + * + * SPDX-License-Identifier: GPL-2.0-or-later + */ + +#ifndef BLOCK_RAM_REGISTRAR_H +#define BLOCK_RAM_REGISTRAR_H + +#include "exec/ramlist.h" + +/** + * struct BlockRAMRegistrar: + * + * Keeps RAMBlock memory registered with a BlockBackend using + * blk_register_buf() including hotplugged memory. + * + * Emulated devices or other BlockBackend users initialize a BlockRAMRegis= trar + * with blk_ram_registrar_init() before submitting I/O requests with the + * BDRV_REQ_REGISTERED_BUF flag set. + */ +typedef struct { + BlockBackend *blk; + RAMBlockNotifier notifier; + bool ok; +} BlockRAMRegistrar; + +void blk_ram_registrar_init(BlockRAMRegistrar *r, BlockBackend *blk); +void blk_ram_registrar_destroy(BlockRAMRegistrar *r); + +/* Have all RAMBlocks been registered successfully? */ +static inline bool blk_ram_registrar_ok(BlockRAMRegistrar *r) +{ + return r->ok; +} + +#endif /* BLOCK_RAM_REGISTRAR_H */ diff --git a/block/block-ram-registrar.c b/block/block-ram-registrar.c new file mode 100644 index 0000000000..25dbafa789 --- /dev/null +++ b/block/block-ram-registrar.c @@ -0,0 +1,58 @@ +/* + * BlockBackend RAM Registrar + * + * SPDX-License-Identifier: GPL-2.0-or-later + */ + +#include "qemu/osdep.h" +#include "sysemu/block-backend.h" +#include "sysemu/block-ram-registrar.h" +#include "qapi/error.h" + +static void ram_block_added(RAMBlockNotifier *n, void *host, size_t size, + size_t max_size) +{ + BlockRAMRegistrar *r =3D container_of(n, BlockRAMRegistrar, notifier); + Error *err =3D NULL; + + if (!r->ok) { + return; /* don't try again if we've already failed */ + } + + if (!blk_register_buf(r->blk, host, max_size, &err)) { + error_report_err(err); + ram_block_notifier_remove(&r->notifier); + r->ok =3D false; + } +} + +static void ram_block_removed(RAMBlockNotifier *n, void *host, size_t size, + size_t max_size) +{ + BlockRAMRegistrar *r =3D container_of(n, BlockRAMRegistrar, notifier); + blk_unregister_buf(r->blk, host, max_size); +} + +void blk_ram_registrar_init(BlockRAMRegistrar *r, BlockBackend *blk) +{ + r->blk =3D blk; + r->notifier =3D (RAMBlockNotifier){ + .ram_block_added =3D ram_block_added, + .ram_block_removed =3D ram_block_removed, + + /* + * .ram_block_resized() is not necessary because we use the max_si= ze + * value that does not change across resize. + */ + }; + r->ok =3D true; + + ram_block_notifier_add(&r->notifier); +} + +void blk_ram_registrar_destroy(BlockRAMRegistrar *r) +{ + if (r->ok) { + ram_block_notifier_remove(&r->notifier); + } +} diff --git a/block/meson.build b/block/meson.build index 500878f082..b7c68b83a3 100644 --- a/block/meson.build +++ b/block/meson.build @@ -46,6 +46,7 @@ block_ss.add(files( ), zstd, zlib, gnutls) =20 softmmu_ss.add(when: 'CONFIG_TCG', if_true: files('blkreplay.c')) +softmmu_ss.add(files('block-ram-registrar.c')) =20 if get_option('qcow1').allowed() block_ss.add(files('qcow.c')) --=20 2.37.3 From nobody Sun May 12 14:33:58 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1665688041; cv=none; d=zohomail.com; s=zohoarc; b=eXtygJPP+ZsI7G4i6p1AUeE5VpH4jJqkm/MJp5k/cJKkezMXwX/EooyIuqOeP3fuL5O1kppEFuuElgD++2rWrzHzEOCJ/cWJbb2rtFCy69zBWDth/w+Gaj3L67QAmOMKsPvxXu8pr2k/qG1nY7XW5S+kiEfKzZKhfcUsiVtZrxU= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1665688041; h=Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=2uoNyr/XSKf+OD83Nh+bVP2/xYaSIaJlR6XmlFpdjFM=; b=FPC7YdILbldezxnzwNZRpb8WTIiUbVFQ8qZsom+YT6VgyoSGyZE0Ufc1bZQLQAoxyulBP0UspBsFA2CiFnfd+8xJ/nlSIN8A5wGofoL8IiM9xvHoCUJWLwSKhQnQnaaoiiqWjyvhNT33uoRbkaBB9TyvFdr086jy0uoo/K/A8t0= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1665688041435225.71392385758975; Thu, 13 Oct 2022 12:07:21 -0700 (PDT) Received: from localhost ([::1]:51262 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oj3Y7-0004I2-TD for importer@patchew.org; Thu, 13 Oct 2022 15:07:19 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:40402) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oj3Qz-0006eV-9b for qemu-devel@nongnu.org; Thu, 13 Oct 2022 14:59:58 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:54489) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oj3Qv-00047V-S0 for qemu-devel@nongnu.org; Thu, 13 Oct 2022 14:59:57 -0400 Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-533-GUuuTvOmNaGvvtQsoKaFqg-1; Thu, 13 Oct 2022 14:59:48 -0400 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 1AC8D185A78B; Thu, 13 Oct 2022 18:59:48 +0000 (UTC) Received: from localhost (unknown [10.39.194.71]) by smtp.corp.redhat.com (Postfix) with ESMTP id 5D21840268E0; Thu, 13 Oct 2022 18:59:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1665687590; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=2uoNyr/XSKf+OD83Nh+bVP2/xYaSIaJlR6XmlFpdjFM=; b=E6NSKmvVkY5jD7QtQzgiajl+FY6HbApuNdjRfV6aZ2zd4KLS0y787l/SnjLi8829Ecx+9Y imcjzPloFA95UBascpHB+tElJkdN9Jxkes0tMb/ptInaIfsNXaW7uxRN0pyI39k+pPAhBa 4rU09TqohKO3pBisCj+esNuA6jLk1Fg= X-MC-Unique: GUuuTvOmNaGvvtQsoKaFqg-1 From: Stefan Hajnoczi To: qemu-devel@nongnu.org Cc: Fam Zheng , Markus Armbruster , Stefan Hajnoczi , Laurent Vivier , Marcel Apfelbaum , "Michael S. Tsirkin" , Wen Congyang , Kevin Wolf , =?UTF-8?q?Philippe=20Mathieu-Daud=C3=A9?= , Richard Henderson , David Hildenbrand , Hanna Reitz , Xie Changlong , Eduardo Habkost , qemu-block@nongnu.org, Eric Blake , John Snow , afaria@redhat.com, Jeff Cody , Yanan Wang , Paolo Bonzini , Raphael Norwitz , sgarzare@redhat.com, integration@gluster.org, Peter Xu , "Richard W.M. Jones" , Thomas Huth , Vladimir Sementsov-Ogievskiy , "Denis V. Lunev" Subject: [PATCH v7 10/13] exec/cpu-common: add qemu_ram_get_fd() Date: Thu, 13 Oct 2022 14:59:05 -0400 Message-Id: <20221013185908.1297568-11-stefanha@redhat.com> In-Reply-To: <20221013185908.1297568-1-stefanha@redhat.com> References: <20221013185908.1297568-1-stefanha@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.129.124; envelope-from=stefanha@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1665688043051100001 Content-Type: text/plain; charset="utf-8" Add a function to get the file descriptor for a RAMBlock. Device emulation code typically uses the MemoryRegion APIs but vhost-style code may use RAMBlock directly for sharing guest memory with another process. This new API will be used by the libblkio block driver so it can share guest memory via .bdrv_register_buf(). Signed-off-by: Stefan Hajnoczi --- include/exec/cpu-common.h | 1 + softmmu/physmem.c | 5 +++++ 2 files changed, 6 insertions(+) diff --git a/include/exec/cpu-common.h b/include/exec/cpu-common.h index c493510ee9..6feaa40ca7 100644 --- a/include/exec/cpu-common.h +++ b/include/exec/cpu-common.h @@ -92,6 +92,7 @@ void qemu_ram_set_uf_zeroable(RAMBlock *rb); bool qemu_ram_is_migratable(RAMBlock *rb); void qemu_ram_set_migratable(RAMBlock *rb); void qemu_ram_unset_migratable(RAMBlock *rb); +int qemu_ram_get_fd(RAMBlock *rb); =20 size_t qemu_ram_pagesize(RAMBlock *block); size_t qemu_ram_pagesize_largest(void); diff --git a/softmmu/physmem.c b/softmmu/physmem.c index 56e03e07b5..d9578ccfd4 100644 --- a/softmmu/physmem.c +++ b/softmmu/physmem.c @@ -1748,6 +1748,11 @@ void qemu_ram_unset_migratable(RAMBlock *rb) rb->flags &=3D ~RAM_MIGRATABLE; } =20 +int qemu_ram_get_fd(RAMBlock *rb) +{ + return rb->fd; +} + /* Called with iothread lock held. */ void qemu_ram_set_idstr(RAMBlock *new_block, const char *name, DeviceState= *dev) { --=20 2.37.3 From nobody Sun May 12 14:33:58 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1665688275; cv=none; d=zohomail.com; s=zohoarc; b=Zz2q0yAqFNQJQl9VBd5Y2hi6HbajrEvcfqSEnkUtyOkCIL2wZn9TU6YuybjavBqmgLJUbentmFqNnq0htPC6XeFnPfVlxXRpogZhobC2eHKhC0qPbbKSYm8BGdbfbQI+ateJle/ONZ7K9Jj/SvDa0fRnu9c0UHAIR4aMPBkzlhg= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1665688275; h=Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=LO1Yk9Zyzrv9MQU42UWrwS5AGoOuZeWvHnsE7ewd4ko=; b=kGinncfBVxhxadGQkaAcVQ6yPvmmkPXPGlhykLYglbHkCqu4fyaVHGFNc+oCe0bqqRPxAWrFCCBInNJyL9y/wa8WoBo8uQG0zVqlJwSbyJ1XqtVjhSNfT+G3G2eTTIFvc1WBnou0Ltt1p5Bsz4rN7q9C203mnJML+EtHkeR9enE= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1665688275591251.34658517119112; Thu, 13 Oct 2022 12:11:15 -0700 (PDT) Received: from localhost ([::1]:38616 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oj3bs-0007RP-Jz for importer@patchew.org; Thu, 13 Oct 2022 15:11:12 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:40406) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oj3R0-0006ev-J6 for qemu-devel@nongnu.org; Thu, 13 Oct 2022 14:59:58 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:54728) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oj3Qx-00048h-KE for qemu-devel@nongnu.org; Thu, 13 Oct 2022 14:59:58 -0400 Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-649-JGyezIxpMXmQSgKgyZWChg-1; Thu, 13 Oct 2022 14:59:51 -0400 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 5CED7382C967; Thu, 13 Oct 2022 18:59:50 +0000 (UTC) Received: from localhost (unknown [10.39.194.71]) by smtp.corp.redhat.com (Postfix) with ESMTP id B74D01410F38; Thu, 13 Oct 2022 18:59:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1665687594; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=LO1Yk9Zyzrv9MQU42UWrwS5AGoOuZeWvHnsE7ewd4ko=; b=Y8dHOr5gJQTwSRRPSKLQDU/uAu5+v1I8OSHg4rwZolhNfzEwLyq/C6OHgZFDSh78uQNq/x tgslrbunxwGHz9V3FSuNphf/e2F6VsQmN5DNc4j2FO7z0eBVxHiPbFQ2UnI8blE/ccrxnJ MQpF/0G1V9FllY4Lxllb+cxBKxRz2Uk= X-MC-Unique: JGyezIxpMXmQSgKgyZWChg-1 From: Stefan Hajnoczi To: qemu-devel@nongnu.org Cc: Fam Zheng , Markus Armbruster , Stefan Hajnoczi , Laurent Vivier , Marcel Apfelbaum , "Michael S. Tsirkin" , Wen Congyang , Kevin Wolf , =?UTF-8?q?Philippe=20Mathieu-Daud=C3=A9?= , Richard Henderson , David Hildenbrand , Hanna Reitz , Xie Changlong , Eduardo Habkost , qemu-block@nongnu.org, Eric Blake , John Snow , afaria@redhat.com, Jeff Cody , Yanan Wang , Paolo Bonzini , Raphael Norwitz , sgarzare@redhat.com, integration@gluster.org, Peter Xu , "Richard W.M. Jones" , Thomas Huth , Vladimir Sementsov-Ogievskiy , "Denis V. Lunev" Subject: [PATCH v7 11/13] stubs: add qemu_ram_block_from_host() and qemu_ram_get_fd() Date: Thu, 13 Oct 2022 14:59:06 -0400 Message-Id: <20221013185908.1297568-12-stefanha@redhat.com> In-Reply-To: <20221013185908.1297568-1-stefanha@redhat.com> References: <20221013185908.1297568-1-stefanha@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.1 on 10.11.54.7 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.129.124; envelope-from=stefanha@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1665688276522100001 Content-Type: text/plain; charset="utf-8" The blkio block driver will need to look up the file descriptor for a given pointer. This is possible in softmmu builds where the RAMBlock API is available for querying guest RAM. Add stubs so tools like qemu-img that link the block layer still build successfully. In this case there is no guest RAM but that is fine. Bounce buffers and their file descriptors will be allocated with libblkio's blkio_alloc_mem_region() so we won't rely on QEMU's qemu_ram_get_fd() in that case. Signed-off-by: Stefan Hajnoczi --- stubs/physmem.c | 13 +++++++++++++ stubs/meson.build | 1 + 2 files changed, 14 insertions(+) create mode 100644 stubs/physmem.c diff --git a/stubs/physmem.c b/stubs/physmem.c new file mode 100644 index 0000000000..1fc5f2df29 --- /dev/null +++ b/stubs/physmem.c @@ -0,0 +1,13 @@ +#include "qemu/osdep.h" +#include "exec/cpu-common.h" + +RAMBlock *qemu_ram_block_from_host(void *ptr, bool round_offset, + ram_addr_t *offset) +{ + return NULL; +} + +int qemu_ram_get_fd(RAMBlock *rb) +{ + return -1; +} diff --git a/stubs/meson.build b/stubs/meson.build index d8f3fd5c44..4314161f5f 100644 --- a/stubs/meson.build +++ b/stubs/meson.build @@ -29,6 +29,7 @@ stub_ss.add(files('migr-blocker.c')) stub_ss.add(files('module-opts.c')) stub_ss.add(files('monitor.c')) stub_ss.add(files('monitor-core.c')) +stub_ss.add(files('physmem.c')) stub_ss.add(files('qemu-timer-notify-cb.c')) stub_ss.add(files('qmp_memory_device.c')) stub_ss.add(files('qmp-command-available.c')) --=20 2.37.3 From nobody Sun May 12 14:33:58 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1665688477; cv=none; d=zohomail.com; s=zohoarc; b=b1ZymFmgJHRG0vMpxNJ6OLzTwr3jSN3gwfDljLGNzdJQKHGw9VGVgXVHRoHFFGLRPzysK+iaLhKLFULA5Zip7Fz2npxcuwD8Rxy+lcg+LIQ5xavivtbb3fYflS3NF1SvhOXeL8u1yrmmbvx/hipWVCE2R6X6EyIfxrXChkaJklQ= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1665688477; h=Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=xEdWwvxeK0o+vyxh5TkWNs/E97vysjDvJ35MWYC1VUY=; b=hxMYDtPXoIIjjjZ4sg8SaDgJpmyPSp6PBPp2xBHmTLUJOBwvyvQrEOnfoDmHd0UGZMGlDuK6arjKDVAaaSbqF01tZyOeR60Bdya9nLk9kG4SLXmnXX1WJgc7j5RPWL8tyN3sl0llYu15gBLlaVaZKL6UBgwxaHGmialSiPP3gqc= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1665688477153928.1089371714625; Thu, 13 Oct 2022 12:14:37 -0700 (PDT) Received: from localhost ([::1]:40780 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oj3f9-0003T8-Sz for importer@patchew.org; Thu, 13 Oct 2022 15:14:35 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:40412) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oj3R2-0006gK-2t for qemu-devel@nongnu.org; Thu, 13 Oct 2022 15:00:00 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:57728) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oj3Qz-0004AH-Lq for qemu-devel@nongnu.org; Thu, 13 Oct 2022 14:59:59 -0400 Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-157-2nbA9e_PPD-dbRWimBOigg-1; Thu, 13 Oct 2022 14:59:54 -0400 Received: from smtp.corp.redhat.com (int-mx09.intmail.prod.int.rdu2.redhat.com [10.11.54.9]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 17CF2185A78B; Thu, 13 Oct 2022 18:59:53 +0000 (UTC) Received: from localhost (unknown [10.39.194.71]) by smtp.corp.redhat.com (Postfix) with ESMTP id 85F454B3FCE; Thu, 13 Oct 2022 18:59:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1665687597; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=xEdWwvxeK0o+vyxh5TkWNs/E97vysjDvJ35MWYC1VUY=; b=QGylJjBewHVa5r0yMYQQyItwkBjxCAJ1qWnO51k0fXNPd514MLxQMI4dWEO5QgmDxs/F3+ Ko5m+YD1MKXSUnm8dCGVUDFrMQv32fFpN+rAMd1+Bbc8mYfD+2GTEr7BK/KI0allVEHMI3 y60IaCxJ9OsHCoAJgLbGBmUQQ3snNnE= X-MC-Unique: 2nbA9e_PPD-dbRWimBOigg-1 From: Stefan Hajnoczi To: qemu-devel@nongnu.org Cc: Fam Zheng , Markus Armbruster , Stefan Hajnoczi , Laurent Vivier , Marcel Apfelbaum , "Michael S. Tsirkin" , Wen Congyang , Kevin Wolf , =?UTF-8?q?Philippe=20Mathieu-Daud=C3=A9?= , Richard Henderson , David Hildenbrand , Hanna Reitz , Xie Changlong , Eduardo Habkost , qemu-block@nongnu.org, Eric Blake , John Snow , afaria@redhat.com, Jeff Cody , Yanan Wang , Paolo Bonzini , Raphael Norwitz , sgarzare@redhat.com, integration@gluster.org, Peter Xu , "Richard W.M. Jones" , Thomas Huth , Vladimir Sementsov-Ogievskiy , "Denis V. Lunev" Subject: [PATCH v7 12/13] blkio: implement BDRV_REQ_REGISTERED_BUF optimization Date: Thu, 13 Oct 2022 14:59:07 -0400 Message-Id: <20221013185908.1297568-13-stefanha@redhat.com> In-Reply-To: <20221013185908.1297568-1-stefanha@redhat.com> References: <20221013185908.1297568-1-stefanha@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.1 on 10.11.54.9 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.129.124; envelope-from=stefanha@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1665688478126100001 Content-Type: text/plain; charset="utf-8" Avoid bounce buffers when QEMUIOVector elements are within previously registered bdrv_register_buf() buffers. The idea is that emulated storage controllers will register guest RAM using bdrv_register_buf() and set the BDRV_REQ_REGISTERED_BUF on I/O requests. Therefore no blkio_map_mem_region() calls are necessary in the performance-critical I/O code path. This optimization doesn't apply if the I/O buffer is internally allocated by QEMU (e.g. qcow2 metadata). There we still take the slow path because BDRV_REQ_REGISTERED_BUF is not set. Signed-off-by: Stefan Hajnoczi Reviewed-by: Stefano Garzarella --- block/blkio.c | 183 +++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 180 insertions(+), 3 deletions(-) diff --git a/block/blkio.c b/block/blkio.c index b0cfd74b36..82f26eedd2 100644 --- a/block/blkio.c +++ b/block/blkio.c @@ -11,9 +11,13 @@ #include "qemu/osdep.h" #include #include "block/block_int.h" +#include "exec/memory.h" +#include "exec/cpu-common.h" /* for qemu_ram_get_fd() */ #include "qapi/error.h" +#include "qemu/error-report.h" #include "qapi/qmp/qdict.h" #include "qemu/module.h" +#include "exec/memory.h" /* for ram_block_discard_disable() */ =20 /* * Keep the QEMU BlockDriver names identical to the libblkio driver names. @@ -73,6 +77,12 @@ typedef struct { =20 /* Can we skip adding/deleting blkio_mem_regions? */ bool needs_mem_regions; + + /* Are file descriptors necessary for blkio_mem_regions? */ + bool needs_mem_region_fd; + + /* Are madvise(MADV_DONTNEED)-style operations unavailable? */ + bool may_pin_mem_regions; } BDRVBlkioState; =20 /* Called with s->bounce_lock held */ @@ -347,7 +357,8 @@ blkio_co_preadv(BlockDriverState *bs, int64_t offset, i= nt64_t bytes, .coroutine =3D qemu_coroutine_self(), }; BDRVBlkioState *s =3D bs->opaque; - bool use_bounce_buffer =3D s->needs_mem_regions; + bool use_bounce_buffer =3D + s->needs_mem_regions && !(flags & BDRV_REQ_REGISTERED_BUF); BlkioBounceBuf bounce; struct iovec *iov =3D qiov->iov; int iovcnt =3D qiov->niov; @@ -390,7 +401,8 @@ static int coroutine_fn blkio_co_pwritev(BlockDriverSta= te *bs, int64_t offset, .coroutine =3D qemu_coroutine_self(), }; BDRVBlkioState *s =3D bs->opaque; - bool use_bounce_buffer =3D s->needs_mem_regions; + bool use_bounce_buffer =3D + s->needs_mem_regions && !(flags & BDRV_REQ_REGISTERED_BUF); BlkioBounceBuf bounce; struct iovec *iov =3D qiov->iov; int iovcnt =3D qiov->niov; @@ -473,6 +485,130 @@ static void blkio_io_unplug(BlockDriverState *bs) } } =20 +typedef enum { + BMRR_OK, + BMRR_SKIP, + BMRR_FAIL, +} BlkioMemRegionResult; + +/* + * Produce a struct blkio_mem_region for a given address and size. + * + * This function produces identical results when called multiple times wit= h the + * same arguments. This property is necessary because blkio_unmap_mem_regi= on() + * must receive the same struct blkio_mem_region field values that were pa= ssed + * to blkio_map_mem_region(). + */ +static BlkioMemRegionResult +blkio_mem_region_from_host(BlockDriverState *bs, + void *host, size_t size, + struct blkio_mem_region *region, + Error **errp) +{ + BDRVBlkioState *s =3D bs->opaque; + int fd =3D -1; + ram_addr_t fd_offset =3D 0; + + if (((uintptr_t)host | size) % s->mem_region_alignment) { + error_setg(errp, "unaligned buf %p with size %zu", host, size); + return BMRR_FAIL; + } + + /* Attempt to find the fd for the underlying memory */ + if (s->needs_mem_region_fd) { + RAMBlock *ram_block; + RAMBlock *end_block; + ram_addr_t offset; + + /* + * bdrv_register_buf() is called with the BQL held so mr lives at = least + * until this function returns. + */ + ram_block =3D qemu_ram_block_from_host(host, false, &fd_offset); + if (ram_block) { + fd =3D qemu_ram_get_fd(ram_block); + } + if (fd =3D=3D -1) { + /* + * Ideally every RAMBlock would have an fd. pc-bios and other + * things don't. Luckily they are usually not I/O buffers and = we + * can just ignore them. + */ + return BMRR_SKIP; + } + + /* Make sure the fd covers the entire range */ + end_block =3D qemu_ram_block_from_host(host + size - 1, false, &of= fset); + if (ram_block !=3D end_block) { + error_setg(errp, "registered buffer at %p with size %zu extend= s " + "beyond RAMBlock", host, size); + return BMRR_FAIL; + } + } + + *region =3D (struct blkio_mem_region){ + .addr =3D host, + .len =3D size, + .fd =3D fd, + .fd_offset =3D fd_offset, + }; + return BMRR_OK; +} + +static bool blkio_register_buf(BlockDriverState *bs, void *host, size_t si= ze, + Error **errp) +{ + BDRVBlkioState *s =3D bs->opaque; + struct blkio_mem_region region; + BlkioMemRegionResult region_result; + int ret; + + /* + * Mapping memory regions conflicts with RAM discard (virtio-mem) when + * there is pinning, so only do it when necessary. + */ + if (!s->needs_mem_regions && s->may_pin_mem_regions) { + return true; + } + + region_result =3D blkio_mem_region_from_host(bs, host, size, ®ion, = errp); + if (region_result =3D=3D BMRR_SKIP) { + return true; + } else if (region_result !=3D BMRR_OK) { + return false; + } + + WITH_QEMU_LOCK_GUARD(&s->blkio_lock) { + ret =3D blkio_map_mem_region(s->blkio, ®ion); + } + + if (ret < 0) { + error_setg(errp, "Failed to add blkio mem region %p with size %zu:= %s", + host, size, blkio_get_error_msg()); + return false; + } + return true; +} + +static void blkio_unregister_buf(BlockDriverState *bs, void *host, size_t = size) +{ + BDRVBlkioState *s =3D bs->opaque; + struct blkio_mem_region region; + + /* See blkio_register_buf() */ + if (!s->needs_mem_regions && s->may_pin_mem_regions) { + return; + } + + if (blkio_mem_region_from_host(bs, host, size, ®ion, NULL) !=3D BMR= R_OK) { + return; + } + + WITH_QEMU_LOCK_GUARD(&s->blkio_lock) { + blkio_unmap_mem_region(s->blkio, ®ion); + } +} + static int blkio_io_uring_open(BlockDriverState *bs, QDict *options, int f= lags, Error **errp) { @@ -609,6 +745,17 @@ static int blkio_file_open(BlockDriverState *bs, QDict= *options, int flags, return ret; } =20 + ret =3D blkio_get_bool(s->blkio, + "needs-mem-region-fd", + &s->needs_mem_region_fd); + if (ret < 0) { + error_setg_errno(errp, -ret, + "failed to get needs-mem-region-fd: %s", + blkio_get_error_msg()); + blkio_destroy(&s->blkio); + return ret; + } + ret =3D blkio_get_uint64(s->blkio, "mem-region-alignment", &s->mem_region_alignment); @@ -620,15 +767,39 @@ static int blkio_file_open(BlockDriverState *bs, QDic= t *options, int flags, return ret; } =20 + ret =3D blkio_get_bool(s->blkio, + "may-pin-mem-regions", + &s->may_pin_mem_regions); + if (ret < 0) { + /* Be conservative (assume pinning) if the property is not support= ed */ + s->may_pin_mem_regions =3D s->needs_mem_regions; + } + + /* + * Notify if libblkio drivers pin memory and prevent features like + * virtio-mem from working. + */ + if (s->may_pin_mem_regions) { + ret =3D ram_block_discard_disable(true); + if (ret < 0) { + error_setg_errno(errp, -ret, "ram_block_discard_disable() fail= ed"); + blkio_destroy(&s->blkio); + return ret; + } + } + ret =3D blkio_start(s->blkio); if (ret < 0) { error_setg_errno(errp, -ret, "blkio_start failed: %s", blkio_get_error_msg()); blkio_destroy(&s->blkio); + if (s->may_pin_mem_regions) { + ram_block_discard_disable(false); + } return ret; } =20 - bs->supported_write_flags =3D BDRV_REQ_FUA; + bs->supported_write_flags =3D BDRV_REQ_FUA | BDRV_REQ_REGISTERED_BUF; bs->supported_zero_flags =3D BDRV_REQ_FUA | BDRV_REQ_MAY_UNMAP | BDRV_REQ_NO_FALLBACK; =20 @@ -652,6 +823,10 @@ static void blkio_close(BlockDriverState *bs) qemu_mutex_destroy(&s->blkio_lock); blkio_detach_aio_context(bs); blkio_destroy(&s->blkio); + + if (s->may_pin_mem_regions) { + ram_block_discard_disable(false); + } } =20 static int64_t blkio_getlength(BlockDriverState *bs) @@ -799,6 +974,8 @@ static void blkio_refresh_limits(BlockDriverState *bs, = Error **errp) .bdrv_co_pwrite_zeroes =3D blkio_co_pwrite_zeroes, \ .bdrv_io_unplug =3D blkio_io_unplug, \ .bdrv_refresh_limits =3D blkio_refresh_limits, \ + .bdrv_register_buf =3D blkio_register_buf, \ + .bdrv_unregister_buf =3D blkio_unregister_buf, \ __VA_ARGS__ \ } =20 --=20 2.37.3 From nobody Sun May 12 14:33:58 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1665688745; cv=none; d=zohomail.com; s=zohoarc; b=k2S/Sc8Lc6NkhA6KzxBBWZvUxMf87eeE+xWvmthKYkPPBUPpmCGkdhR+xwNmyq8GGJpK6ODIPDHjEmdoS8gvN7S3LuKpwSrUXRA4rmiTH6W9//U7Ww5LjqUW+l3yCCTx8AICtIqA36gWK8Ord4mM9ZjaWa+/Bg1onXC+hcbbg5U= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1665688745; h=Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=mZ66uAehH80bMDrqLMDdESnTQc3CKm9Lg541YpF/pnE=; b=OrXKKmyK2Zw8CF/DCgHvbgGo1xwobkvveplhG875ZUdKUw+Lt+DLvxqMcNNA0fqaqHck3glTlZNnTYJClBxXY2dVpL9tNDybMPPB0Np0bknJ7VsO2rMfefa1v0qFGa+N8+Ml+G5pvjJN657CMg+kcCIyLQ+aa5I+yhYsu0yGmnI= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1665688745026414.6585992769009; Thu, 13 Oct 2022 12:19:05 -0700 (PDT) Received: from localhost ([::1]:60228 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oj3jT-0007pz-Fv for importer@patchew.org; Thu, 13 Oct 2022 15:19:03 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:34132) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oj3R8-0006iw-1N for qemu-devel@nongnu.org; Thu, 13 Oct 2022 15:00:06 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:38717) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oj3R4-0004BD-HN for qemu-devel@nongnu.org; Thu, 13 Oct 2022 15:00:04 -0400 Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-21-ZaarLV7SOqaJqDR4UWN2iA-1; Thu, 13 Oct 2022 14:59:56 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 5CF4C800B30; Thu, 13 Oct 2022 18:59:55 +0000 (UTC) Received: from localhost (unknown [10.39.194.71]) by smtp.corp.redhat.com (Postfix) with ESMTP id 90F682166BA3; Thu, 13 Oct 2022 18:59:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1665687600; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=mZ66uAehH80bMDrqLMDdESnTQc3CKm9Lg541YpF/pnE=; b=ZxYSZ3obySrA3nELRP4zrk0CwLbCOAj1fiaczH6BJhB8jpLnmuu5LeItG+IFCYaWG+l6UD EdEyxSrbk3akvP7zK0QKv6VAkFxbOTH0IZSndw2G2FXFzVUVkKkPh4ZSPvZKL/sz5RBGAr bv3bOV76NomLEm8l9e7dyrao6WdAY9U= X-MC-Unique: ZaarLV7SOqaJqDR4UWN2iA-1 From: Stefan Hajnoczi To: qemu-devel@nongnu.org Cc: Fam Zheng , Markus Armbruster , Stefan Hajnoczi , Laurent Vivier , Marcel Apfelbaum , "Michael S. Tsirkin" , Wen Congyang , Kevin Wolf , =?UTF-8?q?Philippe=20Mathieu-Daud=C3=A9?= , Richard Henderson , David Hildenbrand , Hanna Reitz , Xie Changlong , Eduardo Habkost , qemu-block@nongnu.org, Eric Blake , John Snow , afaria@redhat.com, Jeff Cody , Yanan Wang , Paolo Bonzini , Raphael Norwitz , sgarzare@redhat.com, integration@gluster.org, Peter Xu , "Richard W.M. Jones" , Thomas Huth , Vladimir Sementsov-Ogievskiy , "Denis V. Lunev" Subject: [PATCH v7 13/13] virtio-blk: use BDRV_REQ_REGISTERED_BUF optimization hint Date: Thu, 13 Oct 2022 14:59:08 -0400 Message-Id: <20221013185908.1297568-14-stefanha@redhat.com> In-Reply-To: <20221013185908.1297568-1-stefanha@redhat.com> References: <20221013185908.1297568-1-stefanha@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.1 on 10.11.54.6 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.129.124; envelope-from=stefanha@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1665688745676100003 Content-Type: text/plain; charset="utf-8" Register guest RAM using BlockRAMRegistrar and set the BDRV_REQ_REGISTERED_BUF flag so block drivers can optimize memory accesses in I/O requests. This is for vdpa-blk, vhost-user-blk, and other I/O interfaces that rely on DMA mapping/unmapping. Signed-off-by: Stefan Hajnoczi Reviewed-by: Stefano Garzarella --- include/hw/virtio/virtio-blk.h | 2 ++ hw/block/virtio-blk.c | 39 ++++++++++++++++++++++------------ 2 files changed, 27 insertions(+), 14 deletions(-) diff --git a/include/hw/virtio/virtio-blk.h b/include/hw/virtio/virtio-blk.h index d311c57cca..7f589b4146 100644 --- a/include/hw/virtio/virtio-blk.h +++ b/include/hw/virtio/virtio-blk.h @@ -19,6 +19,7 @@ #include "hw/block/block.h" #include "sysemu/iothread.h" #include "sysemu/block-backend.h" +#include "sysemu/block-ram-registrar.h" #include "qom/object.h" =20 #define TYPE_VIRTIO_BLK "virtio-blk-device" @@ -64,6 +65,7 @@ struct VirtIOBlock { struct VirtIOBlockDataPlane *dataplane; uint64_t host_features; size_t config_size; + BlockRAMRegistrar blk_ram_registrar; }; =20 typedef struct VirtIOBlockReq { diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c index 8131ec2dbc..f717550fdc 100644 --- a/hw/block/virtio-blk.c +++ b/hw/block/virtio-blk.c @@ -21,6 +21,7 @@ #include "hw/block/block.h" #include "hw/qdev-properties.h" #include "sysemu/blockdev.h" +#include "sysemu/block-ram-registrar.h" #include "sysemu/sysemu.h" #include "sysemu/runstate.h" #include "hw/virtio/virtio-blk.h" @@ -362,12 +363,14 @@ static void virtio_blk_handle_scsi(VirtIOBlockReq *re= q) } } =20 -static inline void submit_requests(BlockBackend *blk, MultiReqBuffer *mrb, +static inline void submit_requests(VirtIOBlock *s, MultiReqBuffer *mrb, int start, int num_reqs, int niov) { + BlockBackend *blk =3D s->blk; QEMUIOVector *qiov =3D &mrb->reqs[start]->qiov; int64_t sector_num =3D mrb->reqs[start]->sector_num; bool is_write =3D mrb->is_write; + BdrvRequestFlags flags =3D 0; =20 if (num_reqs > 1) { int i; @@ -398,12 +401,18 @@ static inline void submit_requests(BlockBackend *blk,= MultiReqBuffer *mrb, num_reqs - 1); } =20 + if (blk_ram_registrar_ok(&s->blk_ram_registrar)) { + flags |=3D BDRV_REQ_REGISTERED_BUF; + } + if (is_write) { - blk_aio_pwritev(blk, sector_num << BDRV_SECTOR_BITS, qiov, 0, - virtio_blk_rw_complete, mrb->reqs[start]); + blk_aio_pwritev(blk, sector_num << BDRV_SECTOR_BITS, qiov, + flags, virtio_blk_rw_complete, + mrb->reqs[start]); } else { - blk_aio_preadv(blk, sector_num << BDRV_SECTOR_BITS, qiov, 0, - virtio_blk_rw_complete, mrb->reqs[start]); + blk_aio_preadv(blk, sector_num << BDRV_SECTOR_BITS, qiov, + flags, virtio_blk_rw_complete, + mrb->reqs[start]); } } =20 @@ -425,14 +434,14 @@ static int multireq_compare(const void *a, const void= *b) } } =20 -static void virtio_blk_submit_multireq(BlockBackend *blk, MultiReqBuffer *= mrb) +static void virtio_blk_submit_multireq(VirtIOBlock *s, MultiReqBuffer *mrb) { int i =3D 0, start =3D 0, num_reqs =3D 0, niov =3D 0, nb_sectors =3D 0; uint32_t max_transfer; int64_t sector_num =3D 0; =20 if (mrb->num_reqs =3D=3D 1) { - submit_requests(blk, mrb, 0, 1, -1); + submit_requests(s, mrb, 0, 1, -1); mrb->num_reqs =3D 0; return; } @@ -452,11 +461,11 @@ static void virtio_blk_submit_multireq(BlockBackend *= blk, MultiReqBuffer *mrb) * 3. merge would exceed maximum transfer length of backend de= vice */ if (sector_num + nb_sectors !=3D req->sector_num || - niov > blk_get_max_iov(blk) - req->qiov.niov || + niov > blk_get_max_iov(s->blk) - req->qiov.niov || req->qiov.size > max_transfer || nb_sectors > (max_transfer - req->qiov.size) / BDRV_SECTOR_SIZE) { - submit_requests(blk, mrb, start, num_reqs, niov); + submit_requests(s, mrb, start, num_reqs, niov); num_reqs =3D 0; } } @@ -472,7 +481,7 @@ static void virtio_blk_submit_multireq(BlockBackend *bl= k, MultiReqBuffer *mrb) num_reqs++; } =20 - submit_requests(blk, mrb, start, num_reqs, niov); + submit_requests(s, mrb, start, num_reqs, niov); mrb->num_reqs =3D 0; } =20 @@ -487,7 +496,7 @@ static void virtio_blk_handle_flush(VirtIOBlockReq *req= , MultiReqBuffer *mrb) * Make sure all outstanding writes are posted to the backing device. */ if (mrb->is_write && mrb->num_reqs > 0) { - virtio_blk_submit_multireq(s->blk, mrb); + virtio_blk_submit_multireq(s, mrb); } blk_aio_flush(s->blk, virtio_blk_flush_complete, req); } @@ -667,7 +676,7 @@ static int virtio_blk_handle_request(VirtIOBlockReq *re= q, MultiReqBuffer *mrb) if (mrb->num_reqs > 0 && (mrb->num_reqs =3D=3D VIRTIO_BLK_MAX_MERG= E_REQS || is_write !=3D mrb->is_write || !s->conf.request_merging)) { - virtio_blk_submit_multireq(s->blk, mrb); + virtio_blk_submit_multireq(s, mrb); } =20 assert(mrb->num_reqs < VIRTIO_BLK_MAX_MERGE_REQS); @@ -774,7 +783,7 @@ void virtio_blk_handle_vq(VirtIOBlock *s, VirtQueue *vq) } while (!virtio_queue_empty(vq)); =20 if (mrb.num_reqs) { - virtio_blk_submit_multireq(s->blk, &mrb); + virtio_blk_submit_multireq(s, &mrb); } =20 blk_io_unplug(s->blk); @@ -823,7 +832,7 @@ void virtio_blk_process_queued_requests(VirtIOBlock *s,= bool is_bh) } =20 if (mrb.num_reqs) { - virtio_blk_submit_multireq(s->blk, &mrb); + virtio_blk_submit_multireq(s, &mrb); } if (is_bh) { blk_dec_in_flight(s->conf.conf.blk); @@ -1205,6 +1214,7 @@ static void virtio_blk_device_realize(DeviceState *de= v, Error **errp) } =20 s->change =3D qemu_add_vm_change_state_handler(virtio_blk_dma_restart_= cb, s); + blk_ram_registrar_init(&s->blk_ram_registrar, s->blk); blk_set_dev_ops(s->blk, &virtio_block_ops, s); =20 blk_iostatus_enable(s->blk); @@ -1230,6 +1240,7 @@ static void virtio_blk_device_unrealize(DeviceState *= dev) virtio_del_queue(vdev, i); } qemu_coroutine_dec_pool_size(conf->num_queues * conf->queue_size / 2); + blk_ram_registrar_destroy(&s->blk_ram_registrar); qemu_del_vm_change_state_handler(s->change); blockdev_mark_auto_del(s->blk); virtio_cleanup(vdev); --=20 2.37.3