From nobody Thu Dec 18 17:55:18 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) client-ip=192.237.175.120; envelope-from=xen-devel-bounces@lists.xenproject.org; helo=lists.xenproject.org; Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1684862040; cv=none; d=zohomail.com; s=zohoarc; b=dt21qjkJEW60RD3YzeGq+DXL7P1AtlL3z7Vp279AWwqoTD5Foq+cPndz7wgzdjuuJxGKDjFsnNGmb/j0TkHS+XXpLWSZn1KDwHQ81uK0wOoMGHABo+N2wdywjWJV25hqrGupKNaERWeigQr3w2erB1cy7z2l7xfCDb2Hd83C6pk= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1684862040; h=Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=lWkemwIYHqUqRrd9czrbQXH/T+H09BEI6lXtVpoxOE0=; b=HjQ3Ajgb29AHe1/wzLg/yAGPoVqNBrZ/F/yngsVDJedYPMPLN0deGCohfQXZFCGEJ2FDOGdUQaCJKWqZCXq/p26ipHejAR+BDvsWzgzow6kq3JXS7ZlkPtDUuYgumfHDTEIiP3VQi7IUVfGUMWx5XSm2qiu7pdSgfTkNildJKv4= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) by mx.zohomail.com with SMTPS id 16848620407071019.5526392580796; Tue, 23 May 2023 10:14:00 -0700 (PDT) Received: from list by lists.xenproject.org with outflank-mailman.538633.838755 (Exim 4.92) (envelope-from ) id 1q1VZT-0003YZ-QJ; Tue, 23 May 2023 17:13:15 +0000 Received: by outflank-mailman (output) from mailman id 538633.838755; Tue, 23 May 2023 17:13:15 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1q1VZT-0003YS-Mh; Tue, 23 May 2023 17:13:15 +0000 Received: by outflank-mailman (input) for mailman id 538633; Tue, 23 May 2023 17:13:14 +0000 Received: from se1-gles-sth1-in.inumbo.com ([159.253.27.254] helo=se1-gles-sth1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1q1VZS-0003Rc-77 for xen-devel@lists.xenproject.org; Tue, 23 May 2023 17:13:14 +0000 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by se1-gles-sth1.inumbo.com (Halon) with ESMTPS id 1891f5db-f98d-11ed-b22d-6b7b168915f2; Tue, 23 May 2023 19:13:12 +0200 (CEST) Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-446-OWN8SbHRN8KoJsCl-5O8JQ-1; Tue, 23 May 2023 13:13:06 -0400 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.rdu2.redhat.com [10.11.54.8]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 593C48032FE; Tue, 23 May 2023 17:13:05 +0000 (UTC) Received: from localhost (unknown [10.39.195.107]) by smtp.corp.redhat.com (Postfix) with ESMTP id 34221C1ED99; Tue, 23 May 2023 17:13:03 +0000 (UTC) X-Outflank-Mailman: Message body and most headers restored to incoming version X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 1891f5db-f98d-11ed-b22d-6b7b168915f2 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1684861991; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=lWkemwIYHqUqRrd9czrbQXH/T+H09BEI6lXtVpoxOE0=; b=ei7xrdCUKJ1Qm804EVuueRqHxBUiBb96YkfPe+W/ReR7NvJ6pE/oyNTyryVmPxiehR6ri2 FZiMTVJVpPsJ1+qnztG8mu1QJcMTnQpHxJxXJq4SAB36dnmEJtixw1gidByVK5lXKQBg/b 90k7GTwrSJ2UlVEVzybamMQ5acwaXyc= X-MC-Unique: OWN8SbHRN8KoJsCl-5O8JQ-1 From: Stefan Hajnoczi To: qemu-devel@nongnu.org Cc: Aarushi Mehta , "Michael S. Tsirkin" , Stefano Garzarella , Julia Suvorova , Paolo Bonzini , =?UTF-8?q?Philippe=20Mathieu-Daud=C3=A9?= , Stefano Stabellini , Paul Durrant , Hanna Reitz , Kevin Wolf , Fam Zheng , Stefan Hajnoczi , xen-devel@lists.xenproject.org, eblake@redhat.com, Anthony Perard , qemu-block@nongnu.org Subject: [PATCH v2 1/6] block: add blk_io_plug_call() API Date: Tue, 23 May 2023 13:12:55 -0400 Message-Id: <20230523171300.132347-2-stefanha@redhat.com> In-Reply-To: <20230523171300.132347-1-stefanha@redhat.com> References: <20230523171300.132347-1-stefanha@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.1 on 10.11.54.8 X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1684862041389100003 Content-Type: text/plain; charset="utf-8" Introduce a new API for thread-local blk_io_plug() that does not traverse the block graph. The goal is to make blk_io_plug() multi-queue friendly. Instead of having block drivers track whether or not we're in a plugged section, provide an API that allows them to defer a function call until we're unplugged: blk_io_plug_call(fn, opaque). If blk_io_plug_call() is called multiple times with the same fn/opaque pair, then fn() is only called once at the end of the function - resulting in batching. This patch introduces the API and changes blk_io_plug()/blk_io_unplug(). blk_io_plug()/blk_io_unplug() no longer require a BlockBackend argument because the plug state is now thread-local. Later patches convert block drivers to blk_io_plug_call() and then we can finally remove .bdrv_co_io_plug() once all block drivers have been converted. Signed-off-by: Stefan Hajnoczi Reviewed-by: Eric Blake Reviewed-by: Stefano Garzarella --- v2 - "is not be freed" -> "is not freed" [Eric] --- MAINTAINERS | 1 + include/sysemu/block-backend-io.h | 13 +-- block/block-backend.c | 22 ----- block/plug.c | 159 ++++++++++++++++++++++++++++++ hw/block/dataplane/xen-block.c | 8 +- hw/block/virtio-blk.c | 4 +- hw/scsi/virtio-scsi.c | 6 +- block/meson.build | 1 + 8 files changed, 173 insertions(+), 41 deletions(-) create mode 100644 block/plug.c diff --git a/MAINTAINERS b/MAINTAINERS index 1b6466496d..2be6f0c26b 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -2646,6 +2646,7 @@ F: util/aio-*.c F: util/aio-*.h F: util/fdmon-*.c F: block/io.c +F: block/plug.c F: migration/block* F: include/block/aio.h F: include/block/aio-wait.h diff --git a/include/sysemu/block-backend-io.h b/include/sysemu/block-backe= nd-io.h index d62a7ee773..be4dcef59d 100644 --- a/include/sysemu/block-backend-io.h +++ b/include/sysemu/block-backend-io.h @@ -100,16 +100,9 @@ void blk_iostatus_set_err(BlockBackend *blk, int error= ); int blk_get_max_iov(BlockBackend *blk); int blk_get_max_hw_iov(BlockBackend *blk); =20 -/* - * blk_io_plug/unplug are thread-local operations. This means that multiple - * IOThreads can simultaneously call plug/unplug, but the caller must ensu= re - * that each unplug() is called in the same IOThread of the matching plug(= ). - */ -void coroutine_fn blk_co_io_plug(BlockBackend *blk); -void co_wrapper blk_io_plug(BlockBackend *blk); - -void coroutine_fn blk_co_io_unplug(BlockBackend *blk); -void co_wrapper blk_io_unplug(BlockBackend *blk); +void blk_io_plug(void); +void blk_io_unplug(void); +void blk_io_plug_call(void (*fn)(void *), void *opaque); =20 AioContext *blk_get_aio_context(BlockBackend *blk); BlockAcctStats *blk_get_stats(BlockBackend *blk); diff --git a/block/block-backend.c b/block/block-backend.c index ca537cd0ad..1f1d226ba6 100644 --- a/block/block-backend.c +++ b/block/block-backend.c @@ -2568,28 +2568,6 @@ void blk_add_insert_bs_notifier(BlockBackend *blk, N= otifier *notify) notifier_list_add(&blk->insert_bs_notifiers, notify); } =20 -void coroutine_fn blk_co_io_plug(BlockBackend *blk) -{ - BlockDriverState *bs =3D blk_bs(blk); - IO_CODE(); - GRAPH_RDLOCK_GUARD(); - - if (bs) { - bdrv_co_io_plug(bs); - } -} - -void coroutine_fn blk_co_io_unplug(BlockBackend *blk) -{ - BlockDriverState *bs =3D blk_bs(blk); - IO_CODE(); - GRAPH_RDLOCK_GUARD(); - - if (bs) { - bdrv_co_io_unplug(bs); - } -} - BlockAcctStats *blk_get_stats(BlockBackend *blk) { IO_CODE(); diff --git a/block/plug.c b/block/plug.c new file mode 100644 index 0000000000..98a155d2f4 --- /dev/null +++ b/block/plug.c @@ -0,0 +1,159 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +/* + * Block I/O plugging + * + * Copyright Red Hat. + * + * This API defers a function call within a blk_io_plug()/blk_io_unplug() + * section, allowing multiple calls to batch up. This is a performance + * optimization that is used in the block layer to submit several I/O requ= ests + * at once instead of individually: + * + * blk_io_plug(); <-- start of plugged region + * ... + * blk_io_plug_call(my_func, my_obj); <-- deferred my_func(my_obj) call + * blk_io_plug_call(my_func, my_obj); <-- another + * blk_io_plug_call(my_func, my_obj); <-- another + * ... + * blk_io_unplug(); <-- end of plugged region, my_func(my_obj) is called= once + * + * This code is actually generic and not tied to the block layer. If anoth= er + * subsystem needs this functionality, it could be renamed. + */ + +#include "qemu/osdep.h" +#include "qemu/coroutine-tls.h" +#include "qemu/notify.h" +#include "qemu/thread.h" +#include "sysemu/block-backend.h" + +/* A function call that has been deferred until unplug() */ +typedef struct { + void (*fn)(void *); + void *opaque; +} UnplugFn; + +/* Per-thread state */ +typedef struct { + unsigned count; /* how many times has plug() been called? */ + GArray *unplug_fns; /* functions to call at unplug time */ +} Plug; + +/* Use get_ptr_plug() to fetch this thread-local value */ +QEMU_DEFINE_STATIC_CO_TLS(Plug, plug); + +/* Called at thread cleanup time */ +static void blk_io_plug_atexit(Notifier *n, void *value) +{ + Plug *plug =3D get_ptr_plug(); + g_array_free(plug->unplug_fns, TRUE); +} + +/* This won't involve coroutines, so use __thread */ +static __thread Notifier blk_io_plug_atexit_notifier; + +/** + * blk_io_plug_call: + * @fn: a function pointer to be invoked + * @opaque: a user-defined argument to @fn() + * + * Call @fn(@opaque) immediately if not within a blk_io_plug()/blk_io_unpl= ug() + * section. + * + * Otherwise defer the call until the end of the outermost + * blk_io_plug()/blk_io_unplug() section in this thread. If the same + * @fn/@opaque pair has already been deferred, it will only be called once= upon + * blk_io_unplug() so that accumulated calls are batched into a single cal= l. + * + * The caller must ensure that @opaque is not freed before @fn() is invoke= d. + */ +void blk_io_plug_call(void (*fn)(void *), void *opaque) +{ + Plug *plug =3D get_ptr_plug(); + + /* Call immediately if we're not plugged */ + if (plug->count =3D=3D 0) { + fn(opaque); + return; + } + + GArray *array =3D plug->unplug_fns; + if (!array) { + array =3D g_array_new(FALSE, FALSE, sizeof(UnplugFn)); + plug->unplug_fns =3D array; + blk_io_plug_atexit_notifier.notify =3D blk_io_plug_atexit; + qemu_thread_atexit_add(&blk_io_plug_atexit_notifier); + } + + UnplugFn *fns =3D (UnplugFn *)array->data; + UnplugFn new_fn =3D { + .fn =3D fn, + .opaque =3D opaque, + }; + + /* + * There won't be many, so do a linear search. If this becomes a bottl= eneck + * then a binary search (glib 2.62+) or different data structure could= be + * used. + */ + for (guint i =3D 0; i < array->len; i++) { + if (memcmp(&fns[i], &new_fn, sizeof(new_fn)) =3D=3D 0) { + return; /* already exists */ + } + } + + g_array_append_val(array, new_fn); +} + +/** + * blk_io_plug: Defer blk_io_plug_call() functions until blk_io_unplug() + * + * blk_io_plug/unplug are thread-local operations. This means that multiple + * threads can simultaneously call plug/unplug, but the caller must ensure= that + * each unplug() is called in the same thread of the matching plug(). + * + * Nesting is supported. blk_io_plug_call() functions are only called at t= he + * outermost blk_io_unplug(). + */ +void blk_io_plug(void) +{ + Plug *plug =3D get_ptr_plug(); + + assert(plug->count < UINT32_MAX); + + plug->count++; +} + +/** + * blk_io_unplug: Run any pending blk_io_plug_call() functions + * + * There must have been a matching blk_io_plug() call in the same thread p= rior + * to this blk_io_unplug() call. + */ +void blk_io_unplug(void) +{ + Plug *plug =3D get_ptr_plug(); + + assert(plug->count > 0); + + if (--plug->count > 0) { + return; + } + + GArray *array =3D plug->unplug_fns; + if (!array) { + return; + } + + UnplugFn *fns =3D (UnplugFn *)array->data; + + for (guint i =3D 0; i < array->len; i++) { + fns[i].fn(fns[i].opaque); + } + + /* + * This resets the array without freeing memory so that appending is c= heap + * in the future. + */ + g_array_set_size(array, 0); +} diff --git a/hw/block/dataplane/xen-block.c b/hw/block/dataplane/xen-block.c index d8bc39d359..e49c24f63d 100644 --- a/hw/block/dataplane/xen-block.c +++ b/hw/block/dataplane/xen-block.c @@ -537,7 +537,7 @@ static bool xen_block_handle_requests(XenBlockDataPlane= *dataplane) * is below us. */ if (inflight_atstart > IO_PLUG_THRESHOLD) { - blk_io_plug(dataplane->blk); + blk_io_plug(); } while (rc !=3D rp) { /* pull request from ring */ @@ -577,12 +577,12 @@ static bool xen_block_handle_requests(XenBlockDataPla= ne *dataplane) =20 if (inflight_atstart > IO_PLUG_THRESHOLD && batched >=3D inflight_atstart) { - blk_io_unplug(dataplane->blk); + blk_io_unplug(); } xen_block_do_aio(request); if (inflight_atstart > IO_PLUG_THRESHOLD) { if (batched >=3D inflight_atstart) { - blk_io_plug(dataplane->blk); + blk_io_plug(); batched =3D 0; } else { batched++; @@ -590,7 +590,7 @@ static bool xen_block_handle_requests(XenBlockDataPlane= *dataplane) } } if (inflight_atstart > IO_PLUG_THRESHOLD) { - blk_io_unplug(dataplane->blk); + blk_io_unplug(); } =20 return done_something; diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c index 8f65ea4659..b4286424c1 100644 --- a/hw/block/virtio-blk.c +++ b/hw/block/virtio-blk.c @@ -1134,7 +1134,7 @@ void virtio_blk_handle_vq(VirtIOBlock *s, VirtQueue *= vq) bool suppress_notifications =3D virtio_queue_get_notification(vq); =20 aio_context_acquire(blk_get_aio_context(s->blk)); - blk_io_plug(s->blk); + blk_io_plug(); =20 do { if (suppress_notifications) { @@ -1158,7 +1158,7 @@ void virtio_blk_handle_vq(VirtIOBlock *s, VirtQueue *= vq) virtio_blk_submit_multireq(s, &mrb); } =20 - blk_io_unplug(s->blk); + blk_io_unplug(); aio_context_release(blk_get_aio_context(s->blk)); } =20 diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c index 612c525d9d..534a44ee07 100644 --- a/hw/scsi/virtio-scsi.c +++ b/hw/scsi/virtio-scsi.c @@ -799,7 +799,7 @@ static int virtio_scsi_handle_cmd_req_prepare(VirtIOSCS= I *s, VirtIOSCSIReq *req) return -ENOBUFS; } scsi_req_ref(req->sreq); - blk_io_plug(d->conf.blk); + blk_io_plug(); object_unref(OBJECT(d)); return 0; } @@ -810,7 +810,7 @@ static void virtio_scsi_handle_cmd_req_submit(VirtIOSCS= I *s, VirtIOSCSIReq *req) if (scsi_req_enqueue(sreq)) { scsi_req_continue(sreq); } - blk_io_unplug(sreq->dev->conf.blk); + blk_io_unplug(); scsi_req_unref(sreq); } =20 @@ -836,7 +836,7 @@ static void virtio_scsi_handle_cmd_vq(VirtIOSCSI *s, Vi= rtQueue *vq) while (!QTAILQ_EMPTY(&reqs)) { req =3D QTAILQ_FIRST(&reqs); QTAILQ_REMOVE(&reqs, req, next); - blk_io_unplug(req->sreq->dev->conf.blk); + blk_io_unplug(); scsi_req_unref(req->sreq); virtqueue_detach_element(req->vq, &req->elem, 0); virtio_scsi_free_req(req); diff --git a/block/meson.build b/block/meson.build index 486dda8b85..fb4332bd66 100644 --- a/block/meson.build +++ b/block/meson.build @@ -23,6 +23,7 @@ block_ss.add(files( 'mirror.c', 'nbd.c', 'null.c', + 'plug.c', 'qapi.c', 'qcow2-bitmap.c', 'qcow2-cache.c', --=20 2.40.1