From nobody Mon Feb 9 00:30:59 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1668793455; cv=none; d=zohomail.com; s=zohoarc; b=Ff9ycRKMTGQdLkWDqlZNC3/5tJJjBLrmvaAYG9PSsJupQhQ5dP//JrO8/xnYYKWkgpri+as1WeuCtkuuV1k5BrEj2AtqbyR55pdqc/iDgx6YmItvQ0YnKQRYkHHcyJS6Gqvfm/3q8YEO17+6v7ohadMeYFB2d84hk8QppWj33P8= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1668793455; h=Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=FeLRHFrl5U3synO7C+VkluQdib4uKOFPDq83oDtcplI=; b=ZpEGoeGk4kwi1eDDwRmBSmH0fHlIOmAUzLqLq9vBumPe3euAzdHljF3ADZUmxdX+ixKknN3roH/DGB7l6mD+mXv6WqVdFPYrzR1RarBOTepxqRNsaAhDzY/zbuyHlteKb3wYHT8P4APC/z7/0Bxqg0UA386JRGsrAGGtQWx4X8g= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1668793455718319.5937279799539; Fri, 18 Nov 2022 09:44:15 -0800 (PST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ow5NZ-0007ED-N8; Fri, 18 Nov 2022 12:42:18 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ow5NG-0006jh-BU for qemu-devel@nongnu.org; Fri, 18 Nov 2022 12:41:58 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ow5N5-0002YO-Dr for qemu-devel@nongnu.org; Fri, 18 Nov 2022 12:41:58 -0500 Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-610-PUmKJmyxND21eF069yqwag-1; Fri, 18 Nov 2022 12:41:43 -0500 Received: from smtp.corp.redhat.com (int-mx10.intmail.prod.int.rdu2.redhat.com [10.11.54.10]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 3803D811E67; Fri, 18 Nov 2022 17:41:43 +0000 (UTC) Received: from merkur.fritz.box (unknown [10.39.194.142]) by smtp.corp.redhat.com (Postfix) with ESMTP id F2EEE492B04; Fri, 18 Nov 2022 17:41:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1668793306; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=FeLRHFrl5U3synO7C+VkluQdib4uKOFPDq83oDtcplI=; b=PxzZw697tXZgX1tFWc7dwqZDjE637Tfu9RCIHmUXJcTYj5MqmfnE6eHenwl8iu0Jty5qix VAy5tISrhgjrH/fri6icF1PCPfmr/1Lr7mYRE0SM1TAXYx94EilRX2dZSaCXvMrE/uLqpg EL0MgYu+KzKKMPUJZNhXO7UU6/Bhujc= X-MC-Unique: PUmKJmyxND21eF069yqwag-1 From: Kevin Wolf To: qemu-block@nongnu.org Cc: kwolf@redhat.com, eesposit@redhat.com, stefanha@redhat.com, hreitz@redhat.com, pbonzini@redhat.com, vsementsov@yandex-team.ru, qemu-devel@nongnu.org Subject: [PATCH v2 14/15] block: Don't poll in bdrv_replace_child_noperm() Date: Fri, 18 Nov 2022 18:41:09 +0100 Message-Id: <20221118174110.55183-15-kwolf@redhat.com> In-Reply-To: <20221118174110.55183-1-kwolf@redhat.com> References: <20221118174110.55183-1-kwolf@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.1 on 10.11.54.10 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=kwolf@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1668793456900100001 Content-Type: text/plain; charset="utf-8" In order to make sure that bdrv_replace_child_noperm() doesn't have to poll any more, get rid of the bdrv_parent_drained_begin_single() call. This is possible now because we can require that the parent is already drained through the child in question when the function is called and we don't call the parent drain callbacks more than once. The additional drain calls needed in callers cause the test case to run its code in the drain handler too early (bdrv_attach_child() drains now), so modify it to only enable the code after the test setup has completed. Signed-off-by: Kevin Wolf Reviewed-by: Vladimir Sementsov-Ogievskiy --- include/block/block-io.h | 8 +++ block.c | 103 ++++++++++++++++++++++++++++++----- block/io.c | 2 +- tests/unit/test-bdrv-drain.c | 10 ++++ 4 files changed, 108 insertions(+), 15 deletions(-) diff --git a/include/block/block-io.h b/include/block/block-io.h index 8f5e75756a..65e6d2569b 100644 --- a/include/block/block-io.h +++ b/include/block/block-io.h @@ -292,6 +292,14 @@ bdrv_writev_vmstate(BlockDriverState *bs, QEMUIOVector= *qiov, int64_t pos); */ void bdrv_parent_drained_begin_single(BdrvChild *c, bool poll); =20 +/** + * bdrv_parent_drained_poll_single: + * + * Returns true if there is any pending activity to cease before @c can be + * called quiesced, false otherwise. + */ +bool bdrv_parent_drained_poll_single(BdrvChild *c); + /** * bdrv_parent_drained_end_single: * diff --git a/block.c b/block.c index d18512944d..3f12aba6ce 100644 --- a/block.c +++ b/block.c @@ -2409,6 +2409,20 @@ static void bdrv_replace_child_abort(void *opaque) =20 GLOBAL_STATE_CODE(); /* old_bs reference is transparently moved from @s to @s->child */ + if (!s->child->bs) { + /* + * The parents were undrained when removing old_bs from the child.= New + * requests can't have been made, though, because the child was em= pty. + * + * TODO Make bdrv_replace_child_noperm() transactionable to avoid + * undraining the parent in the first place. Once this is done, ha= ving + * new_bs drained when calling bdrv_replace_child_tran() is not a + * requirement any more. + */ + bdrv_parent_drained_begin_single(s->child, false); + assert(!bdrv_parent_drained_poll_single(s->child)); + } + assert(s->child->quiesced_parent); bdrv_replace_child_noperm(s->child, s->old_bs); bdrv_unref(new_bs); } @@ -2424,12 +2438,19 @@ static TransactionActionDrv bdrv_replace_child_drv = =3D { * * Note: real unref of old_bs is done only on commit. * + * Both @child->bs and @new_bs (if non-NULL) must be drained. @new_bs must= be + * kept drained until the transaction is completed. + * * The function doesn't update permissions, caller is responsible for this. */ static void bdrv_replace_child_tran(BdrvChild *child, BlockDriverState *ne= w_bs, Transaction *tran) { BdrvReplaceChildState *s =3D g_new(BdrvReplaceChildState, 1); + + assert(child->quiesced_parent); + assert(!new_bs || new_bs->quiesce_counter); + *s =3D (BdrvReplaceChildState) { .child =3D child, .old_bs =3D child->bs, @@ -2821,6 +2842,14 @@ uint64_t bdrv_qapi_perm_to_blk_perm(BlockPermission = qapi_perm) return permissions[qapi_perm]; } =20 +/* + * Replaces the node that a BdrvChild points to without updating permissio= ns. + * + * If @new_bs is non-NULL, the parent of @child must already be drained th= rough + * @child. + * + * This function does not poll. + */ static void bdrv_replace_child_noperm(BdrvChild *child, BlockDriverState *new_bs) { @@ -2828,6 +2857,28 @@ static void bdrv_replace_child_noperm(BdrvChild *chi= ld, int new_bs_quiesce_counter; =20 assert(!child->frozen); + + /* + * If we want to change the BdrvChild to point to a drained node as it= s new + * child->bs, we need to make sure that its new parent is drained, too= . In + * other words, either child->quiesce_parent must already be true or w= e must + * be able to set it and keep the parent's quiesce_counter consistent = with + * that, but without polling or starting new requests (this function + * guarantees that it doesn't poll, and starting new requests would be + * against the invariants of drain sections). + * + * To keep things simple, we pick the first option (child->quiesce_par= ent + * must already be true). We also generalise the rule a bit to make it + * easier to verify in callers and more likely to be covered in test c= ases: + * The parent must be quiesced through this child even if new_bs isn't + * currently drained. + * + * The only exception is for callers that always pass new_bs =3D=3D NU= LL. In + * this case, we obviously never need to consider the case of a drained + * new_bs, so we can keep the callers simpler by allowing them not to = drain + * the parent. + */ + assert(!new_bs || child->quiesced_parent); assert(old_bs !=3D new_bs); GLOBAL_STATE_CODE(); =20 @@ -2835,15 +2886,6 @@ static void bdrv_replace_child_noperm(BdrvChild *chi= ld, assert(bdrv_get_aio_context(old_bs) =3D=3D bdrv_get_aio_context(ne= w_bs)); } =20 - /* - * If the new child node is drained but the old one was not, flush - * all outstanding requests to the old child node. - */ - new_bs_quiesce_counter =3D (new_bs ? new_bs->quiesce_counter : 0); - if (new_bs_quiesce_counter && !child->quiesced_parent) { - bdrv_parent_drained_begin_single(child, true); - } - if (old_bs) { if (child->klass->detach) { child->klass->detach(child); @@ -2863,11 +2905,9 @@ static void bdrv_replace_child_noperm(BdrvChild *chi= ld, } =20 /* - * If the old child node was drained but the new one is not, allow - * requests to come in only after the new node has been attached. - * - * Update new_bs_quiesce_counter because bdrv_parent_drained_begin_sin= gle() - * polls, which could have changed the value. + * If the parent was drained through this BdrvChild previously, but ne= w_bs + * is not drained, allow requests to come in only after the new node h= as + * been attached. */ new_bs_quiesce_counter =3D (new_bs ? new_bs->quiesce_counter : 0); if (!new_bs_quiesce_counter && child->quiesced_parent) { @@ -3004,6 +3044,24 @@ static BdrvChild *bdrv_attach_child_common(BlockDriv= erState *child_bs, } =20 bdrv_ref(child_bs); + /* + * Let every new BdrvChild start with a drained parent. Inserting the = child + * in the graph with bdrv_replace_child_noperm() will undrain it if + * @child_bs is not drained. + * + * The child was only just created and is not yet visible in global st= ate + * until bdrv_replace_child_noperm() inserts it into the graph, so nob= ody + * could have sent requests and polling is not necessary. + * + * Note that this means that the parent isn't fully drained yet, we on= ly + * stop new requests from coming in. This is fine, we don't care about= the + * old requests here, they are not for this child. If another place en= ters a + * drain section for the same parent, but wants it to be fully quiesce= d, it + * will not run most of the the code in .drained_begin() again (which = is not + * a problem, we already did this), but it will still poll until the p= arent + * is fully quiesced, so it will not be negatively affected either. + */ + bdrv_parent_drained_begin_single(new_child, false); bdrv_replace_child_noperm(new_child, child_bs); =20 BdrvAttachChildCommonState *s =3D g_new(BdrvAttachChildCommonState, 1); @@ -5061,7 +5119,10 @@ static void bdrv_remove_child(BdrvChild *child, Tran= saction *tran) } =20 if (child->bs) { + BlockDriverState *bs =3D child->bs; + bdrv_drained_begin(bs); bdrv_replace_child_tran(child, NULL, tran); + bdrv_drained_end(bs); } =20 tran_add(tran, &bdrv_remove_child_drv, child); @@ -5078,6 +5139,15 @@ static void bdrv_remove_filter_or_cow_child(BlockDri= verState *bs, bdrv_remove_child(bdrv_filter_or_cow_child(bs), tran); } =20 +static void undrain_on_clean_cb(void *opaque) +{ + bdrv_drained_end(opaque); +} + +static TransactionActionDrv undrain_on_clean =3D { + .clean =3D undrain_on_clean_cb, +}; + static int bdrv_replace_node_noperm(BlockDriverState *from, BlockDriverState *to, bool auto_skip, Transaction *tran, @@ -5087,6 +5157,11 @@ static int bdrv_replace_node_noperm(BlockDriverState= *from, =20 GLOBAL_STATE_CODE(); =20 + bdrv_drained_begin(from); + bdrv_drained_begin(to); + tran_add(tran, &undrain_on_clean, from); + tran_add(tran, &undrain_on_clean, to); + QLIST_FOREACH_SAFE(c, &from->parents, next_parent, next) { assert(c->bs =3D=3D from); if (!should_update_child(c, to)) { diff --git a/block/io.c b/block/io.c index 5e9150d92c..ae64830eac 100644 --- a/block/io.c +++ b/block/io.c @@ -81,7 +81,7 @@ static void bdrv_parent_drained_end(BlockDriverState *bs,= BdrvChild *ignore) } } =20 -static bool bdrv_parent_drained_poll_single(BdrvChild *c) +bool bdrv_parent_drained_poll_single(BdrvChild *c) { if (c->klass->drained_poll) { return c->klass->drained_poll(c); diff --git a/tests/unit/test-bdrv-drain.c b/tests/unit/test-bdrv-drain.c index 172bc6debc..2686a8acee 100644 --- a/tests/unit/test-bdrv-drain.c +++ b/tests/unit/test-bdrv-drain.c @@ -1654,6 +1654,7 @@ static void test_drop_intermediate_poll(void) =20 =20 typedef struct BDRVReplaceTestState { + bool setup_completed; bool was_drained; bool was_undrained; bool has_read; @@ -1738,6 +1739,10 @@ static void bdrv_replace_test_drain_begin(BlockDrive= rState *bs) { BDRVReplaceTestState *s =3D bs->opaque; =20 + if (!s->setup_completed) { + return; + } + if (!s->drain_count) { s->drain_co =3D qemu_coroutine_create(bdrv_replace_test_drain_co, = bs); bdrv_inc_in_flight(bs); @@ -1769,6 +1774,10 @@ static void bdrv_replace_test_drain_end(BlockDriverS= tate *bs) { BDRVReplaceTestState *s =3D bs->opaque; =20 + if (!s->setup_completed) { + return; + } + g_assert(s->drain_count > 0); if (!--s->drain_count) { s->was_undrained =3D true; @@ -1867,6 +1876,7 @@ static void do_test_replace_child_mid_drain(int old_d= rain_count, bdrv_ref(old_child_bs); bdrv_attach_child(parent_bs, old_child_bs, "child", &child_of_bds, BDRV_CHILD_COW, &error_abort); + parent_s->setup_completed =3D true; =20 for (i =3D 0; i < old_drain_count; i++) { bdrv_drained_begin(old_child_bs); --=20 2.38.1