From nobody Sun Feb 8 06:22:49 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5DC6226D4EB; Fri, 10 Oct 2025 09:14:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760087695; cv=none; b=GDvinntdsO4xG0oGNvfq44J6En350h/yGekQXKOEhKWT3LwG6eLY0mDbTtkfvoT6YePHX/4yKaG9nxxyptvgl2QfpwloYHR+MrjxoAY7mNFj+8uBuzn8JVAJ3aSvpIZl//NBd10nist3ge47WWdRYqSwAMIqIkJ2ygKFU0o65ow= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760087695; c=relaxed/simple; bh=u7SLgLBccucj039JaoAvEvFkT37dkHYadI9yOpx4Z1s=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=j3HkZX9Rw4sECi4bPjNQ5soJ5kUr+oiEdGnID6gLYpC2TbQkXO7lOlF3YWdZRD3+5/MlcvN8dOBqvicomA6hz8wNS+kUe7SyTFTygn7LTFyDQHODW2JTXjPLCUOnhocfifmOz8DCmFCSj9hzBmsCk7IxtY6oz3QWf/uLlKoYlOI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=uUcHKBHd; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="uUcHKBHd" Received: by smtp.kernel.org (Postfix) with ESMTPSA id B6362C4CEF8; Fri, 10 Oct 2025 09:14:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1760087695; bh=u7SLgLBccucj039JaoAvEvFkT37dkHYadI9yOpx4Z1s=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=uUcHKBHdKBlX+Vk6ZWgSSSlwaDTUs7m64gI4i572Ow6hKpG39/wGQyevudosdQyIx moWPdYyRaID9tWXKn2C2qiRHzo2GJVDzDXQCT+8wFMqpXqVEAxfMWVV8s15aYO2EwK W37Bnfp4HO8F0NHqqraIAiu3VbvfndhDG6t8x5T5nQYwkymSkOg8MDnClEJQ9x1ink 4LWftPKPvYgOT9B9glXvdQA+7kcWzuAi2bZuxvdtCKedq3qdpXjHOa6wxm7E3242Ag 9/EZXKMiVe4oE9TI3F2yKBVpbuv+6JxAKnUBiDPal7mghk9mdgR9CEJbPH9RgbSFHZ PXeIGWshHyKBQ== From: Yu Kuai To: axboe@kernel.dk, tj@kernel.org, linux-block@vger.kernel.org, cgroups@vger.kernel.org, nilay@linux.ibm.com, bvanassche@acm.org, ming.lei@redhat.com, hch@lst.de Cc: linux-kernel@vger.kernel.org, Yu Kuai Subject: [PATCH v2 01/19] blk-cgroup: protect iterating blkgs with blkcg->lock in blkcg_print_stat() Date: Fri, 10 Oct 2025 17:14:26 +0800 Message-ID: <20251010091446.3048529-2-yukuai@kernel.org> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20251010091446.3048529-1-yukuai@kernel.org> References: <20251010091446.3048529-1-yukuai@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Yu Kuai blkcg_print_one_stat() will be called for each blkg: - access blkg->iostat, which is freed from rcu callback blkg_free_workfn(); - access policy data from pd_stat_fn(), which is frred from pd_free_fn(), while pd_free_fn() can be called by removing blkcg or deactivating policy; The blkcg->lock can make sure iterated blkgs are still online, and both blkg->iostat and policy data for activated policy won't be freed. Prepare to convert protecting blkgs from request_queue with mutex. Signed-off-by: Yu Kuai --- block/blk-cgroup.c | 9 +++------ 1 file changed, 3 insertions(+), 6 deletions(-) diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c index f93de34fe87d..0f6039d468a6 100644 --- a/block/blk-cgroup.c +++ b/block/blk-cgroup.c @@ -1242,13 +1242,10 @@ static int blkcg_print_stat(struct seq_file *sf, vo= id *v) else css_rstat_flush(&blkcg->css); =20 - rcu_read_lock(); - hlist_for_each_entry_rcu(blkg, &blkcg->blkg_list, blkcg_node) { - spin_lock_irq(&blkg->q->queue_lock); + guard(spinlock)(&blkcg->lock); + hlist_for_each_entry(blkg, &blkcg->blkg_list, blkcg_node) blkcg_print_one_stat(blkg, sf); - spin_unlock_irq(&blkg->q->queue_lock); - } - rcu_read_unlock(); + return 0; } =20 --=20 2.51.0 From nobody Sun Feb 8 06:22:49 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 11E10271447; Fri, 10 Oct 2025 09:14:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760087698; cv=none; b=rV923JopNaszVoNcw8fcXbPOQWQ31WmwbC+KdwCCV6g929XhXkgS6P/k92see074ms44k3YquU4PNF5IDO8vPmqCdo9mJaamld4++a8Qm0Thb7mafen20km3VnfTZLoGxnHt047tLGT8/zFoGF2vsqQZltjrGqc/tk8aatttgpk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760087698; c=relaxed/simple; bh=FXI5+pYneSkm/7caOF0xCjnW9F7E3AvQeBSpegaH1UQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=iP1bgMZ0Gb2kwDnbKs/y733du8z6VJlWOv0gm+Xv9Lm4XwsRhZSUvocQi0FZeXD1j818hcNRR/K0PAsDsyaMa1Agwe3TgDP3ZX7wMJwrZ9S4uiu1re43qnUCvOTQrDXCX++O5y1hP1H4LG9fvQBBpHeCgGYsbaDLYjeJOxQmOxA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=kO7G2FBg; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="kO7G2FBg" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 70962C4CEF1; Fri, 10 Oct 2025 09:14:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1760087697; bh=FXI5+pYneSkm/7caOF0xCjnW9F7E3AvQeBSpegaH1UQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=kO7G2FBg/pnq4Rnx6HcULPdfRzkOh/7PzA4V6DnYpoI6akcmuMup02rOfOArx5Q+y 3JbOM7z/Zjy64xwP6Rv2CwQBIkW8REUEarrbSW2kA+XK1mPrUVljeFc/3G0bSCLRSr Gzn/82CCy+Tt/HiazxYIB3RLMRQoQaylNiTSjSQIye9GdQXHHGfOE6WVAJZILVjMHz vTYo68OrSGJwyIIRCtKKQCvOiu3dNJDjYnYjRGxvaKvJNxRZ45WFKiQf2tP5L7JQ1y AQ3sI2nR7bGgXTBg5+s1XSwXVi9gRWrC8RbWROoTr5ffLyJyrbLN8zMJ3SI4+SccOn HJSsCOv3X2s3g== From: Yu Kuai To: axboe@kernel.dk, tj@kernel.org, linux-block@vger.kernel.org, cgroups@vger.kernel.org, nilay@linux.ibm.com, bvanassche@acm.org, ming.lei@redhat.com, hch@lst.de Cc: linux-kernel@vger.kernel.org, Yu Kuai Subject: [PATCH v2 02/19] blk-cgroup: delay freeing policy data after rcu grace period Date: Fri, 10 Oct 2025 17:14:27 +0800 Message-ID: <20251010091446.3048529-3-yukuai@kernel.org> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20251010091446.3048529-1-yukuai@kernel.org> References: <20251010091446.3048529-1-yukuai@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Yu Kuai Currently blkcg_print_blkgs() must be protected by rcu to iterate blkgs from blkcg, and then prfill() must be protected by queue_lock to prevent policy_data to be freed by deactivating policy. For consequence, queue_lock have to be nested under rcu from blkcg_print_blkgs(). This patch delay freeing policy_data after rcu grace period, so that it's possible to protect prfill() just with rcu lock held. Signed-off-by: Yu Kuai --- block/bfq-cgroup.c | 10 ++++++++-- block/blk-cgroup.h | 2 ++ block/blk-iocost.c | 14 ++++++++++++-- block/blk-iolatency.c | 10 +++++++++- block/blk-throttle.c | 13 +++++++++++-- 5 files changed, 42 insertions(+), 7 deletions(-) diff --git a/block/bfq-cgroup.c b/block/bfq-cgroup.c index 9fb9f3533150..a7e705d98751 100644 --- a/block/bfq-cgroup.c +++ b/block/bfq-cgroup.c @@ -548,14 +548,20 @@ static void bfq_pd_init(struct blkg_policy_data *pd) bfqg->rq_pos_tree =3D RB_ROOT; } =20 -static void bfq_pd_free(struct blkg_policy_data *pd) +static void bfqg_release(struct rcu_head *rcu) { + struct blkg_policy_data *pd =3D + container_of(rcu, struct blkg_policy_data, rcu_head); struct bfq_group *bfqg =3D pd_to_bfqg(pd); =20 - bfqg_stats_exit(&bfqg->stats); bfqg_put(bfqg); } =20 +static void bfq_pd_free(struct blkg_policy_data *pd) +{ + call_rcu(&pd->rcu_head, bfqg_release); +} + static void bfq_pd_reset_stats(struct blkg_policy_data *pd) { struct bfq_group *bfqg =3D pd_to_bfqg(pd); diff --git a/block/blk-cgroup.h b/block/blk-cgroup.h index 1cce3294634d..fd206d1fa3c9 100644 --- a/block/blk-cgroup.h +++ b/block/blk-cgroup.h @@ -140,6 +140,8 @@ struct blkg_policy_data { struct blkcg_gq *blkg; int plid; bool online; + + struct rcu_head rcu_head; }; =20 /* diff --git a/block/blk-iocost.c b/block/blk-iocost.c index 5bfd70311359..3593547930cc 100644 --- a/block/blk-iocost.c +++ b/block/blk-iocost.c @@ -3017,6 +3017,16 @@ static void ioc_pd_init(struct blkg_policy_data *pd) spin_unlock_irqrestore(&ioc->lock, flags); } =20 +static void iocg_release(struct rcu_head *rcu) +{ + struct blkg_policy_data *pd =3D + container_of(rcu, struct blkg_policy_data, rcu_head); + struct ioc_gq *iocg =3D pd_to_iocg(pd); + + free_percpu(iocg->pcpu_stat); + kfree(iocg); +} + static void ioc_pd_free(struct blkg_policy_data *pd) { struct ioc_gq *iocg =3D pd_to_iocg(pd); @@ -3041,8 +3051,8 @@ static void ioc_pd_free(struct blkg_policy_data *pd) =20 hrtimer_cancel(&iocg->waitq_timer); } - free_percpu(iocg->pcpu_stat); - kfree(iocg); + + call_rcu(&pd->rcu_head, iocg_release); } =20 static void ioc_pd_stat(struct blkg_policy_data *pd, struct seq_file *s) diff --git a/block/blk-iolatency.c b/block/blk-iolatency.c index 45bd18f68541..ce25fbb8aaf6 100644 --- a/block/blk-iolatency.c +++ b/block/blk-iolatency.c @@ -1031,13 +1031,21 @@ static void iolatency_pd_offline(struct blkg_policy= _data *pd) iolatency_clear_scaling(blkg); } =20 -static void iolatency_pd_free(struct blkg_policy_data *pd) +static void iolat_release(struct rcu_head *rcu) { + struct blkg_policy_data *pd =3D + container_of(rcu, struct blkg_policy_data, rcu_head); struct iolatency_grp *iolat =3D pd_to_lat(pd); + free_percpu(iolat->stats); kfree(iolat); } =20 +static void iolatency_pd_free(struct blkg_policy_data *pd) +{ + call_rcu(&pd->rcu_head, iolat_release); +} + static struct cftype iolatency_files[] =3D { { .name =3D "latency", diff --git a/block/blk-throttle.c b/block/blk-throttle.c index 2c5b64b1a724..cb3bfdb4684a 100644 --- a/block/blk-throttle.c +++ b/block/blk-throttle.c @@ -360,16 +360,25 @@ static void throtl_pd_online(struct blkg_policy_data = *pd) tg_update_has_rules(tg); } =20 -static void throtl_pd_free(struct blkg_policy_data *pd) +static void tg_release(struct rcu_head *rcu) { + struct blkg_policy_data *pd =3D + container_of(rcu, struct blkg_policy_data, rcu_head); struct throtl_grp *tg =3D pd_to_tg(pd); =20 - timer_delete_sync(&tg->service_queue.pending_timer); blkg_rwstat_exit(&tg->stat_bytes); blkg_rwstat_exit(&tg->stat_ios); kfree(tg); } =20 +static void throtl_pd_free(struct blkg_policy_data *pd) +{ + struct throtl_grp *tg =3D pd_to_tg(pd); + + timer_delete_sync(&tg->service_queue.pending_timer); + call_rcu(&pd->rcu_head, tg_release); +} + static struct throtl_grp * throtl_rb_first(struct throtl_service_queue *parent_sq) { --=20 2.51.0 From nobody Sun Feb 8 06:22:49 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DA5462749EA; Fri, 10 Oct 2025 09:15:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760087701; cv=none; b=duFjJKRGU3Kw9uInLR8WNU+Il7t36RSG/B8x+ux0DnlO/lXpospXx0+p4lgZlT3nM1EoL8g0YiQxQ9Xfg12A15Y+olmy6M9np9nWUdsbxvEFThU3036mAq7RwhInjnQujd2wM1dl/4RPQllYlRs6vGjXBe4H6B9p7Sbb0rusMAo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760087701; c=relaxed/simple; bh=zxYNGcr/gnHPy/XjyyXryLP2IO0Ih75dPQi3DbjxTS8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=fNrTPhiQZ+EGviLGOXGrqvavxiLGspIMkEHcvtieNjCkJYdGKhUGmJGpu97iMqxTWwpTeacRAwbPvaf0CKqJGnSgyNHr42Ep9b1+fOv+PQpdUrkBFMu7t10f5n7sW9gMlqccyyVxxspGU3PKY9rCwVxevelvw3YvNVX8s2NZYm0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=FHx+Wzhf; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="FHx+Wzhf" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 29EBFC113D0; Fri, 10 Oct 2025 09:14:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1760087700; bh=zxYNGcr/gnHPy/XjyyXryLP2IO0Ih75dPQi3DbjxTS8=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=FHx+WzhfvwgIB/L6ZwCuYp/ZgVO6D8PCdJBtKCuBXjKbQv9OJ8mYr5datTGm/V3eJ 8gWxN/n/V2NHeuX+W3k5tykZXGfzPLZ3vpvJG8eLoxURf16tu0Dda4on8ibwJwp3cF 9c872FeKoObv1QVbOZ4mJmRFk0ISPP2Huu8en/gBacCoOCv+QM0EggB47v8pKEgPkB u9d2hPCr9XWOz+nvc7igdYUZxiBlOXW5VRkMWAWNzp5X1bc8tUfy65ZrxDFNJKcXgg /p4p7QhfAY9dM4UMOY+4nOZMMStRyQHdefbLOIt0+SKcO5lBZNYrD8yQuzwZskGMvk putjHMwbpzTVw== From: Yu Kuai To: axboe@kernel.dk, tj@kernel.org, linux-block@vger.kernel.org, cgroups@vger.kernel.org, nilay@linux.ibm.com, bvanassche@acm.org, ming.lei@redhat.com, hch@lst.de Cc: linux-kernel@vger.kernel.org, Yu Kuai Subject: [PATCH v2 03/19] blk-cgroup: don't nest queue_lock under rcu in blkcg_print_blkgs() Date: Fri, 10 Oct 2025 17:14:28 +0800 Message-ID: <20251010091446.3048529-4-yukuai@kernel.org> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20251010091446.3048529-1-yukuai@kernel.org> References: <20251010091446.3048529-1-yukuai@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Yu Kuai With previous modification to delay freeing policy data after rcu grace period, now it's safe to protect prfill() with rcu directly because it's guaranteed policy_data won't be freed by concurrent deactivating policy. Signed-off-by: Yu Kuai --- block/blk-cgroup-rwstat.c | 4 +--- block/blk-cgroup.c | 2 -- 2 files changed, 1 insertion(+), 5 deletions(-) diff --git a/block/blk-cgroup-rwstat.c b/block/blk-cgroup-rwstat.c index a55fb0c53558..b8ab8c0063a3 100644 --- a/block/blk-cgroup-rwstat.c +++ b/block/blk-cgroup-rwstat.c @@ -101,10 +101,9 @@ void blkg_rwstat_recursive_sum(struct blkcg_gq *blkg, = struct blkcg_policy *pol, struct cgroup_subsys_state *pos_css; unsigned int i; =20 - lockdep_assert_held(&blkg->q->queue_lock); + WARN_ON_ONCE(!rcu_read_lock_held()); =20 memset(sum, 0, sizeof(*sum)); - rcu_read_lock(); blkg_for_each_descendant_pre(pos_blkg, pos_css, blkg) { struct blkg_rwstat *rwstat; =20 @@ -119,6 +118,5 @@ void blkg_rwstat_recursive_sum(struct blkcg_gq *blkg, s= truct blkcg_policy *pol, for (i =3D 0; i < BLKG_RWSTAT_NR; i++) sum->cnt[i] +=3D blkg_rwstat_read_counter(rwstat, i); } - rcu_read_unlock(); } EXPORT_SYMBOL_GPL(blkg_rwstat_recursive_sum); diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c index 0f6039d468a6..fb40262971c9 100644 --- a/block/blk-cgroup.c +++ b/block/blk-cgroup.c @@ -713,10 +713,8 @@ void blkcg_print_blkgs(struct seq_file *sf, struct blk= cg *blkcg, =20 rcu_read_lock(); hlist_for_each_entry_rcu(blkg, &blkcg->blkg_list, blkcg_node) { - spin_lock_irq(&blkg->q->queue_lock); if (blkcg_policy_enabled(blkg->q, pol)) total +=3D prfill(sf, blkg->pd[pol->plid], data); - spin_unlock_irq(&blkg->q->queue_lock); } rcu_read_unlock(); =20 --=20 2.51.0 From nobody Sun Feb 8 06:22:49 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 04728274B5C; Fri, 10 Oct 2025 09:15:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760087704; cv=none; b=hoPHb3pi2Z/wvrkPrw8ZU/35pGvus8Fk74taGsTR45CVPMh+Cp5LwboW3ynTMZ4tiuLqbB6DpFxf0szU31/eWCngRTQJqsq+WQlvAey/9etAlc58IbrnMTC1KbNVcFXerzWfQ3J+TcXzsvNv5dlS5lPWEwcMV1RPtRonYWWbdBs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760087704; c=relaxed/simple; bh=ovGLFVCTGYuUOq4yNzliDM5OKa3CSQx0e3n0oSGzHFo=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=SppcJl8F+we0LqxX76b7rJlMvtbspa6QrthD1iblenA0IIEfCXb5yt9/vZMnwi7APTe9fU3uFmxPdUKamKK8abLjwnhNwB39jC89xepncjElSwVamSM8YS6LE59fRcHsZyo6S96hqMQjaIS7OXG1U56Dh7uH63WtZdJqGUp9QbA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=puSKUgh7; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="puSKUgh7" Received: by smtp.kernel.org (Postfix) with ESMTPSA id D6D4CC4CEF1; Fri, 10 Oct 2025 09:15:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1760087703; bh=ovGLFVCTGYuUOq4yNzliDM5OKa3CSQx0e3n0oSGzHFo=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=puSKUgh74qhPenhdcqpgluAILDfsmC+ftc5uyGPFAd+eYy+xWsCgA9OanqBdpAPcn XnM1VH4MeubcQ5NFdG5pudsTvnmAR0ytwnAyhwqYRigYUuHnrjhHHuwsMhL76wEwqt O4uGN1KEwmSV86kf2pTNUi2dvD6ObfgYxdNUcYeC8Xnc6ILbEDWda3QnRw7CjKPE6Y XaITSTbXkQ5QkNLtwPbJHamE/oYAH824F3HAjhmlg22Ij18UzYNCMtT/W2FxMDOtLT vJWTc+UR6TXC6+Obf0MHiL8HHGZquw/tLHy2mwbTQQdiKxzUHTfu/008IcKR6OcAlC 3LYUwRxXteCdw== From: Yu Kuai To: axboe@kernel.dk, tj@kernel.org, linux-block@vger.kernel.org, cgroups@vger.kernel.org, nilay@linux.ibm.com, bvanassche@acm.org, ming.lei@redhat.com, hch@lst.de Cc: linux-kernel@vger.kernel.org, Yu Kuai Subject: [PATCH v2 04/19] blk-cgroup: don't nest queue_lock under rcu in blkg_lookup_create() Date: Fri, 10 Oct 2025 17:14:29 +0800 Message-ID: <20251010091446.3048529-5-yukuai@kernel.org> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20251010091446.3048529-1-yukuai@kernel.org> References: <20251010091446.3048529-1-yukuai@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Yu Kuai Changes to two step: 1) hold rcu lock and do blkg_lookup() from fast path; 2) hold queue_lock directly from slow path, and don't nest it under rcu lock; Prepare to convert protecting blkcg with blkcg_mutex instead of queue_lock; Signed-off-by: Yu Kuai --- block/blk-cgroup.c | 57 +++++++++++++++++++++++++++++----------------- 1 file changed, 36 insertions(+), 21 deletions(-) diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c index fb40262971c9..3363d2476fed 100644 --- a/block/blk-cgroup.c +++ b/block/blk-cgroup.c @@ -467,22 +467,17 @@ static struct blkcg_gq *blkg_lookup_create(struct blk= cg *blkcg, { struct request_queue *q =3D disk->queue; struct blkcg_gq *blkg; - unsigned long flags; - - WARN_ON_ONCE(!rcu_read_lock_held()); =20 - blkg =3D blkg_lookup(blkcg, q); - if (blkg) - return blkg; - - spin_lock_irqsave(&q->queue_lock, flags); + rcu_read_lock(); blkg =3D blkg_lookup(blkcg, q); if (blkg) { if (blkcg !=3D &blkcg_root && blkg !=3D rcu_dereference(blkcg->blkg_hint)) rcu_assign_pointer(blkcg->blkg_hint, blkg); - goto found; + rcu_read_unlock(); + return blkg; } + rcu_read_unlock(); =20 /* * Create blkgs walking down from blkcg_root to @blkcg, so that all @@ -514,8 +509,6 @@ static struct blkcg_gq *blkg_lookup_create(struct blkcg= *blkcg, break; } =20 -found: - spin_unlock_irqrestore(&q->queue_lock, flags); return blkg; } =20 @@ -2078,6 +2071,18 @@ void blkcg_add_delay(struct blkcg_gq *blkg, u64 now,= u64 delta) atomic64_add(delta, &blkg->delay_nsec); } =20 +static inline struct blkcg_gq *blkg_lookup_tryget(struct blkcg_gq *blkg) +{ +retry: + if (blkg_tryget(blkg)) + return blkg; + + blkg =3D blkg->parent; + if (blkg) + goto retry; + + return NULL; +} /** * blkg_tryget_closest - try and get a blkg ref on the closet blkg * @bio: target bio @@ -2090,20 +2095,30 @@ void blkcg_add_delay(struct blkcg_gq *blkg, u64 now= , u64 delta) static inline struct blkcg_gq *blkg_tryget_closest(struct bio *bio, struct cgroup_subsys_state *css) { - struct blkcg_gq *blkg, *ret_blkg =3D NULL; + struct request_queue *q =3D bio->bi_bdev->bd_queue; + struct blkcg *blkcg =3D css_to_blkcg(css); + struct blkcg_gq *blkg; =20 rcu_read_lock(); - blkg =3D blkg_lookup_create(css_to_blkcg(css), bio->bi_bdev->bd_disk); - while (blkg) { - if (blkg_tryget(blkg)) { - ret_blkg =3D blkg; - break; - } - blkg =3D blkg->parent; - } + blkg =3D blkg_lookup(blkcg, q); + if (likely(blkg)) + blkg =3D blkg_lookup_tryget(blkg); rcu_read_unlock(); =20 - return ret_blkg; + if (blkg) + return blkg; + + /* + * Fast path failed, we're probably issuing IO in this cgroup the first + * time, hold lock to create new blkg. + */ + spin_lock_irq(&q->queue_lock); + blkg =3D blkg_lookup_create(blkcg, bio->bi_bdev->bd_disk); + if (blkg) + blkg =3D blkg_lookup_tryget(blkg); + spin_unlock_irq(&q->queue_lock); + + return blkg; } =20 /** --=20 2.51.0 From nobody Sun Feb 8 06:22:49 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B4D8725EF87; Fri, 10 Oct 2025 09:15:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760087709; cv=none; b=DuuD2ExgCyxOKiNsr1+rMpNcNTRK4SN8xWCrci9LdtM41tsSGosaC0RMyzH/QOvBiLNTKSFAlazAkStmWd3hKpzJqYbHmMo0Ltv5AfvFo/jw+tn2vuSqatMuYcm11NX1pZove0eOgZvvjgCnwSElvQ65NNNeFn7l+mHvmekO5Z8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760087709; c=relaxed/simple; bh=sqJuHAprKJQ3RhTMTwL0vwyV20ICgfdd32xYvJ9805M=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Zonm4HEmyJnf0mQEpmJQlWdHkvw+MOToR8PrWGdCSmnzIPzwQyvov25AmcSiqFKC9gSVRkTyjKP+O1uWVJ58jQt3x+BW/F61SW6gSnCoRXLNbMmYyKsCm/XdkTnwJrDSx68gxD3xcyYQACBezwteIQh+JpYtkwQxTRB9yDVVxdQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=EWte+dZe; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="EWte+dZe" Received: by smtp.kernel.org (Postfix) with ESMTPSA id F15CBC4CEFE; Fri, 10 Oct 2025 09:15:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1760087706; bh=sqJuHAprKJQ3RhTMTwL0vwyV20ICgfdd32xYvJ9805M=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=EWte+dZepLXYxECs3XRrjQxMe+uiU/DCu0QZCHhdKkyiP8LljthiG/W/MyLdreNRw 9FMTpqwyUPLcnFqh5hy2Bz5jY67Z/f6k1oysuenYQxmdJocoKdCTh/kvX5s9+gE4Ci nxO6QuGOQ6i7qAaH8gngIr6ghsbBkY6vkctc86Kxg/qRWstYCBcSxKZcx/8RGnDbzp jYbj+3heIhE57adJd++X0pbi210xpSNNtSQ8FdN/d6hv0uPfV42CVu4nzxxIMrUbbr 4/wJ1KiCWr1PaH2RW42MQoBC0iGeZ4w+fJ43XpUMU32+pMuv8uahBbqNUZtDh7s2iO fyUFA3OjuhaTA== From: Yu Kuai To: axboe@kernel.dk, tj@kernel.org, linux-block@vger.kernel.org, cgroups@vger.kernel.org, nilay@linux.ibm.com, bvanassche@acm.org, ming.lei@redhat.com, hch@lst.de Cc: linux-kernel@vger.kernel.org, Yu Kuai Subject: [PATCH v2 05/19] blk-cgroup: don't nest queu_lock under rcu in bio_associate_blkg() Date: Fri, 10 Oct 2025 17:14:30 +0800 Message-ID: <20251010091446.3048529-6-yukuai@kernel.org> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20251010091446.3048529-1-yukuai@kernel.org> References: <20251010091446.3048529-1-yukuai@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Yu Kuai If bio is already associated with blkg, blkcg is already pinned until bio is done, no need for rcu protection; Otherwise protect blkcg_css() with rcu independently. Prepare to convert protecting blkcg with blkcg_mutex instead of queue_lock. Signed-off-by: Yu Kuai --- block/blk-cgroup.c | 18 +++++++++++------- 1 file changed, 11 insertions(+), 7 deletions(-) diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c index 3363d2476fed..2234ff2b2b8b 100644 --- a/block/blk-cgroup.c +++ b/block/blk-cgroup.c @@ -2166,16 +2166,20 @@ void bio_associate_blkg(struct bio *bio) if (blk_op_is_passthrough(bio->bi_opf)) return; =20 - rcu_read_lock(); - - if (bio->bi_blkg) + if (bio->bi_blkg) { css =3D bio_blkcg_css(bio); - else + bio_associate_blkg_from_css(bio, css); + } else { + rcu_read_lock(); css =3D blkcg_css(); + if (!css_tryget_online(css)) + css =3D NULL; + rcu_read_unlock(); =20 - bio_associate_blkg_from_css(bio, css); - - rcu_read_unlock(); + bio_associate_blkg_from_css(bio, css); + if (css) + css_put(css); + } } EXPORT_SYMBOL_GPL(bio_associate_blkg); =20 --=20 2.51.0 From nobody Sun Feb 8 06:22:49 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7167627464F; Fri, 10 Oct 2025 09:15:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760087710; cv=none; b=d3YMKhY0fhEJYsZZcVhhSYoctNQANTL3m5Xnvipmw5nTEJ+VxOkKfHmrdngtfgng5MY812c8z5rFhs11rRkCfgq/dOpBFrgqYEm/aDGsxVAo39dBHvLg1prje2gy3b17Ck5zicoRsFsYcTrNLGDVnDJxcVb4o9WjjPRx5+4MbmA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760087710; c=relaxed/simple; bh=G6Upd2yMujq+BwPFpJQX6P368aq8r2nm84L+UG2pXok=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=UIcmtfmIQHHWgLHUrYAqP64dIGIVBVWgdZvn7ShnnygKsskzh/XdzoVBnMkat+OZyUQqvQFhTwlXjfKaGPAFehKOE1YPcAFr+qQc7C2nc/0iEujeNcfTivUqH6uZIbxthz8ZfSW73A5I5g7qrsKTutsbo0SLRARx+vSnOiDFmbc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=F0tgdXSG; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="F0tgdXSG" Received: by smtp.kernel.org (Postfix) with ESMTPSA id AA23FC4CEF1; Fri, 10 Oct 2025 09:15:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1760087708; bh=G6Upd2yMujq+BwPFpJQX6P368aq8r2nm84L+UG2pXok=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=F0tgdXSG0HTr1rpLpYrBwXvlHDcUYN8xoGkTrDQ4y533t3vSXhz9yLFFbxD2NMOAO vftJxOaBSXqsXupiarE2qSpua+IeBkHXz2HZlDS5P220w5OBHX9xlz6BNjdoGfGzyG SfyUCAvveQhZFekUFON2yGIYGsfmWSLbHueGNAHYYPb685XzKjRkads4XYrcdWFpOu NxVDSYm8yXXpVF1GMQvbcaQFaU0nmdDDktSZ/WV3/hSypbLpSbNT0N8Dd/ZNQdr3kD I/LBuOhXUq+MLxOcq04hRieJZUoeW365eUO9KevoIeFIESJZd0No29Dbf1Z6wYZV8t dpd3IJFXsOavA== From: Yu Kuai To: axboe@kernel.dk, tj@kernel.org, linux-block@vger.kernel.org, cgroups@vger.kernel.org, nilay@linux.ibm.com, bvanassche@acm.org, ming.lei@redhat.com, hch@lst.de Cc: linux-kernel@vger.kernel.org, Yu Kuai Subject: [PATCH v2 06/19] blk-cgroup: don't nest queue_lock under blkcg->lock in blkcg_destroy_blkgs() Date: Fri, 10 Oct 2025 17:14:31 +0800 Message-ID: <20251010091446.3048529-7-yukuai@kernel.org> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20251010091446.3048529-1-yukuai@kernel.org> References: <20251010091446.3048529-1-yukuai@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Yu Kuai The correct lock order is q->queue_lock before blkcg->lock, and in order to prevent deadlock from blkcg_destroy_blkgs(), trylock is used for q->queue_lock while blkcg->lock is already held, this is hacky. Hence refactor blkcg_destroy_blkgs(), by holding blkcg->lock to get the first blkg and release it, then hold q->queue_lock and blkcg->lock in the correct order to destroy blkg. This is super cold path, it's fine to grab and release locks. Also prepare to convert protecting blkcg with blkcg_mutex instead of queue_lock. Signed-off-by: Yu Kuai --- block/blk-cgroup.c | 45 ++++++++++++++++++++++++++------------------- 1 file changed, 26 insertions(+), 19 deletions(-) diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c index 2234ff2b2b8b..99edf15ce525 100644 --- a/block/blk-cgroup.c +++ b/block/blk-cgroup.c @@ -1284,6 +1284,21 @@ struct list_head *blkcg_get_cgwb_list(struct cgroup_= subsys_state *css) * This finally frees the blkcg. */ =20 +static struct blkcg_gq *blkcg_get_first_blkg(struct blkcg *blkcg) +{ + struct blkcg_gq *blkg =3D NULL; + + spin_lock_irq(&blkcg->lock); + if (!hlist_empty(&blkcg->blkg_list)) { + blkg =3D hlist_entry(blkcg->blkg_list.first, struct blkcg_gq, + blkcg_node); + blkg_get(blkg); + } + spin_unlock_irq(&blkcg->lock); + + return blkg; +} + /** * blkcg_destroy_blkgs - responsible for shooting down blkgs * @blkcg: blkcg of interest @@ -1297,32 +1312,24 @@ struct list_head *blkcg_get_cgwb_list(struct cgroup= _subsys_state *css) */ static void blkcg_destroy_blkgs(struct blkcg *blkcg) { - might_sleep(); + struct blkcg_gq *blkg; =20 - spin_lock_irq(&blkcg->lock); + might_sleep(); =20 - while (!hlist_empty(&blkcg->blkg_list)) { - struct blkcg_gq *blkg =3D hlist_entry(blkcg->blkg_list.first, - struct blkcg_gq, blkcg_node); + while ((blkg =3D blkcg_get_first_blkg(blkcg))) { struct request_queue *q =3D blkg->q; =20 - if (need_resched() || !spin_trylock(&q->queue_lock)) { - /* - * Given that the system can accumulate a huge number - * of blkgs in pathological cases, check to see if we - * need to rescheduling to avoid softlockup. - */ - spin_unlock_irq(&blkcg->lock); - cond_resched(); - spin_lock_irq(&blkcg->lock); - continue; - } + spin_lock_irq(&q->queue_lock); + spin_lock(&blkcg->lock); =20 blkg_destroy(blkg); - spin_unlock(&q->queue_lock); - } =20 - spin_unlock_irq(&blkcg->lock); + spin_unlock(&blkcg->lock); + spin_unlock_irq(&q->queue_lock); + + blkg_put(blkg); + cond_resched(); + } } =20 /** --=20 2.51.0 From nobody Sun Feb 8 06:22:49 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EC760277004; Fri, 10 Oct 2025 09:15:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760087712; cv=none; b=dN4W/IJomBBGEuLsCd59/PtR+qM8N3Kf1HWF9Z25nLRgH62JYCOB92d+sWmKl7+1w75KGdL1xiOlauWJoH669mPmPUh5dbpo3ymOr+/fmj+ObWqQ631PKNiRgw0rILRG+R4hV44FoBk0JHTMZAyF64HB/rXG/QNmEJhGxxTFI0o= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760087712; c=relaxed/simple; bh=f5ObnNSNB2Rh3zWoq5PQH5VemgkhXP2DHUAXCReJSxs=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=AFJjena8UyCa1/uJMaOKOv/haYF31UmFX9aLNCLM7XqJlPISMZF8e5LSSJMu9NHD7yIOtMiiBZY8+eEImq9sMjrjuhlDJHdsRbnL97ftdMsm6hMJFvnSSn9gLlmJMNeLd56JQVMUg3BOvLBBMIqJ7RRXdBRwmGMu9aoVtiBt8fM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=hUvd7UFT; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="hUvd7UFT" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5EF31C113D0; Fri, 10 Oct 2025 09:15:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1760087711; bh=f5ObnNSNB2Rh3zWoq5PQH5VemgkhXP2DHUAXCReJSxs=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=hUvd7UFTfOp5CondiGEU+M94qZHLL4ehN141dp+doFfjf6af+VpWLY+1fLXRlL5i7 lLMhhSCkwvdVocBIxaajBvQP8A1sNw6mWR6H7ZXX4vb38srYbB8U2wd26k1pNdcjyS gaTUT9nh8uSCEHYx9/0YdbsyjtzwFxpisdeiDpzEph015JI91UVggkLAtOzvFwz1nc NJLN3yQzmfNLSMrEitSM2vx6bI+gObStReGSUFMOnEIxZ6whHV8j3hoPkDwuozpw8c wH9UJ9oiT5u3QYB+HiyqcvdF+VtICLHfGdRxlLeIEgsPewlnm9jH5zqcCICOja0Jf3 py4OBUThlyGYw== From: Yu Kuai To: axboe@kernel.dk, tj@kernel.org, linux-block@vger.kernel.org, cgroups@vger.kernel.org, nilay@linux.ibm.com, bvanassche@acm.org, ming.lei@redhat.com, hch@lst.de Cc: linux-kernel@vger.kernel.org, Yu Kuai Subject: [PATCH v2 07/19] mm/page_io: don't nest queue_lock under rcu in bio_associate_blkg_from_page() Date: Fri, 10 Oct 2025 17:14:32 +0800 Message-ID: <20251010091446.3048529-8-yukuai@kernel.org> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20251010091446.3048529-1-yukuai@kernel.org> References: <20251010091446.3048529-1-yukuai@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Yu Kuai Prepare to convert protecting blkcg with blkcg_mutex instead of queue_lock. Signed-off-by: Yu Kuai --- mm/page_io.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/mm/page_io.c b/mm/page_io.c index a2056a5ecb13..4f4cc9370573 100644 --- a/mm/page_io.c +++ b/mm/page_io.c @@ -313,8 +313,13 @@ static void bio_associate_blkg_from_page(struct bio *b= io, struct folio *folio) =20 rcu_read_lock(); css =3D cgroup_e_css(memcg->css.cgroup, &io_cgrp_subsys); - bio_associate_blkg_from_css(bio, css); + if (!css || !css_tryget_online(css)) + css =3D NULL; rcu_read_unlock(); + + bio_associate_blkg_from_css(bio, css); + if (css) + css_put(css); } #else #define bio_associate_blkg_from_page(bio, folio) do { } while (0) --=20 2.51.0 From nobody Sun Feb 8 06:22:49 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 022A7276028; Fri, 10 Oct 2025 09:15:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760087715; cv=none; b=ZMzfRxyO8QC9gfPosZoCzVwb2jsIX+lRKsc9lZSK6HLralAfMepuMkX8csV4d1ZT9w/gw96+8uz+YPuwcXWcHLh/2dpvQeFFzP5zIiYnmqJD7+Q/oPGXKOHNEzsiwCVT6vVo2qLSYddeNa/Q+AAcSNpNTrwx0ZxGPj7tsrvKYCA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760087715; c=relaxed/simple; bh=FQQXZUoa9Daug55CdrLq0FNeKqBX+4W3fszrTzKrPg8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=oQpb1+fBeLNyHtudaPYqLmsdyuMy4kLjdvs9Z99NYKIT3caiNHDFJUx8syL8UcCeJYvpfmWtg+m6QuWlXf+Bh/al0dOVQfPGMEXZHEkOSMv4kkd76qrmZ4NUVVEHOGAL8kZNescM88GorZ86CLnXxZvDi+F/EBibnYsz7kwk5mQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=QGV9ww9a; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="QGV9ww9a" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 48269C4CEF1; Fri, 10 Oct 2025 09:15:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1760087714; bh=FQQXZUoa9Daug55CdrLq0FNeKqBX+4W3fszrTzKrPg8=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=QGV9ww9amy7L2IZtikL5KwF7RQkikyDgY9LUEikN4xPuXjov5yy+ryuu/G963y3bT 9yhnWFC3EIKBqp697+/6qlFrjCkSXxyf9GkVtORzsCVlKRC3dNQNha7DadwmsxZCIy 1Ucs832Z92hEvwd8D/16KRJ4hGYX4AzSFDZ4PKk8XLait2SgFbtZkuNJyg91s6NaMA 4e3nviEL2OYaYIwi4q6PVyj/mkGTHFcn4pkmx3kzFRH8Dq5uxE8997wJaxa+FPsCCL YmwTlT1OB5RqQ2KBKg0TMfcxAxdHYh/Di5ibYO0LTrYOB748FS6dxVs7eC0zFbMQf6 aVJ5FN6tOBLNA== From: Yu Kuai To: axboe@kernel.dk, tj@kernel.org, linux-block@vger.kernel.org, cgroups@vger.kernel.org, nilay@linux.ibm.com, bvanassche@acm.org, ming.lei@redhat.com, hch@lst.de Cc: linux-kernel@vger.kernel.org, Yu Kuai Subject: [PATCH v2 08/19] block, bfq: don't grab queue_lock to initialize bfq Date: Fri, 10 Oct 2025 17:14:33 +0800 Message-ID: <20251010091446.3048529-9-yukuai@kernel.org> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20251010091446.3048529-1-yukuai@kernel.org> References: <20251010091446.3048529-1-yukuai@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Yu Kuai Request_queue is freezed and quiesced during elevator init_sched() method, there is no point to hold queue_lock for protection. Signed-off-by: Yu Kuai --- block/bfq-iosched.c | 5 ----- 1 file changed, 5 deletions(-) diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c index 9e0eee9aba5c..86309828e235 100644 --- a/block/bfq-iosched.c +++ b/block/bfq-iosched.c @@ -7203,10 +7203,7 @@ static int bfq_init_queue(struct request_queue *q, s= truct elevator_queue *eq) return -ENOMEM; =20 eq->elevator_data =3D bfqd; - - spin_lock_irq(&q->queue_lock); q->elevator =3D eq; - spin_unlock_irq(&q->queue_lock); =20 /* * Our fallback bfqq if bfq_find_alloc_queue() runs into OOM issues. @@ -7239,7 +7236,6 @@ static int bfq_init_queue(struct request_queue *q, st= ruct elevator_queue *eq) * If the disk supports multiple actuators, copy independent * access ranges from the request queue structure. */ - spin_lock_irq(&q->queue_lock); if (ia_ranges) { /* * Check if the disk ia_ranges size exceeds the current bfq @@ -7265,7 +7261,6 @@ static int bfq_init_queue(struct request_queue *q, st= ruct elevator_queue *eq) bfqd->sector[0] =3D 0; bfqd->nr_sectors[0] =3D get_capacity(q->disk); } - spin_unlock_irq(&q->queue_lock); =20 INIT_LIST_HEAD(&bfqd->dispatch); =20 --=20 2.51.0 From nobody Sun Feb 8 06:22:49 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8717B276028; Fri, 10 Oct 2025 09:15:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760087718; cv=none; b=mhZcN3E/FK6o6aX5g5cDvZz3sG3+HTF1QeRaDSiQ1l3FzZ+qqXDpQWcTKhVb3BY147pGzGBhnP10nTlA1RBjqEQ/A2JK28Qv/RtMaAz9WrY5zmasj1L0HFhzqbc/+02Yp14fTtU4f0YNtHrsh5bxGzuxGGsKIvCWnWkWNdNCVQI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760087718; c=relaxed/simple; bh=uYE/yAafPPuAnGJJP42MriRS+/0j7WIjEApVU97Vx54=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=eGZFLrCReTIjkUWtcmP3IB5NfKPFEpJkAsOu7JkVPjJsUZ0YtTGlFZg0VIoMwWAUQ4SXlLwtVaAmv5RXi5beFvblNxjJxy2gUp3S1rufi+FkLEZOeAhPqQaEpBVxuoxr64ljMX9poJOKrhemA/dWFbMRk5hBr/ZisdH2mkeJy5I= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=S9h37Oef; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="S9h37Oef" Received: by smtp.kernel.org (Postfix) with ESMTPSA id F00FEC4CEFE; Fri, 10 Oct 2025 09:15:14 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1760087718; bh=uYE/yAafPPuAnGJJP42MriRS+/0j7WIjEApVU97Vx54=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=S9h37OefhinypiSuMwFvj4JTG8mIMiQaEKqxpdYJQOF2vl2hEP7AsBeKYQBVDK2iC xUZp4UJvGoCmQjimy1a7zOS8BIEzq39BIG6JBBkdmxUf3J5ymHLTKqD+BsEKflPw9Y tXfmfzzM7KGMEzkm0UJFa/DdqiP3WMBRgBxVeyA30MVqMDQLaa8GNq42z17bE4M0Qf aIC0XcPqUjIKXbARnn4nR1UBf3yO2eMQwF6eNeVlODFkAq7dRaeTn92ctXWzZ7sGiB i1r4BBfEEn1VXMsIwsOAya5Y4ZB+9xcBcTV5lxj5jG62FmnUJc/zpkcPzAUO8WO104 EUtXaBwjyWlkw== From: Yu Kuai To: axboe@kernel.dk, tj@kernel.org, linux-block@vger.kernel.org, cgroups@vger.kernel.org, nilay@linux.ibm.com, bvanassche@acm.org, ming.lei@redhat.com, hch@lst.de Cc: linux-kernel@vger.kernel.org, Yu Kuai Subject: [PATCH v2 09/19] blk-cgroup: convert to protect blkgs with blkcg_mutex Date: Fri, 10 Oct 2025 17:14:34 +0800 Message-ID: <20251010091446.3048529-10-yukuai@kernel.org> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20251010091446.3048529-1-yukuai@kernel.org> References: <20251010091446.3048529-1-yukuai@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Yu Kuai With previous modifications, queue_lock no longer grabbed under other spinlocks and rcu for protecting blkgs, it's ok convert to blkcg_mutex directly. Signed-off-by: Yu Kuai --- block/bfq-cgroup.c | 6 +-- block/bfq-iosched.c | 8 ++-- block/blk-cgroup.c | 104 ++++++++++++++------------------------------ block/blk-cgroup.h | 6 +-- 4 files changed, 42 insertions(+), 82 deletions(-) diff --git a/block/bfq-cgroup.c b/block/bfq-cgroup.c index a7e705d98751..43790ae91b57 100644 --- a/block/bfq-cgroup.c +++ b/block/bfq-cgroup.c @@ -405,7 +405,7 @@ static void bfqg_stats_xfer_dead(struct bfq_group *bfqg) =20 parent =3D bfqg_parent(bfqg); =20 - lockdep_assert_held(&bfqg_to_blkg(bfqg)->q->queue_lock); + lockdep_assert_held(&bfqg_to_blkg(bfqg)->q->blkcg_mutex); =20 if (unlikely(!parent)) return; @@ -872,7 +872,7 @@ static void bfq_reparent_active_queues(struct bfq_data = *bfqd, * and reparent its children entities. * @pd: descriptor of the policy going offline. * - * blkio already grabs the queue_lock for us, so no need to use + * blkio already grabs the blkcg_mtuex for us, so no need to use * RCU-based magic */ static void bfq_pd_offline(struct blkg_policy_data *pd) @@ -1145,7 +1145,7 @@ static u64 bfqg_prfill_stat_recursive(struct seq_file= *sf, struct cgroup_subsys_state *pos_css; u64 sum =3D 0; =20 - lockdep_assert_held(&blkg->q->queue_lock); + lockdep_assert_held(&blkg->q->blkcg_mutex); =20 rcu_read_lock(); blkg_for_each_descendant_pre(pos_blkg, pos_css, blkg) { diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c index 86309828e235..3350c9b22eb4 100644 --- a/block/bfq-iosched.c +++ b/block/bfq-iosched.c @@ -5266,7 +5266,7 @@ static void bfq_update_dispatch_stats(struct request_= queue *q, * In addition, the following queue lock guarantees that * bfqq_group(bfqq) exists as well. */ - spin_lock_irq(&q->queue_lock); + mutex_lock(&q->blkcg_mutex); if (idle_timer_disabled) /* * Since the idle timer has been disabled, @@ -5285,7 +5285,7 @@ static void bfq_update_dispatch_stats(struct request_= queue *q, bfqg_stats_set_start_empty_time(bfqg); bfqg_stats_update_io_remove(bfqg, rq->cmd_flags); } - spin_unlock_irq(&q->queue_lock); + mutex_unlock(&q->blkcg_mutex); } #else static inline void bfq_update_dispatch_stats(struct request_queue *q, @@ -6218,11 +6218,11 @@ static void bfq_update_insert_stats(struct request_= queue *q, * In addition, the following queue lock guarantees that * bfqq_group(bfqq) exists as well. */ - spin_lock_irq(&q->queue_lock); + mutex_lock(&q->blkcg_mutex); bfqg_stats_update_io_add(bfqq_group(bfqq), bfqq, cmd_flags); if (idle_timer_disabled) bfqg_stats_update_idle_time(bfqq_group(bfqq)); - spin_unlock_irq(&q->queue_lock); + mutex_unlock(&q->blkcg_mutex); } #else static inline void bfq_update_insert_stats(struct request_queue *q, diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c index 99edf15ce525..b8bb2f3506aa 100644 --- a/block/blk-cgroup.c +++ b/block/blk-cgroup.c @@ -130,9 +130,7 @@ static void blkg_free_workfn(struct work_struct *work) blkcg_policy[i]->pd_free_fn(blkg->pd[i]); if (blkg->parent) blkg_put(blkg->parent); - spin_lock_irq(&q->queue_lock); list_del_init(&blkg->q_node); - spin_unlock_irq(&q->queue_lock); mutex_unlock(&q->blkcg_mutex); =20 blk_put_queue(q); @@ -372,7 +370,7 @@ static struct blkcg_gq *blkg_create(struct blkcg *blkcg= , struct gendisk *disk, struct blkcg_gq *blkg; int i, ret; =20 - lockdep_assert_held(&disk->queue->queue_lock); + lockdep_assert_held(&disk->queue->blkcg_mutex); =20 /* request_queue is dying, do not create/recreate a blkg */ if (blk_queue_dying(disk->queue)) { @@ -457,7 +455,7 @@ static struct blkcg_gq *blkg_create(struct blkcg *blkcg= , struct gendisk *disk, * Lookup blkg for the @blkcg - @disk pair. If it doesn't exist, try to * create one. blkg creation is performed recursively from blkcg_root such * that all non-root blkg's have access to the parent blkg. This function - * should be called under RCU read lock and takes @disk->queue->queue_lock. + * should be called under RCU read lock and takes @disk->queue->blkcg_mute= x. * * Returns the blkg or the closest blkg if blkg_create() fails as it walks * down from root. @@ -517,7 +515,7 @@ static void blkg_destroy(struct blkcg_gq *blkg) struct blkcg *blkcg =3D blkg->blkcg; int i; =20 - lockdep_assert_held(&blkg->q->queue_lock); + lockdep_assert_held(&blkg->q->blkcg_mutex); lockdep_assert_held(&blkcg->lock); =20 /* @@ -546,8 +544,8 @@ static void blkg_destroy(struct blkcg_gq *blkg) =20 /* * Both setting lookup hint to and clearing it from @blkg are done - * under queue_lock. If it's not pointing to @blkg now, it never - * will. Hint assignment itself can race safely. + * under q->blkcg_mutex and blkcg->lock. If it's not pointing to @blkg + * now, it never will. Hint assignment itself can race safely. */ if (rcu_access_pointer(blkcg->blkg_hint) =3D=3D blkg) rcu_assign_pointer(blkcg->blkg_hint, NULL); @@ -567,25 +565,20 @@ static void blkg_destroy_all(struct gendisk *disk) int i; =20 restart: - spin_lock_irq(&q->queue_lock); + mutex_lock(&q->blkcg_mutex); list_for_each_entry(blkg, &q->blkg_list, q_node) { struct blkcg *blkcg =3D blkg->blkcg; =20 if (hlist_unhashed(&blkg->blkcg_node)) continue; =20 - spin_lock(&blkcg->lock); + spin_lock_irq(&blkcg->lock); blkg_destroy(blkg); - spin_unlock(&blkcg->lock); + spin_unlock_irq(&blkcg->lock); =20 - /* - * in order to avoid holding the spin lock for too long, release - * it when a batch of blkgs are destroyed. - */ if (!(--count)) { count =3D BLKG_DESTROY_BATCH_SIZE; - spin_unlock_irq(&q->queue_lock); - cond_resched(); + mutex_unlock(&q->blkcg_mutex); goto restart; } } @@ -603,7 +596,7 @@ static void blkg_destroy_all(struct gendisk *disk) } =20 q->root_blkg =3D NULL; - spin_unlock_irq(&q->queue_lock); + mutex_unlock(&q->blkcg_mutex); } =20 static void blkg_iostat_set(struct blkg_iostat *dst, struct blkg_iostat *s= rc) @@ -854,7 +847,7 @@ unsigned long __must_check blkg_conf_open_bdev_frozen(s= truct blkg_conf_ctx *ctx) */ int blkg_conf_prep(struct blkcg *blkcg, const struct blkcg_policy *pol, struct blkg_conf_ctx *ctx) - __acquires(&bdev->bd_queue->queue_lock) + __acquires(&bdev->bd_queue->blkcg_mutex) { struct gendisk *disk; struct request_queue *q; @@ -870,7 +863,6 @@ int blkg_conf_prep(struct blkcg *blkcg, const struct bl= kcg_policy *pol, =20 /* Prevent concurrent with blkcg_deactivate_policy() */ mutex_lock(&q->blkcg_mutex); - spin_lock_irq(&q->queue_lock); =20 if (!blkcg_policy_enabled(q, pol)) { ret =3D -EOPNOTSUPP; @@ -896,23 +888,18 @@ int blkg_conf_prep(struct blkcg *blkcg, const struct = blkcg_policy *pol, parent =3D blkcg_parent(parent); } =20 - /* Drop locks to do new blkg allocation with GFP_KERNEL. */ - spin_unlock_irq(&q->queue_lock); - new_blkg =3D blkg_alloc(pos, disk, GFP_NOIO); if (unlikely(!new_blkg)) { ret =3D -ENOMEM; - goto fail_exit; + goto fail_unlock; } =20 if (radix_tree_preload(GFP_KERNEL)) { blkg_free(new_blkg); ret =3D -ENOMEM; - goto fail_exit; + goto fail_unlock; } =20 - spin_lock_irq(&q->queue_lock); - if (!blkcg_policy_enabled(q, pol)) { blkg_free(new_blkg); ret =3D -EOPNOTSUPP; @@ -936,15 +923,12 @@ int blkg_conf_prep(struct blkcg *blkcg, const struct = blkcg_policy *pol, goto success; } success: - mutex_unlock(&q->blkcg_mutex); ctx->blkg =3D blkg; return 0; =20 fail_preloaded: radix_tree_preload_end(); fail_unlock: - spin_unlock_irq(&q->queue_lock); -fail_exit: mutex_unlock(&q->blkcg_mutex); /* * If queue was bypassing, we should retry. Do so after a @@ -968,11 +952,11 @@ EXPORT_SYMBOL_GPL(blkg_conf_prep); * blkg_conf_ctx's initialized with blkg_conf_init(). */ void blkg_conf_exit(struct blkg_conf_ctx *ctx) - __releases(&ctx->bdev->bd_queue->queue_lock) + __releases(&ctx->bdev->bd_queue->blkcg_mutex) __releases(&ctx->bdev->bd_queue->rq_qos_mutex) { if (ctx->blkg) { - spin_unlock_irq(&bdev_get_queue(ctx->bdev)->queue_lock); + mutex_unlock(&bdev_get_queue(ctx->bdev)->blkcg_mutex); ctx->blkg =3D NULL; } =20 @@ -1319,13 +1303,13 @@ static void blkcg_destroy_blkgs(struct blkcg *blkcg) while ((blkg =3D blkcg_get_first_blkg(blkcg))) { struct request_queue *q =3D blkg->q; =20 - spin_lock_irq(&q->queue_lock); - spin_lock(&blkcg->lock); + mutex_lock(&q->blkcg_mutex); + spin_lock_irq(&blkcg->lock); =20 blkg_destroy(blkg); =20 - spin_unlock(&blkcg->lock); - spin_unlock_irq(&q->queue_lock); + spin_unlock_irq(&blkcg->lock); + mutex_unlock(&q->blkcg_mutex); =20 blkg_put(blkg); cond_resched(); @@ -1502,24 +1486,23 @@ int blkcg_init_disk(struct gendisk *disk) if (!new_blkg) return -ENOMEM; =20 - preloaded =3D !radix_tree_preload(GFP_KERNEL); + mutex_lock(&q->blkcg_mutex); + preloaded =3D !radix_tree_preload(GFP_NOIO); =20 /* Make sure the root blkg exists. */ - /* spin_lock_irq can serve as RCU read-side critical section. */ - spin_lock_irq(&q->queue_lock); blkg =3D blkg_create(&blkcg_root, disk, new_blkg); if (IS_ERR(blkg)) goto err_unlock; q->root_blkg =3D blkg; - spin_unlock_irq(&q->queue_lock); =20 if (preloaded) radix_tree_preload_end(); + mutex_unlock(&q->blkcg_mutex); =20 return 0; =20 err_unlock: - spin_unlock_irq(&q->queue_lock); + mutex_unlock(&q->blkcg_mutex); if (preloaded) radix_tree_preload_end(); return PTR_ERR(blkg); @@ -1596,8 +1579,7 @@ int blkcg_activate_policy(struct gendisk *disk, const= struct blkcg_policy *pol) =20 if (queue_is_mq(q)) memflags =3D blk_mq_freeze_queue(q); -retry: - spin_lock_irq(&q->queue_lock); + mutex_lock(&q->blkcg_mutex); =20 /* blkg_list is pushed at the head, reverse walk to initialize parents fi= rst */ list_for_each_entry_reverse(blkg, &q->blkg_list, q_node) { @@ -1606,36 +1588,17 @@ int blkcg_activate_policy(struct gendisk *disk, con= st struct blkcg_policy *pol) if (blkg->pd[pol->plid]) continue; =20 - /* If prealloc matches, use it; otherwise try GFP_NOWAIT */ + /* If prealloc matches, use it */ if (blkg =3D=3D pinned_blkg) { pd =3D pd_prealloc; pd_prealloc =3D NULL; } else { pd =3D pol->pd_alloc_fn(disk, blkg->blkcg, - GFP_NOWAIT); + GFP_NOIO); } =20 - if (!pd) { - /* - * GFP_NOWAIT failed. Free the existing one and - * prealloc for @blkg w/ GFP_KERNEL. - */ - if (pinned_blkg) - blkg_put(pinned_blkg); - blkg_get(blkg); - pinned_blkg =3D blkg; - - spin_unlock_irq(&q->queue_lock); - - if (pd_prealloc) - pol->pd_free_fn(pd_prealloc); - pd_prealloc =3D pol->pd_alloc_fn(disk, blkg->blkcg, - GFP_KERNEL); - if (pd_prealloc) - goto retry; - else - goto enomem; - } + if (!pd) + goto enomem; =20 spin_lock(&blkg->blkcg->lock); =20 @@ -1656,8 +1619,8 @@ int blkcg_activate_policy(struct gendisk *disk, const= struct blkcg_policy *pol) __set_bit(pol->plid, q->blkcg_pols); ret =3D 0; =20 - spin_unlock_irq(&q->queue_lock); out: + mutex_unlock(&q->blkcg_mutex); if (queue_is_mq(q)) blk_mq_unfreeze_queue(q, memflags); if (pinned_blkg) @@ -1668,7 +1631,6 @@ int blkcg_activate_policy(struct gendisk *disk, const= struct blkcg_policy *pol) =20 enomem: /* alloc failed, take down everything */ - spin_lock_irq(&q->queue_lock); list_for_each_entry(blkg, &q->blkg_list, q_node) { struct blkcg *blkcg =3D blkg->blkcg; struct blkg_policy_data *pd; @@ -1684,7 +1646,7 @@ int blkcg_activate_policy(struct gendisk *disk, const= struct blkcg_policy *pol) } spin_unlock(&blkcg->lock); } - spin_unlock_irq(&q->queue_lock); + ret =3D -ENOMEM; goto out; } @@ -1712,7 +1674,6 @@ void blkcg_deactivate_policy(struct gendisk *disk, memflags =3D blk_mq_freeze_queue(q); =20 mutex_lock(&q->blkcg_mutex); - spin_lock_irq(&q->queue_lock); =20 __clear_bit(pol->plid, q->blkcg_pols); =20 @@ -1729,7 +1690,6 @@ void blkcg_deactivate_policy(struct gendisk *disk, spin_unlock(&blkcg->lock); } =20 - spin_unlock_irq(&q->queue_lock); mutex_unlock(&q->blkcg_mutex); =20 if (queue_is_mq(q)) @@ -2119,11 +2079,11 @@ static inline struct blkcg_gq *blkg_tryget_closest(= struct bio *bio, * Fast path failed, we're probably issuing IO in this cgroup the first * time, hold lock to create new blkg. */ - spin_lock_irq(&q->queue_lock); + mutex_lock(&q->blkcg_mutex); blkg =3D blkg_lookup_create(blkcg, bio->bi_bdev->bd_disk); if (blkg) blkg =3D blkg_lookup_tryget(blkg); - spin_unlock_irq(&q->queue_lock); + mutex_unlock(&q->blkcg_mutex); =20 return blkg; } diff --git a/block/blk-cgroup.h b/block/blk-cgroup.h index fd206d1fa3c9..540be30aebcd 100644 --- a/block/blk-cgroup.h +++ b/block/blk-cgroup.h @@ -263,7 +263,7 @@ static inline struct blkcg_gq *blkg_lookup(struct blkcg= *blkcg, return q->root_blkg; =20 blkg =3D rcu_dereference_check(blkcg->blkg_hint, - lockdep_is_held(&q->queue_lock)); + lockdep_is_held(&q->blkcg_mutex)); if (blkg && blkg->q =3D=3D q) return blkg; =20 @@ -347,8 +347,8 @@ static inline void blkg_put(struct blkcg_gq *blkg) * @p_blkg: target blkg to walk descendants of * * Walk @c_blkg through the descendants of @p_blkg. Must be used with RCU - * read locked. If called under either blkcg or queue lock, the iteration - * is guaranteed to include all and only online blkgs. The caller may + * read locked. If called under either blkcg->lock or q->blkcg_mutex, the + * iteration is guaranteed to include all and only online blkgs. The call= er may * update @pos_css by calling css_rightmost_descendant() to skip subtree. * @p_blkg is included in the iteration and the first node to be visited. */ --=20 2.51.0 From nobody Sun Feb 8 06:22:49 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B59C4279DCA; Fri, 10 Oct 2025 09:15:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760087724; cv=none; b=gFHmU3XxzsqzePE5cfEZj7TXcvoEusnBlpmz+v0WQVxjw5aTzHZ2tr2JV3TrrIo8Rz0DmOHoiEeV1pbcs/ZrhkD0txT5PL6CGpKCk2PxT1P2DiKJ2RwCV8/SzWo32FWvd9rQSjfoDdLm1cBvKWb1ZknypZ4/GAcky5x0CmD7Y8I= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760087724; c=relaxed/simple; bh=dgOaZ/uKuIlh4HsZq71ap9uemZ6G88FXxt+VuJD2VbI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=rmklA8GAmB0CoeE3vKtnfIjcsLwqHwOV4ckl+fVx9tWCAfAkx37If5ffAsLTvaQtAo5WtzBF/bGKevw+SLuaLcPbqvgl10UkZXoulkgWEybAsxKhljooKqhCUVVKKtx8M7tBCDwcaG7YSNtbgZ3ieOnd2mZnbWTl5bZQVQ/UOWw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=VnTi/k5P; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="VnTi/k5P" Received: by smtp.kernel.org (Postfix) with ESMTPSA id DA8F7C4CEF1; Fri, 10 Oct 2025 09:15:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1760087721; bh=dgOaZ/uKuIlh4HsZq71ap9uemZ6G88FXxt+VuJD2VbI=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=VnTi/k5PmvFW0PJ8JRh7NEFR3O+tv2saE2jydMaVaCKcJ55iCQE4PZt9Q6YED2Gio Gw4QJruEKhKB7L+FkaW/eh44Q7RYDJXqdzauJCMPvAKmU6ErlmyXi2b42h8rY9ygD0 oafXI0yop0rl1EbLnKwcgapQCfBBcA5NMc2WKeyTelhA91+VIFUzEmR53E8L8M62Py 8ci81r2yWhIz4sxYQmkFIZKgI2VHMcahxvs2HVZ9bMuhjChmmIvz4H2vy2RI24uh+e OE/6XA+Ql5S639MXXE0VEoTOO4/c/dw36P9HllfPgBO6fdrd2mD1oB0pf2evr2/kqn 0zvwlsXmq1asQ== From: Yu Kuai To: axboe@kernel.dk, tj@kernel.org, linux-block@vger.kernel.org, cgroups@vger.kernel.org, nilay@linux.ibm.com, bvanassche@acm.org, ming.lei@redhat.com, hch@lst.de Cc: linux-kernel@vger.kernel.org, Yu Kuai Subject: [PATCH v2 10/19] blk-cgroup: remove radix_tree_preload() Date: Fri, 10 Oct 2025 17:14:35 +0800 Message-ID: <20251010091446.3048529-11-yukuai@kernel.org> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20251010091446.3048529-1-yukuai@kernel.org> References: <20251010091446.3048529-1-yukuai@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Yu Kuai Now that blkcg_mutex is used to protect blkgs, memory allocation no longer need to be non-blocking, this is not needed. Signed-off-by: Yu Kuai --- block/blk-cgroup.c | 20 ++------------------ 1 file changed, 2 insertions(+), 18 deletions(-) diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c index b8bb2f3506aa..030499d70543 100644 --- a/block/blk-cgroup.c +++ b/block/blk-cgroup.c @@ -894,16 +894,10 @@ int blkg_conf_prep(struct blkcg *blkcg, const struct = blkcg_policy *pol, goto fail_unlock; } =20 - if (radix_tree_preload(GFP_KERNEL)) { - blkg_free(new_blkg); - ret =3D -ENOMEM; - goto fail_unlock; - } - if (!blkcg_policy_enabled(q, pol)) { blkg_free(new_blkg); ret =3D -EOPNOTSUPP; - goto fail_preloaded; + goto fail_unlock; } =20 blkg =3D blkg_lookup(pos, q); @@ -913,12 +907,10 @@ int blkg_conf_prep(struct blkcg *blkcg, const struct = blkcg_policy *pol, blkg =3D blkg_create(pos, disk, new_blkg); if (IS_ERR(blkg)) { ret =3D PTR_ERR(blkg); - goto fail_preloaded; + goto fail_unlock; } } =20 - radix_tree_preload_end(); - if (pos =3D=3D blkcg) goto success; } @@ -926,8 +918,6 @@ int blkg_conf_prep(struct blkcg *blkcg, const struct bl= kcg_policy *pol, ctx->blkg =3D blkg; return 0; =20 -fail_preloaded: - radix_tree_preload_end(); fail_unlock: mutex_unlock(&q->blkcg_mutex); /* @@ -1480,14 +1470,12 @@ int blkcg_init_disk(struct gendisk *disk) { struct request_queue *q =3D disk->queue; struct blkcg_gq *new_blkg, *blkg; - bool preloaded; =20 new_blkg =3D blkg_alloc(&blkcg_root, disk, GFP_KERNEL); if (!new_blkg) return -ENOMEM; =20 mutex_lock(&q->blkcg_mutex); - preloaded =3D !radix_tree_preload(GFP_NOIO); =20 /* Make sure the root blkg exists. */ blkg =3D blkg_create(&blkcg_root, disk, new_blkg); @@ -1495,16 +1483,12 @@ int blkcg_init_disk(struct gendisk *disk) goto err_unlock; q->root_blkg =3D blkg; =20 - if (preloaded) - radix_tree_preload_end(); mutex_unlock(&q->blkcg_mutex); =20 return 0; =20 err_unlock: mutex_unlock(&q->blkcg_mutex); - if (preloaded) - radix_tree_preload_end(); return PTR_ERR(blkg); } =20 --=20 2.51.0 From nobody Sun Feb 8 06:22:49 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2077D2765E6; Fri, 10 Oct 2025 09:15:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760087724; cv=none; b=f6JfJ/KR6U0mUvXpRg4BtJTp5eO9Zf0rwVX6KOpJMiEWcM1qKdTKqMRiog62nNJSispHfhAyDV4MQ3ZCIcYf4T7DXb3L0D0UQpANC2zIkAgRVQtJB9tf5w8aO+JcDo1moTqxb7FsYG5kRE2ELZthF9Th9vXpql2eYww0cotrXFo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760087724; c=relaxed/simple; bh=71t1rc86LBoLpp/AZ7tkC/o2XX//I5mO8PuzK7GqrtA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=DHvU+tPieHLVXqJV3N9T+e0FG+2xzm6QBlH1lUAY2oyli48rigj0AinpFmKz32kBL5WABXvzmo819a2CG/Jc8mpoBe3WWpyj4dXyZi8ElIYbJ99Wi90FIfDN9xRQeG1mUVVwiiLJvW7ddXB/SHa/TQUCvYP2lMkyano0Ogf1jjY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=mWeTXK9U; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="mWeTXK9U" Received: by smtp.kernel.org (Postfix) with ESMTPSA id B958DC4CEF8; Fri, 10 Oct 2025 09:15:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1760087724; bh=71t1rc86LBoLpp/AZ7tkC/o2XX//I5mO8PuzK7GqrtA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=mWeTXK9URzTJtZkBTn/CaM7Kf/qkbsu6fXJWF6j6Sbsw27MnWVf4CjHSlW5/rGzoq Grt/KJFg3+OBYCqYKrKn3GNYoR2c6FMaPrbGNZvWEKyUOnMPbnCiUs+sp+50Ht9YQT wrL3xSPARIK4lwh2yg7UaF3dlDIMyWdJdARdKRwKnYfN15SA3TJj4G8BK7Wp0Aps8a 8trw3Qx4+YDddSOwN0ZRp1Mtnc+eNW5MduuW6ptBQq3+ueuzSlIoZ9/g5gE1XYuZsz uY7NA3sriMnTmZIdmj5rpFIsVQ0oElVS5wBor/LA8ADs8WZQkKdecOhCjFShCB0Mm7 B96uGHcz168gA== From: Yu Kuai To: axboe@kernel.dk, tj@kernel.org, linux-block@vger.kernel.org, cgroups@vger.kernel.org, nilay@linux.ibm.com, bvanassche@acm.org, ming.lei@redhat.com, hch@lst.de Cc: linux-kernel@vger.kernel.org, Yu Kuai Subject: [PATCH v2 11/19] blk-cgroup: remove preallocate blkg for blkg_create() Date: Fri, 10 Oct 2025 17:14:36 +0800 Message-ID: <20251010091446.3048529-12-yukuai@kernel.org> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20251010091446.3048529-1-yukuai@kernel.org> References: <20251010091446.3048529-1-yukuai@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Yu Kuai Now that blkg_create is protected with blkcg_mutex, there is no need to preallocate blkg, remove related code. Signed-off-by: Yu Kuai --- block/blk-cgroup.c | 91 +++++++++++++++++----------------------------- 1 file changed, 33 insertions(+), 58 deletions(-) diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c index 030499d70543..3c23d2d1e237 100644 --- a/block/blk-cgroup.c +++ b/block/blk-cgroup.c @@ -364,10 +364,9 @@ static struct blkcg_gq *blkg_alloc(struct blkcg *blkcg= , struct gendisk *disk, * If @new_blkg is %NULL, this function tries to allocate a new one as * necessary using %GFP_NOWAIT. @new_blkg is always consumed on return. */ -static struct blkcg_gq *blkg_create(struct blkcg *blkcg, struct gendisk *d= isk, - struct blkcg_gq *new_blkg) +static struct blkcg_gq *blkg_create(struct blkcg *blkcg, struct gendisk *d= isk) { - struct blkcg_gq *blkg; + struct blkcg_gq *blkg =3D NULL; int i, ret; =20 lockdep_assert_held(&disk->queue->blkcg_mutex); @@ -384,15 +383,11 @@ static struct blkcg_gq *blkg_create(struct blkcg *blk= cg, struct gendisk *disk, goto err_free_blkg; } =20 - /* allocate */ - if (!new_blkg) { - new_blkg =3D blkg_alloc(blkcg, disk, GFP_NOWAIT); - if (unlikely(!new_blkg)) { - ret =3D -ENOMEM; - goto err_put_css; - } + blkg =3D blkg_alloc(blkcg, disk, GFP_NOIO); + if (unlikely(!blkg)) { + ret =3D -ENOMEM; + goto err_put_css; } - blkg =3D new_blkg; =20 /* link parent */ if (blkcg_parent(blkcg)) { @@ -415,35 +410,34 @@ static struct blkcg_gq *blkg_create(struct blkcg *blk= cg, struct gendisk *disk, /* insert */ spin_lock(&blkcg->lock); ret =3D radix_tree_insert(&blkcg->blkg_tree, disk->queue->id, blkg); - if (likely(!ret)) { - hlist_add_head_rcu(&blkg->blkcg_node, &blkcg->blkg_list); - list_add(&blkg->q_node, &disk->queue->blkg_list); + if (unlikely(ret)) { + spin_unlock(&blkcg->lock); + blkg_put(blkg); + return ERR_PTR(ret); + } =20 - for (i =3D 0; i < BLKCG_MAX_POLS; i++) { - struct blkcg_policy *pol =3D blkcg_policy[i]; + hlist_add_head_rcu(&blkg->blkcg_node, &blkcg->blkg_list); + list_add(&blkg->q_node, &disk->queue->blkg_list); =20 - if (blkg->pd[i]) { - if (pol->pd_online_fn) - pol->pd_online_fn(blkg->pd[i]); - blkg->pd[i]->online =3D true; - } + for (i =3D 0; i < BLKCG_MAX_POLS; i++) { + struct blkcg_policy *pol =3D blkcg_policy[i]; + + if (blkg->pd[i]) { + if (pol->pd_online_fn) + pol->pd_online_fn(blkg->pd[i]); + blkg->pd[i]->online =3D true; } } + blkg->online =3D true; spin_unlock(&blkcg->lock); - - if (!ret) - return blkg; - - /* @blkg failed fully initialized, use the usual release path */ - blkg_put(blkg); - return ERR_PTR(ret); + return blkg; =20 err_put_css: css_put(&blkcg->css); err_free_blkg: - if (new_blkg) - blkg_free(new_blkg); + if (blkg) + blkg_free(blkg); return ERR_PTR(ret); } =20 @@ -498,7 +492,7 @@ static struct blkcg_gq *blkg_lookup_create(struct blkcg= *blkcg, parent =3D blkcg_parent(parent); } =20 - blkg =3D blkg_create(pos, disk, NULL); + blkg =3D blkg_create(pos, disk); if (IS_ERR(blkg)) { blkg =3D ret_blkg; break; @@ -880,7 +874,6 @@ int blkg_conf_prep(struct blkcg *blkcg, const struct bl= kcg_policy *pol, while (true) { struct blkcg *pos =3D blkcg; struct blkcg *parent; - struct blkcg_gq *new_blkg; =20 parent =3D blkcg_parent(blkcg); while (parent && !blkg_lookup(parent, q)) { @@ -888,23 +881,14 @@ int blkg_conf_prep(struct blkcg *blkcg, const struct = blkcg_policy *pol, parent =3D blkcg_parent(parent); } =20 - new_blkg =3D blkg_alloc(pos, disk, GFP_NOIO); - if (unlikely(!new_blkg)) { - ret =3D -ENOMEM; - goto fail_unlock; - } - if (!blkcg_policy_enabled(q, pol)) { - blkg_free(new_blkg); ret =3D -EOPNOTSUPP; goto fail_unlock; } =20 blkg =3D blkg_lookup(pos, q); - if (blkg) { - blkg_free(new_blkg); - } else { - blkg =3D blkg_create(pos, disk, new_blkg); + if (!blkg) { + blkg =3D blkg_create(pos, disk); if (IS_ERR(blkg)) { ret =3D PTR_ERR(blkg); goto fail_unlock; @@ -1469,27 +1453,18 @@ void blkg_init_queue(struct request_queue *q) int blkcg_init_disk(struct gendisk *disk) { struct request_queue *q =3D disk->queue; - struct blkcg_gq *new_blkg, *blkg; - - new_blkg =3D blkg_alloc(&blkcg_root, disk, GFP_KERNEL); - if (!new_blkg) - return -ENOMEM; + struct blkcg_gq *blkg; =20 + /* Make sure the root blkg exists. */ mutex_lock(&q->blkcg_mutex); + blkg =3D blkg_create(&blkcg_root, disk); + mutex_unlock(&q->blkcg_mutex); =20 - /* Make sure the root blkg exists. */ - blkg =3D blkg_create(&blkcg_root, disk, new_blkg); if (IS_ERR(blkg)) - goto err_unlock; - q->root_blkg =3D blkg; - - mutex_unlock(&q->blkcg_mutex); + return PTR_ERR(blkg); =20 + q->root_blkg =3D blkg; return 0; - -err_unlock: - mutex_unlock(&q->blkcg_mutex); - return PTR_ERR(blkg); } =20 void blkcg_exit_disk(struct gendisk *disk) --=20 2.51.0 From nobody Sun Feb 8 06:22:49 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7660727AC44; Fri, 10 Oct 2025 09:15:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760087727; cv=none; b=POaZ1dNOcdpUJQN5DOJwCZGeaLOqhEhDyN4t3mwNFyQGd7hyGR9dAXfhZ6YHdYeReNg22aJTzHHsk8LFoEwDEQzpMNvffDryCgfAdlr68zE9IN0K42QPkIHajby5fToREDf/NrxXjMM9AtZHA6s66sOcZKWsirW7egzFlGny2p4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760087727; c=relaxed/simple; bh=5UU4sesvV9+dhG0cfl65lKFBzq/uovJuWrNs1r3kNr0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=JgaH5628vtpnq7gmQofbJyV9BM+67LeeXhrJP400emSXLfK0OOqzeCY7uIjNXrFx1Crw+Sr74tzLHuZEnNhnhJ3xXoPrx6iHXT0FAjbJgycCbT71u9ii/rG6SlzohbbNmHr+x6KffoIA58sN6aB6FEXLBmCFXcU32J+FSRGnDOg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=AhMkkYeR; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="AhMkkYeR" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 71013C4CEFE; Fri, 10 Oct 2025 09:15:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1760087727; bh=5UU4sesvV9+dhG0cfl65lKFBzq/uovJuWrNs1r3kNr0=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=AhMkkYeR82SsEDde1Ipap7NRuMllA9iW6c5Y+KgG7VjxsSBwLnGlJEQFwmxcxu4s1 pGrqFNO3lPsN8j68ey9s1Of/SYn1W/Vho05Ktfbe2B9mhs8CUNzL+BkMKrHGA3Cgnu cj+STcglNWjr81v7Q/z2Kt6OpEmsqMPTEdx0pzynwRhJ1P8uWn+74V8R06OZPHJ1uq TCCJ8wIIfNSBvwPFmyOLUIoJ54oPAHqGSRdcl1+Jz+NzfXhXi1cdOBIHoQsEVQlzQz 4z5laSZiLxcQ3ICivl9pgHVrp/TmI1GydlfxN8vSNAqKjBcABFIN99zhR5Gnrj4fTc oOv8rPn5ow0uA== From: Yu Kuai To: axboe@kernel.dk, tj@kernel.org, linux-block@vger.kernel.org, cgroups@vger.kernel.org, nilay@linux.ibm.com, bvanassche@acm.org, ming.lei@redhat.com, hch@lst.de Cc: linux-kernel@vger.kernel.org, Yu Kuai Subject: [PATCH v2 12/19] blk-throttle: fix possible deadlock due to queue_lock in timer Date: Fri, 10 Oct 2025 17:14:37 +0800 Message-ID: <20251010091446.3048529-13-yukuai@kernel.org> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20251010091446.3048529-1-yukuai@kernel.org> References: <20251010091446.3048529-1-yukuai@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Yu Kuai Abusing queue_lock to protect blk-throttle can cause deadlock: 1) throtl_pending_timer_fn() will hold the lock, while throtl_pd_free() will flush the timer, this is fixed by protecting blkgs with blkcg_mutex instead of queue_lock by previous patches. 2) queue_lock can be held from hardirq context, hence if throtl_pending_timer_fn() is interrupted by hardirq, deadlock can be triggered as well. Stop abusing queue_lock to protect blk-throttle, and intorduce a new internal lock td->lock for protection. And now that the new lock won't be grabbed from hardirq context, it's safe to use spin_lock_bh() from thread context and spin_lock() directly from softirq context. Fixes: 6e1a5704cbbd ("blk-throttle: dispatch from throtl_pending_timer_fn()= ") Signed-off-by: Yu Kuai --- block/blk-throttle.c | 31 +++++++++++++------------------ 1 file changed, 13 insertions(+), 18 deletions(-) diff --git a/block/blk-throttle.c b/block/blk-throttle.c index cb3bfdb4684a..7feaa2ef0a6b 100644 --- a/block/blk-throttle.c +++ b/block/blk-throttle.c @@ -33,6 +33,7 @@ static struct workqueue_struct *kthrotld_workqueue; =20 struct throtl_data { + spinlock_t lock; /* service tree for active throtl groups */ struct throtl_service_queue service_queue; =20 @@ -1149,7 +1150,7 @@ static void throtl_pending_timer_fn(struct timer_list= *t) else q =3D td->queue; =20 - spin_lock_irq(&q->queue_lock); + spin_lock(&td->lock); =20 if (!q->root_blkg) goto out_unlock; @@ -1175,9 +1176,9 @@ static void throtl_pending_timer_fn(struct timer_list= *t) break; =20 /* this dispatch windows is still open, relax and repeat */ - spin_unlock_irq(&q->queue_lock); + spin_unlock(&td->lock); cpu_relax(); - spin_lock_irq(&q->queue_lock); + spin_lock(&td->lock); } =20 if (!dispatched) @@ -1200,7 +1201,7 @@ static void throtl_pending_timer_fn(struct timer_list= *t) queue_work(kthrotld_workqueue, &td->dispatch_work); } out_unlock: - spin_unlock_irq(&q->queue_lock); + spin_unlock(&td->lock); } =20 /** @@ -1216,7 +1217,6 @@ static void blk_throtl_dispatch_work_fn(struct work_s= truct *work) struct throtl_data *td =3D container_of(work, struct throtl_data, dispatch_work); struct throtl_service_queue *td_sq =3D &td->service_queue; - struct request_queue *q =3D td->queue; struct bio_list bio_list_on_stack; struct bio *bio; struct blk_plug plug; @@ -1224,11 +1224,11 @@ static void blk_throtl_dispatch_work_fn(struct work= _struct *work) =20 bio_list_init(&bio_list_on_stack); =20 - spin_lock_irq(&q->queue_lock); + spin_lock_bh(&td->lock); for (rw =3D READ; rw <=3D WRITE; rw++) while ((bio =3D throtl_pop_queued(td_sq, NULL, rw))) bio_list_add(&bio_list_on_stack, bio); - spin_unlock_irq(&q->queue_lock); + spin_unlock_bh(&td->lock); =20 if (!bio_list_empty(&bio_list_on_stack)) { blk_start_plug(&plug); @@ -1306,7 +1306,7 @@ static void tg_conf_updated(struct throtl_grp *tg, bo= ol global) rcu_read_unlock(); =20 /* - * We're already holding queue_lock and know @tg is valid. Let's + * We're already holding td->lock and know @tg is valid. Let's * apply the new config directly. * * Restart the slices for both READ and WRITES. It might happen @@ -1333,6 +1333,7 @@ static int blk_throtl_init(struct gendisk *disk) if (!td) return -ENOMEM; =20 + spin_lock_init(&td->lock); INIT_WORK(&td->dispatch_work, blk_throtl_dispatch_work_fn); throtl_service_queue_init(&td->service_queue); =20 @@ -1703,12 +1704,7 @@ void blk_throtl_cancel_bios(struct gendisk *disk) if (!blk_throtl_activated(q)) return; =20 - spin_lock_irq(&q->queue_lock); - /* - * queue_lock is held, rcu lock is not needed here technically. - * However, rcu lock is still held to emphasize that following - * path need RCU protection and to prevent warning from lockdep. - */ + spin_lock_bh(&q->td->lock); rcu_read_lock(); blkg_for_each_descendant_post(blkg, pos_css, q->root_blkg) { /* @@ -1722,7 +1718,7 @@ void blk_throtl_cancel_bios(struct gendisk *disk) tg_flush_bios(blkg_to_tg(blkg)); } rcu_read_unlock(); - spin_unlock_irq(&q->queue_lock); + spin_unlock_bh(&q->td->lock); } =20 static bool tg_within_limit(struct throtl_grp *tg, struct bio *bio, bool r= w) @@ -1755,7 +1751,6 @@ static bool tg_within_limit(struct throtl_grp *tg, st= ruct bio *bio, bool rw) =20 bool __blk_throtl_bio(struct bio *bio) { - struct request_queue *q =3D bdev_get_queue(bio->bi_bdev); struct blkcg_gq *blkg =3D bio->bi_blkg; struct throtl_qnode *qn =3D NULL; struct throtl_grp *tg =3D blkg_to_tg(blkg); @@ -1765,7 +1760,7 @@ bool __blk_throtl_bio(struct bio *bio) struct throtl_data *td =3D tg->td; =20 rcu_read_lock(); - spin_lock_irq(&q->queue_lock); + spin_lock_bh(&td->lock); sq =3D &tg->service_queue; =20 while (true) { @@ -1841,7 +1836,7 @@ bool __blk_throtl_bio(struct bio *bio) } =20 out_unlock: - spin_unlock_irq(&q->queue_lock); + spin_unlock_bh(&td->lock); =20 rcu_read_unlock(); return throttled; --=20 2.51.0 From nobody Sun Feb 8 06:22:49 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D038F274B53; Fri, 10 Oct 2025 09:15:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760087729; cv=none; b=QldSlcdBUG7ITHYdH2daajBNIS3GgJKBxrYliN/vHJAN8Qj0vIBpgSIIHe5J0QqOBLrTdmmDmDGj5YKbm4yRfgjwek4315jEtPfOW5xCyVaDBpMpnUhbG5cLYDeyNtdw8WOst+H5mcsSoj9ZbFJ0NmkPCUT3ytgNvkLC6Dj76Xk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760087729; c=relaxed/simple; bh=32Ngjeq6kVC94iP3qBSCEC2DfH54K3j4MhShjpr2/Wc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=ju/xBcRtfmxjBARxgICNgc6a43di7J3y2v7TYIpSSGRKA6tepXid44EwsPQLDqo1yjIZo7F/zQ6AD/nMsixzBBCaSxaN2bkOHUbJuoVxINuqsyyO3iwt4UfxLVkbcnyVFOjK8YHCYvay8v51/5VPPf1ERDLfph+4wG/L6vrbQok= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=UiO1TqJi; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="UiO1TqJi" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 819D8C4CEF8; Fri, 10 Oct 2025 09:15:27 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1760087729; bh=32Ngjeq6kVC94iP3qBSCEC2DfH54K3j4MhShjpr2/Wc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=UiO1TqJiladMMYbFS5lPrCL+ogJQKeMfrxvfYp7mmZIuEpn5elzgbd2w2ZBrXW84F VtfpfBZQnTDl032+0TxDgjgFf6v1MUi5aAf9VFwVJmWSqf59kwg1kfDQD5AUaNQgPf /ntqMLA5bzZa/Bdp3Wh+T8IvMnxW2rUauj7Wl/Y0qjwri+5jgFtS3XBvTsVSav7rt7 yaEvhkoP2V3S6IvhZ/I/qsgLrplj8zuOAwr3KXfO8wryeAM5XxzAakKHwsXXl5Q1h3 P4O/lbWYW12zjVtkMCrNyRlpyiMjHm+C7h1uByIzBIs6ppZoW2ZWzLvyw29jp7461h xVMYLjSqSyIOg== From: Yu Kuai To: axboe@kernel.dk, tj@kernel.org, linux-block@vger.kernel.org, cgroups@vger.kernel.org, nilay@linux.ibm.com, bvanassche@acm.org, ming.lei@redhat.com, hch@lst.de Cc: linux-kernel@vger.kernel.org, Yu Kuai Subject: [PATCH v2 13/19] blk-cgroup: add new blkg configuration helpers Date: Fri, 10 Oct 2025 17:14:38 +0800 Message-ID: <20251010091446.3048529-14-yukuai@kernel.org> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20251010091446.3048529-1-yukuai@kernel.org> References: <20251010091446.3048529-1-yukuai@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Yu Kuai Currently there are many helpers to be used in different cases: - blkg_conf_open_bdev() - blkg_conf_open_bdev_frozen() - blkg_conf_prep() - blkg_conf_exit() - blkg_conf_exit_frozen() This patch introduce two new helpers: - blkg_conf_start() - blkg_conf_end() And following patches will convert all blkcg policy to use this two helpers. Signed-off-by: Yu Kuai --- block/blk-cgroup.c | 61 ++++++++++++++++++++++++++++++++++++++++++++++ block/blk-cgroup.h | 3 +++ 2 files changed, 64 insertions(+) diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c index 3c23d2d1e237..63089ae269cb 100644 --- a/block/blk-cgroup.c +++ b/block/blk-cgroup.c @@ -824,6 +824,67 @@ unsigned long __must_check blkg_conf_open_bdev_frozen(= struct blkg_conf_ctx *ctx) return memflags; } =20 +void blkg_conf_end(struct blkg_conf_ctx *ctx) +{ + struct request_queue *q =3D bdev_get_queue(ctx->bdev); + + mutex_unlock(&q->blkcg_mutex); + mutex_unlock(&q->rq_qos_mutex); + mutex_unlock(&q->elevator_lock); + blk_mq_unfreeze_queue(q, ctx->memflags); + blkdev_put_no_open(ctx->bdev); +} +EXPORT_SYMBOL_GPL(blkg_conf_end); + +int blkg_conf_start(struct blkcg *blkcg, struct blkg_conf_ctx *ctx) +{ + char *input =3D ctx->input; + unsigned int major, minor; + struct block_device *bdev; + struct request_queue *q; + int key_len; + + if (sscanf(input, "%u:%u%n", &major, &minor, &key_len) !=3D 2) + return -EINVAL; + + input +=3D key_len; + if (!isspace(*input)) + return -EINVAL; + + input =3D skip_spaces(input); + bdev =3D blkdev_get_no_open(MKDEV(major, minor), false); + if (!bdev) + return -ENODEV; + + if (bdev_is_partition(bdev)) { + blkdev_put_no_open(bdev); + return -ENODEV; + } + + if (!disk_live(bdev->bd_disk)) { + blkdev_put_no_open(bdev); + return -ENODEV; + } + + ctx->body =3D input; + ctx->bdev =3D bdev; + ctx->memflags =3D blk_mq_freeze_queue(ctx->bdev->bd_queue); +=09 + q =3D bdev->bd_queue; + mutex_lock(&q->elevator_lock); + mutex_lock(&q->rq_qos_mutex); + mutex_lock(&q->blkcg_mutex); + + ctx->blkg =3D blkg_lookup_create(blkcg, bdev->bd_disk); + if (!ctx->blkg) { + blkg_conf_end(ctx); + return -ENOMEM; + } + + return 0; +} +EXPORT_SYMBOL_GPL(blkg_conf_start); + /** * blkg_conf_prep - parse and prepare for per-blkg config update * @blkcg: target block cgroup diff --git a/block/blk-cgroup.h b/block/blk-cgroup.h index 540be30aebcd..e7868989befb 100644 --- a/block/blk-cgroup.h +++ b/block/blk-cgroup.h @@ -217,6 +217,7 @@ struct blkg_conf_ctx { char *body; struct block_device *bdev; struct blkcg_gq *blkg; + unsigned long memflags; }; =20 void blkg_conf_init(struct blkg_conf_ctx *ctx, char *input); @@ -226,6 +227,8 @@ int blkg_conf_prep(struct blkcg *blkcg, const struct bl= kcg_policy *pol, struct blkg_conf_ctx *ctx); void blkg_conf_exit(struct blkg_conf_ctx *ctx); void blkg_conf_exit_frozen(struct blkg_conf_ctx *ctx, unsigned long memfla= gs); +void blkg_conf_end(struct blkg_conf_ctx *ctx); +int blkg_conf_start(struct blkcg *blkcg, struct blkg_conf_ctx *ctx); =20 /** * bio_issue_as_root_blkg - see if this bio needs to be issued as root blkg --=20 2.51.0 From nobody Sun Feb 8 06:22:49 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E2F28274B53; Fri, 10 Oct 2025 09:15:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760087733; cv=none; b=OIxJNOsaXgs/0g3LsrcfkoUs13rgdKI+wU7U809e3cJBj2zEt7lIpgqFaenDOl5ghzdcPDmh9TuwJZEEu7TgNYqUNNbVGaT/dKf4Td2dS9MZ4jU7OvypSFXK0xR/189D7Qq4rbqgsrphI62ri3XfRvh0MqUQKfz8RSiusWi0Y4E= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760087733; c=relaxed/simple; bh=gKnwCpOoqD8n4UeyhgqRcCPLvd7xwvT4lg/c/NGhF6Y=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=jGkfAEJNXzYsIAXCpi6GPlANrUnbcJlV5IHyPuQUOB4hldtQcHDVUxV6c9N1v05F/Ed9PjR487G1KuRzoTbnowMqx4K6O6VlAPODWdKbpovTCzKJsDVd2K6uaqchCZ5sRHlQbWJ1DyUJ+SK6zNJC4wSItNQmBCXUto+j/nNqurI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=furPqr22; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="furPqr22" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 32D2AC4CEF1; Fri, 10 Oct 2025 09:15:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1760087732; bh=gKnwCpOoqD8n4UeyhgqRcCPLvd7xwvT4lg/c/NGhF6Y=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=furPqr22+3mA4la5f2HPFFCM6upF21kpWnV2YKOIoKBMFfPtL8vpcwCOvanJE9KPm MgjJY5187rnC1xd1786BaDM8N6la0OrFEied4EUzlXJUU6V/xbNA1dfFj7wY8s8M4i 2SpK3Yy1sc7srD0MhB1kkMS4dgNKFAMihx+BwSjEhMIIdqjlejKV3hdVa+wOOarYiX tIuQL+jSAay1+EM/pVp6zqPbgR+ABBIRBebZ5nuxRBUlQj5AkKEE+Z4ivgMryKQP3B cPhGhGoNZGFlAwGI4G2eOuWEgMgBEZJPxbVORpNcL6OCN4eOLacUFxKuO4wAFDwDpa XrhmxxKBrQGlA== From: Yu Kuai To: axboe@kernel.dk, tj@kernel.org, linux-block@vger.kernel.org, cgroups@vger.kernel.org, nilay@linux.ibm.com, bvanassche@acm.org, ming.lei@redhat.com, hch@lst.de Cc: linux-kernel@vger.kernel.org, Yu Kuai Subject: [PATCH v2 14/19] blk-cgroup: factor out a helper __blkg_activate_policy() Date: Fri, 10 Oct 2025 17:14:39 +0800 Message-ID: <20251010091446.3048529-15-yukuai@kernel.org> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20251010091446.3048529-1-yukuai@kernel.org> References: <20251010091446.3048529-1-yukuai@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Yu Kuai Currently bfq policy is activated by initializing elevator, while others are activated by cgroupfs configuration. factor out a helper that blkcg_mutex is alread held to prepare use new helpers blkg_conf{start, end} for policys other than bfq. Signed-off-by: Yu Kuai --- block/blk-cgroup.c | 31 +++---------------------------- block/blk-cgroup.h | 34 +++++++++++++++++++++++++++++++++- 2 files changed, 36 insertions(+), 29 deletions(-) diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c index 63089ae269cb..4b7324c1d0d5 100644 --- a/block/blk-cgroup.c +++ b/block/blk-cgroup.c @@ -1562,32 +1562,14 @@ struct cgroup_subsys io_cgrp_subsys =3D { }; EXPORT_SYMBOL_GPL(io_cgrp_subsys); =20 -/** - * blkcg_activate_policy - activate a blkcg policy on a gendisk - * @disk: gendisk of interest - * @pol: blkcg policy to activate - * - * Activate @pol on @disk. Requires %GFP_KERNEL context. @disk goes thro= ugh - * bypass mode to populate its blkgs with policy_data for @pol. - * - * Activation happens with @disk bypassed, so nobody would be accessing bl= kgs - * from IO path. Update of each blkg is protected by both queue and blkcg - * locks so that holding either lock and testing blkcg_policy_enabled() is - * always enough for dereferencing policy data. - * - * The caller is responsible for synchronizing [de]activations and policy - * [un]registerations. Returns 0 on success, -errno on failure. - */ -int blkcg_activate_policy(struct gendisk *disk, const struct blkcg_policy = *pol) +int __blkcg_activate_policy(struct gendisk *disk, const struct blkcg_polic= y *pol) { struct request_queue *q =3D disk->queue; struct blkg_policy_data *pd_prealloc =3D NULL; struct blkcg_gq *blkg, *pinned_blkg =3D NULL; - unsigned int memflags; int ret; =20 - if (blkcg_policy_enabled(q, pol)) - return 0; + lockdep_assert_held(&q->blkcg_mutex); =20 /* * Policy is allowed to be registered without pd_alloc_fn/pd_free_fn, @@ -1597,10 +1579,6 @@ int blkcg_activate_policy(struct gendisk *disk, cons= t struct blkcg_policy *pol) if (WARN_ON_ONCE(!pol->pd_alloc_fn || !pol->pd_free_fn)) return -EINVAL; =20 - if (queue_is_mq(q)) - memflags =3D blk_mq_freeze_queue(q); - mutex_lock(&q->blkcg_mutex); - /* blkg_list is pushed at the head, reverse walk to initialize parents fi= rst */ list_for_each_entry_reverse(blkg, &q->blkg_list, q_node) { struct blkg_policy_data *pd; @@ -1640,9 +1618,6 @@ int blkcg_activate_policy(struct gendisk *disk, const= struct blkcg_policy *pol) ret =3D 0; =20 out: - mutex_unlock(&q->blkcg_mutex); - if (queue_is_mq(q)) - blk_mq_unfreeze_queue(q, memflags); if (pinned_blkg) blkg_put(pinned_blkg); if (pd_prealloc) @@ -1670,7 +1645,7 @@ int blkcg_activate_policy(struct gendisk *disk, const= struct blkcg_policy *pol) ret =3D -ENOMEM; goto out; } -EXPORT_SYMBOL_GPL(blkcg_activate_policy); +EXPORT_SYMBOL_GPL(__blkcg_activate_policy); =20 /** * blkcg_deactivate_policy - deactivate a blkcg policy on a gendisk diff --git a/block/blk-cgroup.h b/block/blk-cgroup.h index e7868989befb..c3d16d52c275 100644 --- a/block/blk-cgroup.h +++ b/block/blk-cgroup.h @@ -200,7 +200,7 @@ void blkcg_exit_disk(struct gendisk *disk); /* Blkio controller policy registration */ int blkcg_policy_register(struct blkcg_policy *pol); void blkcg_policy_unregister(struct blkcg_policy *pol); -int blkcg_activate_policy(struct gendisk *disk, const struct blkcg_policy = *pol); +int __blkcg_activate_policy(struct gendisk *disk, const struct blkcg_polic= y *pol); void blkcg_deactivate_policy(struct gendisk *disk, const struct blkcg_policy *pol); =20 @@ -465,6 +465,38 @@ static inline bool blkcg_policy_enabled(struct request= _queue *q, return pol && test_bit(pol->plid, q->blkcg_pols); } =20 +/** + * blkcg_activate_policy - activate a blkcg policy on a gendisk + * @disk: gendisk of interest + * @pol: blkcg policy to activate + * + * Activate @pol on @disk. Requires %GFP_KERNEL context. @disk goes thro= ugh + * bypass mode to populate its blkgs with policy_data for @pol. + * + * Activation happens with @disk bypassed, so nobody would be accessing bl= kgs + * from IO path. Update of each blkg is protected by both queue and blkcg + * locks so that holding either lock and testing blkcg_policy_enabled() is + * always enough for dereferencing policy data. + * + * The caller is responsible for synchronizing [de]activations and policy + * [un]registerations. Returns 0 on success, -errno on failure. + */ +static inline int blkcg_activate_policy(struct gendisk *disk, + const struct blkcg_policy *pol) +{ + struct request_queue *q =3D disk->queue; + int ret; + + if (blkcg_policy_enabled(q, pol)) + return 0; + + mutex_lock(&q->blkcg_mutex); + ret =3D __blkcg_activate_policy(disk, pol); + mutex_unlock(&q->blkcg_mutex); + + return ret; +} + void blk_cgroup_bio_start(struct bio *bio); void blkcg_add_delay(struct blkcg_gq *blkg, u64 now, u64 delta); #else /* CONFIG_BLK_CGROUP */ --=20 2.51.0 From nobody Sun Feb 8 06:22:49 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5D3D727E05E; Fri, 10 Oct 2025 09:15:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760087735; cv=none; b=aCckvdRJLU+jZvamzvvrhmkMSM7mFqPuRgfXWy/w8dMQfjNk8LGN/Z05YN30trc60/g5yiQLnXTitIuXFemIRuUdSXblrS/fC/qWm6C6v7Vjzm//huqRXehC2q8B+wfHzplfs5I5uSqiytJHX9+pqHJ8EmeNN//VD4VFdnjJtD8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760087735; c=relaxed/simple; bh=YmIrw+qqi3d4FLmLcRKt5MeMqkVsyFUH1PLrM/RP3/0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=k6kii8D1AELOGUWwwrpexuQwoO8cw0SV8Pf8FkMq0REDGbePQEffB2La/AY0zntgpvdJ36DfzpBIFsd2KNPr1mxnBRTPXHhqQL/DIKQvASahlECxNppBtYMXt0RtpP9BLB3k80HCMgeXrVEWbLawXGpw3lpU3hKpM5ddnSuaQJE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=HfULuDq4; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="HfULuDq4" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 0D0EFC4CEF8; Fri, 10 Oct 2025 09:15:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1760087735; bh=YmIrw+qqi3d4FLmLcRKt5MeMqkVsyFUH1PLrM/RP3/0=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=HfULuDq4zdrUW3jqwerVDBxyp7PyhXsT+t3m+v7971AxCj6X/B3MNrAA0fax+OMP2 SNIj4rmQrXpxZGRVfa1I30dWVtNGj+3tIjTnvowTypk86JCQ+d365qA0y5FU4QjSil qdJqpRTYXLF3V/S0yqPQs97X8KaCPMCK6W7YHjSUI7xulMArD3ulf8lx3LEV9+X3Mi yna/9tRYeOxpXhpFPyLOyLaQKDyN7A5pLTWbbsscPvBXJJeHzRhvfELxg0bYbHJEui q93unoniByosnpax+Yu63m0r6c1o1SL0e91M/2WPadRVDYXVmlBLbTjZuiCzjNSOFJ 48pA2ORfUuw+A== From: Yu Kuai To: axboe@kernel.dk, tj@kernel.org, linux-block@vger.kernel.org, cgroups@vger.kernel.org, nilay@linux.ibm.com, bvanassche@acm.org, ming.lei@redhat.com, hch@lst.de Cc: linux-kernel@vger.kernel.org, Yu Kuai Subject: [PATCH v2 15/19] blk-throttle: convert to use blkg_conf_{start, end} Date: Fri, 10 Oct 2025 17:14:40 +0800 Message-ID: <20251010091446.3048529-16-yukuai@kernel.org> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20251010091446.3048529-1-yukuai@kernel.org> References: <20251010091446.3048529-1-yukuai@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Yu Kuai No functional changes are intended, make code cleaner. Signed-off-by: Yu Kuai --- block/blk-throttle.c | 29 +++++++++-------------------- 1 file changed, 9 insertions(+), 20 deletions(-) diff --git a/block/blk-throttle.c b/block/blk-throttle.c index 7feaa2ef0a6b..761499feed5e 100644 --- a/block/blk-throttle.c +++ b/block/blk-throttle.c @@ -1326,7 +1326,6 @@ static int blk_throtl_init(struct gendisk *disk) { struct request_queue *q =3D disk->queue; struct throtl_data *td; - unsigned int memflags; int ret; =20 td =3D kzalloc_node(sizeof(*td), GFP_KERNEL, q->node); @@ -1337,14 +1336,13 @@ static int blk_throtl_init(struct gendisk *disk) INIT_WORK(&td->dispatch_work, blk_throtl_dispatch_work_fn); throtl_service_queue_init(&td->service_queue); =20 - memflags =3D blk_mq_freeze_queue(disk->queue); blk_mq_quiesce_queue(disk->queue); =20 q->td =3D td; td->queue =3D q; =20 /* activate policy, blk_throtl_activated() will return true */ - ret =3D blkcg_activate_policy(disk, &blkcg_policy_throtl); + ret =3D __blkcg_activate_policy(disk, &blkcg_policy_throtl); if (ret) { q->td =3D NULL; kfree(td); @@ -1361,7 +1359,6 @@ static int blk_throtl_init(struct gendisk *disk) =20 out: blk_mq_unquiesce_queue(disk->queue); - blk_mq_unfreeze_queue(disk->queue, memflags); =20 return ret; } @@ -1377,10 +1374,9 @@ static ssize_t tg_set_conf(struct kernfs_open_file *= of, u64 v; =20 blkg_conf_init(&ctx, buf); - - ret =3D blkg_conf_open_bdev(&ctx); + ret =3D blkg_conf_start(blkcg, &ctx); if (ret) - goto out_finish; + return ret; =20 if (!blk_throtl_activated(ctx.bdev->bd_queue)) { ret =3D blk_throtl_init(ctx.bdev->bd_disk); @@ -1388,10 +1384,6 @@ static ssize_t tg_set_conf(struct kernfs_open_file *= of, goto out_finish; } =20 - ret =3D blkg_conf_prep(blkcg, &blkcg_policy_throtl, &ctx); - if (ret) - goto out_finish; - ret =3D -EINVAL; if (sscanf(ctx.body, "%llu", &v) !=3D 1) goto out_finish; @@ -1408,8 +1400,9 @@ static ssize_t tg_set_conf(struct kernfs_open_file *o= f, =20 tg_conf_updated(tg, false); ret =3D 0; + out_finish: - blkg_conf_exit(&ctx); + blkg_conf_end(&ctx); return ret ?: nbytes; } =20 @@ -1561,10 +1554,9 @@ static ssize_t tg_set_limit(struct kernfs_open_file = *of, int ret; =20 blkg_conf_init(&ctx, buf); - - ret =3D blkg_conf_open_bdev(&ctx); + ret =3D blkg_conf_start(blkcg, &ctx); if (ret) - goto out_finish; + return ret; =20 if (!blk_throtl_activated(ctx.bdev->bd_queue)) { ret =3D blk_throtl_init(ctx.bdev->bd_disk); @@ -1572,10 +1564,6 @@ static ssize_t tg_set_limit(struct kernfs_open_file = *of, goto out_finish; } =20 - ret =3D blkg_conf_prep(blkcg, &blkcg_policy_throtl, &ctx); - if (ret) - goto out_finish; - tg =3D blkg_to_tg(ctx.blkg); tg_update_carryover(tg); =20 @@ -1626,8 +1614,9 @@ static ssize_t tg_set_limit(struct kernfs_open_file *= of, =20 tg_conf_updated(tg, false); ret =3D 0; + out_finish: - blkg_conf_exit(&ctx); + blkg_conf_end(&ctx); return ret ?: nbytes; } =20 --=20 2.51.0 From nobody Sun Feb 8 06:22:49 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B5CDE26F2BC; Fri, 10 Oct 2025 09:15:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760087738; cv=none; b=k8yWxGZY5YpcUhkbMMxqARec31LMiXXfa6kyVGgE2kLCFpkdwICTH5sAMWXP1Z5d2N0v2lP9ss4dsnbPPg25nm085Gkq8jKoyb9M/VVWcZbWRNGfXhYXBFdDyX0g2EvOnFoRSa92AkFKvauFJq4VOQYEOF427YqXUJJ3HIH53f8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760087738; c=relaxed/simple; bh=HR7CFnUsZ2dNXHDx2BPlySPGR24lHcrJ1SInaPl9BH8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=YXXmJ0V35vbEa3qi9JAT1Uay9lTF1XsbEeLpMAZNldcrw9sToneeT3ZqKxm2IcssUNF44Oj8wkviiFjarX9v/K8Hvcik155J7sWA60g+mlY7CPWlOw2b69JwQSLN8b0y9fhQOq+q9Dv42LeOkYs329NWD1bjy5ki8Mh/ETt/haA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=pkNz+mrA; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="pkNz+mrA" Received: by smtp.kernel.org (Postfix) with ESMTPSA id B24A7C4CEF1; Fri, 10 Oct 2025 09:15:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1760087738; bh=HR7CFnUsZ2dNXHDx2BPlySPGR24lHcrJ1SInaPl9BH8=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=pkNz+mrAAi9cH4SSeTgDb7vJkYjB31dwvRo/5LEEpKKVYELUaqoV0XqETOKpR+398 I+KoU/GwV/EWgXvyZ0U/BghZ20pUTx3yTFl5UqyDRGmA42KF4dneW5Z1aHRKsRtx9x HtxLDWjJY4JvF7Ifa8UQ2a5ww+LhM93BvSK5mRHLcSjag3X0p5TFRCb5qIyiKtCeX6 FLbtP3RBB0+s+utZHyNhxtvlE0caLkrXej442SCcaz1C9N9Tqwiuy/tR9YS01BEGtk NcI6AJIVfZXuMoIRumdrqEgZbdv+BXgp1S2z3+XU+dEIeDWlVnWyofHuv8+X7oQ88/ 4vYSyjp2nLVtw== From: Yu Kuai To: axboe@kernel.dk, tj@kernel.org, linux-block@vger.kernel.org, cgroups@vger.kernel.org, nilay@linux.ibm.com, bvanassche@acm.org, ming.lei@redhat.com, hch@lst.de Cc: linux-kernel@vger.kernel.org, Yu Kuai Subject: [PATCH v2 16/19] block, bfq: convert to use blkg_conf_{start, end} Date: Fri, 10 Oct 2025 17:14:41 +0800 Message-ID: <20251010091446.3048529-17-yukuai@kernel.org> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20251010091446.3048529-1-yukuai@kernel.org> References: <20251010091446.3048529-1-yukuai@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Yu Kuai Make code cleaner, we'll create new blkg and then return error if bfq is not enabled for the device, this is fine because this is super cold path. Signed-off-by: Yu Kuai --- block/bfq-cgroup.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/block/bfq-cgroup.c b/block/bfq-cgroup.c index 43790ae91b57..d39c7a5db35d 100644 --- a/block/bfq-cgroup.c +++ b/block/bfq-cgroup.c @@ -1056,10 +1056,9 @@ static ssize_t bfq_io_set_device_weight(struct kernf= s_open_file *of, u64 v; =20 blkg_conf_init(&ctx, buf); - - ret =3D blkg_conf_prep(blkcg, &blkcg_policy_bfq, &ctx); + ret =3D blkg_conf_start(blkcg, &ctx); if (ret) - goto out; + return ret; =20 if (sscanf(ctx.body, "%llu", &v) =3D=3D 1) { /* require "default" on dfl */ @@ -1074,6 +1073,10 @@ static ssize_t bfq_io_set_device_weight(struct kernf= s_open_file *of, } =20 bfqg =3D blkg_to_bfqg(ctx.blkg); + if (!bfqg) { + ret =3D -EOPNOTSUPP; + goto out; + } =20 ret =3D -ERANGE; if (!v || (v >=3D BFQ_MIN_WEIGHT && v <=3D BFQ_MAX_WEIGHT)) { @@ -1081,7 +1084,7 @@ static ssize_t bfq_io_set_device_weight(struct kernfs= _open_file *of, ret =3D 0; } out: - blkg_conf_exit(&ctx); + blkg_conf_end(&ctx); return ret ?: nbytes; } =20 --=20 2.51.0 From nobody Sun Feb 8 06:22:49 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 29AA9217723; Fri, 10 Oct 2025 09:15:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760087741; cv=none; b=pSZIwQa/8UAhR9q6xqUvINVke6iolw2OYZU/KRKTDIhEeIJXhaHCxfCdLMW7Wqmd9kIrsU2+XjagvXU8JvsTRg5loIX/3u1i9DXEH20hK4D8PpHUc3cZu1YTvwfDekKUCvRujLkKZrTm1aBiC+q77hXxMdy1YfwWNRRlC3UUy/k= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760087741; c=relaxed/simple; bh=R9L4Dfzvxc48pIpRSH2GwrKBjgbQXWcbaCisT1DQcVI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=KCplrfp6APU0Udo1faeqG5M7T2gQoJGtJ8fzzz4fs8Qwwd4PAyXVKsrOKb7LgWPPlKYyi2Jd+/KgB6Pu5G7qSQnti2OW9ivbpeEQGq8wlx8NFs1zi+IU6mLmlM6hvueRF862QptNvlUycnlJrxg6XXc71TIQm00PkY/wxt6DpLI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=B5SV/D+Q; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="B5SV/D+Q" Received: by smtp.kernel.org (Postfix) with ESMTPSA id CD6B8C4CEF8; Fri, 10 Oct 2025 09:15:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1760087741; bh=R9L4Dfzvxc48pIpRSH2GwrKBjgbQXWcbaCisT1DQcVI=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=B5SV/D+QlEUmh3ZuhCIqcng4zG0GCCkFGbiEg8ie+ryPtdnPNLIkW+8KBUnNUZCii RkFgDTjJ9lPxsaXWv7U0WEUoK5IsxwSQiWMGC+uL0grK92MGaYSwv7gpcr1U55JrSf h3GiT1Z+sPds8cBCcb2SaAFgpZ1wZCKUz8F9TR+S7i9F4rLIEwJy04GO6s12SQC/lU wAXzOo0wPxINFL7paX3e4zfUwz/QyoAh3iqcvSWSp/FE3SPbNUIS+CYUpnWlRkRODi mmfqIr5qRJAIfZ9gfSjmUP+tVzoQNJkNv94sxr9UEpwld2hz0JKaKNymBVyutwL9DB UjthDQnAtZtuw== From: Yu Kuai To: axboe@kernel.dk, tj@kernel.org, linux-block@vger.kernel.org, cgroups@vger.kernel.org, nilay@linux.ibm.com, bvanassche@acm.org, ming.lei@redhat.com, hch@lst.de Cc: linux-kernel@vger.kernel.org, Yu Kuai Subject: [PATCH v2 17/19] blk-iolatency: convert to use blkg_conf_{start, end} Date: Fri, 10 Oct 2025 17:14:42 +0800 Message-ID: <20251010091446.3048529-18-yukuai@kernel.org> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20251010091446.3048529-1-yukuai@kernel.org> References: <20251010091446.3048529-1-yukuai@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Yu Kuai No functional changes are intended, make code cleaner. Signed-off-by: Yu Kuai --- block/blk-iolatency.c | 13 ++++--------- 1 file changed, 4 insertions(+), 9 deletions(-) diff --git a/block/blk-iolatency.c b/block/blk-iolatency.c index ce25fbb8aaf6..1199447a2a33 100644 --- a/block/blk-iolatency.c +++ b/block/blk-iolatency.c @@ -768,7 +768,7 @@ static int blk_iolatency_init(struct gendisk *disk) &blkcg_iolatency_ops); if (ret) goto err_free; - ret =3D blkcg_activate_policy(disk, &blkcg_policy_iolatency); + ret =3D __blkcg_activate_policy(disk, &blkcg_policy_iolatency); if (ret) goto err_qos_del; =20 @@ -837,10 +837,9 @@ static ssize_t iolatency_set_limit(struct kernfs_open_= file *of, char *buf, int ret; =20 blkg_conf_init(&ctx, buf); - - ret =3D blkg_conf_open_bdev(&ctx); + ret =3D blkg_conf_start(blkcg, &ctx); if (ret) - goto out; + return ret; =20 /* * blk_iolatency_init() may fail after rq_qos_add() succeeds which can @@ -852,10 +851,6 @@ static ssize_t iolatency_set_limit(struct kernfs_open_= file *of, char *buf, if (ret) goto out; =20 - ret =3D blkg_conf_prep(blkcg, &blkcg_policy_iolatency, &ctx); - if (ret) - goto out; - iolat =3D blkg_to_lat(ctx.blkg); p =3D ctx.body; =20 @@ -890,7 +885,7 @@ static ssize_t iolatency_set_limit(struct kernfs_open_f= ile *of, char *buf, iolatency_clear_scaling(blkg); ret =3D 0; out: - blkg_conf_exit(&ctx); + blkg_conf_end(&ctx); return ret ?: nbytes; } =20 --=20 2.51.0 From nobody Sun Feb 8 06:22:49 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D0CEC275AF3; Fri, 10 Oct 2025 09:15:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760087743; cv=none; b=bfgD34A2NkEYumAQ/rmSNjY7jHQjVtHMhY0g827atzF9WefKASMC7YwvMBAqG2KkatOkUbuoybCMjpl0GRjLKS94J1B6jIIMeAOyt7YpM0WcgXUrDq49fjkiMdVzLBchEVUyqOZtoipNI2MoXpHEWFlHZA3vpKvUgc0fJFXaMdw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760087743; c=relaxed/simple; bh=oKPt0bnqcsiByTXgVTcFXXKhtQ5JAVyT2nx9cy0X76Q=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=jsoy0Vn3Es6SH5nw1M3ZrabSO7S1mFTVvvjd500fTomfp5NTM7xVxnNbSo5bSLR0hmF9VSl0fooHxgCFUYMbCzRznCmGlFBcrf+3ZZu1a9gi74284H6ZEtBYUkaDizE8Ly/3qZ0XM2Nj8MzvOZO6vLEAkhlLd3PJ9lhJrsy9PRM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=SraVnF5N; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="SraVnF5N" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7E737C4CEF8; Fri, 10 Oct 2025 09:15:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1760087743; bh=oKPt0bnqcsiByTXgVTcFXXKhtQ5JAVyT2nx9cy0X76Q=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=SraVnF5NyLo2wRyBwtzb2jLADe9dt/JNLTGOovcsgNAx39qzYE6BVqPBlxbgq2kZn FUArqWYrqUHeeW8hgK48jHbV5R1qOtBmqR3BggSVL7w2SiDS/ipk3/VGg++POXTn1D pSDP7Uog9NKX8piecLSkFb4tXShtJPO6a9ITeoThjcLdex6W6AwT2nekIVkBm7J4l2 BQbNRY5aO5qQMEyxLZDOj/IcbVvJ6jjf0gW6iFZhRCqeV/25SZE+we4Eriuj5Aasdh UVnwLVAlFp7RMokD8gUVo89G4ZFwkW4OBh7y37an0JApPReVGcBWvakw6xY9cSKbhw p/J4odtoS9UTQ== From: Yu Kuai To: axboe@kernel.dk, tj@kernel.org, linux-block@vger.kernel.org, cgroups@vger.kernel.org, nilay@linux.ibm.com, bvanassche@acm.org, ming.lei@redhat.com, hch@lst.de Cc: linux-kernel@vger.kernel.org, Yu Kuai Subject: [PATCH v2 18/19] blk-iocost: convert to use blkg_conf_{start, end} Date: Fri, 10 Oct 2025 17:14:43 +0800 Message-ID: <20251010091446.3048529-19-yukuai@kernel.org> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20251010091446.3048529-1-yukuai@kernel.org> References: <20251010091446.3048529-1-yukuai@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Yu Kuai No functional changes are intended, make code cleaner. Signed-off-by: Yu Kuai --- block/blk-iocost.c | 47 +++++++++++++++++++++------------------------- 1 file changed, 21 insertions(+), 26 deletions(-) diff --git a/block/blk-iocost.c b/block/blk-iocost.c index 3593547930cc..de3862acb297 100644 --- a/block/blk-iocost.c +++ b/block/blk-iocost.c @@ -2931,7 +2931,7 @@ static int blk_iocost_init(struct gendisk *disk) if (ret) goto err_free_ioc; =20 - ret =3D blkcg_activate_policy(disk, &blkcg_policy_iocost); + ret =3D __blkcg_activate_policy(disk, &blkcg_policy_iocost); if (ret) goto err_del_qos; return 0; @@ -3140,12 +3140,15 @@ static ssize_t ioc_weight_write(struct kernfs_open_= file *of, char *buf, } =20 blkg_conf_init(&ctx, buf); - - ret =3D blkg_conf_prep(blkcg, &blkcg_policy_iocost, &ctx); + ret =3D blkg_conf_start(blkcg, &ctx); if (ret) - goto err; + return ret; =20 iocg =3D blkg_to_iocg(ctx.blkg); + if (!iocg) { + ret =3D -EOPNOTSUPP; + goto err; + } =20 if (!strncmp(ctx.body, "default", 7)) { v =3D 0; @@ -3162,13 +3165,13 @@ static ssize_t ioc_weight_write(struct kernfs_open_= file *of, char *buf, weight_updated(iocg, &now); spin_unlock(&iocg->ioc->lock); =20 - blkg_conf_exit(&ctx); + blkg_conf_end(&ctx); return nbytes; =20 einval: ret =3D -EINVAL; err: - blkg_conf_exit(&ctx); + blkg_conf_end(&ctx); return ret; } =20 @@ -3226,22 +3229,19 @@ static const match_table_t qos_tokens =3D { static ssize_t ioc_qos_write(struct kernfs_open_file *of, char *input, size_t nbytes, loff_t off) { + struct blkcg *blkcg =3D css_to_blkcg(of_css(of)); struct blkg_conf_ctx ctx; struct gendisk *disk; struct ioc *ioc; u32 qos[NR_QOS_PARAMS]; bool enable, user; char *body, *p; - unsigned long memflags; int ret; =20 blkg_conf_init(&ctx, input); - - memflags =3D blkg_conf_open_bdev_frozen(&ctx); - if (IS_ERR_VALUE(memflags)) { - ret =3D memflags; - goto err; - } + ret =3D blkg_conf_start(blkcg, &ctx); + if (ret) + return ret; =20 body =3D ctx.body; disk =3D ctx.bdev->bd_disk; @@ -3358,14 +3358,14 @@ static ssize_t ioc_qos_write(struct kernfs_open_fil= e *of, char *input, =20 blk_mq_unquiesce_queue(disk->queue); =20 - blkg_conf_exit_frozen(&ctx, memflags); + blkg_conf_end(&ctx); return nbytes; einval: spin_unlock_irq(&ioc->lock); blk_mq_unquiesce_queue(disk->queue); ret =3D -EINVAL; err: - blkg_conf_exit_frozen(&ctx, memflags); + blkg_conf_end(&ctx); return ret; } =20 @@ -3418,9 +3418,9 @@ static const match_table_t i_lcoef_tokens =3D { static ssize_t ioc_cost_model_write(struct kernfs_open_file *of, char *inp= ut, size_t nbytes, loff_t off) { + struct blkcg *blkcg =3D css_to_blkcg(of_css(of)); struct blkg_conf_ctx ctx; struct request_queue *q; - unsigned int memflags; struct ioc *ioc; u64 u[NR_I_LCOEFS]; bool user; @@ -3428,10 +3428,9 @@ static ssize_t ioc_cost_model_write(struct kernfs_op= en_file *of, char *input, int ret; =20 blkg_conf_init(&ctx, input); - - ret =3D blkg_conf_open_bdev(&ctx); + ret =3D blkg_conf_start(blkcg, &ctx); if (ret) - goto err; + return ret; =20 body =3D ctx.body; q =3D bdev_get_queue(ctx.bdev); @@ -3448,10 +3447,9 @@ static ssize_t ioc_cost_model_write(struct kernfs_op= en_file *of, char *input, ioc =3D q_to_ioc(q); } =20 - memflags =3D blk_mq_freeze_queue(q); blk_mq_quiesce_queue(q); - spin_lock_irq(&ioc->lock); + memcpy(u, ioc->params.i_lcoefs, sizeof(u)); user =3D ioc->user_cost_model; =20 @@ -3500,20 +3498,17 @@ static ssize_t ioc_cost_model_write(struct kernfs_o= pen_file *of, char *input, spin_unlock_irq(&ioc->lock); =20 blk_mq_unquiesce_queue(q); - blk_mq_unfreeze_queue(q, memflags); =20 - blkg_conf_exit(&ctx); + blkg_conf_end(&ctx); return nbytes; =20 einval: spin_unlock_irq(&ioc->lock); - blk_mq_unquiesce_queue(q); - blk_mq_unfreeze_queue(q, memflags); =20 ret =3D -EINVAL; err: - blkg_conf_exit(&ctx); + blkg_conf_end(&ctx); return ret; } =20 --=20 2.51.0 From nobody Sun Feb 8 06:22:49 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E27A428153A; Fri, 10 Oct 2025 09:15:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760087747; cv=none; b=SF/lkeYZQpxULZVZaYZwtpV/0Tk9lYVdGCrz3UHV4a3/yjH8gT1Z1jmGY0SYl723VrwtIQHyVpocPi/+twSrtLStr1i1St2SpMZt0+N5ByAcxL/3K4Qvb87MQLVMmEQV/Ef39bv8Bzn/P6XS17WDlNSeuTxRea1e8GvsEANKSxY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760087747; c=relaxed/simple; bh=cJvhoPRiXOlxs+zH/XVaJ4God3JTKZGfA+vj8Ps5Zvc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=cxFBwBQ997lFr6AXS0cTQT4uQh59+9unSuJf3dcICH1dXowUzs1GErzHqM8YO8T47V0yYbSTT2c1O5/ZSQ/V+8tccYbNPT//vOYMYQHcnyb2VRnSMk6vTKSFYns8HvhX6NsGgxvdKnwSExNecGc4cX8yhHce1OUkh+eRSZyPF+s= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=tGveJ0K3; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="tGveJ0K3" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 919E7C4CEF1; Fri, 10 Oct 2025 09:15:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1760087746; bh=cJvhoPRiXOlxs+zH/XVaJ4God3JTKZGfA+vj8Ps5Zvc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=tGveJ0K3n9rjtc9cLpwOOkTxehk838nZLm/2mI3EY2pKs8/pdEzRwM1lST9GyRequ Z9Cglgz0UXnPaTS8DNnpXmgjB9nroM3FMi0Ry9OZW61u8ufYcZAVIBfjtU6/URgOHs BcJJA/xXtUTstv5ota6E0RsByeGbsVF/RDGHfR4UUkvZRmbupT3ebNwZg11P1OKLKf 05QYAz+9Z7mtIkWhUwHFX52gnNTNTD4X/eUaRnhKIXDhiEpgULHajRIvLCCLy2Hid2 hSbjoKqHP86kLB7bvNLzS/8yyqrzGL7eTBuushkecsl0cQUTN0OThYkxiojnv/S6Qz 4vT1hKFRznFiA== From: Yu Kuai To: axboe@kernel.dk, tj@kernel.org, linux-block@vger.kernel.org, cgroups@vger.kernel.org, nilay@linux.ibm.com, bvanassche@acm.org, ming.lei@redhat.com, hch@lst.de Cc: linux-kernel@vger.kernel.org, Yu Kuai Subject: [PATCH v2 19/19] blk-cgroup: remove unsed blkg configuration helpers Date: Fri, 10 Oct 2025 17:14:44 +0800 Message-ID: <20251010091446.3048529-20-yukuai@kernel.org> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20251010091446.3048529-1-yukuai@kernel.org> References: <20251010091446.3048529-1-yukuai@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable From: Yu Kuai Remove followings helpers that is now unused: - blkg_conf_open_bdev() - blkg_conf_open_bdev_frozen() - blkg_conf_prep() - blkg_conf_exit() - blkg_conf_exit_frozen() Signed-off-by: Yu Kuai --- block/blk-cgroup.c | 224 +-------------------------------------------- block/blk-cgroup.h | 6 -- 2 files changed, 1 insertion(+), 229 deletions(-) diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c index 4b7324c1d0d5..d93654334854 100644 --- a/block/blk-cgroup.c +++ b/block/blk-cgroup.c @@ -729,8 +729,7 @@ EXPORT_SYMBOL_GPL(__blkg_prfill_u64); * @input: input string * * Initialize @ctx which can be used to parse blkg config input string @in= put. - * Once initialized, @ctx can be used with blkg_conf_open_bdev() and - * blkg_conf_prep(), and must be cleaned up with blkg_conf_exit(). + * Once initialized, @ctx can be used with blkg_conf_start(). */ void blkg_conf_init(struct blkg_conf_ctx *ctx, char *input) { @@ -738,92 +737,6 @@ void blkg_conf_init(struct blkg_conf_ctx *ctx, char *i= nput) } EXPORT_SYMBOL_GPL(blkg_conf_init); =20 -/** - * blkg_conf_open_bdev - parse and open bdev for per-blkg config update - * @ctx: blkg_conf_ctx initialized with blkg_conf_init() - * - * Parse the device node prefix part, MAJ:MIN, of per-blkg config update f= rom - * @ctx->input and get and store the matching bdev in @ctx->bdev. @ctx->bo= dy is - * set to point past the device node prefix. - * - * This function may be called multiple times on @ctx and the extra calls = become - * NOOPs. blkg_conf_prep() implicitly calls this function. Use this functi= on - * explicitly if bdev access is needed without resolving the blkcg / polic= y part - * of @ctx->input. Returns -errno on error. - */ -int blkg_conf_open_bdev(struct blkg_conf_ctx *ctx) -{ - char *input =3D ctx->input; - unsigned int major, minor; - struct block_device *bdev; - int key_len; - - if (ctx->bdev) - return 0; - - if (sscanf(input, "%u:%u%n", &major, &minor, &key_len) !=3D 2) - return -EINVAL; - - input +=3D key_len; - if (!isspace(*input)) - return -EINVAL; - input =3D skip_spaces(input); - - bdev =3D blkdev_get_no_open(MKDEV(major, minor), false); - if (!bdev) - return -ENODEV; - if (bdev_is_partition(bdev)) { - blkdev_put_no_open(bdev); - return -ENODEV; - } - - mutex_lock(&bdev->bd_queue->rq_qos_mutex); - if (!disk_live(bdev->bd_disk)) { - blkdev_put_no_open(bdev); - mutex_unlock(&bdev->bd_queue->rq_qos_mutex); - return -ENODEV; - } - - ctx->body =3D input; - ctx->bdev =3D bdev; - return 0; -} -/* - * Similar to blkg_conf_open_bdev, but additionally freezes the queue, - * acquires q->elevator_lock, and ensures the correct locking order - * between q->elevator_lock and q->rq_qos_mutex. - * - * This function returns negative error on failure. On success it returns - * memflags which must be saved and later passed to blkg_conf_exit_frozen - * for restoring the memalloc scope. - */ -unsigned long __must_check blkg_conf_open_bdev_frozen(struct blkg_conf_ctx= *ctx) -{ - int ret; - unsigned long memflags; - - if (ctx->bdev) - return -EINVAL; - - ret =3D blkg_conf_open_bdev(ctx); - if (ret < 0) - return ret; - /* - * At this point, we haven=E2=80=99t started protecting anything related = to QoS, - * so we release q->rq_qos_mutex here, which was first acquired in blkg_ - * conf_open_bdev. Later, we re-acquire q->rq_qos_mutex after freezing - * the queue and acquiring q->elevator_lock to maintain the correct - * locking order. - */ - mutex_unlock(&ctx->bdev->bd_queue->rq_qos_mutex); - - memflags =3D blk_mq_freeze_queue(ctx->bdev->bd_queue); - mutex_lock(&ctx->bdev->bd_queue->elevator_lock); - mutex_lock(&ctx->bdev->bd_queue->rq_qos_mutex); - - return memflags; -} - void blkg_conf_end(struct blkg_conf_ctx *ctx) { struct request_queue *q =3D bdev_get_queue(ctx->bdev); @@ -885,141 +798,6 @@ int blkg_conf_start(struct blkcg *blkcg, struct blkg_= conf_ctx *ctx) } EXPORT_SYMBOL_GPL(blkg_conf_start); =20 -/** - * blkg_conf_prep - parse and prepare for per-blkg config update - * @blkcg: target block cgroup - * @pol: target policy - * @ctx: blkg_conf_ctx initialized with blkg_conf_init() - * - * Parse per-blkg config update from @ctx->input and initialize @ctx - * accordingly. On success, @ctx->body points to the part of @ctx->input - * following MAJ:MIN, @ctx->bdev points to the target block device and - * @ctx->blkg to the blkg being configured. - * - * blkg_conf_open_bdev() may be called on @ctx beforehand. On success, this - * function returns with queue lock held and must be followed by - * blkg_conf_exit(). - */ -int blkg_conf_prep(struct blkcg *blkcg, const struct blkcg_policy *pol, - struct blkg_conf_ctx *ctx) - __acquires(&bdev->bd_queue->blkcg_mutex) -{ - struct gendisk *disk; - struct request_queue *q; - struct blkcg_gq *blkg; - int ret; - - ret =3D blkg_conf_open_bdev(ctx); - if (ret) - return ret; - - disk =3D ctx->bdev->bd_disk; - q =3D disk->queue; - - /* Prevent concurrent with blkcg_deactivate_policy() */ - mutex_lock(&q->blkcg_mutex); - - if (!blkcg_policy_enabled(q, pol)) { - ret =3D -EOPNOTSUPP; - goto fail_unlock; - } - - blkg =3D blkg_lookup(blkcg, q); - if (blkg) - goto success; - - /* - * Create blkgs walking down from blkcg_root to @blkcg, so that all - * non-root blkgs have access to their parents. - */ - while (true) { - struct blkcg *pos =3D blkcg; - struct blkcg *parent; - - parent =3D blkcg_parent(blkcg); - while (parent && !blkg_lookup(parent, q)) { - pos =3D parent; - parent =3D blkcg_parent(parent); - } - - if (!blkcg_policy_enabled(q, pol)) { - ret =3D -EOPNOTSUPP; - goto fail_unlock; - } - - blkg =3D blkg_lookup(pos, q); - if (!blkg) { - blkg =3D blkg_create(pos, disk); - if (IS_ERR(blkg)) { - ret =3D PTR_ERR(blkg); - goto fail_unlock; - } - } - - if (pos =3D=3D blkcg) - goto success; - } -success: - ctx->blkg =3D blkg; - return 0; - -fail_unlock: - mutex_unlock(&q->blkcg_mutex); - /* - * If queue was bypassing, we should retry. Do so after a - * short msleep(). It isn't strictly necessary but queue - * can be bypassing for some time and it's always nice to - * avoid busy looping. - */ - if (ret =3D=3D -EBUSY) { - msleep(10); - ret =3D restart_syscall(); - } - return ret; -} -EXPORT_SYMBOL_GPL(blkg_conf_prep); - -/** - * blkg_conf_exit - clean up per-blkg config update - * @ctx: blkg_conf_ctx initialized with blkg_conf_init() - * - * Clean up after per-blkg config update. This function must be called on = all - * blkg_conf_ctx's initialized with blkg_conf_init(). - */ -void blkg_conf_exit(struct blkg_conf_ctx *ctx) - __releases(&ctx->bdev->bd_queue->blkcg_mutex) - __releases(&ctx->bdev->bd_queue->rq_qos_mutex) -{ - if (ctx->blkg) { - mutex_unlock(&bdev_get_queue(ctx->bdev)->blkcg_mutex); - ctx->blkg =3D NULL; - } - - if (ctx->bdev) { - mutex_unlock(&ctx->bdev->bd_queue->rq_qos_mutex); - blkdev_put_no_open(ctx->bdev); - ctx->body =3D NULL; - ctx->bdev =3D NULL; - } -} -EXPORT_SYMBOL_GPL(blkg_conf_exit); - -/* - * Similar to blkg_conf_exit, but also unfreezes the queue and releases - * q->elevator_lock. Should be used when blkg_conf_open_bdev_frozen - * is used to open the bdev. - */ -void blkg_conf_exit_frozen(struct blkg_conf_ctx *ctx, unsigned long memfla= gs) -{ - if (ctx->bdev) { - struct request_queue *q =3D ctx->bdev->bd_queue; - - blkg_conf_exit(ctx); - mutex_unlock(&q->elevator_lock); - blk_mq_unfreeze_queue(q, memflags); - } -} - static void blkg_iostat_add(struct blkg_iostat *dst, struct blkg_iostat *s= rc) { int i; diff --git a/block/blk-cgroup.h b/block/blk-cgroup.h index c3d16d52c275..aec801255821 100644 --- a/block/blk-cgroup.h +++ b/block/blk-cgroup.h @@ -221,12 +221,6 @@ struct blkg_conf_ctx { }; =20 void blkg_conf_init(struct blkg_conf_ctx *ctx, char *input); -int blkg_conf_open_bdev(struct blkg_conf_ctx *ctx); -unsigned long blkg_conf_open_bdev_frozen(struct blkg_conf_ctx *ctx); -int blkg_conf_prep(struct blkcg *blkcg, const struct blkcg_policy *pol, - struct blkg_conf_ctx *ctx); -void blkg_conf_exit(struct blkg_conf_ctx *ctx); -void blkg_conf_exit_frozen(struct blkg_conf_ctx *ctx, unsigned long memfla= gs); void blkg_conf_end(struct blkg_conf_ctx *ctx); int blkg_conf_start(struct blkcg *blkcg, struct blkg_conf_ctx *ctx); =20 --=20 2.51.0