From nobody Tue Feb 10 04:13:26 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BDB9D2DF6F6; Tue, 3 Feb 2026 08:06:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770105977; cv=none; b=HWcxTLN6RgbWqCLhc8TfdzQDjy8cLLT27JFogS5grtX0N6SxRojqxLHlpVSendonB6723lULkL0Dt0xUuy+1aV9zMAxR+8O/nSc/J4vZG64jSFEroFSUhacKlDZzwH94fnPa7sK5qr6z7FQQCe1fYLrVG36QZ4PcE1mWBag/tfc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770105977; c=relaxed/simple; bh=mfoop8ZFZvTQ70rW8k+bwTmcnWPolUFqyPFaLre1q6E=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=SFUWU/RX9UfK+T/gN18GiJ2R4Ly8k8GOzlClaneft/7wJvybf7QiJZW3eEWM8kZ7FPii188lKcKjYSuORo/r6GeovncO58NsFtt6RzHz1whSdFZRzVvRINlykhqMVfYsAD8Qoam+QEGcxzuErqZQnKbeMECSxkwtu1BxmolrjdI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id B9A42C116D0; Tue, 3 Feb 2026 08:06:14 +0000 (UTC) From: Yu Kuai To: tj@kernel.org, josef@toxicpanda.com, axboe@kernel.dk Cc: cgroups@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, yukuai@fnnas.com, zhengqixing@huawei.com, mkoutny@suse.com, hch@infradead.org, ming.lei@redhat.com, nilay@linux.ibm.com Subject: [PATCH v2 3/7] blk-cgroup: fix race between policy activation and blkg destruction Date: Tue, 3 Feb 2026 16:05:58 +0800 Message-ID: <20260203080602.726505-4-yukuai@fnnas.com> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20260203080602.726505-1-yukuai@fnnas.com> References: <20260203080602.726505-1-yukuai@fnnas.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Zheng Qixing When switching an IO scheduler on a block device, blkcg_activate_policy() allocates blkg_policy_data (pd) for all blkgs attached to the queue. However, blkcg_activate_policy() may race with concurrent blkcg deletion, leading to use-after-free and memory leak issues. The use-after-free occurs in the following race: T1 (blkcg_activate_policy): - Successfully allocates pd for blkg1 (loop0->queue, blkcgA) - Fails to allocate pd for blkg2 (loop0->queue, blkcgB) - Enters the enomem rollback path to release blkg1 resources T2 (blkcg deletion): - blkcgA is deleted concurrently - blkg1 is freed via blkg_free_workfn() - blkg1->pd is freed T1 (continued): - Rollback path accesses blkg1->pd->online after pd is freed - Triggers use-after-free In addition, blkg_free_workfn() frees pd before removing the blkg from q->blkg_list. This allows blkcg_activate_policy() to allocate a new pd for a blkg that is being destroyed, leaving the newly allocated pd unreachable when the blkg is finally freed. Fix these races by extending blkcg_mutex coverage to serialize blkcg_activate_policy() rollback and blkg destruction, ensuring pd lifecycle is synchronized with blkg list visibility. Link: https://lore.kernel.org/all/20260108014416.3656493-3-zhengqixing@huaw= eicloud.com/ Fixes: f1c006f1c685 ("blk-cgroup: synchronize pd_free_fn() from blkg_free_w= orkfn() and blkcg_deactivate_policy()") Signed-off-by: Zheng Qixing Signed-off-by: Yu Kuai --- block/blk-cgroup.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c index 0bc7b19399b6..a6ac6ba9430d 100644 --- a/block/blk-cgroup.c +++ b/block/blk-cgroup.c @@ -1599,6 +1599,8 @@ int blkcg_activate_policy(struct gendisk *disk, const= struct blkcg_policy *pol) =20 if (queue_is_mq(q)) memflags =3D blk_mq_freeze_queue(q); + + mutex_lock(&q->blkcg_mutex); retry: spin_lock_irq(&q->queue_lock); =20 @@ -1661,6 +1663,7 @@ int blkcg_activate_policy(struct gendisk *disk, const= struct blkcg_policy *pol) =20 spin_unlock_irq(&q->queue_lock); out: + mutex_unlock(&q->blkcg_mutex); if (queue_is_mq(q)) blk_mq_unfreeze_queue(q, memflags); if (pinned_blkg) --=20 2.51.0