From nobody Mon Jun 8 03:19:50 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-alma10-1.taild15c8.ts.net [100.103.45.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0544237EFE2; Tue, 2 Jun 2026 10:10:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=100.103.45.18 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780395026; cv=none; b=EnyKQm1C1NtlEdHz8gh+gPvJpesNzHhjmXemyURPf9itYCVVy34b4iDn0hb+B1Uy4jBMtL52PUUgiuO7Ov2Xk4YEOX3wd7mqdMzjqHljZ3AV0Bibd3JwnViLP6tz+cH79dUKiBuMzdQz/uIfzJEY28ZN5lcZYsTGvJKcdHY8c3E= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780395026; c=relaxed/simple; bh=LHr7gBg6PtJCZf6/6mkuc3vnbKC5yndXujDfy0HTjfM=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=LhZdHQMKjLMG4U4eQaRr+g4Xj5/Ufv0oa3MKkd/LNLgTmujZ4yw5l09v5XvViJr4vwx3/dhKEWH1piN9f6o3Dt5aqv+6Ibc5bqBN5NIGt2GlRkKKb7xIQpB89Ufvhv3HiDpjWnspxlf7ZQmGGR65LMejGZZHmUvi6ARciHlHFmA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=MyfVmOmz; arc=none smtp.client-ip=100.103.45.18 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="MyfVmOmz" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 4FEB71F00898; Tue, 2 Jun 2026 10:10:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel.org; s=k20260515; t=1780395024; bh=QUhM5h4C1WnXSL2tEPYUniUgKT/hUDIJ/vXilp+tDFQ=; h=From:Date:Subject:References:In-Reply-To:To:Cc; b=MyfVmOmzX7hd/4gWE9I1HAP/tDKLvZhsRpaF0AF23wvS4sNb3Xx0sJuwzDIkbfDaY FsjPXGFT3LWBJ11nfCzL1IsgItbFt8v/ksAh9q4fYdWKCRAccQxAc4jxDSN2NkQn0K eY38uDZhStmD6xXuv8afkp2SHzSXcsbad+Ke/JhlJXTO7XIsGDNYOKscFTgMpf5sg8 jIW2Bk3szsshYcD+t0Twms+Bqnsaau186Rw1A/ruY0DQDE8M20cf3PbYzb7WGuGxs+ w47n+mxCjsQSBAa0DZ11hagsIDAIEDnl7/4kcTCJ7YlAYbw9I1tyAav6MqFcLlx2aQ X9caJzyyrZdAg== From: Christian Brauner Date: Tue, 02 Jun 2026 12:10:07 +0200 Subject: [PATCH RFC 1/8] fs, block: move blk_mode_t and fop_flags_t into Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260602-work-super-bdev_holder_global-v1-1-bb0fd82f3861@kernel.org> References: <20260602-work-super-bdev_holder_global-v1-0-bb0fd82f3861@kernel.org> In-Reply-To: <20260602-work-super-bdev_holder_global-v1-0-bb0fd82f3861@kernel.org> To: Christoph Hellwig , Jan Kara Cc: Jens Axboe , Alexander Viro , linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, Carlos Maiolino , linux-xfs@vger.kernel.org, Chris Mason , David Sterba , linux-btrfs@vger.kernel.org, Theodore Ts'o , linux-ext4@vger.kernel.org, Gao Xiang , linux-erofs@lists.ozlabs.org, "Christian Brauner (Amutable)" X-Mailer: b4 0.16-dev-fffa9 X-Developer-Signature: v=1; a=openpgp-sha256; l=1762; i=brauner@kernel.org; h=from:subject:message-id; bh=LHr7gBg6PtJCZf6/6mkuc3vnbKC5yndXujDfy0HTjfM=; b=owGbwMvMwCU28Zj0gdSKO4sYT6slMWTJreGQ0tbvuCvNPbdnY7ncU9PyXVtSrr4UOXPnm57kl sqwU8b/OkpZGMS4GGTFFFkc2k3C5ZbzVGw2ytSAmcPKBDKEgYtTACZS6cHI8IFJyzT/iD9n3TUF yYePNC5UyCp2aSR8/9TYf9xL5drecIb/cbfU15913s30r980eIpVTc3qw1+uhF5de2BJ2qn/hsy TWQE= X-Developer-Key: i=brauner@kernel.org; a=openpgp; fpr=4880B8C9BD0E5106FC070F4F7B3C391EFEA93624 blk_mode_t and fop_flags_t are both plain 'unsigned int __bitwise' flag typedefs, exactly like the gfp_t, slab_flags_t and fmode_t that already live in . Move them there so they are available everywhere without having to drag in a subsystem header. Signed-off-by: Christian Brauner (Amutable) --- include/linux/blkdev.h | 2 -- include/linux/fs.h | 2 -- include/linux/types.h | 2 ++ 3 files changed, 2 insertions(+), 4 deletions(-) diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 890128cdea1c..c8494d64a69d 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -126,8 +126,6 @@ struct blk_integrity { unsigned char pi_tuple_size; }; =20 -typedef unsigned int __bitwise blk_mode_t; - /* open for reading */ #define BLK_OPEN_READ ((__force blk_mode_t)(1 << 0)) /* open for writing */ diff --git a/include/linux/fs.h b/include/linux/fs.h index 11559c513dfb..e9346be8470f 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -1921,8 +1921,6 @@ struct dir_context { struct io_uring_cmd; struct offset_ctx; =20 -typedef unsigned int __bitwise fop_flags_t; - struct file_operations { struct module *owner; fop_flags_t fop_flags; diff --git a/include/linux/types.h b/include/linux/types.h index 608050dbca6a..ef026585420b 100644 --- a/include/linux/types.h +++ b/include/linux/types.h @@ -163,6 +163,8 @@ typedef u32 dma_addr_t; typedef unsigned int __bitwise gfp_t; typedef unsigned int __bitwise slab_flags_t; typedef unsigned int __bitwise fmode_t; +typedef unsigned int __bitwise blk_mode_t; +typedef unsigned int __bitwise fop_flags_t; =20 #ifdef CONFIG_PHYS_ADDR_T_64BIT typedef u64 phys_addr_t; --=20 2.47.3 From nobody Mon Jun 8 03:19:50 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-alma10-1.taild15c8.ts.net [100.103.45.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1C0BD3D6497; Tue, 2 Jun 2026 10:10:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=100.103.45.18 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780395030; cv=none; b=RJ/PQbQcE8mjBRf8lQkh8i1OX5LXZkrqRd2KErOUhM22bHVwVbgDWw14gtrHzNOa1mJ58tWhiM567/cI4rCtiHxiEx4hfyEe93yO/WvYj2XmE9z7GSn97LXJwM7wx+kMYS+r4fOWbXzkP+KgSlunWGCrecIoUXOY5oZ/5TLRIMU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780395030; c=relaxed/simple; bh=MYpZz0B6HGCp0PliU5rmIIJhd2BihonSRtJLqed8TeA=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=aQvRjDBhHmF7cGO3R1AxfeiE1/YLT4Rq9USfgdky7HGRk9S7f+DTeRegkz6l7gnNOPjXDz5cVV2AJw6L09tlUfHtQSS5RnPDa/2YjR8WmV2NYtBA2bIfmEOP1yGp6wXN7doSlEtkD8FlC8TzI98iSbHQytPLC+7ZVlS5yW5LVMM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=dVZt3MZL; arc=none smtp.client-ip=100.103.45.18 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="dVZt3MZL" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 15F491F00893; Tue, 2 Jun 2026 10:10:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel.org; s=k20260515; t=1780395028; bh=G3AZrfjKA/1cygwXgkK9tiTHubrtfF5j0AYVS7Y8tws=; h=From:Date:Subject:References:In-Reply-To:To:Cc; b=dVZt3MZLgDNbSb846tpmPJili/RzE+FsyHAGY6dnXxpKsCmKIHKa0gBRaK9J6b5gD xroBMbMpom2v+WALmANddzahvK1IpsXSlt5RxM+UAEdjLgI4gFG3GRmXlmO+9rHXKK Kp7UAeZFypoz1UR1oDsi5Zh65E2qCXFuNe6yqjWtA/hjL+rM0zZouL71vwAKfDmNIR OAUTpdWSXg0rlGDeoa/Jz6taGfmOwu4opJERymG8aSJmMvPWy+IqrKwUGWiWypuZov wWO4znM86rH8JHAZzlTSH+ccMiIG9Z5/rM8Xdhm1hWf1ktxr+0NWsF6OcFuvCG253g f34VluspHag3Q== From: Christian Brauner Date: Tue, 02 Jun 2026 12:10:08 +0200 Subject: [PATCH RFC 2/8] fs: add a global device to super block hash table Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260602-work-super-bdev_holder_global-v1-2-bb0fd82f3861@kernel.org> References: <20260602-work-super-bdev_holder_global-v1-0-bb0fd82f3861@kernel.org> In-Reply-To: <20260602-work-super-bdev_holder_global-v1-0-bb0fd82f3861@kernel.org> To: Christoph Hellwig , Jan Kara Cc: Jens Axboe , Alexander Viro , linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, Carlos Maiolino , linux-xfs@vger.kernel.org, Chris Mason , David Sterba , linux-btrfs@vger.kernel.org, Theodore Ts'o , linux-ext4@vger.kernel.org, Gao Xiang , linux-erofs@lists.ozlabs.org, "Christian Brauner (Amutable)" X-Mailer: b4 0.16-dev-fffa9 X-Developer-Signature: v=1; a=openpgp-sha256; l=20256; i=brauner@kernel.org; h=from:subject:message-id; bh=MYpZz0B6HGCp0PliU5rmIIJhd2BihonSRtJLqed8TeA=; b=owGbwMvMwCU28Zj0gdSKO4sYT6slMWTJreGws0xq7DQ/7fpxEQ9jn9Pk8jtBPidK5DbYvXt4Q MjmHPuMjlIWBjEuBlkxRRaHdpNwueU8FZuNMjVg5rAygQxh4OIUgInYvWD4Hx/sFBad8m3JnD4W cZ2l6rwaDX4KV36dtWQ3M7g/d5reK0aGy8WNjAWHqv/byRo9DW98cUPJ+0B0+KSZ8TkTNyZlx4n wAwA= X-Developer-Key: i=brauner@kernel.org; a=openpgp; fpr=4880B8C9BD0E5106FC070F4F7B3C391EFEA93624 fs_holder_ops recovers the owning superblock from bdev->bd_holder, which forces the holder to be exactly one superblock and prevents several superblocks from sharing one block device. That's what erofs is doing. Introduce a global dev_t-keyed rhltable mapping each block device to the superblock(s) using it. The holder argument becomes purely the block layer's exclusivity token (a superblock, or a file_system_type for shared devices) and is no longer needed by the fs specific callbacks. Registration keeps one entry per (device, superblock). When a filesystem claims a device it already uses (xfs with its log on the data device), no second entry is added, so each superblock is acted on once. Each table entry holds a passive reference (s_count) on its superblock, so the struct stays valid for as long as the entry is reachable. The callbacks look the device up in the table and act on every superblock using it: Unlinking an entry is deferred to the last unpin, so a cursor never resumes from a removed node. After this it's possible to act on all superblocks that share a given device. Signed-off-by: Christian Brauner (Amutable) --- fs/super.c | 430 +++++++++++++++++++++++++++++++++----------= ---- include/linux/blkdev.h | 7 - include/linux/fs/super.h | 7 + 3 files changed, 309 insertions(+), 135 deletions(-) diff --git a/fs/super.c b/fs/super.c index 378e81efe643..e0174d5819a0 100644 --- a/fs/super.c +++ b/fs/super.c @@ -24,6 +24,7 @@ #include #include #include +#include #include #include #include /* for the emergency remount stuff */ @@ -1411,186 +1412,234 @@ EXPORT_SYMBOL(sget_dev); =20 #ifdef CONFIG_BLOCK /* - * Lock the superblock that is holder of the bdev. Returns the superblock - * pointer if we successfully locked the superblock and it is alive. Other= wise - * we return NULL and just unlock bdev->bd_holder_lock. - * - * The function must be called with bdev->bd_holder_lock and releases it. + * Filesystems claim block devices through fs_bdev_file_open_by_{dev,path}= (), + * which records a {dev_t -> super_block} entry in the global @fs_bdev_sup= ers + * table. The fs_holder_ops callbacks resolve a device event to the + * superblock(s) using that device by looking it up there rather than read= ing + * bdev->bd_holder, so several superblocks may share one block device -- t= he + * holder is then only the block layer's exclusivity token. */ -static struct super_block *bdev_super_lock(struct block_device *bdev, bool= excl) - __releases(&bdev->bd_holder_lock) +struct fs_bdev_holder { + dev_t dev; /* @fs_bdev_supers key */ + struct super_block *sb; + refcount_t fs_bdev_passive; /* @fs_bdev_active>0 bias + cursor pins */ + refcount_t fs_bdev_active; /* open claims for (dev, sb) */ + struct rhlist_head node; + struct rcu_head rcu; +}; + +static struct rhltable fs_bdev_supers; +static const struct rhashtable_params fs_bdev_params =3D { + .key_len =3D sizeof(dev_t), + .key_offset =3D offsetof(struct fs_bdev_holder, dev), + .head_offset =3D offsetof(struct fs_bdev_holder, node), +}; + +static int __init fs_bdev_supers_init(void) { - struct super_block *sb =3D bdev->bd_holder; - bool locked; + if (rhltable_init(&fs_bdev_supers, &fs_bdev_params)) + panic("VFS: Cannot initialise fs_bdev_supers\n"); + return 0; +} +fs_initcall(fs_bdev_supers_init); =20 - lockdep_assert_held(&bdev->bd_holder_lock); - lockdep_assert_not_held(&sb->s_umount); - lockdep_assert_not_held(&bdev->bd_disk->open_mutex); +static void fs_bdev_holder_put(struct fs_bdev_holder *h) +{ + /* Unlink only once unpinned, so a cursor never resumes from a removed no= de. */ + if (refcount_dec_and_test(&h->fs_bdev_passive)) { + rhltable_remove(&fs_bdev_supers, &h->node, fs_bdev_params); + put_super(h->sb); + kfree_rcu(h, rcu); + } +} =20 - /* Make sure sb doesn't go away from under us */ - spin_lock(&sb_lock); - sb->s_count++; - spin_unlock(&sb_lock); +/* + * Walk the superblocks sharing a block device the way __iterate_supers() = walks + * super_blocks: fs_bdev_first()/fs_bdev_next() return each entry with its= node + * pinned (refcount) so the chain link survives the RCU drop and the sleep= ing + * work the callbacks do between iterations; fs_bdev_next() also unpins the + * previous entry. The entry's fs_bdev_passive ref keeps @h->sb valid; ca= llers + * take s_active and/or super_lock_shared() as needed and skip dying super= blocks. + * A shared per-entry list node can't replace this because mark_dead and s= ync + * are not mutually serialised. + */ +static struct fs_bdev_holder *fs_bdev_pin(struct rhlist_head *pos) +{ + struct fs_bdev_holder *h; =20 - mutex_unlock(&bdev->bd_holder_lock); + /* Caller holds rcu_read_lock(). */ + for (; pos; pos =3D rcu_dereference_all(pos->next)) { + h =3D container_of(pos, struct fs_bdev_holder, node); + if (refcount_inc_not_zero(&h->fs_bdev_passive)) + return h; + } + return NULL; +} =20 - locked =3D super_lock(sb, excl); +static struct fs_bdev_holder *fs_bdev_first(dev_t dev) +{ + struct fs_bdev_holder *h; =20 - /* - * If the superblock wasn't already SB_DYING then we hold - * s_umount and can safely drop our temporary reference. - */ - put_super(sb); + rcu_read_lock(); + h =3D fs_bdev_pin(rhltable_lookup(&fs_bdev_supers, &dev, fs_bdev_params)); + rcu_read_unlock(); + return h; +} =20 - if (!locked) - return NULL; +static struct fs_bdev_holder *fs_bdev_next(struct fs_bdev_holder *prev) +{ + struct fs_bdev_holder *h; =20 - if (!sb->s_root || !(sb->s_flags & SB_ACTIVE)) { - super_unlock(sb, excl); - return NULL; - } + rcu_read_lock(); + h =3D fs_bdev_pin(rcu_dereference_all(prev->node.next)); + rcu_read_unlock(); + + fs_bdev_holder_put(prev); + return h; +} =20 - return sb; +static int fs_super_freeze(struct super_block *sb) +{ + if (sb->s_op->freeze_super) + return sb->s_op->freeze_super(sb, + FREEZE_MAY_NEST | FREEZE_HOLDER_USERSPACE, NULL); + return freeze_super(sb, FREEZE_MAY_NEST | FREEZE_HOLDER_USERSPACE, NULL); +} + +static int fs_super_thaw(struct super_block *sb) +{ + if (sb->s_op->thaw_super) + return sb->s_op->thaw_super(sb, + FREEZE_MAY_NEST | FREEZE_HOLDER_USERSPACE, NULL); + return thaw_super(sb, FREEZE_MAY_NEST | FREEZE_HOLDER_USERSPACE, NULL); } =20 static void fs_bdev_mark_dead(struct block_device *bdev, bool surprise) { - struct super_block *sb; + struct fs_bdev_holder *h; + dev_t dev =3D bdev->bd_dev; =20 - sb =3D bdev_super_lock(bdev, false); - if (!sb) - return; + mutex_unlock(&bdev->bd_holder_lock); =20 - if (sb->s_op->remove_bdev) { - int ret; + for (h =3D fs_bdev_first(dev); h; h =3D fs_bdev_next(h)) { + struct super_block *sb =3D h->sb; =20 - ret =3D sb->s_op->remove_bdev(sb, bdev); - if (!ret) { - super_unlock_shared(sb); - return; + if (!super_lock_shared(sb)) + continue; + if (sb->s_root && (sb->s_flags & SB_ACTIVE)) { + if (!sb->s_op->remove_bdev || + sb->s_op->remove_bdev(sb, bdev)) { + if (!surprise) + sync_filesystem(sb); + shrink_dcache_sb(sb); + evict_inodes(sb); + if (sb->s_op->shutdown) + sb->s_op->shutdown(sb); + } } - /* Fallback to shutdown. */ + super_unlock_shared(sb); } - - if (!surprise) - sync_filesystem(sb); - shrink_dcache_sb(sb); - evict_inodes(sb); - if (sb->s_op->shutdown) - sb->s_op->shutdown(sb); - - super_unlock_shared(sb); } =20 static void fs_bdev_sync(struct block_device *bdev) { - struct super_block *sb; + struct fs_bdev_holder *h; + dev_t dev =3D bdev->bd_dev; =20 - sb =3D bdev_super_lock(bdev, false); - if (!sb) - return; + mutex_unlock(&bdev->bd_holder_lock); =20 - sync_filesystem(sb); - super_unlock_shared(sb); -} + for (h =3D fs_bdev_first(dev); h; h =3D fs_bdev_next(h)) { + struct super_block *sb =3D h->sb; =20 -static struct super_block *get_bdev_super(struct block_device *bdev) -{ - bool active =3D false; - struct super_block *sb; - - sb =3D bdev_super_lock(bdev, true); - if (sb) { - active =3D atomic_inc_not_zero(&sb->s_active); - super_unlock_excl(sb); + if (!super_lock_shared(sb)) + continue; + if (sb->s_root && (sb->s_flags & SB_ACTIVE)) + sync_filesystem(sb); + super_unlock_shared(sb); } - if (!active) - return NULL; - return sb; } =20 /** - * fs_bdev_freeze - freeze owning filesystem of block device + * fs_bdev_freeze - freeze every superblock using a block device * @bdev: block device * - * Freeze the filesystem that owns this block device if it is still - * active. - * - * A filesystem that owns multiple block devices may be frozen from each - * block device and won't be unfrozen until all block devices are - * unfrozen. Each block device can only freeze the filesystem once as we - * nest freezes for block devices in the block layer. + * Freeze each live superblock using @bdev. A superblock owning several b= lock + * devices is frozen once per device and stays frozen until all are thawed= ; the + * block layer nests these freezes so the count stays balanced. * - * Return: If the freeze was successful zero is returned. If the freeze - * failed a negative error code is returned. + * Return: 0, or the error from the one superblock on a single-fs device. = When + * several superblocks share @bdev a per-superblock failure is swa= llowed + * (see below), but a sync_blockdev() failure is always reported. */ static int fs_bdev_freeze(struct block_device *bdev) { - struct super_block *sb; - int error =3D 0; + dev_t dev =3D bdev->bd_dev; + struct fs_bdev_holder *h; + unsigned int count =3D 0; + int error =3D 0, err; =20 lockdep_assert_held(&bdev->bd_fsfreeze_mutex); =20 - sb =3D get_bdev_super(bdev); - if (!sb) - return -EINVAL; + mutex_unlock(&bdev->bd_holder_lock); =20 - if (sb->s_op->freeze_super) - error =3D sb->s_op->freeze_super(sb, - FREEZE_MAY_NEST | FREEZE_HOLDER_USERSPACE, NULL); - else - error =3D freeze_super(sb, - FREEZE_MAY_NEST | FREEZE_HOLDER_USERSPACE, NULL); + for (h =3D fs_bdev_first(dev); h; h =3D fs_bdev_next(h)) { + if (!atomic_inc_not_zero(&h->sb->s_active)) + continue; + err =3D fs_super_freeze(h->sb); + if (err && !error) + error =3D err; + deactivate_super(h->sb); + count++; + } + + /* + * When several superblocks share the device, keep it frozen even if some + * of them failed to freeze and swallow the error: rolling the rest back + * via thaw_super() can fail too, so neither is a clear win. A single + * filesystem (count =3D=3D 1) still reports its error. + */ + if (error && count > 1) + error =3D 0; if (!error) error =3D sync_blockdev(bdev); - deactivate_super(sb); return error; } =20 /** - * fs_bdev_thaw - thaw owning filesystem of block device + * fs_bdev_thaw - thaw every superblock using a block device * @bdev: block device * - * Thaw the filesystem that owns this block device. + * The counterpart to fs_bdev_freeze(): thaw each live superblock using @b= dev. + * A zero return does not imply a superblock is fully unfrozen; it may hav= e been + * frozen more than once (by the kernel or via another device). * - * A filesystem that owns multiple block devices may be frozen from each - * block device and won't be unfrozen until all block devices are - * unfrozen. Each block device can only freeze the filesystem once as we - * nest freezes for block devices in the block layer. - * - * Return: If the thaw was successful zero is returned. If the thaw - * failed a negative error code is returned. If this function - * returns zero it doesn't mean that the filesystem is unfrozen - * as it may have been frozen multiple times (kernel may hold a - * freeze or might be frozen from other block devices). + * Return: 0, or the first error on a single-fs device; a shared device sw= allows + * per-superblock errors, as fs_bdev_freeze() does. */ static int fs_bdev_thaw(struct block_device *bdev) { - struct super_block *sb; - int error; + dev_t dev =3D bdev->bd_dev; + struct fs_bdev_holder *h; + unsigned int count =3D 0; + int error =3D 0, err; =20 lockdep_assert_held(&bdev->bd_fsfreeze_mutex); =20 - /* - * The block device may have been frozen before it was claimed by a - * filesystem. Concurrently another process might try to mount that - * frozen block device and has temporarily claimed the block device for - * that purpose causing a concurrent fs_bdev_thaw() to end up here. The - * mounter is already about to abort mounting because they still saw an - * elevanted bdev->bd_fsfreeze_count so get_bdev_super() will return - * NULL in that case. - */ - sb =3D get_bdev_super(bdev); - if (!sb) - return -EINVAL; + mutex_unlock(&bdev->bd_holder_lock); =20 - if (sb->s_op->thaw_super) - error =3D sb->s_op->thaw_super(sb, - FREEZE_MAY_NEST | FREEZE_HOLDER_USERSPACE, NULL); - else - error =3D thaw_super(sb, - FREEZE_MAY_NEST | FREEZE_HOLDER_USERSPACE, NULL); - deactivate_super(sb); + for (h =3D fs_bdev_first(dev); h; h =3D fs_bdev_next(h)) { + if (!atomic_inc_not_zero(&h->sb->s_active)) + continue; + err =3D fs_super_thaw(h->sb); + if (err && !error) + error =3D err; + deactivate_super(h->sb); + count++; + } + + /* Shared device: swallow per-superblock errors, like fs_bdev_freeze(). */ + if (error && count > 1) + error =3D 0; return error; } =20 @@ -1602,6 +1651,131 @@ const struct blk_holder_ops fs_holder_ops =3D { }; EXPORT_SYMBOL_GPL(fs_holder_ops); =20 +static int fs_bdev_register(struct file *bdev_file, struct super_block *sb) +{ + dev_t dev =3D file_bdev(bdev_file)->bd_dev; + struct rhlist_head *list, *pos; + struct fs_bdev_holder *h; + int err; + + /* + * A superblock may claim one device more than once (xfs with its log on + * the data device). Keep a single entry per (device, superblock) and + * count the claims in @fs_bdev_active; the entry lives until the last one + * is released. + */ + scoped_guard(rcu) { + list =3D rhltable_lookup(&fs_bdev_supers, &dev, fs_bdev_params); + rhl_for_each_entry_rcu(h, pos, list, node) + if (h->sb =3D=3D sb && refcount_inc_not_zero(&h->fs_bdev_active)) + return 0; + } + + h =3D kmalloc(sizeof(*h), GFP_KERNEL); + if (!h) + return -ENOMEM; + h->dev =3D dev; + h->sb =3D sb; + refcount_set(&h->fs_bdev_passive, 1); + refcount_set(&h->fs_bdev_active, 1); + + err =3D rhltable_insert(&fs_bdev_supers, &h->node, fs_bdev_params); + if (err) { + kfree(h); + return err; + } + + /* The sb->s_count ref keeps @h->sb valid for as long as the entry exists= . */ + spin_lock(&sb_lock); + sb->s_count++; + spin_unlock(&sb_lock); + + return 0; +} + +/** + * fs_bdev_file_open_by_dev - claim a block device on behalf of a superblo= ck + * @dev: block device number + * @mode: open mode + * @holder: block-layer exclusivity token (a superblock, or the file_syste= m_type + * when the device may be shared by several superblocks of that t= ype) + * @sb: superblock to drive fs_holder_ops events for + * + * Open @dev with &fs_holder_ops and register that @sb uses it, so device + * removal/sync/freeze/thaw are propagated to @sb (and any other superblock + * sharing @dev). Must be paired with fs_bdev_file_release(). + * + * Return: an opened block-device file or an ERR_PTR(). + */ +struct file *fs_bdev_file_open_by_dev(dev_t dev, blk_mode_t mode, void *ho= lder, + struct super_block *sb) +{ + struct file *bdev_file; + int err; + + bdev_file =3D bdev_file_open_by_dev(dev, mode, holder, &fs_holder_ops); + if (IS_ERR(bdev_file)) + return bdev_file; + + err =3D fs_bdev_register(bdev_file, sb); + if (err) { + bdev_fput(bdev_file); + return ERR_PTR(err); + } + return bdev_file; +} +EXPORT_SYMBOL_GPL(fs_bdev_file_open_by_dev); + +struct file *fs_bdev_file_open_by_path(const char *path, blk_mode_t mode, + void *holder, struct super_block *sb) +{ + struct file *bdev_file; + int err; + + bdev_file =3D bdev_file_open_by_path(path, mode, holder, &fs_holder_ops); + if (IS_ERR(bdev_file)) + return bdev_file; + + err =3D fs_bdev_register(bdev_file, sb); + if (err) { + bdev_fput(bdev_file); + return ERR_PTR(err); + } + return bdev_file; +} +EXPORT_SYMBOL_GPL(fs_bdev_file_open_by_path); + +/** + * fs_bdev_file_release - release a block device claimed for a superblock + * @bdev_file: file returned by fs_bdev_file_open_by_{dev,path}() + * @sb: superblock the device was claimed for + * + * Drop one claim on the {dev, @sb} entry; the last claim unregisters it (a + * pinning cursor defers the actual unlink). Then close the block device. + */ +void fs_bdev_file_release(struct file *bdev_file, struct super_block *sb) +{ + dev_t dev =3D file_bdev(bdev_file)->bd_dev; + struct fs_bdev_holder *h, *found =3D NULL; + struct rhlist_head *list, *pos; + + rcu_read_lock(); + list =3D rhltable_lookup(&fs_bdev_supers, &dev, fs_bdev_params); + rhl_for_each_entry_rcu(h, pos, list, node) { + if (h->sb !=3D sb) + continue; + /* At most one entry per (dev, sb); the last claim drops the bias. */ + if (refcount_dec_and_test(&h->fs_bdev_active)) + found =3D h; + break; + } + rcu_read_unlock(); + if (found) + fs_bdev_holder_put(found); + bdev_fput(bdev_file); +} +EXPORT_SYMBOL_GPL(fs_bdev_file_release); + int setup_bdev_super(struct super_block *sb, int sb_flags, struct fs_context *fc) { @@ -1609,7 +1783,7 @@ int setup_bdev_super(struct super_block *sb, int sb_f= lags, struct file *bdev_file; struct block_device *bdev; =20 - bdev_file =3D bdev_file_open_by_dev(sb->s_dev, mode, sb, &fs_holder_ops); + bdev_file =3D fs_bdev_file_open_by_dev(sb->s_dev, mode, sb, sb); if (IS_ERR(bdev_file)) { if (fc) errorf(fc, "%s: Can't open blockdev", fc->source); @@ -1623,7 +1797,7 @@ int setup_bdev_super(struct super_block *sb, int sb_f= lags, * writable from userspace even for a read-only block device. */ if ((mode & BLK_OPEN_WRITE) && bdev_read_only(bdev)) { - bdev_fput(bdev_file); + fs_bdev_file_release(bdev_file, sb); return -EACCES; } =20 @@ -1634,7 +1808,7 @@ int setup_bdev_super(struct super_block *sb, int sb_f= lags, if (atomic_read(&bdev->bd_fsfreeze_count) > 0) { if (fc) warnf(fc, "%pg: Can't mount, blockdev is frozen", bdev); - bdev_fput(bdev_file); + fs_bdev_file_release(bdev_file, sb); return -EBUSY; } spin_lock(&sb_lock); @@ -1725,7 +1899,7 @@ void kill_block_super(struct super_block *sb) generic_shutdown_super(sb); if (bdev) { sync_blockdev(bdev); - bdev_fput(sb->s_bdev_file); + fs_bdev_file_release(sb->s_bdev_file, sb); } } =20 diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index c8494d64a69d..43d37c02febf 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -1760,13 +1760,6 @@ struct blk_holder_ops { int (*thaw)(struct block_device *bdev); }; =20 -/* - * For filesystems using @fs_holder_ops, the @holder argument passed to - * helpers used to open and claim block devices via - * bd_prepare_to_claim() must point to a superblock. - */ -extern const struct blk_holder_ops fs_holder_ops; - /* * Return the correct open flags for blkdev_get_by_* for super block flags * as stored in sb->s_flags. diff --git a/include/linux/fs/super.h b/include/linux/fs/super.h index f21ffbb6dea5..721d842e3b24 100644 --- a/include/linux/fs/super.h +++ b/include/linux/fs/super.h @@ -235,4 +235,11 @@ int freeze_super(struct super_block *super, enum freez= e_holder who, int thaw_super(struct super_block *super, enum freeze_holder who, const void *freeze_owner); =20 +struct file; +struct file *fs_bdev_file_open_by_dev(dev_t dev, blk_mode_t mode, void *ho= lder, + struct super_block *sb); +struct file *fs_bdev_file_open_by_path(const char *path, blk_mode_t mode, + void *holder, struct super_block *sb); +void fs_bdev_file_release(struct file *bdev_file, struct super_block *sb); + #endif /* _LINUX_FS_SUPER_H */ --=20 2.47.3 From nobody Mon Jun 8 03:19:50 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-alma10-1.taild15c8.ts.net [100.103.45.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AC9B73D812D; Tue, 2 Jun 2026 10:10:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=100.103.45.18 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780395033; cv=none; b=fOjAhEgPecFqJ1nlKXNTaA1ujgJy++tYu7eYkVMxpGPV4jf2p/yecTe8OixwubcGQ39MhMT1eQbzEgxlt25UgHdxz0UK0qPxM8qjCN+O/VVQvRjNaxRdJA/JG6EJyqKl2sc2HMfZCx5jnW46iOqZ2Umusf+Vw9KWF22PswF2tho= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780395033; c=relaxed/simple; bh=xIWw97lxG+sc/8l9SC3RHLuViltxzsWW1ctwJfp2JrM=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=NjTd5H3hhcDpCpxObPiaK5pi7E786RrOPLqREPSY/DnWc5yxAQKAR3xZCxBH+FYqxTShk6l/LgGNGxQSMnA1UlQU4yOzYgAwiG8UHdGByw3RyAJrp5/JbtVMCeFoNieVtyOp5R2t4hTsgAZh+6ZsFuFPog2b2dkfSYgEL9ZZXwg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=Kp2K8pYF; arc=none smtp.client-ip=100.103.45.18 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="Kp2K8pYF" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2EC3B1F00898; Tue, 2 Jun 2026 10:10:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel.org; s=k20260515; t=1780395032; bh=NK64pa3OIIduaAGDL2mfn4zWcijZV5LoX5Ft+Wnsf7E=; h=From:Date:Subject:References:In-Reply-To:To:Cc; b=Kp2K8pYF9IK72GeyOkPwzo7GUvAWWaOEr21bmKOrIJEzmi41bqRKfVIKJ/lnf5TDz vRtyo28Hguuoovis00BvGY3miPGttRBlQUlJgGPH5x+1jTAaO+dkVf7NC0xhqZbHqA xwCM96IcndMPK8Hcj7KqTAx89q2N4tMbEgPHDS+j93lrC1bPERJdIGAWTVDb0YFHVQ YTGkWCF6LjCsisbqOVqoNTZqBGTBcKV8c+Fsm2IzfjqcCnqlPI20ft85fimFKLRrIT n9jaS31lrRaDdhC57KzUl2Hl7mEOa3JJav2W9Cdq0gY4MLx80rad67M8cbYD9l0wnJ 4CKNEtwuw8krA== From: Christian Brauner Date: Tue, 02 Jun 2026 12:10:09 +0200 Subject: [PATCH RFC 3/8] fs: refuse to claim any frozen block device Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260602-work-super-bdev_holder_global-v1-3-bb0fd82f3861@kernel.org> References: <20260602-work-super-bdev_holder_global-v1-0-bb0fd82f3861@kernel.org> In-Reply-To: <20260602-work-super-bdev_holder_global-v1-0-bb0fd82f3861@kernel.org> To: Christoph Hellwig , Jan Kara Cc: Jens Axboe , Alexander Viro , linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, Carlos Maiolino , linux-xfs@vger.kernel.org, Chris Mason , David Sterba , linux-btrfs@vger.kernel.org, Theodore Ts'o , linux-ext4@vger.kernel.org, Gao Xiang , linux-erofs@lists.ozlabs.org, "Christian Brauner (Amutable)" X-Mailer: b4 0.16-dev-fffa9 X-Developer-Signature: v=1; a=openpgp-sha256; l=1748; i=brauner@kernel.org; h=from:subject:message-id; bh=xIWw97lxG+sc/8l9SC3RHLuViltxzsWW1ctwJfp2JrM=; b=owGbwMvMwCU28Zj0gdSKO4sYT6slMWTJreEwus30kUP9ttJBxaMH5KyWR7OvFH771srmQUU3l 7iBsYNnRykLgxgXg6yYIotDu0m43HKeis1GmRowc1iZQIYwcHEKwESUbzD8Dz/ef67iYaNnjD5T 8b0z1YoV83MnWEdLFr6Y0WK47uSCUIb/YemVN96ZyjHXTuG6afanfnv4q5A5c3ffNNC9qHnUwPY QFwA= X-Developer-Key: i=brauner@kernel.org; a=openpgp; fpr=4880B8C9BD0E5106FC070F4F7B3C391EFEA93624 setup_bdev_super() already refuses to bring a filesystem up on a frozen block device but only for the primary device. Now that filesystems claim every device through fs_bdev_file_open_by_{dev,path}(), do that check once in the registration helper so it covers all of them. Drop the now-redundant check from setup_bdev_super(). Signed-off-by: Christian Brauner (Amutable) --- fs/super.c | 21 +++++++++++---------- 1 file changed, 11 insertions(+), 10 deletions(-) diff --git a/fs/super.c b/fs/super.c index e0174d5819a0..cea743f699e4 100644 --- a/fs/super.c +++ b/fs/super.c @@ -1690,6 +1690,17 @@ static int fs_bdev_register(struct file *bdev_file, = struct super_block *sb) sb->s_count++; spin_unlock(&sb_lock); =20 + /* + * Don't bring a filesystem up on a frozen device. The entry is already + * published, so a freeze either is seen here or finds it and waits in + * super_lock() until this mount is born or (on -EBUSY) dies. The mount + * aborts, so the entry is torn down without rebalancing @fs_bdev_active. + */ + if (atomic_read(&file_bdev(bdev_file)->bd_fsfreeze_count) > 0) { + fs_bdev_holder_put(h); + return -EBUSY; + } + return 0; } =20 @@ -1801,16 +1812,6 @@ int setup_bdev_super(struct super_block *sb, int sb_= flags, return -EACCES; } =20 - /* - * It is enough to check bdev was not frozen before we set - * s_bdev as freezing will wait until SB_BORN is set. - */ - if (atomic_read(&bdev->bd_fsfreeze_count) > 0) { - if (fc) - warnf(fc, "%pg: Can't mount, blockdev is frozen", bdev); - fs_bdev_file_release(bdev_file, sb); - return -EBUSY; - } spin_lock(&sb_lock); sb->s_bdev_file =3D bdev_file; sb->s_bdev =3D bdev; --=20 2.47.3 From nobody Mon Jun 8 03:19:50 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-alma10-1.taild15c8.ts.net [100.103.45.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 76CE93D9695; Tue, 2 Jun 2026 10:10:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=100.103.45.18 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780395037; cv=none; b=CR5Ef6ic+n7eDzr8CWE+Hamz82NiSQBO7eFemJoAchfQ+4vjGcWQzZQbFupwZStx2WDjw/L5rhbJp5O8X8bYPeYJEeyWqS3ZVLJ1dN7L/XwzOtqrsyzqPvIU2TNSt78vG/hbpYhl333zHoewFPukyyFxWP6VM7A1rSVg+NPO/LI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780395037; c=relaxed/simple; bh=YYSwTQpvvqiN+ClwMHMKH84kTVWiiSstmufcfKS+DaQ=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=TPXlQc3cUYK9+8Bglb8pweLILe8iJemVFKSLw3svFzLocm8hgVWb6KU7dop+Np7QjeHLKzCnxItc6aukQbWxAXmLB2+lhjOlgGBmsV5Ss7AlfkLL6y0rgC3l2UOfA7KuiJZVNsS6cqJha0Ejyr1aVNZjmniWBJpfdx7q1wgA96g= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=OnrvYZ8d; arc=none smtp.client-ip=100.103.45.18 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="OnrvYZ8d" Received: by smtp.kernel.org (Postfix) with ESMTPSA id C67D81F00899; Tue, 2 Jun 2026 10:10:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel.org; s=k20260515; t=1780395036; bh=XXqlEVPtjTumz77I7tYirzGL781sufCUro38/y5OW94=; h=From:Date:Subject:References:In-Reply-To:To:Cc; b=OnrvYZ8d/uudw7IFCwY3+2vEoGJ6Xtsg+t6/CG6GZXH5Cu/a+UI4VZkLuLsPmmhZP 6o5H9c7TCjOLYfvJWyQ7hEbbRYWhw70BBhhbJBxs/ExyIWJJncxDblUbaMvr9CROs1 Q01KG/rktCCnGP4xLk9oyJmDMD/FzT6LnNGo4+3qUsfjVwdjmm5MoxFZWMFEp43MyO rlhpI4KBhN99dHrUw9gwyCUO1vVnm9WmD+Y6lVT2Ywt2rozkoHb0kksHIyBNZrYRJz jNM/tTxqaM7tEKooZLW4L4jKyDdvZTR2ZlGT9eWCl4b8VZrFS2uLTuPzv9ns5h4Bko VhkPjNUBZqM9g== From: Christian Brauner Date: Tue, 02 Jun 2026 12:10:10 +0200 Subject: [PATCH RFC 4/8] xfs: port to fs_bdev_file_open_by_path() Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260602-work-super-bdev_holder_global-v1-4-bb0fd82f3861@kernel.org> References: <20260602-work-super-bdev_holder_global-v1-0-bb0fd82f3861@kernel.org> In-Reply-To: <20260602-work-super-bdev_holder_global-v1-0-bb0fd82f3861@kernel.org> To: Christoph Hellwig , Jan Kara Cc: Jens Axboe , Alexander Viro , linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, Carlos Maiolino , linux-xfs@vger.kernel.org, Chris Mason , David Sterba , linux-btrfs@vger.kernel.org, Theodore Ts'o , linux-ext4@vger.kernel.org, Gao Xiang , linux-erofs@lists.ozlabs.org, "Christian Brauner (Amutable)" X-Mailer: b4 0.16-dev-fffa9 X-Developer-Signature: v=1; a=openpgp-sha256; l=1913; i=brauner@kernel.org; h=from:subject:message-id; bh=YYSwTQpvvqiN+ClwMHMKH84kTVWiiSstmufcfKS+DaQ=; b=owGbwMvMwCU28Zj0gdSKO4sYT6slMWTJreEwOvyid52jcmyz7OHVMQca/xYJfXfm1zK+KXT6R 2bNEu2SjlIWBjEuBlkxRRaHdpNwueU8FZuNMjVg5rAygQxh4OIUgIkwmTAyLPDiey56bvmxsIsP RdS936y5+zyoeu3X1308nyyEdyZ9F2P4n5X2ceN6vpucR3JWXQmQ8lxn/toiWGMVx4I7C7qP8Cb P5wUA X-Developer-Key: i=brauner@kernel.org; a=openpgp; fpr=4880B8C9BD0E5106FC070F4F7B3C391EFEA93624 Route opens through fs_bdev_file_open_by_path() so each external device is registered against mp->m_super, and convert the matching releases. Signed-off-by: Christian Brauner (Amutable) --- fs/xfs/xfs_buf.c | 2 +- fs/xfs/xfs_super.c | 10 +++++----- 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c index 580d40a5ee57..3d3b29edb156 100644 --- a/fs/xfs/xfs_buf.c +++ b/fs/xfs/xfs_buf.c @@ -1601,7 +1601,7 @@ xfs_free_buftarg( fs_put_dax(btp->bt_daxdev, btp->bt_mount); /* the main block device is closed by kill_block_super */ if (btp->bt_bdev !=3D btp->bt_mount->m_super->s_bdev) - bdev_fput(btp->bt_file); + fs_bdev_file_release(btp->bt_file, btp->bt_mount->m_super); kfree(btp); } =20 diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c index f8de44443e81..304667210695 100644 --- a/fs/xfs/xfs_super.c +++ b/fs/xfs/xfs_super.c @@ -400,8 +400,8 @@ xfs_blkdev_get( blk_mode_t mode; =20 mode =3D sb_open_mode(mp->m_super->s_flags); - *bdev_filep =3D bdev_file_open_by_path(name, mode, - mp->m_super, &fs_holder_ops); + *bdev_filep =3D fs_bdev_file_open_by_path(name, mode, + mp->m_super, mp->m_super); if (IS_ERR(*bdev_filep)) { error =3D PTR_ERR(*bdev_filep); *bdev_filep =3D NULL; @@ -526,7 +526,7 @@ xfs_open_devices( mp->m_logdev_targp =3D mp->m_ddev_targp; /* Handle won't be used, drop it */ if (logdev_file) - bdev_fput(logdev_file); + fs_bdev_file_release(logdev_file, mp->m_super); } =20 return 0; @@ -538,10 +538,10 @@ xfs_open_devices( xfs_free_buftarg(mp->m_ddev_targp); out_close_rtdev: if (rtdev_file) - bdev_fput(rtdev_file); + fs_bdev_file_release(rtdev_file, mp->m_super); out_close_logdev: if (logdev_file) - bdev_fput(logdev_file); + fs_bdev_file_release(logdev_file, mp->m_super); return error; } =20 --=20 2.47.3 From nobody Mon Jun 8 03:19:50 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-alma10-1.taild15c8.ts.net [100.103.45.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1A1073D649C; Tue, 2 Jun 2026 10:10:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=100.103.45.18 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780395041; cv=none; b=OkkMnpwBjMVQUpjFrdVh0pjgi4DLNzURp6nqFVjLvpRJAi1mb9b1oR6wVu8qzaW9EA0xLqKFN0OVg/59d7dKF80Y7YVxGcrfJt5cL7zZRyk4gJEBdWHdYbAPtNJFb1xEJapHusdBx743kULnntTl7kZbe+93BH9mNc5H9c+Syt0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780395041; c=relaxed/simple; bh=R6lYGCJCu4lkxrUCRLzgOgiNYkfZrRAWRY4AmVTHu6c=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=dpVaCpUwnsMVAhReBMvS6cc9EZU6vvE6wJj8KHVGGC5yfCqWq9dNECzcDww0QxnIN9eYjvQ2QqlV+rgmZjr8pw5Bt4WtN0B0idG+w6NmwUlKbwFgtjsq/4iONNGxHBTJHAShaor0by53hI2biwPC2WKl+VN3rDr/gGBj+mZ01Fo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=WQeLmDsh; arc=none smtp.client-ip=100.103.45.18 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="WQeLmDsh" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8751F1F0089E; Tue, 2 Jun 2026 10:10:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel.org; s=k20260515; t=1780395039; bh=Y8/E0TuOj6HisojXfCRnxMPnMP6DVmDM0aSEXinDBTM=; h=From:Date:Subject:References:In-Reply-To:To:Cc; b=WQeLmDshDmSVqiwu4w+Ps30a/Px9tLvBr3ZxYV1T/pO05z0uG+q/nWeMSnY2qqwjR gw2ssSZPIXMboT49toM2ciEyuYRlRTB1l737ZWecKj2C3FkwtQ+ugiXYQzkNKhHEnG yHZiju/5+pCeXtkRIod6jSckUVYEm8wE/hB4uD43SZDn6PSQuYOMOZBncu3F5X+k3k dVTbvwUtrzvvHw/9vK45zsrglaTmh1dKmYzvuoQuhz5rHKa1n9jVKh/aDr4C7lRqq6 lX+Yz/BnuDvINboizgtmr66rq6wr/7UqLMEq1Wmh1M4p5OeSKU1/5DXMSAkHaArDux UN4P2IBbXUBAA== From: Christian Brauner Date: Tue, 02 Jun 2026 12:10:11 +0200 Subject: [PATCH RFC 5/8] btrfs: open via dedicated fs bdev helpers Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260602-work-super-bdev_holder_global-v1-5-bb0fd82f3861@kernel.org> References: <20260602-work-super-bdev_holder_global-v1-0-bb0fd82f3861@kernel.org> In-Reply-To: <20260602-work-super-bdev_holder_global-v1-0-bb0fd82f3861@kernel.org> To: Christoph Hellwig , Jan Kara Cc: Jens Axboe , Alexander Viro , linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, Carlos Maiolino , linux-xfs@vger.kernel.org, Chris Mason , David Sterba , linux-btrfs@vger.kernel.org, Theodore Ts'o , linux-ext4@vger.kernel.org, Gao Xiang , linux-erofs@lists.ozlabs.org, "Christian Brauner (Amutable)" X-Mailer: b4 0.16-dev-fffa9 X-Developer-Signature: v=1; a=openpgp-sha256; l=5068; i=brauner@kernel.org; h=from:subject:message-id; bh=R6lYGCJCu4lkxrUCRLzgOgiNYkfZrRAWRY4AmVTHu6c=; b=owGbwMvMwCU28Zj0gdSKO4sYT6slMWTJreGoVd72Q36ffFai5ax5Ytve3U7h2uDTrWvJwMy03 aVja11sRykLgxgXg6yYIotDu0m43HKeis1GmRowc1iZQIYwcHEKwESaMhgZfm/9Guscvav3GfPp 9XnP04vnly6w2+cnd8SsfdKL85KmYQy/WeP+PNzAtjHCs67Gm7379v4r8zY/2cj85fBRM4VYfoG VDAA= X-Developer-Key: i=brauner@kernel.org; a=openpgp; fpr=4880B8C9BD0E5106FC070F4F7B3C391EFEA93624 Route opens through fs_bdev_file_open_by_path() so each external device is registered against the correct superblock, and convert the matching releases. The temporary identification opens that only read the superblock and close again pass a NULL holder and are left untouched. Signed-off-by: Christian Brauner (Amutable) --- fs/btrfs/dev-replace.c | 6 +++--- fs/btrfs/ioctl.c | 4 ++-- fs/btrfs/volumes.c | 26 +++++++++++++++++--------- 3 files changed, 22 insertions(+), 14 deletions(-) diff --git a/fs/btrfs/dev-replace.c b/fs/btrfs/dev-replace.c index 8f8fa14886de..463155b0b1ff 100644 --- a/fs/btrfs/dev-replace.c +++ b/fs/btrfs/dev-replace.c @@ -247,8 +247,8 @@ static int btrfs_init_dev_replace_tgtdev(struct btrfs_f= s_info *fs_info, return -EINVAL; } =20 - bdev_file =3D bdev_file_open_by_path(device_path, BLK_OPEN_WRITE, - fs_info->sb, &fs_holder_ops); + bdev_file =3D fs_bdev_file_open_by_path(device_path, BLK_OPEN_WRITE, + fs_info->sb, fs_info->sb); if (IS_ERR(bdev_file)) { btrfs_err(fs_info, "target device %s is invalid!", device_path); return PTR_ERR(bdev_file); @@ -325,7 +325,7 @@ static int btrfs_init_dev_replace_tgtdev(struct btrfs_f= s_info *fs_info, return 0; =20 error: - bdev_fput(bdev_file); + fs_bdev_file_release(bdev_file, fs_info->sb); return ret; } =20 diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index b2e447f5005c..16afa71b98f2 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -2579,7 +2579,7 @@ static long btrfs_ioctl_rm_dev_v2(struct file *file, = void __user *arg) err_drop: mnt_drop_write_file(file); if (bdev_file) - bdev_fput(bdev_file); + fs_bdev_file_release(bdev_file, fs_info->sb); out: btrfs_put_dev_args_from_path(&args); kfree(vol_args); @@ -2630,7 +2630,7 @@ static long btrfs_ioctl_rm_dev(struct file *file, voi= d __user *arg) =20 mnt_drop_write_file(file); if (bdev_file) - bdev_fput(bdev_file); + fs_bdev_file_release(bdev_file, fs_info->sb); out: btrfs_put_dev_args_from_path(&args); out_free: diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index a88e68f90564..6f7d7afb4d66 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -480,7 +480,12 @@ btrfs_get_bdev_and_sb(const char *device_path, blk_mod= e_t flags, void *holder, struct block_device *bdev; int ret; =20 - *bdev_file =3D bdev_file_open_by_path(device_path, flags, holder, &fs_hol= der_ops); + if (holder) + *bdev_file =3D fs_bdev_file_open_by_path(device_path, flags, + holder, holder); + else + *bdev_file =3D bdev_file_open_by_path(device_path, flags, NULL, + NULL); =20 if (IS_ERR(*bdev_file)) { ret =3D PTR_ERR(*bdev_file); @@ -495,7 +500,7 @@ btrfs_get_bdev_and_sb(const char *device_path, blk_mode= _t flags, void *holder, if (holder) { ret =3D set_blocksize(*bdev_file, BTRFS_BDEV_BLOCKSIZE); if (ret) { - bdev_fput(*bdev_file); + fs_bdev_file_release(*bdev_file, holder); goto error; } } @@ -503,7 +508,10 @@ btrfs_get_bdev_and_sb(const char *device_path, blk_mod= e_t flags, void *holder, *disk_super =3D btrfs_read_disk_super(bdev, 0, false); if (IS_ERR(*disk_super)) { ret =3D PTR_ERR(*disk_super); - bdev_fput(*bdev_file); + if (holder) + fs_bdev_file_release(*bdev_file, holder); + else + bdev_fput(*bdev_file); goto error; } =20 @@ -727,7 +735,7 @@ static int btrfs_open_one_device(struct btrfs_fs_device= s *fs_devices, =20 error_free_page: btrfs_release_disk_super(disk_super); - bdev_fput(bdev_file); + fs_bdev_file_release(bdev_file, holder); =20 return -EINVAL; } @@ -1082,7 +1090,7 @@ static void __btrfs_free_extra_devids(struct btrfs_fs= _devices *fs_devices, continue; =20 if (device->bdev_file) { - bdev_fput(device->bdev_file); + fs_bdev_file_release(device->bdev_file, fs_devices->fs_info->sb); device->bdev =3D NULL; device->bdev_file =3D NULL; fs_devices->open_devices--; @@ -1129,7 +1137,7 @@ static void btrfs_close_bdev(struct btrfs_device *dev= ice) invalidate_bdev(device->bdev); } =20 - bdev_fput(device->bdev_file); + fs_bdev_file_release(device->bdev_file, device->fs_info->sb); } =20 static void btrfs_close_one_device(struct btrfs_device *device) @@ -2820,8 +2828,8 @@ int btrfs_init_new_device(struct btrfs_fs_info *fs_in= fo, const char *device_path if (sb_rdonly(sb) && !fs_devices->seeding) return -EROFS; =20 - bdev_file =3D bdev_file_open_by_path(device_path, BLK_OPEN_WRITE, - fs_info->sb, &fs_holder_ops); + bdev_file =3D fs_bdev_file_open_by_path(device_path, BLK_OPEN_WRITE, + fs_info->sb, fs_info->sb); if (IS_ERR(bdev_file)) return PTR_ERR(bdev_file); =20 @@ -3045,7 +3053,7 @@ int btrfs_init_new_device(struct btrfs_fs_info *fs_in= fo, const char *device_path error_free_device: btrfs_free_device(device); error: - bdev_fput(bdev_file); + fs_bdev_file_release(bdev_file, fs_info->sb); if (locked) { mutex_unlock(&uuid_mutex); up_write(&sb->s_umount); --=20 2.47.3 From nobody Mon Jun 8 03:19:50 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-alma10-1.taild15c8.ts.net [100.103.45.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C34273D6CAD; Tue, 2 Jun 2026 10:10:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=100.103.45.18 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780395044; cv=none; b=bd8jESMQhGx/a83R5ZzTVeFx3jqCtrE4W2+v7xg1M+hQfeRQZ8N0pOk5ts090sf3P53qZW0l+EHwrpRRnI4IgcbhNKbfwMPeSnW2rf5qghl2s7WO64gszlk1Sm1SEhRYNdPsMDu0SBb3GvLxy5S33/7pY1AOvVJskq8U2G/0Zyk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780395044; c=relaxed/simple; bh=tRD6D2ZvC10l0Swb1mE77uQnFpNR5auwAZCGdQLKj3I=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=MRWY+3runejx66UiUUmoUCYj7zLNRY3bBmxZx6ZIITAaxiO5jEyqZq64kIMHV1MWKmr1xnAoMtH51sopkiQgXWmYWu6hxVC/TjlD/bmXayVtlLbTHwptIMRsafvgF61FjreyCuJZt1fvRmE3ikKohrn0Sy6E7iQCk0sVFw812A8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=IcYRFZ7I; arc=none smtp.client-ip=100.103.45.18 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="IcYRFZ7I" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 416B61F00899; Tue, 2 Jun 2026 10:10:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel.org; s=k20260515; t=1780395043; bh=Rcl0Ij+Ta68yATaT5Q/U3eCvCYbsFeYyob8MpA9JEIY=; h=From:Date:Subject:References:In-Reply-To:To:Cc; b=IcYRFZ7IwF5i2OaH89y2up0aV6PvzsJqfvYMhN8Rb430whUhhGzKxCyO61VhDckN8 mdwK5pkZmT3BPZZ2F6YPiiuM867ZBfa6ef0HaJHGqB3X/ozsJJWexF5h8jHXFKML1k oqoumaaaw6e1Hel59xTSQu+iyE3jZQ1A9Q/8NejBaZSDfHP/ifSkh2au0Fjza5xpWa IrW4Vcsz7CnMp80iPim5gTBwbJSF+Z7YZbEOlJszNf7x+wdCisnmSxsc9W66E+6TYr Wa5VA/7TNUf7EzQoU7ZT1PRntyOyrkVstP0XLHkDaECwCp7k0kvPAGxNgmF/kA2UOe BMZza+NotpqGw== From: Christian Brauner Date: Tue, 02 Jun 2026 12:10:12 +0200 Subject: [PATCH RFC 6/8] ext4: open via dedicated fs bdev helpers Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260602-work-super-bdev_holder_global-v1-6-bb0fd82f3861@kernel.org> References: <20260602-work-super-bdev_holder_global-v1-0-bb0fd82f3861@kernel.org> In-Reply-To: <20260602-work-super-bdev_holder_global-v1-0-bb0fd82f3861@kernel.org> To: Christoph Hellwig , Jan Kara Cc: Jens Axboe , Alexander Viro , linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, Carlos Maiolino , linux-xfs@vger.kernel.org, Chris Mason , David Sterba , linux-btrfs@vger.kernel.org, Theodore Ts'o , linux-ext4@vger.kernel.org, Gao Xiang , linux-erofs@lists.ozlabs.org, "Christian Brauner (Amutable)" X-Mailer: b4 0.16-dev-fffa9 X-Developer-Signature: v=1; a=openpgp-sha256; l=1960; i=brauner@kernel.org; h=from:subject:message-id; bh=tRD6D2ZvC10l0Swb1mE77uQnFpNR5auwAZCGdQLKj3I=; b=owGbwMvMwCU28Zj0gdSKO4sYT6slMWTJreHQvVlsv1mBf/ON//MTCj44HOu5eIo74EO/d/ONo LLZxpF+HaUsDGJcDLJiiiwO7Sbhcst5KjYbZWrAzGFlAhnCwMUpABPpEGNkuLdt+8Ldt/37WQ7P +LFH8YH+sRW3n1wvXqvCWFi/48R+y0ZGhoVLNebPSzimfdyP4ci2z7+/z1hav17yfvESwfglBdm TYrgA X-Developer-Key: i=brauner@kernel.org; a=openpgp; fpr=4880B8C9BD0E5106FC070F4F7B3C391EFEA93624 Route opens through fs_bdev_file_open_by_path() so each external device is registered against the correct superblock, and convert the matching releases. Signed-off-by: Christian Brauner (Amutable) --- fs/ext4/super.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/fs/ext4/super.c b/fs/ext4/super.c index 6a77db4d3124..8108d999008e 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -5793,7 +5793,7 @@ failed_mount8: __maybe_unused brelse(sbi->s_sbh); if (sbi->s_journal_bdev_file) { invalidate_bdev(file_bdev(sbi->s_journal_bdev_file)); - bdev_fput(sbi->s_journal_bdev_file); + fs_bdev_file_release(sbi->s_journal_bdev_file, sb); } out_fail: invalidate_bdev(sb->s_bdev); @@ -5972,9 +5972,9 @@ static struct file *ext4_get_journal_blkdev(struct su= per_block *sb, struct ext4_super_block *es; int errno; =20 - bdev_file =3D bdev_file_open_by_dev(j_dev, + bdev_file =3D fs_bdev_file_open_by_dev(j_dev, BLK_OPEN_READ | BLK_OPEN_WRITE | BLK_OPEN_RESTRICT_WRITES, - sb, &fs_holder_ops); + sb, sb); if (IS_ERR(bdev_file)) { ext4_msg(sb, KERN_ERR, "failed to open journal device unknown-block(%u,%u) %ld", @@ -6034,7 +6034,7 @@ static struct file *ext4_get_journal_blkdev(struct su= per_block *sb, out_bh: brelse(bh); out_bdev: - bdev_fput(bdev_file); + fs_bdev_file_release(bdev_file, sb); return ERR_PTR(errno); } =20 @@ -6073,7 +6073,7 @@ static journal_t *ext4_open_dev_journal(struct super_= block *sb, out_journal: ext4_journal_destroy(EXT4_SB(sb), journal); out_bdev: - bdev_fput(bdev_file); + fs_bdev_file_release(bdev_file, sb); return ERR_PTR(errno); } =20 @@ -7492,7 +7492,7 @@ static void ext4_kill_sb(struct super_block *sb) kill_block_super(sb); =20 if (bdev_file) - bdev_fput(bdev_file); + fs_bdev_file_release(bdev_file, sb); } =20 static struct file_system_type ext4_fs_type =3D { --=20 2.47.3 From nobody Mon Jun 8 03:19:50 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-alma10-1.taild15c8.ts.net [100.103.45.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B20233D47B2; Tue, 2 Jun 2026 10:10:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=100.103.45.18 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780395048; cv=none; b=XM6M5bEK8L/2pfJtDPJXe6QWSPWglQGNEBT+3nqzroFavgWzeCVhHZ6RV+A3MY91rEE5CQP/qS1QA9hRPqPPzlvnKsbYXszuIlQr5GrE+JhYTHbuk6JuWw+R1r2A2vMy027FjgVKYb11ovQ8Lh0DsW5Va2DmGnOa+pNNAaaj1Aw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780395048; c=relaxed/simple; bh=rcHD4HbFbUrUpExn6n0CNv7K4b9hWXj4N9RjO33LQqM=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=qdp2AyL9uKbKKpTADv9M7QiGl1vtaPfEXLFa2ajdc9oSmWwQkYLM4qhuK/GSdr5gXmQrcu+1tXzbFXKXDFBw2YHcLqsTDfR3dwiIbQZnQ0TrB6mVbL888DwAswTKdQkyHYMfcVcg8ffofI9THlgWR2ebL7hha2I7OcniowDOBiI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=AeFRVuEy; arc=none smtp.client-ip=100.103.45.18 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="AeFRVuEy" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 216A91F00893; Tue, 2 Jun 2026 10:10:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel.org; s=k20260515; t=1780395047; bh=5bBzVi4bBIK5tK2AYeV9dNHzpJuPm4TjbWGXo8xGZGo=; h=From:Date:Subject:References:In-Reply-To:To:Cc; b=AeFRVuEyR0A20ZtgU5dPVf0eDw0c7XFSyhj5dtede4QGcluYMTeYtifMC5iR5XnGX wHUXAPLk+87sonl7T37TwkoLvCUZhbd+gSThgOX309X9rE6TwvVadruCk76Aytlj3N XPa6OOtFtvZXSWMIsWJG0wswkVofTjU6Njtcw2QXDs/57twKmM7AF/YprE14G8CHsY YwKGXHThQducHnEKFUNrDWRB2wS3HC+ZvcQj7kT7ZhJAzd8+UO058UeOS2+IAcBODf pvevE4L1pq85maDAfTFD9Mu4vl2NugCGFtDYmOXe2FsTxnuO2ET/UgGfjRZ4kNnUmK Y3edpjJ2DBISA== From: Christian Brauner Date: Tue, 02 Jun 2026 12:10:13 +0200 Subject: [PATCH RFC 7/8] erofs: open via dedicated fs bdev helpers Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260602-work-super-bdev_holder_global-v1-7-bb0fd82f3861@kernel.org> References: <20260602-work-super-bdev_holder_global-v1-0-bb0fd82f3861@kernel.org> In-Reply-To: <20260602-work-super-bdev_holder_global-v1-0-bb0fd82f3861@kernel.org> To: Christoph Hellwig , Jan Kara Cc: Jens Axboe , Alexander Viro , linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, Carlos Maiolino , linux-xfs@vger.kernel.org, Chris Mason , David Sterba , linux-btrfs@vger.kernel.org, Theodore Ts'o , linux-ext4@vger.kernel.org, Gao Xiang , linux-erofs@lists.ozlabs.org, "Christian Brauner (Amutable)" X-Mailer: b4 0.16-dev-fffa9 X-Developer-Signature: v=1; a=openpgp-sha256; l=7744; i=brauner@kernel.org; h=from:subject:message-id; bh=rcHD4HbFbUrUpExn6n0CNv7K4b9hWXj4N9RjO33LQqM=; b=owGbwMvMwCU28Zj0gdSKO4sYT6slMWTJreHY5X4o6u+VmAbplGN/S3j3PHN4Y3E6fV/YwhOP9 xzn0LM811HKwiDGxSArpsji0G4SLrecp2KzUaYGzBxWJpAhDFycAjCRj5MZ/pdG+t3afEJN7Iv4 7K8VLJ5qPTwuh6dfD5swLYGjfGXyVRNGhgVzy244cyedkOOIP3gw8fjEU3XCp3+/C0wpWreUz2X lHhYA X-Developer-Key: i=brauner@kernel.org; a=openpgp; fpr=4880B8C9BD0E5106FC070F4F7B3C391EFEA93624 Route opens through fs_bdev_file_open_by_path() so each external device is registered against the correct superblock, and convert the matching releases. Signed-off-by: Christian Brauner (Amutable) --- fs/erofs/data.c | 6 +++++ fs/erofs/internal.h | 10 ++++++++ fs/erofs/super.c | 66 +++++++++++++++++++++++++++++++++++++++++++------= ---- fs/erofs/zdata.c | 10 +++++--- 4 files changed, 77 insertions(+), 15 deletions(-) diff --git a/fs/erofs/data.c b/fs/erofs/data.c index 44da21c9d777..5220585293df 100644 --- a/fs/erofs/data.c +++ b/fs/erofs/data.c @@ -69,6 +69,9 @@ int erofs_init_metabuf(struct erofs_buf *buf, struct supe= r_block *sb, { struct erofs_sb_info *sbi =3D EROFS_SB(sb); =20 + if (erofs_is_shutdown(sb)) + return -EIO; + buf->file =3D NULL; if (in_metabox) { if (unlikely(!sbi->metabox_inode)) @@ -236,6 +239,9 @@ int erofs_map_dev(struct super_block *sb, struct erofs_= map_dev *map) } up_read(&devs->rwsem); } + if (erofs_is_shutdown(sb) || + (map->m_dif && READ_ONCE(map->m_dif->dead))) + return -EIO; return 0; } =20 diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h index 4792490161ec..ca1ed7ce3961 100644 --- a/fs/erofs/internal.h +++ b/fs/erofs/internal.h @@ -48,6 +48,7 @@ struct erofs_device_info { =20 erofs_blk_t blocks; erofs_blk_t uniaddr; + bool dead; /* backing device gone; fence I/O */ }; =20 enum { @@ -104,6 +105,7 @@ struct erofs_xattr_prefix_item { struct erofs_sb_info { struct erofs_device_info dif0; struct erofs_mount_opts opt; /* options */ + unsigned long flags; /* see EROFS_SB_* */ #ifdef CONFIG_EROFS_FS_ZIP /* list for all registered superblocks, mainly for shrinker */ struct list_head list; @@ -195,6 +197,14 @@ static inline bool erofs_is_fscache_mode(struct super_= block *sb) !erofs_is_fileio_mode(EROFS_SB(sb)) && !sb->s_bdev; } =20 +/* erofs_sb_info->flags */ +#define EROFS_SB_SHUTDOWN 0 /* primary device gone; fail all I/O */ + +static inline bool erofs_is_shutdown(struct super_block *sb) +{ + return test_bit(EROFS_SB_SHUTDOWN, &EROFS_SB(sb)->flags); +} + enum { EROFS_ZIP_CACHE_DISABLED, EROFS_ZIP_CACHE_READAHEAD, diff --git a/fs/erofs/super.c b/fs/erofs/super.c index 802add6652fd..e03cb95be96b 100644 --- a/fs/erofs/super.c +++ b/fs/erofs/super.c @@ -153,8 +153,8 @@ static int erofs_init_device(struct erofs_buf *buf, str= uct super_block *sb, } else if (!sbi->devs->flatdev) { file =3D erofs_is_fileio_mode(sbi) ? filp_open(dif->path, O_RDONLY | O_LARGEFILE, 0) : - bdev_file_open_by_path(dif->path, - BLK_OPEN_READ, sb->s_type, NULL); + fs_bdev_file_open_by_path(dif->path, + BLK_OPEN_READ, sb->s_type, sb); if (IS_ERR(file)) { if (file =3D=3D ERR_PTR(-ENOTBLK)) return -EINVAL; @@ -843,11 +843,16 @@ static int erofs_fc_reconfigure(struct fs_context *fc) =20 static int erofs_release_device_info(int id, void *ptr, void *data) { + struct super_block *sb =3D data; struct erofs_device_info *dif =3D ptr; =20 fs_put_dax(dif->dax_dev, NULL); - if (dif->file) - fput(dif->file); + if (dif->file) { + if (S_ISBLK(file_inode(dif->file)->i_mode)) + fs_bdev_file_release(dif->file, sb); + else + fput(dif->file); + } erofs_fscache_unregister_cookie(dif->fscache); dif->fscache =3D NULL; kfree(dif->path); @@ -855,18 +860,19 @@ static int erofs_release_device_info(int id, void *pt= r, void *data) return 0; } =20 -static void erofs_free_dev_context(struct erofs_dev_context *devs) +static void erofs_free_dev_context(struct erofs_dev_context *devs, + struct super_block *sb) { if (!devs) return; - idr_for_each(&devs->tree, &erofs_release_device_info, NULL); + idr_for_each(&devs->tree, &erofs_release_device_info, sb); idr_destroy(&devs->tree); kfree(devs); } =20 -static void erofs_sb_free(struct erofs_sb_info *sbi) +static void erofs_sb_free(struct erofs_sb_info *sbi, struct super_block *s= b) { - erofs_free_dev_context(sbi->devs); + erofs_free_dev_context(sbi->devs, sb); kfree(sbi->fsid); kfree_sensitive(sbi->domain_id); if (sbi->dif0.file) @@ -879,8 +885,13 @@ static void erofs_fc_free(struct fs_context *fc) { struct erofs_sb_info *sbi =3D fc->s_fs_info; =20 - if (sbi) /* free here if an error occurs before transferring to sb */ - erofs_sb_free(sbi); + /* + * Freed here only if an error occurs before the sb is set up; at that + * point no block-backed device has been claimed (that happens in + * fill_super), so the NULL sb never reaches fs_bdev_file_release(). + */ + if (sbi) + erofs_sb_free(sbi, NULL); } =20 static const struct fs_context_operations erofs_context_ops =3D { @@ -936,7 +947,7 @@ static void erofs_kill_sb(struct super_block *sb) erofs_drop_internal_inodes(sbi); fs_put_dax(sbi->dif0.dax_dev, NULL); erofs_fscache_unregister_fs(sb); - erofs_sb_free(sbi); + erofs_sb_free(sbi, sb); sb->s_fs_info =3D NULL; } =20 @@ -948,7 +959,7 @@ static void erofs_put_super(struct super_block *sb) erofs_shrinker_unregister(sb); erofs_xattr_prefixes_cleanup(sb); erofs_drop_internal_inodes(sbi); - erofs_free_dev_context(sbi->devs); + erofs_free_dev_context(sbi->devs, sb); sbi->devs =3D NULL; erofs_fscache_unregister_fs(sb); } @@ -1121,6 +1132,35 @@ static void erofs_evict_inode(struct inode *inode) clear_inode(inode); } =20 +/* + * A blob device may back several erofs superblocks; fence only the affect= ed + * one and keep the rest of the mount alive. The primary device falls bac= k to + * the generic teardown (return non-zero). + */ +static int erofs_remove_bdev(struct super_block *sb, struct block_device *= bdev) +{ + struct erofs_dev_context *devs =3D EROFS_SB(sb)->devs; + struct erofs_device_info *dif; + int id; + + if (bdev =3D=3D sb->s_bdev) + return 1; + + down_read(&devs->rwsem); + idr_for_each_entry(&devs->tree, dif, id) { + if (dif->file && S_ISBLK(file_inode(dif->file)->i_mode) && + file_bdev(dif->file)->bd_dev =3D=3D bdev->bd_dev) + WRITE_ONCE(dif->dead, true); + } + up_read(&devs->rwsem); + return 0; +} + +static void erofs_shutdown(struct super_block *sb) +{ + set_bit(EROFS_SB_SHUTDOWN, &EROFS_SB(sb)->flags); +} + const struct super_operations erofs_sops =3D { .put_super =3D erofs_put_super, .alloc_inode =3D erofs_alloc_inode, @@ -1128,6 +1168,8 @@ const struct super_operations erofs_sops =3D { .evict_inode =3D erofs_evict_inode, .statfs =3D erofs_statfs, .show_options =3D erofs_show_options, + .remove_bdev =3D erofs_remove_bdev, + .shutdown =3D erofs_shutdown, }; =20 module_init(erofs_module_init); diff --git a/fs/erofs/zdata.c b/fs/erofs/zdata.c index 43bb5a6a9924..89ae91935364 100644 --- a/fs/erofs/zdata.c +++ b/fs/erofs/zdata.c @@ -1697,11 +1697,15 @@ static void z_erofs_submit_queue(struct z_erofs_fro= ntend *f, continue; } =20 - /* no device id here, thus it will always succeed */ mdev =3D (struct erofs_map_dev) { .m_pa =3D round_down(pcl->pos, sb->s_blocksize), }; - (void)erofs_map_dev(sb, &mdev); + if (erofs_map_dev(sb, &mdev)) { + /* the backing device is gone; fail the batch */ + q[JQ_SUBMIT]->eio =3D true; + qtail[JQ_SUBMIT] =3D &pcl->next; + continue; + } =20 cur =3D mdev.m_pa; end =3D round_up(cur + pcl->pageofs_in + pcl->pclustersize, @@ -1785,7 +1789,7 @@ static void z_erofs_submit_queue(struct z_erofs_front= end *f, * although background is preferred, no one is pending for submission. * don't issue decompression but drop it directly instead. */ - if (!*force_fg && !nr_bios) { + if (!*force_fg && !nr_bios && !q[JQ_SUBMIT]->eio) { kvfree(q[JQ_SUBMIT]); return; } --=20 2.47.3 From nobody Mon Jun 8 03:19:50 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-alma10-1.taild15c8.ts.net [100.103.45.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8C0DB3DE453; Tue, 2 Jun 2026 10:10:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=100.103.45.18 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780395052; cv=none; b=LyoeI9qVRlJtlUlNgGCL7fR5bzUjhmSjxjW7mN12lF9DeTW4vuxc8s11w/ZG76BIOr+PNYn43V0vbyCEMMn9htlLstM5R0nnHu/J6dG20fZp6B7Gk4toHEw+K74WJx587FaCVgHqxglhLPpQUIhxvWfwGEVro23L1NlkQLvTSm0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780395052; c=relaxed/simple; bh=R23GnSj4U5c2eVlp7JGCCBg/KF5c+QIHjrFm4diCqAg=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=KFX6sFk7tJVOJH457T0DcdTM4VlKtdDXXy+qGKDiZGzqrFqYk29ZJOgq6eX5sTGoH/8ndqTg5OJXzbcIryYB92V6b5fnN6uCOA+reCMpy7vfmKGHwjJBzXJ7nRGJSUEGpEBDlb94UPHf5WRnb7l8I1it+iOPGCRWKeDJQqYG/eE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=mKNi+CFk; arc=none smtp.client-ip=100.103.45.18 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="mKNi+CFk" Received: by smtp.kernel.org (Postfix) with ESMTPSA id D83D01F0089B; Tue, 2 Jun 2026 10:10:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel.org; s=k20260515; t=1780395051; bh=xLOTdT8r5c8v5N1ZxNPQflZse6scSqS6hKPK2aBGU3s=; h=From:Date:Subject:References:In-Reply-To:To:Cc; b=mKNi+CFkSwIzUT3XyErbW7NUlsQGaqz8SaV5mqOF5zpDbz/Z2aM0kkT5r/xonBqSS vIFBYSRF+Y8Qd3Y6kSLOMq50Pny91d27GC/z9QyA+IJiPhCrRbZTVnGXRcDUVZ3Fci g/R73rSa92eMq4puOjqbtYwgQAq7fvtIK8/LASaNsHS50nl3qVNyqz/HvQ5ro3+msv fET0bfi0cVeWzh85VyG67UF1iASgo8srRq+Gid4B29ynv3MA+95OhVeC1VvC9f0QAk /2TbB1DSzJrILBmkHgNiC4DL3kp8cR7aSg2Bs/ifoereNRYuW6WCHfDvwXCB1CYHaI 2Uc9x28TrzuRw== From: Christian Brauner Date: Tue, 02 Jun 2026 12:10:14 +0200 Subject: [PATCH RFC 8/8] super: make fs_holder_ops private Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260602-work-super-bdev_holder_global-v1-8-bb0fd82f3861@kernel.org> References: <20260602-work-super-bdev_holder_global-v1-0-bb0fd82f3861@kernel.org> In-Reply-To: <20260602-work-super-bdev_holder_global-v1-0-bb0fd82f3861@kernel.org> To: Christoph Hellwig , Jan Kara Cc: Jens Axboe , Alexander Viro , linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, Carlos Maiolino , linux-xfs@vger.kernel.org, Chris Mason , David Sterba , linux-btrfs@vger.kernel.org, Theodore Ts'o , linux-ext4@vger.kernel.org, Gao Xiang , linux-erofs@lists.ozlabs.org, "Christian Brauner (Amutable)" X-Mailer: b4 0.16-dev-fffa9 X-Developer-Signature: v=1; a=openpgp-sha256; l=764; i=brauner@kernel.org; h=from:subject:message-id; bh=R23GnSj4U5c2eVlp7JGCCBg/KF5c+QIHjrFm4diCqAg=; b=owGbwMvMwCU28Zj0gdSKO4sYT6slMWTJreGY//RPxrQiwUNbMteaBZSkHhQ+tUBJbYLZtUb7N /9tbhXXdpSyMIhxMciKKbI4tJuEyy3nqdhslKkBM4eVCWQIAxenAEyExYThn+0q8UN3BJ6GPdTU uvE8205tyuMZvh/sl79WLi21dk76FcPwvyRUiS+vibv687R6c4v1F9+JxWQGFKy13a/tYX3sksg tFgA= X-Developer-Key: i=brauner@kernel.org; a=openpgp; fpr=4880B8C9BD0E5106FC070F4F7B3C391EFEA93624 There's no need to expose it anymore. Signed-off-by: Christian Brauner (Amutable) --- fs/super.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/fs/super.c b/fs/super.c index cea743f699e4..983c2fbf5202 100644 --- a/fs/super.c +++ b/fs/super.c @@ -1643,13 +1643,12 @@ static int fs_bdev_thaw(struct block_device *bdev) return error; } =20 -const struct blk_holder_ops fs_holder_ops =3D { +static const struct blk_holder_ops fs_holder_ops =3D { .mark_dead =3D fs_bdev_mark_dead, .sync =3D fs_bdev_sync, .freeze =3D fs_bdev_freeze, .thaw =3D fs_bdev_thaw, }; -EXPORT_SYMBOL_GPL(fs_holder_ops); =20 static int fs_bdev_register(struct file *bdev_file, struct super_block *sb) { --=20 2.47.3