From nobody Mon Oct 6 21:02:01 2025 Received: from out30-124.freemail.mail.aliyun.com (out30-124.freemail.mail.aliyun.com [115.124.30.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6BB022F273C for ; Wed, 16 Jul 2025 17:33:22 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752687206; cv=none; b=R9kbk9YpoPDvrRAmfTOjh5z5HdFznBw98E3jRKFVAKetxnZklay4HFOjNHuFUwmbiRJnLOka/4FPtSws1iW9t56njOZJaTLPomvwCR3rX/kSMc2k75t/8nus2OMRbReNd93HKmEjQVlPuFcTOWFvYQGql+uiunTrVS6zr5ka89s= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752687206; c=relaxed/simple; bh=FuK8DttPfjSWS55Mk5TD10h4xNht8EH7lfxxUr6zm0Q=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=hkZiQkYVA1a8+4TnD2AOfZiOv5HE0L2se7/eO5SoHL0Rfnfy2Z1RIdQLTcDNQtV2hInnsPrYsHfBA2xUtRTOFWZG1qj3JVSOIz4zoZbmptuwiiyf0xuCgiAgGICOwa9wW1vEi68afWL2n7Id+RkCJa9e4EMx2GuygMfoxmFF1bo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=ETdidkg6; arc=none smtp.client-ip=115.124.30.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="ETdidkg6" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1752687201; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=6w8O6RweXULv3ifzkRKQqUGQ5ARREZ9DZcDovLN/Qng=; b=ETdidkg6ukNKKSDWOPsLL3pWIs/zIKycqiiMJUNFwn+EqEMFKqzUJmixM0U27tq1AwfeccDSkZQIh7l5OC8rqmS6nlGwjLT6GT0WGmLDg5DA3Q+xOwBi999a6hBPqz2e475qk1Q/I2v8BTHh8kVeNW0ZcEuEddhOZzGiKabm0sM= Received: from x31i01179.sqa.na131.tbsite.net(mailfrom:hsiangkao@linux.alibaba.com fp:SMTPD_---0Wj5Vb6t_1752687199 cluster:ay36) by smtp.aliyun-inc.com; Thu, 17 Jul 2025 01:33:20 +0800 From: Gao Xiang To: linux-erofs@lists.ozlabs.org Cc: LKML , Gao Xiang Subject: [PATCH v3 1/2] erofs: add on-disk definition for metadata compression Date: Thu, 17 Jul 2025 01:33:13 +0800 Message-ID: <20250716173314.308744-2-hsiangkao@linux.alibaba.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250716173314.308744-1-hsiangkao@linux.alibaba.com> References: <20250716173314.308744-1-hsiangkao@linux.alibaba.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Filesystem metadata has a high degree of redundancy, so it should compress well in the general case. Although metadata compression can increase overall I/O latency, many users care more about minimized image sizes than extreme runtime performance. Let's implement metadata compression in response to user requests [1]. Actually, it's quite simple to implement metadata compression: since EROFS already supports per-inode compression, we can simply treat a special inode (called `the metabox inode`) as a container for compressed inode metadata. Since EROFS supports multiple algorithms, users can even specify LZ4 for metadata and LZMA for data. To better support incremental builds, the MSB of NIDs indicates where the inode metadata is located: if bit 63 is set, the inode itself should be read from `the metabox inode`. Optionally, shared xattrs can also be kept in `the metabox inode` if COMPAT_SHARED_EA_IN_METABOX is set. [1] https://issues.redhat.com/browse/RHEL-75783 Signed-off-by: Gao Xiang --- fs/erofs/erofs_fs.h | 13 ++++++++++--- fs/erofs/internal.h | 2 ++ 2 files changed, 12 insertions(+), 3 deletions(-) diff --git a/fs/erofs/erofs_fs.h b/fs/erofs/erofs_fs.h index 767fb4acdc93..0c9047e4a295 100644 --- a/fs/erofs/erofs_fs.h +++ b/fs/erofs/erofs_fs.h @@ -15,6 +15,7 @@ #define EROFS_FEATURE_COMPAT_SB_CHKSUM 0x00000001 #define EROFS_FEATURE_COMPAT_MTIME 0x00000002 #define EROFS_FEATURE_COMPAT_XATTR_FILTER 0x00000004 +#define EROFS_FEATURE_COMPAT_SHARED_EA_IN_METABOX 0x00000008 =20 /* * Any bits that aren't in EROFS_ALL_FEATURE_INCOMPAT should @@ -31,6 +32,7 @@ #define EROFS_FEATURE_INCOMPAT_DEDUPE 0x00000020 #define EROFS_FEATURE_INCOMPAT_XATTR_PREFIXES 0x00000040 #define EROFS_FEATURE_INCOMPAT_48BIT 0x00000080 +#define EROFS_FEATURE_INCOMPAT_METABOX 0x00000100 #define EROFS_ALL_FEATURE_INCOMPAT \ ((EROFS_FEATURE_INCOMPAT_48BIT << 1) - 1) =20 @@ -46,7 +48,7 @@ struct erofs_deviceslot { }; #define EROFS_DEVT_SLOT_SIZE sizeof(struct erofs_deviceslot) =20 -/* erofs on-disk super block (currently 128 bytes) */ +/* erofs on-disk super block (currently 144 bytes at maximum) */ struct erofs_super_block { __le32 magic; /* file system magic number */ __le32 checksum; /* crc32c to avoid unexpected on-disk overlap */ @@ -82,7 +84,9 @@ struct erofs_super_block { __u8 reserved[3]; __le32 build_time; /* seconds added to epoch for mkfs time */ __le64 rootnid_8b; /* (48BIT on) nid of root directory */ - __u8 reserved2[8]; + __le64 reserved2; + __le64 metabox_nid; /* (METABOX on) nid of the metabox inode */ + __le64 reserved3; /* [align to extslot 1] */ }; =20 /* @@ -267,6 +271,9 @@ struct erofs_inode_chunk_index { __le32 startblk_lo; /* starting block number of this chunk */ }; =20 +#define EROFS_DIRENT_NID_METABOX_BIT 63 +#define EROFS_DIRENT_NID_MASK (BIT(EROFS_DIRENT_NID_METABOX_BIT) - 1) + /* dirent sorts in alphabet order, thus we can do binary search */ struct erofs_dirent { __le64 nid; /* node number */ @@ -434,7 +441,7 @@ static inline void erofs_check_ondisk_layout_definition= s(void) .h_clusterbits =3D 1 << Z_EROFS_FRAGMENT_INODE_BIT }; =20 - BUILD_BUG_ON(sizeof(struct erofs_super_block) !=3D 128); + BUILD_BUG_ON(sizeof(struct erofs_super_block) !=3D 144); BUILD_BUG_ON(sizeof(struct erofs_inode_compact) !=3D 32); BUILD_BUG_ON(sizeof(struct erofs_inode_extended) !=3D 64); BUILD_BUG_ON(sizeof(struct erofs_xattr_ibody_header) !=3D 12); diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h index a7699114f6fe..ad932f670bb6 100644 --- a/fs/erofs/internal.h +++ b/fs/erofs/internal.h @@ -227,8 +227,10 @@ EROFS_FEATURE_FUNCS(fragments, incompat, INCOMPAT_FRAG= MENTS) EROFS_FEATURE_FUNCS(dedupe, incompat, INCOMPAT_DEDUPE) EROFS_FEATURE_FUNCS(xattr_prefixes, incompat, INCOMPAT_XATTR_PREFIXES) EROFS_FEATURE_FUNCS(48bit, incompat, INCOMPAT_48BIT) +EROFS_FEATURE_FUNCS(metabox, incompat, INCOMPAT_METABOX) EROFS_FEATURE_FUNCS(sb_chksum, compat, COMPAT_SB_CHKSUM) EROFS_FEATURE_FUNCS(xattr_filter, compat, COMPAT_XATTR_FILTER) +EROFS_FEATURE_FUNCS(shared_ea_in_metabox, compat, COMPAT_SHARED_EA_IN_META= BOX) =20 /* atomic flag definitions */ #define EROFS_I_EA_INITED_BIT 0 --=20 2.43.5 From nobody Mon Oct 6 21:02:01 2025 Received: from out30-133.freemail.mail.aliyun.com (out30-133.freemail.mail.aliyun.com [115.124.30.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0C41A2F273C for ; Wed, 16 Jul 2025 17:33:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.133 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752687212; cv=none; b=k4EAksWkhtg8Ne83TEy4P+LhzHQ9uS2rBko/2TR4Ikn2tL9TKjyZ2CfthF29ea2dIDUcaJUz3mzH9RPVYTTrwpLreIVzSXNQqQBkuyojyuD5u3BK21wIVB26OITizvL5I165OV0ONgSATKJxYVsmmqsWsNIwX5lJy8L6+pj4SYY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752687212; c=relaxed/simple; bh=/3F3h6mScyJ+RLBF8uktU+hfyTYiFldgNIDIe+xZuJo=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=pdAmDSz4ga2pQbSivRp47WDSKhmdDnS+sqv83oELbdVp4vfg7twpn8ZH5zvMVpijQXv5k/9tIGNjTum2bRnWQjcqifTMKTfxiSZxHz02/DaC+VTRCiWD10MTdkBk9bPPh8ejdu5+ENPZsAQt2c5uNWBkOx/ur1kQVyg+iBHcOxY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=xVIhKpZF; arc=none smtp.client-ip=115.124.30.133 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="xVIhKpZF" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1752687202; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=wQc79DnXDtcgGScN75+Cggp4+0TZnbzqVXm/nl2v5UI=; b=xVIhKpZFGUcSW1vs4yZCe7yNt4aUnc9HHSXFI/nxEUW5c/Z/L7e8Du+Q+72FivL1qWOywls42PWEM1UIMiBC3ogps2ydgulfr8lNBaeKl+OBHb30oHckKZWNWckTFpQUNowuNKuPrfvpktXiItdxuoiCtI0qhcH+0++3akpjUJQ= Received: from x31i01179.sqa.na131.tbsite.net(mailfrom:hsiangkao@linux.alibaba.com fp:SMTPD_---0Wj5Vb7s_1752687200 cluster:ay36) by smtp.aliyun-inc.com; Thu, 17 Jul 2025 01:33:21 +0800 From: Gao Xiang To: linux-erofs@lists.ozlabs.org Cc: LKML , Bo Liu , Gao Xiang Subject: [PATCH v3 2/2] erofs: implement metadata compression Date: Thu, 17 Jul 2025 01:33:14 +0800 Message-ID: <20250716173314.308744-3-hsiangkao@linux.alibaba.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250716173314.308744-1-hsiangkao@linux.alibaba.com> References: <20250716173314.308744-1-hsiangkao@linux.alibaba.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Bo Liu Thanks to the meta buffer infrastructure, metadata-compressed inodes are just read from the metabox inode instead of the blockdevice (or backing file) inode. The same is true for shared extended attributes. Co-developed-by: Gao Xiang Signed-off-by: Bo Liu Signed-off-by: Gao Xiang --- fs/erofs/data.c | 59 +++++++++++++++++++++++++---------------- fs/erofs/decompressor.c | 2 +- fs/erofs/erofs_fs.h | 2 +- fs/erofs/fileio.c | 2 +- fs/erofs/inode.c | 5 ++-- fs/erofs/internal.h | 17 +++++++++--- fs/erofs/super.c | 22 +++++++++++++-- fs/erofs/xattr.c | 20 +++++++++----- fs/erofs/zdata.c | 5 +++- fs/erofs/zmap.c | 16 ++++++----- 10 files changed, 103 insertions(+), 47 deletions(-) diff --git a/fs/erofs/data.c b/fs/erofs/data.c index 383c1337e157..f46c47335b9c 100644 --- a/fs/erofs/data.c +++ b/fs/erofs/data.c @@ -49,11 +49,18 @@ void *erofs_bread(struct erofs_buf *buf, erofs_off_t of= fset, bool need_kmap) return buf->base + (offset & ~PAGE_MASK); } =20 -void erofs_init_metabuf(struct erofs_buf *buf, struct super_block *sb) +int erofs_init_metabuf(struct erofs_buf *buf, struct super_block *sb, + bool in_metabox) { struct erofs_sb_info *sbi =3D EROFS_SB(sb); =20 buf->file =3D NULL; + if (in_metabox) { + if (unlikely(!sbi->metabox_inode)) + return -EFSCORRUPTED; + buf->mapping =3D sbi->metabox_inode->i_mapping; + return 0; + } buf->off =3D sbi->dif0.fsoff; if (erofs_is_fileio_mode(sbi)) { buf->file =3D sbi->dif0.file; /* some fs like FUSE needs it */ @@ -62,12 +69,17 @@ void erofs_init_metabuf(struct erofs_buf *buf, struct s= uper_block *sb) buf->mapping =3D sbi->dif0.fscache->inode->i_mapping; else buf->mapping =3D sb->s_bdev->bd_mapping; + return 0; } =20 void *erofs_read_metabuf(struct erofs_buf *buf, struct super_block *sb, - erofs_off_t offset) + erofs_off_t offset, bool in_metabox) { - erofs_init_metabuf(buf, sb); + int err; + + err =3D erofs_init_metabuf(buf, sb, in_metabox); + if (err) + return ERR_PTR(err); return erofs_bread(buf, offset, true); } =20 @@ -118,7 +130,7 @@ int erofs_map_blocks(struct inode *inode, struct erofs_= map_blocks *map) pos =3D ALIGN(erofs_iloc(inode) + vi->inode_isize + vi->xattr_isize, unit) + unit * chunknr; =20 - idx =3D erofs_read_metabuf(&buf, sb, pos); + idx =3D erofs_read_metabuf(&buf, sb, pos, erofs_inode_in_metabox(inode)); if (IS_ERR(idx)) { err =3D PTR_ERR(idx); goto out; @@ -264,7 +276,6 @@ static int erofs_iomap_begin(struct inode *inode, loff_= t offset, loff_t length, =20 map.m_la =3D offset; map.m_llen =3D length; - ret =3D erofs_map_blocks(inode, &map); if (ret < 0) return ret; @@ -273,35 +284,37 @@ static int erofs_iomap_begin(struct inode *inode, lof= f_t offset, loff_t length, iomap->length =3D map.m_llen; iomap->flags =3D 0; iomap->private =3D NULL; + iomap->addr =3D IOMAP_NULL_ADDR; if (!(map.m_flags & EROFS_MAP_MAPPED)) { iomap->type =3D IOMAP_HOLE; - iomap->addr =3D IOMAP_NULL_ADDR; return 0; } =20 - mdev =3D (struct erofs_map_dev) { - .m_deviceid =3D map.m_deviceid, - .m_pa =3D map.m_pa, - }; - ret =3D erofs_map_dev(sb, &mdev); - if (ret) - return ret; - - if (flags & IOMAP_DAX) - iomap->dax_dev =3D mdev.m_dif->dax_dev; - else - iomap->bdev =3D mdev.m_bdev; - - iomap->addr =3D mdev.m_dif->fsoff + mdev.m_pa; - if (flags & IOMAP_DAX) - iomap->addr +=3D mdev.m_dif->dax_part_off; + if (!(map.m_flags & EROFS_MAP_META) || !erofs_inode_in_metabox(inode)) { + mdev =3D (struct erofs_map_dev) { + .m_deviceid =3D map.m_deviceid, + .m_pa =3D map.m_pa, + }; + ret =3D erofs_map_dev(sb, &mdev); + if (ret) + return ret; + + if (flags & IOMAP_DAX) + iomap->dax_dev =3D mdev.m_dif->dax_dev; + else + iomap->bdev =3D mdev.m_bdev; + iomap->addr =3D mdev.m_dif->fsoff + mdev.m_pa; + if (flags & IOMAP_DAX) + iomap->addr +=3D mdev.m_dif->dax_part_off; + } =20 if (map.m_flags & EROFS_MAP_META) { void *ptr; struct erofs_buf buf =3D __EROFS_BUF_INITIALIZER; =20 iomap->type =3D IOMAP_INLINE; - ptr =3D erofs_read_metabuf(&buf, sb, mdev.m_pa); + ptr =3D erofs_read_metabuf(&buf, sb, map.m_pa, + erofs_inode_in_metabox(inode)); if (IS_ERR(ptr)) return PTR_ERR(ptr); iomap->inline_data =3D ptr; diff --git a/fs/erofs/decompressor.c b/fs/erofs/decompressor.c index 358061d7b660..354762c9723f 100644 --- a/fs/erofs/decompressor.c +++ b/fs/erofs/decompressor.c @@ -467,7 +467,7 @@ int z_erofs_parse_cfgs(struct super_block *sb, struct e= rofs_super_block *dsb) return -EOPNOTSUPP; } =20 - erofs_init_metabuf(&buf, sb); + (void)erofs_init_metabuf(&buf, sb, false); offset =3D EROFS_SUPER_OFFSET + sbi->sb_size; alg =3D 0; for (algs =3D sbi->available_compr_algs; algs; algs >>=3D 1, ++alg) { diff --git a/fs/erofs/erofs_fs.h b/fs/erofs/erofs_fs.h index 0c9047e4a295..a61831a82a73 100644 --- a/fs/erofs/erofs_fs.h +++ b/fs/erofs/erofs_fs.h @@ -34,7 +34,7 @@ #define EROFS_FEATURE_INCOMPAT_48BIT 0x00000080 #define EROFS_FEATURE_INCOMPAT_METABOX 0x00000100 #define EROFS_ALL_FEATURE_INCOMPAT \ - ((EROFS_FEATURE_INCOMPAT_48BIT << 1) - 1) + ((EROFS_FEATURE_INCOMPAT_METABOX << 1) - 1) =20 #define EROFS_SB_EXTSLOT_SIZE 16 =20 diff --git a/fs/erofs/fileio.c b/fs/erofs/fileio.c index 3ee082476c8c..b7b3432a9882 100644 --- a/fs/erofs/fileio.c +++ b/fs/erofs/fileio.c @@ -115,7 +115,7 @@ static int erofs_fileio_scan_folio(struct erofs_fileio = *io, struct folio *folio) void *src; =20 src =3D erofs_read_metabuf(&buf, inode->i_sb, - map->m_pa + ofs); + map->m_pa + ofs, erofs_inode_in_metabox(inode)); if (IS_ERR(src)) { err =3D PTR_ERR(src); break; diff --git a/fs/erofs/inode.c b/fs/erofs/inode.c index 47215c5e3385..045ccca6ab30 100644 --- a/fs/erofs/inode.c +++ b/fs/erofs/inode.c @@ -29,6 +29,7 @@ static int erofs_read_inode(struct inode *inode) struct super_block *sb =3D inode->i_sb; erofs_blk_t blkaddr =3D erofs_blknr(sb, erofs_iloc(inode)); unsigned int ofs =3D erofs_blkoff(sb, erofs_iloc(inode)); + bool in_mbox =3D erofs_inode_in_metabox(inode); struct erofs_buf buf =3D __EROFS_BUF_INITIALIZER; struct erofs_sb_info *sbi =3D EROFS_SB(sb); erofs_blk_t addrmask =3D BIT_ULL(48) - 1; @@ -39,7 +40,7 @@ static int erofs_read_inode(struct inode *inode) void *ptr; int err =3D 0; =20 - ptr =3D erofs_read_metabuf(&buf, sb, erofs_pos(sb, blkaddr)); + ptr =3D erofs_read_metabuf(&buf, sb, erofs_pos(sb, blkaddr), in_mbox); if (IS_ERR(ptr)) { err =3D PTR_ERR(ptr); erofs_err(sb, "failed to read inode meta block (nid: %llu): %d", @@ -78,7 +79,7 @@ static int erofs_read_inode(struct inode *inode) =20 memcpy(&copied, dic, gotten); ptr =3D erofs_read_metabuf(&buf, sb, - erofs_pos(sb, blkaddr + 1)); + erofs_pos(sb, blkaddr + 1), in_mbox); if (IS_ERR(ptr)) { err =3D PTR_ERR(ptr); erofs_err(sb, "failed to read inode payload block (nid: %llu): %d", diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h index ad932f670bb6..a0e1b0b06d33 100644 --- a/fs/erofs/internal.h +++ b/fs/erofs/internal.h @@ -125,6 +125,7 @@ struct erofs_sb_info { struct erofs_sb_lz4_info lz4; #endif /* CONFIG_EROFS_FS_ZIP */ struct inode *packed_inode; + struct inode *metabox_inode; struct erofs_dev_context *devs; u64 total_blocks; =20 @@ -148,6 +149,7 @@ struct erofs_sb_info { /* what we really care is nid, rather than ino.. */ erofs_nid_t root_nid; erofs_nid_t packed_nid; + erofs_nid_t metabox_nid; /* used for statfs, f_files - f_favail */ u64 inos; =20 @@ -281,12 +283,20 @@ struct erofs_inode { =20 #define EROFS_I(ptr) container_of(ptr, struct erofs_inode, vfs_inode) =20 +static inline bool erofs_inode_in_metabox(struct inode *inode) +{ + return EROFS_I(inode)->nid & BIT(EROFS_DIRENT_NID_METABOX_BIT); +} + static inline erofs_off_t erofs_iloc(struct inode *inode) { struct erofs_sb_info *sbi =3D EROFS_I_SB(inode); + erofs_nid_t nid_lo =3D EROFS_I(inode)->nid & EROFS_DIRENT_NID_MASK; =20 + if (erofs_inode_in_metabox(inode)) + return nid_lo << sbi->islotbits; return erofs_pos(inode->i_sb, sbi->meta_blkaddr) + - (EROFS_I(inode)->nid << sbi->islotbits); + (nid_lo << sbi->islotbits); } =20 static inline unsigned int erofs_inode_version(unsigned int ifmt) @@ -385,9 +395,10 @@ void *erofs_read_metadata(struct super_block *sb, stru= ct erofs_buf *buf, void erofs_unmap_metabuf(struct erofs_buf *buf); void erofs_put_metabuf(struct erofs_buf *buf); void *erofs_bread(struct erofs_buf *buf, erofs_off_t offset, bool need_kma= p); -void erofs_init_metabuf(struct erofs_buf *buf, struct super_block *sb); +int erofs_init_metabuf(struct erofs_buf *buf, struct super_block *sb, + bool in_metabox); void *erofs_read_metabuf(struct erofs_buf *buf, struct super_block *sb, - erofs_off_t offset); + erofs_off_t offset, bool in_metabox); int erofs_map_dev(struct super_block *sb, struct erofs_map_dev *dev); int erofs_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo, u64 start, u64 len); diff --git a/fs/erofs/super.c b/fs/erofs/super.c index bc27fa3bd678..539551cf59db 100644 --- a/fs/erofs/super.c +++ b/fs/erofs/super.c @@ -141,7 +141,7 @@ static int erofs_init_device(struct erofs_buf *buf, str= uct super_block *sb, struct erofs_deviceslot *dis; struct file *file; =20 - dis =3D erofs_read_metabuf(buf, sb, *pos); + dis =3D erofs_read_metabuf(buf, sb, *pos, false); if (IS_ERR(dis)) return PTR_ERR(dis); =20 @@ -258,7 +258,7 @@ static int erofs_read_superblock(struct super_block *sb) void *data; int ret; =20 - data =3D erofs_read_metabuf(&buf, sb, 0); + data =3D erofs_read_metabuf(&buf, sb, 0, false); if (IS_ERR(data)) { erofs_err(sb, "cannot read erofs superblock"); return PTR_ERR(data); @@ -319,6 +319,14 @@ static int erofs_read_superblock(struct super_block *s= b) sbi->root_nid =3D le16_to_cpu(dsb->rb.rootnid_2b); } sbi->packed_nid =3D le64_to_cpu(dsb->packed_nid); + if (erofs_sb_has_metabox(sbi)) { + if (sbi->sb_size <=3D offsetof(struct erofs_super_block, + metabox_nid)) + return -EFSCORRUPTED; + sbi->metabox_nid =3D le64_to_cpu(dsb->metabox_nid); + if (sbi->metabox_nid & BIT(EROFS_DIRENT_NID_METABOX_BIT)) + return -EFSCORRUPTED; /* self-loop detection */ + } sbi->inos =3D le64_to_cpu(dsb->inos); =20 sbi->epoch =3D (s64)le64_to_cpu(dsb->epoch); @@ -335,6 +343,8 @@ static int erofs_read_superblock(struct super_block *sb) =20 if (erofs_sb_has_48bit(sbi)) erofs_info(sb, "EXPERIMENTAL 48-bit layout support in use. Use at your o= wn risk!"); + if (erofs_sb_has_metabox(sbi)) + erofs_info(sb, "EXPERIMENTAL metadata compression support in use. Use at= your own risk!"); if (erofs_is_fscache_mode(sb)) erofs_info(sb, "[deprecated] fscache-based on-demand read feature in use= . Use at your own risk!"); out: @@ -690,6 +700,12 @@ static int erofs_fc_fill_super(struct super_block *sb,= struct fs_context *fc) return PTR_ERR(inode); sbi->packed_inode =3D inode; } + if (erofs_sb_has_metabox(sbi)) { + inode =3D erofs_iget(sb, sbi->metabox_nid); + if (IS_ERR(inode)) + return PTR_ERR(inode); + sbi->metabox_inode =3D inode; + } =20 inode =3D erofs_iget(sb, sbi->root_nid); if (IS_ERR(inode)) @@ -845,6 +861,8 @@ static void erofs_drop_internal_inodes(struct erofs_sb_= info *sbi) { iput(sbi->packed_inode); sbi->packed_inode =3D NULL; + iput(sbi->metabox_inode); + sbi->metabox_inode =3D NULL; #ifdef CONFIG_EROFS_FS_ZIP iput(sbi->managed_cache); sbi->managed_cache =3D NULL; diff --git a/fs/erofs/xattr.c b/fs/erofs/xattr.c index 9cf84717a92e..6d2da6ad2be2 100644 --- a/fs/erofs/xattr.c +++ b/fs/erofs/xattr.c @@ -77,7 +77,9 @@ static int erofs_init_inode_xattrs(struct inode *inode) } =20 it.buf =3D __EROFS_BUF_INITIALIZER; - erofs_init_metabuf(&it.buf, sb); + ret =3D erofs_init_metabuf(&it.buf, sb, erofs_inode_in_metabox(inode)); + if (ret) + goto out_unlock; it.pos =3D erofs_iloc(inode) + vi->inode_isize; =20 /* read in shared xattr array (non-atomic, see kmalloc below) */ @@ -326,6 +328,9 @@ static int erofs_xattr_iter_inline(struct erofs_xattr_i= ter *it, return -ENOATTR; } =20 + ret =3D erofs_init_metabuf(&it->buf, it->sb, erofs_inode_in_metabox(inode= )); + if (ret) + return ret; remaining =3D vi->xattr_isize - xattr_header_sz; it->pos =3D erofs_iloc(inode) + vi->inode_isize + xattr_header_sz; =20 @@ -362,7 +367,12 @@ static int erofs_xattr_iter_shared(struct erofs_xattr_= iter *it, struct super_block *const sb =3D it->sb; struct erofs_sb_info *sbi =3D EROFS_SB(sb); unsigned int i; - int ret =3D -ENOATTR; + int ret; + + ret =3D erofs_init_metabuf(&it->buf, sb, + erofs_sb_has_shared_ea_in_metabox(sbi)); + if (ret) + return ret; =20 for (i =3D 0; i < vi->xattr_shared_count; ++i) { it->pos =3D erofs_pos(sb, sbi->xattr_blkaddr) + @@ -378,7 +388,7 @@ static int erofs_xattr_iter_shared(struct erofs_xattr_i= ter *it, if ((getxattr && ret !=3D -ENOATTR) || (!getxattr && ret)) break; } - return ret; + return i ? ret : -ENOATTR; } =20 int erofs_getxattr(struct inode *inode, int index, const char *name, @@ -413,7 +423,6 @@ int erofs_getxattr(struct inode *inode, int index, cons= t char *name, =20 it.sb =3D inode->i_sb; it.buf =3D __EROFS_BUF_INITIALIZER; - erofs_init_metabuf(&it.buf, it.sb); it.buffer =3D buffer; it.buffer_size =3D buffer_size; it.buffer_ofs =3D 0; @@ -439,7 +448,6 @@ ssize_t erofs_listxattr(struct dentry *dentry, char *bu= ffer, size_t buffer_size) =20 it.sb =3D dentry->d_sb; it.buf =3D __EROFS_BUF_INITIALIZER; - erofs_init_metabuf(&it.buf, it.sb); it.dentry =3D dentry; it.buffer =3D buffer; it.buffer_size =3D buffer_size; @@ -485,7 +493,7 @@ int erofs_xattr_prefixes_init(struct super_block *sb) if (sbi->packed_inode) buf.mapping =3D sbi->packed_inode->i_mapping; else - erofs_init_metabuf(&buf, sb); + (void)erofs_init_metabuf(&buf, sb, false); =20 for (i =3D 0; i < sbi->xattr_prefix_count; i++) { void *ptr =3D erofs_read_metadata(sb, &buf, &pos, &len); diff --git a/fs/erofs/zdata.c b/fs/erofs/zdata.c index 0d1ddd9b15de..792f20888a8f 100644 --- a/fs/erofs/zdata.c +++ b/fs/erofs/zdata.c @@ -855,7 +855,10 @@ static int z_erofs_pcluster_begin(struct z_erofs_front= end *fe) /* bind cache first when cached decompression is preferred */ z_erofs_bind_cache(fe); } else { - erofs_init_metabuf(&map->buf, sb); + ret =3D erofs_init_metabuf(&map->buf, sb, + erofs_inode_in_metabox(fe->inode)); + if (ret) + return ret; ptr =3D erofs_bread(&map->buf, map->m_pa, false); if (IS_ERR(ptr)) { ret =3D PTR_ERR(ptr); diff --git a/fs/erofs/zmap.c b/fs/erofs/zmap.c index b72a0e3f9362..a93efd95c555 100644 --- a/fs/erofs/zmap.c +++ b/fs/erofs/zmap.c @@ -17,7 +17,7 @@ struct z_erofs_maprecorder { u16 delta[2]; erofs_blk_t pblk, compressedblks; erofs_off_t nextpackoff; - bool partialref; + bool partialref, in_mbox; }; =20 static int z_erofs_load_full_lcluster(struct z_erofs_maprecorder *m, @@ -31,7 +31,7 @@ static int z_erofs_load_full_lcluster(struct z_erofs_mapr= ecorder *m, struct z_erofs_lcluster_index *di; unsigned int advise; =20 - di =3D erofs_read_metabuf(&m->map->buf, inode->i_sb, pos); + di =3D erofs_read_metabuf(&m->map->buf, inode->i_sb, pos, m->in_mbox); if (IS_ERR(di)) return PTR_ERR(di); m->lcn =3D lcn; @@ -146,7 +146,7 @@ static int z_erofs_load_compact_lcluster(struct z_erofs= _maprecorder *m, else return -EOPNOTSUPP; =20 - in =3D erofs_read_metabuf(&m->map->buf, m->inode->i_sb, pos); + in =3D erofs_read_metabuf(&m->map->buf, inode->i_sb, pos, m->in_mbox); if (IS_ERR(in)) return PTR_ERR(in); =20 @@ -392,6 +392,7 @@ static int z_erofs_map_blocks_fo(struct inode *inode, struct z_erofs_maprecorder m =3D { .inode =3D inode, .map =3D map, + .in_mbox =3D erofs_inode_in_metabox(inode), }; int err =3D 0; unsigned int endoff, afmt; @@ -521,6 +522,7 @@ static int z_erofs_map_blocks_ext(struct inode *inode, unsigned int recsz =3D z_erofs_extent_recsize(vi->z_advise); erofs_off_t pos =3D round_up(Z_EROFS_MAP_HEADER_END(erofs_iloc(inode) + vi->inode_isize + vi->xattr_isize), recsz); + bool in_mbox =3D erofs_inode_in_metabox(inode); erofs_off_t lend =3D inode->i_size; erofs_off_t l, r, mid, pa, la, lstart; struct z_erofs_extent *ext; @@ -530,7 +532,7 @@ static int z_erofs_map_blocks_ext(struct inode *inode, map->m_flags =3D 0; if (recsz <=3D offsetof(struct z_erofs_extent, pstart_hi)) { if (recsz <=3D offsetof(struct z_erofs_extent, pstart_lo)) { - ext =3D erofs_read_metabuf(&map->buf, sb, pos); + ext =3D erofs_read_metabuf(&map->buf, sb, pos, in_mbox); if (IS_ERR(ext)) return PTR_ERR(ext); pa =3D le64_to_cpu(*(__le64 *)ext); @@ -543,7 +545,7 @@ static int z_erofs_map_blocks_ext(struct inode *inode, } =20 for (; lstart <=3D map->m_la; lstart +=3D 1 << vi->z_lclusterbits) { - ext =3D erofs_read_metabuf(&map->buf, sb, pos); + ext =3D erofs_read_metabuf(&map->buf, sb, pos, in_mbox); if (IS_ERR(ext)) return PTR_ERR(ext); map->m_plen =3D le32_to_cpu(ext->plen); @@ -563,7 +565,7 @@ static int z_erofs_map_blocks_ext(struct inode *inode, for (l =3D 0, r =3D vi->z_extents; l < r; ) { mid =3D l + (r - l) / 2; ext =3D erofs_read_metabuf(&map->buf, sb, - pos + mid * recsz); + pos + mid * recsz, in_mbox); if (IS_ERR(ext)) return PTR_ERR(ext); =20 @@ -645,7 +647,7 @@ static int z_erofs_fill_inode(struct inode *inode, stru= ct erofs_map_blocks *map) goto out_unlock; =20 pos =3D ALIGN(erofs_iloc(inode) + vi->inode_isize + vi->xattr_isize, 8); - h =3D erofs_read_metabuf(&map->buf, sb, pos); + h =3D erofs_read_metabuf(&map->buf, sb, pos, erofs_inode_in_metabox(inode= )); if (IS_ERR(h)) { err =3D PTR_ERR(h); goto out_unlock; --=20 2.43.5