fs/ext4/balloc.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
There is case that s_first_data_block is not 0 and block nr is smaller than
s_first_data_block when calculating group bitmap during allocation. This
underflow make index exceed es->s_groups_count in ext4_get_group_info()
and trigger the BUG_ON.
Fix it with protection of underflow.
Fixes: 72b64b594081ef ("ext4 uninline ext4_get_group_no_and_offset()")
Link: https://syzkaller.appspot.com/bug?id=79d5768e9bfe362911ac1a5057a36fc6b5c30002
Reported-by: syzbot+6be2b977c89f79b6b153@syzkaller.appspotmail.com
Signed-off-by: Jun Nie <jun.nie@linaro.org>
---
fs/ext4/balloc.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/fs/ext4/balloc.c b/fs/ext4/balloc.c
index 8ff4b9192a9f..177ef6bd635a 100644
--- a/fs/ext4/balloc.c
+++ b/fs/ext4/balloc.c
@@ -56,7 +56,8 @@ void ext4_get_group_no_and_offset(struct super_block *sb, ext4_fsblk_t blocknr,
struct ext4_super_block *es = EXT4_SB(sb)->s_es;
ext4_grpblk_t offset;
- blocknr = blocknr - le32_to_cpu(es->s_first_data_block);
+ blocknr = blocknr > le32_to_cpu(es->s_first_data_block) ?
+ blocknr - le32_to_cpu(es->s_first_data_block) : 0;
offset = do_div(blocknr, EXT4_BLOCKS_PER_GROUP(sb)) >>
EXT4_SB(sb)->s_cluster_bits;
if (offsetp)
--
2.34.1
On Thu, Dec 22, 2022 at 10:02:44AM +0800, Jun Nie wrote: > There is case that s_first_data_block is not 0 and block nr is smaller than > s_first_data_block when calculating group bitmap during allocation. This > underflow make index exceed es->s_groups_count in ext4_get_group_info() > and trigger the BUG_ON. > > Fix it with protection of underflow. When was this happening, and why? If blocknr is less than s_first_data_block, this is either a insufficient input validation, insufficient validation to detection file system corruption. or some other kernel bug. Looking quickly at the code and the repro, it appears that issue is that FS_IOC_GETFSMAP is getting passed a stating physical block of 0 in fmh_keys[0] when on a file system with a blocksize of 1k (in which case s_first_data_block is 1). It's unclear to me what FS_IOC_GETFSMAP should *do* when passed a value which requests that it provide a mapping for a block which is out of bounds (either too big, or too small)?. Should it return an error? Should it simply not return a mapping? The map page for ioctl_getfsmap() doesn't shed any light on this question. Darrick, you designed the interface and wrote most of fs/ext4/fsmap.c. Can you let us know what is supposed to happen in this case? Many thanks!! > Fixes: 72b64b594081ef ("ext4 uninline ext4_get_group_no_and_offset()") This makes ***no*** sense; the commit in question is from 2006, which means that in some jourisdictions it's old enough to drive a car. :-) Futhermore, all it does is move the function from an inline function to a C file (in this case, balloc.c). It also long predates introduction of FS_IOC_GETFSMAP support, which was in 2017. I'm guessing you just did a "git blame" and blindly assumed that whatever commit last touched the C code in question was what introduced the problem? Anyway, please try to understand what is going on instead of doing the moral equivalent of taking a sledgehammer to the code until the reproducer stops triggering a BUG. It's not enough to shut up the reproducer; you should understand what is happening, and why, and then strive to find the best fix to the problem. Papering over problems in the end will result in more fragile code, and the goal of syzkaller is to improve kernel quality. But syzkaller is just a tool and used wrongly, it can have the opposite effect. Regards, - Ted
On Thu, Dec 22, 2022 at 12:41:58PM -0500, Theodore Ts'o wrote: > On Thu, Dec 22, 2022 at 10:02:44AM +0800, Jun Nie wrote: > > There is case that s_first_data_block is not 0 and block nr is smaller than > > s_first_data_block when calculating group bitmap during allocation. This > > underflow make index exceed es->s_groups_count in ext4_get_group_info() > > and trigger the BUG_ON. > > > > Fix it with protection of underflow. > > When was this happening, and why? If blocknr is less than > s_first_data_block, this is either a insufficient input validation, > insufficient validation to detection file system corruption. or some > other kernel bug. > > Looking quickly at the code and the repro, it appears that issue is > that FS_IOC_GETFSMAP is getting passed a stating physical block of 0 > in fmh_keys[0] when on a file system with a blocksize of 1k (in which > case s_first_data_block is 1). It's unclear to me what Question -- on a 1k-block filesystem, are the first 1024 bytes of the device *reserved* by ext4 for whatever bootloader crud goes in there? Or is that space undefined in the filesystem specification? I never did figure that out when I was writing the ondisk specification that's in the kernel, but maybe you remember? > FS_IOC_GETFSMAP should *do* when passed a value which requests that it > provide a mapping for a block which is out of bounds (either too big, > or too small)?. Should it return an error? Should it simply not > return a mapping? The map page for ioctl_getfsmap() doesn't shed any > light on this question. > > Darrick, you designed the interface and wrote most of fs/ext4/fsmap.c. > Can you let us know what is supposed to happen in this case? Many > thanks!! If those first 1024 bytes are defined to be reserved in the ondisk format, then you could return a mapping for those bytes with the owner code set to EXT4_FMR_OWN_UNKNOWN. If, however, the space is undefined, then going off this statement in the manpage: "For example, if the low key (fsmap_head.fmh_keys[0]) is set to (8:0, 36864, 0, 0, 0), the filesystem will only return records for extents starting at or above 36 KiB on disk." I think the 'at or above' clause means that ext4 should not pass back any mapping for the byte range 0-1023 on a 1k-block filesystem. If the low key is set to (8:0, 0, 0, 0, 0) and high key is set to (8:0, 1023, 0, 0, 0) then ext4 shouldn't return any mapping at all, because there's no space usage defined for that region of the disk. If the low key is set to (8:0, 0, 0, 0, 0) and high key is set to all ones, then ext4 can return mappings for the primary superblock at offset 1024. --D > > > Fixes: 72b64b594081ef ("ext4 uninline ext4_get_group_no_and_offset()") > > This makes ***no*** sense; the commit in question is from 2006, which > means that in some jourisdictions it's old enough to drive a car. :-) > Futhermore, all it does is move the function from an inline function > to a C file (in this case, balloc.c). It also long predates > introduction of FS_IOC_GETFSMAP support, which was in 2017. > > I'm guessing you just did a "git blame" and blindly assumed that > whatever commit last touched the C code in question was what > introduced the problem? > > Anyway, please try to understand what is going on instead of doing the > moral equivalent of taking a sledgehammer to the code until the > reproducer stops triggering a BUG. It's not enough to shut up the > reproducer; you should understand what is happening, and why, and then > strive to find the best fix to the problem. Papering over problems in > the end will result in more fragile code, and the goal of syzkaller is > to improve kernel quality. But syzkaller is just a tool and used > wrongly, it can have the opposite effect. > > Regards, > > - Ted
On Thu, Dec 22, 2022 at 10:08:59AM -0800, Darrick J. Wong wrote: > > Question -- on a 1k-block filesystem, are the first 1024 bytes of the > device *reserved* by ext4 for whatever bootloader crud goes in there? > Or is that space undefined in the filesystem specification? > > I never did figure that out when I was writing the ondisk specification > that's in the kernel, but maybe you remember? That's an interesting (and philosophical) question. The ext2 file system never had a formal specification, and this part of the file system format was devised by Remy Card before I had gotten involved with ext2. (I first got started writing e2fsprogs; which replaced the previous file system utilities, which were forked from minix's tools, and which were quite inefficient.) In favor of it being undefined, the first 1024 bytes are not part of any block group in an ext2 file system with a 1k block size. (The first block group is composed of physical blocks 1 through 8192 inclusive when the block size is 1k. Whereas if the blocksize is 4k, the first block group is composed of physical blocks 0 through 32767.) In addition, the status of the first 1024 bytes is not controlled by an ext2 block allocation bitmap. One could also argue that to the extent that ext2 was derived the ext file system, which in turn was derived from Minix --- and Minix File System (which does have a specification, explicitly states that "block 0" is reserved for the Bootloader, with "Block 1" being the location of the superblock. But Minix only supports a 1k blocksize, and doesn't have the concept of FFS-style block (cylinder) groups. So I'd come down on the side which states that the first 1024 bytes are "undefined" on a 1k block file system. (One could also aruge that they are "undefined" on a 2k and 4k block file system, but the first 1024 bytes are part of "block 0", and on 2k and 4k block file systems, "block 0" is part of a block group.) > If those first 1024 bytes are defined to be reserved in the ondisk > format, then you could return a mapping for those bytes with the owner > code set to EXT4_FMR_OWN_UNKNOWN. > > If, however, the space is undefined, then going off this statement in > the manpage: > > "For example, if the low key (fsmap_head.fmh_keys[0]) is set to (8:0, > 36864, 0, 0, 0), the filesystem will only return records for extents > starting at or above 36 KiB on disk." > > I think the 'at or above' clause means that ext4 should not pass back > any mapping for the byte range 0-1023 on a 1k-block filesystem. Sure, sounds good to me. - Ted
© 2016 - 2025 Red Hat, Inc.