From nobody Sun Sep 7 13:39:47 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E79BAC001DE for ; Mon, 24 Jul 2023 12:14:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229836AbjGXMOI (ORCPT ); Mon, 24 Jul 2023 08:14:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43754 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230237AbjGXMN7 (ORCPT ); Mon, 24 Jul 2023 08:13:59 -0400 Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 894ADE66; Mon, 24 Jul 2023 05:13:58 -0700 (PDT) Received: from dggpeml500021.china.huawei.com (unknown [172.30.72.57]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4R8fFF1C19zrRfn; Mon, 24 Jul 2023 20:13:05 +0800 (CST) Received: from huawei.com (10.175.127.227) by dggpeml500021.china.huawei.com (7.185.36.21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.27; Mon, 24 Jul 2023 20:13:55 +0800 From: Baokun Li To: CC: , , , , , , , , , Subject: [PATCH v2 1/3] ext4: add two helper functions extent_logical_end() and pa_logical_end() Date: Mon, 24 Jul 2023 20:10:57 +0800 Message-ID: <20230724121059.11834-2-libaokun1@huawei.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230724121059.11834-1-libaokun1@huawei.com> References: <20230724121059.11834-1-libaokun1@huawei.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Originating-IP: [10.175.127.227] X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To dggpeml500021.china.huawei.com (7.185.36.21) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" When we use lstart + len to calculate the end of free extent or prealloc space, it may exceed the maximum value of 4294967295(0xffffffff) supported by ext4_lblk_t and cause overflow, which may lead to various problems. Therefore, we add two helper functions, extent_logical_end() and pa_logical_end(), to limit the type of end to loff_t, and also convert lstart to loff_t for calculation to avoid overflow. Signed-off-by: Baokun Li Reviewed-by: Jan Kara Reviewed-by: Ritesh Harjani (IBM) --- fs/ext4/mballoc.c | 9 +++------ fs/ext4/mballoc.h | 14 ++++++++++++++ 2 files changed, 17 insertions(+), 6 deletions(-) diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c index 21b903fe546e..4cb13b3e41b3 100644 --- a/fs/ext4/mballoc.c +++ b/fs/ext4/mballoc.c @@ -4432,7 +4432,7 @@ ext4_mb_normalize_request(struct ext4_allocation_cont= ext *ac, =20 /* first, let's learn actual file size * given current request is allocated */ - size =3D ac->ac_o_ex.fe_logical + EXT4_C2B(sbi, ac->ac_o_ex.fe_len); + size =3D extent_logical_end(sbi, &ac->ac_o_ex); size =3D size << bsbits; if (size < i_size_read(ac->ac_inode)) size =3D i_size_read(ac->ac_inode); @@ -4766,7 +4766,6 @@ ext4_mb_use_preallocated(struct ext4_allocation_conte= xt *ac) struct ext4_inode_info *ei =3D EXT4_I(ac->ac_inode); struct ext4_locality_group *lg; struct ext4_prealloc_space *tmp_pa =3D NULL, *cpa =3D NULL; - loff_t tmp_pa_end; struct rb_node *iter; ext4_fsblk_t goal_block; =20 @@ -4862,9 +4861,7 @@ ext4_mb_use_preallocated(struct ext4_allocation_conte= xt *ac) * pa can possibly satisfy the request hence check if it overlaps * original logical start and stop searching if it doesn't. */ - tmp_pa_end =3D (loff_t)tmp_pa->pa_lstart + EXT4_C2B(sbi, tmp_pa->pa_len); - - if (ac->ac_o_ex.fe_logical >=3D tmp_pa_end) { + if (ac->ac_o_ex.fe_logical >=3D pa_logical_end(sbi, tmp_pa)) { spin_unlock(&tmp_pa->pa_lock); goto try_group_pa; } @@ -5769,7 +5766,7 @@ static void ext4_mb_group_or_file(struct ext4_allocat= ion_context *ac) =20 group_pa_eligible =3D sbi->s_mb_group_prealloc > 0; inode_pa_eligible =3D true; - size =3D ac->ac_o_ex.fe_logical + EXT4_C2B(sbi, ac->ac_o_ex.fe_len); + size =3D extent_logical_end(sbi, &ac->ac_o_ex); isize =3D (i_size_read(ac->ac_inode) + ac->ac_sb->s_blocksize - 1) >> bsbits; =20 diff --git a/fs/ext4/mballoc.h b/fs/ext4/mballoc.h index df6b5e7c2274..d7aeb5da7d86 100644 --- a/fs/ext4/mballoc.h +++ b/fs/ext4/mballoc.h @@ -233,6 +233,20 @@ static inline ext4_fsblk_t ext4_grp_offs_to_block(stru= ct super_block *sb, (fex->fe_start << EXT4_SB(sb)->s_cluster_bits); } =20 +static inline loff_t extent_logical_end(struct ext4_sb_info *sbi, + struct ext4_free_extent *fex) +{ + /* Use loff_t to avoid end exceeding ext4_lblk_t max. */ + return (loff_t)fex->fe_logical + EXT4_C2B(sbi, fex->fe_len); +} + +static inline loff_t pa_logical_end(struct ext4_sb_info *sbi, + struct ext4_prealloc_space *pa) +{ + /* Use loff_t to avoid end exceeding ext4_lblk_t max. */ + return (loff_t)pa->pa_lstart + EXT4_C2B(sbi, pa->pa_len); +} + typedef int (*ext4_mballoc_query_range_fn)( struct super_block *sb, ext4_group_t agno, --=20 2.31.1 From nobody Sun Sep 7 13:39:47 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E5D16C001DE for ; Mon, 24 Jul 2023 12:14:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230302AbjGXMOM (ORCPT ); Mon, 24 Jul 2023 08:14:12 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43760 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230242AbjGXMOA (ORCPT ); Mon, 24 Jul 2023 08:14:00 -0400 Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AF7E6E67; Mon, 24 Jul 2023 05:13:58 -0700 (PDT) Received: from dggpeml500021.china.huawei.com (unknown [172.30.72.54]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4R8fBX4NMSztRBs; Mon, 24 Jul 2023 20:10:44 +0800 (CST) Received: from huawei.com (10.175.127.227) by dggpeml500021.china.huawei.com (7.185.36.21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.27; Mon, 24 Jul 2023 20:13:56 +0800 From: Baokun Li To: CC: , , , , , , , , , , Subject: [PATCH v2 2/3] ext4: fix BUG in ext4_mb_new_inode_pa() due to overflow Date: Mon, 24 Jul 2023 20:10:58 +0800 Message-ID: <20230724121059.11834-3-libaokun1@huawei.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230724121059.11834-1-libaokun1@huawei.com> References: <20230724121059.11834-1-libaokun1@huawei.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Originating-IP: [10.175.127.227] X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To dggpeml500021.china.huawei.com (7.185.36.21) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" When we calculate the end position of ext4_free_extent, this position may be exactly where ext4_lblk_t (i.e. uint) overflows. For example, if ac_g_ex.fe_logical is 4294965248 and ac_orig_goal_len is 2048, then the computed end is 0x100000000, which is 0. If ac->ac_o_ex.fe_logical is not the first case of adjusting the best extent, that is, new_bex_end > 0, the following BUG_ON will be triggered: =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D kernel BUG at fs/ext4/mballoc.c:5116! invalid opcode: 0000 [#1] PREEMPT SMP PTI CPU: 3 PID: 673 Comm: xfs_io Tainted: G E 6.5.0-rc1+ #279 RIP: 0010:ext4_mb_new_inode_pa+0xc5/0x430 Call Trace: ext4_mb_use_best_found+0x203/0x2f0 ext4_mb_try_best_found+0x163/0x240 ext4_mb_regular_allocator+0x158/0x1550 ext4_mb_new_blocks+0x86a/0xe10 ext4_ext_map_blocks+0xb0c/0x13a0 ext4_map_blocks+0x2cd/0x8f0 ext4_iomap_begin+0x27b/0x400 iomap_iter+0x222/0x3d0 __iomap_dio_rw+0x243/0xcb0 iomap_dio_rw+0x16/0x80 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D A simple reproducer demonstrating the problem: mkfs.ext4 -F /dev/sda -b 4096 100M mount /dev/sda /tmp/test fallocate -l1M /tmp/test/tmp fallocate -l10M /tmp/test/file fallocate -i -o 1M -l16777203M /tmp/test/file fsstress -d /tmp/test -l 0 -n 100000 -p 8 & sleep 10 && killall -9 fsstress rm -f /tmp/test/tmp xfs_io -c "open -ad /tmp/test/file" -c "pwrite -S 0xff 0 8192" We simply refactor the logic for adjusting the best extent by adding a temporary ext4_free_extent ex and use extent_logical_end() to avoid overflow, which also simplifies the code. Cc: stable@kernel.org # 6.4 Fixes: 93cdf49f6eca ("ext4: Fix best extent lstart adjustment logic in ext4= _mb_new_inode_pa()") Signed-off-by: Baokun Li Reviewed-by: Jan Kara Reviewed-by: Ritesh Harjani (IBM) --- fs/ext4/mballoc.c | 31 ++++++++++++++----------------- 1 file changed, 14 insertions(+), 17 deletions(-) diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c index 4cb13b3e41b3..86bce870dc5a 100644 --- a/fs/ext4/mballoc.c +++ b/fs/ext4/mballoc.c @@ -5177,8 +5177,11 @@ ext4_mb_new_inode_pa(struct ext4_allocation_context = *ac) pa =3D ac->ac_pa; =20 if (ac->ac_b_ex.fe_len < ac->ac_orig_goal_len) { - int new_bex_start; - int new_bex_end; + struct ext4_free_extent ex =3D { + .fe_logical =3D ac->ac_g_ex.fe_logical, + .fe_len =3D ac->ac_orig_goal_len, + }; + loff_t orig_goal_end =3D extent_logical_end(sbi, &ex); =20 /* we can't allocate as much as normalizer wants. * so, found space must get proper lstart @@ -5197,29 +5200,23 @@ ext4_mb_new_inode_pa(struct ext4_allocation_context= *ac) * still cover original start * 3. Else, keep the best ex at start of original request. */ - new_bex_end =3D ac->ac_g_ex.fe_logical + - EXT4_C2B(sbi, ac->ac_orig_goal_len); - new_bex_start =3D new_bex_end - EXT4_C2B(sbi, ac->ac_b_ex.fe_len); - if (ac->ac_o_ex.fe_logical >=3D new_bex_start) - goto adjust_bex; + ex.fe_len =3D ac->ac_b_ex.fe_len; =20 - new_bex_start =3D ac->ac_g_ex.fe_logical; - new_bex_end =3D - new_bex_start + EXT4_C2B(sbi, ac->ac_b_ex.fe_len); - if (ac->ac_o_ex.fe_logical < new_bex_end) + ex.fe_logical =3D orig_goal_end - EXT4_C2B(sbi, ex.fe_len); + if (ac->ac_o_ex.fe_logical >=3D ex.fe_logical) goto adjust_bex; =20 - new_bex_start =3D ac->ac_o_ex.fe_logical; - new_bex_end =3D - new_bex_start + EXT4_C2B(sbi, ac->ac_b_ex.fe_len); + ex.fe_logical =3D ac->ac_g_ex.fe_logical; + if (ac->ac_o_ex.fe_logical < extent_logical_end(sbi, &ex)) + goto adjust_bex; =20 + ex.fe_logical =3D ac->ac_o_ex.fe_logical; adjust_bex: - ac->ac_b_ex.fe_logical =3D new_bex_start; + ac->ac_b_ex.fe_logical =3D ex.fe_logical; =20 BUG_ON(ac->ac_o_ex.fe_logical < ac->ac_b_ex.fe_logical); BUG_ON(ac->ac_o_ex.fe_len > ac->ac_b_ex.fe_len); - BUG_ON(new_bex_end > (ac->ac_g_ex.fe_logical + - EXT4_C2B(sbi, ac->ac_orig_goal_len))); + BUG_ON(extent_logical_end(sbi, &ex) > orig_goal_end); } =20 pa->pa_lstart =3D ac->ac_b_ex.fe_logical; --=20 2.31.1 From nobody Sun Sep 7 13:39:47 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BC9D6C001E0 for ; Mon, 24 Jul 2023 12:14:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230312AbjGXMOP (ORCPT ); Mon, 24 Jul 2023 08:14:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43754 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230248AbjGXMOA (ORCPT ); Mon, 24 Jul 2023 08:14:00 -0400 Received: from szxga03-in.huawei.com (szxga03-in.huawei.com [45.249.212.189]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7F229E61; Mon, 24 Jul 2023 05:13:59 -0700 (PDT) Received: from dggpeml500021.china.huawei.com (unknown [172.30.72.54]) by szxga03-in.huawei.com (SkyGuard) with ESMTP id 4R8fCJ2fnqzHqdQ; Mon, 24 Jul 2023 20:11:24 +0800 (CST) Received: from huawei.com (10.175.127.227) by dggpeml500021.china.huawei.com (7.185.36.21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.27; Mon, 24 Jul 2023 20:13:56 +0800 From: Baokun Li To: CC: , , , , , , , , , Subject: [PATCH v2 3/3] ext4: avoid overlapping preallocations due to overflow Date: Mon, 24 Jul 2023 20:10:59 +0800 Message-ID: <20230724121059.11834-4-libaokun1@huawei.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230724121059.11834-1-libaokun1@huawei.com> References: <20230724121059.11834-1-libaokun1@huawei.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Originating-IP: [10.175.127.227] X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To dggpeml500021.china.huawei.com (7.185.36.21) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Let's say we want to allocate 2 blocks starting from 4294966386, after predicting the file size, start is aligned to 4294965248, len is changed to 2048, then end =3D start + size =3D 0x100000000. Since end is of type ext4_lblk_t, i.e. uint, end is truncated to 0. This causes (pa->pa_lstart >=3D end) to always hold when checking if the current extent to be allocated crosses already preallocated blocks, so the resulting ac_g_ex may cross already preallocated blocks. Hence we convert the end type to loff_t and use pa_logical_end() to avoid overflow. Signed-off-by: Baokun Li Reviewed-by: Jan Kara Reviewed-by: Ritesh Harjani (IBM) --- fs/ext4/mballoc.c | 21 ++++++++++----------- 1 file changed, 10 insertions(+), 11 deletions(-) diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c index 86bce870dc5a..78a4a24e2f57 100644 --- a/fs/ext4/mballoc.c +++ b/fs/ext4/mballoc.c @@ -4222,12 +4222,13 @@ ext4_mb_pa_rb_next_iter(ext4_lblk_t new_start, ext4= _lblk_t cur_start, struct rb_ =20 static inline void ext4_mb_pa_assert_overlap(struct ext4_allocation_context *ac, - ext4_lblk_t start, ext4_lblk_t end) + ext4_lblk_t start, loff_t end) { struct ext4_sb_info *sbi =3D EXT4_SB(ac->ac_sb); struct ext4_inode_info *ei =3D EXT4_I(ac->ac_inode); struct ext4_prealloc_space *tmp_pa; - ext4_lblk_t tmp_pa_start, tmp_pa_end; + ext4_lblk_t tmp_pa_start; + loff_t tmp_pa_end; struct rb_node *iter; =20 read_lock(&ei->i_prealloc_lock); @@ -4236,7 +4237,7 @@ ext4_mb_pa_assert_overlap(struct ext4_allocation_cont= ext *ac, tmp_pa =3D rb_entry(iter, struct ext4_prealloc_space, pa_node.inode_node); tmp_pa_start =3D tmp_pa->pa_lstart; - tmp_pa_end =3D tmp_pa->pa_lstart + EXT4_C2B(sbi, tmp_pa->pa_len); + tmp_pa_end =3D pa_logical_end(sbi, tmp_pa); =20 spin_lock(&tmp_pa->pa_lock); if (tmp_pa->pa_deleted =3D=3D 0) @@ -4258,14 +4259,14 @@ ext4_mb_pa_assert_overlap(struct ext4_allocation_co= ntext *ac, */ static inline void ext4_mb_pa_adjust_overlap(struct ext4_allocation_context *ac, - ext4_lblk_t *start, ext4_lblk_t *end) + ext4_lblk_t *start, loff_t *end) { struct ext4_inode_info *ei =3D EXT4_I(ac->ac_inode); struct ext4_sb_info *sbi =3D EXT4_SB(ac->ac_sb); struct ext4_prealloc_space *tmp_pa =3D NULL, *left_pa =3D NULL, *right_pa= =3D NULL; struct rb_node *iter; - ext4_lblk_t new_start, new_end; - ext4_lblk_t tmp_pa_start, tmp_pa_end, left_pa_end =3D -1, right_pa_start = =3D -1; + ext4_lblk_t new_start, tmp_pa_start, right_pa_start =3D -1; + loff_t new_end, tmp_pa_end, left_pa_end =3D -1; =20 new_start =3D *start; new_end =3D *end; @@ -4284,7 +4285,7 @@ ext4_mb_pa_adjust_overlap(struct ext4_allocation_cont= ext *ac, tmp_pa =3D rb_entry(iter, struct ext4_prealloc_space, pa_node.inode_node); tmp_pa_start =3D tmp_pa->pa_lstart; - tmp_pa_end =3D tmp_pa->pa_lstart + EXT4_C2B(sbi, tmp_pa->pa_len); + tmp_pa_end =3D pa_logical_end(sbi, tmp_pa); =20 /* PA must not overlap original request */ spin_lock(&tmp_pa->pa_lock); @@ -4364,8 +4365,7 @@ ext4_mb_pa_adjust_overlap(struct ext4_allocation_cont= ext *ac, } =20 if (left_pa) { - left_pa_end =3D - left_pa->pa_lstart + EXT4_C2B(sbi, left_pa->pa_len); + left_pa_end =3D pa_logical_end(sbi, left_pa); BUG_ON(left_pa_end > ac->ac_o_ex.fe_logical); } =20 @@ -4404,8 +4404,7 @@ ext4_mb_normalize_request(struct ext4_allocation_cont= ext *ac, struct ext4_sb_info *sbi =3D EXT4_SB(ac->ac_sb); struct ext4_super_block *es =3D sbi->s_es; int bsbits, max; - ext4_lblk_t end; - loff_t size, start_off; + loff_t size, start_off, end; loff_t orig_size __maybe_unused; ext4_lblk_t start; =20 --=20 2.31.1