[PATCH v2] ocfs2: validate bg_list extent bounds in discontig groups

ZhengYuan Huang posted 1 patch 4 days, 10 hours ago
fs/ocfs2/suballoc.c | 25 +++++++++++++++++++++++++
1 file changed, 25 insertions(+)
[PATCH v2] ocfs2: validate bg_list extent bounds in discontig groups
Posted by ZhengYuan Huang 4 days, 10 hours ago
[BUG]
Running ocfs2 on a corrupted image with a discontiguous block
group whose bg_list.l_next_free_rec is set to an excessively
large value triggers a KASAN use-after-free crash:

BUG: KASAN: use-after-free in ocfs2_bg_discontig_fix_by_rec fs/ocfs2/suballoc.c:1678 [inline]
BUG: KASAN: use-after-free in ocfs2_bg_discontig_fix_result+0x4a4/0x560 fs/ocfs2/suballoc.c:1715
Read of size 4 at addr ffff88801a85f000 by task syz.0.115/552

Call Trace:
 ...
 __asan_report_load4_noabort+0x14/0x30 mm/kasan/report_generic.c:380
 ocfs2_bg_discontig_fix_by_rec fs/ocfs2/suballoc.c:1678 [inline]
 ocfs2_bg_discontig_fix_result+0x4a4/0x560 fs/ocfs2/suballoc.c:1715
 ocfs2_search_one_group fs/ocfs2/suballoc.c:1752 [inline]
 ocfs2_claim_suballoc_bits+0x13c3/0x1cd0 fs/ocfs2/suballoc.c:1984
 ocfs2_claim_new_inode+0x2e7/0x8a0 fs/ocfs2/suballoc.c:2292
 ocfs2_mknod_locked.constprop.0+0x121/0x2a0 fs/ocfs2/namei.c:637
 ocfs2_mknod+0xc71/0x2400 fs/ocfs2/namei.c:384
 ocfs2_create+0x158/0x390 fs/ocfs2/namei.c:676
 lookup_open.isra.0+0x10a1/0x1460 fs/namei.c:3796
 open_last_lookups fs/namei.c:3895 [inline]
 path_openat+0x11fe/0x2ce0 fs/namei.c:4131
 do_filp_open+0x1f6/0x430 fs/namei.c:4161
 do_sys_openat2+0x117/0x1c0 fs/open.c:1437
 do_sys_open fs/open.c:1452 [inline]
 __do_sys_openat fs/open.c:1468 [inline]
 ...

[CAUSE]
ocfs2_bg_discontig_fix_result() iterates over bg->bg_list.l_recs[]
using l_next_free_rec as the upper bound without any sanity check:

  for (i = 0; i < le16_to_cpu(bg->bg_list.l_next_free_rec); i++) {
          rec = &bg->bg_list.l_recs[i];

l_next_free_rec is read directly from the on-disk group descriptor and
is trusted blindly. On a 4 KiB block device, bg_list.l_recs[] can hold
at most 235 entries (ocfs2_extent_recs_per_gd(sb)). A corrupted or
crafted filesystem image can set l_next_free_rec to an arbitrarily
large value, causing the loop to index past the end of the group
descriptor buffer_head data page and into an adjacent freed page.

[FIX]
Validate discontiguous bg_list.l_count against
ocfs2_extent_recs_per_gd(sb), then reject l_next_free_rec values that
exceed l_count. This keeps the on-disk extent list self-consistent and
matches how the rest of ocfs2 uses l_count as the extent-list bound.

Signed-off-by: ZhengYuan Huang <gality369@gmail.com>
---
Changes in v2:
- Validate bg_list.l_count against ocfs2_extent_recs_per_gd(sb).
- Bound bg_list.l_next_free_rec by bg_list.l_count.
- Update the changelog text to explain the l_count validation.
---
 fs/ocfs2/suballoc.c | 25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)

diff --git a/fs/ocfs2/suballoc.c b/fs/ocfs2/suballoc.c
index 6ac4dcd54588..29a8a878c2df 100644
--- a/fs/ocfs2/suballoc.c
+++ b/fs/ocfs2/suballoc.c
@@ -196,6 +196,31 @@ static int ocfs2_validate_gd_self(struct super_block *sb,
 			 8 * le16_to_cpu(gd->bg_size));
 	}
 
+	/*
+	 * For discontiguous block groups, validate the on-disk extent list
+	 * against the maximum number of extent records that can physically
+	 * fit in a single block.
+	 */
+	if (ocfs2_gd_is_discontig(gd)) {
+		u16 max_recs = ocfs2_extent_recs_per_gd(sb);
+		u16 l_count = le16_to_cpu(gd->bg_list.l_count);
+		u16 l_next_free_rec = le16_to_cpu(gd->bg_list.l_next_free_rec);
+
+		if (l_count != max_recs) {
+			do_error("Group descriptor #%llu bad discontig l_count %u expected %u\n",
+				 (unsigned long long)bh->b_blocknr,
+				 l_count,
+				 max_recs);
+		}
+
+		if (l_next_free_rec > l_count) {
+			do_error("Group descriptor #%llu bad discontig l_next_free_rec %u max %u\n",
+				 (unsigned long long)bh->b_blocknr,
+				 l_next_free_rec,
+				 l_count);
+		}
+	}
+
 	return 0;
 }
 
-- 
2.43.0
Re: [PATCH v2] ocfs2: validate bg_list extent bounds in discontig groups
Posted by Joseph Qi 4 days, 9 hours ago

On 4/1/26 10:16 AM, ZhengYuan Huang wrote:
> [BUG]
> Running ocfs2 on a corrupted image with a discontiguous block
> group whose bg_list.l_next_free_rec is set to an excessively
> large value triggers a KASAN use-after-free crash:
> 
> BUG: KASAN: use-after-free in ocfs2_bg_discontig_fix_by_rec fs/ocfs2/suballoc.c:1678 [inline]
> BUG: KASAN: use-after-free in ocfs2_bg_discontig_fix_result+0x4a4/0x560 fs/ocfs2/suballoc.c:1715
> Read of size 4 at addr ffff88801a85f000 by task syz.0.115/552
> 
> Call Trace:
>  ...
>  __asan_report_load4_noabort+0x14/0x30 mm/kasan/report_generic.c:380
>  ocfs2_bg_discontig_fix_by_rec fs/ocfs2/suballoc.c:1678 [inline]
>  ocfs2_bg_discontig_fix_result+0x4a4/0x560 fs/ocfs2/suballoc.c:1715
>  ocfs2_search_one_group fs/ocfs2/suballoc.c:1752 [inline]
>  ocfs2_claim_suballoc_bits+0x13c3/0x1cd0 fs/ocfs2/suballoc.c:1984
>  ocfs2_claim_new_inode+0x2e7/0x8a0 fs/ocfs2/suballoc.c:2292
>  ocfs2_mknod_locked.constprop.0+0x121/0x2a0 fs/ocfs2/namei.c:637
>  ocfs2_mknod+0xc71/0x2400 fs/ocfs2/namei.c:384
>  ocfs2_create+0x158/0x390 fs/ocfs2/namei.c:676
>  lookup_open.isra.0+0x10a1/0x1460 fs/namei.c:3796
>  open_last_lookups fs/namei.c:3895 [inline]
>  path_openat+0x11fe/0x2ce0 fs/namei.c:4131
>  do_filp_open+0x1f6/0x430 fs/namei.c:4161
>  do_sys_openat2+0x117/0x1c0 fs/open.c:1437
>  do_sys_open fs/open.c:1452 [inline]
>  __do_sys_openat fs/open.c:1468 [inline]
>  ...
> 
> [CAUSE]
> ocfs2_bg_discontig_fix_result() iterates over bg->bg_list.l_recs[]
> using l_next_free_rec as the upper bound without any sanity check:
> 
>   for (i = 0; i < le16_to_cpu(bg->bg_list.l_next_free_rec); i++) {
>           rec = &bg->bg_list.l_recs[i];
> 
> l_next_free_rec is read directly from the on-disk group descriptor and
> is trusted blindly. On a 4 KiB block device, bg_list.l_recs[] can hold
> at most 235 entries (ocfs2_extent_recs_per_gd(sb)). A corrupted or
> crafted filesystem image can set l_next_free_rec to an arbitrarily
> large value, causing the loop to index past the end of the group
> descriptor buffer_head data page and into an adjacent freed page.
> 
> [FIX]
> Validate discontiguous bg_list.l_count against
> ocfs2_extent_recs_per_gd(sb), then reject l_next_free_rec values that
> exceed l_count. This keeps the on-disk extent list self-consistent and
> matches how the rest of ocfs2 uses l_count as the extent-list bound.
> 
> Signed-off-by: ZhengYuan Huang <gality369@gmail.com>

Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>

> ---
> Changes in v2:
> - Validate bg_list.l_count against ocfs2_extent_recs_per_gd(sb).
> - Bound bg_list.l_next_free_rec by bg_list.l_count.
> - Update the changelog text to explain the l_count validation.
> ---
>  fs/ocfs2/suballoc.c | 25 +++++++++++++++++++++++++
>  1 file changed, 25 insertions(+)
> 
> diff --git a/fs/ocfs2/suballoc.c b/fs/ocfs2/suballoc.c
> index 6ac4dcd54588..29a8a878c2df 100644
> --- a/fs/ocfs2/suballoc.c
> +++ b/fs/ocfs2/suballoc.c
> @@ -196,6 +196,31 @@ static int ocfs2_validate_gd_self(struct super_block *sb,
>  			 8 * le16_to_cpu(gd->bg_size));
>  	}
>  
> +	/*
> +	 * For discontiguous block groups, validate the on-disk extent list
> +	 * against the maximum number of extent records that can physically
> +	 * fit in a single block.
> +	 */
> +	if (ocfs2_gd_is_discontig(gd)) {
> +		u16 max_recs = ocfs2_extent_recs_per_gd(sb);
> +		u16 l_count = le16_to_cpu(gd->bg_list.l_count);
> +		u16 l_next_free_rec = le16_to_cpu(gd->bg_list.l_next_free_rec);
> +
> +		if (l_count != max_recs) {
> +			do_error("Group descriptor #%llu bad discontig l_count %u expected %u\n",
> +				 (unsigned long long)bh->b_blocknr,
> +				 l_count,
> +				 max_recs);
> +		}
> +
> +		if (l_next_free_rec > l_count) {
> +			do_error("Group descriptor #%llu bad discontig l_next_free_rec %u max %u\n",
> +				 (unsigned long long)bh->b_blocknr,
> +				 l_next_free_rec,
> +				 l_count);
> +		}
> +	}
> +
>  	return 0;
>  }
>