[PATCH] ocfs2: validate bg_list.l_next_free_rec in discontig group descriptor

ZhengYuan Huang posted 1 patch 18 hours ago
fs/ocfs2/suballoc.c | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
[PATCH] ocfs2: validate bg_list.l_next_free_rec in discontig group descriptor
Posted by ZhengYuan Huang 18 hours ago
[BUG]
Running ocfs2 on a corrupted image with a discontiguous block
group whose bg_list.l_next_free_rec is set to an excessively
large value triggers a KASAN use-after-free crash:

BUG: KASAN: use-after-free in ocfs2_bg_discontig_fix_by_rec fs/ocfs2/suballoc.c:1678 [inline]
BUG: KASAN: use-after-free in ocfs2_bg_discontig_fix_result+0x4a4/0x560 fs/ocfs2/suballoc.c:1715
Read of size 4 at addr ffff88801a85f000 by task syz.0.115/552

Call Trace:
 <TASK>
 ...
 __asan_report_load4_noabort+0x14/0x30 mm/kasan/report_generic.c:380
 ocfs2_bg_discontig_fix_by_rec fs/ocfs2/suballoc.c:1678 [inline]
 ocfs2_bg_discontig_fix_result+0x4a4/0x560 fs/ocfs2/suballoc.c:1715
 ocfs2_search_one_group fs/ocfs2/suballoc.c:1752 [inline]
 ocfs2_claim_suballoc_bits+0x13c3/0x1cd0 fs/ocfs2/suballoc.c:1984
 ocfs2_claim_new_inode+0x2e7/0x8a0 fs/ocfs2/suballoc.c:2292
 ocfs2_mknod_locked.constprop.0+0x121/0x2a0 fs/ocfs2/namei.c:637
 ocfs2_mknod+0xc71/0x2400 fs/ocfs2/namei.c:384
 ocfs2_create+0x158/0x390 fs/ocfs2/namei.c:676
 lookup_open.isra.0+0x10a1/0x1460 fs/namei.c:3796
 open_last_lookups fs/namei.c:3895 [inline]
 path_openat+0x11fe/0x2ce0 fs/namei.c:4131
 do_filp_open+0x1f6/0x430 fs/namei.c:4161
 do_sys_openat2+0x117/0x1c0 fs/open.c:1437
 do_sys_open fs/open.c:1452 [inline]
 __do_sys_openat fs/open.c:1468 [inline]
 __se_sys_openat fs/open.c:1463 [inline]
 __x64_sys_openat+0x15b/0x220 fs/open.c:1463
 ...

[CAUSE]
ocfs2_bg_discontig_fix_result() iterates over bg->bg_list.l_recs[]
using l_next_free_rec as the upper bound without any sanity check:

  for (i = 0; i < le16_to_cpu(bg->bg_list.l_next_free_rec); i++) {
          rec = &bg->bg_list.l_recs[i];

l_next_free_rec is read directly from the on-disk group descriptor and
is trusted blindly. On a 4 KiB block device, bg_list.l_recs[] can hold
at most 235 entries (ocfs2_extent_recs_per_gd(sb)). A corrupted or
crafted filesystem image can set l_next_free_rec to an arbitrarily
large value, causing the loop to index past the end of the group
descriptor buffer_head data page and into an adjacent freed page.

[FIX]
Fix this by adding a bounds check in ocfs2_validate_gd_self(), which
is called for every group descriptor read via ocfs2_read_group_descriptor().
Use ocfs2_gd_is_discontig() to restrict the check to discontiguous
block groups, and ocfs2_extent_recs_per_gd(sb) as the physical upper
bound (rather than trusting the on-disk l_count, which could also be
corrupted). This follows the same do_error() pattern used by the
existing field sanity checks in ocfs2_validate_gd_self().

Signed-off-by: ZhengYuan Huang <gality369@gmail.com>
---
 fs/ocfs2/suballoc.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/fs/ocfs2/suballoc.c b/fs/ocfs2/suballoc.c
index 6ac4dcd54588..6dcf45fda457 100644
--- a/fs/ocfs2/suballoc.c
+++ b/fs/ocfs2/suballoc.c
@@ -196,6 +196,22 @@ static int ocfs2_validate_gd_self(struct super_block *sb,
 			 8 * le16_to_cpu(gd->bg_size));
 	}
 
+	/*
+	 * For discontiguous block groups, validate that bg_list.l_next_free_rec
+	 * does not exceed the maximum number of extent records that can physically
+	 * fit in a single block.
+	 */
+	if (ocfs2_gd_is_discontig(gd)) {
+		u16 max_recs = ocfs2_extent_recs_per_gd(sb);
+
+		if (le16_to_cpu(gd->bg_list.l_next_free_rec) > max_recs) {
+			do_error("Group descriptor #%llu bad discontig l_next_free_rec %u max %u\n",
+				 (unsigned long long)bh->b_blocknr,
+				 le16_to_cpu(gd->bg_list.l_next_free_rec),
+				 max_recs);
+		}
+	}
+
 	return 0;
 }
 
-- 
2.43.0
Re: [PATCH] ocfs2: validate bg_list.l_next_free_rec in discontig group descriptor
Posted by Joseph Qi 6 hours ago

On 3/31/26 9:31 PM, ZhengYuan Huang wrote:
> [BUG]
> Running ocfs2 on a corrupted image with a discontiguous block
> group whose bg_list.l_next_free_rec is set to an excessively
> large value triggers a KASAN use-after-free crash:
> 
> BUG: KASAN: use-after-free in ocfs2_bg_discontig_fix_by_rec fs/ocfs2/suballoc.c:1678 [inline]
> BUG: KASAN: use-after-free in ocfs2_bg_discontig_fix_result+0x4a4/0x560 fs/ocfs2/suballoc.c:1715
> Read of size 4 at addr ffff88801a85f000 by task syz.0.115/552
> 
> Call Trace:
>  <TASK>
>  ...
>  __asan_report_load4_noabort+0x14/0x30 mm/kasan/report_generic.c:380
>  ocfs2_bg_discontig_fix_by_rec fs/ocfs2/suballoc.c:1678 [inline]
>  ocfs2_bg_discontig_fix_result+0x4a4/0x560 fs/ocfs2/suballoc.c:1715
>  ocfs2_search_one_group fs/ocfs2/suballoc.c:1752 [inline]
>  ocfs2_claim_suballoc_bits+0x13c3/0x1cd0 fs/ocfs2/suballoc.c:1984
>  ocfs2_claim_new_inode+0x2e7/0x8a0 fs/ocfs2/suballoc.c:2292
>  ocfs2_mknod_locked.constprop.0+0x121/0x2a0 fs/ocfs2/namei.c:637
>  ocfs2_mknod+0xc71/0x2400 fs/ocfs2/namei.c:384
>  ocfs2_create+0x158/0x390 fs/ocfs2/namei.c:676
>  lookup_open.isra.0+0x10a1/0x1460 fs/namei.c:3796
>  open_last_lookups fs/namei.c:3895 [inline]
>  path_openat+0x11fe/0x2ce0 fs/namei.c:4131
>  do_filp_open+0x1f6/0x430 fs/namei.c:4161
>  do_sys_openat2+0x117/0x1c0 fs/open.c:1437
>  do_sys_open fs/open.c:1452 [inline]
>  __do_sys_openat fs/open.c:1468 [inline]
>  __se_sys_openat fs/open.c:1463 [inline]
>  __x64_sys_openat+0x15b/0x220 fs/open.c:1463
>  ...
> 
> [CAUSE]
> ocfs2_bg_discontig_fix_result() iterates over bg->bg_list.l_recs[]
> using l_next_free_rec as the upper bound without any sanity check:
> 
>   for (i = 0; i < le16_to_cpu(bg->bg_list.l_next_free_rec); i++) {
>           rec = &bg->bg_list.l_recs[i];
> 
> l_next_free_rec is read directly from the on-disk group descriptor and
> is trusted blindly. On a 4 KiB block device, bg_list.l_recs[] can hold
> at most 235 entries (ocfs2_extent_recs_per_gd(sb)). A corrupted or
> crafted filesystem image can set l_next_free_rec to an arbitrarily
> large value, causing the loop to index past the end of the group
> descriptor buffer_head data page and into an adjacent freed page.
> 
> [FIX]
> Fix this by adding a bounds check in ocfs2_validate_gd_self(), which
> is called for every group descriptor read via ocfs2_read_group_descriptor().
> Use ocfs2_gd_is_discontig() to restrict the check to discontiguous
> block groups, and ocfs2_extent_recs_per_gd(sb) as the physical upper
> bound (rather than trusting the on-disk l_count, which could also be
> corrupted). This follows the same do_error() pattern used by the
> existing field sanity checks in ocfs2_validate_gd_self().
> 
> Signed-off-by: ZhengYuan Huang <gality369@gmail.com>
> ---
>  fs/ocfs2/suballoc.c | 16 ++++++++++++++++
>  1 file changed, 16 insertions(+)
> 
> diff --git a/fs/ocfs2/suballoc.c b/fs/ocfs2/suballoc.c
> index 6ac4dcd54588..6dcf45fda457 100644
> --- a/fs/ocfs2/suballoc.c
> +++ b/fs/ocfs2/suballoc.c
> @@ -196,6 +196,22 @@ static int ocfs2_validate_gd_self(struct super_block *sb,
>  			 8 * le16_to_cpu(gd->bg_size));
>  	}
>  
> +	/*
> +	 * For discontiguous block groups, validate that bg_list.l_next_free_rec
> +	 * does not exceed the maximum number of extent records that can physically
> +	 * fit in a single block.
> +	 */
> +	if (ocfs2_gd_is_discontig(gd)) {
> +		u16 max_recs = ocfs2_extent_recs_per_gd(sb);
> +
> +		if (le16_to_cpu(gd->bg_list.l_next_free_rec) > max_recs) {

Since most places only check l_count, so why not validate l_count here
as well?
Something like:

if (le16_to_cpu(el->l_count) != max_recs ||
    le16_to_cpu(el->l_next_free_rec > le16_to_cpu(el->l_count))
	......

Thanks,
Joseph

> +			do_error("Group descriptor #%llu bad discontig l_next_free_rec %u max %u\n",
> +				 (unsigned long long)bh->b_blocknr,
> +				 le16_to_cpu(gd->bg_list.l_next_free_rec),
> +				 max_recs);
> +		}
> +	}
> +
>  	return 0;
>  }
>
Re: [PATCH] ocfs2: validate bg_list.l_next_free_rec in discontig group descriptor
Posted by ZhengYuan Huang 6 hours ago
On Wed, Apr 1, 2026 at 9:26 AM Joseph Qi <joseph.qi@linux.alibaba.com> wrote:
> >  fs/ocfs2/suballoc.c | 16 ++++++++++++++++
> >  1 file changed, 16 insertions(+)
> >
> > diff --git a/fs/ocfs2/suballoc.c b/fs/ocfs2/suballoc.c
> > index 6ac4dcd54588..6dcf45fda457 100644
> > --- a/fs/ocfs2/suballoc.c
> > +++ b/fs/ocfs2/suballoc.c
> > @@ -196,6 +196,22 @@ static int ocfs2_validate_gd_self(struct super_block *sb,
> >                        8 * le16_to_cpu(gd->bg_size));
> >       }
> >
> > +     /*
> > +      * For discontiguous block groups, validate that bg_list.l_next_free_rec
> > +      * does not exceed the maximum number of extent records that can physically
> > +      * fit in a single block.
> > +      */
> > +     if (ocfs2_gd_is_discontig(gd)) {
> > +             u16 max_recs = ocfs2_extent_recs_per_gd(sb);
> > +
> > +             if (le16_to_cpu(gd->bg_list.l_next_free_rec) > max_recs) {
>
> Since most places only check l_count, so why not validate l_count here
> as well?
> Something like:
>
> if (le16_to_cpu(el->l_count) != max_recs ||
>     le16_to_cpu(el->l_next_free_rec > le16_to_cpu(el->l_count))
>         ......
>
> Thanks,
> Joseph

Thanks for the suggestion!

I’ll incorporate the additional validation for l_count as well and
send a v2 patch shortly.

Thanks,
ZhengYuan Huang