fs/ocfs2/suballoc.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+)
[BUG]
Running ocfs2 on a corrupted image with a discontiguous block
group whose bg_list.l_next_free_rec is set to an excessively
large value triggers a KASAN use-after-free crash:
BUG: KASAN: use-after-free in ocfs2_bg_discontig_fix_by_rec fs/ocfs2/suballoc.c:1678 [inline]
BUG: KASAN: use-after-free in ocfs2_bg_discontig_fix_result+0x4a4/0x560 fs/ocfs2/suballoc.c:1715
Read of size 4 at addr ffff88801a85f000 by task syz.0.115/552
Call Trace:
<TASK>
...
__asan_report_load4_noabort+0x14/0x30 mm/kasan/report_generic.c:380
ocfs2_bg_discontig_fix_by_rec fs/ocfs2/suballoc.c:1678 [inline]
ocfs2_bg_discontig_fix_result+0x4a4/0x560 fs/ocfs2/suballoc.c:1715
ocfs2_search_one_group fs/ocfs2/suballoc.c:1752 [inline]
ocfs2_claim_suballoc_bits+0x13c3/0x1cd0 fs/ocfs2/suballoc.c:1984
ocfs2_claim_new_inode+0x2e7/0x8a0 fs/ocfs2/suballoc.c:2292
ocfs2_mknod_locked.constprop.0+0x121/0x2a0 fs/ocfs2/namei.c:637
ocfs2_mknod+0xc71/0x2400 fs/ocfs2/namei.c:384
ocfs2_create+0x158/0x390 fs/ocfs2/namei.c:676
lookup_open.isra.0+0x10a1/0x1460 fs/namei.c:3796
open_last_lookups fs/namei.c:3895 [inline]
path_openat+0x11fe/0x2ce0 fs/namei.c:4131
do_filp_open+0x1f6/0x430 fs/namei.c:4161
do_sys_openat2+0x117/0x1c0 fs/open.c:1437
do_sys_open fs/open.c:1452 [inline]
__do_sys_openat fs/open.c:1468 [inline]
__se_sys_openat fs/open.c:1463 [inline]
__x64_sys_openat+0x15b/0x220 fs/open.c:1463
...
[CAUSE]
ocfs2_bg_discontig_fix_result() iterates over bg->bg_list.l_recs[]
using l_next_free_rec as the upper bound without any sanity check:
for (i = 0; i < le16_to_cpu(bg->bg_list.l_next_free_rec); i++) {
rec = &bg->bg_list.l_recs[i];
l_next_free_rec is read directly from the on-disk group descriptor and
is trusted blindly. On a 4 KiB block device, bg_list.l_recs[] can hold
at most 235 entries (ocfs2_extent_recs_per_gd(sb)). A corrupted or
crafted filesystem image can set l_next_free_rec to an arbitrarily
large value, causing the loop to index past the end of the group
descriptor buffer_head data page and into an adjacent freed page.
[FIX]
Fix this by adding a bounds check in ocfs2_validate_gd_self(), which
is called for every group descriptor read via ocfs2_read_group_descriptor().
Use ocfs2_gd_is_discontig() to restrict the check to discontiguous
block groups, and ocfs2_extent_recs_per_gd(sb) as the physical upper
bound (rather than trusting the on-disk l_count, which could also be
corrupted). This follows the same do_error() pattern used by the
existing field sanity checks in ocfs2_validate_gd_self().
Signed-off-by: ZhengYuan Huang <gality369@gmail.com>
---
fs/ocfs2/suballoc.c | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/fs/ocfs2/suballoc.c b/fs/ocfs2/suballoc.c
index 6ac4dcd54588..6dcf45fda457 100644
--- a/fs/ocfs2/suballoc.c
+++ b/fs/ocfs2/suballoc.c
@@ -196,6 +196,22 @@ static int ocfs2_validate_gd_self(struct super_block *sb,
8 * le16_to_cpu(gd->bg_size));
}
+ /*
+ * For discontiguous block groups, validate that bg_list.l_next_free_rec
+ * does not exceed the maximum number of extent records that can physically
+ * fit in a single block.
+ */
+ if (ocfs2_gd_is_discontig(gd)) {
+ u16 max_recs = ocfs2_extent_recs_per_gd(sb);
+
+ if (le16_to_cpu(gd->bg_list.l_next_free_rec) > max_recs) {
+ do_error("Group descriptor #%llu bad discontig l_next_free_rec %u max %u\n",
+ (unsigned long long)bh->b_blocknr,
+ le16_to_cpu(gd->bg_list.l_next_free_rec),
+ max_recs);
+ }
+ }
+
return 0;
}
--
2.43.0
On 3/31/26 9:31 PM, ZhengYuan Huang wrote:
> [BUG]
> Running ocfs2 on a corrupted image with a discontiguous block
> group whose bg_list.l_next_free_rec is set to an excessively
> large value triggers a KASAN use-after-free crash:
>
> BUG: KASAN: use-after-free in ocfs2_bg_discontig_fix_by_rec fs/ocfs2/suballoc.c:1678 [inline]
> BUG: KASAN: use-after-free in ocfs2_bg_discontig_fix_result+0x4a4/0x560 fs/ocfs2/suballoc.c:1715
> Read of size 4 at addr ffff88801a85f000 by task syz.0.115/552
>
> Call Trace:
> <TASK>
> ...
> __asan_report_load4_noabort+0x14/0x30 mm/kasan/report_generic.c:380
> ocfs2_bg_discontig_fix_by_rec fs/ocfs2/suballoc.c:1678 [inline]
> ocfs2_bg_discontig_fix_result+0x4a4/0x560 fs/ocfs2/suballoc.c:1715
> ocfs2_search_one_group fs/ocfs2/suballoc.c:1752 [inline]
> ocfs2_claim_suballoc_bits+0x13c3/0x1cd0 fs/ocfs2/suballoc.c:1984
> ocfs2_claim_new_inode+0x2e7/0x8a0 fs/ocfs2/suballoc.c:2292
> ocfs2_mknod_locked.constprop.0+0x121/0x2a0 fs/ocfs2/namei.c:637
> ocfs2_mknod+0xc71/0x2400 fs/ocfs2/namei.c:384
> ocfs2_create+0x158/0x390 fs/ocfs2/namei.c:676
> lookup_open.isra.0+0x10a1/0x1460 fs/namei.c:3796
> open_last_lookups fs/namei.c:3895 [inline]
> path_openat+0x11fe/0x2ce0 fs/namei.c:4131
> do_filp_open+0x1f6/0x430 fs/namei.c:4161
> do_sys_openat2+0x117/0x1c0 fs/open.c:1437
> do_sys_open fs/open.c:1452 [inline]
> __do_sys_openat fs/open.c:1468 [inline]
> __se_sys_openat fs/open.c:1463 [inline]
> __x64_sys_openat+0x15b/0x220 fs/open.c:1463
> ...
>
> [CAUSE]
> ocfs2_bg_discontig_fix_result() iterates over bg->bg_list.l_recs[]
> using l_next_free_rec as the upper bound without any sanity check:
>
> for (i = 0; i < le16_to_cpu(bg->bg_list.l_next_free_rec); i++) {
> rec = &bg->bg_list.l_recs[i];
>
> l_next_free_rec is read directly from the on-disk group descriptor and
> is trusted blindly. On a 4 KiB block device, bg_list.l_recs[] can hold
> at most 235 entries (ocfs2_extent_recs_per_gd(sb)). A corrupted or
> crafted filesystem image can set l_next_free_rec to an arbitrarily
> large value, causing the loop to index past the end of the group
> descriptor buffer_head data page and into an adjacent freed page.
>
> [FIX]
> Fix this by adding a bounds check in ocfs2_validate_gd_self(), which
> is called for every group descriptor read via ocfs2_read_group_descriptor().
> Use ocfs2_gd_is_discontig() to restrict the check to discontiguous
> block groups, and ocfs2_extent_recs_per_gd(sb) as the physical upper
> bound (rather than trusting the on-disk l_count, which could also be
> corrupted). This follows the same do_error() pattern used by the
> existing field sanity checks in ocfs2_validate_gd_self().
>
> Signed-off-by: ZhengYuan Huang <gality369@gmail.com>
> ---
> fs/ocfs2/suballoc.c | 16 ++++++++++++++++
> 1 file changed, 16 insertions(+)
>
> diff --git a/fs/ocfs2/suballoc.c b/fs/ocfs2/suballoc.c
> index 6ac4dcd54588..6dcf45fda457 100644
> --- a/fs/ocfs2/suballoc.c
> +++ b/fs/ocfs2/suballoc.c
> @@ -196,6 +196,22 @@ static int ocfs2_validate_gd_self(struct super_block *sb,
> 8 * le16_to_cpu(gd->bg_size));
> }
>
> + /*
> + * For discontiguous block groups, validate that bg_list.l_next_free_rec
> + * does not exceed the maximum number of extent records that can physically
> + * fit in a single block.
> + */
> + if (ocfs2_gd_is_discontig(gd)) {
> + u16 max_recs = ocfs2_extent_recs_per_gd(sb);
> +
> + if (le16_to_cpu(gd->bg_list.l_next_free_rec) > max_recs) {
Since most places only check l_count, so why not validate l_count here
as well?
Something like:
if (le16_to_cpu(el->l_count) != max_recs ||
le16_to_cpu(el->l_next_free_rec > le16_to_cpu(el->l_count))
......
Thanks,
Joseph
> + do_error("Group descriptor #%llu bad discontig l_next_free_rec %u max %u\n",
> + (unsigned long long)bh->b_blocknr,
> + le16_to_cpu(gd->bg_list.l_next_free_rec),
> + max_recs);
> + }
> + }
> +
> return 0;
> }
>
On Wed, Apr 1, 2026 at 9:26 AM Joseph Qi <joseph.qi@linux.alibaba.com> wrote:
> > fs/ocfs2/suballoc.c | 16 ++++++++++++++++
> > 1 file changed, 16 insertions(+)
> >
> > diff --git a/fs/ocfs2/suballoc.c b/fs/ocfs2/suballoc.c
> > index 6ac4dcd54588..6dcf45fda457 100644
> > --- a/fs/ocfs2/suballoc.c
> > +++ b/fs/ocfs2/suballoc.c
> > @@ -196,6 +196,22 @@ static int ocfs2_validate_gd_self(struct super_block *sb,
> > 8 * le16_to_cpu(gd->bg_size));
> > }
> >
> > + /*
> > + * For discontiguous block groups, validate that bg_list.l_next_free_rec
> > + * does not exceed the maximum number of extent records that can physically
> > + * fit in a single block.
> > + */
> > + if (ocfs2_gd_is_discontig(gd)) {
> > + u16 max_recs = ocfs2_extent_recs_per_gd(sb);
> > +
> > + if (le16_to_cpu(gd->bg_list.l_next_free_rec) > max_recs) {
>
> Since most places only check l_count, so why not validate l_count here
> as well?
> Something like:
>
> if (le16_to_cpu(el->l_count) != max_recs ||
> le16_to_cpu(el->l_next_free_rec > le16_to_cpu(el->l_count))
> ......
>
> Thanks,
> Joseph
Thanks for the suggestion!
I’ll incorporate the additional validation for l_count as well and
send a v2 patch shortly.
Thanks,
ZhengYuan Huang
© 2016 - 2026 Red Hat, Inc.