[PATCH] ext4: skip split extent recovery on corruption

hongao posted 1 patch 1 week, 4 days ago
There is a newer version of this series
[PATCH] ext4: skip split extent recovery on corruption
Posted by hongao 1 week, 4 days ago
ext4_split_extent_at() retries after ext4_ext_insert_extent() fails by
refinding the original extent and restoring its length. That recovery is
only safe for transient resource failures such as -ENOSPC, -EDQUOT, and
-ENOMEM.

When ext4_ext_insert_extent() fails because the extent tree is already
corrupted, ext4_find_extent() can return a leaf path without p_ext.
ext4_split_extent_at() then dereferences path[depth].p_ext while trying to
fix up the original extent length, causing a NULL pointer dereference while
handling a pre-existing filesystem corruption.

Do not enter the recovery path for corruption errors, and validate p_ext
after refinding the extent before touching it. This keeps the recovery path
limited to cases it can actually repair and turns the syzbot-triggered crash
into a proper corruption report.

Fixes: 716b9c23b862 ("ext4: refactor split and convert extents")
Reported-by: syzbot+1ffa5d865557e51cb604@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=1ffa5d865557e51cb604
Signed-off-by: hongao <hongao@uniontech.com>

diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index ae3804f36535..aba9947fd151 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -3239,6 +3239,9 @@ static struct ext4_ext_path *ext4_split_extent_at(handle_t *handle,
 
 	insert_err = PTR_ERR(path);
 	err = 0;
+	if (insert_err != -ENOSPC && insert_err != -EDQUOT &&
+	    insert_err != -ENOMEM)
+		goto out_path;
 
 	/*
 	 * Get a new path to try to zeroout or fix the extent length.
@@ -3261,6 +3264,13 @@ static struct ext4_ext_path *ext4_split_extent_at(handle_t *handle,
 
 	depth = ext_depth(inode);
 	ex = path[depth].p_ext;
+	if (!ex) {
+		EXT4_ERROR_INODE(inode,
+				 "bad extent address lblock: %lu, depth: %d pblock %lld",
+				 (unsigned long)ee_block, depth, path[depth].p_block);
+		err = -EFSCORRUPTED;
+		goto out;
+	}
 
 fix_extent_len:
 	ex->ee_len = orig_ex.ee_len;
-- 
2.51.0
Re: [PATCH] ext4: skip split extent recovery on corruption
Posted by hongao 1 week, 3 days ago
Hi Jan, Yi, Ojaswin,

Thank you for reviewing the patch.

I will send a v2 that moves the p_ext validation before
ext4_ext_get_access(), as Yi suggested.

Also, thanks for the Reviewed-by tags:
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>

Regarding Ojaswin's questions:

I do not have a local reproducer at the moment. My analysis was based on
the syzbot report, the crash trace, and code inspection.

For the p_ext == NULL case, a successful ext4_find_extent() only means
that we were able to walk the tree down to a leaf. It does not guarantee
that the leaf still contains a valid extent entry for the target logical
block.

path[depth].p_ext is derived from the extent entries stored in that leaf.
If the leaf metadata is already corrupted, ext4_find_extent() may still
return a non-error path, but p_ext can be NULL because there is no usable
extent entry there anymore.

So in the corruption case, a valid path is not enough to continue the
recovery path safely. Returning -EFSCORRUPTED is safer than
dereferencing p_ext and crashing while trying to repair already-broken
metadata.

Thanks,
Hongao
Re: [PATCH] ext4: skip split extent recovery on corruption
Posted by Ojaswin Mujoo 1 week, 3 days ago
On Tue, Mar 24, 2026 at 09:40:25AM +0800, hongao wrote:
> Hi Jan, Yi, Ojaswin,
> 
> Thank you for reviewing the patch.
> 
> I will send a v2 that moves the p_ext validation before
> ext4_ext_get_access(), as Yi suggested.
> 
> Also, thanks for the Reviewed-by tags:
> Reviewed-by: Jan Kara <jack@suse.cz>
> Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
> 
> Regarding Ojaswin's questions:
> 
> I do not have a local reproducer at the moment. My analysis was based on
> the syzbot report, the crash trace, and code inspection.
> 
> For the p_ext == NULL case, a successful ext4_find_extent() only means
> that we were able to walk the tree down to a leaf. It does not guarantee
> that the leaf still contains a valid extent entry for the target logical
> block.
> 
> path[depth].p_ext is derived from the extent entries stored in that leaf.
> If the leaf metadata is already corrupted, ext4_find_extent() may still
> return a non-error path, but p_ext can be NULL because there is no usable
> extent entry there anymore.
> 
> So in the corruption case, a valid path is not enough to continue the
> recovery path safely. Returning -EFSCORRUPTED is safer than
> dereferencing p_ext and crashing while trying to repair already-broken
> metadata.

Hmm so I looked a bit more and this path can lead to an empty extent:

ext4_ext_insert_extent():
  ...
	// Creates an empty leaf with p_ext == 0
	path = ext4_ext_create_new_leaf(handle, inode, mb_flags, gb_flags,
					path, newext);

  // Errors out
	err = ext4_ext_get_access(handle, inode, path + depth);
	if (err)
		goto errout;

Checking ext4_ext_get_access()

Seems like it can return EROFS when journal is aborted or EIO if journal
detected some IO errors on backing device. I guess we have already
handled these by exiting early in your patch, but sure having an extra
check wont hurt. 

Feel free to add:
Reviewed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>

Regards,
Ojaswin

> 
> Thanks,
> Hongao
Re: [PATCH] ext4: skip split extent recovery on corruption
Posted by Zhang Yi 1 week, 4 days ago
On 3/23/2026 9:57 AM, hongao wrote:
> ext4_split_extent_at() retries after ext4_ext_insert_extent() fails by
> refinding the original extent and restoring its length. That recovery is
> only safe for transient resource failures such as -ENOSPC, -EDQUOT, and
> -ENOMEM.
> 
> When ext4_ext_insert_extent() fails because the extent tree is already
> corrupted, ext4_find_extent() can return a leaf path without p_ext.
> ext4_split_extent_at() then dereferences path[depth].p_ext while trying to
> fix up the original extent length, causing a NULL pointer dereference while
> handling a pre-existing filesystem corruption.
> 
> Do not enter the recovery path for corruption errors, and validate p_ext
> after refinding the extent before touching it. This keeps the recovery path
> limited to cases it can actually repair and turns the syzbot-triggered crash
> into a proper corruption report.
> 
> Fixes: 716b9c23b862 ("ext4: refactor split and convert extents")
> Reported-by: syzbot+1ffa5d865557e51cb604@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=1ffa5d865557e51cb604
> Signed-off-by: hongao <hongao@uniontech.com>

Thank you for the patch, this is a nice catch. This overall looks good
to me besides one minor suggestion below.

Reviewed-by: Zhang Yi <yi.zhang@huawei.com>

> 
> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
> index ae3804f36535..aba9947fd151 100644
> --- a/fs/ext4/extents.c
> +++ b/fs/ext4/extents.c
> @@ -3239,6 +3239,9 @@ static struct ext4_ext_path *ext4_split_extent_at(handle_t *handle,
>  
>  	insert_err = PTR_ERR(path);
>  	err = 0;
> +	if (insert_err != -ENOSPC && insert_err != -EDQUOT &&
> +	    insert_err != -ENOMEM)
> +		goto out_path;
>  
>  	/*
>  	 * Get a new path to try to zeroout or fix the extent length.
> @@ -3261,6 +3264,13 @@ static struct ext4_ext_path *ext4_split_extent_at(handle_t *handle,
>  
>  	depth = ext_depth(inode);
>  	ex = path[depth].p_ext;
> +	if (!ex) {
> +		EXT4_ERROR_INODE(inode,
> +				 "bad extent address lblock: %lu, depth: %d pblock %lld",
> +				 (unsigned long)ee_block, depth, path[depth].p_block);
> +		err = -EFSCORRUPTED;
> +		goto out;
> +	}

Should we move the entire code snippet before the previous
ext4_ext_get_access() call? Although this is not closely related to the
current patch.

>  
>  fix_extent_len:
>  	ex->ee_len = orig_ex.ee_len;
Re: [PATCH] ext4: skip split extent recovery on corruption
Posted by Jan Kara 1 week, 4 days ago
On Mon 23-03-26 09:57:52, hongao wrote:
> ext4_split_extent_at() retries after ext4_ext_insert_extent() fails by
> refinding the original extent and restoring its length. That recovery is
> only safe for transient resource failures such as -ENOSPC, -EDQUOT, and
> -ENOMEM.
> 
> When ext4_ext_insert_extent() fails because the extent tree is already
> corrupted, ext4_find_extent() can return a leaf path without p_ext.
> ext4_split_extent_at() then dereferences path[depth].p_ext while trying to
> fix up the original extent length, causing a NULL pointer dereference while
> handling a pre-existing filesystem corruption.
> 
> Do not enter the recovery path for corruption errors, and validate p_ext
> after refinding the extent before touching it. This keeps the recovery path
> limited to cases it can actually repair and turns the syzbot-triggered crash
> into a proper corruption report.
> 
> Fixes: 716b9c23b862 ("ext4: refactor split and convert extents")
> Reported-by: syzbot+1ffa5d865557e51cb604@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=1ffa5d865557e51cb604
> Signed-off-by: hongao <hongao@uniontech.com>

Thanks! The fix looks good to me! Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> 
> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
> index ae3804f36535..aba9947fd151 100644
> --- a/fs/ext4/extents.c
> +++ b/fs/ext4/extents.c
> @@ -3239,6 +3239,9 @@ static struct ext4_ext_path *ext4_split_extent_at(handle_t *handle,
>  
>  	insert_err = PTR_ERR(path);
>  	err = 0;
> +	if (insert_err != -ENOSPC && insert_err != -EDQUOT &&
> +	    insert_err != -ENOMEM)
> +		goto out_path;
>  
>  	/*
>  	 * Get a new path to try to zeroout or fix the extent length.
> @@ -3261,6 +3264,13 @@ static struct ext4_ext_path *ext4_split_extent_at(handle_t *handle,
>  
>  	depth = ext_depth(inode);
>  	ex = path[depth].p_ext;
> +	if (!ex) {
> +		EXT4_ERROR_INODE(inode,
> +				 "bad extent address lblock: %lu, depth: %d pblock %lld",
> +				 (unsigned long)ee_block, depth, path[depth].p_block);
> +		err = -EFSCORRUPTED;
> +		goto out;
> +	}
>  
>  fix_extent_len:
>  	ex->ee_len = orig_ex.ee_len;
> -- 
> 2.51.0
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR
Re: [PATCH] ext4: skip split extent recovery on corruption
Posted by Ojaswin Mujoo 1 week, 4 days ago
On Mon, Mar 23, 2026 at 09:57:52AM +0800, hongao wrote:
> ext4_split_extent_at() retries after ext4_ext_insert_extent() fails by
> refinding the original extent and restoring its length. That recovery is
> only safe for transient resource failures such as -ENOSPC, -EDQUOT, and
> -ENOMEM.
> 
> When ext4_ext_insert_extent() fails because the extent tree is already
> corrupted, ext4_find_extent() can return a leaf path without p_ext.
> ext4_split_extent_at() then dereferences path[depth].p_ext while trying to
> fix up the original extent length, causing a NULL pointer dereference while
> handling a pre-existing filesystem corruption.
> 
> Do not enter the recovery path for corruption errors, and validate p_ext
> after refinding the extent before touching it. This keeps the recovery path
> limited to cases it can actually repair and turns the syzbot-triggered crash
> into a proper corruption report.
> 
> Fixes: 716b9c23b862 ("ext4: refactor split and convert extents")
> Reported-by: syzbot+1ffa5d865557e51cb604@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=1ffa5d865557e51cb604
> Signed-off-by: hongao <hongao@uniontech.com>
> 
> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
> index ae3804f36535..aba9947fd151 100644
> --- a/fs/ext4/extents.c
> +++ b/fs/ext4/extents.c
> @@ -3239,6 +3239,9 @@ static struct ext4_ext_path *ext4_split_extent_at(handle_t *handle,
>  
>  	insert_err = PTR_ERR(path);
>  	err = 0;
> +	if (insert_err != -ENOSPC && insert_err != -EDQUOT &&
> +	    insert_err != -ENOMEM)
> +		goto out_path;

Hi Hongao,

Thanks for the fix. This makes sense to me since we anyways don't
zeroout for these cases and just return the error, the callers
should expect no guarantee that extent is in a stable state.

Are you able to replicate this issue btw as I don't see a syzbot
replicator for this.

>  
>  	/*
>  	 * Get a new path to try to zeroout or fix the extent length.
> @@ -3261,6 +3264,13 @@ static struct ext4_ext_path *ext4_split_extent_at(handle_t *handle,
>  
>  	depth = ext_depth(inode);
>  	ex = path[depth].p_ext;
> +	if (!ex) {
> +		EXT4_ERROR_INODE(inode,
> +				 "bad extent address lblock: %lu, depth: %d pblock %lld",
> +				 (unsigned long)ee_block, depth, path[depth].p_block);
> +		err = -EFSCORRUPTED;
> +		goto out;
> +	}

Looking at the syzbot report, it does seem this is possible, but looking
at ext4_ext_insert_extent() it is not immediately obvious. Do you know
what codepath can lead us to path being valid but p_ext being NULL.

Regards,
ojaswin
>  
>  fix_extent_len:
>  	ex->ee_len = orig_ex.ee_len;
> -- 
> 2.51.0
>
[PATCH v2] ext4: skip split extent recovery on corruption
Posted by hongao 1 week, 3 days ago
ext4_split_extent_at() retries after ext4_ext_insert_extent() fails by
refinding the original extent and restoring its length. That recovery is
only safe for transient resource failures such as -ENOSPC, -EDQUOT, and
-ENOMEM.

When ext4_ext_insert_extent() fails because the extent tree is already
corrupted, ext4_find_extent() can return a leaf path without p_ext.
ext4_split_extent_at() then dereferences path[depth].p_ext while trying to
fix up the original extent length, causing a NULL pointer dereference while
handling a pre-existing filesystem corruption.

Do not enter the recovery path for corruption errors, and validate p_ext
after refinding the extent before touching it. This keeps the recovery path
limited to cases it can actually repair and turns the syzbot-triggered crash
into a proper corruption report.

Fixes: 716b9c23b862 ("ext4: refactor split and convert extents")
Reported-by: syzbot+1ffa5d865557e51cb604@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=1ffa5d865557e51cb604
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Signed-off-by: hongao <hongao@uniontech.com>
---
v2:
- move the p_ext validation before ext4_ext_get_access()
- add Reviewed-by tags from Jan Kara and Zhang Yi

 fs/ext4/extents.c | 16 +++++++++++++---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index ae3804f36535..10af8dee0f1a 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -3239,6 +3239,9 @@ static struct ext4_ext_path *ext4_split_extent_at(handle_t *handle,
 
 	insert_err = PTR_ERR(path);
 	err = 0;
+	if (insert_err != -ENOSPC && insert_err != -EDQUOT &&
+	    insert_err != -ENOMEM)
+		goto out_path;
 
 	/*
 	 * Get a new path to try to zeroout or fix the extent length.
@@ -3255,13 +3258,20 @@ static struct ext4_ext_path *ext4_split_extent_at(handle_t *handle,
 		goto out_path;
 	}
 
+	depth = ext_depth(inode);
+	ex = path[depth].p_ext;
+	if (!ex) {
+		EXT4_ERROR_INODE(inode,
+				 "bad extent address lblock: %lu, depth: %d pblock %lld",
+				 (unsigned long)ee_block, depth, path[depth].p_block);
+		err = -EFSCORRUPTED;
+		goto out;
+	}
+
 	err = ext4_ext_get_access(handle, inode, path + depth);
 	if (err)
 		goto out;
 
-	depth = ext_depth(inode);
-	ex = path[depth].p_ext;
-
 fix_extent_len:
 	ex->ee_len = orig_ex.ee_len;
 	err = ext4_ext_dirty(handle, inode, path + path->p_depth);
-- 
2.51.0
Re: [PATCH v2] ext4: skip split extent recovery on corruption
Posted by Theodore Ts'o 6 days, 15 hours ago
On Tue, 24 Mar 2026 09:58:15 +0800, hongao wrote:
> ext4_split_extent_at() retries after ext4_ext_insert_extent() fails by
> refinding the original extent and restoring its length. That recovery is
> only safe for transient resource failures such as -ENOSPC, -EDQUOT, and
> -ENOMEM.
> 
> When ext4_ext_insert_extent() fails because the extent tree is already
> corrupted, ext4_find_extent() can return a leaf path without p_ext.
> ext4_split_extent_at() then dereferences path[depth].p_ext while trying to
> fix up the original extent length, causing a NULL pointer dereference while
> handling a pre-existing filesystem corruption.
> 
> [...]

Applied, thanks!

[1/1] ext4: skip split extent recovery on corruption
      commit: 3ceda17325fc2600f66fd85b526592bc8a9dfb9d

Best regards,
-- 
Theodore Ts'o <tytso@mit.edu>