[PATCH v2 2/2] btrfs: revalidate cached tree blocks on the uptodate path

ZhengYuan Huang posted 2 patches 3 weeks, 4 days ago
[PATCH v2 2/2] btrfs: revalidate cached tree blocks on the uptodate path
Posted by ZhengYuan Huang 3 weeks, 4 days ago
read_extent_buffer_pages_nowait() returns immediately when an extent
buffer is already marked EXTENT_BUFFER_UPTODATE. On that cache-hit path,
the caller supplied btrfs_tree_parent_check is not re-run.

This can let read_tree_root_path() accept a cached tree block whose
actual header level does not match the expected level derived from the
root item. In particular, if root_item.level is corrupted while the
actual root block was already cached and validated earlier with a
different expected level, the later read hits the cached uptodate path,
skips re-validation, and builds an inconsistent btrfs_root.

That inconsistent root can later lead to a null-ptr-deref in
handle_indirect_tree_backref(), because backref walking uses
root->root_item.level while btrfs_search_slot() fills path->nodes[]
according to the cached commit_root's actual level.

Fix this by re-validating cached extent buffers against the supplied
btrfs_tree_parent_check on the EXTENT_BUFFER_UPTODATE path, and make
read_tree_root_path() pass its check to btrfs_buffer_uptodate().

This makes cache hits and fresh reads follow the same tree-parent
verification rules, and turns the corruption into a read failure instead
of constructing an inconsistent root object.

Signed-off-by: ZhengYuan Huang <gality369@gmail.com>
---
 fs/btrfs/disk-io.c   |  6 ++++--
 fs/btrfs/extent_io.c | 12 +++++++++++-
 2 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 8773f1f7ea46..9a8c06c0adc2 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -1054,8 +1054,10 @@ static struct btrfs_root *read_tree_root_path(struct btrfs_root *tree_root,
 		root->node = NULL;
 		goto fail;
 	}
-	if (unlikely(!btrfs_buffer_uptodate(root->node, generation, false, NULL))) {
-		ret = -EIO;
+	ret = btrfs_buffer_uptodate(root->node, generation, false, &check);
+	if (unlikely(ret <= 0)) {
+		if (ret == 0)
+			ret = -EIO;
 		goto fail;
 	}
 
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 93eed1d3716c..1324449e892d 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -3828,8 +3828,13 @@ int read_extent_buffer_pages_nowait(struct extent_buffer *eb, int mirror_num,
 {
 	struct btrfs_bio *bbio;
 
-	if (test_bit(EXTENT_BUFFER_UPTODATE, &eb->bflags))
+	if (test_bit(EXTENT_BUFFER_UPTODATE, &eb->bflags)) {
+		int ret = btrfs_buffer_uptodate(eb, 0, true, check);
+
+		if (unlikely(ret < 0))
+			return ret;
 		return 0;
+	}
 
 	/*
 	 * We could have had EXTENT_BUFFER_UPTODATE cleared by the write
@@ -3850,7 +3855,12 @@ int read_extent_buffer_pages_nowait(struct extent_buffer *eb, int mirror_num,
 	 * will now be set, and we shouldn't read it in again.
 	 */
 	if (unlikely(test_bit(EXTENT_BUFFER_UPTODATE, &eb->bflags))) {
+		int ret;
+
 		clear_extent_buffer_reading(eb);
+		ret = btrfs_buffer_uptodate(eb, 0, true, check);
+		if (unlikely(ret < 0))
+			return ret;
 		return 0;
 	}
 
-- 
2.43.0
Re: [PATCH v2 2/2] btrfs: revalidate cached tree blocks on the uptodate path
Posted by Qu Wenruo 3 weeks, 1 day ago

在 2026/3/13 19:49, ZhengYuan Huang 写道:
> read_extent_buffer_pages_nowait() returns immediately when an extent
> buffer is already marked EXTENT_BUFFER_UPTODATE. On that cache-hit path,
> the caller supplied btrfs_tree_parent_check is not re-run.
> 
> This can let read_tree_root_path() accept a cached tree block whose
> actual header level does not match the expected level derived from the
> root item. In particular, if root_item.level is corrupted while the
> actual root block was already cached and validated earlier with a
> different expected level, the later read hits the cached uptodate path,
> skips re-validation, and builds an inconsistent btrfs_root.
> 
> That inconsistent root can later lead to a null-ptr-deref in
> handle_indirect_tree_backref(), because backref walking uses
> root->root_item.level while btrfs_search_slot() fills path->nodes[]
> according to the cached commit_root's actual level.
> 
> Fix this by re-validating cached extent buffers against the supplied
> btrfs_tree_parent_check on the EXTENT_BUFFER_UPTODATE path, and make
> read_tree_root_path() pass its check to btrfs_buffer_uptodate().
> 
> This makes cache hits and fresh reads follow the same tree-parent
> verification rules, and turns the corruption into a read failure instead
> of constructing an inconsistent root object.
> 
> Signed-off-by: ZhengYuan Huang <gality369@gmail.com>
> ---
>   fs/btrfs/disk-io.c   |  6 ++++--
>   fs/btrfs/extent_io.c | 12 +++++++++++-
>   2 files changed, 15 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> index 8773f1f7ea46..9a8c06c0adc2 100644
> --- a/fs/btrfs/disk-io.c
> +++ b/fs/btrfs/disk-io.c
> @@ -1054,8 +1054,10 @@ static struct btrfs_root *read_tree_root_path(struct btrfs_root *tree_root,
>   		root->node = NULL;
>   		goto fail;
>   	}
> -	if (unlikely(!btrfs_buffer_uptodate(root->node, generation, false, NULL))) {
> -		ret = -EIO;
> +	ret = btrfs_buffer_uptodate(root->node, generation, false, &check);
> +	if (unlikely(ret <= 0)) {
> +		if (ret == 0)
> +			ret = -EIO;
>   		goto fail;
>   	}
>   
> diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
> index 93eed1d3716c..1324449e892d 100644
> --- a/fs/btrfs/extent_io.c
> +++ b/fs/btrfs/extent_io.c
> @@ -3828,8 +3828,13 @@ int read_extent_buffer_pages_nowait(struct extent_buffer *eb, int mirror_num,
>   {
>   	struct btrfs_bio *bbio;
>   
> -	if (test_bit(EXTENT_BUFFER_UPTODATE, &eb->bflags))
> +	if (test_bit(EXTENT_BUFFER_UPTODATE, &eb->bflags)) {

This has a conflict with the latest for-next branch.

It has already been replaced with extent_buffer_uptodate() helper by 
commit "btrfs: use the helper extent_buffer_uptodate() everywhere", 
which is introduced over one month ago.

I have solved the conflicts this time, but please always base your 
patches on the latest for-next branch:

  https://github.com/btrfs/linux/tree/for-next

> +		int ret = btrfs_buffer_uptodate(eb, 0, true, check);
> +
> +		if (unlikely(ret < 0))
> +			return ret;

You didn't check (ret == 0) case, where it's transid mismatch.

>   		return 0;
> +	}
>   
>   	/*
>   	 * We could have had EXTENT_BUFFER_UPTODATE cleared by the write
> @@ -3850,7 +3855,12 @@ int read_extent_buffer_pages_nowait(struct extent_buffer *eb, int mirror_num,
>   	 * will now be set, and we shouldn't read it in again.
>   	 */
>   	if (unlikely(test_bit(EXTENT_BUFFER_UPTODATE, &eb->bflags))) {
> +		int ret;
> +
>   		clear_extent_buffer_reading(eb);
> +		ret = btrfs_buffer_uptodate(eb, 0, true, check);
> +		if (unlikely(ret < 0))
> +			return ret;

The same, I have fixed both call sites during merge.

Thanks,
Qu

>   		return 0;
>   	}
>