[PATCH v2 0/5] btrfs: fix relocation on RAID stripe-tree filesystems

Johannes Thumshirn posted 5 patches 1 year, 6 months ago
There is a newer version of this series
fs/btrfs/bio.c              |  3 ++-
fs/btrfs/raid-stripe-tree.c |  8 +++-----
fs/btrfs/relocation.c       | 14 ++++++++++----
fs/btrfs/scrub.c            |  2 +-
fs/btrfs/volumes.h          |  2 +-
5 files changed, 17 insertions(+), 12 deletions(-)
[PATCH v2 0/5] btrfs: fix relocation on RAID stripe-tree filesystems
Posted by Johannes Thumshirn 1 year, 6 months ago
When doing relocation on RST backed filesystems, there is a possibility of
a scatter-gather list corruption.

See patch 4 for details.

CI Link: https://github.com/btrfs/linux/actions/runs/10143804038

---
Changes in v2:
- Change RST lookup error message to debug
- Link to v1: https://lore.kernel.org/r/20240729-debug-v1-0-f0b3d78d9438@kernel.org

---
Johannes Thumshirn (5):
      btrfs: don't dump stripe-tree on lookup error
      btrfs: rename btrfs_io_stripe::is_scrub to rst_search_commit_root
      btrfs: set rst_search_commit_root in case of relocation
      btrfs: don't readahead the relocation inode on RST
      btrfs: change RST lookup error message to debug

 fs/btrfs/bio.c              |  3 ++-
 fs/btrfs/raid-stripe-tree.c |  8 +++-----
 fs/btrfs/relocation.c       | 14 ++++++++++----
 fs/btrfs/scrub.c            |  2 +-
 fs/btrfs/volumes.h          |  2 +-
 5 files changed, 17 insertions(+), 12 deletions(-)
---
base-commit: 543cb1b052748dc53ff06b23273fcb78f11b8254
change-id: 20240726-debug-f1fe805ea37b

Best regards,
-- 
Johannes Thumshirn <jth@kernel.org>
Re: [PATCH v2 0/5] btrfs: fix relocation on RAID stripe-tree filesystems
Posted by Qu Wenruo 1 year, 6 months ago

在 2024/7/30 20:03, Johannes Thumshirn 写道:
> When doing relocation on RST backed filesystems, there is a possibility of
> a scatter-gather list corruption.
>
> See patch 4 for details.
>
> CI Link: https://github.com/btrfs/linux/actions/runs/10143804038
>
> ---
> Changes in v2:
> - Change RST lookup error message to debug
> - Link to v1: https://lore.kernel.org/r/20240729-debug-v1-0-f0b3d78d9438@kernel.org
>
> ---
> Johannes Thumshirn (5):
>        btrfs: don't dump stripe-tree on lookup error
>        btrfs: rename btrfs_io_stripe::is_scrub to rst_search_commit_root
>        btrfs: set rst_search_commit_root in case of relocation
>        btrfs: don't readahead the relocation inode on RST
>        btrfs: change RST lookup error message to debug

Reviewed-by: Qu Wenruo <wqu@suse.com>

The solution looks fine to me, but I have one extra question related to
the readahead.

   Does the readahead fail because it's reading some range not covered by
   any extent?

If so, you may want to add an example to patch 4 to explain the problem
better.

Thanks,
Qu

>
>   fs/btrfs/bio.c              |  3 ++-
>   fs/btrfs/raid-stripe-tree.c |  8 +++-----
>   fs/btrfs/relocation.c       | 14 ++++++++++----
>   fs/btrfs/scrub.c            |  2 +-
>   fs/btrfs/volumes.h          |  2 +-
>   5 files changed, 17 insertions(+), 12 deletions(-)
> ---
> base-commit: 543cb1b052748dc53ff06b23273fcb78f11b8254
> change-id: 20240726-debug-f1fe805ea37b
>
> Best regards,
Re: [PATCH v2 0/5] btrfs: fix relocation on RAID stripe-tree filesystems
Posted by Johannes Thumshirn 1 year, 6 months ago
On 30.07.24 23:34, Qu Wenruo wrote:
> 
> 
> 在 2024/7/30 20:03, Johannes Thumshirn 写道:
>> When doing relocation on RST backed filesystems, there is a possibility of
>> a scatter-gather list corruption.
>>
>> See patch 4 for details.
>>
>> CI Link: https://github.com/btrfs/linux/actions/runs/10143804038
>>
>> ---
>> Changes in v2:
>> - Change RST lookup error message to debug
>> - Link to v1: https://lore.kernel.org/r/20240729-debug-v1-0-f0b3d78d9438@kernel.org
>>
>> ---
>> Johannes Thumshirn (5):
>>         btrfs: don't dump stripe-tree on lookup error
>>         btrfs: rename btrfs_io_stripe::is_scrub to rst_search_commit_root
>>         btrfs: set rst_search_commit_root in case of relocation
>>         btrfs: don't readahead the relocation inode on RST
>>         btrfs: change RST lookup error message to debug
> 
> Reviewed-by: Qu Wenruo <wqu@suse.com>
> 
> The solution looks fine to me, but I have one extra question related to
> the readahead.
> 
>     Does the readahead fail because it's reading some range not covered by
>     any extent?

TBH I'm not 100% certain how it happens. The readahead fails because we 
have a RST lookup error. This could be because of preallocated extents 
(Josef's assumption) or something else.

I could not 100% verify that it is only preallocated extents, but my 
debug code could've been incomplete as well.

I would really really love to have a better explanation, but I don't 
have one yet.

I'm sorry to disappoint you here.

Byte,
	Johannes

Re: [PATCH v2 0/5] btrfs: fix relocation on RAID stripe-tree filesystems
Posted by Josef Bacik 1 year, 6 months ago
On Tue, Jul 30, 2024 at 12:33:13PM +0200, Johannes Thumshirn wrote:
> When doing relocation on RST backed filesystems, there is a possibility of
> a scatter-gather list corruption.
> 
> See patch 4 for details.
> 
> CI Link: https://github.com/btrfs/linux/actions/runs/10143804038
> 
> ---
> Changes in v2:
> - Change RST lookup error message to debug
> - Link to v1: https://lore.kernel.org/r/20240729-debug-v1-0-f0b3d78d9438@kernel.org
> 
> ---
> Johannes Thumshirn (5):

You'll have to rebase this because there's some fuzzy application with the folio
stuff merged, you can add

Reviewed-by: Josef Bacik <josef@toxicpanda.com>

Thanks,

Josef