[PATCH] md/raid1: fix len reuse across rdevs in choose_first_rdev()

Abd-Alrhman Masalkhi posted 1 patch 1 month, 3 weeks ago
drivers/md/raid1.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
[PATCH] md/raid1: fix len reuse across rdevs in choose_first_rdev()
Posted by Abd-Alrhman Masalkhi 1 month, 3 weeks ago
choose_first_rdev() initializes the variable len before iterating over
all rdevs, but passes it by reference to raid1_check_read_range(), which
it might update *len and return 0 depending on the layout of the bad
block region. As a result, 'len' can be modified during the first
iteration and reused for subsequent rdevs, causing later devices to be
evaluated with an incorrect length value.

Fixes: 31a73331752d3 ("md/raid1: factor out read_first_rdev() from read_balance()")
Signed-off-by: Abd-Alrhman Masalkhi <abd.masalkhi@gmail.com>
---
 drivers/md/raid1.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index b549be9174bb..5f5dbf79c903 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -591,12 +591,12 @@ static int choose_first_rdev(struct r1conf *conf, struct r1bio *r1_bio,
 			     int *max_sectors)
 {
 	sector_t this_sector = r1_bio->sector;
-	int len = r1_bio->sectors;
 	int disk;
 
 	for (disk = 0 ; disk < conf->raid_disks * 2 ; disk++) {
 		struct md_rdev *rdev;
 		int read_len;
+		int len = r1_bio->sectors;
 
 		if (r1_bio->bios[disk] == IO_BLOCKED)
 			continue;
-- 
2.43.0
Re: [PATCH] md/raid1: fix len reuse across rdevs in choose_first_rdev()
Posted by Yu Kuai 1 month, 2 weeks ago
Hi,

在 2026/4/26 17:35, Abd-Alrhman Masalkhi 写道:
> choose_first_rdev() initializes the variable len before iterating over
> all rdevs, but passes it by reference to raid1_check_read_range(), which
> it might update *len and return 0 depending on the layout of the bad
> block region. As a result, 'len' can be modified during the first
> iteration and reused for subsequent rdevs, causing later devices to be
> evaluated with an incorrect length value.
>
> Fixes: 31a73331752d3 ("md/raid1: factor out read_first_rdev() from read_balance()")
> Signed-off-by: Abd-Alrhman Masalkhi <abd.masalkhi@gmail.com>
> ---
>   drivers/md/raid1.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
> index b549be9174bb..5f5dbf79c903 100644
> --- a/drivers/md/raid1.c
> +++ b/drivers/md/raid1.c
> @@ -591,12 +591,12 @@ static int choose_first_rdev(struct r1conf *conf, struct r1bio *r1_bio,
>   			     int *max_sectors)
>   {
>   	sector_t this_sector = r1_bio->sector;
> -	int len = r1_bio->sectors;
>   	int disk;
>   
>   	for (disk = 0 ; disk < conf->raid_disks * 2 ; disk++) {
>   		struct md_rdev *rdev;
>   		int read_len;
> +		int len = r1_bio->sectors;
>   
>   		if (r1_bio->bios[disk] == IO_BLOCKED)
>   			continue;

This patch is wrong, choose_first_rdev() is used when raid1_should_read_first() is true,
meaning the read overlaps an unsynced/resyncing area. Reset len can cause the problem that
reading the same area can return different data.

-- 
Thansk,
Kuai
Re: [PATCH] md/raid1: fix len reuse across rdevs in choose_first_rdev()
Posted by Abd-Alrhman Masalkhi 1 month, 2 weeks ago
Hi Kaui,

On Tue, Apr 28, 2026 at 16:23 +0800, Yu Kuai wrote:
> Hi,
>
> 在 2026/4/26 17:35, Abd-Alrhman Masalkhi 写道:
>> choose_first_rdev() initializes the variable len before iterating over
>> all rdevs, but passes it by reference to raid1_check_read_range(), which
>> it might update *len and return 0 depending on the layout of the bad
>> block region. As a result, 'len' can be modified during the first
>> iteration and reused for subsequent rdevs, causing later devices to be
>> evaluated with an incorrect length value.
>>
>> Fixes: 31a73331752d3 ("md/raid1: factor out read_first_rdev() from read_balance()")
>> Signed-off-by: Abd-Alrhman Masalkhi <abd.masalkhi@gmail.com>
>> ---
>>   drivers/md/raid1.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
>> index b549be9174bb..5f5dbf79c903 100644
>> --- a/drivers/md/raid1.c
>> +++ b/drivers/md/raid1.c
>> @@ -591,12 +591,12 @@ static int choose_first_rdev(struct r1conf *conf, struct r1bio *r1_bio,
>>   			     int *max_sectors)
>>   {
>>   	sector_t this_sector = r1_bio->sector;
>> -	int len = r1_bio->sectors;
>>   	int disk;
>>   
>>   	for (disk = 0 ; disk < conf->raid_disks * 2 ; disk++) {
>>   		struct md_rdev *rdev;
>>   		int read_len;
>> +		int len = r1_bio->sectors;
>>   
>>   		if (r1_bio->bios[disk] == IO_BLOCKED)
>>   			continue;
>
> This patch is wrong, choose_first_rdev() is used when raid1_should_read_first() is true,
> meaning the read overlaps an unsynced/resyncing area. Reset len can cause the problem that
> reading the same area can return different data.
>

Thank you for the detailed explanation. After carefully re-reading the
code and your feedback, I understand why the patch is wrong.

> -- 
> Thansk,
> Kuai

-- 
Best Regards,
Abd-Alrhman