Don't set MD_BROKEN on failfast bio failure

[PATCH v4 7/9] md/raid10: fix failfast read error not rescheduled

Posted by Kenta Akagi 2 weeks, 3 days ago

raid10_end_read_request lacks a path to retry when a FailFast IO fails.
As a result, when Failfast Read IOs fail on all rdevs, the upper layer
receives EIO, without read rescheduled.

Looking at the two commits below, it seems only raid10_end_read_request
lacks the failfast read retry handling, while raid1_end_read_request has
it. In RAID1, the retry works as expected.
* commit 8d3ca83dcf9c ("md/raid10: add failfast handling for reads.")
* commit 2e52d449bcec ("md/raid1: add failfast handling for reads.")

I don't know why raid10_end_read_request lacks this, but it is probably
just a simple oversight.

This commit will make the failfast read bio for the last rdev in raid10
retry if it fails.

Fixes: 8d3ca83dcf9c ("md/raid10: add failfast handling for reads.")
Signed-off-by: Kenta Akagi <k@mgml.me>
---
 drivers/md/raid10.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index 92cf3047dce6..86c0eacd37cb 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -399,6 +399,11 @@ static void raid10_end_read_request(struct bio *bio)
 		 * wait for the 'master' bio.
 		 */
 		set_bit(R10BIO_Uptodate, &r10_bio->state);
+	} else if (test_bit(FailFast, &rdev->flags) &&
+		 test_bit(R10BIO_FailFast, &r10_bio->state)) {
+		/* This was a fail-fast read so we definitely
+		 * want to retry */
+		;
 	} else if (!raid1_should_handle_error(bio)) {
 		uptodate = 1;
 	} else {
-- 
2.50.1

Re: [PATCH v4 7/9] md/raid10: fix failfast read error not rescheduled

Posted by Li Nan 2 weeks ago


在 2025/9/15 11:42, Kenta Akagi 写道:
> raid10_end_read_request lacks a path to retry when a FailFast IO fails.
> As a result, when Failfast Read IOs fail on all rdevs, the upper layer
> receives EIO, without read rescheduled.
> 
> Looking at the two commits below, it seems only raid10_end_read_request
> lacks the failfast read retry handling, while raid1_end_read_request has
> it. In RAID1, the retry works as expected.
> * commit 8d3ca83dcf9c ("md/raid10: add failfast handling for reads.")
> * commit 2e52d449bcec ("md/raid1: add failfast handling for reads.")
> 
> I don't know why raid10_end_read_request lacks this, but it is probably
> just a simple oversight.

Agreed, these two lines can be removed.

Other than that, LGTM.

Reviewed-by: Li Nan <linan122@huawei.com>

> 
> This commit will make the failfast read bio for the last rdev in raid10
> retry if it fails.
> 
> Fixes: 8d3ca83dcf9c ("md/raid10: add failfast handling for reads.")
> Signed-off-by: Kenta Akagi <k@mgml.me>
> ---
>   drivers/md/raid10.c | 5 +++++
>   1 file changed, 5 insertions(+)
> 
> diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
> index 92cf3047dce6..86c0eacd37cb 100644
> --- a/drivers/md/raid10.c
> +++ b/drivers/md/raid10.c
> @@ -399,6 +399,11 @@ static void raid10_end_read_request(struct bio *bio)
>   		 * wait for the 'master' bio.
>   		 */
>   		set_bit(R10BIO_Uptodate, &r10_bio->state);
> +	} else if (test_bit(FailFast, &rdev->flags) &&
> +		 test_bit(R10BIO_FailFast, &r10_bio->state)) {
> +		/* This was a fail-fast read so we definitely
> +		 * want to retry */
> +		;
>   	} else if (!raid1_should_handle_error(bio)) {
>   		uptodate = 1;
>   	} else {

-- 
Thanks,
Nan

Re: [PATCH v4 7/9] md/raid10: fix failfast read error not rescheduled

Posted by Kenta Akagi 2 weeks ago


On 2025/09/18 16:38, Li Nan wrote:
> 
> 
> 在 2025/9/15 11:42, Kenta Akagi 写道:
>> raid10_end_read_request lacks a path to retry when a FailFast IO fails.
>> As a result, when Failfast Read IOs fail on all rdevs, the upper layer
>> receives EIO, without read rescheduled.
>>
>> Looking at the two commits below, it seems only raid10_end_read_request
>> lacks the failfast read retry handling, while raid1_end_read_request has
>> it. In RAID1, the retry works as expected.
>> * commit 8d3ca83dcf9c ("md/raid10: add failfast handling for reads.")
>> * commit 2e52d449bcec ("md/raid1: add failfast handling for reads.")
>>
>> I don't know why raid10_end_read_request lacks this, but it is probably
>> just a simple oversight.
> 
> Agreed, these two lines can be removed.

I will revise the commit message.

> 
> Other than that, LGTM.
> 
> Reviewed-by: Li Nan <linan122@huawei.com>

Thank you. However, there is a WARNING due to the comment format that needs to be fixed.
I also received a failure email from the RAID CI system.

------------------------------------------------------------------------
patch-v4/v4-0007-md-raid10-fix-failfast-read-error-not-rescheduled.patch
------------------------------------------------------------------------
WARNING: Block comments use a trailing */ on a separate line
#39: FILE: drivers/md/raid10.c:405:
+                * want to retry */

total: 0 errors, 1 warnings, 11 lines checked


I will apply the corrections below and resubmit as v5.
Is it okay to add a Reviewed-by tag in this case?
Sorry to bother you.

+       } else if (test_bit(FailFast, &rdev->flags) &&
+                test_bit(R10BIO_FailFast, &r10_bio->state)) {
+               /* This was a fail-fast read so we definitely
+                * want to retry
+                */
+               ;

Thanks,
Akagi

> 
>>
>> This commit will make the failfast read bio for the last rdev in raid10
>> retry if it fails.
>>
>> Fixes: 8d3ca83dcf9c ("md/raid10: add failfast handling for reads.")
>> Signed-off-by: Kenta Akagi <k@mgml.me>
>> ---
>>   drivers/md/raid10.c | 5 +++++
>>   1 file changed, 5 insertions(+)
>>
>> diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
>> index 92cf3047dce6..86c0eacd37cb 100644
>> --- a/drivers/md/raid10.c
>> +++ b/drivers/md/raid10.c
>> @@ -399,6 +399,11 @@ static void raid10_end_read_request(struct bio *bio)
>>            * wait for the 'master' bio.
>>            */
>>           set_bit(R10BIO_Uptodate, &r10_bio->state);
>> +    } else if (test_bit(FailFast, &rdev->flags) &&
>> +         test_bit(R10BIO_FailFast, &r10_bio->state)) {
>> +        /* This was a fail-fast read so we definitely
>> +         * want to retry */
>> +        ;
>>       } else if (!raid1_should_handle_error(bio)) {
>>           uptodate = 1;
>>       } else {
> 
> -- 
> Thanks,
> Nan
> 
>

Re: [PATCH v4 7/9] md/raid10: fix failfast read error not rescheduled

Posted by Li Nan 1 week, 6 days ago


在 2025/9/19 0:12, Kenta Akagi 写道:
> 
> 
> On 2025/09/18 16:38, Li Nan wrote:
>>
>>
>> 在 2025/9/15 11:42, Kenta Akagi 写道:
>>> raid10_end_read_request lacks a path to retry when a FailFast IO fails.
>>> As a result, when Failfast Read IOs fail on all rdevs, the upper layer
>>> receives EIO, without read rescheduled.
>>>
>>> Looking at the two commits below, it seems only raid10_end_read_request
>>> lacks the failfast read retry handling, while raid1_end_read_request has
>>> it. In RAID1, the retry works as expected.
>>> * commit 8d3ca83dcf9c ("md/raid10: add failfast handling for reads.")
>>> * commit 2e52d449bcec ("md/raid1: add failfast handling for reads.")
>>>
>>> I don't know why raid10_end_read_request lacks this, but it is probably
>>> just a simple oversight.
>>
>> Agreed, these two lines can be removed.
> 
> I will revise the commit message.
> 
>>
>> Other than that, LGTM.
>>
>> Reviewed-by: Li Nan <linan122@huawei.com>
> 
> Thank you. However, there is a WARNING due to the comment format that needs to be fixed.
> I also received a failure email from the RAID CI system.
> 
> ------------------------------------------------------------------------
> patch-v4/v4-0007-md-raid10-fix-failfast-read-error-not-rescheduled.patch
> ------------------------------------------------------------------------
> WARNING: Block comments use a trailing */ on a separate line
> #39: FILE: drivers/md/raid10.c:405:
> +                * want to retry */
> 
> total: 0 errors, 1 warnings, 11 lines checked
> 
> 
> I will apply the corrections below and resubmit as v5.
> Is it okay to add a Reviewed-by tag in this case?
> Sorry to bother you.

Yes, please feel free to add it.

> 
> +       } else if (test_bit(FailFast, &rdev->flags) &&
> +                test_bit(R10BIO_FailFast, &r10_bio->state)) {
> +               /* This was a fail-fast read so we definitely


/*
  * This was ...
  */

This way is better.

> +                * want to retry
> +                */
> +               ;
> 
> Thanks,
> Akagi
> 
>>
>>>
>>> This commit will make the failfast read bio for the last rdev in raid10
>>> retry if it fails.
>>>
>>> Fixes: 8d3ca83dcf9c ("md/raid10: add failfast handling for reads.")
>>> Signed-off-by: Kenta Akagi <k@mgml.me>
>>> ---
>>>    drivers/md/raid10.c | 5 +++++
>>>    1 file changed, 5 insertions(+)
>>>
>>> diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
>>> index 92cf3047dce6..86c0eacd37cb 100644
>>> --- a/drivers/md/raid10.c
>>> +++ b/drivers/md/raid10.c
>>> @@ -399,6 +399,11 @@ static void raid10_end_read_request(struct bio *bio)
>>>             * wait for the 'master' bio.
>>>             */
>>>            set_bit(R10BIO_Uptodate, &r10_bio->state);
>>> +    } else if (test_bit(FailFast, &rdev->flags) &&
>>> +         test_bit(R10BIO_FailFast, &r10_bio->state)) {
>>> +        /* This was a fail-fast read so we definitely
>>> +         * want to retry */
>>> +        ;
>>>        } else if (!raid1_should_handle_error(bio)) {
>>>            uptodate = 1;
>>>        } else {
>>
>> -- 
>> Thanks,
>> Nan
>>
>>
> 
> 
> .

-- 
Thanks,
Nan