[PATCH] md: prevent incoreect update of resync/recovery offset

linan666@huaweicloud.com posted 1 patch 1 month ago
drivers/md/md.c | 3 +++
1 file changed, 3 insertions(+)
[PATCH] md: prevent incoreect update of resync/recovery offset
Posted by linan666@huaweicloud.com 1 month ago
From: Li Nan <linan122@huawei.com>

In md_do_sync(), when md_sync_action returns ACTION_FROZEN, subsequent
call to md_sync_position() will return MaxSector. This causes
'curr_resync' (and later 'recovery_offset') to be set to MaxSector too,
which incorrectly signals that recovery/resync has completed even though
disk data has not actually been updated.

To fix this issue, skip updating any offset values when the sync acion
is either FROZEN or IDLE.

Fixes: 7d9f107a4e94 ("md: use new helpers in md_do_sync()")
Signed-off-by: Li Nan <linan122@huawei.com>
---
 drivers/md/md.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index e78f80d39271..6828a569e819 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -9397,6 +9397,9 @@ void md_do_sync(struct md_thread *thread)
 	}
 
 	action = md_sync_action(mddev);
+	if (action == ACTION_FROZEN || action == ACTION_IDLE)
+		goto skip;
+
 	desc = md_sync_action_name(action);
 	mddev->last_sync_action = action;
 
-- 
2.39.2
Re: [PATCH] md: prevent incoreect update of resync/recovery offset
Posted by Paul Menzel 1 month ago
Dear Nan,


Thank you for your patch. I have some formal comments. In the 
summary/title: incor*r*ect.

Am 30.08.25 um 11:02 schrieb linan666@huaweicloud.com:
> From: Li Nan <linan122@huawei.com>
> 
> In md_do_sync(), when md_sync_action returns ACTION_FROZEN, subsequent
> call to md_sync_position() will return MaxSector. This causes
> 'curr_resync' (and later 'recovery_offset') to be set to MaxSector too,
> which incorrectly signals that recovery/resync has completed even though
> disk data has not actually been updated.
> 
> To fix this issue, skip updating any offset values when the sync acion

ac*t*ion

> is either FROZEN or IDLE.

Maybe state that the same holds true for IDLE?

Should these two cases be handled differently in `md_sync_position()`. 
Does it semantically make sense, that the default of MaxSector is returned?

> Fixes: 7d9f107a4e94 ("md: use new helpers in md_do_sync()")
> Signed-off-by: Li Nan <linan122@huawei.com>
> ---
>   drivers/md/md.c | 3 +++
>   1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index e78f80d39271..6828a569e819 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -9397,6 +9397,9 @@ void md_do_sync(struct md_thread *thread)
>   	}
>   
>   	action = md_sync_action(mddev);
> +	if (action == ACTION_FROZEN || action == ACTION_IDLE)
> +		goto skip;
> +
>   	desc = md_sync_action_name(action);
>   	mddev->last_sync_action = action;
>   


Kind regards,

Paul
Re: [PATCH] md: prevent incoreect update of resync/recovery offset
Posted by Li Nan 1 month ago

在 2025/8/30 17:41, Paul Menzel 写道:
> Dear Nan,
> 
> 
> Thank you for your patch. I have some formal comments. In the 
> summary/title: incor*r*ect.
> 
Thanks for your review, I will fix typo in v2.

> Am 30.08.25 um 11:02 schrieb linan666@huaweicloud.com:
>> From: Li Nan <linan122@huawei.com>
>>
>> In md_do_sync(), when md_sync_action returns ACTION_FROZEN, subsequent
>> call to md_sync_position() will return MaxSector. This causes
>> 'curr_resync' (and later 'recovery_offset') to be set to MaxSector too,
>> which incorrectly signals that recovery/resync has completed even though
>> disk data has not actually been updated.
>>
>> To fix this issue, skip updating any offset values when the sync acion
> 
> ac*t*ion
> 
>> is either FROZEN or IDLE.
> 
> Maybe state that the same holds true for IDLE?
> 
> Should these two cases be handled differently in `md_sync_position()`. Does 
> it semantically make sense, that the default of MaxSector is returned?
> 

md_sync_position() return value decides whether sync_request is needed. The
loop uses 'while (j < max_sectors)', when J is MaxSector there’s nothing
to do. This seems reasonable.

>> Fixes: 7d9f107a4e94 ("md: use new helpers in md_do_sync()")
>> Signed-off-by: Li Nan <linan122@huawei.com>
>> ---
>>   drivers/md/md.c | 3 +++
>>   1 file changed, 3 insertions(+)
>>
>> diff --git a/drivers/md/md.c b/drivers/md/md.c
>> index e78f80d39271..6828a569e819 100644
>> --- a/drivers/md/md.c
>> +++ b/drivers/md/md.c
>> @@ -9397,6 +9397,9 @@ void md_do_sync(struct md_thread *thread)
>>       }
>>       action = md_sync_action(mddev);
>> +    if (action == ACTION_FROZEN || action == ACTION_IDLE)
>> +        goto skip;
>> +
>>       desc = md_sync_action_name(action);
>>       mddev->last_sync_action = action;
> 
> 
> Kind regards,
> 
> Paul
> 
> .

-- 
Thanks,
Nan

Re: [PATCH] md: prevent incoreect update of resync/recovery offset
Posted by Yu Kuai 1 month ago
Hi,

在 2025/8/30 17:41, Paul Menzel 写道:
> Dear Nan,
>
>
> Thank you for your patch. I have some formal comments. In the 
> summary/title: incor*r*ect.
>
> Am 30.08.25 um 11:02 schrieb linan666@huaweicloud.com:
>> From: Li Nan <linan122@huawei.com>
>>
>> In md_do_sync(), when md_sync_action returns ACTION_FROZEN, subsequent
>> call to md_sync_position() will return MaxSector. This causes
>> 'curr_resync' (and later 'recovery_offset') to be set to MaxSector too,
>> which incorrectly signals that recovery/resync has completed even though
>> disk data has not actually been updated.
>>
>> To fix this issue, skip updating any offset values when the sync acion
>
> ac*t*ion
>
>> is either FROZEN or IDLE.
>
> Maybe state that the same holds true for IDLE?
>
> Should these two cases be handled differently in `md_sync_position()`. 
> Does it semantically make sense, that the default of MaxSector is 
> returned?
>
Max sectors will return, however, following procedures will update
resync_offset to MaxSector, while recovery can be interuppted by the
concurrent frozen, can this will cause data lost.

>> Fixes: 7d9f107a4e94 ("md: use new helpers in md_do_sync()")
>> Signed-off-by: Li Nan <linan122@huawei.com>
>> ---
>>   drivers/md/md.c | 3 +++
>>   1 file changed, 3 insertions(+)
>>
>> diff --git a/drivers/md/md.c b/drivers/md/md.c
>> index e78f80d39271..6828a569e819 100644
>> --- a/drivers/md/md.c
>> +++ b/drivers/md/md.c
>> @@ -9397,6 +9397,9 @@ void md_do_sync(struct md_thread *thread)
>>       }
>>         action = md_sync_action(mddev);
>> +    if (action == ACTION_FROZEN || action == ACTION_IDLE)
>> +        goto skip;
>> +
>>       desc = md_sync_action_name(action);
>>       mddev->last_sync_action = action;
>
Please send a new verison with the typo fixed.

Thanks,
Kuai

>
> Kind regards,
>
> Paul
>
Re: [PATCH] md: prevent incoreect update of resync/recovery offset
Posted by Li Nan 1 month ago

在 2025/8/30 19:04, Yu Kuai 写道:
> Hi,
> 
> 在 2025/8/30 17:41, Paul Menzel 写道:
>> Dear Nan,
>>
>>
>> Thank you for your patch. I have some formal comments. In the 
>> summary/title: incor*r*ect.
>>
>> Am 30.08.25 um 11:02 schrieb linan666@huaweicloud.com:
>>> From: Li Nan <linan122@huawei.com>
>>>
>>> In md_do_sync(), when md_sync_action returns ACTION_FROZEN, subsequent
>>> call to md_sync_position() will return MaxSector. This causes
>>> 'curr_resync' (and later 'recovery_offset') to be set to MaxSector too,
>>> which incorrectly signals that recovery/resync has completed even though
>>> disk data has not actually been updated.
>>>
>>> To fix this issue, skip updating any offset values when the sync acion
>>
>> ac*t*ion
>>
>>> is either FROZEN or IDLE.
>>
>> Maybe state that the same holds true for IDLE?
>>
>> Should these two cases be handled differently in `md_sync_position()`. 
>> Does it semantically make sense, that the default of MaxSector is returned?
>>
> Max sectors will return, however, following procedures will update
> resync_offset to MaxSector, while recovery can be interuppted by the
> concurrent frozen, can this will cause data lost.
> 
>>> Fixes: 7d9f107a4e94 ("md: use new helpers in md_do_sync()")
>>> Signed-off-by: Li Nan <linan122@huawei.com>
>>> ---
>>>   drivers/md/md.c | 3 +++
>>>   1 file changed, 3 insertions(+)
>>>
>>> diff --git a/drivers/md/md.c b/drivers/md/md.c
>>> index e78f80d39271..6828a569e819 100644
>>> --- a/drivers/md/md.c
>>> +++ b/drivers/md/md.c
>>> @@ -9397,6 +9397,9 @@ void md_do_sync(struct md_thread *thread)
>>>       }
>>>         action = md_sync_action(mddev);
>>> +    if (action == ACTION_FROZEN || action == ACTION_IDLE)
>>> +        goto skip;
>>> +
>>>       desc = md_sync_action_name(action);
>>>       mddev->last_sync_action = action;
>>
> Please send a new verison with the typo fixed.
> 
> Thanks,
> Kuai
> 

Sorry for my carelessness. I will send v2 later.

-- 
Thanks,
Nan

>>
>> Kind regards,
>>
>> Paul
>>
> 
> .