[v3] cleanup and bugfix of sync

[PATCH v3 07/13] md: update curr_resync_completed even when MD_RECOVERY_INTR is set

Posted by linan666@huaweicloud.com 1 month, 3 weeks ago

From: Li Nan <linan122@huawei.com>

An error sync IO may be done and sub 'recovery_active' while its
error handling work is pending. This work sets 'recovery_disabled'
and MD_RECOVERY_INTR, then later removes the bad disk without Faulty
flag. If 'curr_resync_completed' is updated before the disk is removed,
it could lead to reading from sync-failed regions.

With the previous patch, error IO will set badblocks or mark rdev as
Faulty, sync-failed regions are no longer readable. After waiting for
'recovery_active' to reach 0 (in the previous line), all sync IO has
*completed*, regardless of whether MD_RECOVERY_INTR is set. Thus, the
MD_RECOVERY_INTR check can be removed.

Signed-off-by: Li Nan <linan122@huawei.com>
---
 drivers/md/md.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 39cb8430a7b1..cda434d8fe8c 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -9695,8 +9695,8 @@ void md_do_sync(struct md_thread *thread)
 	wait_event(mddev->recovery_wait, !atomic_read(&mddev->recovery_active));
 
 	if (!test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery) &&
-	    !test_bit(MD_RECOVERY_INTR, &mddev->recovery) &&
 	    mddev->curr_resync >= MD_RESYNC_ACTIVE) {
+		/* All sync IO completes after recovery_active becomes 0 */
 		mddev->curr_resync_completed = mddev->curr_resync;
 		sysfs_notify_dirent_safe(mddev->sysfs_completed);
 	}
-- 
2.39.2

Re: [PATCH v3 07/13] md: update curr_resync_completed even when MD_RECOVERY_INTR is set

Posted by Yu Kuai 1 month, 1 week ago

在 2025/12/15 11:04, linan666@huaweicloud.com 写道:

> From: Li Nan<linan122@huawei.com>
>
> An error sync IO may be done and sub 'recovery_active' while its
> error handling work is pending. This work sets 'recovery_disabled'
> and MD_RECOVERY_INTR, then later removes the bad disk without Faulty
> flag. If 'curr_resync_completed' is updated before the disk is removed,
> it could lead to reading from sync-failed regions.
>
> With the previous patch, error IO will set badblocks or mark rdev as
> Faulty, sync-failed regions are no longer readable. After waiting for
> 'recovery_active' to reach 0 (in the previous line), all sync IO has
> *completed*, regardless of whether MD_RECOVERY_INTR is set. Thus, the
> MD_RECOVERY_INTR check can be removed.
>
> Signed-off-by: Li Nan<linan122@huawei.com>
> ---
>   drivers/md/md.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)

Reviewed-by: Yu Kuai <yukuai@fnnas.com>

-- 
Thansk,
Kuai