drivers/md/raid5.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
From: Li Nan <linan122@huawei.com>
Commit 868bba54a3bc ("md/raid5: fix a deadlock in the case that reshape is
interrupted") fixed a raid deadlock of reshape, but a similar issue is hit
by mdadm test 25raid456-reshape-deadlock.
INFO: task (udev-worker):63822 blocked for more than 122 seconds.
Not tainted 6.18.0-rc2-g0555b5424915-dirty #153
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
__schedule
schedule
schedule_timeout
wait_woken
raid5_make_request
md_handle_request
md_submit_bio
[...]
blkdev_read_iter
vfs_read
ksys_read
__x64_sys_read
It is triggered by:
1) normal IO waits for reshape to progress
2) user sets ACTION_FROZEN via ioctl
3) reshape is interrupted and cannot restart
4) users try to suspend array while active IO waits reshape
Following Kuai's previous fix, such IOs should fail in
make_stripe_request(). Thus, set a timeout for wait_woken() to fix
the deadlock, and blocked IO will fail in the next cycle.
Signed-off-by: Li Nan <linan122@huawei.com>
---
drivers/md/raid5.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index cdbc7eba5c54..957e712d2be9 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -6185,7 +6185,7 @@ static bool raid5_make_request(struct mddev *mddev, struct bio * bi)
}
wait_woken(&wait, TASK_UNINTERRUPTIBLE,
- MAX_SCHEDULE_TIMEOUT);
+ msecs_to_jiffies(10000));
continue;
}
--
2.39.2
Hi,
在 2025/11/24 16:45, linan666@huaweicloud.com 写道:
> From: Li Nan <linan122@huawei.com>
>
> Commit 868bba54a3bc ("md/raid5: fix a deadlock in the case that reshape is
> interrupted") fixed a raid deadlock of reshape, but a similar issue is hit
> by mdadm test 25raid456-reshape-deadlock.
>
> INFO: task (udev-worker):63822 blocked for more than 122 seconds.
> Not tainted 6.18.0-rc2-g0555b5424915-dirty #153
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> __schedule
> schedule
> schedule_timeout
> wait_woken
> raid5_make_request
> md_handle_request
> md_submit_bio
> [...]
> blkdev_read_iter
> vfs_read
> ksys_read
> __x64_sys_read
>
> It is triggered by:
> 1) normal IO waits for reshape to progress
> 2) user sets ACTION_FROZEN via ioctl
> 3) reshape is interrupted and cannot restart
> 4) users try to suspend array while active IO waits reshape
>
> Following Kuai's previous fix, such IOs should fail in
> make_stripe_request(). Thus, set a timeout for wait_woken() to fix
> the deadlock, and blocked IO will fail in the next cycle.
>
> Signed-off-by: Li Nan <linan122@huawei.com>
> ---
> drivers/md/raid5.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> index cdbc7eba5c54..957e712d2be9 100644
> --- a/drivers/md/raid5.c
> +++ b/drivers/md/raid5.c
> @@ -6185,7 +6185,7 @@ static bool raid5_make_request(struct mddev *mddev, struct bio * bi)
> }
>
> wait_woken(&wait, TASK_UNINTERRUPTIBLE,
> - MAX_SCHEDULE_TIMEOUT);
> + msecs_to_jiffies(10000));
Instead of this change to wake up every 10s unconditionally, can you fix this by wake up
synchronously when array is frozen or suspended that reshape can't continue.
> continue;
> }
>
--
Thansk,
Kuai
© 2016 - 2026 Red Hat, Inc.