[PATCH] md: do not delete safemode_timer in mddev_suspend

linan666@huaweicloud.com posted 1 patch 1 year, 7 months ago
drivers/md/md.c | 1 -
1 file changed, 1 deletion(-)
[PATCH] md: do not delete safemode_timer in mddev_suspend
Posted by linan666@huaweicloud.com 1 year, 7 months ago
From: Li Nan <linan122@huawei.com>

The deletion of safemode_timer in mddev_suspend() is redundant and
potentially harmful now. If timer is about to be woken up but gets
deleted, 'in_sync' will remain 0 until the next write, causing array
to stay in the 'active' state instead of transitioning to 'clean'.

Commit 0d9f4f135eb6 ("MD: Add del_timer_sync to mddev_suspend (fix
nasty panic))" introduced this deletion for dm, because if timer fired
after dm is destroyed, the resource which the timer depends on might
have been freed.

However, commit 0dd84b319352 ("md: call __md_stop_writes in md_stop")
added __md_stop_writes() to md_stop(), which is called before freeing
resource. Timer is deleted in __md_stop_writes(), and the origin issue
is resolved. Therefore, delete safemode_timer can be removed safely now.

Signed-off-by: Li Nan <linan122@huawei.com>
---
 drivers/md/md.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index aff9118ff697..09c55d9a2c54 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -479,7 +479,6 @@ int mddev_suspend(struct mddev *mddev, bool interruptible)
 	 */
 	WRITE_ONCE(mddev->suspended, mddev->suspended + 1);
 
-	del_timer_sync(&mddev->safemode_timer);
 	/* restrict memory reclaim I/O during raid array is suspend */
 	mddev->noio_flag = memalloc_noio_save();
 
-- 
2.39.2
Re: [PATCH] md: do not delete safemode_timer in mddev_suspend
Posted by Song Liu 1 year, 6 months ago
On Wed, May 8, 2024 at 2:31 AM <linan666@huaweicloud.com> wrote:
>
> From: Li Nan <linan122@huawei.com>
>
> The deletion of safemode_timer in mddev_suspend() is redundant and
> potentially harmful now. If timer is about to be woken up but gets
> deleted, 'in_sync' will remain 0 until the next write, causing array
> to stay in the 'active' state instead of transitioning to 'clean'.
>
> Commit 0d9f4f135eb6 ("MD: Add del_timer_sync to mddev_suspend (fix
> nasty panic))" introduced this deletion for dm, because if timer fired
> after dm is destroyed, the resource which the timer depends on might
> have been freed.
>
> However, commit 0dd84b319352 ("md: call __md_stop_writes in md_stop")
> added __md_stop_writes() to md_stop(), which is called before freeing
> resource. Timer is deleted in __md_stop_writes(), and the origin issue
> is resolved. Therefore, delete safemode_timer can be removed safely now.
>
> Signed-off-by: Li Nan <linan122@huawei.com>

Applied to md-6.11. Thanks!

Song
Re: [PATCH] md: do not delete safemode_timer in mddev_suspend
Posted by Yu Kuai 1 year, 7 months ago
在 2024/05/08 17:20, linan666@huaweicloud.com 写道:
> From: Li Nan <linan122@huawei.com>
> 
> The deletion of safemode_timer in mddev_suspend() is redundant and
> potentially harmful now. If timer is about to be woken up but gets
> deleted, 'in_sync' will remain 0 until the next write, causing array
> to stay in the 'active' state instead of transitioning to 'clean'.
> 
> Commit 0d9f4f135eb6 ("MD: Add del_timer_sync to mddev_suspend (fix
> nasty panic))" introduced this deletion for dm, because if timer fired
> after dm is destroyed, the resource which the timer depends on might
> have been freed.
> 
> However, commit 0dd84b319352 ("md: call __md_stop_writes in md_stop")
> added __md_stop_writes() to md_stop(), which is called before freeing
> resource. Timer is deleted in __md_stop_writes(), and the origin issue
> is resolved. Therefore, delete safemode_timer can be removed safely now.
> 
> Signed-off-by: Li Nan <linan122@huawei.com>
> ---
>   drivers/md/md.c | 1 -
>   1 file changed, 1 deletion(-)
> 
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index aff9118ff697..09c55d9a2c54 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -479,7 +479,6 @@ int mddev_suspend(struct mddev *mddev, bool interruptible)
>   	 */
>   	WRITE_ONCE(mddev->suspended, mddev->suspended + 1);
>   
> -	del_timer_sync(&mddev->safemode_timer);

I don't understand why time is deleted here before, it's right based on
git log, commit 0d9f4f135eb6 add this to fix panic for dm-raid, and it's
not necessary now.

LGTM, feel free to add:

Reviewed-by: Yu Kuai <yukuai3@huawei.com>

However, since this behaviour is introduced since 2012, does anybody
really care about array status is 'active' instead of 'clean' while
there is no IO after suspend?

Thanks,
Kuai

>   	/* restrict memory reclaim I/O during raid array is suspend */
>   	mddev->noio_flag = memalloc_noio_save();
>   
> 

Re: [PATCH] md: do not delete safemode_timer in mddev_suspend
Posted by Mariusz Tkaczyk 1 year, 7 months ago
On Thu, 9 May 2024 09:34:15 +0800
Yu Kuai <yukuai1@huaweicloud.com> wrote:

> However, since this behaviour is introduced since 2012, does anybody
> really care about array status is 'active' instead of 'clean' while
> there is no IO after suspend?

It may cause rebuild after reboot (bad but we can live with it) or platform hang
(this is bad). mdadm is waiting for transition to clean on shutdown if I
remember correctly.
Probably nobody tried that as we all know that Linux doesn't like suspending
and this is rare to reboot platform just after suspend. Probably any write will
fix it.

But this is all based on my knowledge, not tested or proved however I believe
that it gives light where to look for problems if you want.

Mariusz