[v1] nilfs2: Avoid having an active sc_timer before freeing sci

[PATCH] nilfs2: Avoid having an active sc_timer before freeing sci

Posted by Edward Adam Davis 3 months, 1 week ago

Because kthread_stop did not stop sc_task properly and returned -EINTR,
the sc_timer was not properly closed, ultimately causing the problem [1]
reported by syzbot when freeing sci due to the sc_timer not being closed.

Because the thread sc_task main function nilfs_segctor_thread() returns 0
when it succeeds, when the return value of kthread_stop() is not 0 in
nilfs_segctor_destroy(), we believe that it has not properly closed sc_timer.
We use timer_shutdown_sync() to sync wait for sc_timer to shutdown, and set
the value of sc_task to NULL under the protection of lock sc_state_lock,
so as to avoid the issue caused by sc_timer not being properly shutdowned.

[1]
ODEBUG: free active (active state 0) object: 00000000dacb411a object type: timer_list hint: nilfs_construction_timeout
Call trace:
 nilfs_segctor_destroy fs/nilfs2/segment.c:2811 [inline]
 nilfs_detach_log_writer+0x668/0x8cc fs/nilfs2/segment.c:2877
 nilfs_put_super+0x4c/0x12c fs/nilfs2/super.c:509

Reported-by: syzbot+24d8b70f039151f65590@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=24d8b70f039151f65590
Tested-by: syzbot+24d8b70f039151f65590@syzkaller.appspotmail.com
Signed-off-by: Edward Adam Davis <eadavis@qq.com>
---
 fs/nilfs2/segment.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/fs/nilfs2/segment.c b/fs/nilfs2/segment.c
index f15ca6fc400d..deee16bc9d4e 100644
--- a/fs/nilfs2/segment.c
+++ b/fs/nilfs2/segment.c
@@ -2768,7 +2768,12 @@ static void nilfs_segctor_destroy(struct nilfs_sc_info *sci)
 
 	if (sci->sc_task) {
 		wake_up(&sci->sc_wait_daemon);
-		kthread_stop(sci->sc_task);
+		if (kthread_stop(sci->sc_task)) {
+			spin_lock(&sci->sc_state_lock);
+			sci->sc_task = NULL;
+			timer_shutdown_sync(&sci->sc_timer);
+			spin_unlock(&sci->sc_state_lock);
+		}
 	}
 
 	spin_lock(&sci->sc_state_lock);
-- 
2.43.0

Re: [PATCH] nilfs2: Avoid having an active sc_timer before freeing sci

Posted by Ryusuke Konishi 3 months, 1 week ago

On Wed, Oct 29, 2025 at 2:23 PM Edward Adam Davis wrote:
>
> Because kthread_stop did not stop sc_task properly and returned -EINTR,
> the sc_timer was not properly closed, ultimately causing the problem [1]
> reported by syzbot when freeing sci due to the sc_timer not being closed.
>
> Because the thread sc_task main function nilfs_segctor_thread() returns 0
> when it succeeds, when the return value of kthread_stop() is not 0 in
> nilfs_segctor_destroy(), we believe that it has not properly closed sc_timer.
> We use timer_shutdown_sync() to sync wait for sc_timer to shutdown, and set
> the value of sc_task to NULL under the protection of lock sc_state_lock,
> so as to avoid the issue caused by sc_timer not being properly shutdowned.
>
> [1]
> ODEBUG: free active (active state 0) object: 00000000dacb411a object type: timer_list hint: nilfs_construction_timeout
> Call trace:
>  nilfs_segctor_destroy fs/nilfs2/segment.c:2811 [inline]
>  nilfs_detach_log_writer+0x668/0x8cc fs/nilfs2/segment.c:2877
>  nilfs_put_super+0x4c/0x12c fs/nilfs2/super.c:509
>
> Reported-by: syzbot+24d8b70f039151f65590@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=24d8b70f039151f65590
> Tested-by: syzbot+24d8b70f039151f65590@syzkaller.appspotmail.com
> Signed-off-by: Edward Adam Davis <eadavis@qq.com>
> ---
>  fs/nilfs2/segment.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/fs/nilfs2/segment.c b/fs/nilfs2/segment.c
> index f15ca6fc400d..deee16bc9d4e 100644
> --- a/fs/nilfs2/segment.c
> +++ b/fs/nilfs2/segment.c
> @@ -2768,7 +2768,12 @@ static void nilfs_segctor_destroy(struct nilfs_sc_info *sci)
>
>         if (sci->sc_task) {
>                 wake_up(&sci->sc_wait_daemon);
> -               kthread_stop(sci->sc_task);
> +               if (kthread_stop(sci->sc_task)) {
> +                       spin_lock(&sci->sc_state_lock);
> +                       sci->sc_task = NULL;
> +                       timer_shutdown_sync(&sci->sc_timer);
> +                       spin_unlock(&sci->sc_state_lock);
> +               }
>         }
>
>         spin_lock(&sci->sc_state_lock);
> --
> 2.43.0

Thanks, Edward!

I spent a little while wondering if kthread_stop() could actually
return a non-zero value (such as -EINTR), but then I realized you'd
actually tested it with syzbot and confirmed that it could happen and
that this was causing the problem.

I'll send this fix upstream.

Thanks,
Ryusuke Konishi