[PATCH] kthread: fix task state in kthread worker if being frozen

Chen Yu posted 1 patch 1 year, 5 months ago
There is a newer version of this series
kernel/kthread.c | 6 ++++++
1 file changed, 6 insertions(+)
[PATCH] kthread: fix task state in kthread worker if being frozen
Posted by Chen Yu 1 year, 5 months ago
It was reported that during cpu hotplug test, the following
error was triggered:

 do not call blocking ops when !TASK_RUNNING; state=1 set at kthread_worker_fn (kernel/kthread.c:?)
 WARNING: CPU: 1 PID: 674 at kernel/sched/core.c:8469 __might_sleep

 handle_bug
 exc_invalid_op
 asm_exc_invalid_op
 __might_sleep
 __might_sleep
 kthread_worker_fn
 kthread_worker_fn
 kthread
 __cfi_kthread_worker_fn
 ret_from_fork
 __cfi_kthread
 ret_from_fork_asm

Peter pointed out that there is a race condition when the kworker is being
frozen and falls into try_to_freeze() with TASK_INTERRUPTIBLE, which
triggeres the warning.

Fix this by explicitly set the TASK_RUNNING before entering try_to_freeze().

Fixes: b56c0d8937e6 ("kthread: implement kthread_worker")
Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202408161619.9ed8b83e-lkp@intel.com
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
---
 kernel/kthread.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/kernel/kthread.c b/kernel/kthread.c
index f7be976ff88a..06ab3ada9cf1 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -848,6 +848,12 @@ int kthread_worker_fn(void *worker_ptr)
 	} else if (!freezing(current))
 		schedule();
 
+	/*
+	 * Explicitly set the running state in case we are being
+	 * frozen and skip the schedule() above. try_to_freeze()
+	 * expects the current task to be in the running state.
+	 */
+	__set_current_state(TASK_RUNNING);
 	try_to_freeze();
 	cond_resched();
 	goto repeat;
-- 
2.25.1
Re: [PATCH] kthread: fix task state in kthread worker if being frozen
Posted by Andrew Morton 1 year, 5 months ago
On Mon, 19 Aug 2024 22:15:51 +0800 Chen Yu <yu.c.chen@intel.com> wrote:

> It was reported that during cpu hotplug test, the following
> error was triggered:
> 
>  do not call blocking ops when !TASK_RUNNING; state=1 set at kthread_worker_fn (kernel/kthread.c:?)
>  WARNING: CPU: 1 PID: 674 at kernel/sched/core.c:8469 __might_sleep
> 
>  handle_bug
>  exc_invalid_op
>  asm_exc_invalid_op
>  __might_sleep
>  __might_sleep
>  kthread_worker_fn
>  kthread_worker_fn
>  kthread
>  __cfi_kthread_worker_fn
>  ret_from_fork
>  __cfi_kthread
>  ret_from_fork_asm
> 
> Peter pointed out that there is a race condition when the kworker is being
> frozen and falls into try_to_freeze() with TASK_INTERRUPTIBLE, which
> triggeres the warning.

OK.  A full description of this race would be better than simply
asserting that it exists, please.

> Fix this by explicitly set the TASK_RUNNING before entering try_to_freeze().

OK.

> --- a/kernel/kthread.c
> +++ b/kernel/kthread.c
> @@ -848,6 +848,12 @@ int kthread_worker_fn(void *worker_ptr)
>  	} else if (!freezing(current))
>  		schedule();
>  
> +	/*
> +	 * Explicitly set the running state in case we are being
> +	 * frozen and skip the schedule() above. try_to_freeze()
> +	 * expects the current task to be in the running state.
> +	 */
> +	__set_current_state(TASK_RUNNING);
>  	try_to_freeze();
>  	cond_resched();
>  	goto repeat;

Comment is helpful, but why express in a comment that which can be
expressed in code?

--- a/kernel/kthread.c~kthread-fix-task-state-in-kthread-worker-if-being-frozen
+++ a/kernel/kthread.c
@@ -847,6 +847,12 @@ repeat:
 		trace_sched_kthread_work_execute_end(work, func);
 	} else if (!freezing(current))
 		schedule();
+	} else {
+		/*
+		 * Handle the case where X happens
+		 */
+		__set_current_state(TASK_RUNNING);
+	}
 
 	try_to_freeze();
 	cond_resched();
_
Re: [PATCH] kthread: fix task state in kthread worker if being frozen
Posted by Chen Yu 1 year, 5 months ago
Hi Andrew,

On 2024-08-19 at 18:58:07 -0700, Andrew Morton wrote:
> On Mon, 19 Aug 2024 22:15:51 +0800 Chen Yu <yu.c.chen@intel.com> wrote:
> 
> > It was reported that during cpu hotplug test, the following
> > error was triggered:
> > 
> >  do not call blocking ops when !TASK_RUNNING; state=1 set at kthread_worker_fn (kernel/kthread.c:?)
> >  WARNING: CPU: 1 PID: 674 at kernel/sched/core.c:8469 __might_sleep
> > 
> >  handle_bug
> >  exc_invalid_op
> >  asm_exc_invalid_op
> >  __might_sleep
> >  __might_sleep
> >  kthread_worker_fn
> >  kthread_worker_fn
> >  kthread
> >  __cfi_kthread_worker_fn
> >  ret_from_fork
> >  __cfi_kthread
> >  ret_from_fork_asm
> > 
> > Peter pointed out that there is a race condition when the kworker is being
> > frozen and falls into try_to_freeze() with TASK_INTERRUPTIBLE, which
> > triggeres the warning.
> 
> OK.  A full description of this race would be better than simply
> asserting that it exists, please.
>

OK, will write a description for this.
 
> > Fix this by explicitly set the TASK_RUNNING before entering try_to_freeze().
> 
> OK.
> 
> > --- a/kernel/kthread.c
> > +++ b/kernel/kthread.c
> > @@ -848,6 +848,12 @@ int kthread_worker_fn(void *worker_ptr)
> >  	} else if (!freezing(current))
> >  		schedule();
> >  
> > +	/*
> > +	 * Explicitly set the running state in case we are being
> > +	 * frozen and skip the schedule() above. try_to_freeze()
> > +	 * expects the current task to be in the running state.
> > +	 */
> > +	__set_current_state(TASK_RUNNING);
> >  	try_to_freeze();
> >  	cond_resched();
> >  	goto repeat;
> 
> Comment is helpful, but why express in a comment that which can be
> expressed in code?
>

OK, will do in next version.

thanks,
Chenyu
 
> --- a/kernel/kthread.c~kthread-fix-task-state-in-kthread-worker-if-being-frozen
> +++ a/kernel/kthread.c
> @@ -847,6 +847,12 @@ repeat:
>  		trace_sched_kthread_work_execute_end(work, func);
>  	} else if (!freezing(current))
>  		schedule();
> +	} else {
> +		/*
> +		 * Handle the case where X happens
> +		 */
> +		__set_current_state(TASK_RUNNING);
> +	}
>  
>  	try_to_freeze();
>  	cond_resched();
> _
>