kernel/hung_task.c | 1 + 1 file changed, 1 insertion(+)
needed for thread_with_file; also rare but not unheard of to need this
in module code, when blocking on user input.
one workaround used by some code is wait_event_interruptible() - but
that can be buggy if the outer context isn't expecting unwinding.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: fuyuanli <fuyuanli@didiglobal.com>
---
kernel/hung_task.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/kernel/hung_task.c b/kernel/hung_task.c
index 9a24574988d2..b2fc2727d654 100644
--- a/kernel/hung_task.c
+++ b/kernel/hung_task.c
@@ -43,6 +43,7 @@ static int __read_mostly sysctl_hung_task_check_count = PID_MAX_LIMIT;
* Zero means infinite timeout - no checking done:
*/
unsigned long __read_mostly sysctl_hung_task_timeout_secs = CONFIG_DEFAULT_HUNG_TASK_TIMEOUT;
+EXPORT_SYMBOL_GPL(sysctl_hung_task_timeout_secs);
/*
* Zero (default value) means use sysctl_hung_task_timeout_secs:
--
2.43.0
On Fri, Feb 09, 2024 at 02:09:35AM -0500, Kent Overstreet wrote: > needed for thread_with_file; also rare but not unheard of to need this > in module code, when blocking on user input. > > one workaround used by some code is wait_event_interruptible() - but > that can be buggy if the outer context isn't expecting unwinding. I don't think just exporting the variable ad thus allowing write access is a good idea. If we want to keep going down the route of this hack we should add an accessor function that returns the value. The cleaner solution would be a new task state that explicitly marks code than can sleep forever without triggerring the hang check. Although this might be a bit invaѕive and take a while.
On Wed, 14 Feb 2024 21:26:34 -0800 Christoph Hellwig <hch@infradead.org> wrote: > On Fri, Feb 09, 2024 at 02:09:35AM -0500, Kent Overstreet wrote: > > needed for thread_with_file; also rare but not unheard of to need this > > in module code, when blocking on user input. > > > > one workaround used by some code is wait_event_interruptible() - but > > that can be buggy if the outer context isn't expecting unwinding. > > I don't think just exporting the variable ad thus allowing write > access is a good idea. If we want to keep going down the route of > this hack we should add an accessor function that returns the value. > > The cleaner solution would be a new task state that explicitly > marks code than can sleep forever without triggerring the hang > check. Although this might be a bit invaѕive and take a while. A new PF_whatever flag would solve that simply? Which are the potential use sites for such a thing?
On Thu, Feb 15, 2024 at 10:55:09AM -0800, Andrew Morton wrote: > On Wed, 14 Feb 2024 21:26:34 -0800 Christoph Hellwig <hch@infradead.org> wrote: > > > On Fri, Feb 09, 2024 at 02:09:35AM -0500, Kent Overstreet wrote: > > > needed for thread_with_file; also rare but not unheard of to need this > > > in module code, when blocking on user input. > > > > > > one workaround used by some code is wait_event_interruptible() - but > > > that can be buggy if the outer context isn't expecting unwinding. > > > > I don't think just exporting the variable ad thus allowing write > > access is a good idea. If we want to keep going down the route of > > this hack we should add an accessor function that returns the value. > > > > The cleaner solution would be a new task state that explicitly > > marks code than can sleep forever without triggerring the hang > > check. Although this might be a bit invaѕive and take a while. I had the same thought. > A new PF_whatever flag would solve that simply? TASK_* flags are separate from PF_* flags, fortunately, and it doesn't look like anything but TASK_* flags go in task_struct->__state, so this shouldn't be a difficult change. > Which are the potential use sites for such a thing? There's a few places in the block layer that are using the sysctl value; those will be easy to fix. There's definitely more places abusing TASK_INTERRUPTIBLE, but aside from the ones in my code I can't think of a way to search for them. But the block layer ones look a little suspect to me: the commit message indicates they were added becasue discards on some devices can take > 100 seconds - which is true, but this is a more general problem, there's other places we block on IO. Might want to give this some more thought.
On Thu, Feb 15, 2024 at 06:26:59PM -0500, Kent Overstreet wrote: > There's a few places in the block layer that are using the sysctl value; > those will be easy to fix. There's definitely more places abusing > TASK_INTERRUPTIBLE, but aside from the ones in my code I can't think of > a way to search for them. I think any kthread that is woken using wake_up_process or a wait queue is a good candidate.
On Fri, 9 Feb 2024 02:09:35 -0500 Kent Overstreet <kent.overstreet@linux.dev> wrote: > needed for thread_with_file; also rare but not unheard of to need this > in module code, when blocking on user input. I see no bcachefs code in linux-next which uses this. All I have to go with is the above explanation-free assertion. IOW this patch is unreviewable. > one workaround used by some code is wait_event_interruptible() examples? > - but that can be buggy if the outer context isn't expecting unwinding. More explanation of this? > --- a/kernel/hung_task.c > +++ b/kernel/hung_task.c > @@ -43,6 +43,7 @@ static int __read_mostly sysctl_hung_task_check_count = PID_MAX_LIMIT; > * Zero means infinite timeout - no checking done: > */ > unsigned long __read_mostly sysctl_hung_task_timeout_secs = CONFIG_DEFAULT_HUNG_TASK_TIMEOUT; > +EXPORT_SYMBOL_GPL(sysctl_hung_task_timeout_secs); It seems strange that a module wouild want this. Makes one wonder what the heck is going on in there.
On Fri, Feb 09, 2024 at 02:13:24PM -0800, Andrew Morton wrote: > On Fri, 9 Feb 2024 02:09:35 -0500 Kent Overstreet <kent.overstreet@linux.dev> wrote: > > > needed for thread_with_file; also rare but not unheard of to need this > > in module code, when blocking on user input. > > I see no bcachefs code in linux-next which uses this. All I have to go > with is the above explanation-free assertion. IOW this patch is > unreviewable. > > > one workaround used by some code is wait_event_interruptible() > > examples? fs/bcachefs/util.h kthread_wait_event(); we use that - among other things - when the kthread is parked waiting for userspace to flip it on. TASK_INTERRUPTIBLE was the suggestion I got years ago, but I want to get away from it because - > > > - but that can be buggy if the outer context isn't expecting unwinding. > > More explanation of this? We're starting to think about this a bit more because of David Howell's proposal; the idea is that perhaps TASK_UNINTERRUPTIBLE vs. TASK_INTERURPTIBLE vs. TASK_KILLABLE should probably not be set at the waiting context, it should be set at the outer context where we would handle (or not handle) -ERESTARTSYS. think mutex_lock() vs. mutex_lock_killable(); that is bubbling up the context specification in an ad hoc way. This would regularize that. I've also seen bugs where code was doing a fixed TASK_INTERRUPTIBLE and the outer context wasn't expecting that - kthread creation does this. > > > --- a/kernel/hung_task.c > > +++ b/kernel/hung_task.c > > @@ -43,6 +43,7 @@ static int __read_mostly sysctl_hung_task_check_count = PID_MAX_LIMIT; > > * Zero means infinite timeout - no checking done: > > */ > > unsigned long __read_mostly sysctl_hung_task_timeout_secs = CONFIG_DEFAULT_HUNG_TASK_TIMEOUT; > > +EXPORT_SYMBOL_GPL(sysctl_hung_task_timeout_secs); > > It seems strange that a module wouild want this. Makes one wonder what > the heck is going on in there. specifically, this is for thread_with_file, where we've got a kthread hooked up to a file descriptor, effectively using it as both stdin and stdout. When the kthread reads from the fd, that can block for an unbounded amount of time - we're waiting on userspace input and it's totally fine.
© 2016 - 2026 Red Hat, Inc.