Documentation/admin-guide/sysctl/kernel.rst | 3 +- kernel/hung_task.c | 130 +++++++++++++++----- 2 files changed, 99 insertions(+), 34 deletions(-)
Hi Lance, Greg, Petr, Joel, Andrew, This series introduces the ability to reset /proc/sys/kernel/hung_task_detect_count. Writing a "0" value to this file atomically resets the counter of detected hung tasks. This functionality provides system administrators with the means to clear the cumulative diagnostic history following incident resolution, thereby simplifying subsequent monitoring without necessitating a system restart. The updated logic ensures that the long-running scan (which is inherently preemptible and subject to rcu_lock_break()) does not become desynchronised from the global state. By treating the initial read as a "version snapshot" the kernel can guarantee that the cumulative count only updates if the underlying state remained stable throughout the duration of the scan. Please let me know your thoughts. Changes since v5 [1]: - Introduced a preparatory patch (Joel Granados) - Extended custom proc_handler to handle SYSCTL_USER_TO_KERN writes, strictly validating that only a value of "0" is permitted for resets (Joel Granados) - Transitioned from atomic_long_inc_return_relaxed() to a more robust read_acquire/cmpxchg_release pattern to ensure "All-or-Nothing" scan updates (Petr Mladek) - Re-introduce hung_task_diagnostics(). For better readability and consistent metadata publication Changes since v4 [2]: - Added missing underflow check (Lance Yang) Changes since v3 [3]: - Use atomic operations to ensure cross-CPU visibility and prevent an integer underflow - Use acquire/release semantics for memory ordering (Petr Mladek) - Move quoted string to a single line (Petr Mladek) - Remove variables coredump_msg and disable_msg to simplify code (Petr Mladek) - Add trailing "\n" to all strings to ensure immediate console flushing (Petr Mladek) - Improve the hung task counter documentation (Joel Granados) - Reject non-zero writes with -EINVAL (Joel Granados) - Translate to the new sysctl API (Petr Mladek) Changes since v2 [4]: - Avoided a needless double update to hung_task_detect_count (Lance Yang) - Restored previous use of pr_err() for each message (Greg KH) - Provided a complete descriptive comment for the helper Changes since v1 [5]: - Removed write-only sysfs attribute (Lance Yang) - Modified procfs hung_task_detect_count instead (Lance Yang) - Introduced a custom proc_handler - Updated documentation (Lance Yang) - Added 'static inline' as a hint to eliminate any function call overhead - Removed clutter through encapsulation [1]: https://lore.kernel.org/lkml/20251231004125.2380105-1-atomlin@atomlin.com/ [2]: https://lore.kernel.org/lkml/20251222014210.2032214-1-atomlin@atomlin.com/ [3]: https://lore.kernel.org/all/20251216030036.1822217-1-atomlin@atomlin.com/ [4]: https://lore.kernel.org/lkml/20251211033004.1628875-1-atomlin@atomlin.com/ [5]: https://lore.kernel.org/lkml/20251209041218.1583600-1-atomlin@atomlin.com/ Aaron Tomlin (2): hung_task: Convert detection count to atomic_long_t hung_task: Enable runtime reset of hung_task_detect_count Documentation/admin-guide/sysctl/kernel.rst | 3 +- kernel/hung_task.c | 130 +++++++++++++++----- 2 files changed, 99 insertions(+), 34 deletions(-) -- 2.51.0
On 2026/1/15 10:32, Aaron Tomlin wrote: > Hi Lance, Greg, Petr, Joel, Andrew, > > This series introduces the ability to reset > /proc/sys/kernel/hung_task_detect_count. > > Writing a "0" value to this file atomically resets the counter of detected > hung tasks. This functionality provides system administrators with the > means to clear the cumulative diagnostic history following incident > resolution, thereby simplifying subsequent monitoring without necessitating > a system restart. > > The updated logic ensures that the long-running scan (which is inherently > preemptible and subject to rcu_lock_break()) does not become desynchronised > from the global state. By treating the initial read as a "version snapshot" > the kernel can guarantee that the cumulative count only updates if the > underlying state remained stable throughout the duration of the > scan. > > Please let me know your thoughts. There is a mismatch here with what Joel and Petr suggested ... IIUC, we should just do: - Patch 1: Full cmpxchg-based counting (Petr's POC), sysctl read-only - Patch 2: Add write handler for userspace reset That way Patch 1 is the real logic change, and Patch 2 is just adding the userspace interface. Thanks, Lance
On Thu, Jan 15, 2026 at 11:24:13AM +0800, Lance Yang wrote:
> IIUC, we should just do:
> - Patch 1: Full cmpxchg-based counting (Petr's POC), sysctl read-only
> - Patch 2: Add write handler for userspace reset
>
> That way Patch 1 is the real logic change, and Patch 2 is just adding
> the userspace interface.
Hi Lance,
Thank you for your feedback.
If I am not mistaken, Joel suggested the following structure [1]:
1. Create a preparatory patch to change the data type to atomic_long_t
2. Introduce the required functionality to support a reset to "0"
[1]: https://lore.kernel.org/lkml/d4vx6k7d4tlagesq55yrbma26i2nyt4tpijkme6ckioeyfqfec@txrs27niaj2m/
Kind regards,
--
Aaron Tomlin
On 2026/1/16 02:18, Aaron Tomlin wrote: > On Thu, Jan 15, 2026 at 11:24:13AM +0800, Lance Yang wrote: >> IIUC, we should just do: >> - Patch 1: Full cmpxchg-based counting (Petr's POC), sysctl read-only >> - Patch 2: Add write handler for userspace reset >> >> That way Patch 1 is the real logic change, and Patch 2 is just adding >> the userspace interface. > > Hi Lance, > > Thank you for your feedback. > If I am not mistaken, Joel suggested the following structure [1]: > > 1. Create a preparatory patch to change the data type to atomic_long_t > 2. Introduce the required functionality to support a reset to "0" > > [1]: https://lore.kernel.org/lkml/d4vx6k7d4tlagesq55yrbma26i2nyt4tpijkme6ckioeyfqfec@txrs27niaj2m/ > Yeah, either way works :) But that way (changing to atomic with the old logic first, then rewriting to the new logic) seems like it creates more churn and makes review harder. Or just all in one? I'd hope Petr and Joel can comment.
On Fri 2026-01-16 10:22:34, Lance Yang wrote:
>
>
> On 2026/1/16 02:18, Aaron Tomlin wrote:
> > On Thu, Jan 15, 2026 at 11:24:13AM +0800, Lance Yang wrote:
> > > IIUC, we should just do:
> > > - Patch 1: Full cmpxchg-based counting (Petr's POC), sysctl read-only
> > > - Patch 2: Add write handler for userspace reset
> > >
> > > That way Patch 1 is the real logic change, and Patch 2 is just adding
> > > the userspace interface.
> >
> > Hi Lance,
> >
> > Thank you for your feedback.
> > If I am not mistaken, Joel suggested the following structure [1]:
> >
> > 1. Create a preparatory patch to change the data type to atomic_long_t
> > 2. Introduce the required functionality to support a reset to "0"
> >
> > [1]: https://lore.kernel.org/lkml/d4vx6k7d4tlagesq55yrbma26i2nyt4tpijkme6ckioeyfqfec@txrs27niaj2m/
> >
>
> Yeah, either way works :)
>
> But that way (changing to atomic with the old logic first, then
> rewriting to the new logic) seems like it creates more churn
> and makes review harder.
I agree that adding the atomic and keeping the old logic is not
good. I would prefer to split it into two patches the following
way:
1. Reshufle the code so that "sysctl_hung_task_detect_count"
gets incremented in check_hung_uninterruptible_tasks()
and hung_task_info() will just get "this_round_count".
Plus convert "sysctl_hung_task_detect_count" to atomic.
It is the change that I suggested at
https://lore.kernel.org/lkml/aWTzhLSWQRIGt8Xu@pathway.suse.cz/
This way, it would be clear why the reshufling was done.
And the atomic operations will get the right acquire/release
semantic right away.
2. Add support to reset the couter to "0".
It should be a quite simple patch easy to review.
I think that this is how Joel meant it. We could even have 3 patches:
1. Move "sysctl_hung_task_detect_count" increment to
check_hung_uninterruptible_tasks().
2. Convert the counter to atomic operations.
3. Add reset to "0" support.
But I think that two patches might be good enough.
Best Regards,
Petr
On Tue, Jan 20, 2026 at 10:46:14AM +0100, Petr Mladek wrote: > I agree that adding the atomic and keeping the old logic is not > good. I would prefer to split it into two patches the following > way: > > 1. Reshufle the code so that "sysctl_hung_task_detect_count" > gets incremented in check_hung_uninterruptible_tasks() > and hung_task_info() will just get "this_round_count". > > Plus convert "sysctl_hung_task_detect_count" to atomic. > > It is the change that I suggested at > https://lore.kernel.org/lkml/aWTzhLSWQRIGt8Xu@pathway.suse.cz/ > > This way, it would be clear why the reshufling was done. > And the atomic operations will get the right acquire/release > semantic right away. > > > 2. Add support to reset the couter to "0". > > It should be a quite simple patch easy to review. Acknowledged. > I think that this is how Joel meant it. We could even have 3 patches: > > 1. Move "sysctl_hung_task_detect_count" increment to > check_hung_uninterruptible_tasks(). > > 2. Convert the counter to atomic operations. > > 3. Add reset to "0" support. > > But I think that two patches might be good enough. Understood. I'll sort it out. Kind regards, -- Aaron Tomlin
On 2026/1/20 17:46, Petr Mladek wrote: > On Fri 2026-01-16 10:22:34, Lance Yang wrote: >> >> >> On 2026/1/16 02:18, Aaron Tomlin wrote: >>> On Thu, Jan 15, 2026 at 11:24:13AM +0800, Lance Yang wrote: >>>> IIUC, we should just do: >>>> - Patch 1: Full cmpxchg-based counting (Petr's POC), sysctl read-only >>>> - Patch 2: Add write handler for userspace reset >>>> >>>> That way Patch 1 is the real logic change, and Patch 2 is just adding >>>> the userspace interface. >>> >>> Hi Lance, >>> >>> Thank you for your feedback. >>> If I am not mistaken, Joel suggested the following structure [1]: >>> >>> 1. Create a preparatory patch to change the data type to atomic_long_t >>> 2. Introduce the required functionality to support a reset to "0" >>> >>> [1]: https://lore.kernel.org/lkml/d4vx6k7d4tlagesq55yrbma26i2nyt4tpijkme6ckioeyfqfec@txrs27niaj2m/ >>> >> >> Yeah, either way works :) >> >> But that way (changing to atomic with the old logic first, then >> rewriting to the new logic) seems like it creates more churn >> and makes review harder. > > I agree that adding the atomic and keeping the old logic is not > good. I would prefer to split it into two patches the following > way: > > 1. Reshufle the code so that "sysctl_hung_task_detect_count" > gets incremented in check_hung_uninterruptible_tasks() > and hung_task_info() will just get "this_round_count". > > Plus convert "sysctl_hung_task_detect_count" to atomic. > > It is the change that I suggested at > https://lore.kernel.org/lkml/aWTzhLSWQRIGt8Xu@pathway.suse.cz/ > > This way, it would be clear why the reshufling was done. > And the atomic operations will get the right acquire/release > semantic right away. > > > 2. Add support to reset the couter to "0". > > It should be a quite simple patch easy to review. +1 Thanks, Lance > > > I think that this is how Joel meant it. We could even have 3 patches: > > 1. Move "sysctl_hung_task_detect_count" increment to > check_hung_uninterruptible_tasks(). > > 2. Convert the counter to atomic operations. > > 3. Add reset to "0" support. > > But I think that two patches might be good enough. > > Best Regards, > Petr
© 2016 - 2026 Red Hat, Inc.