[PATCH 0/2] hung_task: Provide runtime reset interface for hung task detector

Aaron Tomlin posted 2 patches 1 week, 3 days ago
There is a newer version of this series
.../sysfs-kernel-hung_task_detect_count_reset |  8 +++
kernel/hung_task.c                            | 68 ++++++++++++++++---
2 files changed, 66 insertions(+), 10 deletions(-)
create mode 100644 Documentation/ABI/testing/sysfs-kernel-hung_task_detect_count_reset
[PATCH 0/2] hung_task: Provide runtime reset interface for hung task detector
Posted by Aaron Tomlin 1 week, 3 days ago
Introduce a write-only sysfs attribute,
/sys/kernel/hung_task_detect_count_reset, to reset the total count of
detected hung tasks at runtime.

The attribute requires writing the value "1" to trigger the reset and returns
-EINVAL for any other input, ensuring robustness.

This addition is primarily justified by the need for enhanced
administrative control and improved diagnostics workflow in a running
production environment. It addresses a key limitation of the existing
mechanism: the inability to clear persistent state without resorting to a
full system reboot. The sysfs interface provides a non-disruptive, runtime
method to manage the diagnostic state.

After a system administrator investigates a potential hang and corrects the issue,
it is now possible to clear the history back to zero. This provides a clean
slate for subsequent monitoring, ensuring that any new recurrence of a hung
task is immediately reflected by a counter value greater than zero,
streamlining the post-mortem diagnostic phase.


Aaron Tomlin (2):
  hung_task: Consolidate hung task warning into an atomic log block
  hung_task: Provide runtime reset interface for hung task detector

 .../sysfs-kernel-hung_task_detect_count_reset |  8 +++
 kernel/hung_task.c                            | 68 ++++++++++++++++---
 2 files changed, 66 insertions(+), 10 deletions(-)
 create mode 100644 Documentation/ABI/testing/sysfs-kernel-hung_task_detect_count_reset

-- 
2.51.0