[PATCH v3] hung_task: Explicitly report I/O wait state in log output

Aaron Tomlin posted 1 patch 18 hours ago
kernel/hung_task.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
[PATCH v3] hung_task: Explicitly report I/O wait state in log output
Posted by Aaron Tomlin 18 hours ago
Currently, the hung task reporting mechanism indiscriminately labels all
TASK_UNINTERRUPTIBLE (D) tasks as "blocked", irrespective of whether they
are awaiting I/O completion or kernel locking primitives. This ambiguity
compels system administrators to manually inspect stack traces to discern
whether the delay stems from an I/O wait (typically indicative of
hardware or filesystem anomalies) or software contention. Such detailed
analysis is not always immediately accessible to system administrators
or support engineers.

To address this, this patch utilises the existing in_iowait field within
struct task_struct to augment the failure report. If the task is blocked
due to I/O (e.g., via io_schedule_prepare()), the log message is updated
to explicitly state "blocked in I/O wait".

Examples:
        - Standard Block: "INFO: task bash:123 blocked for more than 120
          seconds".

        - I/O Block: "INFO: task dd:456 blocked in I/O wait for more than
          120 seconds".

Accessing in_iowait is safe in this context. The detector holds
rcu_read_lock() within check_hung_uninterruptible_tasks(), ensuring the
task structure remains valid in memory. Furthermore, as the task is
confirmed to be in a persistent TASK_UNINTERRUPTIBLE state, it cannot
modify its own in_iowait flag, rendering the read operation stable and
free from data races.

Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Aaron Tomlin <atomlin@atomlin.com>
---
 kernel/hung_task.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/kernel/hung_task.c b/kernel/hung_task.c
index 8bc043fbe89c..6fcc94ce4ca9 100644
--- a/kernel/hung_task.c
+++ b/kernel/hung_task.c
@@ -250,8 +250,9 @@ static void hung_task_info(struct task_struct *t, unsigned long timeout,
 	if (sysctl_hung_task_warnings || hung_task_call_panic) {
 		if (sysctl_hung_task_warnings > 0)
 			sysctl_hung_task_warnings--;
-		pr_err("INFO: task %s:%d blocked for more than %ld seconds.\n",
-		       t->comm, t->pid, (jiffies - t->last_switch_time) / HZ);
+		pr_err("INFO: task %s:%d blocked%s for more than %ld seconds.\n",
+		       t->comm, t->pid, t->in_iowait ? " in I/O wait" : "",
+		       (jiffies - t->last_switch_time) / HZ);
 		pr_err("      %s %s %.*s\n",
 			print_tainted(), init_utsname()->release,
 			(int)strcspn(init_utsname()->version, " "),
-- 
2.51.0