[PATCH v2 1/2] hung_task: Show the blocker task if the task is hung on mutex

Masami Hiramatsu (Google) posted 2 patches 11 months, 3 weeks ago
There is a newer version of this series
[PATCH v2 1/2] hung_task: Show the blocker task if the task is hung on mutex
Posted by Masami Hiramatsu (Google) 11 months, 3 weeks ago
From: Masami Hiramatsu (Google) <mhiramat@kernel.org>

The "hung_task" shows a long-time uninterruptible slept task, but most
often, it's blocked on a mutex acquired by another task. Without
dumping such a task, investigating the root cause of the hung task
problem is very difficult.

This introduce task_struct::blocker_mutex to point the mutex lock
which this task is waiting for. Since the mutex has "owner"
information, we can find the owner task and dump it with hung tasks.

With this change, the hung task shows blocker task's info like below;

 INFO: task cat:113 blocked for more than 122 seconds.
       Not tainted 6.14.0-rc3-00002-g6afe972e1b9b #152
 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
 task:cat             state:D stack:13432 pid:113   tgid:113   ppid:103    task_flags:0x400100 flags:0x00000002
 Call Trace:
  <TASK>
  __schedule+0x731/0x960
  ? schedule_preempt_disabled+0x54/0xa0
  schedule+0xb7/0x140
  ? __mutex_lock+0x51d/0xa50
  ? __mutex_lock+0x51d/0xa50
  schedule_preempt_disabled+0x54/0xa0
  __mutex_lock+0x51d/0xa50
  ? current_time+0x3a/0x120
  read_dummy+0x23/0x70
  full_proxy_read+0x6a/0xc0
  vfs_read+0xc2/0x340
  ? __pfx_direct_file_splice_eof+0x10/0x10
  ? do_sendfile+0x1bd/0x2e0
  ksys_read+0x76/0xe0
  do_syscall_64+0xe3/0x1c0
  ? exc_page_fault+0xa9/0x1d0
  entry_SYSCALL_64_after_hwframe+0x77/0x7f
 RIP: 0033:0x4840cd
 RSP: 002b:00007ffe632b76c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
 RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00000000004840cd
 RDX: 0000000000001000 RSI: 00007ffe632b7710 RDI: 0000000000000003
 RBP: 00007ffe632b7710 R08: 0000000000000000 R09: 0000000000000000
 R10: 0000000001000000 R11: 0000000000000246 R12: 0000000000001000
 R13: 000000003a8b63a0 R14: 0000000000000001 R15: ffffffffffffffff
  </TASK>
 INFO: task cat:113 is blocked on a mutex owned by task cat:112.
 task:cat             state:S stack:13432 pid:112   tgid:112   ppid:103    task_flags:0x400100 flags:0x00000002
 Call Trace:
  <TASK>
  __schedule+0x731/0x960
  ? schedule_timeout+0xa8/0x120
  schedule+0xb7/0x140
  schedule_timeout+0xa8/0x120
  ? __pfx_process_timeout+0x10/0x10
  msleep_interruptible+0x3e/0x60
  read_dummy+0x2d/0x70
  full_proxy_read+0x6a/0xc0
  vfs_read+0xc2/0x340
  ? __pfx_direct_file_splice_eof+0x10/0x10
  ? do_sendfile+0x1bd/0x2e0
  ksys_read+0x76/0xe0
  do_syscall_64+0xe3/0x1c0
  ? exc_page_fault+0xa9/0x1d0
  entry_SYSCALL_64_after_hwframe+0x77/0x7f
 RIP: 0033:0x4840cd
 RSP: 002b:00007ffd69513748 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
 RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00000000004840cd
 RDX: 0000000000001000 RSI: 00007ffd69513790 RDI: 0000000000000003
 RBP: 00007ffd69513790 R08: 0000000000000000 R09: 0000000000000000
 R10: 0000000001000000 R11: 0000000000000246 R12: 0000000000001000
 R13: 0000000029d8d3a0 R14: 0000000000000001 R15: ffffffffffffffff
  </TASK>


Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
---
 Changes in v2:
  - Introduce CONFIG_DETECT_HUNG_TASK_BLOCKER for this feature.
  - Introduce task_struct::blocker_mutex to point the mutex.
  - Rename debug_mutex_get_owner() to mutex_get_owner().
  - Remove unneeded mutex_waiter::mutex.
---
 include/linux/mutex.h  |    2 ++
 include/linux/sched.h  |    4 ++++
 kernel/hung_task.c     |   31 +++++++++++++++++++++++++++++++
 kernel/locking/mutex.c |   14 ++++++++++++++
 lib/Kconfig.debug      |   10 ++++++++++
 5 files changed, 61 insertions(+)

diff --git a/include/linux/mutex.h b/include/linux/mutex.h
index 2bf91b57591b..2143d05116be 100644
--- a/include/linux/mutex.h
+++ b/include/linux/mutex.h
@@ -202,4 +202,6 @@ DEFINE_GUARD(mutex, struct mutex *, mutex_lock(_T), mutex_unlock(_T))
 DEFINE_GUARD_COND(mutex, _try, mutex_trylock(_T))
 DEFINE_GUARD_COND(mutex, _intr, mutex_lock_interruptible(_T) == 0)
 
+extern unsigned long mutex_get_owner(struct mutex *lock);
+
 #endif /* __LINUX_MUTEX_H */
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 9632e3318e0d..0cebdd736d44 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1217,6 +1217,10 @@ struct task_struct {
 	struct mutex_waiter		*blocked_on;
 #endif
 
+#ifdef CONFIG_DETECT_HUNG_TASK_BLOCKER
+	struct mutex			*blocker_mutex;
+#endif
+
 #ifdef CONFIG_DEBUG_ATOMIC_SLEEP
 	int				non_block_count;
 #endif
diff --git a/kernel/hung_task.c b/kernel/hung_task.c
index 04efa7a6e69b..aa5bc7fe09c2 100644
--- a/kernel/hung_task.c
+++ b/kernel/hung_task.c
@@ -93,6 +93,36 @@ static struct notifier_block panic_block = {
 	.notifier_call = hung_task_panic,
 };
 
+
+#ifdef CONFIG_DETECT_HUNG_TASK_BLOCKER
+static void debug_show_blocker(struct task_struct *task)
+{
+	struct task_struct *g, *t;
+	unsigned long owner;
+	struct mutex *lock;
+
+	lock = READ_ONCE(task->blocker_mutex);
+	if (!lock)
+		return;
+
+	owner = mutex_get_owner(lock);
+	if (likely(owner)) {
+		/* Ensure the owner information is correct. */
+		for_each_process_thread(g, t)
+			if ((unsigned long)t == owner) {
+				pr_err("INFO: task %s:%d is blocked on a mutex owned by task %s:%d.\n",
+					task->comm, task->pid, t->comm, t->pid);
+				sched_show_task(t);
+				return;
+			}
+	}
+	pr_err("INFO: task %s:%d is blocked on a mutex, but the owner is not found.\n",
+		task->comm, task->pid);
+}
+#else
+#define debug_show_blocker(t)	do {} while (0)
+#endif
+
 static void check_hung_task(struct task_struct *t, unsigned long timeout)
 {
 	unsigned long switch_count = t->nvcsw + t->nivcsw;
@@ -152,6 +182,7 @@ static void check_hung_task(struct task_struct *t, unsigned long timeout)
 		pr_err("\"echo 0 > /proc/sys/kernel/hung_task_timeout_secs\""
 			" disables this message.\n");
 		sched_show_task(t);
+		debug_show_blocker(t);
 		hung_task_show_lock = true;
 
 		if (sysctl_hung_task_all_cpu_backtrace)
diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
index b36f23de48f1..6a543c204a14 100644
--- a/kernel/locking/mutex.c
+++ b/kernel/locking/mutex.c
@@ -72,6 +72,14 @@ static inline unsigned long __owner_flags(unsigned long owner)
 	return owner & MUTEX_FLAGS;
 }
 
+/* Do not use the return value as a pointer directly. */
+unsigned long mutex_get_owner(struct mutex *lock)
+{
+	unsigned long owner = atomic_long_read(&lock->owner);
+
+	return (unsigned long)__owner_task(owner);
+}
+
 /*
  * Returns: __mutex_owner(lock) on failure or NULL on success.
  */
@@ -180,6 +188,9 @@ static void
 __mutex_add_waiter(struct mutex *lock, struct mutex_waiter *waiter,
 		   struct list_head *list)
 {
+#ifdef CONFIG_DETECT_HUNG_TASK_BLOCKER
+	WRITE_ONCE(current->blocker_mutex, lock);
+#endif
 	debug_mutex_add_waiter(lock, waiter, current);
 
 	list_add_tail(&waiter->list, list);
@@ -195,6 +206,9 @@ __mutex_remove_waiter(struct mutex *lock, struct mutex_waiter *waiter)
 		__mutex_clear_flag(lock, MUTEX_FLAGS);
 
 	debug_mutex_remove_waiter(lock, waiter, current);
+#ifdef CONFIG_DETECT_HUNG_TASK_BLOCKER
+	WRITE_ONCE(current->blocker_mutex, NULL);
+#endif
 }
 
 /*
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 1af972a92d06..91dfd8c10fbb 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1260,6 +1260,16 @@ config BOOTPARAM_HUNG_TASK_PANIC
 
 	  Say N if unsure.
 
+config DETECT_HUNG_TASK_BLOCKER
+	bool "Dump Hung Tasks Blocker"
+	depends on DETECT_HUNG_TASK
+	default y
+	help
+	  Say Y here to show the blocker task's stacktrace who acquires
+	  the mutex lock which "hung tasks" are waiting.
+	  This will add overhead a bit but shows suspicious tasks and
+	  call trace if it comes from waiting a mutex.
+
 config WQ_WATCHDOG
 	bool "Detect Workqueue Stalls"
 	depends on DEBUG_KERNEL
Re: [PATCH v2 1/2] hung_task: Show the blocker task if the task is hung on mutex
Posted by Peter Zijlstra 11 months, 3 weeks ago
On Fri, Feb 21, 2025 at 11:30:01PM +0900, Masami Hiramatsu (Google) wrote:
> +static void debug_show_blocker(struct task_struct *task)
> +{
> +	struct task_struct *g, *t;
> +	unsigned long owner;
> +	struct mutex *lock;
> +
> +	lock = READ_ONCE(task->blocker_mutex);
> +	if (!lock)
> +		return;
> +
> +	owner = mutex_get_owner(lock);
> +	if (likely(owner)) {
> +		/* Ensure the owner information is correct. */
> +		for_each_process_thread(g, t)
> +			if ((unsigned long)t == owner) {
> +				pr_err("INFO: task %s:%d is blocked on a mutex owned by task %s:%d.\n",
> +					task->comm, task->pid, t->comm, t->pid);
> +				sched_show_task(t);
> +				return;
> +			}

 - that for_each_process_thread() scope needs { }

 - that for_each_process_thread() loop needs RCU or tasklist_lock

 - there is no saying that the owner you read with mutex_get_owner() is
   still the owner by the time you do the compare, there can have been
   owner changes.

> +	}
> +	pr_err("INFO: task %s:%d is blocked on a mutex, but the owner is not found.\n",
> +		task->comm, task->pid);
> +}
Re: [PATCH v2 1/2] hung_task: Show the blocker task if the task is hung on mutex
Posted by Waiman Long 11 months, 3 weeks ago
On 2/21/25 1:59 PM, Peter Zijlstra wrote:
> On Fri, Feb 21, 2025 at 11:30:01PM +0900, Masami Hiramatsu (Google) wrote:
>> +static void debug_show_blocker(struct task_struct *task)
>> +{
>> +	struct task_struct *g, *t;
>> +	unsigned long owner;
>> +	struct mutex *lock;
>> +
>> +	lock = READ_ONCE(task->blocker_mutex);
>> +	if (!lock)
>> +		return;
>> +
>> +	owner = mutex_get_owner(lock);
>> +	if (likely(owner)) {
>> +		/* Ensure the owner information is correct. */
>> +		for_each_process_thread(g, t)
>> +			if ((unsigned long)t == owner) {
>> +				pr_err("INFO: task %s:%d is blocked on a mutex owned by task %s:%d.\n",
>> +					task->comm, task->pid, t->comm, t->pid);
>> +				sched_show_task(t);
>> +				return;
>> +			}
>   - that for_each_process_thread() scope needs { }
>
>   - that for_each_process_thread() loop needs RCU or tasklist_lock

The call chain should be

check_hung_uninterruptible_tasks()
   -> check_hung_task()
     -> debug_show_blocker()

check_hung_uninterruptible_tasks() takes rcu_read_lock() before calling 
check_hung_task(). Perhaps add a statement like

     RCU_LOCKDEP_WARN(!rcu_read_lock_held(), "no rcu lock held");

>
>   - there is no saying that the owner you read with mutex_get_owner() is
>     still the owner by the time you do the compare, there can have been
>     owner changes.

Maybe change "owned by" to "likely owned by" :-)

Cheers,
Longman

Re: [PATCH v2 1/2] hung_task: Show the blocker task if the task is hung on mutex
Posted by Masami Hiramatsu (Google) 11 months, 3 weeks ago
On Fri, 21 Feb 2025 15:29:57 -0500
Waiman Long <llong@redhat.com> wrote:

> 
> On 2/21/25 1:59 PM, Peter Zijlstra wrote:
> > On Fri, Feb 21, 2025 at 11:30:01PM +0900, Masami Hiramatsu (Google) wrote:
> >> +static void debug_show_blocker(struct task_struct *task)
> >> +{
> >> +	struct task_struct *g, *t;
> >> +	unsigned long owner;
> >> +	struct mutex *lock;
> >> +
> >> +	lock = READ_ONCE(task->blocker_mutex);
> >> +	if (!lock)
> >> +		return;
> >> +
> >> +	owner = mutex_get_owner(lock);
> >> +	if (likely(owner)) {
> >> +		/* Ensure the owner information is correct. */
> >> +		for_each_process_thread(g, t)
> >> +			if ((unsigned long)t == owner) {
> >> +				pr_err("INFO: task %s:%d is blocked on a mutex owned by task %s:%d.\n",
> >> +					task->comm, task->pid, t->comm, t->pid);
> >> +				sched_show_task(t);
> >> +				return;
> >> +			}
> >   - that for_each_process_thread() scope needs { }

OK, let me add it.

> >
> >   - that for_each_process_thread() loop needs RCU or tasklist_lock
> 
> The call chain should be
> 
> check_hung_uninterruptible_tasks()
>    -> check_hung_task()
>      -> debug_show_blocker()
> 
> check_hung_uninterruptible_tasks() takes rcu_read_lock() before calling 
> check_hung_task(). Perhaps add a statement like
> 
>      RCU_LOCKDEP_WARN(!rcu_read_lock_held(), "no rcu lock held");

Agreed. It needs to add  assertion.

> 
> >
> >   - there is no saying that the owner you read with mutex_get_owner() is
> >     still the owner by the time you do the compare, there can have been
> >     owner changes.
> 
> Maybe change "owned by" to "likely owned by" :-)

Indeed :-)

Thank you!

> 
> Cheers,
> Longman
> 


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>