[v1] hung_task: Dump the blocking task stacktrace

[PATCH 0/2] hung_task: Dump the blocking task stacktrace

Posted by Masami Hiramatsu (Google) 11 months, 3 weeks ago

Hi,

The hung_task detector is very useful for detecting the lockup.
However, since it only dumps the blocked (uninterruptible sleep)
processes, it is not enough to identify the root cause of that
lockup.

For example, if a process holds a mutex and sleep an event in
interruptible state long time, the other processes will wait on
the mutex in uninterruptible state. In this case, the waiter
processes are dumped, but the blocker process is not shown
because it is sleep in interruptible state.

This adds a feature to dump the blocker task which holds a mutex
when detecting a hung task. e.g.

 INFO: task cat:113 blocked for more than 122 seconds.
       Not tainted 6.14.0-rc3-00002-g6afe972e1b9b #152
 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
 task:cat             state:D stack:13432 pid:113   tgid:113   ppid:103    task_flags:0x400100 flags:0x00000002
 Call Trace:
  <TASK>
  __schedule+0x731/0x960
  ? schedule_preempt_disabled+0x54/0xa0
  schedule+0xb7/0x140
  ? __mutex_lock+0x51d/0xa50
  ? __mutex_lock+0x51d/0xa50
  schedule_preempt_disabled+0x54/0xa0
  __mutex_lock+0x51d/0xa50
  ? current_time+0x3a/0x120
  read_dummy+0x23/0x70
  full_proxy_read+0x6a/0xc0
  vfs_read+0xc2/0x340
  ? __pfx_direct_file_splice_eof+0x10/0x10
  ? do_sendfile+0x1bd/0x2e0
  ksys_read+0x76/0xe0
  do_syscall_64+0xe3/0x1c0
  ? exc_page_fault+0xa9/0x1d0
  entry_SYSCALL_64_after_hwframe+0x77/0x7f
 RIP: 0033:0x4840cd
 RSP: 002b:00007ffe632b76c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
 RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00000000004840cd
 RDX: 0000000000001000 RSI: 00007ffe632b7710 RDI: 0000000000000003
 RBP: 00007ffe632b7710 R08: 0000000000000000 R09: 0000000000000000
 R10: 0000000001000000 R11: 0000000000000246 R12: 0000000000001000
 R13: 000000003a8b63a0 R14: 0000000000000001 R15: ffffffffffffffff
  </TASK>
 INFO: task cat:113 is blocked on a mutex owned by task cat:112.
 task:cat             state:S stack:13432 pid:112   tgid:112   ppid:103    task_flags:0x400100 flags:0x00000002
 Call Trace:
  <TASK>
  __schedule+0x731/0x960
  ? schedule_timeout+0xa8/0x120
  schedule+0xb7/0x140
  schedule_timeout+0xa8/0x120
  ? __pfx_process_timeout+0x10/0x10
  msleep_interruptible+0x3e/0x60
  read_dummy+0x2d/0x70
  full_proxy_read+0x6a/0xc0
  vfs_read+0xc2/0x340
  ? __pfx_direct_file_splice_eof+0x10/0x10
  ? do_sendfile+0x1bd/0x2e0
  ksys_read+0x76/0xe0
  do_syscall_64+0xe3/0x1c0
  ? exc_page_fault+0xa9/0x1d0
  entry_SYSCALL_64_after_hwframe+0x77/0x7f
 RIP: 0033:0x4840cd
 RSP: 002b:00007ffd69513748 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
 RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00000000004840cd
 RDX: 0000000000001000 RSI: 00007ffd69513790 RDI: 0000000000000003
 RBP: 00007ffd69513790 R08: 0000000000000000 R09: 0000000000000000
 R10: 0000000001000000 R11: 0000000000000246 R12: 0000000000001000
 R13: 0000000029d8d3a0 R14: 0000000000000001 R15: ffffffffffffffff
  </TASK>

Thank you,

---

Masami Hiramatsu (Google) (2):
      hung_task: Show the blocker task if the task is hung on mutex
      samples: Add hung_task detector mutex blocking sample


 kernel/hung_task.c                  |   38 ++++++++++++++++++++
 kernel/locking/mutex-debug.c        |    1 +
 kernel/locking/mutex.c              |    9 +++++
 kernel/locking/mutex.h              |    6 +++
 samples/Kconfig                     |    9 +++++
 samples/Makefile                    |    1 +
 samples/hung_task/Makefile          |    2 +
 samples/hung_task/hung_task_mutex.c |   66 +++++++++++++++++++++++++++++++++++
 8 files changed, 132 insertions(+)
 create mode 100644 samples/hung_task/Makefile
 create mode 100644 samples/hung_task/hung_task_mutex.c

--
Masami Hiramatsu (Google) <mhiramat@kernel.org>

Re: [PATCH 0/2] hung_task: Dump the blocking task stacktrace

Posted by Lance Yang 11 months, 3 weeks ago

CC linux-mm

On Wed, Feb 19, 2025 at 9:00 PM Masami Hiramatsu (Google)
<mhiramat@kernel.org> wrote:
>
> Hi,
>
> The hung_task detector is very useful for detecting the lockup.
> However, since it only dumps the blocked (uninterruptible sleep)
> processes, it is not enough to identify the root cause of that
> lockup.
>
> For example, if a process holds a mutex and sleep an event in
> interruptible state long time, the other processes will wait on
> the mutex in uninterruptible state. In this case, the waiter
> processes are dumped, but the blocker process is not shown
> because it is sleep in interruptible state.
>
> This adds a feature to dump the blocker task which holds a mutex
> when detecting a hung task. e.g.
>
>  INFO: task cat:113 blocked for more than 122 seconds.
>        Not tainted 6.14.0-rc3-00002-g6afe972e1b9b #152
>  "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>  task:cat             state:D stack:13432 pid:113   tgid:113   ppid:103    task_flags:0x400100 flags:0x00000002
>  Call Trace:
>   <TASK>
>   __schedule+0x731/0x960
>   ? schedule_preempt_disabled+0x54/0xa0
>   schedule+0xb7/0x140
>   ? __mutex_lock+0x51d/0xa50
>   ? __mutex_lock+0x51d/0xa50
>   schedule_preempt_disabled+0x54/0xa0
>   __mutex_lock+0x51d/0xa50
>   ? current_time+0x3a/0x120
>   read_dummy+0x23/0x70
>   full_proxy_read+0x6a/0xc0
>   vfs_read+0xc2/0x340
>   ? __pfx_direct_file_splice_eof+0x10/0x10
>   ? do_sendfile+0x1bd/0x2e0
>   ksys_read+0x76/0xe0
>   do_syscall_64+0xe3/0x1c0
>   ? exc_page_fault+0xa9/0x1d0
>   entry_SYSCALL_64_after_hwframe+0x77/0x7f
>  RIP: 0033:0x4840cd
>  RSP: 002b:00007ffe632b76c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
>  RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00000000004840cd
>  RDX: 0000000000001000 RSI: 00007ffe632b7710 RDI: 0000000000000003
>  RBP: 00007ffe632b7710 R08: 0000000000000000 R09: 0000000000000000
>  R10: 0000000001000000 R11: 0000000000000246 R12: 0000000000001000
>  R13: 000000003a8b63a0 R14: 0000000000000001 R15: ffffffffffffffff
>   </TASK>
>  INFO: task cat:113 is blocked on a mutex owned by task cat:112.
>  task:cat             state:S stack:13432 pid:112   tgid:112   ppid:103    task_flags:0x400100 flags:0x00000002
>  Call Trace:
>   <TASK>
>   __schedule+0x731/0x960
>   ? schedule_timeout+0xa8/0x120
>   schedule+0xb7/0x140
>   schedule_timeout+0xa8/0x120
>   ? __pfx_process_timeout+0x10/0x10
>   msleep_interruptible+0x3e/0x60
>   read_dummy+0x2d/0x70
>   full_proxy_read+0x6a/0xc0
>   vfs_read+0xc2/0x340
>   ? __pfx_direct_file_splice_eof+0x10/0x10
>   ? do_sendfile+0x1bd/0x2e0
>   ksys_read+0x76/0xe0
>   do_syscall_64+0xe3/0x1c0
>   ? exc_page_fault+0xa9/0x1d0
>   entry_SYSCALL_64_after_hwframe+0x77/0x7f
>  RIP: 0033:0x4840cd
>  RSP: 002b:00007ffd69513748 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
>  RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00000000004840cd
>  RDX: 0000000000001000 RSI: 00007ffd69513790 RDI: 0000000000000003
>  RBP: 00007ffd69513790 R08: 0000000000000000 R09: 0000000000000000
>  R10: 0000000001000000 R11: 0000000000000246 R12: 0000000000001000
>  R13: 0000000029d8d3a0 R14: 0000000000000001 R15: ffffffffffffffff
>   </TASK>
>
> Thank you,
>
> ---
>
> Masami Hiramatsu (Google) (2):
>       hung_task: Show the blocker task if the task is hung on mutex
>       samples: Add hung_task detector mutex blocking sample
>
>
>  kernel/hung_task.c                  |   38 ++++++++++++++++++++
>  kernel/locking/mutex-debug.c        |    1 +
>  kernel/locking/mutex.c              |    9 +++++
>  kernel/locking/mutex.h              |    6 +++
>  samples/Kconfig                     |    9 +++++
>  samples/Makefile                    |    1 +
>  samples/hung_task/Makefile          |    2 +
>  samples/hung_task/hung_task_mutex.c |   66 +++++++++++++++++++++++++++++++++++
>  8 files changed, 132 insertions(+)
>  create mode 100644 samples/hung_task/Makefile
>  create mode 100644 samples/hung_task/hung_task_mutex.c
>
> --
> Masami Hiramatsu (Google) <mhiramat@kernel.org>

Re: [PATCH 0/2] hung_task: Dump the blocking task stacktrace

Posted by Lance Yang 11 months, 3 weeks ago

On Wed, Feb 19, 2025 at 9:33 PM Lance Yang <ioworker0@gmail.com> wrote:
>
> CC linux-mm
>
> On Wed, Feb 19, 2025 at 9:00 PM Masami Hiramatsu (Google)
> <mhiramat@kernel.org> wrote:
> >
> > Hi,
> >
> > The hung_task detector is very useful for detecting the lockup.
> > However, since it only dumps the blocked (uninterruptible sleep)
> > processes, it is not enough to identify the root cause of that
> > lockup.
> >
> > For example, if a process holds a mutex and sleep an event in
> > interruptible state long time, the other processes will wait on
> > the mutex in uninterruptible state. In this case, the waiter
> > processes are dumped, but the blocker process is not shown
> > because it is sleep in interruptible state.

Cool! I just ran into something similar today, but with rwsem. In that
case, the blocked process was locked up, and we could not identify
the root cause either ;(

Thanks,
Lance

> >
> > This adds a feature to dump the blocker task which holds a mutex
> > when detecting a hung task. e.g.
> >
> >  INFO: task cat:113 blocked for more than 122 seconds.
> >        Not tainted 6.14.0-rc3-00002-g6afe972e1b9b #152
> >  "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> >  task:cat             state:D stack:13432 pid:113   tgid:113   ppid:103    task_flags:0x400100 flags:0x00000002
> >  Call Trace:
> >   <TASK>
> >   __schedule+0x731/0x960
> >   ? schedule_preempt_disabled+0x54/0xa0
> >   schedule+0xb7/0x140
> >   ? __mutex_lock+0x51d/0xa50
> >   ? __mutex_lock+0x51d/0xa50
> >   schedule_preempt_disabled+0x54/0xa0
> >   __mutex_lock+0x51d/0xa50
> >   ? current_time+0x3a/0x120
> >   read_dummy+0x23/0x70
> >   full_proxy_read+0x6a/0xc0
> >   vfs_read+0xc2/0x340
> >   ? __pfx_direct_file_splice_eof+0x10/0x10
> >   ? do_sendfile+0x1bd/0x2e0
> >   ksys_read+0x76/0xe0
> >   do_syscall_64+0xe3/0x1c0
> >   ? exc_page_fault+0xa9/0x1d0
> >   entry_SYSCALL_64_after_hwframe+0x77/0x7f
> >  RIP: 0033:0x4840cd
> >  RSP: 002b:00007ffe632b76c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
> >  RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00000000004840cd
> >  RDX: 0000000000001000 RSI: 00007ffe632b7710 RDI: 0000000000000003
> >  RBP: 00007ffe632b7710 R08: 0000000000000000 R09: 0000000000000000
> >  R10: 0000000001000000 R11: 0000000000000246 R12: 0000000000001000
> >  R13: 000000003a8b63a0 R14: 0000000000000001 R15: ffffffffffffffff
> >   </TASK>
> >  INFO: task cat:113 is blocked on a mutex owned by task cat:112.
> >  task:cat             state:S stack:13432 pid:112   tgid:112   ppid:103    task_flags:0x400100 flags:0x00000002
> >  Call Trace:
> >   <TASK>
> >   __schedule+0x731/0x960
> >   ? schedule_timeout+0xa8/0x120
> >   schedule+0xb7/0x140
> >   schedule_timeout+0xa8/0x120
> >   ? __pfx_process_timeout+0x10/0x10
> >   msleep_interruptible+0x3e/0x60
> >   read_dummy+0x2d/0x70
> >   full_proxy_read+0x6a/0xc0
> >   vfs_read+0xc2/0x340
> >   ? __pfx_direct_file_splice_eof+0x10/0x10
> >   ? do_sendfile+0x1bd/0x2e0
> >   ksys_read+0x76/0xe0
> >   do_syscall_64+0xe3/0x1c0
> >   ? exc_page_fault+0xa9/0x1d0
> >   entry_SYSCALL_64_after_hwframe+0x77/0x7f
> >  RIP: 0033:0x4840cd
> >  RSP: 002b:00007ffd69513748 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
> >  RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00000000004840cd
> >  RDX: 0000000000001000 RSI: 00007ffd69513790 RDI: 0000000000000003
> >  RBP: 00007ffd69513790 R08: 0000000000000000 R09: 0000000000000000
> >  R10: 0000000001000000 R11: 0000000000000246 R12: 0000000000001000
> >  R13: 0000000029d8d3a0 R14: 0000000000000001 R15: ffffffffffffffff
> >   </TASK>
> >
> > Thank you,
> >
> > ---
> >
> > Masami Hiramatsu (Google) (2):
> >       hung_task: Show the blocker task if the task is hung on mutex
> >       samples: Add hung_task detector mutex blocking sample
> >
> >
> >  kernel/hung_task.c                  |   38 ++++++++++++++++++++
> >  kernel/locking/mutex-debug.c        |    1 +
> >  kernel/locking/mutex.c              |    9 +++++
> >  kernel/locking/mutex.h              |    6 +++
> >  samples/Kconfig                     |    9 +++++
> >  samples/Makefile                    |    1 +
> >  samples/hung_task/Makefile          |    2 +
> >  samples/hung_task/hung_task_mutex.c |   66 +++++++++++++++++++++++++++++++++++
> >  8 files changed, 132 insertions(+)
> >  create mode 100644 samples/hung_task/Makefile
> >  create mode 100644 samples/hung_task/hung_task_mutex.c
> >
> > --
> > Masami Hiramatsu (Google) <mhiramat@kernel.org>

Re: [PATCH 0/2] hung_task: Dump the blocking task stacktrace

Posted by Waiman Long 11 months, 3 weeks ago

On 2/19/25 10:02 AM, Lance Yang wrote:
> On Wed, Feb 19, 2025 at 9:33 PM Lance Yang <ioworker0@gmail.com> wrote:
>> CC linux-mm
>>
>> On Wed, Feb 19, 2025 at 9:00 PM Masami Hiramatsu (Google)
>> <mhiramat@kernel.org> wrote:
>>> Hi,
>>>
>>> The hung_task detector is very useful for detecting the lockup.
>>> However, since it only dumps the blocked (uninterruptible sleep)
>>> processes, it is not enough to identify the root cause of that
>>> lockup.
>>>
>>> For example, if a process holds a mutex and sleep an event in
>>> interruptible state long time, the other processes will wait on
>>> the mutex in uninterruptible state. In this case, the waiter
>>> processes are dumped, but the blocker process is not shown
>>> because it is sleep in interruptible state.
> Cool! I just ran into something similar today, but with rwsem. In that
> case, the blocked process was locked up, and we could not identify
> the root cause either ;(

Once this patch series is settled down, we can extend rwsem to provide 
similar feature.

Cheers,
Longman

Re: [PATCH 0/2] hung_task: Dump the blocking task stacktrace

Posted by Masami Hiramatsu (Google) 11 months, 3 weeks ago

On Wed, 19 Feb 2025 15:20:39 -0500
Waiman Long <llong@redhat.com> wrote:

> 
> On 2/19/25 10:02 AM, Lance Yang wrote:
> > On Wed, Feb 19, 2025 at 9:33 PM Lance Yang <ioworker0@gmail.com> wrote:
> >> CC linux-mm
> >>
> >> On Wed, Feb 19, 2025 at 9:00 PM Masami Hiramatsu (Google)
> >> <mhiramat@kernel.org> wrote:
> >>> Hi,
> >>>
> >>> The hung_task detector is very useful for detecting the lockup.
> >>> However, since it only dumps the blocked (uninterruptible sleep)
> >>> processes, it is not enough to identify the root cause of that
> >>> lockup.
> >>>
> >>> For example, if a process holds a mutex and sleep an event in
> >>> interruptible state long time, the other processes will wait on
> >>> the mutex in uninterruptible state. In this case, the waiter
> >>> processes are dumped, but the blocker process is not shown
> >>> because it is sleep in interruptible state.
> > Cool! I just ran into something similar today, but with rwsem. In that
> > case, the blocked process was locked up, and we could not identify
> > the root cause either ;(
> 
> Once this patch series is settled down, we can extend rwsem to provide 
> similar feature.

While discussing about rwsem with Sergey, he pointed that we can not
identify a single blocker on rwsem, because several readers can block
several writers. In this case, we need to dump all of them but we
don't have such info.

So anyway, I would like to start from mutex, which is the simplest one.
For the other locks, we will discuss later. (or start with limited
support, like showing only rwsem::owner)

Thanks,

-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

Re: [PATCH 0/2] hung_task: Dump the blocking task stacktrace

Posted by Waiman Long 11 months, 3 weeks ago

On 2/20/25 9:18 AM, Masami Hiramatsu (Google) wrote:
> On Wed, 19 Feb 2025 15:20:39 -0500
> Waiman Long <llong@redhat.com> wrote:
>
>> On 2/19/25 10:02 AM, Lance Yang wrote:
>>> On Wed, Feb 19, 2025 at 9:33 PM Lance Yang <ioworker0@gmail.com> wrote:
>>>> CC linux-mm
>>>>
>>>> On Wed, Feb 19, 2025 at 9:00 PM Masami Hiramatsu (Google)
>>>> <mhiramat@kernel.org> wrote:
>>>>> Hi,
>>>>>
>>>>> The hung_task detector is very useful for detecting the lockup.
>>>>> However, since it only dumps the blocked (uninterruptible sleep)
>>>>> processes, it is not enough to identify the root cause of that
>>>>> lockup.
>>>>>
>>>>> For example, if a process holds a mutex and sleep an event in
>>>>> interruptible state long time, the other processes will wait on
>>>>> the mutex in uninterruptible state. In this case, the waiter
>>>>> processes are dumped, but the blocker process is not shown
>>>>> because it is sleep in interruptible state.
>>> Cool! I just ran into something similar today, but with rwsem. In that
>>> case, the blocked process was locked up, and we could not identify
>>> the root cause either ;(
>> Once this patch series is settled down, we can extend rwsem to provide
>> similar feature.
> While discussing about rwsem with Sergey, he pointed that we can not
> identify a single blocker on rwsem, because several readers can block
> several writers. In this case, we need to dump all of them but we
> don't have such info.
>
> So anyway, I would like to start from mutex, which is the simplest one.
> For the other locks, we will discuss later. (or start with limited
> support, like showing only rwsem::owner)

Yes, reader tracking is a problem as the rw_semaphore structure doesn't 
store information about the reader-owners as the count can vary. That is 
a limitation that we have to live with.

Cheers,
Longman

Re: [PATCH 0/2] hung_task: Dump the blocking task stacktrace

Posted by Lance Yang 11 months, 3 weeks ago

On Thu, Feb 20, 2025 at 4:20 AM Waiman Long <llong@redhat.com> wrote:
>
>
> On 2/19/25 10:02 AM, Lance Yang wrote:
> > On Wed, Feb 19, 2025 at 9:33 PM Lance Yang <ioworker0@gmail.com> wrote:
> >> CC linux-mm
> >>
> >> On Wed, Feb 19, 2025 at 9:00 PM Masami Hiramatsu (Google)
> >> <mhiramat@kernel.org> wrote:
> >>> Hi,
> >>>
> >>> The hung_task detector is very useful for detecting the lockup.
> >>> However, since it only dumps the blocked (uninterruptible sleep)
> >>> processes, it is not enough to identify the root cause of that
> >>> lockup.
> >>>
> >>> For example, if a process holds a mutex and sleep an event in
> >>> interruptible state long time, the other processes will wait on
> >>> the mutex in uninterruptible state. In this case, the waiter
> >>> processes are dumped, but the blocker process is not shown
> >>> because it is sleep in interruptible state.
> > Cool! I just ran into something similar today, but with rwsem. In that
> > case, the blocked process was locked up, and we could not identify
> > the root cause either ;(
>
> Once this patch series is settled down, we can extend rwsem to provide
> similar feature.

Sounds good! Really looking forward to it ;p

Thanks,
Lance

>
> Cheers,
> Longman
>