scripts/decode_stacktrace.sh | 51 ++++++++++++++++++++++++++++++++++++++---- 1 file changed, 46 insertions(+), 5 deletions(-)
From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Add -c option to search call address search to decode_stacktrace.
This tries to decode line info backwards, starting from 1byte before
the return address, and displays the first line info it founds as
the caller address.
If it tries up to 10bytes before (or the symbol address) and still
can not find it, it gives up and decodes the return address.
With -c option:
Call Trace:
<TASK>
dump_stack_lvl (lib/dump_stack.c:94 lib/dump_stack.c:120)
lockdep_rcu_suspicious (kernel/locking/lockdep.c:6876)
event_filter_pid_sched_process_fork (kernel/trace/trace_events.c:1057)
kernel_clone (include/trace/events/sched.h:396 include/trace/events/sched.h:396 kernel/fork.c:2664)
__x64_sys_clone (kernel/fork.c:2795 kernel/fork.c:2779 kernel/fork.c:2779)
do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94)
? entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:121)
? trace_irq_disable (include/trace/events/preemptirq.h:36)
entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:121)
Without -c option:
Call Trace:
<TASK>
dump_stack_lvl (lib/dump_stack.c:122)
lockdep_rcu_suspicious (kernel/locking/lockdep.c:6877)
event_filter_pid_sched_process_fork (kernel/trace/trace_events.c:?)
kernel_clone (include/trace/events/sched.h:? include/trace/events/sched.h:396 kernel/fork.c:2664)
__x64_sys_clone (kernel/fork.c:2779)
do_syscall_64 (arch/x86/entry/syscall_64.c:?)
? entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
? trace_irq_disable (include/trace/events/preemptirq.h:36)
entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
---
scripts/decode_stacktrace.sh | 51 ++++++++++++++++++++++++++++++++++++++----
1 file changed, 46 insertions(+), 5 deletions(-)
diff --git a/scripts/decode_stacktrace.sh b/scripts/decode_stacktrace.sh
index 8d01b741de62..78e0810af476 100755
--- a/scripts/decode_stacktrace.sh
+++ b/scripts/decode_stacktrace.sh
@@ -5,9 +5,11 @@
usage() {
echo "Usage:"
- echo " $0 -r <release>"
- echo " $0 [<vmlinux> [<base_path>|auto [<modules_path>]]]"
+ echo " $0 [-c] -r <release>"
+ echo " $0 [-c] [<vmlinux> [<base_path>|auto [<modules_path>]]]"
echo " $0 -h"
+ echo "Options:"
+ echo " -c: Decode heuristically searched call address."
}
# Try to find a Rust demangler
@@ -33,11 +35,17 @@ fi
READELF=${UTIL_PREFIX}readelf${UTIL_SUFFIX}
ADDR2LINE=${UTIL_PREFIX}addr2line${UTIL_SUFFIX}
NM=${UTIL_PREFIX}nm${UTIL_SUFFIX}
+call_search=false
if [[ $1 == "-h" ]] ; then
usage
exit 0
-elif [[ $1 == "-r" ]] ; then
+elif [[ $1 == "-c" ]] ; then
+ call_search=true
+ shift 1
+fi
+
+if [[ $1 == "-r" ]] ; then
vmlinux=""
basepath="auto"
modpath=""
@@ -123,6 +131,28 @@ find_module() {
return 1
}
+UNKNOWN_LINE="??:0"
+
+search_call_site() {
+ # Instead of using the return address, use the nearest line info
+ # address before given address.
+ local return_addr=${2}
+ local max=${3}
+ local i
+
+ for i in $(seq 1 ${max}); do
+ local expr=$((0x$return_addr-$i))
+ local address=$(printf "%x\n" "$expr")
+
+ local code=$(${ADDR2LINE} -i -e "${1}" "$address" 2>/dev/null)
+ local first=${code% *}
+ if [[ "$code" != "" && "$code" != ${UNKNOWN_LINE} && "${first#*:}" != "?" ]]; then
+ echo "$code"
+ break
+ fi
+ done
+}
+
parse_symbol() {
# The structure of symbol at this point is:
# ([name]+[offset]/[total length])
@@ -176,6 +206,9 @@ parse_symbol() {
# Let's start doing the math to get the exact address into the
# symbol. First, strip out the symbol total length.
local expr=${symbol%/*}
+ # Also parse the offset from symbol.
+ local offset=${expr#*+}
+ offset=$((offset))
# Now, replace the symbol name with the base address we found
# before.
@@ -190,7 +223,15 @@ parse_symbol() {
if [[ $aarray_support == true && "${cache[$module,$address]+isset}" == "isset" ]]; then
local code=${cache[$module,$address]}
else
- local code=$(${ADDR2LINE} -i -e "$objfile" "$address" 2>/dev/null)
+ local code
+ if [[ $call_search == true && $offset != 0 ]]; then
+ code=$(search_call_site "$objfile" "$address" "$offset")
+ fi
+
+ if [[ "$code" == "" ]]; then
+ code=$(${ADDR2LINE} -i -e "$objfile" "$address" 2>/dev/null)
+ fi
+
if [[ $aarray_support == true ]]; then
cache[$module,$address]=$code
fi
@@ -199,7 +240,7 @@ parse_symbol() {
# addr2line doesn't return a proper error code if it fails, so
# we detect it using the value it prints so that we could preserve
# the offset/size into the function and bail out
- if [[ $code == "??:0" ]]; then
+ if [[ $code == ${UNKNOWN_LINE} ]]; then
return
fi
On Thu, 5 Mar 2026 14:12:19 +0900, Masami Hiramatsu (Google) wrote:
> Add -c option to search call address search to decode_stacktrace.
> This tries to decode line info backwards, starting from 1byte before
> the return address, and displays the first line info it founds as
> the caller address.
> If it tries up to 10bytes before (or the symbol address) and still
> can not find it, it gives up and decodes the return address.
The commit message says "up to 10bytes" but the code passes $offset
(the function offset from the symbol) as the max iteration count to
search_call_site(). There's no 10-byte cap anywhere in the code?
$offset can easily be hundreds or thousands of bytes into a function.
> +search_call_site() {
> + # Instead of using the return address, use the nearest line info
> + # address before given address.
> + local return_addr=${2}
> + local max=${3}
> + local i
> +
> + for i in $(seq 1 ${max}); do
> + local expr=$((0x$return_addr-$i))
> + local address=$(printf "%x\n" "$expr")
> +
> + local code=$(${ADDR2LINE} -i -e "${1}" "$address" 2>/dev/null)
> + local first=${code% *}
> + if [[ "$code" != "" && "$code" != ${UNKNOWN_LINE} && "${first#*:}" != "?" ]]; then
To also address Matthieu's question about performance: I think this
whole iterative search could be replaced by simply subtracting 1 from
the return address before passing it to addr2line.
DWARF line tables map address *ranges* to source lines, so any address
within the CALL instruction resolves to the correct source line.
return_addr-1 is guaranteed to land inside the CALL instruction (it's
the last byte of it), so a single addr2line call is sufficient.
This is exactly what the kernel itself does in sprint_backtrace()
(kernel/kallsyms.c:570): it passes symbol_offset=-1 to
__sprint_symbol(), which does `address += symbol_offset` before
lookup. GDB, perf, and libunwind all use the same addr-1 trick for
the same reason.
That would make this both correct and free.
> + if [[ "$code" != "" && "$code" != ${UNKNOWN_LINE} && "${first#*:}" != "?" ]]; then
Minor: ${UNKNOWN_LINE} is "??:0" -- when unquoted on the RHS of != inside
[[ ]], the ? characters are interpreted as glob wildcards (each matching
any single character). It happens to work here because ? also matches '?'
itself, but it should be quoted as "${UNKNOWN_LINE}" for correctness.
Same issue on the other != ${UNKNOWN_LINE} below.
--
Thanks,
Sasha
On Thu, 5 Mar 2026 10:51:47 -0500
Sasha Levin <sashal@kernel.org> wrote:
> On Thu, 5 Mar 2026 14:12:19 +0900, Masami Hiramatsu (Google) wrote:
> > Add -c option to search call address search to decode_stacktrace.
> > This tries to decode line info backwards, starting from 1byte before
> > the return address, and displays the first line info it founds as
> > the caller address.
> > If it tries up to 10bytes before (or the symbol address) and still
> > can not find it, it gives up and decodes the return address.
>
> The commit message says "up to 10bytes" but the code passes $offset
> (the function offset from the symbol) as the max iteration count to
> search_call_site(). There's no 10-byte cap anywhere in the code?
> $offset can easily be hundreds or thousands of bytes into a function.
Ah, sorry. I forgot to set maximum :(
>
> > +search_call_site() {
> > + # Instead of using the return address, use the nearest line info
> > + # address before given address.
> > + local return_addr=${2}
> > + local max=${3}
> > + local i
> > +
> > + for i in $(seq 1 ${max}); do
> > + local expr=$((0x$return_addr-$i))
> > + local address=$(printf "%x\n" "$expr")
> > +
> > + local code=$(${ADDR2LINE} -i -e "${1}" "$address" 2>/dev/null)
> > + local first=${code% *}
> > + if [[ "$code" != "" && "$code" != ${UNKNOWN_LINE} && "${first#*:}" != "?" ]]; then
>
> To also address Matthieu's question about performance: I think this
> whole iterative search could be replaced by simply subtracting 1 from
> the return address before passing it to addr2line.
>
> DWARF line tables map address *ranges* to source lines, so any address
> within the CALL instruction resolves to the correct source line.
> return_addr-1 is guaranteed to land inside the CALL instruction (it's
> the last byte of it), so a single addr2line call is sufficient.
Ah, got it, OK. I also confirmed "addr-1" works. But if there is no lineinfo
entry for the call instruction, shouldn't we check more instructions before
the call?
>
> This is exactly what the kernel itself does in sprint_backtrace()
> (kernel/kallsyms.c:570): it passes symbol_offset=-1 to
> __sprint_symbol(), which does `address += symbol_offset` before
> lookup. GDB, perf, and libunwind all use the same addr-1 trick for
> the same reason.
OK.
>
> That would make this both correct and free.
>
> > + if [[ "$code" != "" && "$code" != ${UNKNOWN_LINE} && "${first#*:}" != "?" ]]; then
>
> Minor: ${UNKNOWN_LINE} is "??:0" -- when unquoted on the RHS of != inside
> [[ ]], the ? characters are interpreted as glob wildcards (each matching
> any single character). It happens to work here because ? also matches '?'
> itself, but it should be quoted as "${UNKNOWN_LINE}" for correctness.
> Same issue on the other != ${UNKNOWN_LINE} below.
Ah, OK. Let me fix it.
Thanks,
>
> --
> Thanks,
> Sasha
--
Masami Hiramatsu (Google) <mhiramat@kernel.org>
On Fri, Mar 06, 2026 at 01:32:41AM +0900, Masami Hiramatsu wrote: >On Thu, 5 Mar 2026 10:51:47 -0500 >Sasha Levin <sashal@kernel.org> wrote: >> DWARF line tables map address *ranges* to source lines, so any address >> within the CALL instruction resolves to the correct source line. >> return_addr-1 is guaranteed to land inside the CALL instruction (it's >> the last byte of it), so a single addr2line call is sufficient. > >Ah, got it, OK. I also confirmed "addr-1" works. But if there is no lineinfo >entry for the call instruction, shouldn't we check more instructions before >the call? There's no such thing as "no lineinfo entry for the call instruction" - DWARF line tables are range-based, not discrete points. Each row covers all addresses up to the next row, so every address within a function resolves to some source line. addr-1 lands inside the CALL instruction and will always resolve to same line as the CALL itself. We show "??:0" because the address we passed falls outside of any DWARF compilation unit altogether. -- Thanks, Sasha
Hi Masami, On 05/03/2026 06:12, Masami Hiramatsu (Google) wrote: > From: Masami Hiramatsu (Google) <mhiramat@kernel.org> > > Add -c option to search call address search to decode_stacktrace. > This tries to decode line info backwards, starting from 1byte before > the return address, and displays the first line info it founds as > the caller address. > If it tries up to 10bytes before (or the symbol address) and still > can not find it, it gives up and decodes the return address. Thank you for this new option! > With -c option: > Call Trace: > <TASK> > dump_stack_lvl (lib/dump_stack.c:94 lib/dump_stack.c:120) > lockdep_rcu_suspicious (kernel/locking/lockdep.c:6876) > event_filter_pid_sched_process_fork (kernel/trace/trace_events.c:1057) > kernel_clone (include/trace/events/sched.h:396 include/trace/events/sched.h:396 kernel/fork.c:2664) > __x64_sys_clone (kernel/fork.c:2795 kernel/fork.c:2779 kernel/fork.c:2779) > do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94) > ? entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:121) > ? trace_irq_disable (include/trace/events/preemptirq.h:36) > entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:121) > > > Without -c option: > Call Trace: > <TASK> > dump_stack_lvl (lib/dump_stack.c:122) > lockdep_rcu_suspicious (kernel/locking/lockdep.c:6877) > event_filter_pid_sched_process_fork (kernel/trace/trace_events.c:?) > kernel_clone (include/trace/events/sched.h:? include/trace/events/sched.h:396 kernel/fork.c:2664) > __x64_sys_clone (kernel/fork.c:2779) > do_syscall_64 (arch/x86/entry/syscall_64.c:?) > ? entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130) > ? trace_irq_disable (include/trace/events/preemptirq.h:36) > entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130) That's better indeed! Do we need a new option for that? Could it not be the new default behaviour? Or are there any downsides with it? "addr2line" will be called more, but if it is worth it, it is probably not an issue, or is it? Cheers, Matt -- Sponsored by the NGI0 Core fund.
On Thu, 5 Mar 2026 15:56:13 +0100 Matthieu Baerts <matttbe@kernel.org> wrote: > Hi Masami, > > On 05/03/2026 06:12, Masami Hiramatsu (Google) wrote: > > From: Masami Hiramatsu (Google) <mhiramat@kernel.org> > > > > Add -c option to search call address search to decode_stacktrace. > > This tries to decode line info backwards, starting from 1byte before > > the return address, and displays the first line info it founds as > > the caller address. > > If it tries up to 10bytes before (or the symbol address) and still > > can not find it, it gives up and decodes the return address. > > Thank you for this new option! > > > With -c option: > > Call Trace: > > <TASK> > > dump_stack_lvl (lib/dump_stack.c:94 lib/dump_stack.c:120) > > lockdep_rcu_suspicious (kernel/locking/lockdep.c:6876) > > event_filter_pid_sched_process_fork (kernel/trace/trace_events.c:1057) > > kernel_clone (include/trace/events/sched.h:396 include/trace/events/sched.h:396 kernel/fork.c:2664) > > __x64_sys_clone (kernel/fork.c:2795 kernel/fork.c:2779 kernel/fork.c:2779) > > do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94) > > ? entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:121) > > ? trace_irq_disable (include/trace/events/preemptirq.h:36) > > entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:121) > > > > > > Without -c option: > > Call Trace: > > <TASK> > > dump_stack_lvl (lib/dump_stack.c:122) > > lockdep_rcu_suspicious (kernel/locking/lockdep.c:6877) > > event_filter_pid_sched_process_fork (kernel/trace/trace_events.c:?) > > kernel_clone (include/trace/events/sched.h:? include/trace/events/sched.h:396 kernel/fork.c:2664) > > __x64_sys_clone (kernel/fork.c:2779) > > do_syscall_64 (arch/x86/entry/syscall_64.c:?) > > ? entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130) > > ? trace_irq_disable (include/trace/events/preemptirq.h:36) > > entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130) > That's better indeed! > > Do we need a new option for that? Could it not be the new default > behaviour? Or are there any downsides with it? AFAIK, this may not work well on the architectures which have delay slot (I have not tested) which will execute one more instruction after branch before branching. In that case, the return address will not be the next instruction of the delay slot. But I think that is not popular anymore, so we can switch the default behavior and maybe we can switch it based on architecture. Thank you, > > "addr2line" will be called more, but if it is worth it, it is probably > not an issue, or is it? > > Cheers, > Matt > -- > Sponsored by the NGI0 Core fund. > -- Masami Hiramatsu (Google) <mhiramat@kernel.org>
© 2016 - 2026 Red Hat, Inc.