arch/riscv/include/asm/stacktrace.h | 2 +- arch/riscv/kernel/perf_callchain.c | 7 +++- arch/riscv/kernel/stacktrace.c | 56 +++++++++++++++++++++++------ 3 files changed, 53 insertions(+), 12 deletions(-)
This 2-patch series fixes how the non-frame-pointer version of
walk_stackframe handles return addresses, addressing both the
unwinder layer and the display layer.
Background
==========
In the original RISC-V stacktrace code (commit 5d8544e2d007), both
the FP and non-FP versions of walk_stackframe subtracted 0x4 from
return addresses read from the stack:
FP version: pc = frame->ra - 0x4;
Non-FP version: pc = (*ksp++) - 0x4;
The intent was to convert the return address (the instruction after
a call) back to the call site for better symbol resolution. However,
this is incorrect on RISC-V because the RVC (compressed instruction)
extension allows 2-byte instructions (e.g. c.jal). Subtracting a
fixed 0x4 assumes all call instructions are 4 bytes, which produces
a wrong address for compressed calls.
Commit b785ec129bd9 ("riscv/ftrace: Add HAVE_FUNCTION_GRAPH_RET_ADDR_PTR
support") fixed the FP version by replacing "pc = frame->ra - 0x4"
with ftrace_graph_ret_addr(). The commit message explicitly noted:
"the original calculation, pc = frame->ra - 4, is buggy when the
instruction at the return address happened to be a compressed inst."
Later, commit 877425424d6c removed the fallback "#else" branch
entirely since HAVE_FUNCTION_GRAPH_RET_ADDR_PTR is always defined
for RISC-V. However, the non-FP version was simply overlooked and
still carries the bogus -0x4 offset.
Patch 1: Remove -0x4 from the unwinder
=======================================
Remove the -0x4 offset from the non-FP walk_stackframe so it reports
raw return addresses, consistent with the FP version and with other
architectures (ARM64, x86, ARM all report return addresses as-is).
Patch 2: Use %pB for return addresses and %pS for exact IPs
=============================================================
With the -0x4 removed, a pre-existing issue becomes more visible:
when a noreturn function (e.g. panic, do_exit) is the last call in
a function, the saved return address points to the start of the next
function, and kallsyms resolves it to the wrong symbol.
The kernel provides sprint_backtrace() (invoked via %pB) specifically
for this purpose: it subtracts 1 from the address before symbol
lookup, which is sufficient to fall back into the calling function's
range. x86 uses %pB in its printk_stack_address() for this reason.
However, %pB must only be used for return addresses. walk_stackframe
passes two kinds of addresses to its callback: exact instruction
pointers (instruction_pointer(regs) and epc from exception frames)
and return addresses (frame->ra, regs->ra, task->thread.ra). For
exact IPs, subtracting 1 can cause kallsyms to resolve to the
preceding function if the IP is at offset 0 of a function.
Add an 'is_ra' parameter to the walk_stackframe callback to indicate
whether the address is a return address. Then use %pB for return
addresses and %pS for exact instruction pointers, matching the x86
approach which uses %pS for regs->ip via show_ip() and %pB for
stack return addresses via printk_stack_address().
Known limitation: the non-FP version of walk_stackframe() scans the
stack blindly and cannot distinguish pt_regs->epc values from return
addresses, so all stack-scanned values are treated as return addresses
(is_ra = true). This means epc values from pt_regs encountered during
the scan may still use %pB incorrectly. The FP version handles this
case correctly by detecting exception frames explicitly.
Changes since v1
================
- Patch 2: Split %pB usage to only apply to return addresses (is_ra =
true), using %pS for exact instruction pointers (is_ra = false).
This addresses the review feedback from Paul Walmsley and the
Sashiko review noting that x86 uses %pS for exact IPs and %pB only
for return addresses.
- Patch 2: Add 'is_ra' parameter to walk_stackframe callback, update
the declaration in asm/stacktrace.h, and adapt all callers
(print_trace_address, save_wchan, fill_callchain_walk wrapper for
perf, stack_walk_wrapper for arch_stack_walk).
Design principle
================
The two patches follow a clear separation of concerns:
- The unwinder reports raw return addresses (no offset adjustment)
- The display layer handles symbol resolution adjustments (%pB for
return addresses, %pS for exact IPs)
This matches the convention used by other architectures and ensures
consistent behavior between the FP and non-FP unwinders.
Jiakai Xu [1] and Wang Han [2] have separate patch series in flight
for other stacktrace issues; these patches are orthogonal to their
work.
[1] https://lore.kernel.org/all/20260517143704.659416-1-xujiakai2025@iscas.ac.cn/
[2] https://lore.kernel.org/all/20260528082310.1994388-1-wanghan@linux.alibaba.com/
Rui Qi (2):
riscv: stacktrace: Remove bogus -0x4 offset in non-FP walk_stackframe
riscv: stacktrace: Use %pB for return addresses and %pS for exact IPs
arch/riscv/include/asm/stacktrace.h | 2 +-
arch/riscv/kernel/perf_callchain.c | 7 +++-
arch/riscv/kernel/stacktrace.c | 56 +++++++++++++++++++++++------
3 files changed, 53 insertions(+), 12 deletions(-)
--
2.20.1
© 2016 - 2026 Red Hat, Inc.