Refactor the register restoration sequence in the ftrace_common_return
function to clearly distinguish between the logic of normal returns and
direct call returns in function tracing scenarios. The logic is as
follows:
1. In the case of a normal return, the execution flow returns to the
traced function, and ftrace must ensure that the register data is
consistent with the state when the function was entered.
ra = parent return address; t0 = traced function return address.
2. In the case of a direct call return, the execution flow jumps to the
custom trampoline function, and ftrace must ensure that the register
data is consistent with the state when ftrace was entered.
ra = traced function return address; t0 = parent return address.
Fixes: 9cdc3b6a299c ("LoongArch: ftrace: Add direct call support")
Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
---
arch/loongarch/kernel/mcount_dyn.S | 14 ++++++++++----
1 file changed, 10 insertions(+), 4 deletions(-)
diff --git a/arch/loongarch/kernel/mcount_dyn.S b/arch/loongarch/kernel/mcount_dyn.S
index d6b474ad1d5e..5729c20e5b8b 100644
--- a/arch/loongarch/kernel/mcount_dyn.S
+++ b/arch/loongarch/kernel/mcount_dyn.S
@@ -94,7 +94,6 @@ SYM_INNER_LABEL(ftrace_graph_call, SYM_L_GLOBAL)
* at the callsite, so there is no need to restore the T series regs.
*/
ftrace_common_return:
- PTR_L ra, sp, PT_R1
PTR_L a0, sp, PT_R4
PTR_L a1, sp, PT_R5
PTR_L a2, sp, PT_R6
@@ -104,12 +103,17 @@ ftrace_common_return:
PTR_L a6, sp, PT_R10
PTR_L a7, sp, PT_R11
PTR_L fp, sp, PT_R22
- PTR_L t0, sp, PT_ERA
PTR_L t1, sp, PT_R13
- PTR_ADDI sp, sp, PT_SIZE
bnez t1, .Ldirect
+
+ PTR_L ra, sp, PT_R1
+ PTR_L t0, sp, PT_ERA
+ PTR_ADDI sp, sp, PT_SIZE
jr t0
.Ldirect:
+ PTR_L t0, sp, PT_R1
+ PTR_L ra, sp, PT_ERA
+ PTR_ADDI sp, sp, PT_SIZE
jr t1
SYM_CODE_END(ftrace_common)
@@ -161,6 +165,8 @@ SYM_CODE_END(return_to_handler)
#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
SYM_CODE_START(ftrace_stub_direct_tramp)
UNWIND_HINT_UNDEFINED
- jr t0
+ move t1, ra
+ move ra, t0
+ jr t1
SYM_CODE_END(ftrace_stub_direct_tramp)
#endif /* CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS */
--
2.25.1
> diff --git a/arch/loongarch/kernel/mcount_dyn.S b/arch/loongarch/kernel/mcount_dyn.S
> index d6b474ad1d5e..5729c20e5b8b 100644
> --- a/arch/loongarch/kernel/mcount_dyn.S
> +++ b/arch/loongarch/kernel/mcount_dyn.S
> @@ -94,7 +94,6 @@ SYM_INNER_LABEL(ftrace_graph_call, SYM_L_GLOBAL)
> * at the callsite, so there is no need to restore the T series regs.
> */
> ftrace_common_return:
> - PTR_L ra, sp, PT_R1
> PTR_L a0, sp, PT_R4
> PTR_L a1, sp, PT_R5
> PTR_L a2, sp, PT_R6
> @@ -104,12 +103,17 @@ ftrace_common_return:
> PTR_L a6, sp, PT_R10
> PTR_L a7, sp, PT_R11
> PTR_L fp, sp, PT_R22
> - PTR_L t0, sp, PT_ERA
> PTR_L t1, sp, PT_R13
> - PTR_ADDI sp, sp, PT_SIZE
> bnez t1, .Ldirect
> +
> + PTR_L ra, sp, PT_R1
> + PTR_L t0, sp, PT_ERA
> + PTR_ADDI sp, sp, PT_SIZE
> jr t0
> .Ldirect:
> + PTR_L t0, sp, PT_R1
> + PTR_L ra, sp, PT_ERA
^^^^^^^^^^^^^^^^^^^
Does this break existing ftrace direct trampolines?
Before this change, direct trampolines received:
ra = parent return address (from PT_R1)
t0 = traced function address (from PT_ERA)
After this change, direct trampolines receive:
t0 = parent return address (from PT_R1)
ra = traced function address (from PT_ERA)
The sample trampolines in samples/ftrace/ were written for the old
convention. For example, samples/ftrace/ftrace-direct.c does:
my_tramp:
st.d $t0, $sp, 8 # Save what it thinks is traced func addr
st.d $ra, $sp, 16 # Save what it thinks is parent addr
bl my_direct_func
ld.d $t0, $sp, 8 # Restore
ld.d $ra, $sp, 16 # Restore
jr $t0 # Jump to what it thinks is traced function
With the new convention, the sample would save parent address in the
t0 slot, restore it to t0, and then jump to the parent instead of the
traced function, skipping the traced function entirely.
This was fixed later in commit 055e392f3be7 "LoongArch: ftrace: Adjust
register stack restore order in direct call trampolines" which updated
all the samples to match the new convention. However, any out-of-tree
direct trampolines written before this commit would be broken by this
change.
Was this ABI break intentional? The commit message mentions matching
"the state when ftrace was entered" but doesn't explain why breaking
compatibility was necessary. RISC-V uses the opposite convention
(t0 = traced func, ra = parent), which the original LoongArch code
matched.
> + PTR_ADDI sp, sp, PT_SIZE
> jr t1
> SYM_CODE_END(ftrace_common)
[ ... ]
> @@ -161,6 +165,8 @@ SYM_CODE_END(return_to_handler)
> #ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
> SYM_CODE_START(ftrace_stub_direct_tramp)
> UNWIND_HINT_UNDEFINED
> - jr t0
> + move t1, ra
> + move ra, t0
> + jr t1
> SYM_CODE_END(ftrace_stub_direct_tramp)
> #endif /* CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS */
---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
CI run summary: https://github.com/kernel-patches/bpf/actions/runs/20293821855
On Wed, Dec 17, 2025 at 06:55:00AM +0000, bot+bpf-ci@kernel.org wrote: > > diff --git a/arch/loongarch/kernel/mcount_dyn.S b/arch/loongarch/kernel/mcount_dyn.S > > index d6b474ad1d5e..5729c20e5b8b 100644 > > --- a/arch/loongarch/kernel/mcount_dyn.S > > +++ b/arch/loongarch/kernel/mcount_dyn.S > > @@ -94,7 +94,6 @@ SYM_INNER_LABEL(ftrace_graph_call, SYM_L_GLOBAL) > > * at the callsite, so there is no need to restore the T series regs. > > */ > > ftrace_common_return: > > - PTR_L ra, sp, PT_R1 > > PTR_L a0, sp, PT_R4 > > PTR_L a1, sp, PT_R5 > > PTR_L a2, sp, PT_R6 > > @@ -104,12 +103,17 @@ ftrace_common_return: > > PTR_L a6, sp, PT_R10 > > PTR_L a7, sp, PT_R11 > > PTR_L fp, sp, PT_R22 > > - PTR_L t0, sp, PT_ERA > > PTR_L t1, sp, PT_R13 > > - PTR_ADDI sp, sp, PT_SIZE > > bnez t1, .Ldirect > > + > > + PTR_L ra, sp, PT_R1 > > + PTR_L t0, sp, PT_ERA > > + PTR_ADDI sp, sp, PT_SIZE > > jr t0 > > .Ldirect: > > + PTR_L t0, sp, PT_R1 > > + PTR_L ra, sp, PT_ERA > ^^^^^^^^^^^^^^^^^^^ > Does this break existing ftrace direct trampolines? > > Before this change, direct trampolines received: > ra = parent return address (from PT_R1) > t0 = traced function address (from PT_ERA) > > After this change, direct trampolines receive: > t0 = parent return address (from PT_R1) > ra = traced function address (from PT_ERA) > > The sample trampolines in samples/ftrace/ were written for the old > convention. For example, samples/ftrace/ftrace-direct.c does: > > my_tramp: > st.d $t0, $sp, 8 # Save what it thinks is traced func addr > st.d $ra, $sp, 16 # Save what it thinks is parent addr > bl my_direct_func > ld.d $t0, $sp, 8 # Restore > ld.d $ra, $sp, 16 # Restore > jr $t0 # Jump to what it thinks is traced function > > With the new convention, the sample would save parent address in the > t0 slot, restore it to t0, and then jump to the parent instead of the > traced function, skipping the traced function entirely. > > This was fixed later in commit 055e392f3be7 "LoongArch: ftrace: Adjust > register stack restore order in direct call trampolines" which updated > all the samples to match the new convention. However, any out-of-tree > direct trampolines written before this commit would be broken by this > change. > > Was this ABI break intentional? The commit message mentions matching > "the state when ftrace was entered" but doesn't explain why breaking > compatibility was necessary. RISC-V uses the opposite convention > (t0 = traced func, ra = parent), which the original LoongArch code > matched. The patch "LoongArch: ftrace: Refactor register restoration in ftrace_common_return" changed the conventions for t0 and ra after entering the direct call function, which is exactly the reason forthe changes made in this patch. Chenghao > > > + PTR_ADDI sp, sp, PT_SIZE > > jr t1 > > SYM_CODE_END(ftrace_common) > > [ ... ] > > > @@ -161,6 +165,8 @@ SYM_CODE_END(return_to_handler) > > #ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS > > SYM_CODE_START(ftrace_stub_direct_tramp) > > UNWIND_HINT_UNDEFINED > > - jr t0 > > + move t1, ra > > + move ra, t0 > > + jr t1 > > SYM_CODE_END(ftrace_stub_direct_tramp) > > #endif /* CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS */ > > > --- > AI reviewed your patch. Please fix the bug or email reply why it's not a bug. > See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md > > CI run summary: https://github.com/kernel-patches/bpf/actions/runs/20293821855
On 12/17/25 8:26 PM, Chenghao Duan wrote: > On Wed, Dec 17, 2025 at 06:55:00AM +0000, bot+bpf-ci@kernel.org wrote: >>> diff --git a/arch/loongarch/kernel/mcount_dyn.S b/arch/loongarch/kernel/mcount_dyn.S >>> index d6b474ad1d5e..5729c20e5b8b 100644 >>> --- a/arch/loongarch/kernel/mcount_dyn.S >>> +++ b/arch/loongarch/kernel/mcount_dyn.S >>> @@ -94,7 +94,6 @@ SYM_INNER_LABEL(ftrace_graph_call, SYM_L_GLOBAL) >>> * at the callsite, so there is no need to restore the T series regs. >>> */ >>> ftrace_common_return: >>> - PTR_L ra, sp, PT_R1 >>> PTR_L a0, sp, PT_R4 >>> PTR_L a1, sp, PT_R5 >>> PTR_L a2, sp, PT_R6 >>> @@ -104,12 +103,17 @@ ftrace_common_return: >>> PTR_L a6, sp, PT_R10 >>> PTR_L a7, sp, PT_R11 >>> PTR_L fp, sp, PT_R22 >>> - PTR_L t0, sp, PT_ERA >>> PTR_L t1, sp, PT_R13 >>> - PTR_ADDI sp, sp, PT_SIZE >>> bnez t1, .Ldirect >>> + >>> + PTR_L ra, sp, PT_R1 >>> + PTR_L t0, sp, PT_ERA >>> + PTR_ADDI sp, sp, PT_SIZE >>> jr t0 >>> .Ldirect: >>> + PTR_L t0, sp, PT_R1 >>> + PTR_L ra, sp, PT_ERA >> ^^^^^^^^^^^^^^^^^^^ >> Does this break existing ftrace direct trampolines? >> >> Before this change, direct trampolines received: >> ra = parent return address (from PT_R1) >> t0 = traced function address (from PT_ERA) >> >> After this change, direct trampolines receive: >> t0 = parent return address (from PT_R1) >> ra = traced function address (from PT_ERA) >> >> The sample trampolines in samples/ftrace/ were written for the old >> convention. For example, samples/ftrace/ftrace-direct.c does: >> >> my_tramp: >> st.d $t0, $sp, 8 # Save what it thinks is traced func addr >> st.d $ra, $sp, 16 # Save what it thinks is parent addr >> bl my_direct_func >> ld.d $t0, $sp, 8 # Restore >> ld.d $ra, $sp, 16 # Restore >> jr $t0 # Jump to what it thinks is traced function >> >> With the new convention, the sample would save parent address in the >> t0 slot, restore it to t0, and then jump to the parent instead of the >> traced function, skipping the traced function entirely. >> >> This was fixed later in commit 055e392f3be7 "LoongArch: ftrace: Adjust >> register stack restore order in direct call trampolines" which updated >> all the samples to match the new convention. However, any out-of-tree >> direct trampolines written before this commit would be broken by this >> change. >> >> Was this ABI break intentional? The commit message mentions matching >> "the state when ftrace was entered" but doesn't explain why breaking >> compatibility was necessary. RISC-V uses the opposite convention >> (t0 = traced func, ra = parent), which the original LoongArch code >> matched. > > The patch "LoongArch: ftrace: Refactor register restoration in > ftrace_common_return" changed the conventions for t0 and ra after > entering the direct call function, which is exactly the reason forthe > changes made in this patch. I'll try to adjust the kinds of ABI breakage AI comments on. It did catch the other related changes from this series, but the additional commentary wasn't useful. Thanks, Chris
© 2016 - 2025 Red Hat, Inc.