[PATCH v4 1/7] LoongArch: ftrace: Refactor register restoration in ftrace_common_return

Chenghao Duan posted 7 patches 1 day, 17 hours ago
[PATCH v4 1/7] LoongArch: ftrace: Refactor register restoration in ftrace_common_return
Posted by Chenghao Duan 1 day, 17 hours ago
Refactor the register restoration sequence in the ftrace_common_return
function to clearly distinguish between the logic of normal returns and
direct call returns in function tracing scenarios. The logic is as
follows:
1. In the case of a normal return, the execution flow returns to the
traced function, and ftrace must ensure that the register data is
consistent with the state when the function was entered.
ra = parent return address; t0 = traced function return address.

2. In the case of a direct call return, the execution flow jumps to the
custom trampoline function, and ftrace must ensure that the register
data is consistent with the state when ftrace was entered.
ra = traced function return address; t0 = parent return address.

Fixes: 9cdc3b6a299c ("LoongArch: ftrace: Add direct call support")
Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
---
 arch/loongarch/kernel/mcount_dyn.S | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/arch/loongarch/kernel/mcount_dyn.S b/arch/loongarch/kernel/mcount_dyn.S
index d6b474ad1d5e..5729c20e5b8b 100644
--- a/arch/loongarch/kernel/mcount_dyn.S
+++ b/arch/loongarch/kernel/mcount_dyn.S
@@ -94,7 +94,6 @@ SYM_INNER_LABEL(ftrace_graph_call, SYM_L_GLOBAL)
  * at the callsite, so there is no need to restore the T series regs.
  */
 ftrace_common_return:
-	PTR_L		ra, sp, PT_R1
 	PTR_L		a0, sp, PT_R4
 	PTR_L		a1, sp, PT_R5
 	PTR_L		a2, sp, PT_R6
@@ -104,12 +103,17 @@ ftrace_common_return:
 	PTR_L		a6, sp, PT_R10
 	PTR_L		a7, sp, PT_R11
 	PTR_L		fp, sp, PT_R22
-	PTR_L		t0, sp, PT_ERA
 	PTR_L		t1, sp, PT_R13
-	PTR_ADDI	sp, sp, PT_SIZE
 	bnez		t1, .Ldirect
+
+	PTR_L		ra, sp, PT_R1
+	PTR_L		t0, sp, PT_ERA
+	PTR_ADDI	sp, sp, PT_SIZE
 	jr		t0
 .Ldirect:
+	PTR_L		t0, sp, PT_R1
+	PTR_L		ra, sp, PT_ERA
+	PTR_ADDI	sp, sp, PT_SIZE
 	jr		t1
 SYM_CODE_END(ftrace_common)
 
@@ -161,6 +165,8 @@ SYM_CODE_END(return_to_handler)
 #ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
 SYM_CODE_START(ftrace_stub_direct_tramp)
 	UNWIND_HINT_UNDEFINED
-	jr		t0
+	move		t1, ra
+	move		ra, t0
+	jr		t1
 SYM_CODE_END(ftrace_stub_direct_tramp)
 #endif /* CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS */
-- 
2.25.1
Re: [PATCH v4 1/7] LoongArch: ftrace: Refactor register restoration in ftrace_common_return
Posted by bot+bpf-ci@kernel.org 1 day, 16 hours ago
> diff --git a/arch/loongarch/kernel/mcount_dyn.S b/arch/loongarch/kernel/mcount_dyn.S
> index d6b474ad1d5e..5729c20e5b8b 100644
> --- a/arch/loongarch/kernel/mcount_dyn.S
> +++ b/arch/loongarch/kernel/mcount_dyn.S
> @@ -94,7 +94,6 @@ SYM_INNER_LABEL(ftrace_graph_call, SYM_L_GLOBAL)
>   * at the callsite, so there is no need to restore the T series regs.
>   */
>  ftrace_common_return:
> -	PTR_L		ra, sp, PT_R1
>  	PTR_L		a0, sp, PT_R4
>  	PTR_L		a1, sp, PT_R5
>  	PTR_L		a2, sp, PT_R6
> @@ -104,12 +103,17 @@ ftrace_common_return:
>  	PTR_L		a6, sp, PT_R10
>  	PTR_L		a7, sp, PT_R11
>  	PTR_L		fp, sp, PT_R22
> -	PTR_L		t0, sp, PT_ERA
>  	PTR_L		t1, sp, PT_R13
> -	PTR_ADDI	sp, sp, PT_SIZE
>  	bnez		t1, .Ldirect
> +
> +	PTR_L		ra, sp, PT_R1
> +	PTR_L		t0, sp, PT_ERA
> +	PTR_ADDI	sp, sp, PT_SIZE
>  	jr		t0
>  .Ldirect:
> +	PTR_L		t0, sp, PT_R1
> +	PTR_L		ra, sp, PT_ERA
                     ^^^^^^^^^^^^^^^^^^^
Does this break existing ftrace direct trampolines?

Before this change, direct trampolines received:
  ra = parent return address (from PT_R1)
  t0 = traced function address (from PT_ERA)

After this change, direct trampolines receive:
  t0 = parent return address (from PT_R1)
  ra = traced function address (from PT_ERA)

The sample trampolines in samples/ftrace/ were written for the old
convention. For example, samples/ftrace/ftrace-direct.c does:

  my_tramp:
    st.d  $t0, $sp, 8    # Save what it thinks is traced func addr
    st.d  $ra, $sp, 16   # Save what it thinks is parent addr
    bl    my_direct_func
    ld.d  $t0, $sp, 8    # Restore
    ld.d  $ra, $sp, 16   # Restore
    jr    $t0            # Jump to what it thinks is traced function

With the new convention, the sample would save parent address in the
t0 slot, restore it to t0, and then jump to the parent instead of the
traced function, skipping the traced function entirely.

This was fixed later in commit 055e392f3be7 "LoongArch: ftrace: Adjust
register stack restore order in direct call trampolines" which updated
all the samples to match the new convention. However, any out-of-tree
direct trampolines written before this commit would be broken by this
change.

Was this ABI break intentional? The commit message mentions matching
"the state when ftrace was entered" but doesn't explain why breaking
compatibility was necessary. RISC-V uses the opposite convention
(t0 = traced func, ra = parent), which the original LoongArch code
matched.

> +	PTR_ADDI	sp, sp, PT_SIZE
>  	jr		t1
>  SYM_CODE_END(ftrace_common)

[ ... ]

> @@ -161,6 +165,8 @@ SYM_CODE_END(return_to_handler)
>  #ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
>  SYM_CODE_START(ftrace_stub_direct_tramp)
>  	UNWIND_HINT_UNDEFINED
> -	jr		t0
> +	move		t1, ra
> +	move		ra, t0
> +	jr		t1
>  SYM_CODE_END(ftrace_stub_direct_tramp)
>  #endif /* CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS */


---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md

CI run summary: https://github.com/kernel-patches/bpf/actions/runs/20293821855
Re: [PATCH v4 1/7] LoongArch: ftrace: Refactor register restoration in ftrace_common_return
Posted by Chenghao Duan 22 hours ago
On Wed, Dec 17, 2025 at 06:55:00AM +0000, bot+bpf-ci@kernel.org wrote:
> > diff --git a/arch/loongarch/kernel/mcount_dyn.S b/arch/loongarch/kernel/mcount_dyn.S
> > index d6b474ad1d5e..5729c20e5b8b 100644
> > --- a/arch/loongarch/kernel/mcount_dyn.S
> > +++ b/arch/loongarch/kernel/mcount_dyn.S
> > @@ -94,7 +94,6 @@ SYM_INNER_LABEL(ftrace_graph_call, SYM_L_GLOBAL)
> >   * at the callsite, so there is no need to restore the T series regs.
> >   */
> >  ftrace_common_return:
> > -	PTR_L		ra, sp, PT_R1
> >  	PTR_L		a0, sp, PT_R4
> >  	PTR_L		a1, sp, PT_R5
> >  	PTR_L		a2, sp, PT_R6
> > @@ -104,12 +103,17 @@ ftrace_common_return:
> >  	PTR_L		a6, sp, PT_R10
> >  	PTR_L		a7, sp, PT_R11
> >  	PTR_L		fp, sp, PT_R22
> > -	PTR_L		t0, sp, PT_ERA
> >  	PTR_L		t1, sp, PT_R13
> > -	PTR_ADDI	sp, sp, PT_SIZE
> >  	bnez		t1, .Ldirect
> > +
> > +	PTR_L		ra, sp, PT_R1
> > +	PTR_L		t0, sp, PT_ERA
> > +	PTR_ADDI	sp, sp, PT_SIZE
> >  	jr		t0
> >  .Ldirect:
> > +	PTR_L		t0, sp, PT_R1
> > +	PTR_L		ra, sp, PT_ERA
>                      ^^^^^^^^^^^^^^^^^^^
> Does this break existing ftrace direct trampolines?
> 
> Before this change, direct trampolines received:
>   ra = parent return address (from PT_R1)
>   t0 = traced function address (from PT_ERA)
> 
> After this change, direct trampolines receive:
>   t0 = parent return address (from PT_R1)
>   ra = traced function address (from PT_ERA)
> 
> The sample trampolines in samples/ftrace/ were written for the old
> convention. For example, samples/ftrace/ftrace-direct.c does:
> 
>   my_tramp:
>     st.d  $t0, $sp, 8    # Save what it thinks is traced func addr
>     st.d  $ra, $sp, 16   # Save what it thinks is parent addr
>     bl    my_direct_func
>     ld.d  $t0, $sp, 8    # Restore
>     ld.d  $ra, $sp, 16   # Restore
>     jr    $t0            # Jump to what it thinks is traced function
> 
> With the new convention, the sample would save parent address in the
> t0 slot, restore it to t0, and then jump to the parent instead of the
> traced function, skipping the traced function entirely.
> 
> This was fixed later in commit 055e392f3be7 "LoongArch: ftrace: Adjust
> register stack restore order in direct call trampolines" which updated
> all the samples to match the new convention. However, any out-of-tree
> direct trampolines written before this commit would be broken by this
> change.
> 
> Was this ABI break intentional? The commit message mentions matching
> "the state when ftrace was entered" but doesn't explain why breaking
> compatibility was necessary. RISC-V uses the opposite convention
> (t0 = traced func, ra = parent), which the original LoongArch code
> matched.

The patch "LoongArch: ftrace: Refactor register restoration in
ftrace_common_return" changed the conventions for t0 and ra after
entering the direct call function, which is exactly the reason forthe
changes made in this patch.

Chenghao

> 
> > +	PTR_ADDI	sp, sp, PT_SIZE
> >  	jr		t1
> >  SYM_CODE_END(ftrace_common)
> 
> [ ... ]
> 
> > @@ -161,6 +165,8 @@ SYM_CODE_END(return_to_handler)
> >  #ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
> >  SYM_CODE_START(ftrace_stub_direct_tramp)
> >  	UNWIND_HINT_UNDEFINED
> > -	jr		t0
> > +	move		t1, ra
> > +	move		ra, t0
> > +	jr		t1
> >  SYM_CODE_END(ftrace_stub_direct_tramp)
> >  #endif /* CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS */
> 
> 
> ---
> AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
> See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
> 
> CI run summary: https://github.com/kernel-patches/bpf/actions/runs/20293821855
Re: [PATCH v4 1/7] LoongArch: ftrace: Refactor register restoration in ftrace_common_return
Posted by Chris Mason 8 hours ago
On 12/17/25 8:26 PM, Chenghao Duan wrote:
> On Wed, Dec 17, 2025 at 06:55:00AM +0000, bot+bpf-ci@kernel.org wrote:
>>> diff --git a/arch/loongarch/kernel/mcount_dyn.S b/arch/loongarch/kernel/mcount_dyn.S
>>> index d6b474ad1d5e..5729c20e5b8b 100644
>>> --- a/arch/loongarch/kernel/mcount_dyn.S
>>> +++ b/arch/loongarch/kernel/mcount_dyn.S
>>> @@ -94,7 +94,6 @@ SYM_INNER_LABEL(ftrace_graph_call, SYM_L_GLOBAL)
>>>   * at the callsite, so there is no need to restore the T series regs.
>>>   */
>>>  ftrace_common_return:
>>> -	PTR_L		ra, sp, PT_R1
>>>  	PTR_L		a0, sp, PT_R4
>>>  	PTR_L		a1, sp, PT_R5
>>>  	PTR_L		a2, sp, PT_R6
>>> @@ -104,12 +103,17 @@ ftrace_common_return:
>>>  	PTR_L		a6, sp, PT_R10
>>>  	PTR_L		a7, sp, PT_R11
>>>  	PTR_L		fp, sp, PT_R22
>>> -	PTR_L		t0, sp, PT_ERA
>>>  	PTR_L		t1, sp, PT_R13
>>> -	PTR_ADDI	sp, sp, PT_SIZE
>>>  	bnez		t1, .Ldirect
>>> +
>>> +	PTR_L		ra, sp, PT_R1
>>> +	PTR_L		t0, sp, PT_ERA
>>> +	PTR_ADDI	sp, sp, PT_SIZE
>>>  	jr		t0
>>>  .Ldirect:
>>> +	PTR_L		t0, sp, PT_R1
>>> +	PTR_L		ra, sp, PT_ERA
>>                      ^^^^^^^^^^^^^^^^^^^
>> Does this break existing ftrace direct trampolines?
>>
>> Before this change, direct trampolines received:
>>   ra = parent return address (from PT_R1)
>>   t0 = traced function address (from PT_ERA)
>>
>> After this change, direct trampolines receive:
>>   t0 = parent return address (from PT_R1)
>>   ra = traced function address (from PT_ERA)
>>
>> The sample trampolines in samples/ftrace/ were written for the old
>> convention. For example, samples/ftrace/ftrace-direct.c does:
>>
>>   my_tramp:
>>     st.d  $t0, $sp, 8    # Save what it thinks is traced func addr
>>     st.d  $ra, $sp, 16   # Save what it thinks is parent addr
>>     bl    my_direct_func
>>     ld.d  $t0, $sp, 8    # Restore
>>     ld.d  $ra, $sp, 16   # Restore
>>     jr    $t0            # Jump to what it thinks is traced function
>>
>> With the new convention, the sample would save parent address in the
>> t0 slot, restore it to t0, and then jump to the parent instead of the
>> traced function, skipping the traced function entirely.
>>
>> This was fixed later in commit 055e392f3be7 "LoongArch: ftrace: Adjust
>> register stack restore order in direct call trampolines" which updated
>> all the samples to match the new convention. However, any out-of-tree
>> direct trampolines written before this commit would be broken by this
>> change.
>>
>> Was this ABI break intentional? The commit message mentions matching
>> "the state when ftrace was entered" but doesn't explain why breaking
>> compatibility was necessary. RISC-V uses the opposite convention
>> (t0 = traced func, ra = parent), which the original LoongArch code
>> matched.
> 
> The patch "LoongArch: ftrace: Refactor register restoration in
> ftrace_common_return" changed the conventions for t0 and ra after
> entering the direct call function, which is exactly the reason forthe
> changes made in this patch.

I'll try to adjust the kinds of ABI breakage AI comments on.  It did
catch the other related changes from this series, but the additional
commentary wasn't useful.

Thanks,
Chris