[PATCH v2 3/4] LoongArch: BPF: Enhance trampoline support for kernel and module tracing

Chenghao Duan posted 4 patches 2 months ago
There is a newer version of this series
[PATCH v2 3/4] LoongArch: BPF: Enhance trampoline support for kernel and module tracing
Posted by Chenghao Duan 2 months ago
This patch addresses two main issues in the LoongArch BPF trampoline
implementation:

1. BPF-to-BPF call handling:
 - Modify the build_prologue function to ensure that the value of the
 return address register ra is saved to t0 before entering the
 trampoline operation.
 - This ensures that the return address handling logic is accurate and
 error-free when a BPF program calls another BPF program.

2. Enable Module Function Tracing Support:
 - Remove the previous restrictions that blocked the tracing of kernel
 module functions.
 - Fix the issue that previously caused kernel lockups when attempting
 to trace module functions

3. Related Function Optimizations:
 - Adjust the jump offset of tail calls to ensure correct instruction
   alignment.
 - Enhance the bpf_arch_text_poke() function to enable accurate location
 of BPF program entry points.
 - Refine the trampoline return logic to ensure that the register data
 is correct when returning to both the traced function and the parent
 function.

Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
---
 arch/loongarch/net/bpf_jit.c | 38 +++++++++++++++++++++++++-----------
 1 file changed, 27 insertions(+), 11 deletions(-)

diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
index 8dc58781b8eb..0c16a1b18e8f 100644
--- a/arch/loongarch/net/bpf_jit.c
+++ b/arch/loongarch/net/bpf_jit.c
@@ -139,6 +139,7 @@ static void build_prologue(struct jit_ctx *ctx)
 	stack_adjust = round_up(stack_adjust, 16);
 	stack_adjust += bpf_stack_adjust;
 
+	move_reg(ctx, LOONGARCH_GPR_T0, LOONGARCH_GPR_RA);
 	/* Reserve space for the move_imm + jirl instruction */
 	for (i = 0; i < LOONGARCH_LONG_JUMP_NINSNS; i++)
 		emit_insn(ctx, nop);
@@ -238,7 +239,7 @@ static void __build_epilogue(struct jit_ctx *ctx, bool is_tail_call)
 		 * Call the next bpf prog and skip the first instruction
 		 * of TCC initialization.
 		 */
-		emit_insn(ctx, jirl, LOONGARCH_GPR_ZERO, LOONGARCH_GPR_T3, 6);
+		emit_insn(ctx, jirl, LOONGARCH_GPR_ZERO, LOONGARCH_GPR_T3, 7);
 	}
 }
 
@@ -1265,7 +1266,7 @@ static int emit_jump_or_nops(void *target, void *ip, u32 *insns, bool is_call)
 		return 0;
 	}
 
-	return emit_jump_and_link(&ctx, is_call ? LOONGARCH_GPR_T0 : LOONGARCH_GPR_ZERO, (u64)target);
+	return emit_jump_and_link(&ctx, is_call ? LOONGARCH_GPR_RA : LOONGARCH_GPR_ZERO, (u64)target);
 }
 
 static int emit_call(struct jit_ctx *ctx, u64 addr)
@@ -1289,6 +1290,10 @@ int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type old_t,
 		       void *new_addr)
 {
 	int ret;
+	unsigned long size = 0;
+	unsigned long offset = 0;
+	char namebuf[KSYM_NAME_LEN];
+	void *image = NULL;
 	bool is_call;
 	u32 old_insns[LOONGARCH_LONG_JUMP_NINSNS] = {[0 ... 4] = INSN_NOP};
 	u32 new_insns[LOONGARCH_LONG_JUMP_NINSNS] = {[0 ... 4] = INSN_NOP};
@@ -1296,9 +1301,18 @@ int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type old_t,
 	/* Only poking bpf text is supported. Since kernel function entry
 	 * is set up by ftrace, we rely on ftrace to poke kernel functions.
 	 */
-	if (!is_bpf_text_address((unsigned long)ip))
+	if (!__bpf_address_lookup((unsigned long)ip, &size, &offset, namebuf))
 		return -ENOTSUPP;
 
+	image = ip - offset;
+	/* zero offset means we're poking bpf prog entry */
+	if (offset == 0)
+		/* skip to the nop instruction in bpf prog entry:
+		 * move t0, ra
+		 * nop
+		 */
+		ip = image + LOONGARCH_INSN_SIZE;
+
 	is_call = old_t == BPF_MOD_CALL;
 	ret = emit_jump_or_nops(old_addr, ip, old_insns, is_call);
 	if (ret)
@@ -1622,14 +1636,12 @@ static int __arch_prepare_bpf_trampoline(struct jit_ctx *ctx, struct bpf_tramp_i
 
 	/* To traced function */
 	/* Ftrace jump skips 2 NOP instructions */
-	if (is_kernel_text((unsigned long)orig_call))
+	if (is_kernel_text((unsigned long)orig_call) ||
+	    is_module_text_address((unsigned long)orig_call))
 		orig_call += LOONGARCH_FENTRY_NBYTES;
 	/* Direct jump skips 5 NOP instructions */
 	else if (is_bpf_text_address((unsigned long)orig_call))
 		orig_call += LOONGARCH_BPF_FENTRY_NBYTES;
-	/* Module tracing not supported - cause kernel lockups */
-	else if (is_module_text_address((unsigned long)orig_call))
-		return -ENOTSUPP;
 
 	if (flags & BPF_TRAMP_F_CALL_ORIG) {
 		move_addr(ctx, LOONGARCH_GPR_A0, (const u64)im);
@@ -1722,12 +1734,16 @@ static int __arch_prepare_bpf_trampoline(struct jit_ctx *ctx, struct bpf_tramp_i
 		emit_insn(ctx, ldd, LOONGARCH_GPR_FP, LOONGARCH_GPR_SP, 0);
 		emit_insn(ctx, addid, LOONGARCH_GPR_SP, LOONGARCH_GPR_SP, 16);
 
-		if (flags & BPF_TRAMP_F_SKIP_FRAME)
+		if (flags & BPF_TRAMP_F_SKIP_FRAME) {
 			/* return to parent function */
-			emit_insn(ctx, jirl, LOONGARCH_GPR_ZERO, LOONGARCH_GPR_RA, 0);
-		else
-			/* return to traced function */
+			move_reg(ctx, LOONGARCH_GPR_RA, LOONGARCH_GPR_T0);
 			emit_insn(ctx, jirl, LOONGARCH_GPR_ZERO, LOONGARCH_GPR_T0, 0);
+		} else {
+			/* return to traced function */
+			move_reg(ctx, LOONGARCH_GPR_T1, LOONGARCH_GPR_RA);
+			move_reg(ctx, LOONGARCH_GPR_RA, LOONGARCH_GPR_T0);
+			emit_insn(ctx, jirl, LOONGARCH_GPR_ZERO, LOONGARCH_GPR_T1, 0);
+		}
 	}
 
 	ret = ctx->idx;
-- 
2.25.1
Re: [PATCH v2 3/4] LoongArch: BPF: Enhance trampoline support for kernel and module tracing
Posted by Tiezhu Yang 1 month, 3 weeks ago
On 12/12/25 17:11, Chenghao Duan wrote:
> This patch addresses two main issues in the LoongArch BPF trampoline
> implementation:
> 
> 1. BPF-to-BPF call handling:
>   - Modify the build_prologue function to ensure that the value of the
>   return address register ra is saved to t0 before entering the
>   trampoline operation.
>   - This ensures that the return address handling logic is accurate and
>   error-free when a BPF program calls another BPF program.
> 
> 2. Enable Module Function Tracing Support:
>   - Remove the previous restrictions that blocked the tracing of kernel
>   module functions.
>   - Fix the issue that previously caused kernel lockups when attempting
>   to trace module functions
> 
> 3. Related Function Optimizations:
>   - Adjust the jump offset of tail calls to ensure correct instruction
>     alignment.
>   - Enhance the bpf_arch_text_poke() function to enable accurate location
>   of BPF program entry points.
>   - Refine the trampoline return logic to ensure that the register data
>   is correct when returning to both the traced function and the parent
>   function.
> 
> Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>

As described in the commit message, your changes include many kinds
of contents, thanks for the fixes and optimizations.

In order to avoid introducing bugs in the middle, please separate each
logical change into a separate patch, each patch should make an easily
understood change that can be verified by reviewers, each patch should
be justifiable on its own merits.

The current patch #4 can be put after the current patch #2 as a
preparation for the bpf patches.

Furthermore, it would be better to put the related test cases in the
commit message of each patch rather than in the cover letter, so that
it can be verified easily to know what this patch affected and can be
recorded in the git log.

And also please add Fixes tag for each patch if possible.

Thanks,
Tiezhu
Re: [PATCH v2 3/4] LoongArch: BPF: Enhance trampoline support for kernel and module tracing
Posted by Chenghao Duan 1 month, 3 weeks ago
On Sun, Dec 14, 2025 at 08:36:16PM +0800, Tiezhu Yang wrote:
> On 12/12/25 17:11, Chenghao Duan wrote:
> > This patch addresses two main issues in the LoongArch BPF trampoline
> > implementation:
> > 
> > 1. BPF-to-BPF call handling:
> >   - Modify the build_prologue function to ensure that the value of the
> >   return address register ra is saved to t0 before entering the
> >   trampoline operation.
> >   - This ensures that the return address handling logic is accurate and
> >   error-free when a BPF program calls another BPF program.
> > 
> > 2. Enable Module Function Tracing Support:
> >   - Remove the previous restrictions that blocked the tracing of kernel
> >   module functions.
> >   - Fix the issue that previously caused kernel lockups when attempting
> >   to trace module functions
> > 
> > 3. Related Function Optimizations:
> >   - Adjust the jump offset of tail calls to ensure correct instruction
> >     alignment.
> >   - Enhance the bpf_arch_text_poke() function to enable accurate location
> >   of BPF program entry points.
> >   - Refine the trampoline return logic to ensure that the register data
> >   is correct when returning to both the traced function and the parent
> >   function.
> > 
> > Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
> 
> As described in the commit message, your changes include many kinds
> of contents, thanks for the fixes and optimizations.
> 
> In order to avoid introducing bugs in the middle, please separate each
> logical change into a separate patch, each patch should make an easily
> understood change that can be verified by reviewers, each patch should
> be justifiable on its own merits.
> 
> The current patch #4 can be put after the current patch #2 as a
> preparation for the bpf patches.
> 

Got it. I will incorporate your suggestions in the next version.

> Furthermore, it would be better to put the related test cases in the
> commit message of each patch rather than in the cover letter, so that
> it can be verified easily to know what this patch affected and can be
> recorded in the git log.

I fully agree with your suggestions. In fact, the current three patches
(excluding 0002-ftrace-samples-xxx.patch) are all fixes for the failed
test cases of module_attach. The test items included in the cover letter
of 0000-xxx.patch are intended to verify that the trampoline-related
test cases can pass after the current changes. I will follow your advice
and place the relevant test cases in the commit message of the
corresponding patches in the next version.

Chenghao

> 
> And also please add Fixes tag for each patch if possible.
> 
> Thanks,
> Tiezhu