From: Abhishek Dubey <adubey@linux.ibm.com>
In the conventional stack frame, the position of tail_call_cnt
is after the NVR save area (BPF_PPC_STACK_SAVE). Whereas, the
offset of tail_call_cnt in the trampoline frame is after the
stack alignment padding. BPF JIT logic could become complex
when dealing with frame-sensitive offset calculation of
tail_call_cnt. Having the same offset in both frames is the
desired objective.
The trampoline frame does not have a BPF_PPC_STACK_SAVE area.
Introducing it leads to under-utilization of extra memory meant
only for the offset alignment of tail_call_cnt.
Another challenge is the variable alignment padding sitting at
the bottom of the trampoline frame, which requires additional
handling to compute tail_call_cnt offset.
This patch addresses the above issues by moving tail_call_cnt
to the bottom of the stack frame at offset 0 for both types
of frames. This saves additional bytes required by BPF_PPC_STACK_SAVE
in trampoline frame, and a common offset computation for
tail_call_cnt serves both frames.
The changes in this patch are required by the third patch in the
series, where the 'reference to tail_call_info' of the main frame
is copied into the trampoline frame from the previous frame.
Signed-off-by: Abhishek Dubey <adubey@linux.ibm.com>
---
arch/powerpc/net/bpf_jit.h | 4 ++++
arch/powerpc/net/bpf_jit_comp64.c | 31 ++++++++++++++++++++-----------
2 files changed, 24 insertions(+), 11 deletions(-)
diff --git a/arch/powerpc/net/bpf_jit.h b/arch/powerpc/net/bpf_jit.h
index 8334cd667bba..45d419c0ee73 100644
--- a/arch/powerpc/net/bpf_jit.h
+++ b/arch/powerpc/net/bpf_jit.h
@@ -72,6 +72,10 @@
} } while (0)
#ifdef CONFIG_PPC64
+
+/* for tailcall counter */
+#define BPF_PPC_TAILCALL 8
+
/* If dummy pass (!image), account for maximum possible instructions */
#define PPC_LI64(d, i) do { \
if (!image) \
diff --git a/arch/powerpc/net/bpf_jit_comp64.c b/arch/powerpc/net/bpf_jit_comp64.c
index 1fe37128c876..39061cd742c1 100644
--- a/arch/powerpc/net/bpf_jit_comp64.c
+++ b/arch/powerpc/net/bpf_jit_comp64.c
@@ -20,13 +20,15 @@
#include "bpf_jit.h"
/*
- * Stack layout:
+ * Stack layout 1:
+ * Layout when setting up our own stack frame.
+ * Note: r1 at bottom, component offsets positive wrt r1.
* Ensure the top half (upto local_tmp_var) stays consistent
* with our redzone usage.
*
* [ prev sp ] <-------------
- * [ nv gpr save area ] 6*8 |
* [ tail_call_cnt ] 8 |
+ * [ nv gpr save area ] 6*8 |
* [ local_tmp_var ] 24 |
* fp (r31) --> [ ebpf stack space ] upto 512 |
* [ frame header ] 32/112 |
@@ -36,10 +38,12 @@
/* for gpr non volatile registers BPG_REG_6 to 10 */
#define BPF_PPC_STACK_SAVE (6*8)
/* for bpf JIT code internal usage */
-#define BPF_PPC_STACK_LOCALS 32
+#define BPF_PPC_STACK_LOCALS 24
/* stack frame excluding BPF stack, ensure this is quadword aligned */
#define BPF_PPC_STACKFRAME (STACK_FRAME_MIN_SIZE + \
- BPF_PPC_STACK_LOCALS + BPF_PPC_STACK_SAVE)
+ BPF_PPC_STACK_LOCALS + \
+ BPF_PPC_STACK_SAVE + \
+ BPF_PPC_TAILCALL)
/* BPF register usage */
#define TMP_REG_1 (MAX_BPF_JIT_REG + 0)
@@ -87,27 +91,32 @@ static inline bool bpf_has_stack_frame(struct codegen_context *ctx)
}
/*
+ * Stack layout 2:
* When not setting up our own stackframe, the redzone (288 bytes) usage is:
+ * Note: r1 from prev frame. Component offset negative wrt r1.
*
* [ prev sp ] <-------------
* [ ... ] |
* sp (r1) ---> [ stack pointer ] --------------
- * [ nv gpr save area ] 6*8
* [ tail_call_cnt ] 8
+ * [ nv gpr save area ] 6*8
* [ local_tmp_var ] 24
* [ unused red zone ] 224
*/
static int bpf_jit_stack_local(struct codegen_context *ctx)
{
- if (bpf_has_stack_frame(ctx))
+ if (bpf_has_stack_frame(ctx)) {
+ /* Stack layout 1 */
return STACK_FRAME_MIN_SIZE + ctx->stack_size;
- else
- return -(BPF_PPC_STACK_SAVE + 32);
+ } else {
+ /* Stack layout 2 */
+ return -(BPF_PPC_TAILCALL + BPF_PPC_STACK_SAVE + BPF_PPC_STACK_LOCALS);
+ }
}
static int bpf_jit_stack_tailcallcnt(struct codegen_context *ctx)
{
- return bpf_jit_stack_local(ctx) + 24;
+ return bpf_jit_stack_local(ctx) + BPF_PPC_STACK_LOCALS + BPF_PPC_STACK_SAVE;
}
static int bpf_jit_stack_offsetof(struct codegen_context *ctx, int reg)
@@ -115,7 +124,7 @@ static int bpf_jit_stack_offsetof(struct codegen_context *ctx, int reg)
if (reg >= BPF_PPC_NVR_MIN && reg < 32)
return (bpf_has_stack_frame(ctx) ?
(BPF_PPC_STACKFRAME + ctx->stack_size) : 0)
- - (8 * (32 - reg));
+ - (8 * (32 - reg)) - BPF_PPC_TAILCALL;
pr_err("BPF JIT is asking about unknown registers");
BUG();
@@ -145,7 +154,7 @@ void bpf_jit_build_prologue(u32 *image, struct codegen_context *ctx)
if (ctx->seen & SEEN_TAILCALL) {
EMIT(PPC_RAW_LI(bpf_to_ppc(TMP_REG_1), 0));
/* this goes in the redzone */
- EMIT(PPC_RAW_STD(bpf_to_ppc(TMP_REG_1), _R1, -(BPF_PPC_STACK_SAVE + 8)));
+ EMIT(PPC_RAW_STD(bpf_to_ppc(TMP_REG_1), _R1, -(BPF_PPC_TAILCALL)));
} else {
EMIT(PPC_RAW_NOP());
EMIT(PPC_RAW_NOP());
--
2.48.1
On 14/01/26 5:14 pm, adubey@linux.ibm.com wrote:
> From: Abhishek Dubey <adubey@linux.ibm.com>
>
> In the conventional stack frame, the position of tail_call_cnt
> is after the NVR save area (BPF_PPC_STACK_SAVE). Whereas, the
> offset of tail_call_cnt in the trampoline frame is after the
> stack alignment padding. BPF JIT logic could become complex
> when dealing with frame-sensitive offset calculation of
> tail_call_cnt. Having the same offset in both frames is the
> desired objective.
>
> The trampoline frame does not have a BPF_PPC_STACK_SAVE area.
> Introducing it leads to under-utilization of extra memory meant
> only for the offset alignment of tail_call_cnt.
> Another challenge is the variable alignment padding sitting at
> the bottom of the trampoline frame, which requires additional
> handling to compute tail_call_cnt offset.
>
> This patch addresses the above issues by moving tail_call_cnt
> to the bottom of the stack frame at offset 0 for both types
> of frames. This saves additional bytes required by BPF_PPC_STACK_SAVE
> in trampoline frame, and a common offset computation for
> tail_call_cnt serves both frames.
>
> The changes in this patch are required by the third patch in the
> series, where the 'reference to tail_call_info' of the main frame
> is copied into the trampoline frame from the previous frame.
>
> Signed-off-by: Abhishek Dubey <adubey@linux.ibm.com>
> ---
> arch/powerpc/net/bpf_jit.h | 4 ++++
> arch/powerpc/net/bpf_jit_comp64.c | 31 ++++++++++++++++++++-----------
> 2 files changed, 24 insertions(+), 11 deletions(-)
>
> diff --git a/arch/powerpc/net/bpf_jit.h b/arch/powerpc/net/bpf_jit.h
> index 8334cd667bba..45d419c0ee73 100644
> --- a/arch/powerpc/net/bpf_jit.h
> +++ b/arch/powerpc/net/bpf_jit.h
> @@ -72,6 +72,10 @@
> } } while (0)
>
> #ifdef CONFIG_PPC64
> +
> +/* for tailcall counter */
> +#define BPF_PPC_TAILCALL 8
> +
> /* If dummy pass (!image), account for maximum possible instructions */
> #define PPC_LI64(d, i) do { \
> if (!image) \
> diff --git a/arch/powerpc/net/bpf_jit_comp64.c b/arch/powerpc/net/bpf_jit_comp64.c
> index 1fe37128c876..39061cd742c1 100644
> --- a/arch/powerpc/net/bpf_jit_comp64.c
> +++ b/arch/powerpc/net/bpf_jit_comp64.c
> @@ -20,13 +20,15 @@
> #include "bpf_jit.h"
>
> /*
> - * Stack layout:
> + * Stack layout 1:
> + * Layout when setting up our own stack frame.
> + * Note: r1 at bottom, component offsets positive wrt r1.
> * Ensure the top half (upto local_tmp_var) stays consistent
> * with our redzone usage.
> *
> * [ prev sp ] <-------------
> - * [ nv gpr save area ] 6*8 |
> * [ tail_call_cnt ] 8 |
> + * [ nv gpr save area ] 6*8 |
> * [ local_tmp_var ] 24 |
> * fp (r31) --> [ ebpf stack space ] upto 512 |
> * [ frame header ] 32/112 |
> @@ -36,10 +38,12 @@
> /* for gpr non volatile registers BPG_REG_6 to 10 */
> #define BPF_PPC_STACK_SAVE (6*8)
> /* for bpf JIT code internal usage */
> -#define BPF_PPC_STACK_LOCALS 32
> +#define BPF_PPC_STACK_LOCALS 24
> /* stack frame excluding BPF stack, ensure this is quadword aligned */
> #define BPF_PPC_STACKFRAME (STACK_FRAME_MIN_SIZE + \
> - BPF_PPC_STACK_LOCALS + BPF_PPC_STACK_SAVE)
> + BPF_PPC_STACK_LOCALS + \
> + BPF_PPC_STACK_SAVE + \
> + BPF_PPC_TAILCALL)
>
> /* BPF register usage */
> #define TMP_REG_1 (MAX_BPF_JIT_REG + 0)
> @@ -87,27 +91,32 @@ static inline bool bpf_has_stack_frame(struct codegen_context *ctx)
> }
>
> /*
> + * Stack layout 2:
> * When not setting up our own stackframe, the redzone (288 bytes) usage is:
> + * Note: r1 from prev frame. Component offset negative wrt r1.
> *
> * [ prev sp ] <-------------
> * [ ... ] |
> * sp (r1) ---> [ stack pointer ] --------------
> - * [ nv gpr save area ] 6*8
> * [ tail_call_cnt ] 8
> + * [ nv gpr save area ] 6*8
> * [ local_tmp_var ] 24
> * [ unused red zone ] 224
> */
Calling it stack layout 1 & 2 is inappropriate. The stack layout
is essentially the same. It just goes to show things with reference
to r1 when stack is setup explicitly vs when redzone is being used...
- Hari
Le 14/01/2026 à 12:44, adubey@linux.ibm.com a écrit :
> From: Abhishek Dubey <adubey@linux.ibm.com>
>
> In the conventional stack frame, the position of tail_call_cnt
> is after the NVR save area (BPF_PPC_STACK_SAVE). Whereas, the
> offset of tail_call_cnt in the trampoline frame is after the
> stack alignment padding. BPF JIT logic could become complex
> when dealing with frame-sensitive offset calculation of
> tail_call_cnt. Having the same offset in both frames is the
> desired objective.
>
> The trampoline frame does not have a BPF_PPC_STACK_SAVE area.
> Introducing it leads to under-utilization of extra memory meant
> only for the offset alignment of tail_call_cnt.
> Another challenge is the variable alignment padding sitting at
> the bottom of the trampoline frame, which requires additional
> handling to compute tail_call_cnt offset.
>
> This patch addresses the above issues by moving tail_call_cnt
> to the bottom of the stack frame at offset 0 for both types
> of frames. This saves additional bytes required by BPF_PPC_STACK_SAVE
> in trampoline frame, and a common offset computation for
> tail_call_cnt serves both frames.
>
> The changes in this patch are required by the third patch in the
> series, where the 'reference to tail_call_info' of the main frame
> is copied into the trampoline frame from the previous frame.
>
> Signed-off-by: Abhishek Dubey <adubey@linux.ibm.com>
> ---
> arch/powerpc/net/bpf_jit.h | 4 ++++
> arch/powerpc/net/bpf_jit_comp64.c | 31 ++++++++++++++++++++-----------
> 2 files changed, 24 insertions(+), 11 deletions(-)
>
> diff --git a/arch/powerpc/net/bpf_jit.h b/arch/powerpc/net/bpf_jit.h
> index 8334cd667bba..45d419c0ee73 100644
> --- a/arch/powerpc/net/bpf_jit.h
> +++ b/arch/powerpc/net/bpf_jit.h
> @@ -72,6 +72,10 @@
> } } while (0)
>
> #ifdef CONFIG_PPC64
> +
> +/* for tailcall counter */
> +#define BPF_PPC_TAILCALL 8
This needs to be defined outside of CONFIG_PPC64 ifdef because from
patch 3 it is used in bpf_jit_comp.c which is also built on powerpc32.
> +
> /* If dummy pass (!image), account for maximum possible instructions */
> #define PPC_LI64(d, i) do { \
> if (!image) \
> diff --git a/arch/powerpc/net/bpf_jit.h b/arch/powerpc/net/bpf_jit.h
> index 8334cd667bba..45d419c0ee73 100644
> --- a/arch/powerpc/net/bpf_jit.h
> +++ b/arch/powerpc/net/bpf_jit.h
[ ... ]
> +/* for tailcall counter */
> +#define BPF_PPC_TAILCALL 8
This new constant defines tail_call_cnt offset as 8 bytes from prev sp.
> diff --git a/arch/powerpc/net/bpf_jit_comp64.c b/arch/powerpc/net/bpf_jit_comp64.c
> index 1fe37128c876..39061cd742c1 100644
> --- a/arch/powerpc/net/bpf_jit_comp64.c
> +++ b/arch/powerpc/net/bpf_jit_comp64.c
[ ... ]
> static int bpf_jit_stack_tailcallcnt(struct codegen_context *ctx)
> {
>- return bpf_jit_stack_local(ctx) + 24;
>+ return bpf_jit_stack_local(ctx) + BPF_PPC_STACK_LOCALS + BPF_PPC_STACK_SAVE;
> }
The tail_call_cnt offset changes from +24 to +72 (24 + 48) relative to
bpf_jit_stack_local().
There appears to be an inconsistency with the trampoline code in
bpf_jit_comp.c. The function bpf_trampoline_setup_tail_call_cnt() and
bpf_trampoline_restore_tail_call_cnt() use a hardcoded offset:
int tailcallcnt_offset = 7 * 8; /* = 56 */
The comment says "See bpf_jit_stack_tailcallcnt()" but after this patch,
bpf_jit_stack_tailcallcnt() returns an offset that corresponds to 8 bytes
from prev sp (BPF_PPC_TAILCALL), not 56 bytes. When BPF_TRAMP_F_TAIL_CALL_CTX
is set, will the trampoline read/write to the wrong stack location?
Note: This appears to be fixed later in the same series by commit 6889ec3d3e10
("powerpc64/bpf: Tailcall handling with trampolines") which updates the
trampoline code to use BPF_PPC_TAILCALL instead of the hardcoded 7 * 8.
---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
CI run summary: https://github.com/kernel-patches/bpf/actions/runs/20993216190
© 2016 - 2026 Red Hat, Inc.