arch/loongarch/net/bpf_jit.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
(1) Description of Problem:
When testing BPF JIT with the latest compiler toolchains on LoongArch,
there exist some strange failed test cases, dmesg shows something like
this:
# dmesg -t | grep FAIL | head -1
... ret -3 != -3 (0xfffffffd != 0xfffffffd)FAIL ...
(2) Steps to Reproduce:
# echo 1 > /proc/sys/net/core/bpf_jit_enable
# modprobe test_bpf
(3) Additional Info:
There are no failed test cases compiled with the lower version of GCC
such as 13.3.0, while the problems only appear with higher version of
GCC such as 14.2.0.
This is because the problems were hidden by the lower version of GCC
due to there are redundant sign extension instructions generated by
compiler, but with optimization of higher version of GCC, the sign
extension instructions have been removed.
(4) Root Cause Analysis:
The LoongArch architecture does not expose sub-registers, and hold all
32-bit values in a sign-extended format. While BPF, on the other hand,
exposes sub-registers, and use zero-extension (similar to arm64/x86).
This has led to some subtle bugs, where a BPF JITted program has not
sign-extended the a0 register (return value in LoongArch land), passed
the return value up the kernel, for example:
| int from_bpf(void);
|
| long foo(void)
| {
| return from_bpf();
| }
Here, a0 would be 0xffff_ffff, instead of the expected
0xffff_ffff_ffff_ffff.
Internally, the LoongArch JIT uses a5 as a dedicated register for BPF
return values. That is to say, the LoongArch BPF uses a5 for BPF return
values, which are zero-extended, whereas the LoongArch ABI uses a0 which
is sign-extended.
(5) Final Solution:
Keep a5 zero-extended, but explicitly sign-extend a0 (which is used
outside BPF land). Because libbpf currently defines the return value
of an ebpf program as a 32-bit unsigned integer, just use addi.w to
extend bit 31 into bits 63 through 32 of a5 to a0. This is similar
with commit 2f1b0d3d7331 ("riscv, bpf: Sign-extend return values").
Fixes: 5dc615520c4d ("LoongArch: Add BPF JIT support")
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
---
arch/loongarch/net/bpf_jit.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
index 7dbefd4ba210..dd350cba1252 100644
--- a/arch/loongarch/net/bpf_jit.c
+++ b/arch/loongarch/net/bpf_jit.c
@@ -179,7 +179,7 @@ static void __build_epilogue(struct jit_ctx *ctx, bool is_tail_call)
if (!is_tail_call) {
/* Set return value */
- move_reg(ctx, LOONGARCH_GPR_A0, regmap[BPF_REG_0]);
+ emit_insn(ctx, addiw, LOONGARCH_GPR_A0, regmap[BPF_REG_0], 0);
/* Return to the caller */
emit_insn(ctx, jirl, LOONGARCH_GPR_RA, LOONGARCH_GPR_ZERO, 0);
} else {
--
2.42.0
Tiezhu Yang wrote: > (1) Description of Problem: > > When testing BPF JIT with the latest compiler toolchains on LoongArch, > there exist some strange failed test cases, dmesg shows something like > this: > > # dmesg -t | grep FAIL | head -1 > ... ret -3 != -3 (0xfffffffd != 0xfffffffd)FAIL ... > > (2) Steps to Reproduce: > > # echo 1 > /proc/sys/net/core/bpf_jit_enable > # modprobe test_bpf > > (3) Additional Info: > > There are no failed test cases compiled with the lower version of GCC > such as 13.3.0, while the problems only appear with higher version of > GCC such as 14.2.0. > > This is because the problems were hidden by the lower version of GCC > due to there are redundant sign extension instructions generated by > compiler, but with optimization of higher version of GCC, the sign > extension instructions have been removed. > > (4) Root Cause Analysis: > > The LoongArch architecture does not expose sub-registers, and hold all > 32-bit values in a sign-extended format. While BPF, on the other hand, > exposes sub-registers, and use zero-extension (similar to arm64/x86). > > This has led to some subtle bugs, where a BPF JITted program has not > sign-extended the a0 register (return value in LoongArch land), passed > the return value up the kernel, for example: > > | int from_bpf(void); > | > | long foo(void) > | { > | return from_bpf(); > | } > > Here, a0 would be 0xffff_ffff, instead of the expected > 0xffff_ffff_ffff_ffff. > > Internally, the LoongArch JIT uses a5 as a dedicated register for BPF > return values. That is to say, the LoongArch BPF uses a5 for BPF return > values, which are zero-extended, whereas the LoongArch ABI uses a0 which > is sign-extended. > > (5) Final Solution: > > Keep a5 zero-extended, but explicitly sign-extend a0 (which is used > outside BPF land). Because libbpf currently defines the return value > of an ebpf program as a 32-bit unsigned integer, just use addi.w to > extend bit 31 into bits 63 through 32 of a5 to a0. This is similar > with commit 2f1b0d3d7331 ("riscv, bpf: Sign-extend return values"). > > Fixes: 5dc615520c4d ("LoongArch: Add BPF JIT support") > Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> > --- > arch/loongarch/net/bpf_jit.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c > index 7dbefd4ba210..dd350cba1252 100644 > --- a/arch/loongarch/net/bpf_jit.c > +++ b/arch/loongarch/net/bpf_jit.c > @@ -179,7 +179,7 @@ static void __build_epilogue(struct jit_ctx *ctx, bool is_tail_call) > > if (!is_tail_call) { > /* Set return value */ > - move_reg(ctx, LOONGARCH_GPR_A0, regmap[BPF_REG_0]); > + emit_insn(ctx, addiw, LOONGARCH_GPR_A0, regmap[BPF_REG_0], 0); Not overly familiar with this JIT but just to check this wont be used for BPF 2 BPF calls correct? > /* Return to the caller */ > emit_insn(ctx, jirl, LOONGARCH_GPR_RA, LOONGARCH_GPR_ZERO, 0); > } else { > -- > 2.42.0 > >
On 11/22/2024 08:41 AM, John Fastabend wrote: > Tiezhu Yang wrote: >> (1) Description of Problem: >> >> When testing BPF JIT with the latest compiler toolchains on LoongArch, >> there exist some strange failed test cases, dmesg shows something like >> this: >> >> # dmesg -t | grep FAIL | head -1 >> ... ret -3 != -3 (0xfffffffd != 0xfffffffd)FAIL ... ... >> >> (5) Final Solution: >> >> Keep a5 zero-extended, but explicitly sign-extend a0 (which is used >> outside BPF land). Because libbpf currently defines the return value >> of an ebpf program as a 32-bit unsigned integer, just use addi.w to >> extend bit 31 into bits 63 through 32 of a5 to a0. This is similar >> with commit 2f1b0d3d7331 ("riscv, bpf: Sign-extend return values"). >> >> Fixes: 5dc615520c4d ("LoongArch: Add BPF JIT support") >> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> >> --- >> arch/loongarch/net/bpf_jit.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c >> index 7dbefd4ba210..dd350cba1252 100644 >> --- a/arch/loongarch/net/bpf_jit.c >> +++ b/arch/loongarch/net/bpf_jit.c >> @@ -179,7 +179,7 @@ static void __build_epilogue(struct jit_ctx *ctx, bool is_tail_call) >> >> if (!is_tail_call) { >> /* Set return value */ >> - move_reg(ctx, LOONGARCH_GPR_A0, regmap[BPF_REG_0]); >> + emit_insn(ctx, addiw, LOONGARCH_GPR_A0, regmap[BPF_REG_0], 0); > > Not overly familiar with this JIT but just to check this wont be used > for BPF 2 BPF calls correct? I am not sure I understand your comment correctly, but with and without this patch, the LoongArch JIT uses a5 as a dedicated register for BPF return values, a5 is kept as zero-extended for bpf2bpf, just make a0 (which is used outside BPF land) as sign-extend, all of the test cases in test_bpf.ko passed with "echo 1 > /proc/sys/net/core/bpf_jit_enable". Thanks, Tiezhu
© 2016 - 2024 Red Hat, Inc.