arch/loongarch/Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
Currently, when compiling with GCC, there is no "break 0x7" instruction
for zero division due to using the option -mno-check-zero-division, but
the compiler still generates "break 0x0" instruction for zero division.
Here is a simple example:
$ cat test.c
int div(int a)
{
return a / 0;
}
$ gcc -O2 -S test.c -o test.s
GCC generates "break 0" On LoongArch and "ud2" on x86, objtool decodes
"ud2" as INSN_BUG for x86, so decode "break 0" as INSN_BUG can fix the
objtool warnings for LoongArch, but this is not the intention.
When decoding "break 0" as INSN_TRAP in the previous commit, the aim is
to handle "break 0" as a trap. The generated "break 0" for zero division
by GCC is not proper, it should generate a break instruction with proper
bug type, so add the GCC option -fno-isolate-erroneous-paths-dereference
to avoid generating the unexpected "break 0" instruction for now.
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/r/202509200413.7uihAxJ5-lkp@intel.com/
Fixes: baad7830ee9a ("objtool/LoongArch: Mark types based on break immediate code")
Suggested-by: WANG Rui <wangrui@loongson.cn>
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
---
arch/loongarch/Makefile | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/loongarch/Makefile b/arch/loongarch/Makefile
index ae419e32f22e..f2a585b4a937 100644
--- a/arch/loongarch/Makefile
+++ b/arch/loongarch/Makefile
@@ -129,7 +129,7 @@ KBUILD_RUSTFLAGS_KERNEL += -Crelocation-model=pie
LDFLAGS_vmlinux += -static -pie --no-dynamic-linker -z notext $(call ld-option, --apply-dynamic-relocs)
endif
-cflags-y += $(call cc-option, -mno-check-zero-division)
+cflags-y += $(call cc-option, -mno-check-zero-division -fno-isolate-erroneous-paths-dereference)
ifndef CONFIG_KASAN
cflags-y += -fno-builtin-memcpy -fno-builtin-memmove -fno-builtin-memset
--
2.42.0
Hi, Tiezhu, On Tue, Sep 23, 2025 at 2:17 PM Tiezhu Yang <yangtiezhu@loongson.cn> wrote: > > Currently, when compiling with GCC, there is no "break 0x7" instruction > for zero division due to using the option -mno-check-zero-division, but > the compiler still generates "break 0x0" instruction for zero division. > > Here is a simple example: > > $ cat test.c > int div(int a) > { > return a / 0; > } > $ gcc -O2 -S test.c -o test.s > > GCC generates "break 0" On LoongArch and "ud2" on x86, objtool decodes > "ud2" as INSN_BUG for x86, so decode "break 0" as INSN_BUG can fix the > objtool warnings for LoongArch, but this is not the intention. > > When decoding "break 0" as INSN_TRAP in the previous commit, the aim is > to handle "break 0" as a trap. The generated "break 0" for zero division > by GCC is not proper, it should generate a break instruction with proper > bug type, so add the GCC option -fno-isolate-erroneous-paths-dereference > to avoid generating the unexpected "break 0" instruction for now. You said that this patch make performance increase a little. But this is strange, because -isolate-erroneous-paths-dereference rather than -no-isolate-erroneous-paths-dereference is considered as an optimization. Huacai > > Reported-by: kernel test robot <lkp@intel.com> > Closes: https://lore.kernel.org/r/202509200413.7uihAxJ5-lkp@intel.com/ > Fixes: baad7830ee9a ("objtool/LoongArch: Mark types based on break immediate code") > Suggested-by: WANG Rui <wangrui@loongson.cn> > Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> > --- > arch/loongarch/Makefile | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/loongarch/Makefile b/arch/loongarch/Makefile > index ae419e32f22e..f2a585b4a937 100644 > --- a/arch/loongarch/Makefile > +++ b/arch/loongarch/Makefile > @@ -129,7 +129,7 @@ KBUILD_RUSTFLAGS_KERNEL += -Crelocation-model=pie > LDFLAGS_vmlinux += -static -pie --no-dynamic-linker -z notext $(call ld-option, --apply-dynamic-relocs) > endif > > -cflags-y += $(call cc-option, -mno-check-zero-division) > +cflags-y += $(call cc-option, -mno-check-zero-division -fno-isolate-erroneous-paths-dereference) > > ifndef CONFIG_KASAN > cflags-y += -fno-builtin-memcpy -fno-builtin-memmove -fno-builtin-memset > -- > 2.42.0 >
On 2025/9/23 下午10:32, Huacai Chen wrote: > Hi, Tiezhu, > > On Tue, Sep 23, 2025 at 2:17 PM Tiezhu Yang <yangtiezhu@loongson.cn> wrote: >> >> Currently, when compiling with GCC, there is no "break 0x7" instruction >> for zero division due to using the option -mno-check-zero-division, but >> the compiler still generates "break 0x0" instruction for zero division. >> >> Here is a simple example: >> >> $ cat test.c >> int div(int a) >> { >> return a / 0; >> } >> $ gcc -O2 -S test.c -o test.s >> >> GCC generates "break 0" On LoongArch and "ud2" on x86, objtool decodes >> "ud2" as INSN_BUG for x86, so decode "break 0" as INSN_BUG can fix the >> objtool warnings for LoongArch, but this is not the intention. >> >> When decoding "break 0" as INSN_TRAP in the previous commit, the aim is >> to handle "break 0" as a trap. The generated "break 0" for zero division >> by GCC is not proper, it should generate a break instruction with proper >> bug type, so add the GCC option -fno-isolate-erroneous-paths-dereference >> to avoid generating the unexpected "break 0" instruction for now. > You said that this patch make performance increase a little. But this > is strange, because -isolate-erroneous-paths-dereference rather than > -no-isolate-erroneous-paths-dereference is considered as an > optimization. I tested linux 6.17-rc7 with loongson3_defconfig, only a little improvement (about 0.3%) with "./Run -c 1". Here are the test steps, anyone who is interested can test again to get the actual results on the specified environment: git clone https://github.com/kdlucas/byte-unixbench.git cd byte-unixbench/UnixBench/ make ./Run -c 1 ./Run -c 8 Here are the objdump info for sched_update_scaling() in kernel/sched/fair.o: Before: 000000000000bbc8 <sched_update_scaling>: bbc8: 1a00000c pcalau12i $t0, 0 bbcc: 1a00000d pcalau12i $t1, 0 bbd0: 02c0018c addi.d $t0, $t0, 0 bbd4: 288001ae ld.w $t2, $t1, 0 bbd8: 24000190 ldptr.w $t4, $t0, 0 bbdc: 004081ce slli.w $t2, $t2, 0x0 bbe0: 40006200 beqz $t4, 96 # bc40 <sched_update_scaling+0x78> bbe4: 0280200d addi.w $t1, $zero, 8 bbe8: 0012b9ad sltu $t1, $t1, $t2 bbec: 02802012 addi.w $t6, $zero, 8 bbf0: 0013b5cf masknez $t3, $t2, $t1 bbf4: 0013364d maskeqz $t1, $t6, $t1 bbf8: 001535ed or $t1, $t3, $t1 bbfc: 02800811 addi.w $t5, $zero, 2 bc00: 004081af slli.w $t3, $t1, 0x0 bc04: 58001611 beq $t4, $t5, 20 # bc18 <sched_update_scaling+0x50> bc08: 400051c0 beqz $t2, 80 # bc58 <sched_update_scaling+0x90> bc0c: 000015ad clz.w $t1, $t1 bc10: 0280800f addi.w $t3, $zero, 32 bc14: 001135ef sub.w $t3, $t3, $t1 bc18: 24000d8d ldptr.w $t1, $t0, 12 bc1c: 00150004 or $a0, $zero, $zero bc20: 00213dad div.wu $t1, $t1, $t3 bc24: 2980418d st.w $t1, $t0, 16 bc28: 4c000020 jirl $zero, $ra, 0 bc2c: 03400000 andi $zero, $zero, 0x0 bc30: 03400000 andi $zero, $zero, 0x0 bc34: 03400000 andi $zero, $zero, 0x0 bc38: 03400000 andi $zero, $zero, 0x0 bc3c: 03400000 andi $zero, $zero, 0x0 bc40: 24000d8d ldptr.w $t1, $t0, 12 bc44: 0280040f addi.w $t3, $zero, 1 bc48: 00150004 or $a0, $zero, $zero bc4c: 00213dad div.wu $t1, $t1, $t3 bc50: 2980418d st.w $t1, $t0, 16 bc54: 4c000020 jirl $zero, $ra, 0 bc58: 002a0000 break 0x0 bc5c: 03400000 andi $zero, $zero, 0x0 After: 000000000000bbc8 <sched_update_scaling>: bbc8: 1a00000c pcalau12i $t0, 0 bbcc: 1a00000d pcalau12i $t1, 0 bbd0: 02c0018c addi.d $t0, $t0, 0 bbd4: 288001ae ld.w $t2, $t1, 0 bbd8: 24000190 ldptr.w $t4, $t0, 0 bbdc: 0280040f addi.w $t3, $zero, 1 bbe0: 004081ce slli.w $t2, $t2, 0x0 bbe4: 40003a00 beqz $t4, 56 # bc1c <sched_update_scaling+0x54> bbe8: 0280200d addi.w $t1, $zero, 8 bbec: 0012b9ad sltu $t1, $t1, $t2 bbf0: 02802012 addi.w $t6, $zero, 8 bbf4: 0013b5cf masknez $t3, $t2, $t1 bbf8: 0013364d maskeqz $t1, $t6, $t1 bbfc: 001535ed or $t1, $t3, $t1 bc00: 02800811 addi.w $t5, $zero, 2 bc04: 004081af slli.w $t3, $t1, 0x0 bc08: 58001611 beq $t4, $t5, 20 # bc1c <sched_update_scaling+0x54> bc0c: 000015ad clz.w $t1, $t1 bc10: 0280800f addi.w $t3, $zero, 32 bc14: 001135ef sub.w $t3, $t3, $t1 bc18: 001339ef maskeqz $t3, $t3, $t2 bc1c: 24000d8d ldptr.w $t1, $t0, 12 bc20: 00150004 or $a0, $zero, $zero bc24: 00213dad div.wu $t1, $t1, $t3 bc28: 2980418d st.w $t1, $t0, 16 bc2c: 4c000020 jirl $zero, $ra, 0 There is no beqz instruction for zero division with this patch, I guess it will affect the performance to some extent. IMO, the isolate-erroneous-paths-dereference optimization is for error code path, not for performance. Anyway, my initial aim is to check whether exist performance regression, from the point of view of the test results, there is no obvious differences with this patch. Thanks, Tiezhu
On Tue, 2025-09-23 at 14:17 +0800, Tiezhu Yang wrote: > Currently, when compiling with GCC, there is no "break 0x7" instruction > for zero division due to using the option -mno-check-zero-division, but > the compiler still generates "break 0x0" instruction for zero division. > > Here is a simple example: > > $ cat test.c > int div(int a) > { > return a / 0; > } > $ gcc -O2 -S test.c -o test.s > > GCC generates "break 0" On LoongArch and "ud2" on x86, objtool decodes > "ud2" as INSN_BUG for x86, so decode "break 0" as INSN_BUG can fix the > objtool warnings for LoongArch, but this is not the intention. > > When decoding "break 0" as INSN_TRAP in the previous commit, the aim is > to handle "break 0" as a trap. The generated "break 0" for zero division > by GCC is not proper, it should generate a break instruction with proper > bug type, so add the GCC option -fno-isolate-erroneous-paths-dereference > to avoid generating the unexpected "break 0" instruction for now. I just proposed GCC to use the same "documented undefined instruction" as Clang: https://gcc.gnu.org/pipermail/gcc-patches/2025-September/695981.html. -- Xi Ruoyao <xry111@xry111.site>
On Tue, Sep 23, 2025 at 8:39 PM Xi Ruoyao <xry111@xry111.site> wrote: > > On Tue, 2025-09-23 at 14:17 +0800, Tiezhu Yang wrote: > > Currently, when compiling with GCC, there is no "break 0x7" instruction > > for zero division due to using the option -mno-check-zero-division, but > > the compiler still generates "break 0x0" instruction for zero division. > > > > Here is a simple example: > > > > $ cat test.c > > int div(int a) > > { > > return a / 0; > > } > > $ gcc -O2 -S test.c -o test.s > > > > GCC generates "break 0" On LoongArch and "ud2" on x86, objtool decodes > > "ud2" as INSN_BUG for x86, so decode "break 0" as INSN_BUG can fix the > > objtool warnings for LoongArch, but this is not the intention. > > > > When decoding "break 0" as INSN_TRAP in the previous commit, the aim is > > to handle "break 0" as a trap. The generated "break 0" for zero division > > by GCC is not proper, it should generate a break instruction with proper > > bug type, so add the GCC option -fno-isolate-erroneous-paths-dereference > > to avoid generating the unexpected "break 0" instruction for now. > > I just proposed GCC to use the same "documented undefined instruction" > as Clang: > https://gcc.gnu.org/pipermail/gcc-patches/2025-September/695981.html. I have discussed it with Tiezhu offline, and we prefer "break 0x1". Huacai > > -- > Xi Ruoyao <xry111@xry111.site>
© 2016 - 2025 Red Hat, Inc.