[PATCH v2 03/12] ARM: Clean up definition of ARM_HAS_GROUP_RELOCS

Nathan Chancellor posted 12 patches 1 month, 1 week ago
[PATCH v2 03/12] ARM: Clean up definition of ARM_HAS_GROUP_RELOCS
Posted by Nathan Chancellor 1 month, 1 week ago
Now that the minimum supported version of LLVM for building the kernel
has been bumped to 15.0.0, the first depends line of
ARM_HAS_GROUP_RELOCS is always true, so it can be safely removed.
Combine the !COMPILE_TEST dependency into the 'def_bool' line and update
the comment as well.

Reviewed-by: Kees Cook <kees@kernel.org>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
---
Cc: Russell King <linux@armlinux.org.uk>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
---
 arch/arm/Kconfig | 11 ++++-------
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index b1f3df39ed40..faf83015b961 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -166,15 +166,12 @@ config ARM
 	  <http://www.arm.linux.org.uk/>.
 
 config ARM_HAS_GROUP_RELOCS
-	def_bool y
-	depends on !LD_IS_LLD || LLD_VERSION >= 140000
-	depends on !COMPILE_TEST
+	def_bool !COMPILE_TEST
 	help
 	  Whether or not to use R_ARM_ALU_PC_Gn or R_ARM_LDR_PC_Gn group
-	  relocations, which have been around for a long time, but were not
-	  supported in LLD until version 14. The combined range is -/+ 256 MiB,
-	  which is usually sufficient, but not for allyesconfig, so we disable
-	  this feature when doing compile testing.
+	  relocations. The combined range is -/+ 256 MiB, which is usually
+	  sufficient, but not for allyesconfig, so we disable this feature
+	  when doing compile testing.
 
 config ARM_DMA_USE_IOMMU
 	bool

-- 
2.50.1
Re: [PATCH v2 03/12] ARM: Clean up definition of ARM_HAS_GROUP_RELOCS
Posted by Arnd Bergmann 1 month, 1 week ago
On Thu, Aug 21, 2025, at 23:15, Nathan Chancellor wrote:
> Now that the minimum supported version of LLVM for building the kernel
> has been bumped to 15.0.0, the first depends line of
> ARM_HAS_GROUP_RELOCS is always true, so it can be safely removed.
> Combine the !COMPILE_TEST dependency into the 'def_bool' line and update
> the comment as well.
>
> Reviewed-by: Kees Cook <kees@kernel.org>
> Signed-off-by: Nathan Chancellor <nathan@kernel.org>

Acked-by: Arnd Bergmann <arnd@arndb.de>

>.
> 
>  config ARM_HAS_GROUP_RELOCS
> -	def_bool y
> -	depends on !LD_IS_LLD || LLD_VERSION >= 140000
> -	depends on !COMPILE_TEST
> +	def_bool !COMPILE_TEST
>  	help
>  	  Whether or not to use R_ARM_ALU_PC_Gn or R_ARM_LDR_PC_Gn group
> -	  relocations, which have been around for a long time, but were not
> -	  supported in LLD until version 14. The combined range is -/+ 256 MiB,
> -	  which is usually sufficient, but not for allyesconfig, so we disable
> -	  this feature when doing compile testing.

The change is obviously correct by itself, but can we revisit the
question of whether the COMPILE_TEST check is still needed?

Trying it out, I single link issue using llvm-21:

ld.lld-21: error: vmlinux.a(arch/arm/kernel/entry-armv.o):(function __bad_stack: .text+0x110): relocation R_ARM_LDR_PC_G2 out of range: 10168 is not in [0, 4095]; references 'overflow_stack_ptr'
>>> defined in vmlinux.a(arch/arm/kernel/traps.o)

which is from this line

arch/arm/kernel/entry-armv.S:   ldr_this_cpu_armv6 ip, overflow_stack_ptr

with the macro expanding to

        .macro          ldr_this_cpu_armv6, rd:req, sym:req
        this_cpu_offset \rd
        .globl          \sym 
        .reloc          .L0_\@, R_ARM_ALU_PC_G0_NC, \sym
        .reloc          .L1_\@, R_ARM_ALU_PC_G1_NC, \sym
        .reloc          .L2_\@, R_ARM_LDR_PC_G2, \sym
        add             \rd, \rd, pc
.L0_\@: sub             \rd, \rd, #4
.L1_\@: sub             \rd, \rd, #0
.L2_\@: ldr             \rd, [\rd, #4]
        .endm

Would it be possible to either change the macro or to move
the overflow_stack_ptr closer in order to completely eliminate
the CONFIG_ARM_HAS_GROUP_RELOCS symbol and have VMAP_STACK
enabled for all CONFIG_MMU builds?

Are there any other build testing issues with ARM_HAS_GROUP_RELOCS
besides the one I saw here?

      Arnd
Re: [PATCH v2 03/12] ARM: Clean up definition of ARM_HAS_GROUP_RELOCS
Posted by Arnd Bergmann 1 month, 1 week ago
On Fri, Aug 22, 2025, at 09:05, Arnd Bergmann wrote:
> On Thu, Aug 21, 2025, at 23:15, Nathan Chancellor wrote:
>
> Would it be possible to either change the macro or to move
> the overflow_stack_ptr closer in order to completely eliminate
> the CONFIG_ARM_HAS_GROUP_RELOCS symbol and have VMAP_STACK
> enabled for all CONFIG_MMU builds?
>
> Are there any other build testing issues with ARM_HAS_GROUP_RELOCS
> besides the one I saw here?

With some more randconfig testing, I did come across a few
configurations that each fail with hundreds of errors like

arm-linux-gnueabi-ld: drivers/crypto/hifn_795x.o(.text+0x99c): overflow whilst splitting 0x10a61854 for group relocation R_ARM_LDR_PC_G2

so I guess we'll have to stick with the current dependency,
at least for ARMv6 and below.

    Arnd
Re: [PATCH v2 03/12] ARM: Clean up definition of ARM_HAS_GROUP_RELOCS
Posted by Ard Biesheuvel 1 month, 1 week ago
On Fri, 22 Aug 2025 at 22:04, Arnd Bergmann <arnd@arndb.de> wrote:
>
> On Fri, Aug 22, 2025, at 09:05, Arnd Bergmann wrote:
> > On Thu, Aug 21, 2025, at 23:15, Nathan Chancellor wrote:
> >
> > Would it be possible to either change the macro or to move
> > the overflow_stack_ptr closer in order to completely eliminate
> > the CONFIG_ARM_HAS_GROUP_RELOCS symbol and have VMAP_STACK
> > enabled for all CONFIG_MMU builds?
> >
> > Are there any other build testing issues with ARM_HAS_GROUP_RELOCS
> > besides the one I saw here?
>
> With some more randconfig testing, I did come across a few
> configurations that each fail with hundreds of errors like
>
> arm-linux-gnueabi-ld: drivers/crypto/hifn_795x.o(.text+0x99c): overflow whilst splitting 0x10a61854 for group relocation R_ARM_LDR_PC_G2
>
> so I guess we'll have to stick with the current dependency,
> at least for ARMv6 and below.
>

This is due to LOAD_SYM_ARMV6() (rather than the ldr_this_cpu_armv6
asm macro), which is used to implement get_current() on configs that
use a global variable to store the current task pointer (i.e., non-k
v6 and older). It eliminates the first of two LDRs, which would
pollute the D-cache otherwise, as every occurrence of get_current()
emits a literal into .text carrying the address of the __current
global variable. The D-cache footprint of each such literal is a
cacheline, which never contains other useful data.
(The second LDR is needed and always refers to the same address so it
does not impact D-cache efficiency)

The LOAD_SYM_ARMV6() sequence has a range of 256 MiB, which is
sufficient for any ARM kernel that can be meaningfully used in
production. However, randconfigs may produce kernels that are larger
than this, and so we need the COMPILE_TEST check if we are going to
keep the optimization, and I think it is meaningful enough to do so.