arch/x86/Kconfig | 11 +- arch/x86/Makefile | 20 +-- arch/x86/boot/compressed/misc.c | 14 +-- arch/x86/entry/entry.S | 2 - arch/x86/entry/entry_64.S | 2 +- arch/x86/include/asm/desc.h | 1 - arch/x86/include/asm/elf.h | 3 +- arch/x86/include/asm/percpu.h | 22 ---- arch/x86/include/asm/processor.h | 28 +---- arch/x86/include/asm/stackprotector.h | 36 +----- arch/x86/kernel/Makefile | 2 + arch/x86/kernel/asm-offsets_64.c | 6 - arch/x86/kernel/cpu/common.c | 9 +- arch/x86/kernel/head64.c | 2 +- arch/x86/kernel/head_64.S | 20 ++- arch/x86/kernel/irq_64.c | 1 - arch/x86/kernel/module.c | 15 +++ arch/x86/kernel/setup_percpu.c | 12 +- arch/x86/kernel/vmlinux.lds.S | 35 ------ arch/x86/platform/pvh/head.S | 14 ++- arch/x86/tools/relocs.c | 147 ++-------------------- arch/x86/xen/xen-head.S | 10 +- include/asm-generic/sections.h | 2 +- include/asm-generic/vmlinux.lds.h | 38 +----- include/linux/percpu-defs.h | 12 -- init/Kconfig | 5 - kernel/kallsyms.c | 12 +- mm/percpu.c | 4 +- scripts/gcc-x86_32-has-stack-protector.sh | 8 -- scripts/gcc-x86_64-has-stack-protector.sh | 4 - scripts/kallsyms.c | 72 ++--------- scripts/link-vmlinux.sh | 4 - scripts/min-tool-version.sh | 2 + 33 files changed, 100 insertions(+), 475 deletions(-) delete mode 100755 scripts/gcc-x86_32-has-stack-protector.sh delete mode 100755 scripts/gcc-x86_64-has-stack-protector.sh
Currently, x86-64 uses an unusual percpu layout, where the percpu section is linked at absolute address 0. The reason behind this is that older GCC versions placed the stack protector (if enabled) at a fixed offset from the GS segment base. Since the GS segement is also used for percpu variables, this forced the current layout. GCC since version 8.1 supports a configurable location for the stack protector value, which allows removal of the restriction on how the percpu section is linked. This allows the percpu section to be linked normally, like other architectures. In turn, this allows removal of code that was needed to support the zero-based percpu section. v6: - Rebased to current tip tree - Dropped patches already applied - Fixed typos in commit messages - Added Reviewed-by tags Ard Biesheuvel (1): x86/module: Deal with GOT based stack cookie load on Clang < 17 Brian Gerst (14): x86: Raise minimum GCC version to 8.1 x86/stackprotector: Remove stack protector test scripts x86/boot: Disable stack protector for early boot code x86/pvh: Use fixed_percpu_data for early boot GSBASE x86/relocs: Handle R_X86_64_REX_GOTPCRELX relocations x86/stackprotector/64: Convert to normal percpu variable x86/percpu/64: Use relative percpu offsets x86/percpu/64: Remove fixed_percpu_data x86/boot/64: Remove inverse relocations x86/percpu/64: Remove INIT_PER_CPU macros percpu: Remove PER_CPU_FIRST_SECTION percpu: Remove PERCPU_VADDR() percpu: Remove __per_cpu_load kallsyms: Remove KALLSYMS_ABSOLUTE_PERCPU arch/x86/Kconfig | 11 +- arch/x86/Makefile | 20 +-- arch/x86/boot/compressed/misc.c | 14 +-- arch/x86/entry/entry.S | 2 - arch/x86/entry/entry_64.S | 2 +- arch/x86/include/asm/desc.h | 1 - arch/x86/include/asm/elf.h | 3 +- arch/x86/include/asm/percpu.h | 22 ---- arch/x86/include/asm/processor.h | 28 +---- arch/x86/include/asm/stackprotector.h | 36 +----- arch/x86/kernel/Makefile | 2 + arch/x86/kernel/asm-offsets_64.c | 6 - arch/x86/kernel/cpu/common.c | 9 +- arch/x86/kernel/head64.c | 2 +- arch/x86/kernel/head_64.S | 20 ++- arch/x86/kernel/irq_64.c | 1 - arch/x86/kernel/module.c | 15 +++ arch/x86/kernel/setup_percpu.c | 12 +- arch/x86/kernel/vmlinux.lds.S | 35 ------ arch/x86/platform/pvh/head.S | 14 ++- arch/x86/tools/relocs.c | 147 ++-------------------- arch/x86/xen/xen-head.S | 10 +- include/asm-generic/sections.h | 2 +- include/asm-generic/vmlinux.lds.h | 38 +----- include/linux/percpu-defs.h | 12 -- init/Kconfig | 5 - kernel/kallsyms.c | 12 +- mm/percpu.c | 4 +- scripts/gcc-x86_32-has-stack-protector.sh | 8 -- scripts/gcc-x86_64-has-stack-protector.sh | 4 - scripts/kallsyms.c | 72 ++--------- scripts/link-vmlinux.sh | 4 - scripts/min-tool-version.sh | 2 + 33 files changed, 100 insertions(+), 475 deletions(-) delete mode 100755 scripts/gcc-x86_32-has-stack-protector.sh delete mode 100755 scripts/gcc-x86_64-has-stack-protector.sh base-commit: b79d90e018587507fd42c4c888956668692ff431 -- 2.47.1
* Brian Gerst <brgerst@gmail.com> wrote: > Currently, x86-64 uses an unusual percpu layout, where the percpu section > is linked at absolute address 0. The reason behind this is that older GCC > versions placed the stack protector (if enabled) at a fixed offset from the > GS segment base. Since the GS segement is also used for percpu variables, > this forced the current layout. > > GCC since version 8.1 supports a configurable location for the stack > protector value, which allows removal of the restriction on how the percpu > section is linked. This allows the percpu section to be linked normally, > like other architectures. In turn, this allows removal of code that was > needed to support the zero-based percpu section. > > v6: > - Rebased to current tip tree > - Dropped patches already applied > - Fixed typos in commit messages > - Added Reviewed-by tags > > Ard Biesheuvel (1): > x86/module: Deal with GOT based stack cookie load on Clang < 17 > > Brian Gerst (14): > x86: Raise minimum GCC version to 8.1 > x86/stackprotector: Remove stack protector test scripts > x86/boot: Disable stack protector for early boot code > x86/pvh: Use fixed_percpu_data for early boot GSBASE > x86/relocs: Handle R_X86_64_REX_GOTPCRELX relocations > x86/stackprotector/64: Convert to normal percpu variable > x86/percpu/64: Use relative percpu offsets > x86/percpu/64: Remove fixed_percpu_data > x86/boot/64: Remove inverse relocations > x86/percpu/64: Remove INIT_PER_CPU macros > percpu: Remove PER_CPU_FIRST_SECTION > percpu: Remove PERCPU_VADDR() > percpu: Remove __per_cpu_load > kallsyms: Remove KALLSYMS_ABSOLUTE_PERCPU > 33 files changed, 100 insertions(+), 475 deletions(-) > delete mode 100755 scripts/gcc-x86_32-has-stack-protector.sh > delete mode 100755 scripts/gcc-x86_64-has-stack-protector.sh Thank you for doing this series - it all looks pretty good from my side and I've applied it experimentally to tip:x86/asm. I fixed up the trivial details other reviewers and me noticed. Note that the merge is tentative, it might still need a rebase if some fundamental problem comes up - but let's see how testing goes in -next. Thanks, Ingo
On Tue, Feb 18, 2025 at 10:22 AM Ingo Molnar <mingo@kernel.org> wrote: > > > * Brian Gerst <brgerst@gmail.com> wrote: > > > Currently, x86-64 uses an unusual percpu layout, where the percpu section > > is linked at absolute address 0. The reason behind this is that older GCC > > versions placed the stack protector (if enabled) at a fixed offset from the > > GS segment base. Since the GS segement is also used for percpu variables, > > this forced the current layout. > > > > GCC since version 8.1 supports a configurable location for the stack > > protector value, which allows removal of the restriction on how the percpu > > section is linked. This allows the percpu section to be linked normally, > > like other architectures. In turn, this allows removal of code that was > > needed to support the zero-based percpu section. > > > > v6: > > - Rebased to current tip tree > > - Dropped patches already applied > > - Fixed typos in commit messages > > - Added Reviewed-by tags > > > > Ard Biesheuvel (1): > > x86/module: Deal with GOT based stack cookie load on Clang < 17 > > > > Brian Gerst (14): > > x86: Raise minimum GCC version to 8.1 > > x86/stackprotector: Remove stack protector test scripts > > x86/boot: Disable stack protector for early boot code > > x86/pvh: Use fixed_percpu_data for early boot GSBASE > > x86/relocs: Handle R_X86_64_REX_GOTPCRELX relocations > > x86/stackprotector/64: Convert to normal percpu variable > > x86/percpu/64: Use relative percpu offsets > > x86/percpu/64: Remove fixed_percpu_data > > x86/boot/64: Remove inverse relocations > > x86/percpu/64: Remove INIT_PER_CPU macros > > percpu: Remove PER_CPU_FIRST_SECTION > > percpu: Remove PERCPU_VADDR() > > percpu: Remove __per_cpu_load > > kallsyms: Remove KALLSYMS_ABSOLUTE_PERCPU > > > 33 files changed, 100 insertions(+), 475 deletions(-) > > delete mode 100755 scripts/gcc-x86_32-has-stack-protector.sh > > delete mode 100755 scripts/gcc-x86_64-has-stack-protector.sh > > Thank you for doing this series - it all looks pretty good from my side > and I've applied it experimentally to tip:x86/asm. I fixed up the trivial > details other reviewers and me noticed. > > Note that the merge is tentative, it might still need a rebase if some > fundamental problem comes up - but let's see how testing goes in -next. I wonder if there would be any benefit if stack canary is put into struct pcpu_hot? Uros.
* Uros Bizjak <ubizjak@gmail.com> wrote: > > Thank you for doing this series - it all looks pretty good from my > > side and I've applied it experimentally to tip:x86/asm. I fixed up > > the trivial details other reviewers and me noticed. > > > > Note that the merge is tentative, it might still need a rebase if > > some fundamental problem comes up - but let's see how testing goes > > in -next. > > I wonder if there would be any benefit if stack canary is put into > struct pcpu_hot? It should definitely be one of the hottest data structures on x86, so moving it there makes sense even if it cannot be measured explicitly. Thanks, Ingo
On Wed, Feb 19, 2025 at 6:47 AM Ingo Molnar <mingo@kernel.org> wrote: > > > * Uros Bizjak <ubizjak@gmail.com> wrote: > > > > Thank you for doing this series - it all looks pretty good from my > > > side and I've applied it experimentally to tip:x86/asm. I fixed up > > > the trivial details other reviewers and me noticed. > > > > > > Note that the merge is tentative, it might still need a rebase if > > > some fundamental problem comes up - but let's see how testing goes > > > in -next. > > > > I wonder if there would be any benefit if stack canary is put into > > struct pcpu_hot? > > It should definitely be one of the hottest data structures on x86, so > moving it there makes sense even if it cannot be measured explicitly. > It would have to be done with linker tricks, since you can't make the compiler use a struct member directly. Brian Gerst
* Brian Gerst <brgerst@gmail.com> wrote: > On Wed, Feb 19, 2025 at 6:47 AM Ingo Molnar <mingo@kernel.org> wrote: > > > > > > * Uros Bizjak <ubizjak@gmail.com> wrote: > > > > > > Thank you for doing this series - it all looks pretty good from my > > > > side and I've applied it experimentally to tip:x86/asm. I fixed up > > > > the trivial details other reviewers and me noticed. > > > > > > > > Note that the merge is tentative, it might still need a rebase if > > > > some fundamental problem comes up - but let's see how testing goes > > > > in -next. > > > > > > I wonder if there would be any benefit if stack canary is put into > > > struct pcpu_hot? > > > > It should definitely be one of the hottest data structures on x86, so > > moving it there makes sense even if it cannot be measured explicitly. > > > > It would have to be done with linker tricks, since you can't make the > compiler use a struct member directly. Probably not worth it then? Thanks, Ingo
On Thu, Feb 20, 2025 at 8:26 AM Ingo Molnar <mingo@kernel.org> wrote: > > > * Brian Gerst <brgerst@gmail.com> wrote: > > > On Wed, Feb 19, 2025 at 6:47 AM Ingo Molnar <mingo@kernel.org> wrote: > > > > > > > > > * Uros Bizjak <ubizjak@gmail.com> wrote: > > > > > > > > Thank you for doing this series - it all looks pretty good from my > > > > > side and I've applied it experimentally to tip:x86/asm. I fixed up > > > > > the trivial details other reviewers and me noticed. > > > > > > > > > > Note that the merge is tentative, it might still need a rebase if > > > > > some fundamental problem comes up - but let's see how testing goes > > > > > in -next. > > > > > > > > I wonder if there would be any benefit if stack canary is put into > > > > struct pcpu_hot? > > > > > > It should definitely be one of the hottest data structures on x86, so > > > moving it there makes sense even if it cannot be measured explicitly. > > > > > > > It would have to be done with linker tricks, since you can't make the > > compiler use a struct member directly. > > Probably not worth it then? Actually it wasn't so bad since we already had the hack for __ref_stack_chk_guard. Do you want the patches now or when the dust settles on the original series? Brian Gerst
* Brian Gerst <brgerst@gmail.com> wrote: > On Thu, Feb 20, 2025 at 8:26 AM Ingo Molnar <mingo@kernel.org> wrote: > > > > > > * Brian Gerst <brgerst@gmail.com> wrote: > > > > > On Wed, Feb 19, 2025 at 6:47 AM Ingo Molnar <mingo@kernel.org> wrote: > > > > > > > > > > > > * Uros Bizjak <ubizjak@gmail.com> wrote: > > > > > > > > > > Thank you for doing this series - it all looks pretty good from my > > > > > > side and I've applied it experimentally to tip:x86/asm. I fixed up > > > > > > the trivial details other reviewers and me noticed. > > > > > > > > > > > > Note that the merge is tentative, it might still need a rebase if > > > > > > some fundamental problem comes up - but let's see how testing goes > > > > > > in -next. > > > > > > > > > > I wonder if there would be any benefit if stack canary is put into > > > > > struct pcpu_hot? > > > > > > > > It should definitely be one of the hottest data structures on x86, so > > > > moving it there makes sense even if it cannot be measured explicitly. > > > > > > > > > > It would have to be done with linker tricks, since you can't make the > > > compiler use a struct member directly. > > > > Probably not worth it then? > > Actually it wasn't so bad since we already had the hack for > __ref_stack_chk_guard. Do you want the patches now or when the dust > settles on the original series? We can add it now I suppose, while memories are fresh and all that. Thanks, Ingo
On Wed, Feb 19, 2025 at 2:18 PM Brian Gerst <brgerst@gmail.com> wrote:
>
> On Wed, Feb 19, 2025 at 6:47 AM Ingo Molnar <mingo@kernel.org> wrote:
> >
> >
> > * Uros Bizjak <ubizjak@gmail.com> wrote:
> >
> > > > Thank you for doing this series - it all looks pretty good from my
> > > > side and I've applied it experimentally to tip:x86/asm. I fixed up
> > > > the trivial details other reviewers and me noticed.
> > > >
> > > > Note that the merge is tentative, it might still need a rebase if
> > > > some fundamental problem comes up - but let's see how testing goes
> > > > in -next.
> > >
> > > I wonder if there would be any benefit if stack canary is put into
> > > struct pcpu_hot?
> >
> > It should definitely be one of the hottest data structures on x86, so
> > moving it there makes sense even if it cannot be measured explicitly.
> >
>
> It would have to be done with linker tricks, since you can't make the
> compiler use a struct member directly.
Something like the attached patch?
It boots and runs without problems.
However, when building the kernel, I get "Absolute relocations
present" warning with thousands of locations:
RELOCS arch/x86/boot/compressed/vmlinux.relocs
WARNING: Absolute relocations present
Offset Info Type Sym.Value Sym.Name
ffffffff81200826 0003259e00000002 R_X86_64_PC32 ffffffff8350f620
__ref_stack_chk_guard
ffffffff81201493 0003259e00000002 R_X86_64_PC32 ffffffff8350f620
__ref_stack_chk_guard
ffffffff81201714 0003259e00000002 R_X86_64_PC32 ffffffff8350f620
__ref_stack_chk_guard
ffffffff81201d66 0003259e00000002 R_X86_64_PC32 ffffffff8350f620
__ref_stack_chk_guard
...
ffffffff834e2a13 0003259e00000002 R_X86_64_PC32 ffffffff8350f620
__ref_stack_chk_guard
ffffffff834e2a6a 0003259e00000002 R_X86_64_PC32 ffffffff8350f620
__ref_stack_chk_guard
RSTRIP vmlinux
which I don't understand. Looking at the first one:
ffffffff8120081d <force_ibs_eilvt_setup.cold>:
ffffffff8120081d: 48 8b 44 24 08 mov 0x8(%rsp),%rax
ffffffff81200822: 65 48 2b 05 f6 ed 30 sub
%gs:0x230edf6(%rip),%rax # ffffffff8350f620
<__ref_stack_chk_guard>
ffffffff81200829: 02
I don't think this is absolute relocation, see (%rip).
The kernel was compiled with gcc-14.2.1, so clang specific issue was not tested.
Uros.
diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S
index 20be5758c2d2..940efc07f2c1 100644
--- a/arch/x86/entry/entry_32.S
+++ b/arch/x86/entry/entry_32.S
@@ -691,7 +691,7 @@ SYM_CODE_START(__switch_to_asm)
#ifdef CONFIG_STACKPROTECTOR
movl TASK_stack_canary(%edx), %ebx
- movl %ebx, PER_CPU_VAR(__stack_chk_guard)
+ movl %ebx, PER_CPU_VAR(pcpu_hot + X86_stack_canary)
#endif
/*
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 49d3b222fe99..4f4c0cf4963f 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -193,7 +193,7 @@ SYM_FUNC_START(__switch_to_asm)
#ifdef CONFIG_STACKPROTECTOR
movq TASK_stack_canary(%rsi), %rbx
- movq %rbx, PER_CPU_VAR(__stack_chk_guard)
+ movq %rbx, PER_CPU_VAR(pcpu_hot + X86_stack_canary)
#endif
/*
diff --git a/arch/x86/include/asm/current.h b/arch/x86/include/asm/current.h
index bf5953883ec3..a4d515cd6a31 100644
--- a/arch/x86/include/asm/current.h
+++ b/arch/x86/include/asm/current.h
@@ -22,6 +22,9 @@ struct pcpu_hot {
u64 call_depth;
#endif
unsigned long top_of_stack;
+#ifdef CONFIG_STACKPROTECTOR
+ unsigned long stack_canary;
+#endif
void *hardirq_stack_ptr;
u16 softirq_pending;
#ifdef CONFIG_X86_64
diff --git a/arch/x86/include/asm/stackprotector.h b/arch/x86/include/asm/stackprotector.h
index d43fb589fcf6..5e5229ac1c46 100644
--- a/arch/x86/include/asm/stackprotector.h
+++ b/arch/x86/include/asm/stackprotector.h
@@ -20,8 +20,6 @@
#include <linux/sched.h>
-DECLARE_PER_CPU(unsigned long, __stack_chk_guard);
-
/*
* Initialize the stackprotector canary value.
*
@@ -38,12 +36,12 @@ static __always_inline void boot_init_stack_canary(void)
unsigned long canary = get_random_canary();
current->stack_canary = canary;
- this_cpu_write(__stack_chk_guard, canary);
+ this_cpu_write(pcpu_hot.stack_canary, canary);
}
static inline void cpu_init_stack_canary(int cpu, struct task_struct *idle)
{
- per_cpu(__stack_chk_guard, cpu) = idle->stack_canary;
+ per_cpu(pcpu_hot.stack_canary, cpu) = idle->stack_canary;
}
#else /* STACKPROTECTOR */
diff --git a/arch/x86/kernel/asm-offsets.c b/arch/x86/kernel/asm-offsets.c
index a98020bf31bb..59e8b294cbdc 100644
--- a/arch/x86/kernel/asm-offsets.c
+++ b/arch/x86/kernel/asm-offsets.c
@@ -109,6 +109,9 @@ static void __used common(void)
OFFSET(TSS_sp2, tss_struct, x86_tss.sp2);
OFFSET(X86_top_of_stack, pcpu_hot, top_of_stack);
OFFSET(X86_current_task, pcpu_hot, current_task);
+#ifdef CONFIG_STACKPROTECTOR
+ OFFSET(X86_stack_canary, pcpu_hot, stack_canary);
+#endif
#ifdef CONFIG_MITIGATION_CALL_DEPTH_TRACKING
OFFSET(X86_call_depth, pcpu_hot, call_depth);
#endif
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 21078907af57..f30d6a9c4abd 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -24,7 +24,6 @@
#include <linux/io.h>
#include <linux/syscore_ops.h>
#include <linux/pgtable.h>
-#include <linux/stackprotector.h>
#include <linux/utsname.h>
#include <asm/alternative.h>
@@ -2087,13 +2086,6 @@ void syscall_init(void)
}
#endif /* CONFIG_X86_64 */
-#ifdef CONFIG_STACKPROTECTOR
-DEFINE_PER_CPU(unsigned long, __stack_chk_guard);
-#ifndef CONFIG_SMP
-EXPORT_PER_CPU_SYMBOL(__stack_chk_guard);
-#endif
-#endif
-
/*
* Clear all 6 debug registers:
*/
diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index 1769a7126224..a35e4ebe032e 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -468,7 +468,7 @@ SECTIONS
"kernel image bigger than KERNEL_IMAGE_SIZE");
/* needed for Clang - see arch/x86/entry/entry.S */
-PROVIDE(__ref_stack_chk_guard = __stack_chk_guard);
+PROVIDE(__ref_stack_chk_guard = pcpu_hot + X86_stack_canary);
#ifdef CONFIG_X86_64
Hi Uros, On Thu, 20 Feb 2025 at 10:51, Uros Bizjak <ubizjak@gmail.com> wrote: > > On Wed, Feb 19, 2025 at 2:18 PM Brian Gerst <brgerst@gmail.com> wrote: > > > > On Wed, Feb 19, 2025 at 6:47 AM Ingo Molnar <mingo@kernel.org> wrote: > > > > > > > > > * Uros Bizjak <ubizjak@gmail.com> wrote: > > > > > > > > Thank you for doing this series - it all looks pretty good from my > > > > > side and I've applied it experimentally to tip:x86/asm. I fixed up > > > > > the trivial details other reviewers and me noticed. > > > > > > > > > > Note that the merge is tentative, it might still need a rebase if > > > > > some fundamental problem comes up - but let's see how testing goes > > > > > in -next. > > > > > > > > I wonder if there would be any benefit if stack canary is put into > > > > struct pcpu_hot? > > > > > > It should definitely be one of the hottest data structures on x86, so > > > moving it there makes sense even if it cannot be measured explicitly. > > > > > > > It would have to be done with linker tricks, since you can't make the > > compiler use a struct member directly. > > Something like the attached patch? > Interesting take. I'd have tried to put the canary at offset 0x0, and simply use pcpu_hot as the guard symbol. > It boots and runs without problems. > > However, when building the kernel, I get "Absolute relocations > present" warning with thousands of locations: > > RELOCS arch/x86/boot/compressed/vmlinux.relocs > WARNING: Absolute relocations present > Offset Info Type Sym.Value Sym.Name > ffffffff81200826 0003259e00000002 R_X86_64_PC32 ffffffff8350f620 > __ref_stack_chk_guard > ffffffff81201493 0003259e00000002 R_X86_64_PC32 ffffffff8350f620 > __ref_stack_chk_guard > ffffffff81201714 0003259e00000002 R_X86_64_PC32 ffffffff8350f620 > __ref_stack_chk_guard > ffffffff81201d66 0003259e00000002 R_X86_64_PC32 ffffffff8350f620 > __ref_stack_chk_guard > ... > ffffffff834e2a13 0003259e00000002 R_X86_64_PC32 ffffffff8350f620 > __ref_stack_chk_guard > ffffffff834e2a6a 0003259e00000002 R_X86_64_PC32 ffffffff8350f620 > __ref_stack_chk_guard > > RSTRIP vmlinux > > which I don't understand. Looking at the first one: > > ffffffff8120081d <force_ibs_eilvt_setup.cold>: > ffffffff8120081d: 48 8b 44 24 08 mov 0x8(%rsp),%rax > ffffffff81200822: 65 48 2b 05 f6 ed 30 sub > %gs:0x230edf6(%rip),%rax # ffffffff8350f620 > <__ref_stack_chk_guard> > ffffffff81200829: 02 > > I don't think this is absolute relocation, see (%rip). > The warning is about the type of __ref_stack_chk_guard, not about the type of the relocation. $ nm vmlinux |grep \\s__ref_sta ffffffff8350c620 A __ref_stack_chk_guard Without your patch: $ nm vmlinux |grep \\s__ref_sta ffffffff834fba10 D __ref_stack_chk_guard
On Thu, Feb 20, 2025 at 11:05 AM Ard Biesheuvel <ardb@kernel.org> wrote: > > Hi Uros, > > On Thu, 20 Feb 2025 at 10:51, Uros Bizjak <ubizjak@gmail.com> wrote: > > > > On Wed, Feb 19, 2025 at 2:18 PM Brian Gerst <brgerst@gmail.com> wrote: > > > > > > On Wed, Feb 19, 2025 at 6:47 AM Ingo Molnar <mingo@kernel.org> wrote: > > > > > > > > > > > > * Uros Bizjak <ubizjak@gmail.com> wrote: > > > > > > > > > > Thank you for doing this series - it all looks pretty good from my > > > > > > side and I've applied it experimentally to tip:x86/asm. I fixed up > > > > > > the trivial details other reviewers and me noticed. > > > > > > > > > > > > Note that the merge is tentative, it might still need a rebase if > > > > > > some fundamental problem comes up - but let's see how testing goes > > > > > > in -next. > > > > > > > > > > I wonder if there would be any benefit if stack canary is put into > > > > > struct pcpu_hot? > > > > > > > > It should definitely be one of the hottest data structures on x86, so > > > > moving it there makes sense even if it cannot be measured explicitly. > > > > > > > > > > It would have to be done with linker tricks, since you can't make the > > > compiler use a struct member directly. > > > > Something like the attached patch? > > > > Interesting take. I'd have tried to put the canary at offset 0x0, and > simply use pcpu_hot as the guard symbol. > > > > It boots and runs without problems. > > > > However, when building the kernel, I get "Absolute relocations > > present" warning with thousands of locations: > > > > RELOCS arch/x86/boot/compressed/vmlinux.relocs > > WARNING: Absolute relocations present > > Offset Info Type Sym.Value Sym.Name > > ffffffff81200826 0003259e00000002 R_X86_64_PC32 ffffffff8350f620 > > __ref_stack_chk_guard > > ffffffff81201493 0003259e00000002 R_X86_64_PC32 ffffffff8350f620 > > __ref_stack_chk_guard > > ffffffff81201714 0003259e00000002 R_X86_64_PC32 ffffffff8350f620 > > __ref_stack_chk_guard > > ffffffff81201d66 0003259e00000002 R_X86_64_PC32 ffffffff8350f620 > > __ref_stack_chk_guard > > ... > > ffffffff834e2a13 0003259e00000002 R_X86_64_PC32 ffffffff8350f620 > > __ref_stack_chk_guard > > ffffffff834e2a6a 0003259e00000002 R_X86_64_PC32 ffffffff8350f620 > > __ref_stack_chk_guard > > > > RSTRIP vmlinux > > > > which I don't understand. Looking at the first one: > > > > ffffffff8120081d <force_ibs_eilvt_setup.cold>: > > ffffffff8120081d: 48 8b 44 24 08 mov 0x8(%rsp),%rax > > ffffffff81200822: 65 48 2b 05 f6 ed 30 sub > > %gs:0x230edf6(%rip),%rax # ffffffff8350f620 > > <__ref_stack_chk_guard> > > ffffffff81200829: 02 > > > > I don't think this is absolute relocation, see (%rip). > > > > The warning is about the type of __ref_stack_chk_guard, not about the > type of the relocation. Thanks, I got distracted by the text of the warning that mentions relocation. > $ nm vmlinux |grep \\s__ref_sta > ffffffff8350c620 A __ref_stack_chk_guard > > Without your patch: > > $ nm vmlinux |grep \\s__ref_sta > ffffffff834fba10 D __ref_stack_chk_guard Is this a problem in our specific case? While the symbol is absolute, the relocation is still relative, so IMO it should be OK even with your ongoing rip-relative efforts in mind. We can list the symbol in arch/x86/tools/relocs.c to quiet the warning, but I would need some help with auditing the symbol itself. OTOH, we could simply do it your way and put stack canary at the beginning of pcpu_hot structure, with static_assert(offsetof(struct pcpu_hot, stack_canary) == 0)); for good measure. Uros.
On Thu, 20 Feb 2025 at 11:46, Uros Bizjak <ubizjak@gmail.com> wrote: > > On Thu, Feb 20, 2025 at 11:05 AM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > Hi Uros, > > > > On Thu, 20 Feb 2025 at 10:51, Uros Bizjak <ubizjak@gmail.com> wrote: > > > > > > On Wed, Feb 19, 2025 at 2:18 PM Brian Gerst <brgerst@gmail.com> wrote: > > > > > > > > On Wed, Feb 19, 2025 at 6:47 AM Ingo Molnar <mingo@kernel.org> wrote: > > > > > > > > > > * Uros Bizjak <ubizjak@gmail.com> wrote: > > > > > > > > > > > I wonder if there would be any benefit if stack canary is put into > > > > > > struct pcpu_hot? > > > > > > > > > > It should definitely be one of the hottest data structures on x86, so > > > > > moving it there makes sense even if it cannot be measured explicitly. > > > > > > > > > > > > > It would have to be done with linker tricks, since you can't make the > > > > compiler use a struct member directly. > > > > > > > Interesting take. I'd have tried to put the canary at offset 0x0, and > > simply use pcpu_hot as the guard symbol. > > > > > > > It boots and runs without problems. > > > > > > However, when building the kernel, I get "Absolute relocations > > > present" warning with thousands of locations: > > > ... > > > > The warning is about the type of __ref_stack_chk_guard, not about the > > type of the relocation. > > Thanks, I got distracted by the text of the warning that mentions relocation. > > > $ nm vmlinux |grep \\s__ref_sta > > ffffffff8350c620 A __ref_stack_chk_guard > > > > Without your patch: > > > > $ nm vmlinux |grep \\s__ref_sta > > ffffffff834fba10 D __ref_stack_chk_guard > > Is this a problem in our specific case? I don't think so - the whole notion of absolute ELF symbols is rather flaky IME, so I don't think we should be pedantic here. > We can list the symbol in arch/x86/tools/relocs.c to quiet the > warning, but I would need some help with auditing the symbol itself. > > OTOH, we could simply do it your way and put stack canary at the > beginning of pcpu_hot structure, with > > static_assert(offsetof(struct pcpu_hot, stack_canary) == 0)); > > for good measure. I think this would be the most straight-forward if there are no other locality concerns this might interfere with.
On Thu, Feb 20, 2025 at 5:52 AM Ard Biesheuvel <ardb@kernel.org> wrote: > > On Thu, 20 Feb 2025 at 11:46, Uros Bizjak <ubizjak@gmail.com> wrote: > > > > On Thu, Feb 20, 2025 at 11:05 AM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > Hi Uros, > > > > > > On Thu, 20 Feb 2025 at 10:51, Uros Bizjak <ubizjak@gmail.com> wrote: > > > > > > > > On Wed, Feb 19, 2025 at 2:18 PM Brian Gerst <brgerst@gmail.com> wrote: > > > > > > > > > > On Wed, Feb 19, 2025 at 6:47 AM Ingo Molnar <mingo@kernel.org> wrote: > > > > > > > > > > > > * Uros Bizjak <ubizjak@gmail.com> wrote: > > > > > > > > > > > > > I wonder if there would be any benefit if stack canary is put into > > > > > > > struct pcpu_hot? > > > > > > > > > > > > It should definitely be one of the hottest data structures on x86, so > > > > > > moving it there makes sense even if it cannot be measured explicitly. > > > > > > > > > > > > > > > > It would have to be done with linker tricks, since you can't make the > > > > > compiler use a struct member directly. > > > > > > > > > > Interesting take. I'd have tried to put the canary at offset 0x0, and > > > simply use pcpu_hot as the guard symbol. > > > > > > > > > > It boots and runs without problems. > > > > > > > > However, when building the kernel, I get "Absolute relocations > > > > present" warning with thousands of locations: > > > > > ... > > > > > > The warning is about the type of __ref_stack_chk_guard, not about the > > > type of the relocation. > > > > Thanks, I got distracted by the text of the warning that mentions relocation. > > > > > $ nm vmlinux |grep \\s__ref_sta > > > ffffffff8350c620 A __ref_stack_chk_guard > > > > > > Without your patch: > > > > > > $ nm vmlinux |grep \\s__ref_sta > > > ffffffff834fba10 D __ref_stack_chk_guard > > > > Is this a problem in our specific case? > > I don't think so - the whole notion of absolute ELF symbols is rather > flaky IME, so I don't think we should be pedantic here. From what I understand it stayed relative because there wasn't a constant added. As soon as you add a constant (which the linker treats as absolute), it becomes absolute. > > We can list the symbol in arch/x86/tools/relocs.c to quiet the > > warning, but I would need some help with auditing the symbol itself. > > > > OTOH, we could simply do it your way and put stack canary at the > > beginning of pcpu_hot structure, with > > > > static_assert(offsetof(struct pcpu_hot, stack_canary) == 0)); > > > > for good measure. > > I think this would be the most straight-forward if there are no other > locality concerns this might interfere with. I'd prefer it at the end of pcpu_hot, that way the disassembler doesn't latch on to the __stack_chk_guard symbol when referencing the other fields of pcpu_hot. Brian Gerst
On Thu, 20 Feb 2025 at 18:24, Brian Gerst <brgerst@gmail.com> wrote: > > On Thu, Feb 20, 2025 at 5:52 AM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > On Thu, 20 Feb 2025 at 11:46, Uros Bizjak <ubizjak@gmail.com> wrote: > > > > > > OTOH, we could simply do it your way and put stack canary at the > > > beginning of pcpu_hot structure, with > > > > > > static_assert(offsetof(struct pcpu_hot, stack_canary) == 0)); > > > > > > for good measure. > > > > I think this would be the most straight-forward if there are no other > > locality concerns this might interfere with. > > I'd prefer it at the end of pcpu_hot, that way the disassembler > doesn't latch on to the __stack_chk_guard symbol when referencing the > other fields of pcpu_hot. > __stack_chk_guard would no longer exist, only __ref_stack_chk_guard, which would be equal to pcpu_hot. We could just call that __ref_pcpu_hot instead if it might cause confusion otherwise. (We can't use pcpu_hot directly in -mstack-check-guard-symbol= for the same reasons I had to add the indirection via __ref_stack_chk_guard)
On Thu, Feb 20, 2025 at 12:36 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > On Thu, 20 Feb 2025 at 18:24, Brian Gerst <brgerst@gmail.com> wrote: > > > > On Thu, Feb 20, 2025 at 5:52 AM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > On Thu, 20 Feb 2025 at 11:46, Uros Bizjak <ubizjak@gmail.com> wrote: > > > > > > > > OTOH, we could simply do it your way and put stack canary at the > > > > beginning of pcpu_hot structure, with > > > > > > > > static_assert(offsetof(struct pcpu_hot, stack_canary) == 0)); > > > > > > > > for good measure. > > > > > > I think this would be the most straight-forward if there are no other > > > locality concerns this might interfere with. > > > > I'd prefer it at the end of pcpu_hot, that way the disassembler > > doesn't latch on to the __stack_chk_guard symbol when referencing the > > other fields of pcpu_hot. > > > > __stack_chk_guard would no longer exist, only __ref_stack_chk_guard, > which would be equal to pcpu_hot. We could just call that > __ref_pcpu_hot instead if it might cause confusion otherwise. (We > can't use pcpu_hot directly in -mstack-check-guard-symbol= for the > same reasons I had to add the indirection via __ref_stack_chk_guard) That works for me. Brian Gerst
On Thu, Feb 20, 2025 at 12:47 PM Brian Gerst <brgerst@gmail.com> wrote: > > On Thu, Feb 20, 2025 at 12:36 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > On Thu, 20 Feb 2025 at 18:24, Brian Gerst <brgerst@gmail.com> wrote: > > > > > > On Thu, Feb 20, 2025 at 5:52 AM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > On Thu, 20 Feb 2025 at 11:46, Uros Bizjak <ubizjak@gmail.com> wrote: > > > > > > > > > > OTOH, we could simply do it your way and put stack canary at the > > > > > beginning of pcpu_hot structure, with > > > > > > > > > > static_assert(offsetof(struct pcpu_hot, stack_canary) == 0)); > > > > > > > > > > for good measure. > > > > > > > > I think this would be the most straight-forward if there are no other > > > > locality concerns this might interfere with. > > > > > > I'd prefer it at the end of pcpu_hot, that way the disassembler > > > doesn't latch on to the __stack_chk_guard symbol when referencing the > > > other fields of pcpu_hot. > > > > > > > __stack_chk_guard would no longer exist, only __ref_stack_chk_guard, > > which would be equal to pcpu_hot. We could just call that > > __ref_pcpu_hot instead if it might cause confusion otherwise. (We > > can't use pcpu_hot directly in -mstack-check-guard-symbol= for the > > same reasons I had to add the indirection via __ref_stack_chk_guard) > > That works for me. Maybe not. One quirk of how GCC implements this is that -mstack-protector-guard=global (used by !SMP builds) ignores the -mstack-protector-guard-symbol option and always uses __stack_chk_guard. That makes things more challenging. Brian Gerst
On Thu, Feb 20, 2025 at 6:59 PM Brian Gerst <brgerst@gmail.com> wrote: > > On Thu, Feb 20, 2025 at 12:47 PM Brian Gerst <brgerst@gmail.com> wrote: > > > > On Thu, Feb 20, 2025 at 12:36 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > On Thu, 20 Feb 2025 at 18:24, Brian Gerst <brgerst@gmail.com> wrote: > > > > > > > > On Thu, Feb 20, 2025 at 5:52 AM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > > > > > On Thu, 20 Feb 2025 at 11:46, Uros Bizjak <ubizjak@gmail.com> wrote: > > > > > > > > > > > > OTOH, we could simply do it your way and put stack canary at the > > > > > > beginning of pcpu_hot structure, with > > > > > > > > > > > > static_assert(offsetof(struct pcpu_hot, stack_canary) == 0)); > > > > > > > > > > > > for good measure. > > > > > > > > > > I think this would be the most straight-forward if there are no other > > > > > locality concerns this might interfere with. > > > > > > > > I'd prefer it at the end of pcpu_hot, that way the disassembler > > > > doesn't latch on to the __stack_chk_guard symbol when referencing the > > > > other fields of pcpu_hot. > > > > > > > > > > __stack_chk_guard would no longer exist, only __ref_stack_chk_guard, > > > which would be equal to pcpu_hot. We could just call that > > > __ref_pcpu_hot instead if it might cause confusion otherwise. (We > > > can't use pcpu_hot directly in -mstack-check-guard-symbol= for the > > > same reasons I had to add the indirection via __ref_stack_chk_guard) > > > > That works for me. > > Maybe not. One quirk of how GCC implements this is that > -mstack-protector-guard=global (used by !SMP builds) ignores the > -mstack-protector-guard-symbol option and always uses > __stack_chk_guard. That makes things more challenging. Not really. If we put stack_canary as the first member of struct pcpu_hot, we can just alias __stack_chk_guard to struct pcpu_hot in the linker script, and everything starts to magically work, SMP and !SMP. Please see the proposed patch, effectively a three liner, at [1]. [1] https://lore.kernel.org/lkml/20250220200439.4458-1-ubizjak@gmail.com/ Uros.
On Thu, Feb 20, 2025 at 11:52 AM Ard Biesheuvel <ardb@kernel.org> wrote: > > On Thu, 20 Feb 2025 at 11:46, Uros Bizjak <ubizjak@gmail.com> wrote: > > > > On Thu, Feb 20, 2025 at 11:05 AM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > > > Hi Uros, > > > > > > On Thu, 20 Feb 2025 at 10:51, Uros Bizjak <ubizjak@gmail.com> wrote: > > > > > > > > On Wed, Feb 19, 2025 at 2:18 PM Brian Gerst <brgerst@gmail.com> wrote: > > > > > > > > > > On Wed, Feb 19, 2025 at 6:47 AM Ingo Molnar <mingo@kernel.org> wrote: > > > > > > > > > > > > * Uros Bizjak <ubizjak@gmail.com> wrote: > > > > > > > > > > > > > I wonder if there would be any benefit if stack canary is put into > > > > > > > struct pcpu_hot? > > > > > > > > > > > > It should definitely be one of the hottest data structures on x86, so > > > > > > moving it there makes sense even if it cannot be measured explicitly. > > > > > > > > > > > > > > > > It would have to be done with linker tricks, since you can't make the > > > > > compiler use a struct member directly. > > > > > > > > > > Interesting take. I'd have tried to put the canary at offset 0x0, and > > > simply use pcpu_hot as the guard symbol. > > > > > > > > > > It boots and runs without problems. > > > > > > > > However, when building the kernel, I get "Absolute relocations > > > > present" warning with thousands of locations: > > > > > ... > > > > > > The warning is about the type of __ref_stack_chk_guard, not about the > > > type of the relocation. > > > > Thanks, I got distracted by the text of the warning that mentions relocation. > > > > > $ nm vmlinux |grep \\s__ref_sta > > > ffffffff8350c620 A __ref_stack_chk_guard > > > > > > Without your patch: > > > > > > $ nm vmlinux |grep \\s__ref_sta > > > ffffffff834fba10 D __ref_stack_chk_guard > > > > Is this a problem in our specific case? > > I don't think so - the whole notion of absolute ELF symbols is rather > flaky IME, so I don't think we should be pedantic here. > > > We can list the symbol in arch/x86/tools/relocs.c to quiet the > > warning, but I would need some help with auditing the symbol itself. > > > > OTOH, we could simply do it your way and put stack canary at the > > beginning of pcpu_hot structure, with > > > > static_assert(offsetof(struct pcpu_hot, stack_canary) == 0)); > > > > for good measure. > > I think this would be the most straight-forward if there are no other > locality concerns this might interfere with. OK, let me prepare a patch then. Thanks, Uros.
© 2016 - 2025 Red Hat, Inc.