arch/x86/include/asm/init.h | 4 ++++ 1 file changed, 4 insertions(+)
From: Ard Biesheuvel <ardb@kernel.org>
Clang versions before 17 will not honour -fdirect-access-external-data
for the load of the stack cookie emitted into each function's prologue
and epilogue, and will emit a GOT based reference instead, e.g.,
4c 8b 2d 00 00 00 00 mov 0x0(%rip),%r13
18a: R_X86_64_REX_GOTPCRELX __ref_stack_chk_guard-0x4
65 49 8b 45 00 mov %gs:0x0(%r13),%rax
This is inefficient, but at least, the linker will usually follow the
rules of the x86 psABI, and relax the GOT load into a RIP-relative LEA
instruction. This is still suboptimal, as the per-CPU load could use a
RIP-relative reference directly, but at least it gets rid of the first
load from memory.
However, Boris reports that in some cases, when using distro builds of
Clang/LLD 15, the first load gets relaxed into
49 c7 c6 20 c0 55 86 mov $0xffffffff8655c020,%r14
ffffffff8373bf0f: R_X86_64_32S __ref_stack_chk_guard
65 49 8b 06 mov %gs:(%r14),%rax
instead, which is fine in principle, as MOV may be cheaper than LEA on
some micro-architectures. However, such absolute references assume that
the variable in question can be accessed via the kernel virtual mapping,
and this is not guaranteed for the startup code residing in .head.text.
This is therefore a true positive, that was caught using the recently
introduced relocs check for absolute references in the startup code:
Absolute reference to symbol '__ref_stack_chk_guard' not permitted in .head.text
Work around the issue by disabling the stack protector in the startup
code for Clang versions older than 17.
Fixes: 80d47defddc0 ("x86/stackprotector/64: Convert to normal per-CPU variable")
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/x86/include/asm/init.h | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/arch/x86/include/asm/init.h b/arch/x86/include/asm/init.h
index 0e82ebc5d1e1..8b1b1abcef15 100644
--- a/arch/x86/include/asm/init.h
+++ b/arch/x86/include/asm/init.h
@@ -2,7 +2,11 @@
#ifndef _ASM_X86_INIT_H
#define _ASM_X86_INIT_H
+#if defined(CONFIG_CC_IS_CLANG) && CONFIG_CLANG_VERSION < 170000
+#define __head __section(".head.text") __no_sanitize_undefined __no_stack_protector
+#else
#define __head __section(".head.text") __no_sanitize_undefined
+#endif
struct x86_mapping_info {
void *(*alloc_pgt_page)(void *); /* allocate buf for page table */
--
2.49.0.rc0.332.g42c0ae87b1-goog
On Wed, Mar 12, 2025 at 6:27 AM Ard Biesheuvel <ardb+git@google.com> wrote:
>
> From: Ard Biesheuvel <ardb@kernel.org>
>
> Clang versions before 17 will not honour -fdirect-access-external-data
> for the load of the stack cookie emitted into each function's prologue
> and epilogue, and will emit a GOT based reference instead, e.g.,
>
> 4c 8b 2d 00 00 00 00 mov 0x0(%rip),%r13
> 18a: R_X86_64_REX_GOTPCRELX __ref_stack_chk_guard-0x4
> 65 49 8b 45 00 mov %gs:0x0(%r13),%rax
>
> This is inefficient, but at least, the linker will usually follow the
> rules of the x86 psABI, and relax the GOT load into a RIP-relative LEA
> instruction. This is still suboptimal, as the per-CPU load could use a
> RIP-relative reference directly, but at least it gets rid of the first
> load from memory.
>
> However, Boris reports that in some cases, when using distro builds of
> Clang/LLD 15, the first load gets relaxed into
>
> 49 c7 c6 20 c0 55 86 mov $0xffffffff8655c020,%r14
> ffffffff8373bf0f: R_X86_64_32S __ref_stack_chk_guard
> 65 49 8b 06 mov %gs:(%r14),%rax
>
> instead, which is fine in principle, as MOV may be cheaper than LEA on
> some micro-architectures. However, such absolute references assume that
> the variable in question can be accessed via the kernel virtual mapping,
> and this is not guaranteed for the startup code residing in .head.text.
>
> This is therefore a true positive, that was caught using the recently
> introduced relocs check for absolute references in the startup code:
>
> Absolute reference to symbol '__ref_stack_chk_guard' not permitted in .head.text
>
> Work around the issue by disabling the stack protector in the startup
> code for Clang versions older than 17.
>
> Fixes: 80d47defddc0 ("x86/stackprotector/64: Convert to normal per-CPU variable")
> Cc: Borislav Petkov <bp@alien8.de>
> Cc: Brian Gerst <brgerst@gmail.com>
> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> ---
> arch/x86/include/asm/init.h | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/arch/x86/include/asm/init.h b/arch/x86/include/asm/init.h
> index 0e82ebc5d1e1..8b1b1abcef15 100644
> --- a/arch/x86/include/asm/init.h
> +++ b/arch/x86/include/asm/init.h
> @@ -2,7 +2,11 @@
> #ifndef _ASM_X86_INIT_H
> #define _ASM_X86_INIT_H
>
> +#if defined(CONFIG_CC_IS_CLANG) && CONFIG_CLANG_VERSION < 170000
> +#define __head __section(".head.text") __no_sanitize_undefined __no_stack_protector
> +#else
> #define __head __section(".head.text") __no_sanitize_undefined
> +#endif
Just disable it for all __head code. This runs long before userspace
can mount a stack smashing attack.
Brian Gerst
On Wed, 12 Mar 2025 at 12:09, Brian Gerst <brgerst@gmail.com> wrote:
>
> On Wed, Mar 12, 2025 at 6:27 AM Ard Biesheuvel <ardb+git@google.com> wrote:
> >
> > From: Ard Biesheuvel <ardb@kernel.org>
> >
> > Clang versions before 17 will not honour -fdirect-access-external-data
> > for the load of the stack cookie emitted into each function's prologue
> > and epilogue, and will emit a GOT based reference instead, e.g.,
> >
> > 4c 8b 2d 00 00 00 00 mov 0x0(%rip),%r13
> > 18a: R_X86_64_REX_GOTPCRELX __ref_stack_chk_guard-0x4
> > 65 49 8b 45 00 mov %gs:0x0(%r13),%rax
> >
> > This is inefficient, but at least, the linker will usually follow the
> > rules of the x86 psABI, and relax the GOT load into a RIP-relative LEA
> > instruction. This is still suboptimal, as the per-CPU load could use a
> > RIP-relative reference directly, but at least it gets rid of the first
> > load from memory.
> >
> > However, Boris reports that in some cases, when using distro builds of
> > Clang/LLD 15, the first load gets relaxed into
> >
> > 49 c7 c6 20 c0 55 86 mov $0xffffffff8655c020,%r14
> > ffffffff8373bf0f: R_X86_64_32S __ref_stack_chk_guard
> > 65 49 8b 06 mov %gs:(%r14),%rax
> >
> > instead, which is fine in principle, as MOV may be cheaper than LEA on
> > some micro-architectures. However, such absolute references assume that
> > the variable in question can be accessed via the kernel virtual mapping,
> > and this is not guaranteed for the startup code residing in .head.text.
> >
> > This is therefore a true positive, that was caught using the recently
> > introduced relocs check for absolute references in the startup code:
> >
> > Absolute reference to symbol '__ref_stack_chk_guard' not permitted in .head.text
> >
> > Work around the issue by disabling the stack protector in the startup
> > code for Clang versions older than 17.
> >
> > Fixes: 80d47defddc0 ("x86/stackprotector/64: Convert to normal per-CPU variable")
> > Cc: Borislav Petkov <bp@alien8.de>
> > Cc: Brian Gerst <brgerst@gmail.com>
> > Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> > ---
> > arch/x86/include/asm/init.h | 4 ++++
> > 1 file changed, 4 insertions(+)
> >
> > diff --git a/arch/x86/include/asm/init.h b/arch/x86/include/asm/init.h
> > index 0e82ebc5d1e1..8b1b1abcef15 100644
> > --- a/arch/x86/include/asm/init.h
> > +++ b/arch/x86/include/asm/init.h
> > @@ -2,7 +2,11 @@
> > #ifndef _ASM_X86_INIT_H
> > #define _ASM_X86_INIT_H
> >
> > +#if defined(CONFIG_CC_IS_CLANG) && CONFIG_CLANG_VERSION < 170000
> > +#define __head __section(".head.text") __no_sanitize_undefined __no_stack_protector
> > +#else
> > #define __head __section(".head.text") __no_sanitize_undefined
> > +#endif
>
> Just disable it for all __head code. This runs long before userspace
> can mount a stack smashing attack.
>
Not all of it - some code is emitted into .head.text because it is
called early, but it could still be called much later as well.
The following commit has been merged into the x86/core branch of tip:
Commit-ID: 3f5dbafc2d4651020f45309ca85120b6a8162fd9
Gitweb: https://git.kernel.org/tip/3f5dbafc2d4651020f45309ca85120b6a8162fd9
Author: Ard Biesheuvel <ardb@kernel.org>
AuthorDate: Wed, 12 Mar 2025 11:27:41 +01:00
Committer: Ingo Molnar <mingo@kernel.org>
CommitterDate: Wed, 19 Mar 2025 11:26:49 +01:00
x86/head/64: Avoid Clang < 17 stack protector in startup code
Clang versions before 17 will not honour -fdirect-access-external-data
for the load of the stack cookie emitted into each function's prologue
and epilogue, and will emit a GOT based reference instead, e.g.,
4c 8b 2d 00 00 00 00 mov 0x0(%rip),%r13
18a: R_X86_64_REX_GOTPCRELX __ref_stack_chk_guard-0x4
65 49 8b 45 00 mov %gs:0x0(%r13),%rax
This is inefficient, but at least, the linker will usually follow the
rules of the x86 psABI, and relax the GOT load into a RIP-relative LEA
instruction. This is still suboptimal, as the per-CPU load could use a
RIP-relative reference directly, but at least it gets rid of the first
load from memory.
However, Boris reports that in some cases, when using distro builds of
Clang/LLD 15, the first load gets relaxed into
49 c7 c6 20 c0 55 86 mov $0xffffffff8655c020,%r14
ffffffff8373bf0f: R_X86_64_32S __ref_stack_chk_guard
65 49 8b 06 mov %gs:(%r14),%rax
instead, which is fine in principle, as MOV may be cheaper than LEA on
some micro-architectures. However, such absolute references assume that
the variable in question can be accessed via the kernel virtual mapping,
and this is not guaranteed for the startup code residing in .head.text.
This is therefore a true positive, that was caught using the recently
introduced relocs check for absolute references in the startup code:
Absolute reference to symbol '__ref_stack_chk_guard' not permitted in .head.text
Work around the issue by disabling the stack protector in the startup
code for Clang versions older than 17.
Fixes: 80d47defddc0 ("x86/stackprotector/64: Convert to normal per-CPU variable")
Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20250312102740.602870-2-ardb+git@google.com
---
arch/x86/include/asm/init.h | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/arch/x86/include/asm/init.h b/arch/x86/include/asm/init.h
index 0e82ebc..8b1b1ab 100644
--- a/arch/x86/include/asm/init.h
+++ b/arch/x86/include/asm/init.h
@@ -2,7 +2,11 @@
#ifndef _ASM_X86_INIT_H
#define _ASM_X86_INIT_H
+#if defined(CONFIG_CC_IS_CLANG) && CONFIG_CLANG_VERSION < 170000
+#define __head __section(".head.text") __no_sanitize_undefined __no_stack_protector
+#else
#define __head __section(".head.text") __no_sanitize_undefined
+#endif
struct x86_mapping_info {
void *(*alloc_pgt_page)(void *); /* allocate buf for page table */
The following commit has been merged into the x86/asm branch of tip:
Commit-ID: 857716c8249ea9ada9d5657062833b6b5ef9fd63
Gitweb: https://git.kernel.org/tip/857716c8249ea9ada9d5657062833b6b5ef9fd63
Author: Ard Biesheuvel <ardb@kernel.org>
AuthorDate: Wed, 12 Mar 2025 11:27:41 +01:00
Committer: Borislav Petkov (AMD) <bp@alien8.de>
CommitterDate: Wed, 12 Mar 2025 12:08:10 +01:00
x86/head/64: Avoid Clang < 17 stack protector in startup code
Clang versions before 17 will not honour -fdirect-access-external-data
for the load of the stack cookie emitted into each function's prologue
and epilogue, and will emit a GOT based reference instead, e.g.,
4c 8b 2d 00 00 00 00 mov 0x0(%rip),%r13
18a: R_X86_64_REX_GOTPCRELX __ref_stack_chk_guard-0x4
65 49 8b 45 00 mov %gs:0x0(%r13),%rax
This is inefficient, but at least, the linker will usually follow the
rules of the x86 psABI, and relax the GOT load into a RIP-relative LEA
instruction. This is still suboptimal, as the per-CPU load could use a
RIP-relative reference directly, but at least it gets rid of the first
load from memory.
However, Boris reports that in some cases, when using distro builds of
Clang/LLD 15, the first load gets relaxed into
49 c7 c6 20 c0 55 86 mov $0xffffffff8655c020,%r14
ffffffff8373bf0f: R_X86_64_32S __ref_stack_chk_guard
65 49 8b 06 mov %gs:(%r14),%rax
instead, which is fine in principle, as MOV may be cheaper than LEA on
some micro-architectures. However, such absolute references assume that
the variable in question can be accessed via the kernel virtual mapping,
and this is not guaranteed for the startup code residing in .head.text.
This is therefore a true positive, that was caught using the recently
introduced relocs check for absolute references in the startup code:
Absolute reference to symbol '__ref_stack_chk_guard' not permitted in .head.text
Work around the issue by disabling the stack protector in the startup
code for Clang versions older than 17.
Fixes: 80d47defddc0 ("x86/stackprotector/64: Convert to normal per-CPU variable")
Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20250312102740.602870-2-ardb+git@google.com
---
arch/x86/include/asm/init.h | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/arch/x86/include/asm/init.h b/arch/x86/include/asm/init.h
index 0e82ebc..8b1b1ab 100644
--- a/arch/x86/include/asm/init.h
+++ b/arch/x86/include/asm/init.h
@@ -2,7 +2,11 @@
#ifndef _ASM_X86_INIT_H
#define _ASM_X86_INIT_H
+#if defined(CONFIG_CC_IS_CLANG) && CONFIG_CLANG_VERSION < 170000
+#define __head __section(".head.text") __no_sanitize_undefined __no_stack_protector
+#else
#define __head __section(".head.text") __no_sanitize_undefined
+#endif
struct x86_mapping_info {
void *(*alloc_pgt_page)(void *); /* allocate buf for page table */
© 2016 - 2025 Red Hat, Inc.