arch/um/kernel/skas/stub_exe.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
The stub_exe could segfault when built with some compilers (e.g. gcc
13.2.0), as SSE instructions which relied on stack alignment could be
generated, but the stack was misaligned.
This seems to be due to the __start entry point being run with a 16-byte
aligned stack, but the x86_64 SYSV ABI wanting the stack to be so
aligned _before_ a function call (so it is misaligned when the function
is entered due to the return address being pushed). The function
prologue then realigns it. Because the entry point is never _called_,
and hence there is no return address, the prologue is therefore actually
misaligning it, and causing the generated movaps instructions to
SIGSEGV. This results in the following error:
start_userspace : expected SIGSTOP, got status = 139
Don't generate this prologue for __start by using
__attribute__((naked)), which resolves the issue.
Fixes: 32e8eaf263d9 ("um: use execveat to create userspace MMs")
Signed-off-by: David Gow <davidgow@google.com>
---
See the discussion here:
https://lore.kernel.org/linux-um/c7c5228e9de1e79dc88b304e28d25f5ffd7e36dd.camel@sipsolutions.net/T/#m90c1c5b6c34ebaaa043b402e97009c5825fd158a
---
arch/um/kernel/skas/stub_exe.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/um/kernel/skas/stub_exe.c b/arch/um/kernel/skas/stub_exe.c
index 04f75c577f1a..722ce6267476 100644
--- a/arch/um/kernel/skas/stub_exe.c
+++ b/arch/um/kernel/skas/stub_exe.c
@@ -79,7 +79,7 @@ noinline static void real_init(void)
__builtin_unreachable();
}
-void _start(void)
+__attribute__((naked)) void _start(void)
{
char *alloc;
--
2.47.0.rc1.288.g06298d1525-goog
On Fri, 18 Oct 2024 at 07:10, David Gow <davidgow@google.com> wrote: > > The stub_exe could segfault when built with some compilers (e.g. gcc > 13.2.0), as SSE instructions which relied on stack alignment could be > generated, but the stack was misaligned. > > This seems to be due to the __start entry point being run with a 16-byte > aligned stack, but the x86_64 SYSV ABI wanting the stack to be so > aligned _before_ a function call (so it is misaligned when the function > is entered due to the return address being pushed). The function > prologue then realigns it. Because the entry point is never _called_, > and hence there is no return address, the prologue is therefore actually > misaligning it, and causing the generated movaps instructions to > SIGSEGV. This results in the following error: > start_userspace : expected SIGSTOP, got status = 139 > > Don't generate this prologue for __start by using > __attribute__((naked)), which resolves the issue. > > Fixes: 32e8eaf263d9 ("um: use execveat to create userspace MMs") > Signed-off-by: David Gow <davidgow@google.com> > --- Okay, it turns out this breaks clang: arch/um/kernel/skas/stub_exe.c:84:2: error: non-ASM statement in naked function is not supported And, looking into it, gcc's docs are not encouraging here either: https://gcc.gnu.org/onlinedocs/gcc/x86-Function-Attributes.html This attribute allows the compiler to construct the requisite function declaration, while allowing the body of the function to be assembly code. The specified function will not have prologue/epilogue sequences generated by the compiler. Only basic asm statements can safely be included in naked functions (see Basic Asm — Assembler Instructions Without Operands). While using extended asm or a mixture of basic asm and C code may appear to work, they cannot be depended upon to work reliably and are not supported. My gut feeling is that the "correct" way of doing this is to use an actual crt implementation for __start. I managed to get it working with nolibc on x86_54, but the sheer amount of hackery involved was not exactly encouraging. There are a lot of conflicts between the different headers for a start. The other "correct" way would be to rewrite __start in assembly, which is annoying as it'd be architecture specific, so need a separate 32-bit version. The less-correct-but-working-here way, which I'm tempted by for now, is to get rid of __attribute__((naked)) and just use __attribute__((force_arg_align_pointer)). That's probably the best way of fixing the stack issue, but obviously won't fix any other issues which could arise from __start playing loose with the rules. My current plan is to send out a v2 with force_arg_align_pointer next week, unless someone has a more brilliant idea. Cheers, -- David > > See the discussion here: > https://lore.kernel.org/linux-um/c7c5228e9de1e79dc88b304e28d25f5ffd7e36dd.camel@sipsolutions.net/T/#m90c1c5b6c34ebaaa043b402e97009c5825fd158a > > --- > arch/um/kernel/skas/stub_exe.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/um/kernel/skas/stub_exe.c b/arch/um/kernel/skas/stub_exe.c > index 04f75c577f1a..722ce6267476 100644 > --- a/arch/um/kernel/skas/stub_exe.c > +++ b/arch/um/kernel/skas/stub_exe.c > @@ -79,7 +79,7 @@ noinline static void real_init(void) > __builtin_unreachable(); > } > > -void _start(void) > +__attribute__((naked)) void _start(void) > { > char *alloc; > > -- > 2.47.0.rc1.288.g06298d1525-goog >
On Sat, 2024-10-19 at 13:54 +0800, David Gow wrote: > > Okay, it turns out this breaks clang: > arch/um/kernel/skas/stub_exe.c:84:2: error: non-ASM statement in naked > function is not supported Fun :) Interesting too, I've _definitely_ got some code elsewhere, that's usually compiled with clang, using __attribute__((naked,noinline)). With a quite old clang version, I believe. Oh well, whatever. > And, looking into it, gcc's docs are not encouraging here either: > https://gcc.gnu.org/onlinedocs/gcc/x86-Function-Attributes.html > > This attribute allows the compiler to construct the requisite function > declaration, while allowing the body of the function to be assembly > code. The specified function will not have prologue/epilogue sequences > generated by the compiler. Only basic asm statements can safely be > included in naked functions (see Basic Asm — Assembler Instructions > Without Operands). While using extended asm or a mixture of basic asm > and C code may appear to work, they cannot be depended upon to work > reliably and are not supported. Right ... > My gut feeling is that the "correct" way of doing this is to use an > actual crt implementation for __start. I managed to get it working > with nolibc on x86_54, but the sheer amount of hackery involved was > not exactly encouraging. There are a lot of conflicts between the > different headers for a start. Yeah, that doesn't seem like a lot of fun. > The other "correct" way would be to rewrite __start in assembly, which > is annoying as it'd be architecture specific, so need a separate > 32-bit version. That'd probably be less bad than it sounds, since all the syscall magic macros etc. in there are already architecture specific. > The less-correct-but-working-here way, which I'm tempted by for now, > is to get rid of __attribute__((naked)) and just use > __attribute__((force_arg_align_pointer)). That's probably the best way > of fixing the stack issue, but obviously won't fix any other issues > which could arise from __start playing loose with the rules. > > My current plan is to send out a v2 with force_arg_align_pointer next > week, unless someone has a more brilliant idea. Sounds good to me. We'll find out if we have to iterate more ;) johannes
© 2016 - 2024 Red Hat, Inc.