arch/alpha/include/asm/ptrace.h | 64 ++++++++++++++++++++++++++- arch/alpha/include/uapi/asm/ptrace.h | 66 +--------------------------- arch/alpha/kernel/asm-offsets.c | 4 ++ arch/alpha/kernel/entry.S | 24 +++++----- arch/alpha/kernel/traps.c | 2 +- arch/alpha/mm/fault.c | 4 +- 6 files changed, 81 insertions(+), 83 deletions(-)
This series fixes oopses on Alpha/SMP observed since kernel v6.9. [1] Thanks to Magnus Lindholm for identifying that remarkably longstanding bug. The problem is that GCC expects 16-byte alignment of the incoming stack since early 2004, as Maciej found out [2]: Having actually dug speculatively I can see that the psABI was changed in GCC 3.5 with commit e5e10fb4a350 ("re PR target/14539 (128-bit long double improperly aligned)") back in Mar 2004, when the stack pointer alignment was increased from 8 bytes to 16 bytes, and arch/alpha/kernel/entry.S has various suspicious stack pointer adjustments, starting with SP_OFF which is not a whole multiple of 16. Also, as Magnus noted, "ALPHA Calling Standard" [3] required the same: D.3.1 Stack Alignment This standard requires that stacks be octaword aligned at the time a new procedure is invoked. However: - the "normal" kernel stack is always misaligned by 8 bytes, thanks to the odd number of 64-bit words in 'struct pt_regs', which is the very first thing pushed onto the kernel thread stack; - syscall, fault, interrupt etc. handlers may, or may not, receive aligned stack depending on numerous factors. Somehow we got away with it until recently, when we ended up with a stack corruption in kernel/smp.c:smp_call_function_single() due to its use of 32-byte aligned local data and the compiler doing clever things allocating it on the stack. Patches 1-2 are preparatory; 3 - the main fix; 4 - fixes remaining special cases. Ivan. [1] https://lore.kernel.org/rcu/CA+=Fv5R9NG+1SHU9QV9hjmavycHKpnNyerQ=Ei90G98ukRcRJA@mail.gmail.com/#r [2] https://lore.kernel.org/rcu/alpine.DEB.2.21.2501130248010.18889@angie.orcam.me.uk/ [3] https://bitsavers.org/pdf/dec/alpha/Alpha_Calling_Standard_Rev_2.0_19900427.pdf --- Ivan Kokshaysky (4): alpha/uapi: do not expose kernel-only stack frame structures alpha: replace hardcoded stack offsets with autogenerated ones alpha: make stack 16-byte aligned (most cases) alpha: align stack for page fault and user unaligned trap handlers arch/alpha/include/asm/ptrace.h | 64 ++++++++++++++++++++++++++- arch/alpha/include/uapi/asm/ptrace.h | 66 +--------------------------- arch/alpha/kernel/asm-offsets.c | 4 ++ arch/alpha/kernel/entry.S | 24 +++++----- arch/alpha/kernel/traps.c | 2 +- arch/alpha/mm/fault.c | 4 +- 6 files changed, 81 insertions(+), 83 deletions(-) -- 2.39.5
On Wed, 29 Jan 2025, Ivan Kokshaysky wrote: > Somehow we got away with it until recently, when we ended up with > a stack corruption in kernel/smp.c:smp_call_function_single() due to > its use of 32-byte aligned local data and the compiler doing clever > things allocating it on the stack. Thank you for doing this work. I'll review/verify your changes by hand and push them through GCC and glibc regression testing, which should hopefully pick any fallout without having it buried among any intermittent failures, and report back. However, would you please cc <stable@kernel.org> with your submission, v2 presumably, so as to have these changes backported? The thing is I find it quite a grave bug being fixed here, which has been there for decades and triggering occasionally[1], and it might be the only way for users of certain older systems to get a kernel with the fix already applied. As you may have been aware non-BWX Alpha support has been removed and while I'm working on bringing it back, it will likely be missing support for specific models such as Jensen there will be no kernel developer to look after. So getting an LTS kernel might be the only way to get a stable system for some people. References: [1] "System fails to boot when CONFIG_SMP=y", <https://bugzilla.kernel.org/show_bug.cgi?id=213143> Maciej
On Wed, Jan 29, 2025 at 04:02:26PM +0000, Maciej W. Rozycki wrote: > On Wed, 29 Jan 2025, Ivan Kokshaysky wrote: > > > Somehow we got away with it until recently, when we ended up with > > a stack corruption in kernel/smp.c:smp_call_function_single() due to > > its use of 32-byte aligned local data and the compiler doing clever > > things allocating it on the stack. > > Thank you for doing this work. > > I'll review/verify your changes by hand and push them through GCC and > glibc regression testing, which should hopefully pick any fallout without > having it buried among any intermittent failures, and report back. Thanks! > However, would you please cc <stable@kernel.org> with your submission, v2 > presumably, so as to have these changes backported? Sure. As I need to deal with bpf build failure, v2 is inevitable. > The thing is I find it quite a grave bug being fixed here, which has been > there for decades and triggering occasionally[1], and it might be the only > way for users of certain older systems to get a kernel with the fix > already applied. As you may have been aware non-BWX Alpha support has > been removed and while I'm working on bringing it back, it will likely be > missing support for specific models such as Jensen there will be no kernel > developer to look after. So getting an LTS kernel might be the only way > to get a stable system for some people. Yes, I know about your work on non-BWX Alpha and highly appreciate it. > References: > > [1] "System fails to boot when CONFIG_SMP=y", > <https://bugzilla.kernel.org/show_bug.cgi?id=213143> > > Maciej Ivan.
Hi Ivan, On Wed, 2025-01-29 at 10:43 +0100, Ivan Kokshaysky wrote: > This series fixes oopses on Alpha/SMP observed since kernel v6.9. [1] > Thanks to Magnus Lindholm for identifying that remarkably longstanding > bug. > > The problem is that GCC expects 16-byte alignment of the incoming stack > since early 2004, as Maciej found out [2]: > Having actually dug speculatively I can see that the psABI was changed in > GCC 3.5 with commit e5e10fb4a350 ("re PR target/14539 (128-bit long double > improperly aligned)") back in Mar 2004, when the stack pointer alignment > was increased from 8 bytes to 16 bytes, and arch/alpha/kernel/entry.S has > various suspicious stack pointer adjustments, starting with SP_OFF which > is not a whole multiple of 16. > > Also, as Magnus noted, "ALPHA Calling Standard" [3] required the same: > D.3.1 Stack Alignment > This standard requires that stacks be octaword aligned at the time a > new procedure is invoked. > > However: > - the "normal" kernel stack is always misaligned by 8 bytes, thanks to > the odd number of 64-bit words in 'struct pt_regs', which is the very > first thing pushed onto the kernel thread stack; > - syscall, fault, interrupt etc. handlers may, or may not, receive aligned > stack depending on numerous factors. > > Somehow we got away with it until recently, when we ended up with > a stack corruption in kernel/smp.c:smp_call_function_single() due to > its use of 32-byte aligned local data and the compiler doing clever > things allocating it on the stack. > > Patches 1-2 are preparatory; 3 - the main fix; 4 - fixes remaining > special cases. > > Ivan. > > [1] https://lore.kernel.org/rcu/CA+=Fv5R9NG+1SHU9QV9hjmavycHKpnNyerQ=Ei90G98ukRcRJA@mail.gmail.com/#r > [2] https://lore.kernel.org/rcu/alpine.DEB.2.21.2501130248010.18889@angie.orcam.me.uk/ > [3] https://bitsavers.org/pdf/dec/alpha/Alpha_Calling_Standard_Rev_2.0_19900427.pdf > --- > Ivan Kokshaysky (4): > alpha/uapi: do not expose kernel-only stack frame structures > alpha: replace hardcoded stack offsets with autogenerated ones > alpha: make stack 16-byte aligned (most cases) > alpha: align stack for page fault and user unaligned trap handlers > > arch/alpha/include/asm/ptrace.h | 64 ++++++++++++++++++++++++++- > arch/alpha/include/uapi/asm/ptrace.h | 66 +--------------------------- > arch/alpha/kernel/asm-offsets.c | 4 ++ > arch/alpha/kernel/entry.S | 24 +++++----- > arch/alpha/kernel/traps.c | 2 +- > arch/alpha/mm/fault.c | 4 +- > 6 files changed, 81 insertions(+), 83 deletions(-) Thanks a lot for the series! I just applied them on top of Debian's current 6.12.11 kernel in unstable and will thoroughly test the patches. Will report back the results later this week. Adrian -- .''`. John Paul Adrian Glaubitz : :' : Debian Developer `. `' Physicist `- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913
© 2016 - 2025 Red Hat, Inc.