arch/x86/kernel/relocate_kernel_64.S | 2 ++ 1 file changed, 2 insertions(+)
From: David Woodhouse <dwmw@amazon.co.uk>
The version of purgatory code shipped by kexec-tools attempts to look
above the top of its stack to find a return address for a kjump, even
in a non-kjump kexec. Since commit 2cacf7f23a02 ("x86/kexec: Fix stack
and handling of re-entry point for ::preserve_context") the page above
the stack might not be there, leading to a fault (which is at least
now caught my by exception-handling code in kexec).
That commit fixed things for the actual kjump path, but no longer
"gratuitously" pushes the unused return address to the stack in the
non-kjump path. Put that *back* in the non-kjump path, to prevent
purgatory from crashing when trying to access it.
Reported-by: Rohan Kakulawaram <rohanka@google.com>
Tested-by: Rohan Kakulawaram <rohanka@google.com>
Fixes: 2cacf7f23a02 ("x86/kexec: Fix stack and handling of re-entry point for ::preserve_context")
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
---
arch/x86/kernel/relocate_kernel_64.S | 2 ++
1 file changed, 2 insertions(+)
diff --git a/arch/x86/kernel/relocate_kernel_64.S b/arch/x86/kernel/relocate_kernel_64.S
index 4ffba68dc57b..301c586282aa 100644
--- a/arch/x86/kernel/relocate_kernel_64.S
+++ b/arch/x86/kernel/relocate_kernel_64.S
@@ -136,6 +136,8 @@ SYM_CODE_START_LOCAL_NOALIGN(identity_mapped)
* %r13 original CR4 when relocate_kernel() was invoked
*/
+ /* set return address to 0 if not preserving context */
+ pushq $0
/* store the start address on the stack */
pushq %rdx
--
2.43.0
On 4/2/26 03:34, David Woodhouse wrote: > The version of purgatory code shipped by kexec-tools attempts to look > above the top of its stack to find a return address for a kjump This is a bug in kexec-tools, right? Has kexec-tools been fixed? The purgatory code is injected by userspace, so are you kinda asserting here that the this change in the kernel stack "breaks userspace"? I guess one little push isn't the end of the world. But, can we please comment it to this effect: /* * Work around a kexec-tools' <version here> purgatory bug that * accesses the stack one long out of bounds. Push a dummy value * to make the access harmless and avoid a fault. */ Without that, we'll be scratching our heads for the next decade about what this 0 on the stack does. The comment you suggested tells us what it is doing, but not why. It all feels kinda icky though. Our stack is an ABI?!?!?!
On Tue, 2026-04-28 at 07:22 -0700, Dave Hansen wrote: > On 4/2/26 03:34, David Woodhouse wrote: > > The version of purgatory code shipped by kexec-tools attempts to look > > above the top of its stack to find a return address for a kjump > > This is a bug in kexec-tools, right? Has kexec-tools been fixed? There isn't any other way to tell the difference between kjump and a normal kexec, is there? > The purgatory code is injected by userspace, so are you kinda asserting > here that the this change in the kernel stack "breaks userspace"? Essentially, yes. Rohan found that kexec broke on a kernel update, bisected it to my commit, and worked out why. This isn't just the "kernel stack". This is the stack that the kernel sets up specifically for the kexec target. > I guess one little push isn't the end of the world. But, can we please > comment it to this effect: > > /* > * Work around a kexec-tools' <version here> purgatory bug that > * accesses the stack one long out of bounds. Push a dummy value > * to make the access harmless and avoid a fault. > */ > Sure. Will phrase it thus in v2: + /* + * Set return address to 0 if not preserving context. The purgatory + * shipped in kexec-tools will unconditionally look for the return + * address on the stack and set a kexec_jump_back_entry= command + * line option if it's non-zero. There's no other way that it can + * tell a preserve-context (kjump) kexec from a normal one. + */ > Without that, we'll be scratching our heads for the next decade about > what this 0 on the stack does. The comment you suggested tells us what > it is doing, but not why. > > It all feels kinda icky though. Our stack is an ABI?!?!?! Well, it's kexec. Where else are we going to put things? Wanna declare that we're going to use %rbx instead? Or the BIOS Data Area in segment 0x40? :)
© 2016 - 2026 Red Hat, Inc.