[PATCH v7 8/8] [DO NOT MERGE] x86/kexec: Add CFI type information to relocate_kernel()

David Woodhouse posted 8 patches 9 months, 1 week ago
There is a newer version of this series
[PATCH v7 8/8] [DO NOT MERGE] x86/kexec: Add CFI type information to relocate_kernel()
Posted by David Woodhouse 9 months, 1 week ago
From: David Woodhouse <dwmw@amazon.co.uk>

A previous commit added __nocfi to machine_kexec() because it makes an
indirect call to relocate_kernel() which lacked CFI type information,
and caused the system to crash.

Use SYM_TYPED_FUNC_START() to ensure that the type information is
present, and remove the __nocfi tag.

I still can't make objtool happy with this in both GCC and Clang builds
at the same time, so not yet for merging; only included in this series
to nerd-snipe the objtool maintainers.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
---
 arch/x86/kernel/machine_kexec_64.c   | 2 +-
 arch/x86/kernel/relocate_kernel_64.S | 4 +++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/machine_kexec_64.c b/arch/x86/kernel/machine_kexec_64.c
index 7abc7aa0261b..84f59f18dcb6 100644
--- a/arch/x86/kernel/machine_kexec_64.c
+++ b/arch/x86/kernel/machine_kexec_64.c
@@ -380,7 +380,7 @@ void machine_kexec_cleanup(struct kimage *image)
  * Do not allocate memory (or fail in any way) in machine_kexec().
  * We are past the point of no return, committed to rebooting now.
  */
-void __nocfi machine_kexec(struct kimage *image)
+void machine_kexec(struct kimage *image)
 {
 	unsigned long reloc_start = (unsigned long)__relocate_kernel_start;
 	relocate_kernel_fn *relocate_kernel_ptr;
diff --git a/arch/x86/kernel/relocate_kernel_64.S b/arch/x86/kernel/relocate_kernel_64.S
index 2b7fc59af373..999aca909803 100644
--- a/arch/x86/kernel/relocate_kernel_64.S
+++ b/arch/x86/kernel/relocate_kernel_64.S
@@ -5,6 +5,7 @@
  */
 
 #include <linux/linkage.h>
+#include <linux/cfi_types.h>
 #include <linux/stringify.h>
 #include <asm/alternative.h>
 #include <asm/page_types.h>
@@ -59,8 +60,9 @@ SYM_DATA_END(kexec_debug_idt)
 
 	.section .text..relocate_kernel,"ax";
 	.code64
-SYM_CODE_START_NOALIGN(relocate_kernel)
+SYM_TYPED_FUNC_START(relocate_kernel)
 	UNWIND_HINT_END_OF_STACK
+	UNWIND_HINT_FUNC
 	ANNOTATE_NOENDBR
 	/*
 	 * %rdi indirection_page
-- 
2.48.1
Re: [PATCH v7 8/8] [DO NOT MERGE] x86/kexec: Add CFI type information to relocate_kernel()
Posted by Josh Poimboeuf 9 months, 1 week ago
On Wed, Mar 12, 2025 at 02:34:20PM +0000, David Woodhouse wrote:
> From: David Woodhouse <dwmw@amazon.co.uk>
> 
> A previous commit added __nocfi to machine_kexec() because it makes an
> indirect call to relocate_kernel() which lacked CFI type information,
> and caused the system to crash.
> 
> Use SYM_TYPED_FUNC_START() to ensure that the type information is
> present, and remove the __nocfi tag.
> 
> I still can't make objtool happy with this in both GCC and Clang builds
> at the same time, so not yet for merging; only included in this series
> to nerd-snipe the objtool maintainers.

Again, I'd recommend just compiling the code in .data..relocate_kernel.
Then objtool doesn't need to care.

Otherwise it would help to know what warnings you're getting with this
patch, and the corresponding .config.

-- 
Josh
Re: [PATCH v7 8/8] [DO NOT MERGE] x86/kexec: Add CFI type information to relocate_kernel()
Posted by David Woodhouse 9 months, 1 week ago
On Fri, 2025-03-14 at 09:07 -0700, Josh Poimboeuf wrote:
> On Wed, Mar 12, 2025 at 02:34:20PM +0000, David Woodhouse wrote:
> > From: David Woodhouse <dwmw@amazon.co.uk>
> > 
> > A previous commit added __nocfi to machine_kexec() because it makes an
> > indirect call to relocate_kernel() which lacked CFI type information,
> > and caused the system to crash.
> > 
> > Use SYM_TYPED_FUNC_START() to ensure that the type information is
> > present, and remove the __nocfi tag.
> > 
> > I still can't make objtool happy with this in both GCC and Clang builds
> > at the same time, so not yet for merging; only included in this series
> > to nerd-snipe the objtool maintainers.
> 
> Again, I'd recommend just compiling the code in .data..relocate_kernel.
> Then objtool doesn't need to care.
> 
> Otherwise it would help to know what warnings you're getting with this
> patch, and the corresponding .config.

ISTR this version is OK with Clang and CONFIG_CFI_CLANG but with GCC I
get this:

vmlinux.o: warning: objtool: relocate_kernel+0x69: unsupported stack register modification

        /* setup a new stack at the end of the physical control page */
        lea     PAGE_SIZE(%rsi), %rsp
  79:   48 8d a6 00 10 00 00    lea    0x1000(%rsi),%rsp


Maybe the answer is to put the UNWIND_HINT_FUNC into #ifdef
CONFIG_CFI_CLANG but that seems wrong.

I'll have another look at putting it in the data section, and see if I
can remember why I didn't want to do that before (and if that's still
relevant now).

Re: [PATCH v7 8/8] [DO NOT MERGE] x86/kexec: Add CFI type information to relocate_kernel()
Posted by Josh Poimboeuf 9 months, 1 week ago
On Fri, Mar 14, 2025 at 05:23:15PM +0000, David Woodhouse wrote:
> ISTR this version is OK with Clang and CONFIG_CFI_CLANG but with GCC I
> get this:
> 
> vmlinux.o: warning: objtool: relocate_kernel+0x69: unsupported stack register modification
> 
>         /* setup a new stack at the end of the physical control page */
>         lea     PAGE_SIZE(%rsi), %rsp
>   79:   48 8d a6 00 10 00 00    lea    0x1000(%rsi),%rsp
> 
> 
> Maybe the answer is to put the UNWIND_HINT_FUNC into #ifdef
> CONFIG_CFI_CLANG but that seems wrong.

The UNWIND_HINT_FUNC definitely looks wrong, why would Clang need it?

> I'll have another look at putting it in the data section, and see if I
> can remember why I didn't want to do that before (and if that's still
> relevant now).

IIRC, the reasons were the patched alternative, and also you wanted to
disassemble (but note that's still possible with gdb).

Here was a patch to make it work:

  https://lore.kernel.org/20241218212326.44qff3i5n6cxuu5d@jpoimboe

-- 
Josh
Re: [PATCH v7 8/8] [DO NOT MERGE] x86/kexec: Add CFI type information to relocate_kernel()
Posted by David Woodhouse 9 months ago
On Fri, 2025-03-14 at 10:52 -0700, Josh Poimboeuf wrote:
> 
> IIRC, the reasons were the patched alternative, and also you wanted to
> disassemble (but note that's still possible with gdb).

It's meaningful output from 'objdump -S' that I miss. But OK.

> Here was a patch to make it work:
> 
>   https://lore.kernel.org/20241218212326.44qff3i5n6cxuu5d@jpoimboe

I've reworked that and the CR4 filtering, and now it doesn't like me
using SYM_TYPED_FUNC_START() to add the CFI information for the
relocate_kernel() function.

  LD      vmlinux.o
vmlinux.o: warning: objtool: bad call to elf_init_reloc_text_sym() for
data symbol .data..relocate_kernel
 ...

  SORTTAB vmlinux
incomplete ORC unwind tables in file: vmlinux
Failed to sort kernel tables

This happens when I build with (clang and) CONFIG_CFI_CLANG, with and
updated version of your patch above, and this on top of it. This is at
https://git.infradead.org/users/dwmw2/linux.git/shortlog/refs/heads/kexec-debug

From 0f08a44613764e9b38a1c3332812685b61c99c2e Mon Sep 17 00:00:00 2001
From: David Woodhouse <dwmw@amazon.co.uk>
Date: Mon, 16 Dec 2024 10:26:24 +0000
Subject: [PATCH] x86/kexec: Add CFI type information to relocate_kernel()

A previous commit added __nocfi to machine_kexec() because it makes an
indirect call to relocate_kernel() which lacked CFI type information,
and caused the system to crash.

Use SYM_TYPED_FUNC_START() to ensure that the type information is
present, and remove the __nocfi tag.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
---
 arch/x86/kernel/machine_kexec_64.c   | 2 +-
 arch/x86/kernel/relocate_kernel_64.S | 3 ++-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/machine_kexec_64.c b/arch/x86/kernel/machine_kexec_64.c
index 016862d2b544..e1f5fc858aee 100644
--- a/arch/x86/kernel/machine_kexec_64.c
+++ b/arch/x86/kernel/machine_kexec_64.c
@@ -396,7 +396,7 @@ void machine_kexec_cleanup(struct kimage *image)
  * Do not allocate memory (or fail in any way) in machine_kexec().
  * We are past the point of no return, committed to rebooting now.
  */
-void __nocfi machine_kexec(struct kimage *image)
+void machine_kexec(struct kimage *image)
 {
 	unsigned long reloc_start = (unsigned long)__relocate_kernel_start;
 	relocate_kernel_fn *relocate_kernel_ptr;
diff --git a/arch/x86/kernel/relocate_kernel_64.S b/arch/x86/kernel/relocate_kernel_64.S
index 814af7fa6318..c859fbe507e8 100644
--- a/arch/x86/kernel/relocate_kernel_64.S
+++ b/arch/x86/kernel/relocate_kernel_64.S
@@ -5,6 +5,7 @@
  */
 
 #include <linux/linkage.h>
+#include <linux/cfi_types.h>
 #include <linux/stringify.h>
 #include <asm/alternative.h>
 #include <asm/page_types.h>
@@ -68,7 +69,7 @@ SYM_DATA_END(kexec_debug_idt)
  * opinions about it.
  */
 	.code64
-SYM_CODE_START_NOALIGN(relocate_kernel)
+SYM_TYPED_FUNC_START(relocate_kernel)
 	/*
 	 * %rdi indirection_page
 	 * %rsi pa_control_page
-- 
2.48.1

Re: [PATCH v7 8/8] [DO NOT MERGE] x86/kexec: Add CFI type information to relocate_kernel()
Posted by Josh Poimboeuf 9 months ago
On Mon, Mar 17, 2025 at 12:40:14PM +0000, David Woodhouse wrote:
> On Fri, 2025-03-14 at 10:52 -0700, Josh Poimboeuf wrote:
> > 
> > IIRC, the reasons were the patched alternative, and also you wanted to
> > disassemble (but note that's still possible with gdb).
> 
> It's meaningful output from 'objdump -S' that I miss. But OK.

FYI, this works:

  objdump -Srw -j .data..relocate_kernel vmlinux.o

-- 
Josh
Re: [PATCH v7 8/8] [DO NOT MERGE] x86/kexec: Add CFI type information to relocate_kernel()
Posted by Josh Poimboeuf 9 months ago
On Mon, Mar 17, 2025 at 05:17:24PM -0700, Josh Poimboeuf wrote:
> On Mon, Mar 17, 2025 at 12:40:14PM +0000, David Woodhouse wrote:
> > On Fri, 2025-03-14 at 10:52 -0700, Josh Poimboeuf wrote:
> > > 
> > > IIRC, the reasons were the patched alternative, and also you wanted to
> > > disassemble (but note that's still possible with gdb).
> > 
> > It's meaningful output from 'objdump -S' that I miss. But OK.
> 
> FYI, this works:
> 
>   objdump -Srw -j .data..relocate_kernel vmlinux.o

... but I see now that it doesn't intersperse the source.  Never mind...

-- 
Josh
Re: [PATCH v7 8/8] [DO NOT MERGE] x86/kexec: Add CFI type information to relocate_kernel()
Posted by David Woodhouse 9 months ago
On Mon, 2025-03-17 at 17:24 -0700, Josh Poimboeuf wrote:
> On Mon, Mar 17, 2025 at 05:17:24PM -0700, Josh Poimboeuf wrote:
> > On Mon, Mar 17, 2025 at 12:40:14PM +0000, David Woodhouse wrote:
> > > On Fri, 2025-03-14 at 10:52 -0700, Josh Poimboeuf wrote:
> > > > 
> > > > IIRC, the reasons were the patched alternative, and also you
> > > > wanted to
> > > > disassemble (but note that's still possible with gdb).
> > > 
> > > It's meaningful output from 'objdump -S' that I miss. But OK.
> > 
> > FYI, this works:
> > 
> >   objdump -Srw -j .data..relocate_kernel vmlinux.o
> 
> ... but I see now that it doesn't intersperse the source.  Never
> mind...

To be fair, the source is assembler. So it isn't *so* hard to work it
out.

But on the whole, I'm not sure the CFI check is worth it.

CFI checks that the caller and callee agree about the prototype of the
function being called. There are two main benefits of this:

 • to protect against attacks where function pointers are substituted
   for gadgets.

 • to protect against genuine bugs, where the caller and the callee
   disagree about the function arguments.

For the relocate_kernel() case I don't think we care much about the
first. Without a CFI prologue, no *other* code can be tricked into
calling relocate_kernel() — and besides, it's in the kernel's data
section and isn't executable anyway until the kexec code copies it to a
page that *is*. And the kexec code itself isn't just following an
arbitrary function pointer; it copies the code into the control page
and invokes it there, so it's unlikely to follow a bogus pointer
either.

It's the *second* benefit which is more relevant to us, ensuring that
the caller and the callee genuinely do agree about the prototype.

But using the SYM_TYPED_FUNC_START() macro doesn't give us that; the
CFI prologue of the asm function is just using the signature emitted by
the *caller* in the weak __kcfi_typeid_relocate_kernel symbol anyway,
so they're never going to disagree. And how *could* the assembler side
build a typeid signature from the asm anyway?

I suspect that just leaving it marked __nocfi is probably the easier
option, unless I can *hardcode* the function signature 0x19835999 in
the CFI prologue in relocate_kernel_64.S to protect against someone
accidentally changing the C-visible 'relocate_kernel_fn' typedef
without changing the corresponding assembler? But honestly, is that
likely?

Looks like this (with your objtool relaxation on top, as well as
removing load_segments() in machine_kexec_64.c if I want the check to
actually emit the warning correctly.).

I just don't think it's worth it...

From 1bbed9c611fd286b68e2c459320910c4fefd4a22 Mon Sep 17 00:00:00 2001
From: David Woodhouse <dwmw@amazon.co.uk>
Date: Mon, 16 Dec 2024 10:26:24 +0000
Subject: [PATCH] x86/kexec: Add CFI type information to relocate_kernel()

A previous commit added __nocfi to machine_kexec() because it makes an
indirect call to relocate_kernel() which lacked CFI type information,
and caused the system to crash.

Use SYM_TYPED_FUNC_START() to ensure that the type information is
present, and remove the __nocfi tag. And emit the *actual* type
signature (0x19835999) because by default, asm code uses the type
signature emitted by the *caller*, which is never going to differ!

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
---
 arch/x86/kernel/machine_kexec_64.c   |  2 +-
 arch/x86/kernel/relocate_kernel_64.S | 19 ++++++++++++++++++-
 2 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/machine_kexec_64.c b/arch/x86/kernel/machine_kexec_64.c
index 016862d2b544..e1f5fc858aee 100644
--- a/arch/x86/kernel/machine_kexec_64.c
+++ b/arch/x86/kernel/machine_kexec_64.c
@@ -396,7 +396,7 @@ void machine_kexec_cleanup(struct kimage *image)
  * Do not allocate memory (or fail in any way) in machine_kexec().
  * We are past the point of no return, committed to rebooting now.
  */
-void __nocfi machine_kexec(struct kimage *image)
+void machine_kexec(struct kimage *image)
 {
 	unsigned long reloc_start = (unsigned long)__relocate_kernel_start;
 	relocate_kernel_fn *relocate_kernel_ptr;
diff --git a/arch/x86/kernel/relocate_kernel_64.S b/arch/x86/kernel/relocate_kernel_64.S
index 814af7fa6318..428401e55d29 100644
--- a/arch/x86/kernel/relocate_kernel_64.S
+++ b/arch/x86/kernel/relocate_kernel_64.S
@@ -5,6 +5,7 @@
  */
 
 #include <linux/linkage.h>
+#include <linux/cfi_types.h>
 #include <linux/stringify.h>
 #include <asm/alternative.h>
 #include <asm/page_types.h>
@@ -67,8 +68,24 @@ SYM_DATA_END(kexec_debug_idt)
  * a data section even of the object file, to prevent objtool from having
  * opinions about it.
  */
+
+#ifdef CONFIG_CFI_CLANG
+/*
+ * The SYM_TYPED_FUNC_START macro emits a CFI prologue for the function,
+ * referencing the __kcfi_typeid_##name symbol which is the signature of
+ * the function prototype. The value of that symbol ends up coming from
+ * a weak symbol emitted by the *caller* (in this case machine_kexec()),
+ * which means it's *entirely* useless for checking that the caller and
+ * callee agree about the prototype of the (asm) function being called.
+ * So, we define the signature *here* for ourselves, and if the C code
+ * ever calls relocate_kernel() in the belief that it has a different
+ * prototype, then the CFI check will trigger as $DEITY intended.
+ */
+	.weak __kcfi_typeid_relocate_kernel
+	.set  __kcfi_typeid_relocate_kernel, 0x19835999
+#endif
 	.code64
-SYM_CODE_START_NOALIGN(relocate_kernel)
+SYM_TYPED_FUNC_START(relocate_kernel)
 	/*
 	 * %rdi indirection_page
 	 * %rsi pa_control_page
-- 
2.48.1


Re: [PATCH v7 8/8] [DO NOT MERGE] x86/kexec: Add CFI type information to relocate_kernel()
Posted by Josh Poimboeuf 9 months ago
On Tue, Mar 18, 2025 at 03:56:36PM +0000, David Woodhouse wrote:
> But on the whole, I'm not sure the CFI check is worth it.
> 
> CFI checks that the caller and callee agree about the prototype of the
> function being called. There are two main benefits of this:
> 
>  • to protect against attacks where function pointers are substituted
>    for gadgets.
> 
>  • to protect against genuine bugs, where the caller and the callee
>    disagree about the function arguments.

AFAIK the first one is the main point of CFI.

> For the relocate_kernel() case I don't think we care much about the
> first. Without a CFI prologue, no *other* code can be tricked into
> calling relocate_kernel()

But for FineIBT the hash is checked on the callee side.  So it loses
FineIBT protection.

> — and besides, it's in the kernel's data
> section and isn't executable anyway until the kexec code copies it to a
> page that *is*.

Does the code get copied immediately before getting called, or can it be
initialized earlier during boot when kdump does its initial setup?

-- 
Josh
Re: [PATCH v7 8/8] [DO NOT MERGE] x86/kexec: Add CFI type information to relocate_kernel()
Posted by David Woodhouse 9 months ago
On Tue, 2025-03-18 at 10:14 -0700, Josh Poimboeuf wrote:
> On Tue, Mar 18, 2025 at 03:56:36PM +0000, David Woodhouse wrote:
> > But on the whole, I'm not sure the CFI check is worth it.
> > 
> > CFI checks that the caller and callee agree about the prototype of the
> > function being called. There are two main benefits of this:
> > 
> >  • to protect against attacks where function pointers are substituted
> >    for gadgets.
> > 
> >  • to protect against genuine bugs, where the caller and the callee
> >    disagree about the function arguments.
> 
> AFAIK the first one is the main point of CFI.

In the general case yes. I just don't think it matters much for
relocate_kernel().

> > For the relocate_kernel() case I don't think we care much about the
> > first. Without a CFI prologue, no *other* code can be tricked into
> > calling relocate_kernel()
> 
> But for FineIBT the hash is checked on the callee side.  So it loses
> FineIBT protection.

Right now the relocate_kernel() code doesn't even have an endbr, does
it? So it isn't a useful gadget?

> > — and besides, it's in the kernel's data
> > section and isn't executable anyway until the kexec code copies it to a
> > page that *is*.
> 
> Does the code get copied immediately before getting called, or can it be
> initialized earlier during boot when kdump does its initial setup?

It's initialized earlier, in machine_kexec_prepare(), and then the page
is set ROX.

Re: [PATCH v7 8/8] [DO NOT MERGE] x86/kexec: Add CFI type information to relocate_kernel()
Posted by Josh Poimboeuf 9 months ago
On Tue, Mar 18, 2025 at 09:06:58PM +0000, David Woodhouse wrote:
> On Tue, 2025-03-18 at 10:14 -0700, Josh Poimboeuf wrote:
> > On Tue, Mar 18, 2025 at 03:56:36PM +0000, David Woodhouse wrote:
> > > For the relocate_kernel() case I don't think we care much about the
> > > first. Without a CFI prologue, no *other* code can be tricked into
> > > calling relocate_kernel()
> > 
> > But for FineIBT the hash is checked on the callee side.  So it loses
> > FineIBT protection.
> 
> Right now the relocate_kernel() code doesn't even have an endbr, does
> it? So it isn't a useful gadget?

In that case wouldn't IBT explode when you indirect call it?  Or is IBT
getting disabled beforehand?

> > > — and besides, it's in the kernel's data
> > > section and isn't executable anyway until the kexec code copies it to a
> > > page that *is*.
> > 
> > Does the code get copied immediately before getting called, or can it be
> > initialized earlier during boot when kdump does its initial setup?
> 
> It's initialized earlier, in machine_kexec_prepare(), and then the page
> is set ROX.

If that happens during boot (like for kdump init) then it'll be in text
the whole time after boot, right?

-- 
Josh
Re: [PATCH v7 8/8] [DO NOT MERGE] x86/kexec: Add CFI type information to relocate_kernel()
Posted by David Woodhouse 9 months ago
On 18 March 2025 22:41:43 GMT, Josh Poimboeuf <jpoimboe@kernel.org> wrote:
>On Tue, Mar 18, 2025 at 09:06:58PM +0000, David Woodhouse wrote:
>> On Tue, 2025-03-18 at 10:14 -0700, Josh Poimboeuf wrote:
>> > On Tue, Mar 18, 2025 at 03:56:36PM +0000, David Woodhouse wrote:
>> > > For the relocate_kernel() case I don't think we care much about the
>> > > first. Without a CFI prologue, no *other* code can be tricked into
>> > > calling relocate_kernel()
>> > 
>> > But for FineIBT the hash is checked on the callee side.  So it loses
>> > FineIBT protection.
>> 
>> Right now the relocate_kernel() code doesn't even have an endbr, does
>> it? So it isn't a useful gadget?
>
>In that case wouldn't IBT explode when you indirect call it?  Or is IBT
>getting disabled beforehand?

Not sure of the details. The machine_kexec() function which is the *caller* is currently marked with the __nocfi tag which stops any software checks. I guess any hardware feature which requires an endbr to be the target of an indirect branch has to already disabled on the way down? What specifically am I looking for, to check that? Or the hardware support has just never worked with kexec, perhaps?

>> > > — and besides, it's in the kernel's data
>> > > section and isn't executable anyway until the kexec code copies it to a
>> > > page that *is*.
>> > 
>> > Does the code get copied immediately before getting called, or can it be
>> > initialized earlier during boot when kdump does its initial setup?
>> 
>> It's initialized earlier, in machine_kexec_prepare(), and then the page
>> is set ROX.
>
>If that happens during boot (like for kdump init) then it'll be in text
>the whole time after boot, right?

In an executable page, yes.
Re: [PATCH v7 8/8] [DO NOT MERGE] x86/kexec: Add CFI type information to relocate_kernel()
Posted by Josh Poimboeuf 9 months ago
On Wed, Mar 19, 2025 at 01:04:20PM +0000, David Woodhouse wrote:
> On 18 March 2025 22:41:43 GMT, Josh Poimboeuf <jpoimboe@kernel.org> wrote:
> >On Tue, Mar 18, 2025 at 09:06:58PM +0000, David Woodhouse wrote:
> >> On Tue, 2025-03-18 at 10:14 -0700, Josh Poimboeuf wrote:
> >> > On Tue, Mar 18, 2025 at 03:56:36PM +0000, David Woodhouse wrote:
> >> > > For the relocate_kernel() case I don't think we care much about the
> >> > > first. Without a CFI prologue, no *other* code can be tricked into
> >> > > calling relocate_kernel()
> >> > 
> >> > But for FineIBT the hash is checked on the callee side.  So it loses
> >> > FineIBT protection.
> >> 
> >> Right now the relocate_kernel() code doesn't even have an endbr, does
> >> it? So it isn't a useful gadget?
> >
> >In that case wouldn't IBT explode when you indirect call it?  Or is IBT
> >getting disabled beforehand?
> 
> Not sure of the details. The machine_kexec() function which is the
> *caller* is currently marked with the __nocfi tag which stops any
> software checks. I guess any hardware feature which requires an endbr
> to be the target of an indirect branch has to already disabled on the
> way down? What specifically am I looking for, to check that? Or the
> hardware support has just never worked with kexec, perhaps?

Looking at machine_kexec(), it calls cet_disable() before the indirect
call.  So yeah, it seems fine for relocate_kernel() to not have a CFI
prologue or ENDBR.

-- 
Josh
Re: [PATCH v7 8/8] [DO NOT MERGE] x86/kexec: Add CFI type information to relocate_kernel()
Posted by David Woodhouse 9 months ago
On Wed, 2025-03-19 at 08:47 -0700, Josh Poimboeuf wrote:
> 
> Looking at machine_kexec(), it calls cet_disable() before the indirect
> call.  So yeah, it seems fine for relocate_kernel() to not have a CFI
> prologue or ENDBR.

Yeah. I'm just going to throw that into a branch and forget it for now.
It's just causing noise around the *actual* stuff we want to get in, to
do the exception handling.

For that, I think the only thing outstanding is to add a userspace test
based on http://david.woodhou.se/loadret.c which invokes the int3 from
the 'payload' to test it. I think this is better than doing it inside
the kernel itself, as it gives us a selftest for the kexec jump
mechanism too.
Re: [PATCH v7 8/8] [DO NOT MERGE] x86/kexec: Add CFI type information to relocate_kernel()
Posted by David Woodhouse 9 months, 1 week ago
On Fri, 2025-03-14 at 10:52 -0700, Josh Poimboeuf wrote:
> On Fri, Mar 14, 2025 at 05:23:15PM +0000, David Woodhouse wrote:
> > ISTR this version is OK with Clang and CONFIG_CFI_CLANG but with GCC I
> > get this:
> > 
> > vmlinux.o: warning: objtool: relocate_kernel+0x69: unsupported stack register modification
> > 
> >         /* setup a new stack at the end of the physical control page */
> >         lea     PAGE_SIZE(%rsi), %rsp
> >   79:   48 8d a6 00 10 00 00    lea    0x1000(%rsi),%rsp
> > 
> > 
> > Maybe the answer is to put the UNWIND_HINT_FUNC into #ifdef
> > CONFIG_CFI_CLANG but that seems wrong.
> 
> The UNWIND_HINT_FUNC definitely looks wrong, why would Clang need it?

I think it's when CONFIG_CFI_CLANG makes the SYM_TYPED_FUNC_START()
macro actually emit the CFI prologue? 

> > I'll have another look at putting it in the data section, and see if I
> > can remember why I didn't want to do that before (and if that's still
> > relevant now).
> 
> IIRC, the reasons were the patched alternative, and also you wanted to
> disassemble (but note that's still possible with gdb).
> 
> Here was a patch to make it work:
> 
>   https://lore.kernel.org/20241218212326.44qff3i5n6cxuu5d@jpoimboe

Yeah, that does seem reasonable. Sorry, I think I missed that before
Christmas. I'll look at rolling it in. This part is kind of orthogonal
to the actual debug support so it's fine to keep it separate.

Thanks.