[PATCH v3 0/2] x86: enforce and cleanup RIP-relative accesses in early boot code

Kevin Loughlin posted 2 patches 1 year, 10 months ago
Only 0 patches received!
arch/x86/coco/core.c               | 22 +++++---
arch/x86/include/asm/mem_encrypt.h | 32 +++++++++--
arch/x86/kernel/head64.c           | 88 +++++++++++++-----------------
arch/x86/kernel/head_64.S          |  4 +-
arch/x86/kernel/sev-shared.c       | 52 +++++++++---------
arch/x86/kernel/sev.c              | 13 +++--
arch/x86/mm/mem_encrypt_identity.c | 50 +++++++++--------
7 files changed, 143 insertions(+), 118 deletions(-)
[PATCH v3 0/2] x86: enforce and cleanup RIP-relative accesses in early boot code
Posted by Kevin Loughlin 1 year, 10 months ago
SEV/SME code can execute prior to page table fixups for kernel
relocation. However, as with global variables accessed in
__startup_64(), the compiler is not required to generate RIP-relative
accesses for SEV/SME global variables, causing certain flavors of SEV
hosts and guests built with clang to crash during boot.

These crashes highlight a broader problem wherein the toolchain does
not guarantee that early x86-64 code executes correctly at any offset.
While Ard has been looking into overhauling the early x86-64 code
going forward [0], the signficant proposed changes are unfortunately
not backport-friendly.

Instead, this patchset continues the approach of fixing the immediate
problem of SEV-SNP boots crashing when built by clang, providing a
backport-friendly set of changes needed to successfully boot SEV-SNP
hosts and guests. In particular, this patchset is a cleanup of V2 [1],
which introduces a macro to force RIP-relative addressing in early
SEV/SME global variable accesses and existing head64 global accesses.

V2 -> V3: Rename RIP_RELATIVE_ADDR(), remove fixup_*(), cleanup style
V1 -> V2: Use GET_RIP_RELATIVE_PTR() macro to avoid -fPIE compilation

[0] https://lore.kernel.org/lkml/20240129180502.4069817-23-ardb+git@google.com/T/
[1] https://lore.kernel.org/lkml/20240111223650.3502633-1-kevinloughlin@google.com/

Kevin Loughlin (2):
  x86/sev: enforce RIP-relative accesses in early SEV/SME code
  x86/head64: Replace pointer fixups with RIP_RELATIVE_ADDR()

 arch/x86/coco/core.c               | 22 +++++---
 arch/x86/include/asm/mem_encrypt.h | 32 +++++++++--
 arch/x86/kernel/head64.c           | 88 +++++++++++++-----------------
 arch/x86/kernel/head_64.S          |  4 +-
 arch/x86/kernel/sev-shared.c       | 52 +++++++++---------
 arch/x86/kernel/sev.c              | 13 +++--
 arch/x86/mm/mem_encrypt_identity.c | 50 +++++++++--------
 7 files changed, 143 insertions(+), 118 deletions(-)

-- 
2.43.0.429.g432eaa2c6b-goog
Re: [PATCH v3 0/2] x86: enforce and cleanup RIP-relative accesses in early boot code
Posted by Borislav Petkov 1 year, 10 months ago
On Tue, Jan 30, 2024 at 10:08:43PM +0000, Kevin Loughlin wrote:
> Instead, this patchset continues the approach of fixing the immediate
> problem of SEV-SNP boots crashing when built by clang, providing a
> backport-friendly set of changes needed to successfully boot SEV-SNP
> hosts and guests.

What use cases are those exactly? How do I reproduce them here?

SNP is not upstream yet and the SEV* code has been out there for a while
now without a single such report so this must be something new happening
due to <raisins>...?

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette
Re: [PATCH v3 0/2] x86: enforce and cleanup RIP-relative accesses in early boot code
Posted by Jacob Xu 1 year, 10 months ago
On Wed, Jan 31, 2024 at 6:01 AM Borislav Petkov <bp@alien8.de> wrote:
>
> On Tue, Jan 30, 2024 at 10:08:43PM +0000, Kevin Loughlin wrote:
> > Instead, this patchset continues the approach of fixing the immediate
> > problem of SEV-SNP boots crashing when built by clang, providing a
> > backport-friendly set of changes needed to successfully boot SEV-SNP
> > hosts and guests.
>
> What use cases are those exactly? How do I reproduce them here?
>

We're interested in fixing SEV-SNP guest boots which are currently
broken when using a guest kernel compiled with clang. It seems like
every other user of SEV/SNP linux kernel code uses GCC to compile the
kernel so they've avoided this issue.

> SNP is not upstream yet and the SEV* code has been out there for a while
> now without a single such report so this must be something new happening
> due to <raisins>...?

E.g. Google COS uses clang to compile the kernel and we've made do
with an internal fix for a while. We've unfortunately just been rather
slow to report upstream.


>
> Thx.
>
> --
> Regards/Gruss,
>     Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette
Re: [PATCH v3 0/2] x86: enforce and cleanup RIP-relative accesses in early boot code
Posted by Borislav Petkov 1 year, 10 months ago
On Wed, Jan 31, 2024 at 10:16:55AM -0800, Jacob Xu wrote:
> We're interested in fixing SEV-SNP guest boots which are currently
> broken when using a guest kernel compiled with clang. It seems like
> every other user of SEV/SNP linux kernel code uses GCC to compile the
> kernel so they've avoided this issue.

Lemme give that a try here.

> E.g. Google COS uses clang to compile the kernel and we've made do
> with an internal fix for a while. 

Which means that, theoretically, you could forward-port this internal
fix until the issue is fixed for real, I'd say.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette
Re: [PATCH v3 0/2] x86: enforce and cleanup RIP-relative accesses in early boot code
Posted by Kevin Loughlin 1 year, 10 months ago
On Wed, Jan 31, 2024 at 10:30 AM Borislav Petkov <bp@alien8.de> wrote:
>
> On Wed, Jan 31, 2024 at 10:16:55AM -0800, Jacob Xu wrote:
> > We're interested in fixing SEV-SNP guest boots which are currently
> > broken when using a guest kernel compiled with clang. It seems like
> > every other user of SEV/SNP linux kernel code uses GCC to compile the
> > kernel so they've avoided this issue.
>
> Lemme give that a try here.
>
> > E.g. Google COS uses clang to compile the kernel and we've made do
> > with an internal fix for a while.
>
> Which means that, theoretically, you could forward-port this internal
> fix until the issue is fixed for real, I'd say.

True. I just think it would be better to have an upstream fix for
clang builds of SEV-SNP guests; I believe the first such SEV-SNP code
was merged in 5.19 if I'm not mistaken.
Re: [PATCH v3 0/2] x86: enforce and cleanup RIP-relative accesses in early boot code
Posted by Borislav Petkov 1 year, 10 months ago
On Fri, Feb 02, 2024 at 04:22:02PM -0800, Kevin Loughlin wrote:
> True. I just think it would be better to have an upstream fix for
> clang builds of SEV-SNP guests; I believe the first such SEV-SNP code
> was merged in 5.19 if I'm not mistaken.

SNP host support is not upstream yet. So we'd be supporting something
which is out-of-tree. Lemme see how ugly it'll get...

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette
Re: [PATCH v3 0/2] x86: enforce and cleanup RIP-relative accesses in early boot code
Posted by Ard Biesheuvel 1 year, 10 months ago
On Sat, 3 Feb 2024 at 11:20, Borislav Petkov <bp@alien8.de> wrote:
>
> On Fri, Feb 02, 2024 at 04:22:02PM -0800, Kevin Loughlin wrote:
> > True. I just think it would be better to have an upstream fix for
> > clang builds of SEV-SNP guests; I believe the first such SEV-SNP code
> > was merged in 5.19 if I'm not mistaken.
>
> SNP host support is not upstream yet. So we'd be supporting something
> which is out-of-tree. Lemme see how ugly it'll get...
>

The minimal fix doesn't look that bad IMHO. Note that this version is
based on your patch that removes
CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT, and we'd need to see whether
or not to backport that as well.

 arch/x86/include/asm/asm.h         | 13 +++++++++++++
 arch/x86/include/asm/mem_encrypt.h | 13 ++++++++-----
 arch/x86/kernel/sev-shared.c       | 12 ++++++------
 arch/x86/kernel/sev.c              |  9 +++++++--
 arch/x86/mm/mem_encrypt_identity.c | 27 ++++++++++++---------------
 5 files changed, 46 insertions(+), 28 deletions(-)

https://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git/commit/?h=x86-pie-for-sev-v4a&id=65d0e5f4ed6ca807cdf28a1c5c0389af2c9f9bda
Re: [PATCH v3 0/2] x86: enforce and cleanup RIP-relative accesses in early boot code
Posted by Borislav Petkov 1 year, 10 months ago
On Sat, Feb 03, 2024 at 11:27:19AM +0100, Ard Biesheuvel wrote:
> The minimal fix doesn't look that bad IMHO. Note that this version is
> based on your patch that removes
> CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT, and we'd need to see whether
> or not to backport that as well.

Ok, please send it formally so that I can take a look.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette
Re: [PATCH v3 0/2] x86: enforce and cleanup RIP-relative accesses in early boot code
Posted by Ard Biesheuvel 1 year, 10 months ago
On Sat, 3 Feb 2024 at 01:22, Kevin Loughlin <kevinloughlin@google.com> wrote:
>
> On Wed, Jan 31, 2024 at 10:30 AM Borislav Petkov <bp@alien8.de> wrote:
> >
> > On Wed, Jan 31, 2024 at 10:16:55AM -0800, Jacob Xu wrote:
> > > We're interested in fixing SEV-SNP guest boots which are currently
> > > broken when using a guest kernel compiled with clang. It seems like
> > > every other user of SEV/SNP linux kernel code uses GCC to compile the
> > > kernel so they've avoided this issue.
> >
> > Lemme give that a try here.
> >
> > > E.g. Google COS uses clang to compile the kernel and we've made do
> > > with an internal fix for a while.
> >
> > Which means that, theoretically, you could forward-port this internal
> > fix until the issue is fixed for real, I'd say.
>
> True. I just think it would be better to have an upstream fix for
> clang builds of SEV-SNP guests; I believe the first such SEV-SNP code
> was merged in 5.19 if I'm not mistaken.

The problem is not only Clang. The problem is that we try to keep the
stable trees working with newer compilers in general, and we are
relying heavily on behavior on the part of the compiler that could
change in the future. Those references that GCC happens to emit as
RIP-relative today even without the workarounds could easily turn into
absolute references on tomorrow's version, given that both are
permitted by the code model under -fno-pic.

I've compared notes with Kevin internally, and we'll get a minimal,
simplified version of these changes into my v4 SEV PIC series so that
we can easily cherry-pick the fixes, either into linux-stable or into
our downstream fork.