arch/x86/kernel/machine_kexec_64.c | 82 ++++++++++++++++++++++++++++-- arch/x86/mm/ident_map.c | 23 +++++++-- 2 files changed, 95 insertions(+), 10 deletions(-)
Although there was a previous fix to avoid early kernel access to the
EFI config table on Intel systems, the problem can still exist on AMD
systems that support SEV (Secure Encrypted Virtualization). The
command line option "nogbpages" brings this bug to the surface. And
this is what caused the regression with my earlier patch that
attempted to reduce the use of gbpages. This patch series fixes that
problem and restores my earlier patch.
The following 2 commits caused the EFI config table, and the CC_BLOB
entry in that table, to be accessed when enabling SEV at kernel
startup.
commit ec1c66af3a30 ("x86/compressed/64: Detect/setup SEV/SME features
earlier during boot")
commit c01fce9cef84 ("x86/compressed: Add SEV-SNP feature
detection/setup")
These accesses happen before the new kernel establishes its own
identity map, and before establishing a routine to handle page faults.
But the areas referenced are not explicitly added to the kexec
identity map.
This goes unnoticed when these areas happen to be placed close enough
to others areas that are explicitly added to the identity map, but
that is not always the case.
Under certain conditions, for example Intel Atom processors that don't
support 1GB pages, it was found that these areas don't end up mapped,
and the SEV initialization code causes an unrecoverable page fault,
and the kexec fails.
Tau Liu had offered a patch to put the config table into the kexec
identity map to avoid this problem:
https://lore.kernel.org/all/20230601072043.24439-1-ltao@redhat.com/
But the community chose instead to avoid referencing this memory on
non-AMD systems where the problem was reported.
commit bee6cf1a80b5 ("x86/sev: Do not try to parse for the CC blob
on non-AMD hardware")
I later wanted to make a different change to kexec identity map
creation, and had this patch accepted:
commit d794734c9bbf ("x86/mm/ident_map: Use gbpages only where full GB page should be mapped.")
but it quickly needed to be reverted because of problems on AMD systems.
The reported regression problems on AMD systems were due to the above
mentioned references to the EFI config table. In fact, on the same
systems, the "nogbpages" command line option breaks kexec as well.
So I resubmit Tau Liu's original patch that maps the EFI config
table, add an additional patch by me that ensures that the CC blob is
also mapped (if present), and also resubmit my earlier patch to use
gpbages only when a full GB of space is requested to be mapped.
I do not advocate for removing the earlier, non-AMD fix. With kexec,
two different kernel versions can be in play, and the earlier fix
still covers non-AMD systems when the kexec'd-from kernel doesn't have
these patches applied.
All three of the people who reported regression with my earlier patch
have retested with this patch series and found it to work where my
single patch previously did not. With current kernels, all fail to
kexec when "nogbpages" is on the command line, but all succeed with
"nogbpages" after the series is applied.
Tao Liu (1):
x86/kexec: Add EFI config table identity mapping for kexec kernel
Steve Wahl (2):
x86/kexec: Add EFI Confidential Computing blob to kexec identity
mapping.
x86/mm/ident_map: Use gbpages only where full GB page should be
mapped.
arch/x86/kernel/machine_kexec_64.c | 82 ++++++++++++++++++++++++++++--
arch/x86/mm/ident_map.c | 23 +++++++--
2 files changed, 95 insertions(+), 10 deletions(-)
--
2.26.2
On Mon, May 20, 2024 at 01:36:30PM -0500, Steve Wahl wrote:
> Although there was a previous fix to avoid early kernel access to the
> EFI config table on Intel systems, the problem can still exist on AMD
> systems that support SEV (Secure Encrypted Virtualization). The
> command line option "nogbpages" brings this bug to the surface. And
> this is what caused the regression with my earlier patch that
> attempted to reduce the use of gbpages. This patch series fixes that
> problem and restores my earlier patch.
>
> The following 2 commits caused the EFI config table, and the CC_BLOB
> entry in that table, to be accessed when enabling SEV at kernel
> startup.
>
> commit ec1c66af3a30 ("x86/compressed/64: Detect/setup SEV/SME features
> earlier during boot")
> commit c01fce9cef84 ("x86/compressed: Add SEV-SNP feature
> detection/setup")
>
> These accesses happen before the new kernel establishes its own
> identity map, and before establishing a routine to handle page faults.
> But the areas referenced are not explicitly added to the kexec
> identity map.
>
> This goes unnoticed when these areas happen to be placed close enough
> to others areas that are explicitly added to the identity map, but
> that is not always the case.
>
> Under certain conditions, for example Intel Atom processors that don't
> support 1GB pages, it was found that these areas don't end up mapped,
> and the SEV initialization code causes an unrecoverable page fault,
> and the kexec fails.
What does Intel Atom have to do with SEV?!
> Tau Liu had offered a patch to put the config table into the kexec
> identity map to avoid this problem:
>
> https://lore.kernel.org/all/20230601072043.24439-1-ltao@redhat.com/
>
> But the community chose instead to avoid referencing this memory on
> non-AMD systems where the problem was reported.
>
> commit bee6cf1a80b5 ("x86/sev: Do not try to parse for the CC blob
> on non-AMD hardware")
>
> I later wanted to make a different change to kexec identity map
> creation, and had this patch accepted:
>
> commit d794734c9bbf ("x86/mm/ident_map: Use gbpages only where full GB page should be mapped.")
>
> but it quickly needed to be reverted because of problems on AMD systems.
>
> The reported regression problems on AMD systems were due to the above
> mentioned references to the EFI config table. In fact, on the same
> systems, the "nogbpages" command line option breaks kexec as well.
>
> So I resubmit Tau Liu's original patch that maps the EFI config
> table, add an additional patch by me that ensures that the CC blob is
> also mapped (if present), and also resubmit my earlier patch to use
> gpbages only when a full GB of space is requested to be mapped.
>
> I do not advocate for removing the earlier, non-AMD fix. With kexec,
> two different kernel versions can be in play, and the earlier fix
> still covers non-AMD systems when the kexec'd-from kernel doesn't have
> these patches applied.
>
> All three of the people who reported regression with my earlier patch
> have retested with this patch series and found it to work where my
> single patch previously did not. With current kernels, all fail to
> kexec when "nogbpages" is on the command line, but all succeed with
> "nogbpages" after the series is applied.
>
> Tao Liu (1):
> x86/kexec: Add EFI config table identity mapping for kexec kernel
>
> Steve Wahl (2):
> x86/kexec: Add EFI Confidential Computing blob to kexec identity
> mapping.
> x86/mm/ident_map: Use gbpages only where full GB page should be
> mapped.
>
> arch/x86/kernel/machine_kexec_64.c | 82 ++++++++++++++++++++++++++++--
> arch/x86/mm/ident_map.c | 23 +++++++--
> 2 files changed, 95 insertions(+), 10 deletions(-)
Anyway, + Ashish who's been dealing with SNP kexec. We have identified one EFI
issue so far:
https://lore.kernel.org/r/20240612135638.298882-2-ardb%2Bgit@google.com
You could give it a try and report back.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Thu, Jun 13, 2024 at 05:28:56PM +0200, Borislav Petkov wrote:
Thank you for at least saying something on this!
> On Mon, May 20, 2024 at 01:36:30PM -0500, Steve Wahl wrote:
> > Although there was a previous fix to avoid early kernel access to the
> > EFI config table on Intel systems, the problem can still exist on AMD
> > systems that support SEV (Secure Encrypted Virtualization). The
> > command line option "nogbpages" brings this bug to the surface. And
> > this is what caused the regression with my earlier patch that
> > attempted to reduce the use of gbpages. This patch series fixes that
> > problem and restores my earlier patch.
> >
> > The following 2 commits caused the EFI config table, and the CC_BLOB
> > entry in that table, to be accessed when enabling SEV at kernel
> > startup.
> >
> > commit ec1c66af3a30 ("x86/compressed/64: Detect/setup SEV/SME features
> > earlier during boot")
> > commit c01fce9cef84 ("x86/compressed: Add SEV-SNP feature
> > detection/setup")
> >
> > These accesses happen before the new kernel establishes its own
> > identity map, and before establishing a routine to handle page faults.
> > But the areas referenced are not explicitly added to the kexec
> > identity map.
> >
> > This goes unnoticed when these areas happen to be placed close enough
> > to others areas that are explicitly added to the identity map, but
> > that is not always the case.
> >
> > Under certain conditions, for example Intel Atom processors that don't
> > support 1GB pages, it was found that these areas don't end up mapped,
> > and the SEV initialization code causes an unrecoverable page fault,
> > and the kexec fails.
>
> What does Intel Atom have to do with SEV?!
The Atom was the prominent example of a platform that the code
introduced for SEV broke. Unfortunately, the fix currently
implemented leaves things still broken for actual AMD SEV capable
processors when nogbpages is used, and this problem is the reason for
the apparent regression when my reduce-use-of-gbpages patch was
accepted (later removed).
Tau Liu's original patch fixed this problem, but was not accepted.
The patch that was accepted does not fix this.
> > Tau Liu had offered a patch to put the config table into the kexec
> > identity map to avoid this problem:
> >
> > https://lore.kernel.org/all/20230601072043.24439-1-ltao@redhat.com/
> >
> > But the community chose instead to avoid referencing this memory on
> > non-AMD systems where the problem was reported.
> >
> > commit bee6cf1a80b5 ("x86/sev: Do not try to parse for the CC blob
> > on non-AMD hardware")
> >
> > I later wanted to make a different change to kexec identity map
> > creation, and had this patch accepted:
> >
> > commit d794734c9bbf ("x86/mm/ident_map: Use gbpages only where full GB page should be mapped.")
> >
> > but it quickly needed to be reverted because of problems on AMD systems.
> >
> > The reported regression problems on AMD systems were due to the above
> > mentioned references to the EFI config table. In fact, on the same
> > systems, the "nogbpages" command line option breaks kexec as well.
> >
> > So I resubmit Tau Liu's original patch that maps the EFI config
> > table, add an additional patch by me that ensures that the CC blob is
> > also mapped (if present), and also resubmit my earlier patch to use
> > gpbages only when a full GB of space is requested to be mapped.
> >
> > I do not advocate for removing the earlier, non-AMD fix. With kexec,
> > two different kernel versions can be in play, and the earlier fix
> > still covers non-AMD systems when the kexec'd-from kernel doesn't have
> > these patches applied.
> >
> > All three of the people who reported regression with my earlier patch
> > have retested with this patch series and found it to work where my
> > single patch previously did not. With current kernels, all fail to
> > kexec when "nogbpages" is on the command line, but all succeed with
> > "nogbpages" after the series is applied.
> >
> > Tao Liu (1):
> > x86/kexec: Add EFI config table identity mapping for kexec kernel
> >
> > Steve Wahl (2):
> > x86/kexec: Add EFI Confidential Computing blob to kexec identity
> > mapping.
> > x86/mm/ident_map: Use gbpages only where full GB page should be
> > mapped.
> >
> > arch/x86/kernel/machine_kexec_64.c | 82 ++++++++++++++++++++++++++++--
> > arch/x86/mm/ident_map.c | 23 +++++++--
> > 2 files changed, 95 insertions(+), 10 deletions(-)
>
> Anyway, + Ashish who's been dealing with SNP kexec. We have identified one EFI
> issue so far:
>
> https://lore.kernel.org/r/20240612135638.298882-2-ardb%2Bgit@google.com
>
> You could give it a try and report back.
I will look at it, but a cursory inspection doesn't show anything
that affects what I'm talking about here.
Thanks!
--> Steve
--
Steve Wahl, Hewlett Packard Enterprise
On Thu, Jun 13, 2024 at 11:16:36AM -0500, Steve Wahl wrote:
> The Atom was the prominent example of a platform that the code
> introduced for SEV broke. Unfortunately, the fix currently
> implemented leaves things still broken for actual AMD SEV capable
> processors when nogbpages is used,
Ok, how do I reproduce this?
Please give exact step-by-step directions.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Sun, Jun 16, 2024 at 10:25:33PM +0200, Borislav Petkov wrote: > On Thu, Jun 13, 2024 at 11:16:36AM -0500, Steve Wahl wrote: > > The Atom was the prominent example of a platform that the code > > introduced for SEV broke. Unfortunately, the fix currently > > implemented leaves things still broken for actual AMD SEV capable > > processors when nogbpages is used, > > Ok, how do I reproduce this? > > Please give exact step-by-step directions. The first, hardest step is locate a system that is AMD based, SEV capable, with a BIOS that chooses to locate the CC_BLOB at addresses that do not share a 2M page with other chunks of memory the kernel currently adds to the kexec identity map. I.e. This is a stroke of luck, and for all I know could depend on configuration such as memory size in addition to motherboard and BIOS version. However, it does not seem to change from boot to boot; as system that has the problem seems to be consistent about it. Second, boot linux including the "nogbpages" command line option. Third, kexec -l <kernel image> --append=<command line options> --initrd=<initrd>. Fourth, kexec -e. Systems that have this problem successfully kexec without the "nogbpages" parameter, but fail and do a full reboot with the "nogbpages" parameter. I wish I could be more exact, filling in <kernel image> and <command line options> and <initrd> for you, but they must be tailored to the needs of the particular system. I do not have direct access to a system with this problem myself. I have relied on others who reported the problem to reproduce it and test my fix. Thanks, --> Steve Wahl -- Steve Wahl, Hewlett Packard Enterprise
On Mon, Jun 17, 2024 at 10:10:32AM -0500, Steve Wahl wrote:
> The first, hardest step is locate a system that is AMD based, SEV
> capable, with a BIOS that chooses to locate the CC_BLOB at addresses
> that do not share a 2M page with other chunks of memory the kernel
> currently adds to the kexec identity map. I.e. This is a stroke of
> luck,
Ya think?
It is more likely that I win the lottery than finding such a beast. ;-\
> and for all I know could depend on configuration such as memory
> size in addition to motherboard and BIOS version. However, it does
> not seem to change from boot to boot; as system that has the problem
> seems to be consistent about it.
>
> Second, boot linux including the "nogbpages" command line option.
>
> Third, kexec -l <kernel image> --append=<command line options>
> --initrd=<initrd>.
>
> Fourth, kexec -e.
>
> Systems that have this problem successfully kexec without the
> "nogbpages" parameter, but fail and do a full reboot with the
> "nogbpages" parameter.
>
> I wish I could be more exact,
Yes, this doesn't really explain what the culprit is.
So, your 0th message says:
"But the community chose instead to avoid referencing this memory on
non-AMD systems where the problem was reported.
commit bee6cf1a80b5 ("x86/sev: Do not try to parse for the CC blob
on non-AMD hardware")"
But that patch fixes !AMD systems.
Now you're basically saying that there are some AMD machines out there where
the EFI config table doesn't get mapped because it is somewhere else, outside
of the range of a 2M page or 1G page.
Or even if it is, "nogbpages" supplied on the cmdline would cause the
"overlapping 2M and 1G mapping to not happen, leaving the EFI config table
unmapped.
Am I on the right track here?
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Fri, Jun 21, 2024 at 03:17:42PM +0200, Borislav Petkov wrote:
> On Mon, Jun 17, 2024 at 10:10:32AM -0500, Steve Wahl wrote:
> > The first, hardest step is locate a system that is AMD based, SEV
> > capable, with a BIOS that chooses to locate the CC_BLOB at addresses
> > that do not share a 2M page with other chunks of memory the kernel
> > currently adds to the kexec identity map. I.e. This is a stroke of
> > luck,
>
> Ya think?
>
> It is more likely that I win the lottery than finding such a beast. ;-\
Yes, that is the impression I was trying to impart! :-)
> > and for all I know could depend on configuration such as memory
> > size in addition to motherboard and BIOS version. However, it does
> > not seem to change from boot to boot; as system that has the problem
> > seems to be consistent about it.
> >
> > Second, boot linux including the "nogbpages" command line option.
> >
> > Third, kexec -l <kernel image> --append=<command line options>
> > --initrd=<initrd>.
> >
> > Fourth, kexec -e.
> >
> > Systems that have this problem successfully kexec without the
> > "nogbpages" parameter, but fail and do a full reboot with the
> > "nogbpages" parameter.
> >
> > I wish I could be more exact,
>
> Yes, this doesn't really explain what the culprit is.
>
> So, your 0th message says:
>
> "But the community chose instead to avoid referencing this memory on
> non-AMD systems where the problem was reported.
>
> commit bee6cf1a80b5 ("x86/sev: Do not try to parse for the CC blob
> on non-AMD hardware")"
>
> But that patch fixes !AMD systems.
>
> Now you're basically saying that there are some AMD machines out there where
> the EFI config table doesn't get mapped because it is somewhere else, outside
> of the range of a 2M page or 1G page.
I haven't heard of one where using 1G pages doesn't include this EFI
table. But if you switch to 2M pages using "nogbpages", that's when
you get into trouble.
(Also trouble if you switch to using some 2M pages using patch #3,
which is the point of the series.)
> Or even if it is, "nogbpages" supplied on the cmdline would cause the
> "overlapping 2M and 1G mapping to not happen, leaving the EFI config table
> unmapped.
I think so. The EFI config table is being included by luck, not
explicitly; "nogbpages" switches to using 2M pages only, greatly
reducing the range of addresses unintentionally included in the map.
So the EFI config table doesn't end up mapped.
> Am I on the right track here?
I believe so.
Patch #3's intent is/was to reduce the amount of space unintentionally
included in the identity map, to avoid speculation into areas that
cause system halts on HPE's UV hardware. It does so by using 1G pages
for large areas and 2M pages for smaller areas. It should have worked
for systems in general, without being specific to UV, but when it got
into the mainline kernel, it was found to cause a regression on these
AMD systems, and was reverted.
The regression turned out to be the EFI config table not being mapped.
Patch #1 fixes this problem for the EFI config table, and patch #2
fixes it for the CC BLOB that is also likely accessed.
These accesses are a problem because they happen prior to establishing
the page fault interrupt handler that would mend the identity map. I
know very little about the AMD SEV feature but reading the code I
think it may be required to do this before setting up that handler.
Thanks,
--> Steve
--
Steve Wahl, Hewlett Packard Enterprise
On Mon, Jun 24, 2024 at 10:13:44AM -0500, Steve Wahl wrote:
> These accesses are a problem because they happen prior to establishing
> the page fault interrupt handler that would mend the identity map. I
> know very little about the AMD SEV feature but reading the code I
> think it may be required to do this before setting up that handler.
Yeah, from looking at it, we should be able to establish a #PF handler that
early too but the devil's in the detail, especially in that early boot code.
Lemme poke some things and people...
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Mon, Jul 01, 2024 at 04:27:04PM +0200, Borislav Petkov wrote:
> On Mon, Jun 24, 2024 at 10:13:44AM -0500, Steve Wahl wrote:
> > These accesses are a problem because they happen prior to establishing
> > the page fault interrupt handler that would mend the identity map. I
> > know very little about the AMD SEV feature but reading the code I
> > think it may be required to do this before setting up that handler.
>
> Yeah, from looking at it, we should be able to establish a #PF handler that
> early too but the devil's in the detail, especially in that early boot code.
>
> Lemme poke some things and people...
Ard, from EFI perspective and boot services exiting, do you see any potential
issues if we enable a pagefault handler in load_stage1_idt() in
arch/x86/boot/compressed/head_64.S already or is the EFI pagetable not really
"reliable" then?
Would solve the issue in this thread where the EFI config table ends up not
mapped on some hw configurations, elegantly...
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Tue, 2 Jul 2024 at 19:45, Borislav Petkov <bp@alien8.de> wrote: > > On Mon, Jul 01, 2024 at 04:27:04PM +0200, Borislav Petkov wrote: > > On Mon, Jun 24, 2024 at 10:13:44AM -0500, Steve Wahl wrote: > > > These accesses are a problem because they happen prior to establishing > > > the page fault interrupt handler that would mend the identity map. I > > > know very little about the AMD SEV feature but reading the code I > > > think it may be required to do this before setting up that handler. > > > > Yeah, from looking at it, we should be able to establish a #PF handler that > > early too but the devil's in the detail, especially in that early boot code. > > > > Lemme poke some things and people... > > Ard, from EFI perspective and boot services exiting, do you see any potential > issues if we enable a pagefault handler in load_stage1_idt() in > arch/x86/boot/compressed/head_64.S already or is the EFI pagetable not really > "reliable" then? > For the first boot, this shouldn't be needed - EFI maps all of RAM so I wouldn't expect the PF handler to fire, except when writing to code regions that were mapped ROX by the firmware. But even then, things should just keep working, although from a security pov, it would be better if the r/o regions remain r/o > Would solve the issue in this thread where the EFI config table ends up not > mapped on some hw configurations, elegantly... > The #PF handler makes sense when entering via the 32-bit entrypoint, where the asm can only map the lower 4G and is in no position to reason about where RAM lives. For kexec on a 64-bit system, I would expect the high-level support code to be capable of simply mapping all of DRAM 1:1, rather than playing these games with #PF handlers and on-demand mapping.
On Tue, Jul 02, 2024 at 08:32:22PM +0200, Ard Biesheuvel wrote:
> For kexec on a 64-bit system, I would expect the high-level support
> code to be capable of simply mapping all of DRAM 1:1, rather than
> playing these games with #PF handlers and on-demand mapping.
Yeah, apparently we can't do that on SGI, as Steve said.
I like the aspect that the #PF handler won't fire in the first kernel because
of EFI mapping all RAM. That's good.
So we could try to wire in a #PF handler in stage1, see below.
Steve, I don't have a good idea how to test that. Maybe some of those
reporters you were talking about, would be willing to...
---
diff --git a/arch/x86/boot/compressed/idt_64.c b/arch/x86/boot/compressed/idt_64.c
index d100284bbef4..a258587c8949 100644
--- a/arch/x86/boot/compressed/idt_64.c
+++ b/arch/x86/boot/compressed/idt_64.c
@@ -32,6 +32,7 @@ void load_stage1_idt(void)
{
boot_idt_desc.address = (unsigned long)boot_idt;
+ set_idt_entry(X86_TRAP_PF, boot_page_fault);
if (IS_ENABLED(CONFIG_AMD_MEM_ENCRYPT))
set_idt_entry(X86_TRAP_VC, boot_stage1_vc);
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Wed, Jul 03, 2024 at 06:14:41PM +0200, Borislav Petkov wrote:
> On Tue, Jul 02, 2024 at 08:32:22PM +0200, Ard Biesheuvel wrote:
> > For kexec on a 64-bit system, I would expect the high-level support
> > code to be capable of simply mapping all of DRAM 1:1, rather than
> > playing these games with #PF handlers and on-demand mapping.
>
> Yeah, apparently we can't do that on SGI, as Steve said.
>
> I like the aspect that the #PF handler won't fire in the first kernel because
> of EFI mapping all RAM. That's good.
>
> So we could try to wire in a #PF handler in stage1, see below.
>
> Steve, I don't have a good idea how to test that. Maybe some of those
> reporters you were talking about, would be willing to...
>
> ---
> diff --git a/arch/x86/boot/compressed/idt_64.c b/arch/x86/boot/compressed/idt_64.c
> index d100284bbef4..a258587c8949 100644
> --- a/arch/x86/boot/compressed/idt_64.c
> +++ b/arch/x86/boot/compressed/idt_64.c
> @@ -32,6 +32,7 @@ void load_stage1_idt(void)
> {
> boot_idt_desc.address = (unsigned long)boot_idt;
>
> + set_idt_entry(X86_TRAP_PF, boot_page_fault);
>
> if (IS_ENABLED(CONFIG_AMD_MEM_ENCRYPT))
> set_idt_entry(X86_TRAP_VC, boot_stage1_vc);
>
> --
> Regards/Gruss,
> Boris.
Boris,
Eric Hagberg tested this patch for me and it didn't work. (Thanks,
Eric.)
So I looked at the code more closely, and I don't think
boot_page_fault is going to work prior to the call to
initialize_identity_maps. In the current flow in head_64.S, that
comes after load_stage2_idt, where here we were trying to use it
just after load_stage1_idt, quite a bit earlier.
Is there a reason you want to avoid having these areas already entered
in the identity map setup by kexec?
I can see this could have the appearance of getting out of hand if we
had to continually add things to it. But only those pieces that need
to be referenced before the page fault handler is established actually
require this treatment.
--> Steve
--
Steve Wahl, Hewlett Packard Enterprise
On Mon, Jul 08, 2024 at 10:42:47AM -0500, Steve Wahl wrote:
> So I looked at the code more closely, and I don't think boot_page_fault is
> going to work prior to the call to initialize_identity_maps. In the current
> flow in head_64.S, that comes after load_stage2_idt, where here we were
> trying to use it just after load_stage1_idt, quite a bit earlier.
But still after setting up the #PF handlers in both cases. So it can't be
that.
> Is there a reason you want to avoid having these areas already entered
> in the identity map setup by kexec?
Well, imagine my one-liner worked. Can you think of a reason then?
So, theoretically, this should be reproducible in a VM too, I'd say. If we
could manage to get that EFI config table placed at the right address, to be
outside of a 1G page so that it doesn't get covered by a Gb mapping.
Or use "nogbpages" and then maybe perhaps with Ard's help hack up OVMF to do
so. :)
So, can someone with such a box boot with "efi=debug" on the kernel cmdline so
that we can try to reproduce the memory layout in a VM?
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Mon, 8 Jul 2024 at 20:12, Borislav Petkov <bp@alien8.de> wrote: > > On Mon, Jul 08, 2024 at 10:42:47AM -0500, Steve Wahl wrote: > > So I looked at the code more closely, and I don't think boot_page_fault is > > going to work prior to the call to initialize_identity_maps. In the current > > flow in head_64.S, that comes after load_stage2_idt, where here we were > > trying to use it just after load_stage1_idt, quite a bit earlier. > > But still after setting up the #PF handlers in both cases. So it can't be > that. > > > Is there a reason you want to avoid having these areas already entered > > in the identity map setup by kexec? > > Well, imagine my one-liner worked. Can you think of a reason then? > > So, theoretically, this should be reproducible in a VM too, I'd say. If we > could manage to get that EFI config table placed at the right address, to be > outside of a 1G page so that it doesn't get covered by a Gb mapping. > > Or use "nogbpages" and then maybe perhaps with Ard's help hack up OVMF to do > so. :) > > So, can someone with such a box boot with "efi=debug" on the kernel cmdline so > that we can try to reproduce the memory layout in a VM? > Happy to assist, but I'm not sure I follow the approach here. In the context of a confidential VM, I don't think the page fault handler is ever an acceptable approach. kexec should filter out config tables that it doesn't recognize, and map the ones that it does (note that EFI config tables have no standardized header with a length, so mapping tables it does *not* recognize is not feasible to begin with). All these games with on-demand paging may have made sense for 64-bit kernels booting in 32-bit mode (which can only map the first 4G of RAM), but in a confiidential VM context with measurement/attestation etc I think the cure is worse than the disease.
On Mon, Jul 08, 2024 at 08:17:43PM +0200, Ard Biesheuvel wrote:
> Happy to assist, but I'm not sure I follow the approach here.
>
> In the context of a confidential VM, I don't think the page fault
> handler is ever an acceptable approach. kexec should filter out config
> tables that it doesn't recognize, and map the ones that it does (note
> that EFI config tables have no standardized header with a length, so
> mapping tables it does *not* recognize is not feasible to begin with).
>
> All these games with on-demand paging may have made sense for 64-bit
> kernels booting in 32-bit mode (which can only map the first 4G of
> RAM), but in a confiidential VM context with measurement/attestation
> etc I think the cure is worse than the disease.
See upthread. I think this is about AMD server machines which support SEV
baremetal and not about SEV-ES/SNP guests which must do attestation.
Steve?
AFAIR, there was some kink that we have to parse the blob regardless which
I didn't like either but I'd need to refresh with Tom and see whether we can
solve it differently after all. Perhaps check X86_FEATURE_HYPERVISOR or so...
Thx for offering to help still - appreciated! :-)
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Mon, Jul 08, 2024 at 09:07:24PM +0200, Borislav Petkov wrote:
> On Mon, Jul 08, 2024 at 08:17:43PM +0200, Ard Biesheuvel wrote:
> > Happy to assist, but I'm not sure I follow the approach here.
> >
> > In the context of a confidential VM, I don't think the page fault
> > handler is ever an acceptable approach. kexec should filter out config
> > tables that it doesn't recognize, and map the ones that it does (note
> > that EFI config tables have no standardized header with a length, so
> > mapping tables it does *not* recognize is not feasible to begin with).
> >
> > All these games with on-demand paging may have made sense for 64-bit
> > kernels booting in 32-bit mode (which can only map the first 4G of
> > RAM), but in a confiidential VM context with measurement/attestation
> > etc I think the cure is worse than the disease.
>
> See upthread. I think this is about AMD server machines which support SEV
> baremetal and not about SEV-ES/SNP guests which must do attestation.
>
> Steve?
Yes, this is about AMD machines which support SEV, running bare metal.
("Server" is in question, one of my testers is known to be using a
laptop, so the facilities must be present in non-servers as well.)
> AFAIR, there was some kink that we have to parse the blob regardless which
> I didn't like either but I'd need to refresh with Tom and see whether we can
> solve it differently after all. Perhaps check X86_FEATURE_HYPERVISOR or so...
>
> Thx for offering to help still - appreciated! :-)
You asked me to imagine if the one-liner had worked. Yes, it would
have been a magical, easy fix! But things should be as simple as
possible, but no simpler, and that solution is "simpler than
possible".
As far as I can see it, the effort you're putting into finding a
different solution must mean you find something less than desirable
about the solution I have offered. But at this point, I don't
understand why; and lacking that understanding, I'm powerless to help
find alternatives that would be more acceptable.
Having kexec place these portions in the identity map before jumping
to the new kernel more closely mimics the conditions we are under when
entered from the BIOS and bootloader. So it seems to me to be the
logical way to go.
Thanks,
--> Steve
--
Steve Wahl, Hewlett Packard Enterprise
On Mon, Jul 08, 2024 at 02:29:05PM -0500, Steve Wahl wrote:
> Yes, this is about AMD machines which support SEV, running bare metal.
> ("Server" is in question, one of my testers is known to be using a
> laptop, so the facilities must be present in non-servers as well.)
No, they can't be. SEV is supported only on server, not on client. This laptop
has a different problem it seems.
> As far as I can see it, the effort you're putting into finding a
> different solution must mean you find something less than desirable
> about the solution I have offered. But at this point, I don't
> understand why;
Why would we parse the CC blob which is destined *solely* for a SEV- *guest*,
when booting the baremetal kernel which is *not* a guest?
This is the solution I'm chasing - don't do something you're not supposed to
or needed to do.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Mon, Jul 08, 2024 at 09:58:10PM +0200, Borislav Petkov wrote:
> On Mon, Jul 08, 2024 at 02:29:05PM -0500, Steve Wahl wrote:
> > Yes, this is about AMD machines which support SEV, running bare metal.
> > ("Server" is in question, one of my testers is known to be using a
> > laptop, so the facilities must be present in non-servers as well.)
>
> No, they can't be. SEV is supported only on server, not on client. This laptop
> has a different problem it seems.
Ahhh. On the laptop, it's not looking *at* the CC blob that's the
problem.
Its looking *for* the CC blob in the EFI config table; the CC blob
probably does not exist in that table on the laptop. But the EFI
config table needs to be identity mapped, to look through it and see
that the CC blob is not there, and the EFI config table is not mapped.
I think the existence of the CC blob in the EFI config table is being
used, more or less, as a flag as to whether we need to do SEV related
code. Without mapping the EFI config table, we can't look for that
blob.
> > As far as I can see it, the effort you're putting into finding a
> > different solution must mean you find something less than desirable
> > about the solution I have offered. But at this point, I don't
> > understand why;
>
> Why would we parse the CC blob which is destined *solely* for a SEV- *guest*,
> when booting the baremetal kernel which is *not* a guest?
>
> This is the solution I'm chasing - don't do something you're not supposed to
> or needed to do.
What you're saying suggests that, maybe, my patch #2 will not be
necessary. The CC blob will never be present except for in a guest.
But can you do a kexec to a new kernel within that guest? If so,
patch #2 might still be necessary.
Anyway, I think the references you're trying to eliminate when they're
not needed are the references used to determine if the SEV feature is
to be used in this specific boot iteration or not.
--> Steve
--
Steve Wahl, Hewlett Packard Enterprise
On Mon, 8 Jul 2024 at 22:41, Steve Wahl <steve.wahl@hpe.com> wrote:
>
> On Mon, Jul 08, 2024 at 09:58:10PM +0200, Borislav Petkov wrote:
> > On Mon, Jul 08, 2024 at 02:29:05PM -0500, Steve Wahl wrote:
> > > Yes, this is about AMD machines which support SEV, running bare metal.
> > > ("Server" is in question, one of my testers is known to be using a
> > > laptop, so the facilities must be present in non-servers as well.)
> >
> > No, they can't be. SEV is supported only on server, not on client. This laptop
> > has a different problem it seems.
>
> Ahhh. On the laptop, it's not looking *at* the CC blob that's the
> problem.
>
> Its looking *for* the CC blob in the EFI config table; the CC blob
> probably does not exist in that table on the laptop. But the EFI
> config table needs to be identity mapped, to look through it and see
> that the CC blob is not there, and the EFI config table is not mapped.
>
> I think the existence of the CC blob in the EFI config table is being
> used, more or less, as a flag as to whether we need to do SEV related
> code. Without mapping the EFI config table, we can't look for that
> blob.
>
We have run into this exact problem before - I don't have time to
check lore right now (it's 11pm here) but 'CC blob' and 'EFI config
table' are the keywords that may help you track down the thread.
So first of all, let's define some terminology:
- the EFI system table is the EFI root table that contains some magic
numbers and pointers to various other assets in memory, one of which
is:
- the EFI config table array, which is just a list of (GUID, pointer)
tuples, the length of which is recorded in the EFI system table
- an EFI config table is some asset elsewhere in memory that is
identified by its GUID.
The EFI config table array can grow and shrink at boot time, which is
why it is a separate allocation, as this allows it to be realloc()'ed.
This means any bootloader that intends to map the primary EFI table
should also map the EFI config table array, which may be elsewhere
entirely.
> > > As far as I can see it, the effort you're putting into finding a
> > > different solution must mean you find something less than desirable
> > > about the solution I have offered. But at this point, I don't
> > > understand why;
> >
> > Why would we parse the CC blob which is destined *solely* for a SEV- *guest*,
> > when booting the baremetal kernel which is *not* a guest?
> >
> > This is the solution I'm chasing - don't do something you're not supposed to
> > or needed to do.
>
> What you're saying suggests that, maybe, my patch #2 will not be
> necessary. The CC blob will never be present except for in a guest.
> But can you do a kexec to a new kernel within that guest? If so,
> patch #2 might still be necessary.
>
> Anyway, I think the references you're trying to eliminate when they're
> not needed are the references used to determine if the SEV feature is
> to be used in this specific boot iteration or not.
>
It would be better if we did not have to rely on page fault handling
to map the EFI config table array this early. This is not strictly
related to SEV, but the CC blob happens to be the EFI config table
that is accessed before the page fault handler is installed.
So regardless of how we fix any SEV-guest specific issues, we should
ensure that kexec infrastructure creates the mappings of the EFI
system table and the EFI config table array upfront.
On Mon, Jul 08, 2024 at 11:05:29PM +0200, Ard Biesheuvel wrote:
> The EFI config table array can grow and shrink at boot time, which is
> why it is a separate allocation, as this allows it to be realloc()'ed.
> This means any bootloader that intends to map the primary EFI table
> should also map the EFI config table array, which may be elsewhere
> entirely.
Yap, that rings a bell from a past thread.
> So regardless of how we fix any SEV-guest specific issues, we should
> ensure that kexec infrastructure creates the mappings of the EFI
> system table and the EFI config table array upfront.
Because code in the kernel relies on the presence of those so those should be
mapped automatically and unconditionally?
Or?
As long as we put that somewhere as the thing we do by default, sure, I'm
game.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Mon, Jul 08, 2024 at 11:05:29PM +0200, Ard Biesheuvel wrote:
> On Mon, 8 Jul 2024 at 22:41, Steve Wahl <steve.wahl@hpe.com> wrote:
> >
> > On Mon, Jul 08, 2024 at 09:58:10PM +0200, Borislav Petkov wrote:
> > > On Mon, Jul 08, 2024 at 02:29:05PM -0500, Steve Wahl wrote:
> > > > Yes, this is about AMD machines which support SEV, running bare metal.
> > > > ("Server" is in question, one of my testers is known to be using a
> > > > laptop, so the facilities must be present in non-servers as well.)
> > >
> > > No, they can't be. SEV is supported only on server, not on client. This laptop
> > > has a different problem it seems.
> >
> > Ahhh. On the laptop, it's not looking *at* the CC blob that's the
> > problem.
> >
> > Its looking *for* the CC blob in the EFI config table; the CC blob
> > probably does not exist in that table on the laptop. But the EFI
> > config table needs to be identity mapped, to look through it and see
> > that the CC blob is not there, and the EFI config table is not mapped.
> >
> > I think the existence of the CC blob in the EFI config table is being
> > used, more or less, as a flag as to whether we need to do SEV related
> > code. Without mapping the EFI config table, we can't look for that
> > blob.
> >
>
> We have run into this exact problem before - I don't have time to
> check lore right now (it's 11pm here) but 'CC blob' and 'EFI config
> table' are the keywords that may help you track down the thread.
>
> So first of all, let's define some terminology:
> - the EFI system table is the EFI root table that contains some magic
> numbers and pointers to various other assets in memory, one of which
> is:
> - the EFI config table array, which is just a list of (GUID, pointer)
> tuples, the length of which is recorded in the EFI system table
> - an EFI config table is some asset elsewhere in memory that is
> identified by its GUID.
>
> The EFI config table array can grow and shrink at boot time, which is
> why it is a separate allocation, as this allows it to be realloc()'ed.
> This means any bootloader that intends to map the primary EFI table
> should also map the EFI config table array, which may be elsewhere
> entirely.
>
> > > > As far as I can see it, the effort you're putting into finding a
> > > > different solution must mean you find something less than desirable
> > > > about the solution I have offered. But at this point, I don't
> > > > understand why;
> > >
> > > Why would we parse the CC blob which is destined *solely* for a SEV- *guest*,
> > > when booting the baremetal kernel which is *not* a guest?
> > >
> > > This is the solution I'm chasing - don't do something you're not supposed to
> > > or needed to do.
> >
> > What you're saying suggests that, maybe, my patch #2 will not be
> > necessary. The CC blob will never be present except for in a guest.
> > But can you do a kexec to a new kernel within that guest? If so,
> > patch #2 might still be necessary.
> >
> > Anyway, I think the references you're trying to eliminate when they're
> > not needed are the references used to determine if the SEV feature is
> > to be used in this specific boot iteration or not.
> >
>
> It would be better if we did not have to rely on page fault handling
> to map the EFI config table array this early. This is not strictly
> related to SEV, but the CC blob happens to be the EFI config table
> that is accessed before the page fault handler is installed.
>
> So regardless of how we fix any SEV-guest specific issues, we should
> ensure that kexec infrastructure creates the mappings of the EFI
> system table and the EFI config table array upfront.
I think that's exactly what this patch series does.
The mapping of the EFI system table is/was already present in
map_efi_systab. Patch #1 in this series adds the efi_config_table to
what gets mapped.
Patch #2 adds the CC blob to the identity map as well, if present,
since if present it is also dereferenced before the page fault handler
can be put into place. Given what's been discussed, this patch might
not be necessary; I don't know enough to say whether kexec-ing a new
kernel within a SEV guest makes sense. I'm pretty certain it can
cause no harm, though.
And patch #3 fixes the UV platform problem as discussed, which will
not cause the previous reported regression if patches #1 and #2 are
already in place.
--> Steve
--
Steve Wahl, Hewlett Packard Enterprise
On Mon, 8 Jul 2024 at 23:20, Steve Wahl <steve.wahl@hpe.com> wrote:
>
> On Mon, Jul 08, 2024 at 11:05:29PM +0200, Ard Biesheuvel wrote:
> > On Mon, 8 Jul 2024 at 22:41, Steve Wahl <steve.wahl@hpe.com> wrote:
> > >
> > > On Mon, Jul 08, 2024 at 09:58:10PM +0200, Borislav Petkov wrote:
> > > > On Mon, Jul 08, 2024 at 02:29:05PM -0500, Steve Wahl wrote:
> > > > > Yes, this is about AMD machines which support SEV, running bare metal.
> > > > > ("Server" is in question, one of my testers is known to be using a
> > > > > laptop, so the facilities must be present in non-servers as well.)
> > > >
> > > > No, they can't be. SEV is supported only on server, not on client. This laptop
> > > > has a different problem it seems.
> > >
> > > Ahhh. On the laptop, it's not looking *at* the CC blob that's the
> > > problem.
> > >
> > > Its looking *for* the CC blob in the EFI config table; the CC blob
> > > probably does not exist in that table on the laptop. But the EFI
> > > config table needs to be identity mapped, to look through it and see
> > > that the CC blob is not there, and the EFI config table is not mapped.
> > >
> > > I think the existence of the CC blob in the EFI config table is being
> > > used, more or less, as a flag as to whether we need to do SEV related
> > > code. Without mapping the EFI config table, we can't look for that
> > > blob.
> > >
> >
> > We have run into this exact problem before - I don't have time to
> > check lore right now (it's 11pm here) but 'CC blob' and 'EFI config
> > table' are the keywords that may help you track down the thread.
> >
> > So first of all, let's define some terminology:
> > - the EFI system table is the EFI root table that contains some magic
> > numbers and pointers to various other assets in memory, one of which
> > is:
> > - the EFI config table array, which is just a list of (GUID, pointer)
> > tuples, the length of which is recorded in the EFI system table
> > - an EFI config table is some asset elsewhere in memory that is
> > identified by its GUID.
> >
> > The EFI config table array can grow and shrink at boot time, which is
> > why it is a separate allocation, as this allows it to be realloc()'ed.
> > This means any bootloader that intends to map the primary EFI table
> > should also map the EFI config table array, which may be elsewhere
> > entirely.
> >
> > > > > As far as I can see it, the effort you're putting into finding a
> > > > > different solution must mean you find something less than desirable
> > > > > about the solution I have offered. But at this point, I don't
> > > > > understand why;
> > > >
> > > > Why would we parse the CC blob which is destined *solely* for a SEV- *guest*,
> > > > when booting the baremetal kernel which is *not* a guest?
> > > >
> > > > This is the solution I'm chasing - don't do something you're not supposed to
> > > > or needed to do.
> > >
> > > What you're saying suggests that, maybe, my patch #2 will not be
> > > necessary. The CC blob will never be present except for in a guest.
> > > But can you do a kexec to a new kernel within that guest? If so,
> > > patch #2 might still be necessary.
> > >
> > > Anyway, I think the references you're trying to eliminate when they're
> > > not needed are the references used to determine if the SEV feature is
> > > to be used in this specific boot iteration or not.
> > >
> >
> > It would be better if we did not have to rely on page fault handling
> > to map the EFI config table array this early. This is not strictly
> > related to SEV, but the CC blob happens to be the EFI config table
> > that is accessed before the page fault handler is installed.
> >
> > So regardless of how we fix any SEV-guest specific issues, we should
> > ensure that kexec infrastructure creates the mappings of the EFI
> > system table and the EFI config table array upfront.
>
> I think that's exactly what this patch series does.
>
> The mapping of the EFI system table is/was already present in
> map_efi_systab. Patch #1 in this series adds the efi_config_table to
> what gets mapped.
>
Excellent. Please update the commit log to make it very clear that it
is the EFI config table *array* (the GUID/pointer tuple list) that is
being mapped, without any regard for the meaning of the individual
entries.
Also, the patch seems to be lacking your signed-off-by.
> Patch #2 adds the CC blob to the identity map as well, if present,
> since if present it is also dereferenced before the page fault handler
> can be put into place. Given what's been discussed, this patch might
> not be necessary; I don't know enough to say whether kexec-ing a new
> kernel within a SEV guest makes sense. I'm pretty certain it can
> cause no harm, though.
>
I'd prefer it if that is addressed within the context of the SEV guest
work. The memory setup is quite intricate, and dealing with individual
types of EFI config tables is something we should avoid in general. I
still maintain that the best approach would be to map all of DRAM 1:1
instead of mapping patches left and right (as this is what EFI does),
but if we need to do so, let's keep it as generic as we possibly can.
> And patch #3 fixes the UV platform problem as discussed, which will
> not cause the previous reported regression if patches #1 and #2 are
> already in place.
>
I wasn't cc'ed on any of the patches so I don't know exactly what was
discussed.
Please cc me and linux-efi@ on your next revision.
On Tue, Jul 09, 2024 at 08:49:43AM +0200, Ard Biesheuvel wrote: > On Mon, 8 Jul 2024 at 23:20, Steve Wahl <steve.wahl@hpe.com> wrote: > > > > On Mon, Jul 08, 2024 at 11:05:29PM +0200, Ard Biesheuvel wrote: > > > It would be better if we did not have to rely on page fault handling > > > to map the EFI config table array this early. This is not strictly > > > related to SEV, but the CC blob happens to be the EFI config table > > > that is accessed before the page fault handler is installed. > > > > > > So regardless of how we fix any SEV-guest specific issues, we should > > > ensure that kexec infrastructure creates the mappings of the EFI > > > system table and the EFI config table array upfront. > > > > I think that's exactly what this patch series does. > > > > The mapping of the EFI system table is/was already present in > > map_efi_systab. Patch #1 in this series adds the efi_config_table to > > what gets mapped. > > > > Excellent. Please update the commit log to make it very clear that it > is the EFI config table *array* (the GUID/pointer tuple list) that is > being mapped, without any regard for the meaning of the individual > entries. > > Also, the patch seems to be lacking your signed-off-by. I wrote in the non-permanent-comment section of that patch: ------------------------------ > I (Steve Wahl) modified the above commit message, but did not modify > the code. I am not clear if that requires additional Co-developed-by: > and Signed-off-by: lines. If so, copy them from here: > > Co-developed-by: Steve Wahl <steve.wahl@hpe.com> > Signed-off-by: Steve Wahl <steve.wahl@hpe.com> ------------------------------ I take it you believe I should add them to the for-posterity portion of the patch. I will do so. > > Patch #2 adds the CC blob to the identity map as well, if present, > > since if present it is also dereferenced before the page fault handler > > can be put into place. Given what's been discussed, this patch might > > not be necessary; I don't know enough to say whether kexec-ing a new > > kernel within a SEV guest makes sense. I'm pretty certain it can > > cause no harm, though. > > > > I'd prefer it if that is addressed within the context of the SEV guest > work. The memory setup is quite intricate, and dealing with individual > types of EFI config tables is something we should avoid in general. I > still maintain that the best approach would be to map all of DRAM 1:1 > instead of mapping patches left and right (as this is what EFI does), > but if we need to do so, let's keep it as generic as we possibly can. I understand. It's hard to kick yourself out of proactive mode when you're battling a problem that you can't reproduce in your own hands. :-) This is one reason why I kept it as a separate patch in the first place, though. If you will, keep in mind for the future that mapping all of DRAM 1:1, without regards to what areas the BIOS has marked "reserved" in the E820 tables and such, allows processor speculation into said reserved areas, which causes system halts on our (SGI UV) platform. Patch #3 is all about this. Mapping all of DRAM *except* areas marked reserved would work on our platform. > > And patch #3 fixes the UV platform problem as discussed, which will > > not cause the previous reported regression if patches #1 and #2 are > > already in place. > > > > I wasn't cc'ed on any of the patches so I don't know exactly what was > discussed. > > Please cc me and linux-efi@ on your next revision. Will do. --> Steve -- Steve Wahl, Hewlett Packard Enterprise
On Tue, Jul 09, 2024 at 08:49:43AM +0200, Ard Biesheuvel wrote:
> > Patch #2 adds the CC blob to the identity map as well, if present,
> > since if present it is also dereferenced before the page fault handler
> > can be put into place. Given what's been discussed, this patch might
> > not be necessary; I don't know enough to say whether kexec-ing a new
> > kernel within a SEV guest makes sense. I'm pretty certain it can
> > cause no harm, though.
No, keep it in the bag until it is really needed. No proactive "fixing".
> I'd prefer it if that is addressed within the context of the SEV guest
> work. The memory setup is quite intricate, and dealing with individual
> types of EFI config tables is something we should avoid in general. I
> still maintain that the best approach would be to map all of DRAM 1:1
> instead of mapping patches left and right (as this is what EFI does),
> but if we need to do so, let's keep it as generic as we possibly can.
Sure. There's the kink that coco guests need to accept memory first and
mapping it all is the least performant one. But we can deal with that later.
> I wasn't cc'ed on any of the patches so I don't know exactly what was
> discussed.
>
> Please cc me and linux-efi@ on your next revision.
And please update your commit messages with what was discussed on this thread.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Tue, Jul 09, 2024 at 12:37:42PM +0200, Borislav Petkov wrote: > On Tue, Jul 09, 2024 at 08:49:43AM +0200, Ard Biesheuvel wrote: > > > Patch #2 adds the CC blob to the identity map as well, if present, > > > since if present it is also dereferenced before the page fault handler > > > can be put into place. Given what's been discussed, this patch might > > > not be necessary; I don't know enough to say whether kexec-ing a new > > > kernel within a SEV guest makes sense. I'm pretty certain it can > > > cause no harm, though. > > No, keep it in the bag until it is really needed. No proactive "fixing". > > > I'd prefer it if that is addressed within the context of the SEV guest > > work. The memory setup is quite intricate, and dealing with individual > > types of EFI config tables is something we should avoid in general. I > > still maintain that the best approach would be to map all of DRAM 1:1 > > instead of mapping patches left and right (as this is what EFI does), > > but if we need to do so, let's keep it as generic as we possibly can. > > Sure. There's the kink that coco guests need to accept memory first and > mapping it all is the least performant one. But we can deal with that later. > > > I wasn't cc'ed on any of the patches so I don't know exactly what was > > discussed. > > > > Please cc me and linux-efi@ on your next revision. > > And please update your commit messages with what was discussed on this thread. > > Thx. Thanks, Boris. I'll give it my best shot. Next version will leave out the current patch #2, and update the comments to include this conversation somehow summarized. I think perhaps the cover letter was also too verbose on the history and unintentionally hid the information necesary to understand the situation. I will try to make it more concise. --> Steve -- Steve Wahl, Hewlett Packard Enterprise
On Tue, Jul 09, 2024 at 10:07:48AM -0500, Steve Wahl wrote:
> I think perhaps the cover letter was also too verbose on the history
> and unintentionally hid the information necesary to understand the
> situation. I will try to make it more concise.
Thanks.
And while we're at it, I think we should do this too.
Which should actually fix your issue too.
---
diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
index cd44e120fe53..a838cad72532 100644
--- a/arch/x86/boot/compressed/sev.c
+++ b/arch/x86/boot/compressed/sev.c
@@ -484,6 +484,15 @@ static bool early_snp_init(struct boot_params *bp)
{
struct cc_blob_sev_info *cc_info;
+ /*
+ * Bail out if not running on a hypervisor (HV). If the HV
+ * doesn't set the bit, that's an easy SEV-* guest DOS but that
+ * HV has then bigger problems: the SEV-* guest simply won't
+ * start.
+ */
+ if (!(native_cpuid_ecx(1) & BIT(31)))
+ return false;
+
if (!bp)
return false;
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Tue, Jul 09, 2024 at 06:46:20PM +0200, Borislav Petkov wrote:
> On Tue, Jul 09, 2024 at 10:07:48AM -0500, Steve Wahl wrote:
> > I think perhaps the cover letter was also too verbose on the history
> > and unintentionally hid the information necesary to understand the
> > situation. I will try to make it more concise.
>
> Thanks.
>
> And while we're at it, I think we should do this too.
>
> Which should actually fix your issue too.
Ok, that one is interesting. I think you are right, it will fix the
problem as we would bail before calling find_cc_blob(), which is where
we get into trouble. (That call happens just outside the context
displayed by the diff below).
I will add it.
--> Steve
> ---
> diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c
> index cd44e120fe53..a838cad72532 100644
> --- a/arch/x86/boot/compressed/sev.c
> +++ b/arch/x86/boot/compressed/sev.c
> @@ -484,6 +484,15 @@ static bool early_snp_init(struct boot_params *bp)
> {
> struct cc_blob_sev_info *cc_info;
>
> + /*
> + * Bail out if not running on a hypervisor (HV). If the HV
> + * doesn't set the bit, that's an easy SEV-* guest DOS but that
> + * HV has then bigger problems: the SEV-* guest simply won't
> + * start.
> + */
> + if (!(native_cpuid_ecx(1) & BIT(31)))
> + return false;
> +
> if (!bp)
> return false;
>
>
> --
> Regards/Gruss,
> Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette
--
Steve Wahl, Hewlett Packard Enterprise
On Tue, Jul 09, 2024 at 11:56:26AM -0500, Steve Wahl wrote:
> I will add it.
No, don't add it. This needs to be tested properly first. I'll do a separate
patch.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Tue, Jul 09, 2024 at 07:58:19PM +0200, Borislav Petkov wrote: > On Tue, Jul 09, 2024 at 11:56:26AM -0500, Steve Wahl wrote: > > I will add it. > > No, don't add it. This needs to be tested properly first. I'll do a separate > patch. OK! --> Steve -- Steve Wahl, Hewlett Packard Enterprise
On Wed, 3 Jul 2024 at 18:15, Borislav Petkov <bp@alien8.de> wrote:
>
> On Tue, Jul 02, 2024 at 08:32:22PM +0200, Ard Biesheuvel wrote:
> > For kexec on a 64-bit system, I would expect the high-level support
> > code to be capable of simply mapping all of DRAM 1:1, rather than
> > playing these games with #PF handlers and on-demand mapping.
>
> Yeah, apparently we can't do that on SGI, as Steve said.
>
> I like the aspect that the #PF handler won't fire in the first kernel because
> of EFI mapping all RAM. That's good.
>
It won't fire because the code where this handler is being added is
never even called by EFI boot - it decompresses the kernel from the
EFI stub and jumps straight to its entrypoint.
> So we could try to wire in a #PF handler in stage1, see below.
>
Looks fine to me from EFI boot pov, for the reasons given above.
> Steve, I don't have a good idea how to test that. Maybe some of those
> reporters you were talking about, would be willing to...
>
> ---
> diff --git a/arch/x86/boot/compressed/idt_64.c b/arch/x86/boot/compressed/idt_64.c
> index d100284bbef4..a258587c8949 100644
> --- a/arch/x86/boot/compressed/idt_64.c
> +++ b/arch/x86/boot/compressed/idt_64.c
> @@ -32,6 +32,7 @@ void load_stage1_idt(void)
> {
> boot_idt_desc.address = (unsigned long)boot_idt;
>
> + set_idt_entry(X86_TRAP_PF, boot_page_fault);
>
> if (IS_ENABLED(CONFIG_AMD_MEM_ENCRYPT))
> set_idt_entry(X86_TRAP_VC, boot_stage1_vc);
>
> --
> Regards/Gruss,
> Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette
On Tue, Jul 02, 2024 at 08:32:22PM +0200, Ard Biesheuvel wrote: > On Tue, 2 Jul 2024 at 19:45, Borislav Petkov <bp@alien8.de> wrote: > > > > On Mon, Jul 01, 2024 at 04:27:04PM +0200, Borislav Petkov wrote: > > > On Mon, Jun 24, 2024 at 10:13:44AM -0500, Steve Wahl wrote: > > > > These accesses are a problem because they happen prior to establishing > > > > the page fault interrupt handler that would mend the identity map. I > > > > know very little about the AMD SEV feature but reading the code I > > > > think it may be required to do this before setting up that handler. > > > > > > Yeah, from looking at it, we should be able to establish a #PF handler that > > > early too but the devil's in the detail, especially in that early boot code. > > > > > > Lemme poke some things and people... > > > > Ard, from EFI perspective and boot services exiting, do you see any potential > > issues if we enable a pagefault handler in load_stage1_idt() in > > arch/x86/boot/compressed/head_64.S already or is the EFI pagetable not really > > "reliable" then? > > > > For the first boot, this shouldn't be needed - EFI maps all of RAM so > I wouldn't expect the PF handler to fire, except when writing to code > regions that were mapped ROX by the firmware. But even then, things > should just keep working, although from a security pov, it would be > better if the r/o regions remain r/o We are looking at entering from kexec, and the identity map is not completely filled in with all available RAM, especially if you use smaller 2M pages to create the identity map. Patches 1 and 2 are aiming to fill in the parts of the map we potentially use before the page fault handler is established. (And my overall problem is kexec creating the identity map with 1G pages includes areas that are marked "reserved" by the BIOS, causing a halt when speculatively accessed. This is what patch 3 addresses.) > > Would solve the issue in this thread where the EFI config table ends up not > > mapped on some hw configurations, elegantly... > > > > The #PF handler makes sense when entering via the 32-bit entrypoint, > where the asm can only map the lower 4G and is in no position to > reason about where RAM lives. > > For kexec on a 64-bit system, I would expect the high-level support > code to be capable of simply mapping all of DRAM 1:1, rather than > playing these games with #PF handlers and on-demand mapping. Currently the identity map is selectively created, and at least from my point of view, patches 1 and 2 add in some parts that are missed and also not covered by #PF handlers. --> Steve -- Steve Wahl, Hewlett Packard Enterprise
On 6/21/2024 8:17 AM, Borislav Petkov wrote:
> On Mon, Jun 17, 2024 at 10:10:32AM -0500, Steve Wahl wrote:
>> The first, hardest step is locate a system that is AMD based, SEV
>> capable, with a BIOS that chooses to locate the CC_BLOB at addresses
>> that do not share a 2M page with other chunks of memory the kernel
>> currently adds to the kexec identity map. I.e. This is a stroke of
>> luck,
> Ya think?
>
> It is more likely that I win the lottery than finding such a beast. ;-\
>
>> and for all I know could depend on configuration such as memory
>> size in addition to motherboard and BIOS version. However, it does
>> not seem to change from boot to boot; as system that has the problem
>> seems to be consistent about it.
>>
>> Second, boot linux including the "nogbpages" command line option.
>>
>> Third, kexec -l <kernel image> --append=<command line options>
>> --initrd=<initrd>.
>>
>> Fourth, kexec -e.
>>
>> Systems that have this problem successfully kexec without the
>> "nogbpages" parameter, but fail and do a full reboot with the
>> "nogbpages" parameter.
>>
>> I wish I could be more exact,
> Yes, this doesn't really explain what the culprit is.
>
> So, your 0th message says:
>
> "But the community chose instead to avoid referencing this memory on
> non-AMD systems where the problem was reported.
>
> commit bee6cf1a80b5 ("x86/sev: Do not try to parse for the CC blob
> on non-AMD hardware")"
>
> But that patch fixes !AMD systems.
>
> Now you're basically saying that there are some AMD machines out there where
> the EFI config table doesn't get mapped because it is somewhere else, outside
> of the range of a 2M page or 1G page.
>
> Or even if it is, "nogbpages" supplied on the cmdline would cause the
> "overlapping 2M and 1G mapping to not happen, leaving the EFI config table
> unmapped.
From the instructions to reproduce this issue, it looks it is only reproducible on some AMD systems with the "nogbpages" parameter supplied on the kexec-ed kernel's command line, so supplying "nogbpages" is essential to reproduce this, but then "nogbpages" as mentioned makes sense on Atom processors mainly and those are already safe due to the following commit/patch:
commit bee6cf1a80b5 ("x86/sev: Do not try to parse for the CC blob
on non-AMD hardware")
The question is why would you want to use "nogbpages" on AMD systems and then find one of this system which does not have EFI config table mapped as it is outside the range of a 2M or 1G page.
Thanks, Ashish
>
> Am I on the right track here?
>
On Fri, Jun 21, 2024 at 02:41:35PM -0500, Kalra, Ashish wrote:
> On 6/21/2024 8:17 AM, Borislav Petkov wrote:
>
> > On Mon, Jun 17, 2024 at 10:10:32AM -0500, Steve Wahl wrote:
> >> The first, hardest step is locate a system that is AMD based, SEV
> >> capable, with a BIOS that chooses to locate the CC_BLOB at addresses
> >> that do not share a 2M page with other chunks of memory the kernel
> >> currently adds to the kexec identity map. I.e. This is a stroke of
> >> luck,
> > Ya think?
> >
> > It is more likely that I win the lottery than finding such a beast. ;-\
> >
> >> and for all I know could depend on configuration such as memory
> >> size in addition to motherboard and BIOS version. However, it does
> >> not seem to change from boot to boot; as system that has the problem
> >> seems to be consistent about it.
> >>
> >> Second, boot linux including the "nogbpages" command line option.
> >>
> >> Third, kexec -l <kernel image> --append=<command line options>
> >> --initrd=<initrd>.
> >>
> >> Fourth, kexec -e.
> >>
> >> Systems that have this problem successfully kexec without the
> >> "nogbpages" parameter, but fail and do a full reboot with the
> >> "nogbpages" parameter.
> >>
> >> I wish I could be more exact,
> > Yes, this doesn't really explain what the culprit is.
> >
> > So, your 0th message says:
> >
> > "But the community chose instead to avoid referencing this memory on
> > non-AMD systems where the problem was reported.
> >
> > commit bee6cf1a80b5 ("x86/sev: Do not try to parse for the CC blob
> > on non-AMD hardware")"
> >
> > But that patch fixes !AMD systems.
> >
> > Now you're basically saying that there are some AMD machines out there where
> > the EFI config table doesn't get mapped because it is somewhere else, outside
> > of the range of a 2M page or 1G page.
> >
> > Or even if it is, "nogbpages" supplied on the cmdline would cause the
> > "overlapping 2M and 1G mapping to not happen, leaving the EFI config table
> > unmapped.
>
> From the instructions to reproduce this issue, it looks it is only reproducible on some AMD systems with the "nogbpages" parameter supplied on the kexec-ed kernel's command line, so supplying "nogbpages" is essential to reproduce this, but then "nogbpages" as mentioned makes sense on Atom processors mainly and those are already safe due to the following commit/patch:
>
> commit bee6cf1a80b5 ("x86/sev: Do not try to parse for the CC blob
> on non-AMD hardware")
>
> The question is why would you want to use "nogbpages" on AMD systems and then find one of this system which does not have EFI config table mapped as it is outside the range of a 2M or 1G page.
"nogbpages" uses 2M pages exclusively and illustrates the same problem
that would/did surface with patch #3 that uses a combination of 2M and
1G pages. This is the source of the regression that caused patch #3
to be reverted after it previously got into the mainline.
I'm not sure there is currently a reason to use it other than this
illustration. If a reason to use it did arise, though, you would hit
this problem.
Thanks,
--> Steve Wahl
--
Steve Wahl, Hewlett Packard Enterprise
Add Tao in the cc list.
On Tue, 21 May 2024 at 02:37, Steve Wahl <steve.wahl@hpe.com> wrote:
>
> Although there was a previous fix to avoid early kernel access to the
> EFI config table on Intel systems, the problem can still exist on AMD
> systems that support SEV (Secure Encrypted Virtualization). The
> command line option "nogbpages" brings this bug to the surface. And
> this is what caused the regression with my earlier patch that
> attempted to reduce the use of gbpages. This patch series fixes that
> problem and restores my earlier patch.
>
> The following 2 commits caused the EFI config table, and the CC_BLOB
> entry in that table, to be accessed when enabling SEV at kernel
> startup.
>
> commit ec1c66af3a30 ("x86/compressed/64: Detect/setup SEV/SME features
> earlier during boot")
> commit c01fce9cef84 ("x86/compressed: Add SEV-SNP feature
> detection/setup")
>
> These accesses happen before the new kernel establishes its own
> identity map, and before establishing a routine to handle page faults.
> But the areas referenced are not explicitly added to the kexec
> identity map.
>
> This goes unnoticed when these areas happen to be placed close enough
> to others areas that are explicitly added to the identity map, but
> that is not always the case.
>
> Under certain conditions, for example Intel Atom processors that don't
> support 1GB pages, it was found that these areas don't end up mapped,
> and the SEV initialization code causes an unrecoverable page fault,
> and the kexec fails.
>
> Tau Liu had offered a patch to put the config table into the kexec
> identity map to avoid this problem:
>
> https://lore.kernel.org/all/20230601072043.24439-1-ltao@redhat.com/
>
> But the community chose instead to avoid referencing this memory on
> non-AMD systems where the problem was reported.
>
> commit bee6cf1a80b5 ("x86/sev: Do not try to parse for the CC blob
> on non-AMD hardware")
>
> I later wanted to make a different change to kexec identity map
> creation, and had this patch accepted:
>
> commit d794734c9bbf ("x86/mm/ident_map: Use gbpages only where full GB page should be mapped.")
>
> but it quickly needed to be reverted because of problems on AMD systems.
>
> The reported regression problems on AMD systems were due to the above
> mentioned references to the EFI config table. In fact, on the same
> systems, the "nogbpages" command line option breaks kexec as well.
>
> So I resubmit Tau Liu's original patch that maps the EFI config
> table, add an additional patch by me that ensures that the CC blob is
> also mapped (if present), and also resubmit my earlier patch to use
> gpbages only when a full GB of space is requested to be mapped.
>
> I do not advocate for removing the earlier, non-AMD fix. With kexec,
> two different kernel versions can be in play, and the earlier fix
> still covers non-AMD systems when the kexec'd-from kernel doesn't have
> these patches applied.
>
> All three of the people who reported regression with my earlier patch
> have retested with this patch series and found it to work where my
> single patch previously did not. With current kernels, all fail to
> kexec when "nogbpages" is on the command line, but all succeed with
> "nogbpages" after the series is applied.
>
> Tao Liu (1):
> x86/kexec: Add EFI config table identity mapping for kexec kernel
>
> Steve Wahl (2):
> x86/kexec: Add EFI Confidential Computing blob to kexec identity
> mapping.
> x86/mm/ident_map: Use gbpages only where full GB page should be
> mapped.
>
> arch/x86/kernel/machine_kexec_64.c | 82 ++++++++++++++++++++++++++++--
> arch/x86/mm/ident_map.c | 23 +++++++--
> 2 files changed, 95 insertions(+), 10 deletions(-)
>
> --
> 2.26.2
>
Cc kexec list as well.
On Thu, 23 May 2024 at 10:52, Dave Young <dyoung@redhat.com> wrote:
>
> Add Tao in the cc list.
>
> On Tue, 21 May 2024 at 02:37, Steve Wahl <steve.wahl@hpe.com> wrote:
> >
> > Although there was a previous fix to avoid early kernel access to the
> > EFI config table on Intel systems, the problem can still exist on AMD
> > systems that support SEV (Secure Encrypted Virtualization). The
> > command line option "nogbpages" brings this bug to the surface. And
> > this is what caused the regression with my earlier patch that
> > attempted to reduce the use of gbpages. This patch series fixes that
> > problem and restores my earlier patch.
> >
> > The following 2 commits caused the EFI config table, and the CC_BLOB
> > entry in that table, to be accessed when enabling SEV at kernel
> > startup.
> >
> > commit ec1c66af3a30 ("x86/compressed/64: Detect/setup SEV/SME features
> > earlier during boot")
> > commit c01fce9cef84 ("x86/compressed: Add SEV-SNP feature
> > detection/setup")
> >
> > These accesses happen before the new kernel establishes its own
> > identity map, and before establishing a routine to handle page faults.
> > But the areas referenced are not explicitly added to the kexec
> > identity map.
> >
> > This goes unnoticed when these areas happen to be placed close enough
> > to others areas that are explicitly added to the identity map, but
> > that is not always the case.
> >
> > Under certain conditions, for example Intel Atom processors that don't
> > support 1GB pages, it was found that these areas don't end up mapped,
> > and the SEV initialization code causes an unrecoverable page fault,
> > and the kexec fails.
> >
> > Tau Liu had offered a patch to put the config table into the kexec
> > identity map to avoid this problem:
> >
> > https://lore.kernel.org/all/20230601072043.24439-1-ltao@redhat.com/
> >
> > But the community chose instead to avoid referencing this memory on
> > non-AMD systems where the problem was reported.
> >
> > commit bee6cf1a80b5 ("x86/sev: Do not try to parse for the CC blob
> > on non-AMD hardware")
> >
> > I later wanted to make a different change to kexec identity map
> > creation, and had this patch accepted:
> >
> > commit d794734c9bbf ("x86/mm/ident_map: Use gbpages only where full GB page should be mapped.")
> >
> > but it quickly needed to be reverted because of problems on AMD systems.
> >
> > The reported regression problems on AMD systems were due to the above
> > mentioned references to the EFI config table. In fact, on the same
> > systems, the "nogbpages" command line option breaks kexec as well.
> >
> > So I resubmit Tau Liu's original patch that maps the EFI config
> > table, add an additional patch by me that ensures that the CC blob is
> > also mapped (if present), and also resubmit my earlier patch to use
> > gpbages only when a full GB of space is requested to be mapped.
> >
> > I do not advocate for removing the earlier, non-AMD fix. With kexec,
> > two different kernel versions can be in play, and the earlier fix
> > still covers non-AMD systems when the kexec'd-from kernel doesn't have
> > these patches applied.
> >
> > All three of the people who reported regression with my earlier patch
> > have retested with this patch series and found it to work where my
> > single patch previously did not. With current kernels, all fail to
> > kexec when "nogbpages" is on the command line, but all succeed with
> > "nogbpages" after the series is applied.
> >
> > Tao Liu (1):
> > x86/kexec: Add EFI config table identity mapping for kexec kernel
> >
> > Steve Wahl (2):
> > x86/kexec: Add EFI Confidential Computing blob to kexec identity
> > mapping.
> > x86/mm/ident_map: Use gbpages only where full GB page should be
> > mapped.
> >
> > arch/x86/kernel/machine_kexec_64.c | 82 ++++++++++++++++++++++++++++--
> > arch/x86/mm/ident_map.c | 23 +++++++--
> > 2 files changed, 95 insertions(+), 10 deletions(-)
> >
> > --
> > 2.26.2
> >
Gentle ping. Can someone give me some feedback, please?
Thanks,
Steve Wahl, HPE.
On Thu, May 23, 2024 at 10:54:33AM +0800, Dave Young wrote:
> Cc kexec list as well.
>
> On Thu, 23 May 2024 at 10:52, Dave Young <dyoung@redhat.com> wrote:
> >
> > Add Tao in the cc list.
> >
> > On Tue, 21 May 2024 at 02:37, Steve Wahl <steve.wahl@hpe.com> wrote:
> > >
> > > Although there was a previous fix to avoid early kernel access to the
> > > EFI config table on Intel systems, the problem can still exist on AMD
> > > systems that support SEV (Secure Encrypted Virtualization). The
> > > command line option "nogbpages" brings this bug to the surface. And
> > > this is what caused the regression with my earlier patch that
> > > attempted to reduce the use of gbpages. This patch series fixes that
> > > problem and restores my earlier patch.
> > >
> > > The following 2 commits caused the EFI config table, and the CC_BLOB
> > > entry in that table, to be accessed when enabling SEV at kernel
> > > startup.
> > >
> > > commit ec1c66af3a30 ("x86/compressed/64: Detect/setup SEV/SME features
> > > earlier during boot")
> > > commit c01fce9cef84 ("x86/compressed: Add SEV-SNP feature
> > > detection/setup")
> > >
> > > These accesses happen before the new kernel establishes its own
> > > identity map, and before establishing a routine to handle page faults.
> > > But the areas referenced are not explicitly added to the kexec
> > > identity map.
> > >
> > > This goes unnoticed when these areas happen to be placed close enough
> > > to others areas that are explicitly added to the identity map, but
> > > that is not always the case.
> > >
> > > Under certain conditions, for example Intel Atom processors that don't
> > > support 1GB pages, it was found that these areas don't end up mapped,
> > > and the SEV initialization code causes an unrecoverable page fault,
> > > and the kexec fails.
> > >
> > > Tau Liu had offered a patch to put the config table into the kexec
> > > identity map to avoid this problem:
> > >
> > > https://lore.kernel.org/all/20230601072043.24439-1-ltao@redhat.com/
> > >
> > > But the community chose instead to avoid referencing this memory on
> > > non-AMD systems where the problem was reported.
> > >
> > > commit bee6cf1a80b5 ("x86/sev: Do not try to parse for the CC blob
> > > on non-AMD hardware")
> > >
> > > I later wanted to make a different change to kexec identity map
> > > creation, and had this patch accepted:
> > >
> > > commit d794734c9bbf ("x86/mm/ident_map: Use gbpages only where full GB page should be mapped.")
> > >
> > > but it quickly needed to be reverted because of problems on AMD systems.
> > >
> > > The reported regression problems on AMD systems were due to the above
> > > mentioned references to the EFI config table. In fact, on the same
> > > systems, the "nogbpages" command line option breaks kexec as well.
> > >
> > > So I resubmit Tau Liu's original patch that maps the EFI config
> > > table, add an additional patch by me that ensures that the CC blob is
> > > also mapped (if present), and also resubmit my earlier patch to use
> > > gpbages only when a full GB of space is requested to be mapped.
> > >
> > > I do not advocate for removing the earlier, non-AMD fix. With kexec,
> > > two different kernel versions can be in play, and the earlier fix
> > > still covers non-AMD systems when the kexec'd-from kernel doesn't have
> > > these patches applied.
> > >
> > > All three of the people who reported regression with my earlier patch
> > > have retested with this patch series and found it to work where my
> > > single patch previously did not. With current kernels, all fail to
> > > kexec when "nogbpages" is on the command line, but all succeed with
> > > "nogbpages" after the series is applied.
> > >
> > > Tao Liu (1):
> > > x86/kexec: Add EFI config table identity mapping for kexec kernel
> > >
> > > Steve Wahl (2):
> > > x86/kexec: Add EFI Confidential Computing blob to kexec identity
> > > mapping.
> > > x86/mm/ident_map: Use gbpages only where full GB page should be
> > > mapped.
> > >
> > > arch/x86/kernel/machine_kexec_64.c | 82 ++++++++++++++++++++++++++++--
> > > arch/x86/mm/ident_map.c | 23 +++++++--
> > > 2 files changed, 95 insertions(+), 10 deletions(-)
> > >
> > > --
> > > 2.26.2
> > >
>
© 2016 - 2026 Red Hat, Inc.