[PATCH v6] Avoid crash calling PrintErrMesg from efi_multiboot2

Frediano Ziglio posted 1 patch 8 months, 2 weeks ago
Patches applied successfully (tree, apply log)
git fetch https://gitlab.com/xen-project/patchew/xen tags/patchew/20250217162659.151232-1-frediano.ziglio@cloud.com
There is a newer version of this series
xen/common/efi/boot.c        | 58 ++++++++++++++++++++++++------------
xen/common/efi/efi-common.mk |  1 +
2 files changed, 40 insertions(+), 19 deletions(-)
[PATCH v6] Avoid crash calling PrintErrMesg from efi_multiboot2
Posted by Frediano Ziglio 8 months, 2 weeks ago
Although code is compiled with -fpic option data is not position
independent. This causes data pointer to become invalid if
code is not relocated properly which is what happens for
efi_multiboot2 which is called by multiboot entry code.

Code tested adding
   PrintErrMesg(L"Test message", EFI_BUFFER_TOO_SMALL);
in efi_multiboot2 before calling efi_arch_edd (this function
can potentially call PrintErrMesg).

Before the patch (XenServer installation on Qemu, xen replaced
with vanilla xen.gz):
  Booting `XenServer (Serial)'Booting `XenServer (Serial)'
  Test message: !!!! X64 Exception Type - 0E(#PF - Page-Fault)  CPU Apic ID - 00000000 !!!!
  ExceptionData - 0000000000000000  I:0 R:0 U:0 W:0 P:0 PK:0 SS:0 SGX:0
  RIP  - 000000007EE21E9A, CS  - 0000000000000038, RFLAGS - 0000000000210246
  RAX  - 000000007FF0C1B5, RCX - 0000000000000050, RDX - 0000000000000010
  RBX  - 0000000000000000, RSP - 000000007FF0C180, RBP - 000000007FF0C210
  RSI  - FFFF82D040467CE8, RDI - 0000000000000000
  R8   - 000000007FF0C1C8, R9  - 000000007FF0C1C0, R10 - 0000000000000000
  R11  - 0000000000001020, R12 - FFFF82D040467CE8, R13 - 000000007FF0C1B8
  R14  - 000000007EA33328, R15 - 000000007EA332D8
  DS   - 0000000000000030, ES  - 0000000000000030, FS  - 0000000000000030
  GS   - 0000000000000030, SS  - 0000000000000030
  CR0  - 0000000080010033, CR2 - FFFF82D040467CE8, CR3 - 000000007FC01000
  CR4  - 0000000000000668, CR8 - 0000000000000000
  DR0  - 0000000000000000, DR1 - 0000000000000000, DR2 - 0000000000000000
  DR3  - 0000000000000000, DR6 - 00000000FFFF0FF0, DR7 - 0000000000000400
  GDTR - 000000007F9DB000 0000000000000047, LDTR - 0000000000000000
  IDTR - 000000007F48E018 0000000000000FFF,   TR - 0000000000000000
  FXSAVE_STATE - 000000007FF0BDE0
  !!!! Find image based on IP(0x7EE21E9A) (No PDB)  (ImageBase=000000007EE20000, EntryPoint=000000007EE23935) !!!!

After the patch:
  Booting `XenServer (Serial)'Booting `XenServer (Serial)'
  Test message: Buffer too small
  BdsDxe: loading Boot0000 "UiApp" from Fv(7CB8BDC9-F8EB-4F34-AAEA-3EE4AF6516A1)/FvFile(462CAA21-7614-4503-836E-8AB6F4662331)
  BdsDxe: starting Boot0000 "UiApp" from Fv(7CB8BDC9-F8EB-4F34-AAEA-3EE4AF6516A1)/FvFile(462CAA21-7614-4503-836E-8AB6F4662331)

This partially rollback commit 00d5d5ce23e6.

Fixes: 9180f5365524 ("x86: add multiboot2 protocol support for EFI platforms")
Signed-off-by: Frediano Ziglio <frediano.ziglio@cloud.com>
---
Changes since v1:
- added "Fixes:" tag;
- fixed cast style change.

Changes since v2:
- wrap long line.

Changes since v3:
- fixed "Fixes:" tag.

Changes since v4:
- use switch instead of tables.

Changes since v5:
- added -fno-jump-tables option.
---
 xen/common/efi/boot.c        | 58 ++++++++++++++++++++++++------------
 xen/common/efi/efi-common.mk |  1 +
 2 files changed, 40 insertions(+), 19 deletions(-)

diff --git a/xen/common/efi/boot.c b/xen/common/efi/boot.c
index efbec00af9..143b5681ba 100644
--- a/xen/common/efi/boot.c
+++ b/xen/common/efi/boot.c
@@ -287,33 +287,53 @@ static bool __init match_guid(const EFI_GUID *guid1, const EFI_GUID *guid2)
 /* generic routine for printing error messages */
 static void __init PrintErrMesg(const CHAR16 *mesg, EFI_STATUS ErrCode)
 {
-    static const CHAR16* const ErrCodeToStr[] __initconstrel = {
-        [~EFI_ERROR_MASK & EFI_NOT_FOUND]           = L"Not found",
-        [~EFI_ERROR_MASK & EFI_NO_MEDIA]            = L"The device has no media",
-        [~EFI_ERROR_MASK & EFI_MEDIA_CHANGED]       = L"Media changed",
-        [~EFI_ERROR_MASK & EFI_DEVICE_ERROR]        = L"Device error",
-        [~EFI_ERROR_MASK & EFI_VOLUME_CORRUPTED]    = L"Volume corrupted",
-        [~EFI_ERROR_MASK & EFI_ACCESS_DENIED]       = L"Access denied",
-        [~EFI_ERROR_MASK & EFI_OUT_OF_RESOURCES]    = L"Out of resources",
-        [~EFI_ERROR_MASK & EFI_VOLUME_FULL]         = L"Volume is full",
-        [~EFI_ERROR_MASK & EFI_SECURITY_VIOLATION]  = L"Security violation",
-        [~EFI_ERROR_MASK & EFI_CRC_ERROR]           = L"CRC error",
-        [~EFI_ERROR_MASK & EFI_COMPROMISED_DATA]    = L"Compromised data",
-        [~EFI_ERROR_MASK & EFI_BUFFER_TOO_SMALL]    = L"Buffer too small",
-    };
-    EFI_STATUS ErrIdx = ErrCode & ~EFI_ERROR_MASK;
-
     StdOut = StdErr;
     PrintErr(mesg);
     PrintErr(L": ");
 
-    if( (ErrIdx < ARRAY_SIZE(ErrCodeToStr)) && ErrCodeToStr[ErrIdx] )
-        mesg = ErrCodeToStr[ErrIdx];
-    else
+    switch (ErrCode)
     {
+    case EFI_NOT_FOUND:
+        mesg = L"Not found";
+        break;
+    case EFI_NO_MEDIA:
+        mesg = L"The device has no media";
+        break;
+    case EFI_MEDIA_CHANGED:
+        mesg = L"Media changed";
+        break;
+    case EFI_DEVICE_ERROR:
+        mesg = L"Device error";
+        break;
+    case EFI_VOLUME_CORRUPTED:
+        mesg = L"Volume corrupted";
+        break;
+    case EFI_ACCESS_DENIED:
+        mesg = L"Access denied";
+        break;
+    case EFI_OUT_OF_RESOURCES:
+        mesg = L"Out of resources";
+        break;
+    case EFI_VOLUME_FULL:
+        mesg = L"Volume is full";
+        break;
+    case EFI_SECURITY_VIOLATION:
+        mesg = L"Security violation";
+        break;
+    case EFI_CRC_ERROR:
+        mesg = L"CRC error";
+        break;
+    case EFI_COMPROMISED_DATA:
+        mesg = L"Compromised data";
+        break;
+    case EFI_BUFFER_TOO_SMALL:
+        mesg = L"Buffer too small";
+        break;
+    default:
         PrintErr(L"ErrCode: ");
         DisplayUint(ErrCode, 0);
         mesg = NULL;
+        break;
     }
     blexit(mesg);
 }
diff --git a/xen/common/efi/efi-common.mk b/xen/common/efi/efi-common.mk
index 23cafcf20c..06b1c19674 100644
--- a/xen/common/efi/efi-common.mk
+++ b/xen/common/efi/efi-common.mk
@@ -2,6 +2,7 @@ EFIOBJ-y := boot.init.o pe.init.o ebmalloc.o runtime.o
 EFIOBJ-$(CONFIG_COMPAT) += compat.o
 
 CFLAGS-y += -fshort-wchar
+CFLAGS-y += -fno-jump-tables
 CFLAGS-y += -iquote $(srctree)/common/efi
 CFLAGS-y += -iquote $(srcdir)
 
-- 
2.34.1
Re: [PATCH v6] Avoid crash calling PrintErrMesg from efi_multiboot2
Posted by Marek Marczykowski-Górecki 8 months, 1 week ago
On Mon, Feb 17, 2025 at 04:26:59PM +0000, Frediano Ziglio wrote:
> Although code is compiled with -fpic option data is not position
> independent. This causes data pointer to become invalid if
> code is not relocated properly which is what happens for
> efi_multiboot2 which is called by multiboot entry code.
> 
> Code tested adding
>    PrintErrMesg(L"Test message", EFI_BUFFER_TOO_SMALL);
> in efi_multiboot2 before calling efi_arch_edd (this function
> can potentially call PrintErrMesg).
> 
> Before the patch (XenServer installation on Qemu, xen replaced
> with vanilla xen.gz):
>   Booting `XenServer (Serial)'Booting `XenServer (Serial)'
>   Test message: !!!! X64 Exception Type - 0E(#PF - Page-Fault)  CPU Apic ID - 00000000 !!!!
>   ExceptionData - 0000000000000000  I:0 R:0 U:0 W:0 P:0 PK:0 SS:0 SGX:0
>   RIP  - 000000007EE21E9A, CS  - 0000000000000038, RFLAGS - 0000000000210246
>   RAX  - 000000007FF0C1B5, RCX - 0000000000000050, RDX - 0000000000000010
>   RBX  - 0000000000000000, RSP - 000000007FF0C180, RBP - 000000007FF0C210
>   RSI  - FFFF82D040467CE8, RDI - 0000000000000000
>   R8   - 000000007FF0C1C8, R9  - 000000007FF0C1C0, R10 - 0000000000000000
>   R11  - 0000000000001020, R12 - FFFF82D040467CE8, R13 - 000000007FF0C1B8
>   R14  - 000000007EA33328, R15 - 000000007EA332D8
>   DS   - 0000000000000030, ES  - 0000000000000030, FS  - 0000000000000030
>   GS   - 0000000000000030, SS  - 0000000000000030
>   CR0  - 0000000080010033, CR2 - FFFF82D040467CE8, CR3 - 000000007FC01000
>   CR4  - 0000000000000668, CR8 - 0000000000000000
>   DR0  - 0000000000000000, DR1 - 0000000000000000, DR2 - 0000000000000000
>   DR3  - 0000000000000000, DR6 - 00000000FFFF0FF0, DR7 - 0000000000000400
>   GDTR - 000000007F9DB000 0000000000000047, LDTR - 0000000000000000
>   IDTR - 000000007F48E018 0000000000000FFF,   TR - 0000000000000000
>   FXSAVE_STATE - 000000007FF0BDE0
>   !!!! Find image based on IP(0x7EE21E9A) (No PDB)  (ImageBase=000000007EE20000, EntryPoint=000000007EE23935) !!!!
> 
> After the patch:
>   Booting `XenServer (Serial)'Booting `XenServer (Serial)'
>   Test message: Buffer too small
>   BdsDxe: loading Boot0000 "UiApp" from Fv(7CB8BDC9-F8EB-4F34-AAEA-3EE4AF6516A1)/FvFile(462CAA21-7614-4503-836E-8AB6F4662331)
>   BdsDxe: starting Boot0000 "UiApp" from Fv(7CB8BDC9-F8EB-4F34-AAEA-3EE4AF6516A1)/FvFile(462CAA21-7614-4503-836E-8AB6F4662331)
> 
> This partially rollback commit 00d5d5ce23e6.
> 
> Fixes: 9180f5365524 ("x86: add multiboot2 protocol support for EFI platforms")
> Signed-off-by: Frediano Ziglio <frediano.ziglio@cloud.com>

I tried testing this patch, but it seems I cannot reproduce the original
failure...

I did as the commit message suggests here:
https://gitlab.com/xen-project/people/marmarek/xen/-/commit/ca3d6911c448eb886990f33d4380b5646617a982

With blexit() in PrintErrMesg(), it went back to the bootloader, so I'm
sure this code path was reached. But with blexit() commented out, Xen
started correctly both with and without this patch... The branch I used
is here:
https://gitlab.com/xen-project/people/marmarek/xen/-/commits/automation-tests?ref_type=heads

Are there some extra condition to reproduce the issue? Maybe it depends
on the compiler version? I guess I can try also on QEMU, but based on
the description, I would expect it to crash in any case.

> ---
> Changes since v1:
> - added "Fixes:" tag;
> - fixed cast style change.
> 
> Changes since v2:
> - wrap long line.
> 
> Changes since v3:
> - fixed "Fixes:" tag.
> 
> Changes since v4:
> - use switch instead of tables.
> 
> Changes since v5:
> - added -fno-jump-tables option.
> ---
>  xen/common/efi/boot.c        | 58 ++++++++++++++++++++++++------------
>  xen/common/efi/efi-common.mk |  1 +
>  2 files changed, 40 insertions(+), 19 deletions(-)
> 
> diff --git a/xen/common/efi/boot.c b/xen/common/efi/boot.c
> index efbec00af9..143b5681ba 100644
> --- a/xen/common/efi/boot.c
> +++ b/xen/common/efi/boot.c
> @@ -287,33 +287,53 @@ static bool __init match_guid(const EFI_GUID *guid1, const EFI_GUID *guid2)
>  /* generic routine for printing error messages */
>  static void __init PrintErrMesg(const CHAR16 *mesg, EFI_STATUS ErrCode)
>  {
> -    static const CHAR16* const ErrCodeToStr[] __initconstrel = {
> -        [~EFI_ERROR_MASK & EFI_NOT_FOUND]           = L"Not found",
> -        [~EFI_ERROR_MASK & EFI_NO_MEDIA]            = L"The device has no media",
> -        [~EFI_ERROR_MASK & EFI_MEDIA_CHANGED]       = L"Media changed",
> -        [~EFI_ERROR_MASK & EFI_DEVICE_ERROR]        = L"Device error",
> -        [~EFI_ERROR_MASK & EFI_VOLUME_CORRUPTED]    = L"Volume corrupted",
> -        [~EFI_ERROR_MASK & EFI_ACCESS_DENIED]       = L"Access denied",
> -        [~EFI_ERROR_MASK & EFI_OUT_OF_RESOURCES]    = L"Out of resources",
> -        [~EFI_ERROR_MASK & EFI_VOLUME_FULL]         = L"Volume is full",
> -        [~EFI_ERROR_MASK & EFI_SECURITY_VIOLATION]  = L"Security violation",
> -        [~EFI_ERROR_MASK & EFI_CRC_ERROR]           = L"CRC error",
> -        [~EFI_ERROR_MASK & EFI_COMPROMISED_DATA]    = L"Compromised data",
> -        [~EFI_ERROR_MASK & EFI_BUFFER_TOO_SMALL]    = L"Buffer too small",
> -    };
> -    EFI_STATUS ErrIdx = ErrCode & ~EFI_ERROR_MASK;
> -
>      StdOut = StdErr;
>      PrintErr(mesg);
>      PrintErr(L": ");
>  
> -    if( (ErrIdx < ARRAY_SIZE(ErrCodeToStr)) && ErrCodeToStr[ErrIdx] )
> -        mesg = ErrCodeToStr[ErrIdx];
> -    else
> +    switch (ErrCode)
>      {
> +    case EFI_NOT_FOUND:
> +        mesg = L"Not found";
> +        break;
> +    case EFI_NO_MEDIA:
> +        mesg = L"The device has no media";
> +        break;
> +    case EFI_MEDIA_CHANGED:
> +        mesg = L"Media changed";
> +        break;
> +    case EFI_DEVICE_ERROR:
> +        mesg = L"Device error";
> +        break;
> +    case EFI_VOLUME_CORRUPTED:
> +        mesg = L"Volume corrupted";
> +        break;
> +    case EFI_ACCESS_DENIED:
> +        mesg = L"Access denied";
> +        break;
> +    case EFI_OUT_OF_RESOURCES:
> +        mesg = L"Out of resources";
> +        break;
> +    case EFI_VOLUME_FULL:
> +        mesg = L"Volume is full";
> +        break;
> +    case EFI_SECURITY_VIOLATION:
> +        mesg = L"Security violation";
> +        break;
> +    case EFI_CRC_ERROR:
> +        mesg = L"CRC error";
> +        break;
> +    case EFI_COMPROMISED_DATA:
> +        mesg = L"Compromised data";
> +        break;
> +    case EFI_BUFFER_TOO_SMALL:
> +        mesg = L"Buffer too small";
> +        break;
> +    default:
>          PrintErr(L"ErrCode: ");
>          DisplayUint(ErrCode, 0);
>          mesg = NULL;
> +        break;
>      }
>      blexit(mesg);
>  }
> diff --git a/xen/common/efi/efi-common.mk b/xen/common/efi/efi-common.mk
> index 23cafcf20c..06b1c19674 100644
> --- a/xen/common/efi/efi-common.mk
> +++ b/xen/common/efi/efi-common.mk
> @@ -2,6 +2,7 @@ EFIOBJ-y := boot.init.o pe.init.o ebmalloc.o runtime.o
>  EFIOBJ-$(CONFIG_COMPAT) += compat.o
>  
>  CFLAGS-y += -fshort-wchar
> +CFLAGS-y += -fno-jump-tables
>  CFLAGS-y += -iquote $(srctree)/common/efi
>  CFLAGS-y += -iquote $(srcdir)
>  
> -- 
> 2.34.1
> 

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
Re: [PATCH v6] Avoid crash calling PrintErrMesg from efi_multiboot2
Posted by Frediano Ziglio 8 months, 1 week ago
On Fri, Feb 21, 2025 at 8:20 PM Marek Marczykowski-Górecki
<marmarek@invisiblethingslab.com> wrote:
>
> On Mon, Feb 17, 2025 at 04:26:59PM +0000, Frediano Ziglio wrote:
> > Although code is compiled with -fpic option data is not position
> > independent. This causes data pointer to become invalid if
> > code is not relocated properly which is what happens for
> > efi_multiboot2 which is called by multiboot entry code.
> >
> > Code tested adding
> >    PrintErrMesg(L"Test message", EFI_BUFFER_TOO_SMALL);
> > in efi_multiboot2 before calling efi_arch_edd (this function
> > can potentially call PrintErrMesg).
> >
> > Before the patch (XenServer installation on Qemu, xen replaced
> > with vanilla xen.gz):
> >   Booting `XenServer (Serial)'Booting `XenServer (Serial)'
> >   Test message: !!!! X64 Exception Type - 0E(#PF - Page-Fault)  CPU Apic ID - 00000000 !!!!
> >   ExceptionData - 0000000000000000  I:0 R:0 U:0 W:0 P:0 PK:0 SS:0 SGX:0
> >   RIP  - 000000007EE21E9A, CS  - 0000000000000038, RFLAGS - 0000000000210246
> >   RAX  - 000000007FF0C1B5, RCX - 0000000000000050, RDX - 0000000000000010
> >   RBX  - 0000000000000000, RSP - 000000007FF0C180, RBP - 000000007FF0C210
> >   RSI  - FFFF82D040467CE8, RDI - 0000000000000000
> >   R8   - 000000007FF0C1C8, R9  - 000000007FF0C1C0, R10 - 0000000000000000
> >   R11  - 0000000000001020, R12 - FFFF82D040467CE8, R13 - 000000007FF0C1B8
> >   R14  - 000000007EA33328, R15 - 000000007EA332D8
> >   DS   - 0000000000000030, ES  - 0000000000000030, FS  - 0000000000000030
> >   GS   - 0000000000000030, SS  - 0000000000000030
> >   CR0  - 0000000080010033, CR2 - FFFF82D040467CE8, CR3 - 000000007FC01000
> >   CR4  - 0000000000000668, CR8 - 0000000000000000
> >   DR0  - 0000000000000000, DR1 - 0000000000000000, DR2 - 0000000000000000
> >   DR3  - 0000000000000000, DR6 - 00000000FFFF0FF0, DR7 - 0000000000000400
> >   GDTR - 000000007F9DB000 0000000000000047, LDTR - 0000000000000000
> >   IDTR - 000000007F48E018 0000000000000FFF,   TR - 0000000000000000
> >   FXSAVE_STATE - 000000007FF0BDE0
> >   !!!! Find image based on IP(0x7EE21E9A) (No PDB)  (ImageBase=000000007EE20000, EntryPoint=000000007EE23935) !!!!
> >
> > After the patch:
> >   Booting `XenServer (Serial)'Booting `XenServer (Serial)'
> >   Test message: Buffer too small
> >   BdsDxe: loading Boot0000 "UiApp" from Fv(7CB8BDC9-F8EB-4F34-AAEA-3EE4AF6516A1)/FvFile(462CAA21-7614-4503-836E-8AB6F4662331)
> >   BdsDxe: starting Boot0000 "UiApp" from Fv(7CB8BDC9-F8EB-4F34-AAEA-3EE4AF6516A1)/FvFile(462CAA21-7614-4503-836E-8AB6F4662331)
> >
> > This partially rollback commit 00d5d5ce23e6.
> >
> > Fixes: 9180f5365524 ("x86: add multiboot2 protocol support for EFI platforms")
> > Signed-off-by: Frediano Ziglio <frediano.ziglio@cloud.com>
>
> I tried testing this patch, but it seems I cannot reproduce the original
> failure...
>
> I did as the commit message suggests here:
> https://gitlab.com/xen-project/people/marmarek/xen/-/commit/ca3d6911c448eb886990f33d4380b5646617a982
>
> With blexit() in PrintErrMesg(), it went back to the bootloader, so I'm
> sure this code path was reached. But with blexit() commented out, Xen
> started correctly both with and without this patch... The branch I used
> is here:
> https://gitlab.com/xen-project/people/marmarek/xen/-/commits/automation-tests?ref_type=heads
>
> Are there some extra condition to reproduce the issue? Maybe it depends
> on the compiler version? I guess I can try also on QEMU, but based on
> the description, I would expect it to crash in any case.
>

Did you see the correct message in both cases?
Did you use Grub or direct EFI?

With Grub and without this patch you won't see the message, with grub
with the patch you see the correct message.

Frediano

> > ---
> > Changes since v1:
> > - added "Fixes:" tag;
> > - fixed cast style change.
> >
> > Changes since v2:
> > - wrap long line.
> >
> > Changes since v3:
> > - fixed "Fixes:" tag.
> >
> > Changes since v4:
> > - use switch instead of tables.
> >
> > Changes since v5:
> > - added -fno-jump-tables option.
> > ---
> >  xen/common/efi/boot.c        | 58 ++++++++++++++++++++++++------------
> >  xen/common/efi/efi-common.mk |  1 +
> >  2 files changed, 40 insertions(+), 19 deletions(-)
> >
> > diff --git a/xen/common/efi/boot.c b/xen/common/efi/boot.c
> > index efbec00af9..143b5681ba 100644
> > --- a/xen/common/efi/boot.c
> > +++ b/xen/common/efi/boot.c
> > @@ -287,33 +287,53 @@ static bool __init match_guid(const EFI_GUID *guid1, const EFI_GUID *guid2)
> >  /* generic routine for printing error messages */
> >  static void __init PrintErrMesg(const CHAR16 *mesg, EFI_STATUS ErrCode)
> >  {
> > -    static const CHAR16* const ErrCodeToStr[] __initconstrel = {
> > -        [~EFI_ERROR_MASK & EFI_NOT_FOUND]           = L"Not found",
> > -        [~EFI_ERROR_MASK & EFI_NO_MEDIA]            = L"The device has no media",
> > -        [~EFI_ERROR_MASK & EFI_MEDIA_CHANGED]       = L"Media changed",
> > -        [~EFI_ERROR_MASK & EFI_DEVICE_ERROR]        = L"Device error",
> > -        [~EFI_ERROR_MASK & EFI_VOLUME_CORRUPTED]    = L"Volume corrupted",
> > -        [~EFI_ERROR_MASK & EFI_ACCESS_DENIED]       = L"Access denied",
> > -        [~EFI_ERROR_MASK & EFI_OUT_OF_RESOURCES]    = L"Out of resources",
> > -        [~EFI_ERROR_MASK & EFI_VOLUME_FULL]         = L"Volume is full",
> > -        [~EFI_ERROR_MASK & EFI_SECURITY_VIOLATION]  = L"Security violation",
> > -        [~EFI_ERROR_MASK & EFI_CRC_ERROR]           = L"CRC error",
> > -        [~EFI_ERROR_MASK & EFI_COMPROMISED_DATA]    = L"Compromised data",
> > -        [~EFI_ERROR_MASK & EFI_BUFFER_TOO_SMALL]    = L"Buffer too small",
> > -    };
> > -    EFI_STATUS ErrIdx = ErrCode & ~EFI_ERROR_MASK;
> > -
> >      StdOut = StdErr;
> >      PrintErr(mesg);
> >      PrintErr(L": ");
> >
> > -    if( (ErrIdx < ARRAY_SIZE(ErrCodeToStr)) && ErrCodeToStr[ErrIdx] )
> > -        mesg = ErrCodeToStr[ErrIdx];
> > -    else
> > +    switch (ErrCode)
> >      {
> > +    case EFI_NOT_FOUND:
> > +        mesg = L"Not found";
> > +        break;
> > +    case EFI_NO_MEDIA:
> > +        mesg = L"The device has no media";
> > +        break;
> > +    case EFI_MEDIA_CHANGED:
> > +        mesg = L"Media changed";
> > +        break;
> > +    case EFI_DEVICE_ERROR:
> > +        mesg = L"Device error";
> > +        break;
> > +    case EFI_VOLUME_CORRUPTED:
> > +        mesg = L"Volume corrupted";
> > +        break;
> > +    case EFI_ACCESS_DENIED:
> > +        mesg = L"Access denied";
> > +        break;
> > +    case EFI_OUT_OF_RESOURCES:
> > +        mesg = L"Out of resources";
> > +        break;
> > +    case EFI_VOLUME_FULL:
> > +        mesg = L"Volume is full";
> > +        break;
> > +    case EFI_SECURITY_VIOLATION:
> > +        mesg = L"Security violation";
> > +        break;
> > +    case EFI_CRC_ERROR:
> > +        mesg = L"CRC error";
> > +        break;
> > +    case EFI_COMPROMISED_DATA:
> > +        mesg = L"Compromised data";
> > +        break;
> > +    case EFI_BUFFER_TOO_SMALL:
> > +        mesg = L"Buffer too small";
> > +        break;
> > +    default:
> >          PrintErr(L"ErrCode: ");
> >          DisplayUint(ErrCode, 0);
> >          mesg = NULL;
> > +        break;
> >      }
> >      blexit(mesg);
> >  }
> > diff --git a/xen/common/efi/efi-common.mk b/xen/common/efi/efi-common.mk
> > index 23cafcf20c..06b1c19674 100644
> > --- a/xen/common/efi/efi-common.mk
> > +++ b/xen/common/efi/efi-common.mk
> > @@ -2,6 +2,7 @@ EFIOBJ-y := boot.init.o pe.init.o ebmalloc.o runtime.o
> >  EFIOBJ-$(CONFIG_COMPAT) += compat.o
> >
> >  CFLAGS-y += -fshort-wchar
> > +CFLAGS-y += -fno-jump-tables
> >  CFLAGS-y += -iquote $(srctree)/common/efi
> >  CFLAGS-y += -iquote $(srcdir)
> >
> > --
> > 2.34.1
> >
>
> --
> Best Regards,
> Marek Marczykowski-Górecki
> Invisible Things Lab
Re: [PATCH v6] Avoid crash calling PrintErrMesg from efi_multiboot2
Posted by Marek Marczykowski-Górecki 8 months, 1 week ago
On Mon, Feb 24, 2025 at 12:57:13PM +0000, Frediano Ziglio wrote:
> On Fri, Feb 21, 2025 at 8:20 PM Marek Marczykowski-Górecki
> <marmarek@invisiblethingslab.com> wrote:
> >
> > On Mon, Feb 17, 2025 at 04:26:59PM +0000, Frediano Ziglio wrote:
> > > Although code is compiled with -fpic option data is not position
> > > independent. This causes data pointer to become invalid if
> > > code is not relocated properly which is what happens for
> > > efi_multiboot2 which is called by multiboot entry code.
> > >
> > > Code tested adding
> > >    PrintErrMesg(L"Test message", EFI_BUFFER_TOO_SMALL);
> > > in efi_multiboot2 before calling efi_arch_edd (this function
> > > can potentially call PrintErrMesg).
> > >
> > > Before the patch (XenServer installation on Qemu, xen replaced
> > > with vanilla xen.gz):
> > >   Booting `XenServer (Serial)'Booting `XenServer (Serial)'
> > >   Test message: !!!! X64 Exception Type - 0E(#PF - Page-Fault)  CPU Apic ID - 00000000 !!!!
> > >   ExceptionData - 0000000000000000  I:0 R:0 U:0 W:0 P:0 PK:0 SS:0 SGX:0
> > >   RIP  - 000000007EE21E9A, CS  - 0000000000000038, RFLAGS - 0000000000210246
> > >   RAX  - 000000007FF0C1B5, RCX - 0000000000000050, RDX - 0000000000000010
> > >   RBX  - 0000000000000000, RSP - 000000007FF0C180, RBP - 000000007FF0C210
> > >   RSI  - FFFF82D040467CE8, RDI - 0000000000000000
> > >   R8   - 000000007FF0C1C8, R9  - 000000007FF0C1C0, R10 - 0000000000000000
> > >   R11  - 0000000000001020, R12 - FFFF82D040467CE8, R13 - 000000007FF0C1B8
> > >   R14  - 000000007EA33328, R15 - 000000007EA332D8
> > >   DS   - 0000000000000030, ES  - 0000000000000030, FS  - 0000000000000030
> > >   GS   - 0000000000000030, SS  - 0000000000000030
> > >   CR0  - 0000000080010033, CR2 - FFFF82D040467CE8, CR3 - 000000007FC01000
> > >   CR4  - 0000000000000668, CR8 - 0000000000000000
> > >   DR0  - 0000000000000000, DR1 - 0000000000000000, DR2 - 0000000000000000
> > >   DR3  - 0000000000000000, DR6 - 00000000FFFF0FF0, DR7 - 0000000000000400
> > >   GDTR - 000000007F9DB000 0000000000000047, LDTR - 0000000000000000
> > >   IDTR - 000000007F48E018 0000000000000FFF,   TR - 0000000000000000
> > >   FXSAVE_STATE - 000000007FF0BDE0
> > >   !!!! Find image based on IP(0x7EE21E9A) (No PDB)  (ImageBase=000000007EE20000, EntryPoint=000000007EE23935) !!!!
> > >
> > > After the patch:
> > >   Booting `XenServer (Serial)'Booting `XenServer (Serial)'
> > >   Test message: Buffer too small
> > >   BdsDxe: loading Boot0000 "UiApp" from Fv(7CB8BDC9-F8EB-4F34-AAEA-3EE4AF6516A1)/FvFile(462CAA21-7614-4503-836E-8AB6F4662331)
> > >   BdsDxe: starting Boot0000 "UiApp" from Fv(7CB8BDC9-F8EB-4F34-AAEA-3EE4AF6516A1)/FvFile(462CAA21-7614-4503-836E-8AB6F4662331)
> > >
> > > This partially rollback commit 00d5d5ce23e6.
> > >
> > > Fixes: 9180f5365524 ("x86: add multiboot2 protocol support for EFI platforms")
> > > Signed-off-by: Frediano Ziglio <frediano.ziglio@cloud.com>
> >
> > I tried testing this patch, but it seems I cannot reproduce the original
> > failure...
> >
> > I did as the commit message suggests here:
> > https://gitlab.com/xen-project/people/marmarek/xen/-/commit/ca3d6911c448eb886990f33d4380b5646617a982
> >
> > With blexit() in PrintErrMesg(), it went back to the bootloader, so I'm
> > sure this code path was reached. But with blexit() commented out, Xen
> > started correctly both with and without this patch... The branch I used
> > is here:
> > https://gitlab.com/xen-project/people/marmarek/xen/-/commits/automation-tests?ref_type=heads
> >
> > Are there some extra condition to reproduce the issue? Maybe it depends
> > on the compiler version? I guess I can try also on QEMU, but based on
> > the description, I would expect it to crash in any case.
> >
> 
> Did you see the correct message in both cases?
> Did you use Grub or direct EFI?
> 
> With Grub and without this patch you won't see the message, with grub
> with the patch you see the correct message.

I did use grub, and I didn't see the message indeed.
But in the case it was supposed to crash (with added PrintErrMesg(),
commented out blexit and without your patch) it did _not_ crashed and
continued to normal boot. Is that #PF non-fatal here?

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
Re: [PATCH v6] Avoid crash calling PrintErrMesg from efi_multiboot2
Posted by Frediano Ziglio 8 months, 1 week ago
On Mon, Feb 24, 2025 at 1:16 PM Marek Marczykowski-Górecki
<marmarek@invisiblethingslab.com> wrote:
>
> On Mon, Feb 24, 2025 at 12:57:13PM +0000, Frediano Ziglio wrote:
> > On Fri, Feb 21, 2025 at 8:20 PM Marek Marczykowski-Górecki
> > <marmarek@invisiblethingslab.com> wrote:
> > >
> > > On Mon, Feb 17, 2025 at 04:26:59PM +0000, Frediano Ziglio wrote:
> > > > Although code is compiled with -fpic option data is not position
> > > > independent. This causes data pointer to become invalid if
> > > > code is not relocated properly which is what happens for
> > > > efi_multiboot2 which is called by multiboot entry code.
> > > >
> > > > Code tested adding
> > > >    PrintErrMesg(L"Test message", EFI_BUFFER_TOO_SMALL);
> > > > in efi_multiboot2 before calling efi_arch_edd (this function
> > > > can potentially call PrintErrMesg).
> > > >
> > > > Before the patch (XenServer installation on Qemu, xen replaced
> > > > with vanilla xen.gz):
> > > >   Booting `XenServer (Serial)'Booting `XenServer (Serial)'
> > > >   Test message: !!!! X64 Exception Type - 0E(#PF - Page-Fault)  CPU Apic ID - 00000000 !!!!
> > > >   ExceptionData - 0000000000000000  I:0 R:0 U:0 W:0 P:0 PK:0 SS:0 SGX:0
> > > >   RIP  - 000000007EE21E9A, CS  - 0000000000000038, RFLAGS - 0000000000210246
> > > >   RAX  - 000000007FF0C1B5, RCX - 0000000000000050, RDX - 0000000000000010
> > > >   RBX  - 0000000000000000, RSP - 000000007FF0C180, RBP - 000000007FF0C210
> > > >   RSI  - FFFF82D040467CE8, RDI - 0000000000000000
> > > >   R8   - 000000007FF0C1C8, R9  - 000000007FF0C1C0, R10 - 0000000000000000
> > > >   R11  - 0000000000001020, R12 - FFFF82D040467CE8, R13 - 000000007FF0C1B8
> > > >   R14  - 000000007EA33328, R15 - 000000007EA332D8
> > > >   DS   - 0000000000000030, ES  - 0000000000000030, FS  - 0000000000000030
> > > >   GS   - 0000000000000030, SS  - 0000000000000030
> > > >   CR0  - 0000000080010033, CR2 - FFFF82D040467CE8, CR3 - 000000007FC01000
> > > >   CR4  - 0000000000000668, CR8 - 0000000000000000
> > > >   DR0  - 0000000000000000, DR1 - 0000000000000000, DR2 - 0000000000000000
> > > >   DR3  - 0000000000000000, DR6 - 00000000FFFF0FF0, DR7 - 0000000000000400
> > > >   GDTR - 000000007F9DB000 0000000000000047, LDTR - 0000000000000000
> > > >   IDTR - 000000007F48E018 0000000000000FFF,   TR - 0000000000000000
> > > >   FXSAVE_STATE - 000000007FF0BDE0
> > > >   !!!! Find image based on IP(0x7EE21E9A) (No PDB)  (ImageBase=000000007EE20000, EntryPoint=000000007EE23935) !!!!
> > > >
> > > > After the patch:
> > > >   Booting `XenServer (Serial)'Booting `XenServer (Serial)'
> > > >   Test message: Buffer too small
> > > >   BdsDxe: loading Boot0000 "UiApp" from Fv(7CB8BDC9-F8EB-4F34-AAEA-3EE4AF6516A1)/FvFile(462CAA21-7614-4503-836E-8AB6F4662331)
> > > >   BdsDxe: starting Boot0000 "UiApp" from Fv(7CB8BDC9-F8EB-4F34-AAEA-3EE4AF6516A1)/FvFile(462CAA21-7614-4503-836E-8AB6F4662331)
> > > >
> > > > This partially rollback commit 00d5d5ce23e6.
> > > >
> > > > Fixes: 9180f5365524 ("x86: add multiboot2 protocol support for EFI platforms")
> > > > Signed-off-by: Frediano Ziglio <frediano.ziglio@cloud.com>
> > >
> > > I tried testing this patch, but it seems I cannot reproduce the original
> > > failure...
> > >
> > > I did as the commit message suggests here:
> > > https://gitlab.com/xen-project/people/marmarek/xen/-/commit/ca3d6911c448eb886990f33d4380b5646617a982
> > >
> > > With blexit() in PrintErrMesg(), it went back to the bootloader, so I'm
> > > sure this code path was reached. But with blexit() commented out, Xen
> > > started correctly both with and without this patch... The branch I used
> > > is here:
> > > https://gitlab.com/xen-project/people/marmarek/xen/-/commits/automation-tests?ref_type=heads
> > >
> > > Are there some extra condition to reproduce the issue? Maybe it depends
> > > on the compiler version? I guess I can try also on QEMU, but based on
> > > the description, I would expect it to crash in any case.
> > >
> >
> > Did you see the correct message in both cases?
> > Did you use Grub or direct EFI?
> >
> > With Grub and without this patch you won't see the message, with grub
> > with the patch you see the correct message.
>
> I did use grub, and I didn't see the message indeed.
> But in the case it was supposed to crash (with added PrintErrMesg(),
> commented out blexit and without your patch) it did _not_ crashed and
> continued to normal boot. Is that #PF non-fatal here?
>

Hi,
   I tried again with my test environment.
Added the PrintErrMesg line before efi_arch_edd call, I got a #PF, in
my case the system hangs. With the fix patch machine is rebooting and
I can see the message in the logs.
I'm trying with Xen starting inside Qemu, EFI firmware, xen.gz
compiled as ELF file. Host system is an Ubuntu 22.04.5 LTS. Gcc is
version 11.4.

Regards,
   Frediano
Re: [PATCH v6] Avoid crash calling PrintErrMesg from efi_multiboot2
Posted by Marek Marczykowski-Górecki 8 months, 1 week ago
On Mon, Feb 24, 2025 at 02:31:00PM +0000, Frediano Ziglio wrote:
> On Mon, Feb 24, 2025 at 1:16 PM Marek Marczykowski-Górecki
> <marmarek@invisiblethingslab.com> wrote:
> >
> > On Mon, Feb 24, 2025 at 12:57:13PM +0000, Frediano Ziglio wrote:
> > > On Fri, Feb 21, 2025 at 8:20 PM Marek Marczykowski-Górecki
> > > <marmarek@invisiblethingslab.com> wrote:
> > > >
> > > > On Mon, Feb 17, 2025 at 04:26:59PM +0000, Frediano Ziglio wrote:
> > > > > Although code is compiled with -fpic option data is not position
> > > > > independent. This causes data pointer to become invalid if
> > > > > code is not relocated properly which is what happens for
> > > > > efi_multiboot2 which is called by multiboot entry code.
> > > > >
> > > > > Code tested adding
> > > > >    PrintErrMesg(L"Test message", EFI_BUFFER_TOO_SMALL);
> > > > > in efi_multiboot2 before calling efi_arch_edd (this function
> > > > > can potentially call PrintErrMesg).
> > > > >
> > > > > Before the patch (XenServer installation on Qemu, xen replaced
> > > > > with vanilla xen.gz):
> > > > >   Booting `XenServer (Serial)'Booting `XenServer (Serial)'
> > > > >   Test message: !!!! X64 Exception Type - 0E(#PF - Page-Fault)  CPU Apic ID - 00000000 !!!!
> > > > >   ExceptionData - 0000000000000000  I:0 R:0 U:0 W:0 P:0 PK:0 SS:0 SGX:0
> > > > >   RIP  - 000000007EE21E9A, CS  - 0000000000000038, RFLAGS - 0000000000210246
> > > > >   RAX  - 000000007FF0C1B5, RCX - 0000000000000050, RDX - 0000000000000010
> > > > >   RBX  - 0000000000000000, RSP - 000000007FF0C180, RBP - 000000007FF0C210
> > > > >   RSI  - FFFF82D040467CE8, RDI - 0000000000000000
> > > > >   R8   - 000000007FF0C1C8, R9  - 000000007FF0C1C0, R10 - 0000000000000000
> > > > >   R11  - 0000000000001020, R12 - FFFF82D040467CE8, R13 - 000000007FF0C1B8
> > > > >   R14  - 000000007EA33328, R15 - 000000007EA332D8
> > > > >   DS   - 0000000000000030, ES  - 0000000000000030, FS  - 0000000000000030
> > > > >   GS   - 0000000000000030, SS  - 0000000000000030
> > > > >   CR0  - 0000000080010033, CR2 - FFFF82D040467CE8, CR3 - 000000007FC01000
> > > > >   CR4  - 0000000000000668, CR8 - 0000000000000000
> > > > >   DR0  - 0000000000000000, DR1 - 0000000000000000, DR2 - 0000000000000000
> > > > >   DR3  - 0000000000000000, DR6 - 00000000FFFF0FF0, DR7 - 0000000000000400
> > > > >   GDTR - 000000007F9DB000 0000000000000047, LDTR - 0000000000000000
> > > > >   IDTR - 000000007F48E018 0000000000000FFF,   TR - 0000000000000000
> > > > >   FXSAVE_STATE - 000000007FF0BDE0
> > > > >   !!!! Find image based on IP(0x7EE21E9A) (No PDB)  (ImageBase=000000007EE20000, EntryPoint=000000007EE23935) !!!!
> > > > >
> > > > > After the patch:
> > > > >   Booting `XenServer (Serial)'Booting `XenServer (Serial)'
> > > > >   Test message: Buffer too small
> > > > >   BdsDxe: loading Boot0000 "UiApp" from Fv(7CB8BDC9-F8EB-4F34-AAEA-3EE4AF6516A1)/FvFile(462CAA21-7614-4503-836E-8AB6F4662331)
> > > > >   BdsDxe: starting Boot0000 "UiApp" from Fv(7CB8BDC9-F8EB-4F34-AAEA-3EE4AF6516A1)/FvFile(462CAA21-7614-4503-836E-8AB6F4662331)
> > > > >
> > > > > This partially rollback commit 00d5d5ce23e6.
> > > > >
> > > > > Fixes: 9180f5365524 ("x86: add multiboot2 protocol support for EFI platforms")
> > > > > Signed-off-by: Frediano Ziglio <frediano.ziglio@cloud.com>
> > > >
> > > > I tried testing this patch, but it seems I cannot reproduce the original
> > > > failure...
> > > >
> > > > I did as the commit message suggests here:
> > > > https://gitlab.com/xen-project/people/marmarek/xen/-/commit/ca3d6911c448eb886990f33d4380b5646617a982
> > > >
> > > > With blexit() in PrintErrMesg(), it went back to the bootloader, so I'm
> > > > sure this code path was reached. But with blexit() commented out, Xen
> > > > started correctly both with and without this patch... The branch I used
> > > > is here:
> > > > https://gitlab.com/xen-project/people/marmarek/xen/-/commits/automation-tests?ref_type=heads
> > > >
> > > > Are there some extra condition to reproduce the issue? Maybe it depends
> > > > on the compiler version? I guess I can try also on QEMU, but based on
> > > > the description, I would expect it to crash in any case.
> > > >
> > >
> > > Did you see the correct message in both cases?
> > > Did you use Grub or direct EFI?
> > >
> > > With Grub and without this patch you won't see the message, with grub
> > > with the patch you see the correct message.
> >
> > I did use grub, and I didn't see the message indeed.
> > But in the case it was supposed to crash (with added PrintErrMesg(),
> > commented out blexit and without your patch) it did _not_ crashed and
> > continued to normal boot. Is that #PF non-fatal here?
> >
> 
> Hi,
>    I tried again with my test environment.
> Added the PrintErrMesg line before efi_arch_edd call, I got a #PF, in
> my case the system hangs. With the fix patch machine is rebooting and
> I can see the message in the logs.
> I'm trying with Xen starting inside Qemu, EFI firmware, xen.gz
> compiled as ELF file. Host system is an Ubuntu 22.04.5 LTS. Gcc is
> version 11.4.

My test was wrong, commenting out blexit made "mesg" variable unused.
After fixing that, I can reproduce it on both QEMU and real hardware:
without your patch it crashes and with your patch it works just fine.
While there may be more places with similar issue, this patch clearly
improves the situation, so:

Acked-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
Re: [PATCH v6] Avoid crash calling PrintErrMesg from efi_multiboot2
Posted by Jan Beulich 7 months, 4 weeks ago
On 26.02.2025 19:54, Marek Marczykowski-Górecki wrote:
> On Mon, Feb 24, 2025 at 02:31:00PM +0000, Frediano Ziglio wrote:
>> On Mon, Feb 24, 2025 at 1:16 PM Marek Marczykowski-Górecki
>> <marmarek@invisiblethingslab.com> wrote:
>>>
>>> On Mon, Feb 24, 2025 at 12:57:13PM +0000, Frediano Ziglio wrote:
>>>> On Fri, Feb 21, 2025 at 8:20 PM Marek Marczykowski-Górecki
>>>> <marmarek@invisiblethingslab.com> wrote:
>>>>>
>>>>> On Mon, Feb 17, 2025 at 04:26:59PM +0000, Frediano Ziglio wrote:
>>>>>> Although code is compiled with -fpic option data is not position
>>>>>> independent. This causes data pointer to become invalid if
>>>>>> code is not relocated properly which is what happens for
>>>>>> efi_multiboot2 which is called by multiboot entry code.
>>>>>>
>>>>>> Code tested adding
>>>>>>    PrintErrMesg(L"Test message", EFI_BUFFER_TOO_SMALL);
>>>>>> in efi_multiboot2 before calling efi_arch_edd (this function
>>>>>> can potentially call PrintErrMesg).
>>>>>>
>>>>>> Before the patch (XenServer installation on Qemu, xen replaced
>>>>>> with vanilla xen.gz):
>>>>>>   Booting `XenServer (Serial)'Booting `XenServer (Serial)'
>>>>>>   Test message: !!!! X64 Exception Type - 0E(#PF - Page-Fault)  CPU Apic ID - 00000000 !!!!
>>>>>>   ExceptionData - 0000000000000000  I:0 R:0 U:0 W:0 P:0 PK:0 SS:0 SGX:0
>>>>>>   RIP  - 000000007EE21E9A, CS  - 0000000000000038, RFLAGS - 0000000000210246
>>>>>>   RAX  - 000000007FF0C1B5, RCX - 0000000000000050, RDX - 0000000000000010
>>>>>>   RBX  - 0000000000000000, RSP - 000000007FF0C180, RBP - 000000007FF0C210
>>>>>>   RSI  - FFFF82D040467CE8, RDI - 0000000000000000
>>>>>>   R8   - 000000007FF0C1C8, R9  - 000000007FF0C1C0, R10 - 0000000000000000
>>>>>>   R11  - 0000000000001020, R12 - FFFF82D040467CE8, R13 - 000000007FF0C1B8
>>>>>>   R14  - 000000007EA33328, R15 - 000000007EA332D8
>>>>>>   DS   - 0000000000000030, ES  - 0000000000000030, FS  - 0000000000000030
>>>>>>   GS   - 0000000000000030, SS  - 0000000000000030
>>>>>>   CR0  - 0000000080010033, CR2 - FFFF82D040467CE8, CR3 - 000000007FC01000
>>>>>>   CR4  - 0000000000000668, CR8 - 0000000000000000
>>>>>>   DR0  - 0000000000000000, DR1 - 0000000000000000, DR2 - 0000000000000000
>>>>>>   DR3  - 0000000000000000, DR6 - 00000000FFFF0FF0, DR7 - 0000000000000400
>>>>>>   GDTR - 000000007F9DB000 0000000000000047, LDTR - 0000000000000000
>>>>>>   IDTR - 000000007F48E018 0000000000000FFF,   TR - 0000000000000000
>>>>>>   FXSAVE_STATE - 000000007FF0BDE0
>>>>>>   !!!! Find image based on IP(0x7EE21E9A) (No PDB)  (ImageBase=000000007EE20000, EntryPoint=000000007EE23935) !!!!
>>>>>>
>>>>>> After the patch:
>>>>>>   Booting `XenServer (Serial)'Booting `XenServer (Serial)'
>>>>>>   Test message: Buffer too small
>>>>>>   BdsDxe: loading Boot0000 "UiApp" from Fv(7CB8BDC9-F8EB-4F34-AAEA-3EE4AF6516A1)/FvFile(462CAA21-7614-4503-836E-8AB6F4662331)
>>>>>>   BdsDxe: starting Boot0000 "UiApp" from Fv(7CB8BDC9-F8EB-4F34-AAEA-3EE4AF6516A1)/FvFile(462CAA21-7614-4503-836E-8AB6F4662331)
>>>>>>
>>>>>> This partially rollback commit 00d5d5ce23e6.
>>>>>>
>>>>>> Fixes: 9180f5365524 ("x86: add multiboot2 protocol support for EFI platforms")
>>>>>> Signed-off-by: Frediano Ziglio <frediano.ziglio@cloud.com>
>>>>>
>>>>> I tried testing this patch, but it seems I cannot reproduce the original
>>>>> failure...
>>>>>
>>>>> I did as the commit message suggests here:
>>>>> https://gitlab.com/xen-project/people/marmarek/xen/-/commit/ca3d6911c448eb886990f33d4380b5646617a982
>>>>>
>>>>> With blexit() in PrintErrMesg(), it went back to the bootloader, so I'm
>>>>> sure this code path was reached. But with blexit() commented out, Xen
>>>>> started correctly both with and without this patch... The branch I used
>>>>> is here:
>>>>> https://gitlab.com/xen-project/people/marmarek/xen/-/commits/automation-tests?ref_type=heads
>>>>>
>>>>> Are there some extra condition to reproduce the issue? Maybe it depends
>>>>> on the compiler version? I guess I can try also on QEMU, but based on
>>>>> the description, I would expect it to crash in any case.
>>>>>
>>>>
>>>> Did you see the correct message in both cases?
>>>> Did you use Grub or direct EFI?
>>>>
>>>> With Grub and without this patch you won't see the message, with grub
>>>> with the patch you see the correct message.
>>>
>>> I did use grub, and I didn't see the message indeed.
>>> But in the case it was supposed to crash (with added PrintErrMesg(),
>>> commented out blexit and without your patch) it did _not_ crashed and
>>> continued to normal boot. Is that #PF non-fatal here?
>>>
>>
>> Hi,
>>    I tried again with my test environment.
>> Added the PrintErrMesg line before efi_arch_edd call, I got a #PF, in
>> my case the system hangs. With the fix patch machine is rebooting and
>> I can see the message in the logs.
>> I'm trying with Xen starting inside Qemu, EFI firmware, xen.gz
>> compiled as ELF file. Host system is an Ubuntu 22.04.5 LTS. Gcc is
>> version 11.4.
> 
> My test was wrong, commenting out blexit made "mesg" variable unused.
> After fixing that, I can reproduce it on both QEMU and real hardware:
> without your patch it crashes and with your patch it works just fine.
> While there may be more places with similar issue, this patch clearly
> improves the situation, so:
> 
> Acked-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>

This had to be reverted, for breaking the build with old Clang. See the
respective Matrix conversation.

Jan


Re: [PATCH v6] Avoid crash calling PrintErrMesg from efi_multiboot2
Posted by Frediano Ziglio 7 months, 4 weeks ago
On Thu, Mar 6, 2025 at 2:26 PM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 26.02.2025 19:54, Marek Marczykowski-Górecki wrote:
> > On Mon, Feb 24, 2025 at 02:31:00PM +0000, Frediano Ziglio wrote:
> >> On Mon, Feb 24, 2025 at 1:16 PM Marek Marczykowski-Górecki
> >> <marmarek@invisiblethingslab.com> wrote:
> >>>
> >>> On Mon, Feb 24, 2025 at 12:57:13PM +0000, Frediano Ziglio wrote:
> >>>> On Fri, Feb 21, 2025 at 8:20 PM Marek Marczykowski-Górecki
> >>>> <marmarek@invisiblethingslab.com> wrote:
> >>>>>
> >>>>> On Mon, Feb 17, 2025 at 04:26:59PM +0000, Frediano Ziglio wrote:
> >>>>>> Although code is compiled with -fpic option data is not position
> >>>>>> independent. This causes data pointer to become invalid if
> >>>>>> code is not relocated properly which is what happens for
> >>>>>> efi_multiboot2 which is called by multiboot entry code.
> >>>>>>
> >>>>>> Code tested adding
> >>>>>>    PrintErrMesg(L"Test message", EFI_BUFFER_TOO_SMALL);
> >>>>>> in efi_multiboot2 before calling efi_arch_edd (this function
> >>>>>> can potentially call PrintErrMesg).
> >>>>>>
> >>>>>> Before the patch (XenServer installation on Qemu, xen replaced
> >>>>>> with vanilla xen.gz):
> >>>>>>   Booting `XenServer (Serial)'Booting `XenServer (Serial)'
> >>>>>>   Test message: !!!! X64 Exception Type - 0E(#PF - Page-Fault)  CPU Apic ID - 00000000 !!!!
> >>>>>>   ExceptionData - 0000000000000000  I:0 R:0 U:0 W:0 P:0 PK:0 SS:0 SGX:0
> >>>>>>   RIP  - 000000007EE21E9A, CS  - 0000000000000038, RFLAGS - 0000000000210246
> >>>>>>   RAX  - 000000007FF0C1B5, RCX - 0000000000000050, RDX - 0000000000000010
> >>>>>>   RBX  - 0000000000000000, RSP - 000000007FF0C180, RBP - 000000007FF0C210
> >>>>>>   RSI  - FFFF82D040467CE8, RDI - 0000000000000000
> >>>>>>   R8   - 000000007FF0C1C8, R9  - 000000007FF0C1C0, R10 - 0000000000000000
> >>>>>>   R11  - 0000000000001020, R12 - FFFF82D040467CE8, R13 - 000000007FF0C1B8
> >>>>>>   R14  - 000000007EA33328, R15 - 000000007EA332D8
> >>>>>>   DS   - 0000000000000030, ES  - 0000000000000030, FS  - 0000000000000030
> >>>>>>   GS   - 0000000000000030, SS  - 0000000000000030
> >>>>>>   CR0  - 0000000080010033, CR2 - FFFF82D040467CE8, CR3 - 000000007FC01000
> >>>>>>   CR4  - 0000000000000668, CR8 - 0000000000000000
> >>>>>>   DR0  - 0000000000000000, DR1 - 0000000000000000, DR2 - 0000000000000000
> >>>>>>   DR3  - 0000000000000000, DR6 - 00000000FFFF0FF0, DR7 - 0000000000000400
> >>>>>>   GDTR - 000000007F9DB000 0000000000000047, LDTR - 0000000000000000
> >>>>>>   IDTR - 000000007F48E018 0000000000000FFF,   TR - 0000000000000000
> >>>>>>   FXSAVE_STATE - 000000007FF0BDE0
> >>>>>>   !!!! Find image based on IP(0x7EE21E9A) (No PDB)  (ImageBase=000000007EE20000, EntryPoint=000000007EE23935) !!!!
> >>>>>>
> >>>>>> After the patch:
> >>>>>>   Booting `XenServer (Serial)'Booting `XenServer (Serial)'
> >>>>>>   Test message: Buffer too small
> >>>>>>   BdsDxe: loading Boot0000 "UiApp" from Fv(7CB8BDC9-F8EB-4F34-AAEA-3EE4AF6516A1)/FvFile(462CAA21-7614-4503-836E-8AB6F4662331)
> >>>>>>   BdsDxe: starting Boot0000 "UiApp" from Fv(7CB8BDC9-F8EB-4F34-AAEA-3EE4AF6516A1)/FvFile(462CAA21-7614-4503-836E-8AB6F4662331)
> >>>>>>
> >>>>>> This partially rollback commit 00d5d5ce23e6.
> >>>>>>
> >>>>>> Fixes: 9180f5365524 ("x86: add multiboot2 protocol support for EFI platforms")
> >>>>>> Signed-off-by: Frediano Ziglio <frediano.ziglio@cloud.com>
> >>>>>
> >>>>> I tried testing this patch, but it seems I cannot reproduce the original
> >>>>> failure...
> >>>>>
> >>>>> I did as the commit message suggests here:
> >>>>> https://gitlab.com/xen-project/people/marmarek/xen/-/commit/ca3d6911c448eb886990f33d4380b5646617a982
> >>>>>
> >>>>> With blexit() in PrintErrMesg(), it went back to the bootloader, so I'm
> >>>>> sure this code path was reached. But with blexit() commented out, Xen
> >>>>> started correctly both with and without this patch... The branch I used
> >>>>> is here:
> >>>>> https://gitlab.com/xen-project/people/marmarek/xen/-/commits/automation-tests?ref_type=heads
> >>>>>
> >>>>> Are there some extra condition to reproduce the issue? Maybe it depends
> >>>>> on the compiler version? I guess I can try also on QEMU, but based on
> >>>>> the description, I would expect it to crash in any case.
> >>>>>
> >>>>
> >>>> Did you see the correct message in both cases?
> >>>> Did you use Grub or direct EFI?
> >>>>
> >>>> With Grub and without this patch you won't see the message, with grub
> >>>> with the patch you see the correct message.
> >>>
> >>> I did use grub, and I didn't see the message indeed.
> >>> But in the case it was supposed to crash (with added PrintErrMesg(),
> >>> commented out blexit and without your patch) it did _not_ crashed and
> >>> continued to normal boot. Is that #PF non-fatal here?
> >>>
> >>
> >> Hi,
> >>    I tried again with my test environment.
> >> Added the PrintErrMesg line before efi_arch_edd call, I got a #PF, in
> >> my case the system hangs. With the fix patch machine is rebooting and
> >> I can see the message in the logs.
> >> I'm trying with Xen starting inside Qemu, EFI firmware, xen.gz
> >> compiled as ELF file. Host system is an Ubuntu 22.04.5 LTS. Gcc is
> >> version 11.4.
> >
> > My test was wrong, commenting out blexit made "mesg" variable unused.
> > After fixing that, I can reproduce it on both QEMU and real hardware:
> > without your patch it crashes and with your patch it works just fine.
> > While there may be more places with similar issue, this patch clearly
> > improves the situation, so:
> >
> > Acked-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
>
> This had to be reverted, for breaking the build with old Clang. See the
> respective Matrix conversation.
>
> Jan
>

To sum up the failure is:

    clang: error: unknown argument: '-fno-jump-tables'

Frediano
Re: [PATCH v6] Avoid crash calling PrintErrMesg from efi_multiboot2
Posted by Frediano Ziglio 7 months, 2 weeks ago
On Thu, Mar 6, 2025 at 3:02 PM Frediano Ziglio
<frediano.ziglio@cloud.com> wrote:
>
> On Thu, Mar 6, 2025 at 2:26 PM Jan Beulich <jbeulich@suse.com> wrote:
> >
> > On 26.02.2025 19:54, Marek Marczykowski-Górecki wrote:
> > > On Mon, Feb 24, 2025 at 02:31:00PM +0000, Frediano Ziglio wrote:
> > >> On Mon, Feb 24, 2025 at 1:16 PM Marek Marczykowski-Górecki
> > >> <marmarek@invisiblethingslab.com> wrote:
> > >>>
> > >>> On Mon, Feb 24, 2025 at 12:57:13PM +0000, Frediano Ziglio wrote:
> > >>>> On Fri, Feb 21, 2025 at 8:20 PM Marek Marczykowski-Górecki
> > >>>> <marmarek@invisiblethingslab.com> wrote:
> > >>>>>
> > >>>>> On Mon, Feb 17, 2025 at 04:26:59PM +0000, Frediano Ziglio wrote:
> > >>>>>> Although code is compiled with -fpic option data is not position
> > >>>>>> independent. This causes data pointer to become invalid if
> > >>>>>> code is not relocated properly which is what happens for
> > >>>>>> efi_multiboot2 which is called by multiboot entry code.
> > >>>>>>
> > >>>>>> Code tested adding
> > >>>>>>    PrintErrMesg(L"Test message", EFI_BUFFER_TOO_SMALL);
> > >>>>>> in efi_multiboot2 before calling efi_arch_edd (this function
> > >>>>>> can potentially call PrintErrMesg).
> > >>>>>>
> > >>>>>> Before the patch (XenServer installation on Qemu, xen replaced
> > >>>>>> with vanilla xen.gz):
> > >>>>>>   Booting `XenServer (Serial)'Booting `XenServer (Serial)'
> > >>>>>>   Test message: !!!! X64 Exception Type - 0E(#PF - Page-Fault)  CPU Apic ID - 00000000 !!!!
> > >>>>>>   ExceptionData - 0000000000000000  I:0 R:0 U:0 W:0 P:0 PK:0 SS:0 SGX:0
> > >>>>>>   RIP  - 000000007EE21E9A, CS  - 0000000000000038, RFLAGS - 0000000000210246
> > >>>>>>   RAX  - 000000007FF0C1B5, RCX - 0000000000000050, RDX - 0000000000000010
> > >>>>>>   RBX  - 0000000000000000, RSP - 000000007FF0C180, RBP - 000000007FF0C210
> > >>>>>>   RSI  - FFFF82D040467CE8, RDI - 0000000000000000
> > >>>>>>   R8   - 000000007FF0C1C8, R9  - 000000007FF0C1C0, R10 - 0000000000000000
> > >>>>>>   R11  - 0000000000001020, R12 - FFFF82D040467CE8, R13 - 000000007FF0C1B8
> > >>>>>>   R14  - 000000007EA33328, R15 - 000000007EA332D8
> > >>>>>>   DS   - 0000000000000030, ES  - 0000000000000030, FS  - 0000000000000030
> > >>>>>>   GS   - 0000000000000030, SS  - 0000000000000030
> > >>>>>>   CR0  - 0000000080010033, CR2 - FFFF82D040467CE8, CR3 - 000000007FC01000
> > >>>>>>   CR4  - 0000000000000668, CR8 - 0000000000000000
> > >>>>>>   DR0  - 0000000000000000, DR1 - 0000000000000000, DR2 - 0000000000000000
> > >>>>>>   DR3  - 0000000000000000, DR6 - 00000000FFFF0FF0, DR7 - 0000000000000400
> > >>>>>>   GDTR - 000000007F9DB000 0000000000000047, LDTR - 0000000000000000
> > >>>>>>   IDTR - 000000007F48E018 0000000000000FFF,   TR - 0000000000000000
> > >>>>>>   FXSAVE_STATE - 000000007FF0BDE0
> > >>>>>>   !!!! Find image based on IP(0x7EE21E9A) (No PDB)  (ImageBase=000000007EE20000, EntryPoint=000000007EE23935) !!!!
> > >>>>>>
> > >>>>>> After the patch:
> > >>>>>>   Booting `XenServer (Serial)'Booting `XenServer (Serial)'
> > >>>>>>   Test message: Buffer too small
> > >>>>>>   BdsDxe: loading Boot0000 "UiApp" from Fv(7CB8BDC9-F8EB-4F34-AAEA-3EE4AF6516A1)/FvFile(462CAA21-7614-4503-836E-8AB6F4662331)
> > >>>>>>   BdsDxe: starting Boot0000 "UiApp" from Fv(7CB8BDC9-F8EB-4F34-AAEA-3EE4AF6516A1)/FvFile(462CAA21-7614-4503-836E-8AB6F4662331)
> > >>>>>>
> > >>>>>> This partially rollback commit 00d5d5ce23e6.
> > >>>>>>
> > >>>>>> Fixes: 9180f5365524 ("x86: add multiboot2 protocol support for EFI platforms")
> > >>>>>> Signed-off-by: Frediano Ziglio <frediano.ziglio@cloud.com>
> > >>>>>
> > >>>>> I tried testing this patch, but it seems I cannot reproduce the original
> > >>>>> failure...
> > >>>>>
> > >>>>> I did as the commit message suggests here:
> > >>>>> https://gitlab.com/xen-project/people/marmarek/xen/-/commit/ca3d6911c448eb886990f33d4380b5646617a982
> > >>>>>
> > >>>>> With blexit() in PrintErrMesg(), it went back to the bootloader, so I'm
> > >>>>> sure this code path was reached. But with blexit() commented out, Xen
> > >>>>> started correctly both with and without this patch... The branch I used
> > >>>>> is here:
> > >>>>> https://gitlab.com/xen-project/people/marmarek/xen/-/commits/automation-tests?ref_type=heads
> > >>>>>
> > >>>>> Are there some extra condition to reproduce the issue? Maybe it depends
> > >>>>> on the compiler version? I guess I can try also on QEMU, but based on
> > >>>>> the description, I would expect it to crash in any case.
> > >>>>>
> > >>>>
> > >>>> Did you see the correct message in both cases?
> > >>>> Did you use Grub or direct EFI?
> > >>>>
> > >>>> With Grub and without this patch you won't see the message, with grub
> > >>>> with the patch you see the correct message.
> > >>>
> > >>> I did use grub, and I didn't see the message indeed.
> > >>> But in the case it was supposed to crash (with added PrintErrMesg(),
> > >>> commented out blexit and without your patch) it did _not_ crashed and
> > >>> continued to normal boot. Is that #PF non-fatal here?
> > >>>
> > >>
> > >> Hi,
> > >>    I tried again with my test environment.
> > >> Added the PrintErrMesg line before efi_arch_edd call, I got a #PF, in
> > >> my case the system hangs. With the fix patch machine is rebooting and
> > >> I can see the message in the logs.
> > >> I'm trying with Xen starting inside Qemu, EFI firmware, xen.gz
> > >> compiled as ELF file. Host system is an Ubuntu 22.04.5 LTS. Gcc is
> > >> version 11.4.
> > >
> > > My test was wrong, commenting out blexit made "mesg" variable unused.
> > > After fixing that, I can reproduce it on both QEMU and real hardware:
> > > without your patch it crashes and with your patch it works just fine.
> > > While there may be more places with similar issue, this patch clearly
> > > improves the situation, so:
> > >
> > > Acked-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
> >
> > This had to be reverted, for breaking the build with old Clang. See the
> > respective Matrix conversation.
> >
> > Jan
> >
>
> To sum up the failure is:
>
>     clang: error: unknown argument: '-fno-jump-tables'
>

Now that the minimum clang version supports this option, can this
change be applied?

Frediano
Re: [PATCH v6] Avoid crash calling PrintErrMesg from efi_multiboot2
Posted by Jan Beulich 7 months, 2 weeks ago
On 20.03.2025 15:33, Frediano Ziglio wrote:
> On Thu, Mar 6, 2025 at 3:02 PM Frediano Ziglio
> <frediano.ziglio@cloud.com> wrote:
>>
>> On Thu, Mar 6, 2025 at 2:26 PM Jan Beulich <jbeulich@suse.com> wrote:
>>>
>>> On 26.02.2025 19:54, Marek Marczykowski-Górecki wrote:
>>>> On Mon, Feb 24, 2025 at 02:31:00PM +0000, Frediano Ziglio wrote:
>>>>> On Mon, Feb 24, 2025 at 1:16 PM Marek Marczykowski-Górecki
>>>>> <marmarek@invisiblethingslab.com> wrote:
>>>>>>
>>>>>> On Mon, Feb 24, 2025 at 12:57:13PM +0000, Frediano Ziglio wrote:
>>>>>>> On Fri, Feb 21, 2025 at 8:20 PM Marek Marczykowski-Górecki
>>>>>>> <marmarek@invisiblethingslab.com> wrote:
>>>>>>>>
>>>>>>>> On Mon, Feb 17, 2025 at 04:26:59PM +0000, Frediano Ziglio wrote:
>>>>>>>>> Although code is compiled with -fpic option data is not position
>>>>>>>>> independent. This causes data pointer to become invalid if
>>>>>>>>> code is not relocated properly which is what happens for
>>>>>>>>> efi_multiboot2 which is called by multiboot entry code.
>>>>>>>>>
>>>>>>>>> Code tested adding
>>>>>>>>>    PrintErrMesg(L"Test message", EFI_BUFFER_TOO_SMALL);
>>>>>>>>> in efi_multiboot2 before calling efi_arch_edd (this function
>>>>>>>>> can potentially call PrintErrMesg).
>>>>>>>>>
>>>>>>>>> Before the patch (XenServer installation on Qemu, xen replaced
>>>>>>>>> with vanilla xen.gz):
>>>>>>>>>   Booting `XenServer (Serial)'Booting `XenServer (Serial)'
>>>>>>>>>   Test message: !!!! X64 Exception Type - 0E(#PF - Page-Fault)  CPU Apic ID - 00000000 !!!!
>>>>>>>>>   ExceptionData - 0000000000000000  I:0 R:0 U:0 W:0 P:0 PK:0 SS:0 SGX:0
>>>>>>>>>   RIP  - 000000007EE21E9A, CS  - 0000000000000038, RFLAGS - 0000000000210246
>>>>>>>>>   RAX  - 000000007FF0C1B5, RCX - 0000000000000050, RDX - 0000000000000010
>>>>>>>>>   RBX  - 0000000000000000, RSP - 000000007FF0C180, RBP - 000000007FF0C210
>>>>>>>>>   RSI  - FFFF82D040467CE8, RDI - 0000000000000000
>>>>>>>>>   R8   - 000000007FF0C1C8, R9  - 000000007FF0C1C0, R10 - 0000000000000000
>>>>>>>>>   R11  - 0000000000001020, R12 - FFFF82D040467CE8, R13 - 000000007FF0C1B8
>>>>>>>>>   R14  - 000000007EA33328, R15 - 000000007EA332D8
>>>>>>>>>   DS   - 0000000000000030, ES  - 0000000000000030, FS  - 0000000000000030
>>>>>>>>>   GS   - 0000000000000030, SS  - 0000000000000030
>>>>>>>>>   CR0  - 0000000080010033, CR2 - FFFF82D040467CE8, CR3 - 000000007FC01000
>>>>>>>>>   CR4  - 0000000000000668, CR8 - 0000000000000000
>>>>>>>>>   DR0  - 0000000000000000, DR1 - 0000000000000000, DR2 - 0000000000000000
>>>>>>>>>   DR3  - 0000000000000000, DR6 - 00000000FFFF0FF0, DR7 - 0000000000000400
>>>>>>>>>   GDTR - 000000007F9DB000 0000000000000047, LDTR - 0000000000000000
>>>>>>>>>   IDTR - 000000007F48E018 0000000000000FFF,   TR - 0000000000000000
>>>>>>>>>   FXSAVE_STATE - 000000007FF0BDE0
>>>>>>>>>   !!!! Find image based on IP(0x7EE21E9A) (No PDB)  (ImageBase=000000007EE20000, EntryPoint=000000007EE23935) !!!!
>>>>>>>>>
>>>>>>>>> After the patch:
>>>>>>>>>   Booting `XenServer (Serial)'Booting `XenServer (Serial)'
>>>>>>>>>   Test message: Buffer too small
>>>>>>>>>   BdsDxe: loading Boot0000 "UiApp" from Fv(7CB8BDC9-F8EB-4F34-AAEA-3EE4AF6516A1)/FvFile(462CAA21-7614-4503-836E-8AB6F4662331)
>>>>>>>>>   BdsDxe: starting Boot0000 "UiApp" from Fv(7CB8BDC9-F8EB-4F34-AAEA-3EE4AF6516A1)/FvFile(462CAA21-7614-4503-836E-8AB6F4662331)
>>>>>>>>>
>>>>>>>>> This partially rollback commit 00d5d5ce23e6.
>>>>>>>>>
>>>>>>>>> Fixes: 9180f5365524 ("x86: add multiboot2 protocol support for EFI platforms")
>>>>>>>>> Signed-off-by: Frediano Ziglio <frediano.ziglio@cloud.com>
>>>>>>>>
>>>>>>>> I tried testing this patch, but it seems I cannot reproduce the original
>>>>>>>> failure...
>>>>>>>>
>>>>>>>> I did as the commit message suggests here:
>>>>>>>> https://gitlab.com/xen-project/people/marmarek/xen/-/commit/ca3d6911c448eb886990f33d4380b5646617a982
>>>>>>>>
>>>>>>>> With blexit() in PrintErrMesg(), it went back to the bootloader, so I'm
>>>>>>>> sure this code path was reached. But with blexit() commented out, Xen
>>>>>>>> started correctly both with and without this patch... The branch I used
>>>>>>>> is here:
>>>>>>>> https://gitlab.com/xen-project/people/marmarek/xen/-/commits/automation-tests?ref_type=heads
>>>>>>>>
>>>>>>>> Are there some extra condition to reproduce the issue? Maybe it depends
>>>>>>>> on the compiler version? I guess I can try also on QEMU, but based on
>>>>>>>> the description, I would expect it to crash in any case.
>>>>>>>>
>>>>>>>
>>>>>>> Did you see the correct message in both cases?
>>>>>>> Did you use Grub or direct EFI?
>>>>>>>
>>>>>>> With Grub and without this patch you won't see the message, with grub
>>>>>>> with the patch you see the correct message.
>>>>>>
>>>>>> I did use grub, and I didn't see the message indeed.
>>>>>> But in the case it was supposed to crash (with added PrintErrMesg(),
>>>>>> commented out blexit and without your patch) it did _not_ crashed and
>>>>>> continued to normal boot. Is that #PF non-fatal here?
>>>>>>
>>>>>
>>>>> Hi,
>>>>>    I tried again with my test environment.
>>>>> Added the PrintErrMesg line before efi_arch_edd call, I got a #PF, in
>>>>> my case the system hangs. With the fix patch machine is rebooting and
>>>>> I can see the message in the logs.
>>>>> I'm trying with Xen starting inside Qemu, EFI firmware, xen.gz
>>>>> compiled as ELF file. Host system is an Ubuntu 22.04.5 LTS. Gcc is
>>>>> version 11.4.
>>>>
>>>> My test was wrong, commenting out blexit made "mesg" variable unused.
>>>> After fixing that, I can reproduce it on both QEMU and real hardware:
>>>> without your patch it crashes and with your patch it works just fine.
>>>> While there may be more places with similar issue, this patch clearly
>>>> improves the situation, so:
>>>>
>>>> Acked-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
>>>
>>> This had to be reverted, for breaking the build with old Clang. See the
>>> respective Matrix conversation.
>>
>> To sum up the failure is:
>>
>>     clang: error: unknown argument: '-fno-jump-tables'
> 
> Now that the minimum clang version supports this option, can this
> change be applied?

Not sure. I for one would expect that we actively reject building with
too old tool chains then, which is yet to be carried out. Plus I think
you'd want to re-submit, with all tags dropped. The change was wrong to
go in at that earlier point, and hence any such tags weren't quite
accurate.

Jan

Re: [PATCH v6] Avoid crash calling PrintErrMesg from efi_multiboot2
Posted by Frediano Ziglio 7 months, 2 weeks ago
On Thu, Mar 20, 2025 at 3:15 PM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 20.03.2025 15:33, Frediano Ziglio wrote:
> > On Thu, Mar 6, 2025 at 3:02 PM Frediano Ziglio
> > <frediano.ziglio@cloud.com> wrote:
> >>
> >> On Thu, Mar 6, 2025 at 2:26 PM Jan Beulich <jbeulich@suse.com> wrote:
> >>>
> >>> On 26.02.2025 19:54, Marek Marczykowski-Górecki wrote:
> >>>> On Mon, Feb 24, 2025 at 02:31:00PM +0000, Frediano Ziglio wrote:
> >>>>> On Mon, Feb 24, 2025 at 1:16 PM Marek Marczykowski-Górecki
> >>>>> <marmarek@invisiblethingslab.com> wrote:
> >>>>>>
> >>>>>> On Mon, Feb 24, 2025 at 12:57:13PM +0000, Frediano Ziglio wrote:
> >>>>>>> On Fri, Feb 21, 2025 at 8:20 PM Marek Marczykowski-Górecki
> >>>>>>> <marmarek@invisiblethingslab.com> wrote:
> >>>>>>>>
> >>>>>>>> On Mon, Feb 17, 2025 at 04:26:59PM +0000, Frediano Ziglio wrote:
> >>>>>>>>> Although code is compiled with -fpic option data is not position
> >>>>>>>>> independent. This causes data pointer to become invalid if
> >>>>>>>>> code is not relocated properly which is what happens for
> >>>>>>>>> efi_multiboot2 which is called by multiboot entry code.
> >>>>>>>>>
> >>>>>>>>> Code tested adding
> >>>>>>>>>    PrintErrMesg(L"Test message", EFI_BUFFER_TOO_SMALL);
> >>>>>>>>> in efi_multiboot2 before calling efi_arch_edd (this function
> >>>>>>>>> can potentially call PrintErrMesg).
> >>>>>>>>>
> >>>>>>>>> Before the patch (XenServer installation on Qemu, xen replaced
> >>>>>>>>> with vanilla xen.gz):
> >>>>>>>>>   Booting `XenServer (Serial)'Booting `XenServer (Serial)'
> >>>>>>>>>   Test message: !!!! X64 Exception Type - 0E(#PF - Page-Fault)  CPU Apic ID - 00000000 !!!!
> >>>>>>>>>   ExceptionData - 0000000000000000  I:0 R:0 U:0 W:0 P:0 PK:0 SS:0 SGX:0
> >>>>>>>>>   RIP  - 000000007EE21E9A, CS  - 0000000000000038, RFLAGS - 0000000000210246
> >>>>>>>>>   RAX  - 000000007FF0C1B5, RCX - 0000000000000050, RDX - 0000000000000010
> >>>>>>>>>   RBX  - 0000000000000000, RSP - 000000007FF0C180, RBP - 000000007FF0C210
> >>>>>>>>>   RSI  - FFFF82D040467CE8, RDI - 0000000000000000
> >>>>>>>>>   R8   - 000000007FF0C1C8, R9  - 000000007FF0C1C0, R10 - 0000000000000000
> >>>>>>>>>   R11  - 0000000000001020, R12 - FFFF82D040467CE8, R13 - 000000007FF0C1B8
> >>>>>>>>>   R14  - 000000007EA33328, R15 - 000000007EA332D8
> >>>>>>>>>   DS   - 0000000000000030, ES  - 0000000000000030, FS  - 0000000000000030
> >>>>>>>>>   GS   - 0000000000000030, SS  - 0000000000000030
> >>>>>>>>>   CR0  - 0000000080010033, CR2 - FFFF82D040467CE8, CR3 - 000000007FC01000
> >>>>>>>>>   CR4  - 0000000000000668, CR8 - 0000000000000000
> >>>>>>>>>   DR0  - 0000000000000000, DR1 - 0000000000000000, DR2 - 0000000000000000
> >>>>>>>>>   DR3  - 0000000000000000, DR6 - 00000000FFFF0FF0, DR7 - 0000000000000400
> >>>>>>>>>   GDTR - 000000007F9DB000 0000000000000047, LDTR - 0000000000000000
> >>>>>>>>>   IDTR - 000000007F48E018 0000000000000FFF,   TR - 0000000000000000
> >>>>>>>>>   FXSAVE_STATE - 000000007FF0BDE0
> >>>>>>>>>   !!!! Find image based on IP(0x7EE21E9A) (No PDB)  (ImageBase=000000007EE20000, EntryPoint=000000007EE23935) !!!!
> >>>>>>>>>
> >>>>>>>>> After the patch:
> >>>>>>>>>   Booting `XenServer (Serial)'Booting `XenServer (Serial)'
> >>>>>>>>>   Test message: Buffer too small
> >>>>>>>>>   BdsDxe: loading Boot0000 "UiApp" from Fv(7CB8BDC9-F8EB-4F34-AAEA-3EE4AF6516A1)/FvFile(462CAA21-7614-4503-836E-8AB6F4662331)
> >>>>>>>>>   BdsDxe: starting Boot0000 "UiApp" from Fv(7CB8BDC9-F8EB-4F34-AAEA-3EE4AF6516A1)/FvFile(462CAA21-7614-4503-836E-8AB6F4662331)
> >>>>>>>>>
> >>>>>>>>> This partially rollback commit 00d5d5ce23e6.
> >>>>>>>>>
> >>>>>>>>> Fixes: 9180f5365524 ("x86: add multiboot2 protocol support for EFI platforms")
> >>>>>>>>> Signed-off-by: Frediano Ziglio <frediano.ziglio@cloud.com>
> >>>>>>>>
> >>>>>>>> I tried testing this patch, but it seems I cannot reproduce the original
> >>>>>>>> failure...
> >>>>>>>>
> >>>>>>>> I did as the commit message suggests here:
> >>>>>>>> https://gitlab.com/xen-project/people/marmarek/xen/-/commit/ca3d6911c448eb886990f33d4380b5646617a982
> >>>>>>>>
> >>>>>>>> With blexit() in PrintErrMesg(), it went back to the bootloader, so I'm
> >>>>>>>> sure this code path was reached. But with blexit() commented out, Xen
> >>>>>>>> started correctly both with and without this patch... The branch I used
> >>>>>>>> is here:
> >>>>>>>> https://gitlab.com/xen-project/people/marmarek/xen/-/commits/automation-tests?ref_type=heads
> >>>>>>>>
> >>>>>>>> Are there some extra condition to reproduce the issue? Maybe it depends
> >>>>>>>> on the compiler version? I guess I can try also on QEMU, but based on
> >>>>>>>> the description, I would expect it to crash in any case.
> >>>>>>>>
> >>>>>>>
> >>>>>>> Did you see the correct message in both cases?
> >>>>>>> Did you use Grub or direct EFI?
> >>>>>>>
> >>>>>>> With Grub and without this patch you won't see the message, with grub
> >>>>>>> with the patch you see the correct message.
> >>>>>>
> >>>>>> I did use grub, and I didn't see the message indeed.
> >>>>>> But in the case it was supposed to crash (with added PrintErrMesg(),
> >>>>>> commented out blexit and without your patch) it did _not_ crashed and
> >>>>>> continued to normal boot. Is that #PF non-fatal here?
> >>>>>>
> >>>>>
> >>>>> Hi,
> >>>>>    I tried again with my test environment.
> >>>>> Added the PrintErrMesg line before efi_arch_edd call, I got a #PF, in
> >>>>> my case the system hangs. With the fix patch machine is rebooting and
> >>>>> I can see the message in the logs.
> >>>>> I'm trying with Xen starting inside Qemu, EFI firmware, xen.gz
> >>>>> compiled as ELF file. Host system is an Ubuntu 22.04.5 LTS. Gcc is
> >>>>> version 11.4.
> >>>>
> >>>> My test was wrong, commenting out blexit made "mesg" variable unused.
> >>>> After fixing that, I can reproduce it on both QEMU and real hardware:
> >>>> without your patch it crashes and with your patch it works just fine.
> >>>> While there may be more places with similar issue, this patch clearly
> >>>> improves the situation, so:
> >>>>
> >>>> Acked-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
> >>>
> >>> This had to be reverted, for breaking the build with old Clang. See the
> >>> respective Matrix conversation.
> >>
> >> To sum up the failure is:
> >>
> >>     clang: error: unknown argument: '-fno-jump-tables'
> >
> > Now that the minimum clang version supports this option, can this
> > change be applied?
>
> Not sure. I for one would expect that we actively reject building with
> too old tool chains then, which is yet to be carried out. Plus I think
> you'd want to re-submit, with all tags dropped. The change was wrong to
> go in at that earlier point, and hence any such tags weren't quite
> accurate.
>
> Jan

Hi,
  not sure what you intend with "tags" in the above sentence. Git tags ?
Not sure we need to carry on using old tool chains if we decide to
bump the minimal versions.

Frediano
Re: [PATCH v6] Avoid crash calling PrintErrMesg from efi_multiboot2
Posted by Jan Beulich 7 months, 2 weeks ago
On 20.03.2025 21:10, Frediano Ziglio wrote:
> On Thu, Mar 20, 2025 at 3:15 PM Jan Beulich <jbeulich@suse.com> wrote:
>>
>> On 20.03.2025 15:33, Frediano Ziglio wrote:
>>> On Thu, Mar 6, 2025 at 3:02 PM Frediano Ziglio
>>> <frediano.ziglio@cloud.com> wrote:
>>>>
>>>> On Thu, Mar 6, 2025 at 2:26 PM Jan Beulich <jbeulich@suse.com> wrote:
>>>>>
>>>>> On 26.02.2025 19:54, Marek Marczykowski-Górecki wrote:
>>>>>> On Mon, Feb 24, 2025 at 02:31:00PM +0000, Frediano Ziglio wrote:
>>>>>>> On Mon, Feb 24, 2025 at 1:16 PM Marek Marczykowski-Górecki
>>>>>>> <marmarek@invisiblethingslab.com> wrote:
>>>>>>>>
>>>>>>>> On Mon, Feb 24, 2025 at 12:57:13PM +0000, Frediano Ziglio wrote:
>>>>>>>>> On Fri, Feb 21, 2025 at 8:20 PM Marek Marczykowski-Górecki
>>>>>>>>> <marmarek@invisiblethingslab.com> wrote:
>>>>>>>>>>
>>>>>>>>>> On Mon, Feb 17, 2025 at 04:26:59PM +0000, Frediano Ziglio wrote:
>>>>>>>>>>> Although code is compiled with -fpic option data is not position
>>>>>>>>>>> independent. This causes data pointer to become invalid if
>>>>>>>>>>> code is not relocated properly which is what happens for
>>>>>>>>>>> efi_multiboot2 which is called by multiboot entry code.
>>>>>>>>>>>
>>>>>>>>>>> Code tested adding
>>>>>>>>>>>    PrintErrMesg(L"Test message", EFI_BUFFER_TOO_SMALL);
>>>>>>>>>>> in efi_multiboot2 before calling efi_arch_edd (this function
>>>>>>>>>>> can potentially call PrintErrMesg).
>>>>>>>>>>>
>>>>>>>>>>> Before the patch (XenServer installation on Qemu, xen replaced
>>>>>>>>>>> with vanilla xen.gz):
>>>>>>>>>>>   Booting `XenServer (Serial)'Booting `XenServer (Serial)'
>>>>>>>>>>>   Test message: !!!! X64 Exception Type - 0E(#PF - Page-Fault)  CPU Apic ID - 00000000 !!!!
>>>>>>>>>>>   ExceptionData - 0000000000000000  I:0 R:0 U:0 W:0 P:0 PK:0 SS:0 SGX:0
>>>>>>>>>>>   RIP  - 000000007EE21E9A, CS  - 0000000000000038, RFLAGS - 0000000000210246
>>>>>>>>>>>   RAX  - 000000007FF0C1B5, RCX - 0000000000000050, RDX - 0000000000000010
>>>>>>>>>>>   RBX  - 0000000000000000, RSP - 000000007FF0C180, RBP - 000000007FF0C210
>>>>>>>>>>>   RSI  - FFFF82D040467CE8, RDI - 0000000000000000
>>>>>>>>>>>   R8   - 000000007FF0C1C8, R9  - 000000007FF0C1C0, R10 - 0000000000000000
>>>>>>>>>>>   R11  - 0000000000001020, R12 - FFFF82D040467CE8, R13 - 000000007FF0C1B8
>>>>>>>>>>>   R14  - 000000007EA33328, R15 - 000000007EA332D8
>>>>>>>>>>>   DS   - 0000000000000030, ES  - 0000000000000030, FS  - 0000000000000030
>>>>>>>>>>>   GS   - 0000000000000030, SS  - 0000000000000030
>>>>>>>>>>>   CR0  - 0000000080010033, CR2 - FFFF82D040467CE8, CR3 - 000000007FC01000
>>>>>>>>>>>   CR4  - 0000000000000668, CR8 - 0000000000000000
>>>>>>>>>>>   DR0  - 0000000000000000, DR1 - 0000000000000000, DR2 - 0000000000000000
>>>>>>>>>>>   DR3  - 0000000000000000, DR6 - 00000000FFFF0FF0, DR7 - 0000000000000400
>>>>>>>>>>>   GDTR - 000000007F9DB000 0000000000000047, LDTR - 0000000000000000
>>>>>>>>>>>   IDTR - 000000007F48E018 0000000000000FFF,   TR - 0000000000000000
>>>>>>>>>>>   FXSAVE_STATE - 000000007FF0BDE0
>>>>>>>>>>>   !!!! Find image based on IP(0x7EE21E9A) (No PDB)  (ImageBase=000000007EE20000, EntryPoint=000000007EE23935) !!!!
>>>>>>>>>>>
>>>>>>>>>>> After the patch:
>>>>>>>>>>>   Booting `XenServer (Serial)'Booting `XenServer (Serial)'
>>>>>>>>>>>   Test message: Buffer too small
>>>>>>>>>>>   BdsDxe: loading Boot0000 "UiApp" from Fv(7CB8BDC9-F8EB-4F34-AAEA-3EE4AF6516A1)/FvFile(462CAA21-7614-4503-836E-8AB6F4662331)
>>>>>>>>>>>   BdsDxe: starting Boot0000 "UiApp" from Fv(7CB8BDC9-F8EB-4F34-AAEA-3EE4AF6516A1)/FvFile(462CAA21-7614-4503-836E-8AB6F4662331)
>>>>>>>>>>>
>>>>>>>>>>> This partially rollback commit 00d5d5ce23e6.
>>>>>>>>>>>
>>>>>>>>>>> Fixes: 9180f5365524 ("x86: add multiboot2 protocol support for EFI platforms")
>>>>>>>>>>> Signed-off-by: Frediano Ziglio <frediano.ziglio@cloud.com>
>>>>>>>>>>
>>>>>>>>>> I tried testing this patch, but it seems I cannot reproduce the original
>>>>>>>>>> failure...
>>>>>>>>>>
>>>>>>>>>> I did as the commit message suggests here:
>>>>>>>>>> https://gitlab.com/xen-project/people/marmarek/xen/-/commit/ca3d6911c448eb886990f33d4380b5646617a982
>>>>>>>>>>
>>>>>>>>>> With blexit() in PrintErrMesg(), it went back to the bootloader, so I'm
>>>>>>>>>> sure this code path was reached. But with blexit() commented out, Xen
>>>>>>>>>> started correctly both with and without this patch... The branch I used
>>>>>>>>>> is here:
>>>>>>>>>> https://gitlab.com/xen-project/people/marmarek/xen/-/commits/automation-tests?ref_type=heads
>>>>>>>>>>
>>>>>>>>>> Are there some extra condition to reproduce the issue? Maybe it depends
>>>>>>>>>> on the compiler version? I guess I can try also on QEMU, but based on
>>>>>>>>>> the description, I would expect it to crash in any case.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Did you see the correct message in both cases?
>>>>>>>>> Did you use Grub or direct EFI?
>>>>>>>>>
>>>>>>>>> With Grub and without this patch you won't see the message, with grub
>>>>>>>>> with the patch you see the correct message.
>>>>>>>>
>>>>>>>> I did use grub, and I didn't see the message indeed.
>>>>>>>> But in the case it was supposed to crash (with added PrintErrMesg(),
>>>>>>>> commented out blexit and without your patch) it did _not_ crashed and
>>>>>>>> continued to normal boot. Is that #PF non-fatal here?
>>>>>>>>
>>>>>>>
>>>>>>> Hi,
>>>>>>>    I tried again with my test environment.
>>>>>>> Added the PrintErrMesg line before efi_arch_edd call, I got a #PF, in
>>>>>>> my case the system hangs. With the fix patch machine is rebooting and
>>>>>>> I can see the message in the logs.
>>>>>>> I'm trying with Xen starting inside Qemu, EFI firmware, xen.gz
>>>>>>> compiled as ELF file. Host system is an Ubuntu 22.04.5 LTS. Gcc is
>>>>>>> version 11.4.
>>>>>>
>>>>>> My test was wrong, commenting out blexit made "mesg" variable unused.
>>>>>> After fixing that, I can reproduce it on both QEMU and real hardware:
>>>>>> without your patch it crashes and with your patch it works just fine.
>>>>>> While there may be more places with similar issue, this patch clearly
>>>>>> improves the situation, so:
>>>>>>
>>>>>> Acked-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
>>>>>
>>>>> This had to be reverted, for breaking the build with old Clang. See the
>>>>> respective Matrix conversation.
>>>>
>>>> To sum up the failure is:
>>>>
>>>>     clang: error: unknown argument: '-fno-jump-tables'
>>>
>>> Now that the minimum clang version supports this option, can this
>>> change be applied?
>>
>> Not sure. I for one would expect that we actively reject building with
>> too old tool chains then, which is yet to be carried out. Plus I think
>> you'd want to re-submit, with all tags dropped. The change was wrong to
>> go in at that earlier point, and hence any such tags weren't quite
>> accurate.
> 
>   not sure what you intend with "tags" in the above sentence. Git tags ?

Acks and R-b-s.

> Not sure we need to carry on using old tool chains if we decide to
> bump the minimal versions.

I fear I don't understand this remark in this context. In any event,
Andrew meanwhile has sent a patch to the effect of what my comment was
saying.

Jan

Re: [PATCH v6] Avoid crash calling PrintErrMesg from efi_multiboot2
Posted by Jan Beulich 8 months, 1 week ago
On 26.02.2025 19:54, Marek Marczykowski-Górecki wrote:
> On Mon, Feb 24, 2025 at 02:31:00PM +0000, Frediano Ziglio wrote:
>> On Mon, Feb 24, 2025 at 1:16 PM Marek Marczykowski-Górecki
>> <marmarek@invisiblethingslab.com> wrote:
>>>
>>> On Mon, Feb 24, 2025 at 12:57:13PM +0000, Frediano Ziglio wrote:
>>>> On Fri, Feb 21, 2025 at 8:20 PM Marek Marczykowski-Górecki
>>>> <marmarek@invisiblethingslab.com> wrote:
>>>>>
>>>>> On Mon, Feb 17, 2025 at 04:26:59PM +0000, Frediano Ziglio wrote:
>>>>>> Although code is compiled with -fpic option data is not position
>>>>>> independent. This causes data pointer to become invalid if
>>>>>> code is not relocated properly which is what happens for
>>>>>> efi_multiboot2 which is called by multiboot entry code.
>>>>>>
>>>>>> Code tested adding
>>>>>>    PrintErrMesg(L"Test message", EFI_BUFFER_TOO_SMALL);
>>>>>> in efi_multiboot2 before calling efi_arch_edd (this function
>>>>>> can potentially call PrintErrMesg).
>>>>>>
>>>>>> Before the patch (XenServer installation on Qemu, xen replaced
>>>>>> with vanilla xen.gz):
>>>>>>   Booting `XenServer (Serial)'Booting `XenServer (Serial)'
>>>>>>   Test message: !!!! X64 Exception Type - 0E(#PF - Page-Fault)  CPU Apic ID - 00000000 !!!!
>>>>>>   ExceptionData - 0000000000000000  I:0 R:0 U:0 W:0 P:0 PK:0 SS:0 SGX:0
>>>>>>   RIP  - 000000007EE21E9A, CS  - 0000000000000038, RFLAGS - 0000000000210246
>>>>>>   RAX  - 000000007FF0C1B5, RCX - 0000000000000050, RDX - 0000000000000010
>>>>>>   RBX  - 0000000000000000, RSP - 000000007FF0C180, RBP - 000000007FF0C210
>>>>>>   RSI  - FFFF82D040467CE8, RDI - 0000000000000000
>>>>>>   R8   - 000000007FF0C1C8, R9  - 000000007FF0C1C0, R10 - 0000000000000000
>>>>>>   R11  - 0000000000001020, R12 - FFFF82D040467CE8, R13 - 000000007FF0C1B8
>>>>>>   R14  - 000000007EA33328, R15 - 000000007EA332D8
>>>>>>   DS   - 0000000000000030, ES  - 0000000000000030, FS  - 0000000000000030
>>>>>>   GS   - 0000000000000030, SS  - 0000000000000030
>>>>>>   CR0  - 0000000080010033, CR2 - FFFF82D040467CE8, CR3 - 000000007FC01000
>>>>>>   CR4  - 0000000000000668, CR8 - 0000000000000000
>>>>>>   DR0  - 0000000000000000, DR1 - 0000000000000000, DR2 - 0000000000000000
>>>>>>   DR3  - 0000000000000000, DR6 - 00000000FFFF0FF0, DR7 - 0000000000000400
>>>>>>   GDTR - 000000007F9DB000 0000000000000047, LDTR - 0000000000000000
>>>>>>   IDTR - 000000007F48E018 0000000000000FFF,   TR - 0000000000000000
>>>>>>   FXSAVE_STATE - 000000007FF0BDE0
>>>>>>   !!!! Find image based on IP(0x7EE21E9A) (No PDB)  (ImageBase=000000007EE20000, EntryPoint=000000007EE23935) !!!!
>>>>>>
>>>>>> After the patch:
>>>>>>   Booting `XenServer (Serial)'Booting `XenServer (Serial)'
>>>>>>   Test message: Buffer too small
>>>>>>   BdsDxe: loading Boot0000 "UiApp" from Fv(7CB8BDC9-F8EB-4F34-AAEA-3EE4AF6516A1)/FvFile(462CAA21-7614-4503-836E-8AB6F4662331)
>>>>>>   BdsDxe: starting Boot0000 "UiApp" from Fv(7CB8BDC9-F8EB-4F34-AAEA-3EE4AF6516A1)/FvFile(462CAA21-7614-4503-836E-8AB6F4662331)
>>>>>>
>>>>>> This partially rollback commit 00d5d5ce23e6.
>>>>>>
>>>>>> Fixes: 9180f5365524 ("x86: add multiboot2 protocol support for EFI platforms")
>>>>>> Signed-off-by: Frediano Ziglio <frediano.ziglio@cloud.com>
>>>>>
>>>>> I tried testing this patch, but it seems I cannot reproduce the original
>>>>> failure...
>>>>>
>>>>> I did as the commit message suggests here:
>>>>> https://gitlab.com/xen-project/people/marmarek/xen/-/commit/ca3d6911c448eb886990f33d4380b5646617a982
>>>>>
>>>>> With blexit() in PrintErrMesg(), it went back to the bootloader, so I'm
>>>>> sure this code path was reached. But with blexit() commented out, Xen
>>>>> started correctly both with and without this patch... The branch I used
>>>>> is here:
>>>>> https://gitlab.com/xen-project/people/marmarek/xen/-/commits/automation-tests?ref_type=heads
>>>>>
>>>>> Are there some extra condition to reproduce the issue? Maybe it depends
>>>>> on the compiler version? I guess I can try also on QEMU, but based on
>>>>> the description, I would expect it to crash in any case.
>>>>>
>>>>
>>>> Did you see the correct message in both cases?
>>>> Did you use Grub or direct EFI?
>>>>
>>>> With Grub and without this patch you won't see the message, with grub
>>>> with the patch you see the correct message.
>>>
>>> I did use grub, and I didn't see the message indeed.
>>> But in the case it was supposed to crash (with added PrintErrMesg(),
>>> commented out blexit and without your patch) it did _not_ crashed and
>>> continued to normal boot. Is that #PF non-fatal here?
>>>
>>
>> Hi,
>>    I tried again with my test environment.
>> Added the PrintErrMesg line before efi_arch_edd call, I got a #PF, in
>> my case the system hangs. With the fix patch machine is rebooting and
>> I can see the message in the logs.
>> I'm trying with Xen starting inside Qemu, EFI firmware, xen.gz
>> compiled as ELF file. Host system is an Ubuntu 22.04.5 LTS. Gcc is
>> version 11.4.
> 
> My test was wrong, commenting out blexit made "mesg" variable unused.
> After fixing that, I can reproduce it on both QEMU and real hardware:
> without your patch it crashes and with your patch it works just fine.
> While there may be more places with similar issue, this patch clearly
> improves the situation, so:
> 
> Acked-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>

While I would have preferred my comment to be addressed, I'll accept the
maintainer ack to have the patch go in as-is. Andrew, what about your
comment? Can you accept "this is an improvement" as enough of a reason
for it to go in despite not fully addressing the underlying issue? (In
case of no reply until after my upcoming vacation, I'll take this as
silent agreement and put the patch in.)

Jan

Re: [PATCH v6] Avoid crash calling PrintErrMesg from efi_multiboot2
Posted by Jan Beulich 8 months, 2 weeks ago
On 17.02.2025 17:26, Frediano Ziglio wrote:
> --- a/xen/common/efi/efi-common.mk
> +++ b/xen/common/efi/efi-common.mk
> @@ -2,6 +2,7 @@ EFIOBJ-y := boot.init.o pe.init.o ebmalloc.o runtime.o
>  EFIOBJ-$(CONFIG_COMPAT) += compat.o
>  
>  CFLAGS-y += -fshort-wchar
> +CFLAGS-y += -fno-jump-tables
>  CFLAGS-y += -iquote $(srctree)/common/efi
>  CFLAGS-y += -iquote $(srcdir)

Do source files other than boot.c really need this? Do any other files outside
of efi/ maybe also need this (iirc this point was made along with the v5 comment
you got)?

Jan
Re: [PATCH v6] Avoid crash calling PrintErrMesg from efi_multiboot2
Posted by Andrew Cooper 8 months, 2 weeks ago
On 17/02/2025 4:31 pm, Jan Beulich wrote:
> On 17.02.2025 17:26, Frediano Ziglio wrote:
>> --- a/xen/common/efi/efi-common.mk
>> +++ b/xen/common/efi/efi-common.mk
>> @@ -2,6 +2,7 @@ EFIOBJ-y := boot.init.o pe.init.o ebmalloc.o runtime.o
>>  EFIOBJ-$(CONFIG_COMPAT) += compat.o
>>  
>>  CFLAGS-y += -fshort-wchar
>> +CFLAGS-y += -fno-jump-tables
>>  CFLAGS-y += -iquote $(srctree)/common/efi
>>  CFLAGS-y += -iquote $(srcdir)
> Do source files other than boot.c really need this? Do any other files outside
> of efi/ maybe also need this (iirc this point was made along with the v5 comment
> you got)?

Every TU reachable from efi_multiboot2() needs this, because all of
those have logic executed at an incorrect virtual address.

~Andrew
Re: [PATCH v6] Avoid crash calling PrintErrMesg from efi_multiboot2
Posted by Frediano Ziglio 8 months, 2 weeks ago
On Mon, Feb 17, 2025 at 4:41 PM Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>
> On 17/02/2025 4:31 pm, Jan Beulich wrote:
> > On 17.02.2025 17:26, Frediano Ziglio wrote:
> >> --- a/xen/common/efi/efi-common.mk
> >> +++ b/xen/common/efi/efi-common.mk
> >> @@ -2,6 +2,7 @@ EFIOBJ-y := boot.init.o pe.init.o ebmalloc.o runtime.o
> >>  EFIOBJ-$(CONFIG_COMPAT) += compat.o
> >>
> >>  CFLAGS-y += -fshort-wchar
> >> +CFLAGS-y += -fno-jump-tables
> >>  CFLAGS-y += -iquote $(srctree)/common/efi
> >>  CFLAGS-y += -iquote $(srcdir)
> > Do source files other than boot.c really need this? Do any other files outside
> > of efi/ maybe also need this (iirc this point was made along with the v5 comment
> > you got)?
>
> Every TU reachable from efi_multiboot2() needs this, because all of
> those have logic executed at an incorrect virtual address.
>
> ~Andrew

I assume all the files in efi directory will deal with EFI boot so
it's good to have the option enabled for the files inside the
directory. The only exception seems runtime.c file.
About external dependencies not sure how to detect them (beside
looking at all symbols).

Frediano
Re: [PATCH v6] Avoid crash calling PrintErrMesg from efi_multiboot2
Posted by Jan Beulich 8 months, 2 weeks ago
On 17.02.2025 17:52, Frediano Ziglio wrote:
> On Mon, Feb 17, 2025 at 4:41 PM Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>>
>> On 17/02/2025 4:31 pm, Jan Beulich wrote:
>>> On 17.02.2025 17:26, Frediano Ziglio wrote:
>>>> --- a/xen/common/efi/efi-common.mk
>>>> +++ b/xen/common/efi/efi-common.mk
>>>> @@ -2,6 +2,7 @@ EFIOBJ-y := boot.init.o pe.init.o ebmalloc.o runtime.o
>>>>  EFIOBJ-$(CONFIG_COMPAT) += compat.o
>>>>
>>>>  CFLAGS-y += -fshort-wchar
>>>> +CFLAGS-y += -fno-jump-tables
>>>>  CFLAGS-y += -iquote $(srctree)/common/efi
>>>>  CFLAGS-y += -iquote $(srcdir)
>>> Do source files other than boot.c really need this? Do any other files outside
>>> of efi/ maybe also need this (iirc this point was made along with the v5 comment
>>> you got)?
>>
>> Every TU reachable from efi_multiboot2() needs this, because all of
>> those have logic executed at an incorrect virtual address.
>>
>> ~Andrew
> 
> I assume all the files in efi directory will deal with EFI boot so
> it's good to have the option enabled for the files inside the
> directory. The only exception seems runtime.c file.

And compat.c, being a wrapper around runtime.c.

> About external dependencies not sure how to detect them (beside
> looking at all symbols).

Which raises the question whether we don't need to turn on that option
globally, without (as we have it now) any condition.

Jan

Re: [PATCH v6] Avoid crash calling PrintErrMesg from efi_multiboot2
Posted by Frediano Ziglio 8 months, 2 weeks ago
On Mon, Feb 17, 2025 at 4:56 PM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 17.02.2025 17:52, Frediano Ziglio wrote:
> > On Mon, Feb 17, 2025 at 4:41 PM Andrew Cooper <andrew.cooper3@citrix.com> wrote:
> >>
> >> On 17/02/2025 4:31 pm, Jan Beulich wrote:
> >>> On 17.02.2025 17:26, Frediano Ziglio wrote:
> >>>> --- a/xen/common/efi/efi-common.mk
> >>>> +++ b/xen/common/efi/efi-common.mk
> >>>> @@ -2,6 +2,7 @@ EFIOBJ-y := boot.init.o pe.init.o ebmalloc.o runtime.o
> >>>>  EFIOBJ-$(CONFIG_COMPAT) += compat.o
> >>>>
> >>>>  CFLAGS-y += -fshort-wchar
> >>>> +CFLAGS-y += -fno-jump-tables
> >>>>  CFLAGS-y += -iquote $(srctree)/common/efi
> >>>>  CFLAGS-y += -iquote $(srcdir)
> >>> Do source files other than boot.c really need this? Do any other files outside
> >>> of efi/ maybe also need this (iirc this point was made along with the v5 comment
> >>> you got)?
> >>
> >> Every TU reachable from efi_multiboot2() needs this, because all of
> >> those have logic executed at an incorrect virtual address.
> >>
> >> ~Andrew
> >
> > I assume all the files in efi directory will deal with EFI boot so
> > it's good to have the option enabled for the files inside the
> > directory. The only exception seems runtime.c file.
>
> And compat.c, being a wrapper around runtime.c.
>
> > About external dependencies not sure how to detect them (beside
> > looking at all symbols).
>
> Which raises the question whether we don't need to turn on that option
> globally, without (as we have it now) any condition.
>
> Jan

I would avoid adding a potential option that could affect performances
so badly at the moment.
These changes are pretty contained.
I would merge this patch and check any external dependencies as a follow up.

Frediano
Re: [PATCH v6] Avoid crash calling PrintErrMesg from efi_multiboot2
Posted by Jan Beulich 8 months, 2 weeks ago
On 19.02.2025 17:34, Frediano Ziglio wrote:
> On Mon, Feb 17, 2025 at 4:56 PM Jan Beulich <jbeulich@suse.com> wrote:
>> On 17.02.2025 17:52, Frediano Ziglio wrote:
>>> On Mon, Feb 17, 2025 at 4:41 PM Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>>>> On 17/02/2025 4:31 pm, Jan Beulich wrote:
>>>>> On 17.02.2025 17:26, Frediano Ziglio wrote:
>>>>>> --- a/xen/common/efi/efi-common.mk
>>>>>> +++ b/xen/common/efi/efi-common.mk
>>>>>> @@ -2,6 +2,7 @@ EFIOBJ-y := boot.init.o pe.init.o ebmalloc.o runtime.o
>>>>>>  EFIOBJ-$(CONFIG_COMPAT) += compat.o
>>>>>>
>>>>>>  CFLAGS-y += -fshort-wchar
>>>>>> +CFLAGS-y += -fno-jump-tables
>>>>>>  CFLAGS-y += -iquote $(srctree)/common/efi
>>>>>>  CFLAGS-y += -iquote $(srcdir)
>>>>> Do source files other than boot.c really need this? Do any other files outside
>>>>> of efi/ maybe also need this (iirc this point was made along with the v5 comment
>>>>> you got)?
>>>>
>>>> Every TU reachable from efi_multiboot2() needs this, because all of
>>>> those have logic executed at an incorrect virtual address.
>>>
>>> I assume all the files in efi directory will deal with EFI boot so
>>> it's good to have the option enabled for the files inside the
>>> directory. The only exception seems runtime.c file.
>>
>> And compat.c, being a wrapper around runtime.c.
>>
>>> About external dependencies not sure how to detect them (beside
>>> looking at all symbols).
>>
>> Which raises the question whether we don't need to turn on that option
>> globally, without (as we have it now) any condition.
> 
> I would avoid adding a potential option that could affect performances
> so badly at the moment.
> These changes are pretty contained.
> I would merge this patch and check any external dependencies as a follow up.

Well. It's a judgement call to the maintainers. If I were them, I'd demand
that Andrew's remark be addressed, one way or another.

Jan

Re: [PATCH v6] Avoid crash calling PrintErrMesg from efi_multiboot2
Posted by Frediano Ziglio 8 months, 1 week ago
On Thu, Feb 20, 2025 at 7:32 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 19.02.2025 17:34, Frediano Ziglio wrote:
> > On Mon, Feb 17, 2025 at 4:56 PM Jan Beulich <jbeulich@suse.com> wrote:
> >> On 17.02.2025 17:52, Frediano Ziglio wrote:
> >>> On Mon, Feb 17, 2025 at 4:41 PM Andrew Cooper <andrew.cooper3@citrix.com> wrote:
> >>>> On 17/02/2025 4:31 pm, Jan Beulich wrote:
> >>>>> On 17.02.2025 17:26, Frediano Ziglio wrote:
> >>>>>> --- a/xen/common/efi/efi-common.mk
> >>>>>> +++ b/xen/common/efi/efi-common.mk
> >>>>>> @@ -2,6 +2,7 @@ EFIOBJ-y := boot.init.o pe.init.o ebmalloc.o runtime.o
> >>>>>>  EFIOBJ-$(CONFIG_COMPAT) += compat.o
> >>>>>>
> >>>>>>  CFLAGS-y += -fshort-wchar
> >>>>>> +CFLAGS-y += -fno-jump-tables
> >>>>>>  CFLAGS-y += -iquote $(srctree)/common/efi
> >>>>>>  CFLAGS-y += -iquote $(srcdir)
> >>>>> Do source files other than boot.c really need this? Do any other files outside
> >>>>> of efi/ maybe also need this (iirc this point was made along with the v5 comment
> >>>>> you got)?
> >>>>
> >>>> Every TU reachable from efi_multiboot2() needs this, because all of
> >>>> those have logic executed at an incorrect virtual address.
> >>>
> >>> I assume all the files in efi directory will deal with EFI boot so
> >>> it's good to have the option enabled for the files inside the
> >>> directory. The only exception seems runtime.c file.
> >>
> >> And compat.c, being a wrapper around runtime.c.
> >>
> >>> About external dependencies not sure how to detect them (beside
> >>> looking at all symbols).
> >>
> >> Which raises the question whether we don't need to turn on that option
> >> globally, without (as we have it now) any condition.
> >
> > I would avoid adding a potential option that could affect performances
> > so badly at the moment.
> > These changes are pretty contained.
> > I would merge this patch and check any external dependencies as a follow up.
>
> Well. It's a judgement call to the maintainers. If I were them, I'd demand
> that Andrew's remark be addressed, one way or another.
>
> Jan

I think I did, but only Andres can confirm it.

Frediano
Re: [PATCH v6] Avoid crash calling PrintErrMesg from efi_multiboot2
Posted by Frediano Ziglio 8 months, 2 weeks ago
On Mon, Feb 17, 2025 at 4:56 PM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 17.02.2025 17:52, Frediano Ziglio wrote:
> > On Mon, Feb 17, 2025 at 4:41 PM Andrew Cooper <andrew.cooper3@citrix.com> wrote:
> >>
> >> On 17/02/2025 4:31 pm, Jan Beulich wrote:
> >>> On 17.02.2025 17:26, Frediano Ziglio wrote:
> >>>> --- a/xen/common/efi/efi-common.mk
> >>>> +++ b/xen/common/efi/efi-common.mk
> >>>> @@ -2,6 +2,7 @@ EFIOBJ-y := boot.init.o pe.init.o ebmalloc.o runtime.o
> >>>>  EFIOBJ-$(CONFIG_COMPAT) += compat.o
> >>>>
> >>>>  CFLAGS-y += -fshort-wchar
> >>>> +CFLAGS-y += -fno-jump-tables
> >>>>  CFLAGS-y += -iquote $(srctree)/common/efi
> >>>>  CFLAGS-y += -iquote $(srcdir)
> >>> Do source files other than boot.c really need this? Do any other files outside
> >>> of efi/ maybe also need this (iirc this point was made along with the v5 comment
> >>> you got)?
> >>
> >> Every TU reachable from efi_multiboot2() needs this, because all of
> >> those have logic executed at an incorrect virtual address.
> >>
> >> ~Andrew
> >
> > I assume all the files in efi directory will deal with EFI boot so
> > it's good to have the option enabled for the files inside the
> > directory. The only exception seems runtime.c file.
>
> And compat.c, being a wrapper around runtime.c.
>
> > About external dependencies not sure how to detect them (beside
> > looking at all symbols).
>
> Which raises the question whether we don't need to turn on that option
> globally, without (as we have it now) any condition.
>
> Jan

Are you saying enabling that option for all Xen? That potentially
would decrease performances, we have a lot of switches in the code.

Frediano
Re: [PATCH v6] Avoid crash calling PrintErrMesg from efi_multiboot2
Posted by Andrew Cooper 8 months, 2 weeks ago
On 18/02/2025 12:05 pm, Frediano Ziglio wrote:
> On Mon, Feb 17, 2025 at 4:56 PM Jan Beulich <jbeulich@suse.com> wrote:
>> On 17.02.2025 17:52, Frediano Ziglio wrote:
>>> On Mon, Feb 17, 2025 at 4:41 PM Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>>>> On 17/02/2025 4:31 pm, Jan Beulich wrote:
>>>>> On 17.02.2025 17:26, Frediano Ziglio wrote:
>>>>>> --- a/xen/common/efi/efi-common.mk
>>>>>> +++ b/xen/common/efi/efi-common.mk
>>>>>> @@ -2,6 +2,7 @@ EFIOBJ-y := boot.init.o pe.init.o ebmalloc.o runtime.o
>>>>>>  EFIOBJ-$(CONFIG_COMPAT) += compat.o
>>>>>>
>>>>>>  CFLAGS-y += -fshort-wchar
>>>>>> +CFLAGS-y += -fno-jump-tables
>>>>>>  CFLAGS-y += -iquote $(srctree)/common/efi
>>>>>>  CFLAGS-y += -iquote $(srcdir)
>>>>> Do source files other than boot.c really need this? Do any other files outside
>>>>> of efi/ maybe also need this (iirc this point was made along with the v5 comment
>>>>> you got)?
>>>> Every TU reachable from efi_multiboot2() needs this, because all of
>>>> those have logic executed at an incorrect virtual address.
>>>>
>>>> ~Andrew
>>> I assume all the files in efi directory will deal with EFI boot so
>>> it's good to have the option enabled for the files inside the
>>> directory. The only exception seems runtime.c file.
>> And compat.c, being a wrapper around runtime.c.
>>
>>> About external dependencies not sure how to detect them (beside
>>> looking at all symbols).
>> Which raises the question whether we don't need to turn on that option
>> globally, without (as we have it now) any condition.
>>
>> Jan
> Are you saying enabling that option for all Xen? That potentially
> would decrease performances, we have a lot of switches in the code.

-fno-switch-tables is active by default whenever INDIRECT_THUNK is
enabled, and when CET-IBT is enabled to work around a GCC bug/misfeature.

With speculation protections active, indirect branches are *far* slower
than alternative ways of expressing a switch statement.

~Andrew