RE: [PATCH 0/2] Fix boot hang issue on Ampere Emag server

Justin He posted 2 patches 2 years, 7 months ago
Only 0 patches received!
There is a newer version of this series
RE: [PATCH 0/2] Fix boot hang issue on Ampere Emag server
Posted by Justin He 2 years, 7 months ago

> -----Original Message-----
> From: Ard Biesheuvel <ardb@kernel.org>
> Sent: Tuesday, February 7, 2023 5:04 PM
> To: Justin He <Justin.He@arm.com>
> Cc: Huacai Chen <chenhuacai@kernel.org>; linux-efi@vger.kernel.org;
> linux-kernel@vger.kernel.org; Alexandru Elisei <Alexandru.Elisei@arm.com>;
> Jason A. Donenfeld <Jason@zx2c4.com>; nd <nd@arm.com>
> Subject: Re: [PATCH 0/2] Fix boot hang issue on Ampere Emag server
> 
> On Tue, 7 Feb 2023 at 10:03, Justin He <Justin.He@arm.com> wrote:
> >
> > Hi Ard
> >
> > > -----Original Message-----
> > > From: Ard Biesheuvel <ardb@kernel.org>
> > > Sent: Tuesday, February 7, 2023 4:54 PM
> > > To: Justin He <Justin.He@arm.com>
> > > Cc: Huacai Chen <chenhuacai@kernel.org>; linux-efi@vger.kernel.org;
> > > linux-kernel@vger.kernel.org; Alexandru Elisei
> > > <Alexandru.Elisei@arm.com>; Jason A. Donenfeld <Jason@zx2c4.com>; nd
> > > <nd@arm.com>
> > > Subject: Re: [PATCH 0/2] Fix boot hang issue on Ampere Emag server
> > >
> > > On Tue, 7 Feb 2023 at 09:49, Justin He <Justin.He@arm.com> wrote:
> > > >
> > > >
> > > >
> > > > > -----Original Message-----
> > > > [..]
> > > > > > The root cause of the hung IMO might be similar to commit
> > > > > > 550b33cfd445296868a478e8413ffb2e963eed32
> > > > > > Author: Ard Biesheuvel <ardb@kernel.org>
> > > > > > Date:   Thu Nov 10 10:36:20 2022 +0100
> > > > > >
> > > > > >     arm64: efi: Force the use of SetVirtualAddressMap() on
> > > > > > Altra machines
> > > > > >
> > > > > > Do you agree with the idea if I add Ampere ”eMAG” machine into
> > > > > > the list of Using SetVirtualAddressMap() forcibly?
> > > > > >
> > > > > > Please note that even in previous kernel patch, the efibootmgr
> > > > > > -t
> > > > > > 10 will make kernel hung if I passed "efi=novamap" to the boot
> > > parameter.
> > > > > >
> > > > >
> > > > > Interesting. What does dmidecode return for the family in the
> > > > > type 1
> > > record?
> > > >
> > > > # dmidecode |grep -i family
> > > >         Family: eMAG
> > > >         Family: ARMv8
> > > >
> > > > The full dmidecode log is at https://pastebin.com/M3MAJtUG
> > > >
> > >
> > > OK please try this:
> > >
> > > diff --git a/drivers/firmware/efi/libstub/arm64.c
> > > b/drivers/firmware/efi/libstub/arm64.c
> > > index ff2d18c42ee74979..fae930dec82be7c6 100644
> > > --- a/drivers/firmware/efi/libstub/arm64.c
> > > +++ b/drivers/firmware/efi/libstub/arm64.c
> > > @@ -22,7 +22,8 @@ static bool system_needs_vamap(void)
> > >          * Ampere Altra machines crash in SetTime() if
> > > SetVirtualAddressMap()
> > >          * has not been called prior.
> > >          */
> > > -       if (!type1_family || strcmp(type1_family, "Altra"))
> > > +       if (!type1_family ||
> > > +           (strcmp(type1_family, "Altra") && strcmp(type1_family,
> > > + "eMAG")))
> > >                 return false;
> > >
> > >         efi_warn("Working around broken SetVirtualAddressMap()\n");
> >
> > Yes, it works on my eMAG server: the kernel boots.
> > Other than efibootmgr failure. But I noticed this efibootmgr failure
> > even before Commit d3549a938b7 ("avoid SetVirtualAddressMap() when
> > possible ")
> >
> > root@:~/linux# efibootmgr -t 9; efibootmgr -t 5; Could not set
> > Timeout: Input/output error Could not set Timeout: Input/output error
> >
> 
> Do you get any [Firmware Bug] lines in the kernel log?

No, 
I built the kernel based on:
commit d2d11f342b179f1894a901f143ec7c008caba43e (HEAD -> master, origin/master, origin/HEAD)
Author: Linus Torvalds <torvalds@linux-foundation.org>
Date:   Sun Feb 5 17:17:10 2023 -0800

    Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Are you worried about your sync exception fixup patch? I think it has been included.


--
Cheers,
Justin (Jia He)


Re: [PATCH 0/2] Fix boot hang issue on Ampere Emag server
Posted by Ard Biesheuvel 2 years, 7 months ago
On Tue, 7 Feb 2023 at 10:08, Justin He <Justin.He@arm.com> wrote:
>
>
>
> > -----Original Message-----
> > From: Ard Biesheuvel <ardb@kernel.org>
> > Sent: Tuesday, February 7, 2023 5:04 PM
> > To: Justin He <Justin.He@arm.com>
> > Cc: Huacai Chen <chenhuacai@kernel.org>; linux-efi@vger.kernel.org;
> > linux-kernel@vger.kernel.org; Alexandru Elisei <Alexandru.Elisei@arm.com>;
> > Jason A. Donenfeld <Jason@zx2c4.com>; nd <nd@arm.com>
> > Subject: Re: [PATCH 0/2] Fix boot hang issue on Ampere Emag server
> >
> > On Tue, 7 Feb 2023 at 10:03, Justin He <Justin.He@arm.com> wrote:
> > >
> > > Hi Ard
> > >
> > > > -----Original Message-----
> > > > From: Ard Biesheuvel <ardb@kernel.org>
> > > > Sent: Tuesday, February 7, 2023 4:54 PM
> > > > To: Justin He <Justin.He@arm.com>
> > > > Cc: Huacai Chen <chenhuacai@kernel.org>; linux-efi@vger.kernel.org;
> > > > linux-kernel@vger.kernel.org; Alexandru Elisei
> > > > <Alexandru.Elisei@arm.com>; Jason A. Donenfeld <Jason@zx2c4.com>; nd
> > > > <nd@arm.com>
> > > > Subject: Re: [PATCH 0/2] Fix boot hang issue on Ampere Emag server
> > > >
> > > > On Tue, 7 Feb 2023 at 09:49, Justin He <Justin.He@arm.com> wrote:
> > > > >
> > > > >
> > > > >
> > > > > > -----Original Message-----
> > > > > [..]
> > > > > > > The root cause of the hung IMO might be similar to commit
> > > > > > > 550b33cfd445296868a478e8413ffb2e963eed32
> > > > > > > Author: Ard Biesheuvel <ardb@kernel.org>
> > > > > > > Date:   Thu Nov 10 10:36:20 2022 +0100
> > > > > > >
> > > > > > >     arm64: efi: Force the use of SetVirtualAddressMap() on
> > > > > > > Altra machines
> > > > > > >
> > > > > > > Do you agree with the idea if I add Ampere ”eMAG” machine into
> > > > > > > the list of Using SetVirtualAddressMap() forcibly?
> > > > > > >
> > > > > > > Please note that even in previous kernel patch, the efibootmgr
> > > > > > > -t
> > > > > > > 10 will make kernel hung if I passed "efi=novamap" to the boot
> > > > parameter.
> > > > > > >
> > > > > >
> > > > > > Interesting. What does dmidecode return for the family in the
> > > > > > type 1
> > > > record?
> > > > >
> > > > > # dmidecode |grep -i family
> > > > >         Family: eMAG
> > > > >         Family: ARMv8
> > > > >
> > > > > The full dmidecode log is at https://pastebin.com/M3MAJtUG
> > > > >
> > > >
> > > > OK please try this:
> > > >
> > > > diff --git a/drivers/firmware/efi/libstub/arm64.c
> > > > b/drivers/firmware/efi/libstub/arm64.c
> > > > index ff2d18c42ee74979..fae930dec82be7c6 100644
> > > > --- a/drivers/firmware/efi/libstub/arm64.c
> > > > +++ b/drivers/firmware/efi/libstub/arm64.c
> > > > @@ -22,7 +22,8 @@ static bool system_needs_vamap(void)
> > > >          * Ampere Altra machines crash in SetTime() if
> > > > SetVirtualAddressMap()
> > > >          * has not been called prior.
> > > >          */
> > > > -       if (!type1_family || strcmp(type1_family, "Altra"))
> > > > +       if (!type1_family ||
> > > > +           (strcmp(type1_family, "Altra") && strcmp(type1_family,
> > > > + "eMAG")))
> > > >                 return false;
> > > >
> > > >         efi_warn("Working around broken SetVirtualAddressMap()\n");
> > >
> > > Yes, it works on my eMAG server: the kernel boots.
> > > Other than efibootmgr failure. But I noticed this efibootmgr failure
> > > even before Commit d3549a938b7 ("avoid SetVirtualAddressMap() when
> > > possible ")
> > >
> > > root@:~/linux# efibootmgr -t 9; efibootmgr -t 5; Could not set
> > > Timeout: Input/output error Could not set Timeout: Input/output error
> > >
> >
> > Do you get any [Firmware Bug] lines in the kernel log?
>
> No,
> I built the kernel based on:
> commit d2d11f342b179f1894a901f143ec7c008caba43e (HEAD -> master, origin/master, origin/HEAD)
> Author: Linus Torvalds <torvalds@linux-foundation.org>
> Date:   Sun Feb 5 17:17:10 2023 -0800
>
>     Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
>
> Are you worried about your sync exception fixup patch? I think it has been included.
>


I would just like to understand why setvariable is still broken for you.
RE: [PATCH 0/2] Fix boot hang issue on Ampere Emag server
Posted by Justin He 2 years, 7 months ago
Hi Ard

> -----Original Message-----
> From: Ard Biesheuvel <ardb@kernel.org>
> Sent: Tuesday, February 7, 2023 5:09 PM
> To: Justin He <Justin.He@arm.com>
> Cc: Huacai Chen <chenhuacai@kernel.org>; linux-efi@vger.kernel.org;
> linux-kernel@vger.kernel.org; Alexandru Elisei <Alexandru.Elisei@arm.com>;
> Jason A. Donenfeld <Jason@zx2c4.com>; nd <nd@arm.com>
> Subject: Re: [PATCH 0/2] Fix boot hang issue on Ampere Emag server
> 
> On Tue, 7 Feb 2023 at 10:08, Justin He <Justin.He@arm.com> wrote:
> >
> >
> >
> > > -----Original Message-----
> > > From: Ard Biesheuvel <ardb@kernel.org>
> > > Sent: Tuesday, February 7, 2023 5:04 PM
> > > To: Justin He <Justin.He@arm.com>
> > > Cc: Huacai Chen <chenhuacai@kernel.org>; linux-efi@vger.kernel.org;
> > > linux-kernel@vger.kernel.org; Alexandru Elisei
> > > <Alexandru.Elisei@arm.com>; Jason A. Donenfeld <Jason@zx2c4.com>; nd
> > > <nd@arm.com>
> > > Subject: Re: [PATCH 0/2] Fix boot hang issue on Ampere Emag server
> > >
> > > On Tue, 7 Feb 2023 at 10:03, Justin He <Justin.He@arm.com> wrote:
> > > >
> > > > Hi Ard
> > > >
> > > > > -----Original Message-----
> > > > > From: Ard Biesheuvel <ardb@kernel.org>
> > > > > Sent: Tuesday, February 7, 2023 4:54 PM
> > > > > To: Justin He <Justin.He@arm.com>
> > > > > Cc: Huacai Chen <chenhuacai@kernel.org>;
> > > > > linux-efi@vger.kernel.org; linux-kernel@vger.kernel.org;
> > > > > Alexandru Elisei <Alexandru.Elisei@arm.com>; Jason A. Donenfeld
> > > > > <Jason@zx2c4.com>; nd <nd@arm.com>
> > > > > Subject: Re: [PATCH 0/2] Fix boot hang issue on Ampere Emag
> > > > > server
> > > > >
> > > > > On Tue, 7 Feb 2023 at 09:49, Justin He <Justin.He@arm.com> wrote:
> > > > > >
> > > > > >
> > > > > >
> > > > > > > -----Original Message-----
> > > > > > [..]
> > > > > > > > The root cause of the hung IMO might be similar to commit
> > > > > > > > 550b33cfd445296868a478e8413ffb2e963eed32
> > > > > > > > Author: Ard Biesheuvel <ardb@kernel.org>
> > > > > > > > Date:   Thu Nov 10 10:36:20 2022 +0100
> > > > > > > >
> > > > > > > >     arm64: efi: Force the use of SetVirtualAddressMap() on
> > > > > > > > Altra machines
> > > > > > > >
> > > > > > > > Do you agree with the idea if I add Ampere ”eMAG” machine
> > > > > > > > into the list of Using SetVirtualAddressMap() forcibly?
> > > > > > > >
> > > > > > > > Please note that even in previous kernel patch, the
> > > > > > > > efibootmgr -t
> > > > > > > > 10 will make kernel hung if I passed "efi=novamap" to the
> > > > > > > > boot
> > > > > parameter.
> > > > > > > >
> > > > > > >
> > > > > > > Interesting. What does dmidecode return for the family in
> > > > > > > the type 1
> > > > > record?
> > > > > >
> > > > > > # dmidecode |grep -i family
> > > > > >         Family: eMAG
> > > > > >         Family: ARMv8
> > > > > >
> > > > > > The full dmidecode log is at https://pastebin.com/M3MAJtUG
> > > > > >
> > > > >
> > > > > OK please try this:
> > > > >
> > > > > diff --git a/drivers/firmware/efi/libstub/arm64.c
> > > > > b/drivers/firmware/efi/libstub/arm64.c
> > > > > index ff2d18c42ee74979..fae930dec82be7c6 100644
> > > > > --- a/drivers/firmware/efi/libstub/arm64.c
> > > > > +++ b/drivers/firmware/efi/libstub/arm64.c
> > > > > @@ -22,7 +22,8 @@ static bool system_needs_vamap(void)
> > > > >          * Ampere Altra machines crash in SetTime() if
> > > > > SetVirtualAddressMap()
> > > > >          * has not been called prior.
> > > > >          */
> > > > > -       if (!type1_family || strcmp(type1_family, "Altra"))
> > > > > +       if (!type1_family ||
> > > > > +           (strcmp(type1_family, "Altra") &&
> > > > > + strcmp(type1_family,
> > > > > + "eMAG")))
> > > > >                 return false;
> > > > >
> > > > >         efi_warn("Working around broken
> > > > > SetVirtualAddressMap()\n");
> > > >
> > > > Yes, it works on my eMAG server: the kernel boots.
> > > > Other than efibootmgr failure. But I noticed this efibootmgr
> > > > failure even before Commit d3549a938b7 ("avoid
> > > > SetVirtualAddressMap() when possible ")
> > > >
> > > > root@:~/linux# efibootmgr -t 9; efibootmgr -t 5; Could not set
> > > > Timeout: Input/output error Could not set Timeout: Input/output
> > > > error
> > > >
> > >
> > > Do you get any [Firmware Bug] lines in the kernel log?
> >
> > No,
> > I built the kernel based on:
> > commit d2d11f342b179f1894a901f143ec7c008caba43e (HEAD -> master,
> > origin/master, origin/HEAD)
> > Author: Linus Torvalds <torvalds@linux-foundation.org>
> > Date:   Sun Feb 5 17:17:10 2023 -0800
> >
> >     Merge branch 'fixes' of
> > git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
> >
> > Are you worried about your sync exception fixup patch? I think it has been
> included.
> >
> 
> 
> I would just like to understand why setvariable is still broken for you.

On an Ampere *Altra* server, the family name seems to not follow your purpose of invoking efi_get_smbios_string(1, family).

dmidecode |grep -i family -C 10

Handle 0x0001, DMI type 1, 27 bytes
System Information
        Manufacturer: GIGABYTE
        Product Name: R272-P30-00
        Version: 0100
        Serial Number: TS2035812A0022
        UUID: 00000000-0000-4000-8000-B42E99AFEF62
        Wake-up Type: Power Switch
        SKU Number: 01234567890123456789AB
        Family: Server

The full dmidecode info of Altra is at
https://pastebin.com/HQLE1yYv

--
Cheers,
Justin (Jia He)