Read uCode from memory has better performance than from flash.
But it needs extra effort to let BSP copy uCode from flash to
memory. Also BSP already enable cache in SEC phase, so it use
less time to relocate uCode from flash to memory. After
verification, if system has more than one processor, it will
reduce some time if load uCode from memory.
This change enable this optimization.
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Ruiyu Ni <ruiyu.ni@intel.com>
Contributed-under: TianoCore Contribution Agreement 1.1
Signed-off-by: Eric Dong <eric.dong@intel.com>
---
UefiCpuPkg/Library/MpInitLib/MpLib.c | 34 +++++++++++++++++++++++++++++++++-
1 file changed, 33 insertions(+), 1 deletion(-)
diff --git a/UefiCpuPkg/Library/MpInitLib/MpLib.c b/UefiCpuPkg/Library/MpInitLib/MpLib.c
index 108eea0a6f..c3cd6d7d51 100644
--- a/UefiCpuPkg/Library/MpInitLib/MpLib.c
+++ b/UefiCpuPkg/Library/MpInitLib/MpLib.c
@@ -1520,6 +1520,7 @@ MpInitLibInitialize (
UINTN ApResetVectorSize;
UINTN BackupBufferAddr;
UINTN ApIdtBase;
+ VOID *MicrocodePatchInRam;
OldCpuMpData = GetCpuMpDataFromGuidedHob ();
if (OldCpuMpData == NULL) {
@@ -1587,8 +1588,39 @@ MpInitLibInitialize (
CpuMpData->SwitchBspFlag = FALSE;
CpuMpData->CpuData = (CPU_AP_DATA *) (CpuMpData + 1);
CpuMpData->CpuInfoInHob = (UINT64) (UINTN) (CpuMpData->CpuData + MaxLogicalProcessorNumber);
- CpuMpData->MicrocodePatchAddress = PcdGet64 (PcdCpuMicrocodePatchAddress);
CpuMpData->MicrocodePatchRegionSize = PcdGet64 (PcdCpuMicrocodePatchRegionSize);
+ //
+ // If platform has more than one CPU, relocate microcode to memory to reduce
+ // loading microcode time.
+ //
+ MicrocodePatchInRam = NULL;
+ if (MaxLogicalProcessorNumber > 1) {
+ MicrocodePatchInRam = AllocatePages (
+ EFI_SIZE_TO_PAGES (
+ (UINTN)CpuMpData->MicrocodePatchRegionSize
+ )
+ );
+ ASSERT (MicrocodePatchInRam != NULL);
+ }
+ if (MicrocodePatchInRam == NULL) {
+ //
+ // there is only one processor, or no microcode patch is available, or
+ // memory allocation failed
+ //
+ CpuMpData->MicrocodePatchAddress = PcdGet64 (PcdCpuMicrocodePatchAddress);
+ } else {
+ //
+ // there are multiple processors, and a microcode patch is available, and
+ // memory allocation succeeded
+ //
+ CopyMem (
+ MicrocodePatchInRam,
+ (VOID *)(UINTN)PcdGet64 (PcdCpuMicrocodePatchAddress),
+ (UINTN)CpuMpData->MicrocodePatchRegionSize
+ );
+ CpuMpData->MicrocodePatchAddress = (UINTN)MicrocodePatchInRam;
+ }
+
InitializeSpinLock(&CpuMpData->MpLock);
//
--
2.15.0.windows.1
_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel
On 07/12/18 12:49, Eric Dong wrote:
> Read uCode from memory has better performance than from flash.
> But it needs extra effort to let BSP copy uCode from flash to
> memory. Also BSP already enable cache in SEC phase, so it use
> less time to relocate uCode from flash to memory. After
> verification, if system has more than one processor, it will
> reduce some time if load uCode from memory.
>
> This change enable this optimization.
>
> Cc: Laszlo Ersek <lersek@redhat.com>
> Cc: Ruiyu Ni <ruiyu.ni@intel.com>
> Contributed-under: TianoCore Contribution Agreement 1.1
> Signed-off-by: Eric Dong <eric.dong@intel.com>
> ---
> UefiCpuPkg/Library/MpInitLib/MpLib.c | 34 +++++++++++++++++++++++++++++++++-
> 1 file changed, 33 insertions(+), 1 deletion(-)
>
> diff --git a/UefiCpuPkg/Library/MpInitLib/MpLib.c b/UefiCpuPkg/Library/MpInitLib/MpLib.c
> index 108eea0a6f..c3cd6d7d51 100644
> --- a/UefiCpuPkg/Library/MpInitLib/MpLib.c
> +++ b/UefiCpuPkg/Library/MpInitLib/MpLib.c
> @@ -1520,6 +1520,7 @@ MpInitLibInitialize (
> UINTN ApResetVectorSize;
> UINTN BackupBufferAddr;
> UINTN ApIdtBase;
> + VOID *MicrocodePatchInRam;
>
> OldCpuMpData = GetCpuMpDataFromGuidedHob ();
> if (OldCpuMpData == NULL) {
> @@ -1587,8 +1588,39 @@ MpInitLibInitialize (
> CpuMpData->SwitchBspFlag = FALSE;
> CpuMpData->CpuData = (CPU_AP_DATA *) (CpuMpData + 1);
> CpuMpData->CpuInfoInHob = (UINT64) (UINTN) (CpuMpData->CpuData + MaxLogicalProcessorNumber);
> - CpuMpData->MicrocodePatchAddress = PcdGet64 (PcdCpuMicrocodePatchAddress);
> CpuMpData->MicrocodePatchRegionSize = PcdGet64 (PcdCpuMicrocodePatchRegionSize);
> + //
> + // If platform has more than one CPU, relocate microcode to memory to reduce
> + // loading microcode time.
> + //
> + MicrocodePatchInRam = NULL;
> + if (MaxLogicalProcessorNumber > 1) {
> + MicrocodePatchInRam = AllocatePages (
> + EFI_SIZE_TO_PAGES (
> + (UINTN)CpuMpData->MicrocodePatchRegionSize
> + )
> + );
> + ASSERT (MicrocodePatchInRam != NULL);
> + }
> + if (MicrocodePatchInRam == NULL) {
> + //
> + // there is only one processor, or no microcode patch is available, or
> + // memory allocation failed
> + //
> + CpuMpData->MicrocodePatchAddress = PcdGet64 (PcdCpuMicrocodePatchAddress);
> + } else {
> + //
> + // there are multiple processors, and a microcode patch is available, and
> + // memory allocation succeeded
> + //
> + CopyMem (
> + MicrocodePatchInRam,
> + (VOID *)(UINTN)PcdGet64 (PcdCpuMicrocodePatchAddress),
> + (UINTN)CpuMpData->MicrocodePatchRegionSize
> + );
> + CpuMpData->MicrocodePatchAddress = (UINTN)MicrocodePatchInRam;
> + }
> +
> InitializeSpinLock(&CpuMpData->MpLock);
>
> //
>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel
On 07/12/18 23:59, Laszlo Ersek wrote:
> On 07/12/18 12:49, Eric Dong wrote:
>> Read uCode from memory has better performance than from flash.
>> But it needs extra effort to let BSP copy uCode from flash to
>> memory. Also BSP already enable cache in SEC phase, so it use
>> less time to relocate uCode from flash to memory. After
>> verification, if system has more than one processor, it will
>> reduce some time if load uCode from memory.
>>
>> This change enable this optimization.
>>
>> Cc: Laszlo Ersek <lersek@redhat.com>
>> Cc: Ruiyu Ni <ruiyu.ni@intel.com>
>> Contributed-under: TianoCore Contribution Agreement 1.1
>> Signed-off-by: Eric Dong <eric.dong@intel.com>
>> ---
>> UefiCpuPkg/Library/MpInitLib/MpLib.c | 34 +++++++++++++++++++++++++++++++++-
>> 1 file changed, 33 insertions(+), 1 deletion(-)
>>
>> diff --git a/UefiCpuPkg/Library/MpInitLib/MpLib.c b/UefiCpuPkg/Library/MpInitLib/MpLib.c
>> index 108eea0a6f..c3cd6d7d51 100644
>> --- a/UefiCpuPkg/Library/MpInitLib/MpLib.c
>> +++ b/UefiCpuPkg/Library/MpInitLib/MpLib.c
>> @@ -1520,6 +1520,7 @@ MpInitLibInitialize (
>> UINTN ApResetVectorSize;
>> UINTN BackupBufferAddr;
>> UINTN ApIdtBase;
>> + VOID *MicrocodePatchInRam;
>>
>> OldCpuMpData = GetCpuMpDataFromGuidedHob ();
>> if (OldCpuMpData == NULL) {
>> @@ -1587,8 +1588,39 @@ MpInitLibInitialize (
>> CpuMpData->SwitchBspFlag = FALSE;
>> CpuMpData->CpuData = (CPU_AP_DATA *) (CpuMpData + 1);
>> CpuMpData->CpuInfoInHob = (UINT64) (UINTN) (CpuMpData->CpuData + MaxLogicalProcessorNumber);
>> - CpuMpData->MicrocodePatchAddress = PcdGet64 (PcdCpuMicrocodePatchAddress);
>> CpuMpData->MicrocodePatchRegionSize = PcdGet64 (PcdCpuMicrocodePatchRegionSize);
>> + //
>> + // If platform has more than one CPU, relocate microcode to memory to reduce
>> + // loading microcode time.
>> + //
>> + MicrocodePatchInRam = NULL;
>> + if (MaxLogicalProcessorNumber > 1) {
>> + MicrocodePatchInRam = AllocatePages (
>> + EFI_SIZE_TO_PAGES (
>> + (UINTN)CpuMpData->MicrocodePatchRegionSize
>> + )
>> + );
>> + ASSERT (MicrocodePatchInRam != NULL);
>> + }
>> + if (MicrocodePatchInRam == NULL) {
>> + //
>> + // there is only one processor, or no microcode patch is available, or
>> + // memory allocation failed
>> + //
>> + CpuMpData->MicrocodePatchAddress = PcdGet64 (PcdCpuMicrocodePatchAddress);
>> + } else {
>> + //
>> + // there are multiple processors, and a microcode patch is available, and
>> + // memory allocation succeeded
>> + //
>> + CopyMem (
>> + MicrocodePatchInRam,
>> + (VOID *)(UINTN)PcdGet64 (PcdCpuMicrocodePatchAddress),
>> + (UINTN)CpuMpData->MicrocodePatchRegionSize
>> + );
>> + CpuMpData->MicrocodePatchAddress = (UINTN)MicrocodePatchInRam;
>> + }
>> +
>> InitializeSpinLock(&CpuMpData->MpLock);
>>
>> //
>>
>
> Reviewed-by: Laszlo Ersek <lersek@redhat.com>
>
Sorry, I have to take that back -- please do not commit this patch.
For this version of the patch:
Nacked-by: Laszlo Ersek <lersek@redhat.com>
The ASSERT() is wrong. Again, in the code above, AllocatePages() can
return NULL not only because of memory allocation failure, but also
because the number of pages to allocate can be zero! If the platform has
no microcode patch to apply.
I knew this full well when I suggested the code, but then I forgot about
it when you mentioned the ASSERT(). I think you also knew about it, and
forgot about it too. :)
In particular, this patch would make it *impossible* to boot OVMF with
multiple processors, because OVMF *never* provides a microcode update.
So, please remove the ASSERT.
Alternatively, you could modify the ASSERT() like this:
//
// if we attempt actual memory allocation, we expect it to succeed
//
ASSERT (
(CpuMpData->MicrocodePatchRegionSize == 0) ||
(MicrocodePatchInRam != NULL)
);
Thanks,
Laszlo
_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel
Hi Laszlo,
> -----Original Message-----
> From: edk2-devel [mailto:edk2-devel-bounces@lists.01.org] On Behalf Of
> Laszlo Ersek
> Sent: Friday, July 13, 2018 6:14 AM
> To: Dong, Eric <eric.dong@intel.com>; edk2-devel@lists.01.org
> Cc: Ni, Ruiyu <ruiyu.ni@intel.com>
> Subject: Re: [edk2] [Patch v2 1/3] UefiCpuPkg/MpInitLib: Relocate uCode to
> memory to save time.
>
> On 07/12/18 23:59, Laszlo Ersek wrote:
> > On 07/12/18 12:49, Eric Dong wrote:
> >> Read uCode from memory has better performance than from flash.
> >> But it needs extra effort to let BSP copy uCode from flash to memory.
> >> Also BSP already enable cache in SEC phase, so it use less time to
> >> relocate uCode from flash to memory. After verification, if system
> >> has more than one processor, it will reduce some time if load uCode
> >> from memory.
> >>
> >> This change enable this optimization.
> >>
> >> Cc: Laszlo Ersek <lersek@redhat.com>
> >> Cc: Ruiyu Ni <ruiyu.ni@intel.com>
> >> Contributed-under: TianoCore Contribution Agreement 1.1
> >> Signed-off-by: Eric Dong <eric.dong@intel.com>
> >> ---
> >> UefiCpuPkg/Library/MpInitLib/MpLib.c | 34
> >> +++++++++++++++++++++++++++++++++-
> >> 1 file changed, 33 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/UefiCpuPkg/Library/MpInitLib/MpLib.c
> >> b/UefiCpuPkg/Library/MpInitLib/MpLib.c
> >> index 108eea0a6f..c3cd6d7d51 100644
> >> --- a/UefiCpuPkg/Library/MpInitLib/MpLib.c
> >> +++ b/UefiCpuPkg/Library/MpInitLib/MpLib.c
> >> @@ -1520,6 +1520,7 @@ MpInitLibInitialize (
> >> UINTN ApResetVectorSize;
> >> UINTN BackupBufferAddr;
> >> UINTN ApIdtBase;
> >> + VOID *MicrocodePatchInRam;
> >>
> >> OldCpuMpData = GetCpuMpDataFromGuidedHob ();
> >> if (OldCpuMpData == NULL) {
> >> @@ -1587,8 +1588,39 @@ MpInitLibInitialize (
> >> CpuMpData->SwitchBspFlag = FALSE;
> >> CpuMpData->CpuData = (CPU_AP_DATA *) (CpuMpData + 1);
> >> CpuMpData->CpuInfoInHob = (UINT64) (UINTN) (CpuMpData-
> >CpuData + MaxLogicalProcessorNumber);
> >> - CpuMpData->MicrocodePatchAddress = PcdGet64
> (PcdCpuMicrocodePatchAddress);
> >> CpuMpData->MicrocodePatchRegionSize = PcdGet64
> >> (PcdCpuMicrocodePatchRegionSize);
> >> + //
> >> + // If platform has more than one CPU, relocate microcode to memory
> >> + to reduce // loading microcode time.
> >> + //
> >> + MicrocodePatchInRam = NULL;
> >> + if (MaxLogicalProcessorNumber > 1) {
> >> + MicrocodePatchInRam = AllocatePages (
> >> + EFI_SIZE_TO_PAGES (
> >> + (UINTN)CpuMpData->MicrocodePatchRegionSize
> >> + )
> >> + );
> >> + ASSERT (MicrocodePatchInRam != NULL); } if
> >> + (MicrocodePatchInRam == NULL) {
> >> + //
> >> + // there is only one processor, or no microcode patch is available, or
> >> + // memory allocation failed
> >> + //
> >> + CpuMpData->MicrocodePatchAddress = PcdGet64
> >> + (PcdCpuMicrocodePatchAddress); } else {
> >> + //
> >> + // there are multiple processors, and a microcode patch is available,
> and
> >> + // memory allocation succeeded
> >> + //
> >> + CopyMem (
> >> + MicrocodePatchInRam,
> >> + (VOID *)(UINTN)PcdGet64 (PcdCpuMicrocodePatchAddress),
> >> + (UINTN)CpuMpData->MicrocodePatchRegionSize
> >> + );
> >> + CpuMpData->MicrocodePatchAddress = (UINTN)MicrocodePatchInRam;
> >> + }
> >> +
> >> InitializeSpinLock(&CpuMpData->MpLock);
> >>
> >> //
> >>
> >
> > Reviewed-by: Laszlo Ersek <lersek@redhat.com>
> >
>
> Sorry, I have to take that back -- please do not commit this patch.
>
> For this version of the patch:
>
> Nacked-by: Laszlo Ersek <lersek@redhat.com>
>
> The ASSERT() is wrong. Again, in the code above, AllocatePages() can return
> NULL not only because of memory allocation failure, but also because the
> number of pages to allocate can be zero! If the platform has no microcode
> patch to apply.
>
> I knew this full well when I suggested the code, but then I forgot about it when
> you mentioned the ASSERT(). I think you also knew about it, and forgot about
> it too. :)
>
> In particular, this patch would make it *impossible* to boot OVMF with
> multiple processors, because OVMF *never* provides a microcode update.
>
> So, please remove the ASSERT.
>
Agree, I removed ASSERT code and send V3 patches.
> Alternatively, you could modify the ASSERT() like this:
>
> //
> // if we attempt actual memory allocation, we expect it to succeed
> //
> ASSERT (
> (CpuMpData->MicrocodePatchRegionSize == 0) ||
> (MicrocodePatchInRam != NULL)
> );
>
> Thanks,
> Laszlo
> _______________________________________________
> edk2-devel mailing list
> edk2-devel@lists.01.org
> https://lists.01.org/mailman/listinfo/edk2-devel
_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel
© 2016 - 2026 Red Hat, Inc.