[edk2-devel] [PATCH] UefiCpuPkg/MpInitLib: Add check for CR3/GDT/IDT.

Dong, Eric posted 1 patch 2 weeks ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/edk2 tags/patchew/20200902004353.1515-1-eric.dong@intel.com
UefiCpuPkg/Library/MpInitLib/DxeMpLib.c |   8 +-
UefiCpuPkg/Library/MpInitLib/MpLib.c    | 108 +++++++++++++++++++-----
UefiCpuPkg/Library/MpInitLib/MpLib.h    |   6 +-
3 files changed, 97 insertions(+), 25 deletions(-)

[edk2-devel] [PATCH] UefiCpuPkg/MpInitLib: Add check for CR3/GDT/IDT.

Posted by Dong, Eric 2 weeks ago
AP needs to run from real mode to 32bit mode to LONG mode. Page table
(pointed by CR3) and GDT are necessary to set up to correct value when
CPU execution mode is switched to LONG mode.
AP uses the same location page table (CR3) and GDT as what BSP uses.
But when the page table or GDT is above 4GB, it's impossible for CPU
to use because GDTR.base and CR3 are 32bits before switching to LONG
mode.
This patch adds check for the CR3, GDT.Base and IDT.Base to not above
32 bits restriction.

Signed-off-by: Eric Dong <eric.dong@intel.com>
Cc: Ray Ni <ray.ni@intel.com>
Cc: Laszlo Ersek <lersek@redhat.com>
---
 UefiCpuPkg/Library/MpInitLib/DxeMpLib.c |   8 +-
 UefiCpuPkg/Library/MpInitLib/MpLib.c    | 108 +++++++++++++++++++-----
 UefiCpuPkg/Library/MpInitLib/MpLib.h    |   6 +-
 3 files changed, 97 insertions(+), 25 deletions(-)

diff --git a/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c b/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c
index 2c00d72dde..27f12a75a8 100644
--- a/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c
+++ b/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c
@@ -393,13 +393,19 @@ MpInitChangeApLoopCallback (
   )
 {
   CPU_MP_DATA               *CpuMpData;
+  EFI_STATUS                Status;
 
   CpuMpData = GetCpuMpData ();
   CpuMpData->PmCodeSegment = GetProtectedModeCS ();
   CpuMpData->Pm16CodeSegment = GetProtectedMode16CS ();
   CpuMpData->ApLoopMode = PcdGet8 (PcdCpuApLoopMode);
   mNumberToFinish = CpuMpData->CpuCount - 1;
-  WakeUpAP (CpuMpData, TRUE, 0, RelocateApLoop, NULL, TRUE);
+  Status = WakeUpAP (CpuMpData, TRUE, 0, RelocateApLoop, NULL, TRUE);
+  if (EFI_ERROR (Status)) {
+    DEBUG ((DEBUG_ERROR, "ERROR :: %a() Change Ap Loop Mode failed!\n", __FUNCTION__));
+    return;
+  }
+
   while (mNumberToFinish > 0) {
     CpuPause ();
   }
diff --git a/UefiCpuPkg/Library/MpInitLib/MpLib.c b/UefiCpuPkg/Library/MpInitLib/MpLib.c
index 07426274f6..21b17a7b40 100644
--- a/UefiCpuPkg/Library/MpInitLib/MpLib.c
+++ b/UefiCpuPkg/Library/MpInitLib/MpLib.c
@@ -470,7 +470,7 @@ GetProcessorNumber (
 
   @return  CPU count detected
 **/
-UINTN
+EFI_STATUS
 CollectProcessorCount (
   IN CPU_MP_DATA         *CpuMpData
   )
@@ -478,12 +478,17 @@ CollectProcessorCount (
   UINTN                  Index;
   CPU_INFO_IN_HOB        *CpuInfoInHob;
   BOOLEAN                X2Apic;
+  EFI_STATUS             Status;
 
   //
   // Send 1st broadcast IPI to APs to wakeup APs
   //
   CpuMpData->InitFlag = ApInitConfig;
-  WakeUpAP (CpuMpData, TRUE, 0, NULL, NULL, TRUE);
+  Status = WakeUpAP (CpuMpData, TRUE, 0, NULL, NULL, TRUE);
+  if (EFI_ERROR (Status)) {
+    return Status;
+  }
+
   CpuMpData->InitFlag = ApInitDone;
   ASSERT (CpuMpData->CpuCount <= PcdGet32 (PcdCpuMaxLogicalProcessorNumber));
   //
@@ -520,7 +525,11 @@ CollectProcessorCount (
     //
     // Wakeup all APs to enable x2APIC mode
     //
-    WakeUpAP (CpuMpData, TRUE, 0, ApFuncEnableX2Apic, NULL, TRUE);
+    Status = WakeUpAP (CpuMpData, TRUE, 0, ApFuncEnableX2Apic, NULL, TRUE);
+    if (EFI_ERROR (Status)) {
+      return Status;
+    }
+
     //
     // Wait for all known APs finished
     //
@@ -546,7 +555,7 @@ CollectProcessorCount (
 
   DEBUG ((DEBUG_INFO, "MpInitLib: Find %d processors in system.\n", CpuMpData->CpuCount));
 
-  return CpuMpData->CpuCount;
+  return EFI_SUCCESS;
 }
 
 /**
@@ -990,7 +999,7 @@ WaitApWakeup (
   @param[in] CpuMpData          Pointer to CPU MP Data
 
 **/
-VOID
+EFI_STATUS
 FillExchangeInfoData (
   IN CPU_MP_DATA               *CpuMpData
   )
@@ -1001,6 +1010,35 @@ FillExchangeInfoData (
   IA32_CR4                         Cr4;
 
   ExchangeInfo                  = CpuMpData->MpCpuExchangeInfo;
+  ExchangeInfo->Cr3             = AsmReadCr3 ();
+  if (ExchangeInfo->Cr3 > 0xFFFFFFFF) {
+    //
+    // AP needs to run from real mode to 32bit mode to LONG mode. Page table
+    // (pointed by CR3) and GDT are necessary to set up to correct value when
+    // CPU execution mode is switched to LONG mode.
+    // AP uses the same location page table (CR3) and GDT as what BSP uses.
+    // But when the page table or GDT is above 4GB, it's impossible for CPU
+    // to use because GDTR.base and CR3 are 32bits before switching to LONG
+    // mode.
+    // Here add check for the CR3, GDT.Base and IDT.Base to not above 32 bits
+    // limitation.
+    //
+    return EFI_UNSUPPORTED;
+  }
+
+  //
+  // Get the BSP's data of GDT and IDT
+  //
+  AsmReadGdtr ((IA32_DESCRIPTOR *) &ExchangeInfo->GdtrProfile);
+  if (ExchangeInfo->GdtrProfile.Base > 0xFFFFFFFF) {
+    return EFI_UNSUPPORTED;
+  }
+
+  AsmReadIdtr ((IA32_DESCRIPTOR *) &ExchangeInfo->IdtrProfile);
+  if (ExchangeInfo->IdtrProfile.Base > 0xFFFFFFFF) {
+    return EFI_UNSUPPORTED;
+  }
+
   ExchangeInfo->Lock            = 0;
   ExchangeInfo->StackStart      = CpuMpData->Buffer;
   ExchangeInfo->StackSize       = CpuMpData->CpuApStackSize;
@@ -1009,9 +1047,6 @@ FillExchangeInfoData (
 
   ExchangeInfo->CodeSegment     = AsmReadCs ();
   ExchangeInfo->DataSegment     = AsmReadDs ();
-
-  ExchangeInfo->Cr3             = AsmReadCr3 ();
-
   ExchangeInfo->CFunction       = (UINTN) ApWakeupFunction;
   ExchangeInfo->ApIndex         = 0;
   ExchangeInfo->NumApsExecuting = 0;
@@ -1037,13 +1072,6 @@ FillExchangeInfoData (
 
   ExchangeInfo->SevEsIsEnabled  = CpuMpData->SevEsIsEnabled;
   ExchangeInfo->GhcbBase        = (UINTN) CpuMpData->GhcbBase;
-
-  //
-  // Get the BSP's data of GDT and IDT
-  //
-  AsmReadGdtr ((IA32_DESCRIPTOR *) &ExchangeInfo->GdtrProfile);
-  AsmReadIdtr ((IA32_DESCRIPTOR *) &ExchangeInfo->IdtrProfile);
-
   //
   // Find a 32-bit code segment
   //
@@ -1084,6 +1112,8 @@ FillExchangeInfoData (
                          (UINT32)ExchangeInfo->ModeOffset -
                          (UINT32)CpuMpData->AddressMap.ModeTransitionOffset;
   ExchangeInfo->ModeHighSegment = (UINT16)ExchangeInfo->CodeSegment;
+
+  return EFI_SUCCESS;
 }
 
 /**
@@ -1308,8 +1338,12 @@ SetSevEsJumpTable (
   @param[in] Procedure          The function to be invoked by AP
   @param[in] ProcedureArgument  The argument to be passed into AP function
   @param[in] WakeUpDisabledAps  Whether need to wake up disabled APs in broadcast mode.
+
+  @retval EFI_SUCCESS           Wake up the AP success.
+  @retval EFI_UNSUPPORTED       Invalid CR3, IDT, GDT value caused fail to wake up AP.
+
 **/
-VOID
+EFI_STATUS
 WakeUpAP (
   IN CPU_MP_DATA               *CpuMpData,
   IN BOOLEAN                   Broadcast,
@@ -1324,6 +1358,7 @@ WakeUpAP (
   CPU_AP_DATA                      *CpuData;
   BOOLEAN                          ResetVectorRequired;
   CPU_INFO_IN_HOB                  *CpuInfoInHob;
+  EFI_STATUS                       Status;
 
   CpuMpData->FinishedCount = 0;
   ResetVectorRequired = FALSE;
@@ -1333,7 +1368,10 @@ WakeUpAP (
     ResetVectorRequired = TRUE;
     AllocateResetVector (CpuMpData);
     AllocateSevEsAPMemory (CpuMpData);
-    FillExchangeInfoData (CpuMpData);
+    Status = FillExchangeInfoData (CpuMpData);
+    if (EFI_ERROR (Status)) {
+      return Status;
+    }
     SaveLocalApicTimerSetting (CpuMpData);
   }
 
@@ -1500,6 +1538,8 @@ WakeUpAP (
   // S3SmmInitDone Ppi.
   //
   CpuMpData->WakeUpByInitSipiSipi = (CpuMpData->ApLoopMode == ApInHltLoop);
+
+  return EFI_SUCCESS;
 }
 
 /**
@@ -1945,6 +1985,7 @@ MpInitLibInitialize (
   UINTN                    ApResetVectorSize;
   UINTN                    BackupBufferAddr;
   UINTN                    ApIdtBase;
+  EFI_STATUS               Status;
 
   OldCpuMpData = GetCpuMpDataFromGuidedHob ();
   if (OldCpuMpData == NULL) {
@@ -2067,7 +2108,10 @@ MpInitLibInitialize (
       //
       // Wakeup all APs and calculate the processor count in system
       //
-      CollectProcessorCount (CpuMpData);
+      Status = CollectProcessorCount (CpuMpData);
+      if (EFI_ERROR (Status)) {
+        return Status;
+      }
     }
   } else {
     //
@@ -2118,7 +2162,11 @@ MpInitLibInitialize (
       //
       CpuMpData->InitFlag = ApInitReconfig;
     }
-    WakeUpAP (CpuMpData, TRUE, 0, ApInitializeSync, CpuMpData, TRUE);
+    Status = WakeUpAP (CpuMpData, TRUE, 0, ApInitializeSync, CpuMpData, TRUE);
+    if (EFI_ERROR (Status)) {
+      return Status;
+    }
+
     //
     // Wait for all APs finished initialization
     //
@@ -2262,6 +2310,7 @@ SwitchBSPWorker (
   MSR_IA32_APIC_BASE_REGISTER  ApicBaseMsr;
   BOOLEAN                      OldInterruptState;
   BOOLEAN                      OldTimerInterruptState;
+  EFI_STATUS                   Status;
 
   //
   // Save and Disable Local APIC timer interrupt
@@ -2333,7 +2382,10 @@ SwitchBSPWorker (
   //
   // Need to wakeUp AP (future BSP).
   //
-  WakeUpAP (CpuMpData, FALSE, ProcessorNumber, FutureBSPProc, CpuMpData, TRUE);
+  Status = WakeUpAP (CpuMpData, FALSE, ProcessorNumber, FutureBSPProc, CpuMpData, TRUE);
+  if (EFI_ERROR (Status)) {
+    return Status;
+  }
 
   AsmExchangeRole (&CpuMpData->BSPInfo, &CpuMpData->APInfo);
 
@@ -2669,14 +2721,21 @@ StartupAllCPUsWorker (
   CpuMpData->WaitEvent     = WaitEvent;
 
   if (!SingleThread) {
-    WakeUpAP (CpuMpData, TRUE, 0, Procedure, ProcedureArgument, FALSE);
+    Status = WakeUpAP (CpuMpData, TRUE, 0, Procedure, ProcedureArgument, FALSE);
+    if (EFI_ERROR (Status)) {
+      return Status;
+    }
   } else {
     for (ProcessorNumber = 0; ProcessorNumber < ProcessorCount; ProcessorNumber++) {
       if (ProcessorNumber == CallerNumber) {
         continue;
       }
       if (CpuMpData->CpuData[ProcessorNumber].Waiting) {
-        WakeUpAP (CpuMpData, FALSE, ProcessorNumber, Procedure, ProcedureArgument, TRUE);
+        Status = WakeUpAP (CpuMpData, FALSE, ProcessorNumber, Procedure, ProcedureArgument, TRUE);
+        if (EFI_ERROR (Status)) {
+          return Status;
+        }
+
         break;
       }
     }
@@ -2795,7 +2854,10 @@ StartupThisAPWorker (
   CpuData->ExpectedTime = CalculateTimeout (TimeoutInMicroseconds, &CpuData->CurrentTime);
   CpuData->TotalTime    = 0;
 
-  WakeUpAP (CpuMpData, FALSE, ProcessorNumber, Procedure, ProcedureArgument, TRUE);
+  Status = WakeUpAP (CpuMpData, FALSE, ProcessorNumber, Procedure, ProcedureArgument, TRUE);
+  if (EFI_ERROR (Status)) {
+    return Status;
+  }
 
   //
   // If WaitEvent is NULL, execute in blocking mode.
diff --git a/UefiCpuPkg/Library/MpInitLib/MpLib.h b/UefiCpuPkg/Library/MpInitLib/MpLib.h
index 02652eaae1..9cf5c0f9b4 100644
--- a/UefiCpuPkg/Library/MpInitLib/MpLib.h
+++ b/UefiCpuPkg/Library/MpInitLib/MpLib.h
@@ -459,8 +459,12 @@ GetSevEsAPMemory (
   @param[in] Procedure          The function to be invoked by AP
   @param[in] ProcedureArgument  The argument to be passed into AP function
   @param[in] WakeUpDisabledAps  Whether need to wake up disabled APs in broadcast mode.
+
+  @retval EFI_SUCCESS           Wake up the AP success.
+  @retval EFI_UNSUPPORTED       Invalid CR3, IDT, GDT value caused fail to wake up AP.
+
 **/
-VOID
+EFI_STATUS
 WakeUpAP (
   IN CPU_MP_DATA               *CpuMpData,
   IN BOOLEAN                   Broadcast,
-- 
2.23.0.windows.1


-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.

View/Reply Online (#64926): https://edk2.groups.io/g/devel/message/64926
Mute This Topic: https://groups.io/mt/76573401/1787277
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub  [importer@patchew.org]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [edk2-devel] [PATCH] UefiCpuPkg/MpInitLib: Add check for CR3/GDT/IDT.

Posted by Laszlo Ersek 2 weeks ago
Hi Eric,

On 09/02/20 02:43, Dong, Eric wrote:
> AP needs to run from real mode to 32bit mode to LONG mode. Page table
> (pointed by CR3) and GDT are necessary to set up to correct value when
> CPU execution mode is switched to LONG mode.
> AP uses the same location page table (CR3) and GDT as what BSP uses.
> But when the page table or GDT is above 4GB, it's impossible for CPU
> to use because GDTR.base and CR3 are 32bits before switching to LONG
> mode.
> This patch adds check for the CR3, GDT.Base and IDT.Base to not above
> 32 bits restriction.
> 
> Signed-off-by: Eric Dong <eric.dong@intel.com>
> Cc: Ray Ni <ray.ni@intel.com>
> Cc: Laszlo Ersek <lersek@redhat.com>
> ---
>  UefiCpuPkg/Library/MpInitLib/DxeMpLib.c |   8 +-
>  UefiCpuPkg/Library/MpInitLib/MpLib.c    | 108 +++++++++++++++++++-----
>  UefiCpuPkg/Library/MpInitLib/MpLib.h    |   6 +-
>  3 files changed, 97 insertions(+), 25 deletions(-)

(1) This is not for edk2-stable202008, correct?


(2) If I understand correctly, this patch does not solve the problem
when any one of the IDT, GDT, and CR3, are out of 32-bit address space.
Instead, the patch makes the failure symptoms more graceful.

Is that correct?

If so, can you explain in the commit message what happens without the
patch, versus with the patch, when the IDT / GDT / CR3 are out of 32-bit
address space?

Like, if we abandon MpInitChangeApLoopCallback() mid-way, then (AIUI)
the AP "pen" (HLT loop) will not be relocated to reserved memory at
ExitBootServces(). I don't think that simply giving up can end well for
the OS!


(3) Can you remind us in the commit message where the IDT / GDT / page
tables are actually allocated? That is, can you please name the
component that is usually responsible for keeping the allocations in the
32-bit address space?

And once we know where the IDT / GDT / page tables come from --
shouldn't we rather modify those modules, to allocate IDT / GDT / page
tables in the 32-bit address space?

More below:

> diff --git a/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c b/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c
> index 2c00d72dde..27f12a75a8 100644
> --- a/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c
> +++ b/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c
> @@ -393,13 +393,19 @@ MpInitChangeApLoopCallback (
>    )
>  {
>    CPU_MP_DATA               *CpuMpData;
> +  EFI_STATUS                Status;
>  
>    CpuMpData = GetCpuMpData ();
>    CpuMpData->PmCodeSegment = GetProtectedModeCS ();
>    CpuMpData->Pm16CodeSegment = GetProtectedMode16CS ();
>    CpuMpData->ApLoopMode = PcdGet8 (PcdCpuApLoopMode);
>    mNumberToFinish = CpuMpData->CpuCount - 1;
> -  WakeUpAP (CpuMpData, TRUE, 0, RelocateApLoop, NULL, TRUE);
> +  Status = WakeUpAP (CpuMpData, TRUE, 0, RelocateApLoop, NULL, TRUE);
> +  if (EFI_ERROR (Status)) {
> +    DEBUG ((DEBUG_ERROR, "ERROR :: %a() Change Ap Loop Mode failed!\n", __FUNCTION__));
> +    return;
> +  }
> +
>    while (mNumberToFinish > 0) {
>      CpuPause ();
>    }
> diff --git a/UefiCpuPkg/Library/MpInitLib/MpLib.c b/UefiCpuPkg/Library/MpInitLib/MpLib.c
> index 07426274f6..21b17a7b40 100644
> --- a/UefiCpuPkg/Library/MpInitLib/MpLib.c
> +++ b/UefiCpuPkg/Library/MpInitLib/MpLib.c
> @@ -470,7 +470,7 @@ GetProcessorNumber (
>  
>    @return  CPU count detected
>  **/
> -UINTN
> +EFI_STATUS
>  CollectProcessorCount (
>    IN CPU_MP_DATA         *CpuMpData
>    )
> @@ -478,12 +478,17 @@ CollectProcessorCount (
>    UINTN                  Index;
>    CPU_INFO_IN_HOB        *CpuInfoInHob;
>    BOOLEAN                X2Apic;
> +  EFI_STATUS             Status;
>  
>    //
>    // Send 1st broadcast IPI to APs to wakeup APs
>    //
>    CpuMpData->InitFlag = ApInitConfig;
> -  WakeUpAP (CpuMpData, TRUE, 0, NULL, NULL, TRUE);
> +  Status = WakeUpAP (CpuMpData, TRUE, 0, NULL, NULL, TRUE);
> +  if (EFI_ERROR (Status)) {
> +    return Status;
> +  }
> +
>    CpuMpData->InitFlag = ApInitDone;
>    ASSERT (CpuMpData->CpuCount <= PcdGet32 (PcdCpuMaxLogicalProcessorNumber));
>    //
> @@ -520,7 +525,11 @@ CollectProcessorCount (
>      //
>      // Wakeup all APs to enable x2APIC mode
>      //
> -    WakeUpAP (CpuMpData, TRUE, 0, ApFuncEnableX2Apic, NULL, TRUE);
> +    Status = WakeUpAP (CpuMpData, TRUE, 0, ApFuncEnableX2Apic, NULL, TRUE);
> +    if (EFI_ERROR (Status)) {
> +      return Status;
> +    }
> +
>      //
>      // Wait for all known APs finished
>      //
> @@ -546,7 +555,7 @@ CollectProcessorCount (
>  
>    DEBUG ((DEBUG_INFO, "MpInitLib: Find %d processors in system.\n", CpuMpData->CpuCount));
>  
> -  return CpuMpData->CpuCount;
> +  return EFI_SUCCESS;
>  }
>  
>  /**
> @@ -990,7 +999,7 @@ WaitApWakeup (
>    @param[in] CpuMpData          Pointer to CPU MP Data
>  
>  **/
> -VOID
> +EFI_STATUS
>  FillExchangeInfoData (
>    IN CPU_MP_DATA               *CpuMpData
>    )
> @@ -1001,6 +1010,35 @@ FillExchangeInfoData (
>    IA32_CR4                         Cr4;
>  
>    ExchangeInfo                  = CpuMpData->MpCpuExchangeInfo;
> +  ExchangeInfo->Cr3             = AsmReadCr3 ();
> +  if (ExchangeInfo->Cr3 > 0xFFFFFFFF) {
> +    //
> +    // AP needs to run from real mode to 32bit mode to LONG mode. Page table
> +    // (pointed by CR3) and GDT are necessary to set up to correct value when
> +    // CPU execution mode is switched to LONG mode.
> +    // AP uses the same location page table (CR3) and GDT as what BSP uses.
> +    // But when the page table or GDT is above 4GB, it's impossible for CPU
> +    // to use because GDTR.base and CR3 are 32bits before switching to LONG
> +    // mode.
> +    // Here add check for the CR3, GDT.Base and IDT.Base to not above 32 bits
> +    // limitation.
> +    //
> +    return EFI_UNSUPPORTED;
> +  }
> +
> +  //
> +  // Get the BSP's data of GDT and IDT
> +  //
> +  AsmReadGdtr ((IA32_DESCRIPTOR *) &ExchangeInfo->GdtrProfile);
> +  if (ExchangeInfo->GdtrProfile.Base > 0xFFFFFFFF) {
> +    return EFI_UNSUPPORTED;
> +  }
> +
> +  AsmReadIdtr ((IA32_DESCRIPTOR *) &ExchangeInfo->IdtrProfile);
> +  if (ExchangeInfo->IdtrProfile.Base > 0xFFFFFFFF) {
> +    return EFI_UNSUPPORTED;
> +  }
> +
>    ExchangeInfo->Lock            = 0;
>    ExchangeInfo->StackStart      = CpuMpData->Buffer;
>    ExchangeInfo->StackSize       = CpuMpData->CpuApStackSize;

(4) Style: I think we should use

  >= BASE_4GB

rather than

  > 0xFFFFFFFF

(5) Correctness: the conditions above only check the base addresses of
the tables. However I think we should check whether the *last byte* in
each table fits in 32-bits.

(6) How about a different approach: in case we realize the IDT / GDT /
page tables are "out of reach", can we simply copy them under 4GB, and
pass the *copy* to the APs? (I guess this would be difficult with the
page table hierarchy.)

--*--

All in all I'm really surprised (and concerned!) that we need to add an
after-the-fact check for this situation. This situation should never
occur, in my opinion.

Basically the idea seems to be that FillExchangeInfoData() can now fail,
and because of that, WakeUpAP() can now fail too. And we attempt to
propagate that failure out to the callers of WakeUpAP().

But WakeUpAP() is called from *lots* of places, which currently assume
that WakeUpAP() can not fail. I don't think it may even be possible to
cleanly handle WakeUpAP() failure in some of those places. I mentioned
MpInitChangeApLoopCallback() in point (2) above, but
CollectProcessorCount() is a good example too. If
CollectProcessorCount() is abandoned early, then we won't know how many
CPUs are in the system, xAPIC / x2APIC mode will not be set, etc. So
basically MP services in the system will be unusable. I don't think we
should even attempt booting at all, in that case.

I think we should look into point (3) more deeply; that is, figure out
where the page tables / IDT / GDT are allocated in the first place, and
make sure *in those modules* that the allocation is safe.


Hmm wait -- if MpInitLibInitialize() fails (for example because
CollectProcessorCount() fails), then we hit ASSERT_EFI_ERROR()s anyway!
In the following locations:

- InitializeMpSupport() function in "UefiCpuPkg/CpuDxe/CpuMp.c",

- MemoryDiscoveredPpiNotifyCallback() function in
"UefiCpuPkg/CpuMpPei/CpuPaging.c" -- via InitializeCpuMpWorker()

So it seems like this condition is impossible to mitigate, and we're not
going to boot even with this patch (in NOOPT/DEBUG builds) if the
condition occurs (and RELEASE builds will just crash or misbehave badly).

However, in that case, I don't think we should complicate the code with
a bunch of error propagation. We identify the problem in
FillExchangeInfoData(). *If* we cannot fix the agents that are
responsible for the GDT / IDT / page table allocations in the first
place, *then* we should ASSERT() in FillExchangeInfoData() immediately.
It's a serious platform misconfiguration, and we shouldn't even attempt
to continue.

Consider it from the following angle too: if this condition causes
MpInitLibInitialize() to fail, then we'll never reach functions such as
SwitchBSPWorker() and MpInitChangeApLoopCallback() -- so why complicate
them with new branches (that don't do the right thing anyway)?


Is it perhaps the case that the open source code in edk2 *itself* is
safe, but you have some proprietary firmware platforms that use
MpInitLib *but* do not satisfy the 32-bit allocation requirements? How
and where was the problem actually experienced?

Thanks,
Laszlo



> @@ -1009,9 +1047,6 @@ FillExchangeInfoData (
>  
>    ExchangeInfo->CodeSegment     = AsmReadCs ();
>    ExchangeInfo->DataSegment     = AsmReadDs ();
> -
> -  ExchangeInfo->Cr3             = AsmReadCr3 ();
> -
>    ExchangeInfo->CFunction       = (UINTN) ApWakeupFunction;
>    ExchangeInfo->ApIndex         = 0;
>    ExchangeInfo->NumApsExecuting = 0;
> @@ -1037,13 +1072,6 @@ FillExchangeInfoData (
>  
>    ExchangeInfo->SevEsIsEnabled  = CpuMpData->SevEsIsEnabled;
>    ExchangeInfo->GhcbBase        = (UINTN) CpuMpData->GhcbBase;
> -
> -  //
> -  // Get the BSP's data of GDT and IDT
> -  //
> -  AsmReadGdtr ((IA32_DESCRIPTOR *) &ExchangeInfo->GdtrProfile);
> -  AsmReadIdtr ((IA32_DESCRIPTOR *) &ExchangeInfo->IdtrProfile);
> -
>    //
>    // Find a 32-bit code segment
>    //
> @@ -1084,6 +1112,8 @@ FillExchangeInfoData (
>                           (UINT32)ExchangeInfo->ModeOffset -
>                           (UINT32)CpuMpData->AddressMap.ModeTransitionOffset;
>    ExchangeInfo->ModeHighSegment = (UINT16)ExchangeInfo->CodeSegment;
> +
> +  return EFI_SUCCESS;
>  }
>  
>  /**
> @@ -1308,8 +1338,12 @@ SetSevEsJumpTable (
>    @param[in] Procedure          The function to be invoked by AP
>    @param[in] ProcedureArgument  The argument to be passed into AP function
>    @param[in] WakeUpDisabledAps  Whether need to wake up disabled APs in broadcast mode.
> +
> +  @retval EFI_SUCCESS           Wake up the AP success.
> +  @retval EFI_UNSUPPORTED       Invalid CR3, IDT, GDT value caused fail to wake up AP.
> +
>  **/
> -VOID
> +EFI_STATUS
>  WakeUpAP (
>    IN CPU_MP_DATA               *CpuMpData,
>    IN BOOLEAN                   Broadcast,
> @@ -1324,6 +1358,7 @@ WakeUpAP (
>    CPU_AP_DATA                      *CpuData;
>    BOOLEAN                          ResetVectorRequired;
>    CPU_INFO_IN_HOB                  *CpuInfoInHob;
> +  EFI_STATUS                       Status;
>  
>    CpuMpData->FinishedCount = 0;
>    ResetVectorRequired = FALSE;
> @@ -1333,7 +1368,10 @@ WakeUpAP (
>      ResetVectorRequired = TRUE;
>      AllocateResetVector (CpuMpData);
>      AllocateSevEsAPMemory (CpuMpData);
> -    FillExchangeInfoData (CpuMpData);
> +    Status = FillExchangeInfoData (CpuMpData);
> +    if (EFI_ERROR (Status)) {
> +      return Status;
> +    }
>      SaveLocalApicTimerSetting (CpuMpData);
>    }
>  
> @@ -1500,6 +1538,8 @@ WakeUpAP (
>    // S3SmmInitDone Ppi.
>    //
>    CpuMpData->WakeUpByInitSipiSipi = (CpuMpData->ApLoopMode == ApInHltLoop);
> +
> +  return EFI_SUCCESS;
>  }
>  
>  /**
> @@ -1945,6 +1985,7 @@ MpInitLibInitialize (
>    UINTN                    ApResetVectorSize;
>    UINTN                    BackupBufferAddr;
>    UINTN                    ApIdtBase;
> +  EFI_STATUS               Status;
>  
>    OldCpuMpData = GetCpuMpDataFromGuidedHob ();
>    if (OldCpuMpData == NULL) {
> @@ -2067,7 +2108,10 @@ MpInitLibInitialize (
>        //
>        // Wakeup all APs and calculate the processor count in system
>        //
> -      CollectProcessorCount (CpuMpData);
> +      Status = CollectProcessorCount (CpuMpData);
> +      if (EFI_ERROR (Status)) {
> +        return Status;
> +      }
>      }
>    } else {
>      //
> @@ -2118,7 +2162,11 @@ MpInitLibInitialize (
>        //
>        CpuMpData->InitFlag = ApInitReconfig;
>      }
> -    WakeUpAP (CpuMpData, TRUE, 0, ApInitializeSync, CpuMpData, TRUE);
> +    Status = WakeUpAP (CpuMpData, TRUE, 0, ApInitializeSync, CpuMpData, TRUE);
> +    if (EFI_ERROR (Status)) {
> +      return Status;
> +    }
> +
>      //
>      // Wait for all APs finished initialization
>      //
> @@ -2262,6 +2310,7 @@ SwitchBSPWorker (
>    MSR_IA32_APIC_BASE_REGISTER  ApicBaseMsr;
>    BOOLEAN                      OldInterruptState;
>    BOOLEAN                      OldTimerInterruptState;
> +  EFI_STATUS                   Status;
>  
>    //
>    // Save and Disable Local APIC timer interrupt
> @@ -2333,7 +2382,10 @@ SwitchBSPWorker (
>    //
>    // Need to wakeUp AP (future BSP).
>    //
> -  WakeUpAP (CpuMpData, FALSE, ProcessorNumber, FutureBSPProc, CpuMpData, TRUE);
> +  Status = WakeUpAP (CpuMpData, FALSE, ProcessorNumber, FutureBSPProc, CpuMpData, TRUE);
> +  if (EFI_ERROR (Status)) {
> +    return Status;
> +  }
>  
>    AsmExchangeRole (&CpuMpData->BSPInfo, &CpuMpData->APInfo);
>  
> @@ -2669,14 +2721,21 @@ StartupAllCPUsWorker (
>    CpuMpData->WaitEvent     = WaitEvent;
>  
>    if (!SingleThread) {
> -    WakeUpAP (CpuMpData, TRUE, 0, Procedure, ProcedureArgument, FALSE);
> +    Status = WakeUpAP (CpuMpData, TRUE, 0, Procedure, ProcedureArgument, FALSE);
> +    if (EFI_ERROR (Status)) {
> +      return Status;
> +    }
>    } else {
>      for (ProcessorNumber = 0; ProcessorNumber < ProcessorCount; ProcessorNumber++) {
>        if (ProcessorNumber == CallerNumber) {
>          continue;
>        }
>        if (CpuMpData->CpuData[ProcessorNumber].Waiting) {
> -        WakeUpAP (CpuMpData, FALSE, ProcessorNumber, Procedure, ProcedureArgument, TRUE);
> +        Status = WakeUpAP (CpuMpData, FALSE, ProcessorNumber, Procedure, ProcedureArgument, TRUE);
> +        if (EFI_ERROR (Status)) {
> +          return Status;
> +        }
> +
>          break;
>        }
>      }
> @@ -2795,7 +2854,10 @@ StartupThisAPWorker (
>    CpuData->ExpectedTime = CalculateTimeout (TimeoutInMicroseconds, &CpuData->CurrentTime);
>    CpuData->TotalTime    = 0;
>  
> -  WakeUpAP (CpuMpData, FALSE, ProcessorNumber, Procedure, ProcedureArgument, TRUE);
> +  Status = WakeUpAP (CpuMpData, FALSE, ProcessorNumber, Procedure, ProcedureArgument, TRUE);
> +  if (EFI_ERROR (Status)) {
> +    return Status;
> +  }
>  
>    //
>    // If WaitEvent is NULL, execute in blocking mode.
> diff --git a/UefiCpuPkg/Library/MpInitLib/MpLib.h b/UefiCpuPkg/Library/MpInitLib/MpLib.h
> index 02652eaae1..9cf5c0f9b4 100644
> --- a/UefiCpuPkg/Library/MpInitLib/MpLib.h
> +++ b/UefiCpuPkg/Library/MpInitLib/MpLib.h
> @@ -459,8 +459,12 @@ GetSevEsAPMemory (
>    @param[in] Procedure          The function to be invoked by AP
>    @param[in] ProcedureArgument  The argument to be passed into AP function
>    @param[in] WakeUpDisabledAps  Whether need to wake up disabled APs in broadcast mode.
> +
> +  @retval EFI_SUCCESS           Wake up the AP success.
> +  @retval EFI_UNSUPPORTED       Invalid CR3, IDT, GDT value caused fail to wake up AP.
> +
>  **/
> -VOID
> +EFI_STATUS
>  WakeUpAP (
>    IN CPU_MP_DATA               *CpuMpData,
>    IN BOOLEAN                   Broadcast,
> 


-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.

View/Reply Online (#64959): https://edk2.groups.io/g/devel/message/64959
Mute This Topic: https://groups.io/mt/76573401/1787277
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub  [importer@patchew.org]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [edk2-devel] [PATCH] UefiCpuPkg/MpInitLib: Add check for CR3/GDT/IDT.

Posted by Dong, Eric 2 weeks ago
Hi Laszlo,

Thanks for your detail review and good comments, add my reply in below.

This issue was reported by tester. They use a shell application to do the test. In the test, it first moves the page table to above 4G range then calls StartUpThisAp to let AP run the test procedure.  The system will hang during the test.

Add more comments in below inline.

From: Laszlo Ersek <lersek@redhat.com<mailto:lersek@redhat.com>>
Sent: Wednesday, September 2, 2020 3:43 PM
To: devel@edk2.groups.io<mailto:devel@edk2.groups.io>; Dong, Eric <eric.dong@intel.com<mailto:eric.dong@intel.com>>
Cc: Ni, Ray <ray.ni@intel.com<mailto:ray.ni@intel.com>>
Subject: Re: [edk2-devel] [PATCH] UefiCpuPkg/MpInitLib: Add check for CR3/GDT/IDT.

Hi Eric,

On 09/02/20 02:43, Dong, Eric wrote:
> AP needs to run from real mode to 32bit mode to LONG mode. Page table
> (pointed by CR3) and GDT are necessary to set up to correct value when
> CPU execution mode is switched to LONG mode.
> AP uses the same location page table (CR3) and GDT as what BSP uses.
> But when the page table or GDT is above 4GB, it's impossible for CPU
> to use because GDTR.base and CR3 are 32bits before switching to LONG
> mode.
> This patch adds check for the CR3, GDT.Base and IDT.Base to not above
> 32 bits restriction.
>
> Signed-off-by: Eric Dong <eric.dong@intel.com<mailto:eric.dong@intel.com>>
> Cc: Ray Ni <ray.ni@intel.com<mailto:ray.ni@intel.com>>
> Cc: Laszlo Ersek <lersek@redhat.com<mailto:lersek@redhat.com>>
> ---
>  UefiCpuPkg/Library/MpInitLib/DxeMpLib.c |   8 +-
>  UefiCpuPkg/Library/MpInitLib/MpLib.c    | 108 +++++++++++++++++++-----
>  UefiCpuPkg/Library/MpInitLib/MpLib.h    |   6 +-
>  3 files changed, 97 insertions(+), 25 deletions(-)

(1) This is not for edk2-stable202008, correct?
[Eric] Yes, just a bug fix for the issue I received recently.



(2) If I understand correctly, this patch does not solve the problem
when any one of the IDT, GDT, and CR3, are out of 32-bit address space.
Instead, the patch makes the failure symptoms more graceful.

Is that correct?
[Eric] yes, current result is the system hang, I just add error handling to avoid hang.


If so, can you explain in the commit message what happens without the
patch, versus with the patch, when the IDT / GDT / CR3 are out of 32-bit
address space?
[Eric] got it, will include it in next version patch.


Like, if we abandon MpInitChangeApLoopCallback() mid-way, then (AIUI)
the AP "pen" (HLT loop) will not be relocated to reserved memory at
ExitBootServces(). I don't think that simply giving up can end well for
the OS!
[Eric] agree. I don’t find a good way to handle it, so I add an error console log
In it. I will also add  “ASSERT (FALSE)” code to highlight the error.

(3) Can you remind us in the commit message where the IDT / GDT / page
tables are actually allocated? That is, can you please name the
component that is usually responsible for keeping the allocations in the
32-bit address space?

And once we know where the IDT / GDT / page tables come from --
shouldn't we rather modify those modules, to allocate IDT / GDT / page
tables in the 32-bit address space?
[Eric] This issue trigged by a shell application, so I can’t provide the module or driver that cause this failure.


More below:

> diff --git a/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c b/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c
> index 2c00d72dde..27f12a75a8 100644
> --- a/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c
> +++ b/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c
> @@ -393,13 +393,19 @@ MpInitChangeApLoopCallback (
>    )
>  {
>    CPU_MP_DATA               *CpuMpData;
> +  EFI_STATUS                Status;
>
>    CpuMpData = GetCpuMpData ();
>    CpuMpData->PmCodeSegment = GetProtectedModeCS ();
>    CpuMpData->Pm16CodeSegment = GetProtectedMode16CS ();
>    CpuMpData->ApLoopMode = PcdGet8 (PcdCpuApLoopMode);
>    mNumberToFinish = CpuMpData->CpuCount - 1;
> -  WakeUpAP (CpuMpData, TRUE, 0, RelocateApLoop, NULL, TRUE);
> +  Status = WakeUpAP (CpuMpData, TRUE, 0, RelocateApLoop, NULL, TRUE);
> +  if (EFI_ERROR (Status)) {
> +    DEBUG ((DEBUG_ERROR, "ERROR :: %a() Change Ap Loop Mode failed!\n", __FUNCTION__));
> +    return;
> +  }
> +
>    while (mNumberToFinish > 0) {
>      CpuPause ();
>    }
> diff --git a/UefiCpuPkg/Library/MpInitLib/MpLib.c b/UefiCpuPkg/Library/MpInitLib/MpLib.c
> index 07426274f6..21b17a7b40 100644
> --- a/UefiCpuPkg/Library/MpInitLib/MpLib.c
> +++ b/UefiCpuPkg/Library/MpInitLib/MpLib.c
> @@ -470,7 +470,7 @@ GetProcessorNumber (
>
>    @return  CPU count detected
>  **/
> -UINTN
> +EFI_STATUS
>  CollectProcessorCount (
>    IN CPU_MP_DATA         *CpuMpData
>    )
> @@ -478,12 +478,17 @@ CollectProcessorCount (
>    UINTN                  Index;
>    CPU_INFO_IN_HOB        *CpuInfoInHob;
>    BOOLEAN                X2Apic;
> +  EFI_STATUS             Status;
>
>    //
>    // Send 1st broadcast IPI to APs to wakeup APs
>    //
>    CpuMpData->InitFlag = ApInitConfig;
> -  WakeUpAP (CpuMpData, TRUE, 0, NULL, NULL, TRUE);
> +  Status = WakeUpAP (CpuMpData, TRUE, 0, NULL, NULL, TRUE);
> +  if (EFI_ERROR (Status)) {
> +    return Status;
> +  }
> +
>    CpuMpData->InitFlag = ApInitDone;
>    ASSERT (CpuMpData->CpuCount <= PcdGet32 (PcdCpuMaxLogicalProcessorNumber));
>    //
> @@ -520,7 +525,11 @@ CollectProcessorCount (
>      //
>      // Wakeup all APs to enable x2APIC mode
>      //
> -    WakeUpAP (CpuMpData, TRUE, 0, ApFuncEnableX2Apic, NULL, TRUE);
> +    Status = WakeUpAP (CpuMpData, TRUE, 0, ApFuncEnableX2Apic, NULL, TRUE);
> +    if (EFI_ERROR (Status)) {
> +      return Status;
> +    }
> +
>      //
>      // Wait for all known APs finished
>      //
> @@ -546,7 +555,7 @@ CollectProcessorCount (
>
>    DEBUG ((DEBUG_INFO, "MpInitLib: Find %d processors in system.\n", CpuMpData->CpuCount));
>
> -  return CpuMpData->CpuCount;
> +  return EFI_SUCCESS;
>  }
>
>  /**
> @@ -990,7 +999,7 @@ WaitApWakeup (
>    @param[in] CpuMpData          Pointer to CPU MP Data
>
>  **/
> -VOID
> +EFI_STATUS
>  FillExchangeInfoData (
>    IN CPU_MP_DATA               *CpuMpData
>    )
> @@ -1001,6 +1010,35 @@ FillExchangeInfoData (
>    IA32_CR4                         Cr4;
>
>    ExchangeInfo                  = CpuMpData->MpCpuExchangeInfo;
> +  ExchangeInfo->Cr3             = AsmReadCr3 ();
> +  if (ExchangeInfo->Cr3 > 0xFFFFFFFF) {
> +    //
> +    // AP needs to run from real mode to 32bit mode to LONG mode. Page table
> +    // (pointed by CR3) and GDT are necessary to set up to correct value when
> +    // CPU execution mode is switched to LONG mode.
> +    // AP uses the same location page table (CR3) and GDT as what BSP uses.
> +    // But when the page table or GDT is above 4GB, it's impossible for CPU
> +    // to use because GDTR.base and CR3 are 32bits before switching to LONG
> +    // mode.
> +    // Here add check for the CR3, GDT.Base and IDT.Base to not above 32 bits
> +    // limitation.
> +    //
> +    return EFI_UNSUPPORTED;
> +  }
> +
> +  //
> +  // Get the BSP's data of GDT and IDT
> +  //
> +  AsmReadGdtr ((IA32_DESCRIPTOR *) &ExchangeInfo->GdtrProfile);
> +  if (ExchangeInfo->GdtrProfile.Base > 0xFFFFFFFF) {
> +    return EFI_UNSUPPORTED;
> +  }
> +
> +  AsmReadIdtr ((IA32_DESCRIPTOR *) &ExchangeInfo->IdtrProfile);
> +  if (ExchangeInfo->IdtrProfile.Base > 0xFFFFFFFF) {
> +    return EFI_UNSUPPORTED;
> +  }
> +
>    ExchangeInfo->Lock            = 0;
>    ExchangeInfo->StackStart      = CpuMpData->Buffer;
>    ExchangeInfo->StackSize       = CpuMpData->CpuApStackSize;

(4) Style: I think we should use

  >= BASE_4GB

rather than

  > 0xFFFFFFFF
[Eric] got it, will update it in next version patch.


(5) Correctness: the conditions above only check the base addresses of
the tables. However I think we should check whether the *last byte* in
each table fits in 32-bits.
[Eric] got it, will update it in next version patch.

(6) How about a different approach: in case we realize the IDT / GDT /
page tables are "out of reach", can we simply copy them under 4GB, and
pass the *copy* to the APs? (I guess this would be difficult with the
page table hierarchy.)
[Eric] This is the first time we met this issue and it was reported by a shell
application that simulates the usage modal. We are not clear about the
real impact of this case. Also, we have a release schedule pressure and
this patch used one of the options to avoid the hang. Maybe we need to
enhance it when we met the real issue later.

--*--

All in all I'm really surprised (and concerned!) that we need to add an
after-the-fact check for this situation. This situation should never
occur, in my opinion.

Basically the idea seems to be that FillExchangeInfoData() can now fail,
and because of that, WakeUpAP() can now fail too. And we attempt to
propagate that failure out to the callers of WakeUpAP().

But WakeUpAP() is called from *lots* of places, which currently assume
that WakeUpAP() can not fail. I don't think it may even be possible to
cleanly handle WakeUpAP() failure in some of those places. I mentioned
MpInitChangeApLoopCallback() in point (2) above, but
CollectProcessorCount() is a good example too. If
CollectProcessorCount() is abandoned early, then we won't know how many
CPUs are in the system, xAPIC / x2APIC mode will not be set, etc. So
basically MP services in the system will be unusable. I don't think we
should even attempt booting at all, in that case.

I think we should look into point (3) more deeply; that is, figure out
where the page tables / IDT / GDT are allocated in the first place, and
make sure *in those modules* that the allocation is safe.


Hmm wait -- if MpInitLibInitialize() fails (for example because
CollectProcessorCount() fails), then we hit ASSERT_EFI_ERROR()s anyway!
In the following locations:

- InitializeMpSupport() function in "UefiCpuPkg/CpuDxe/CpuMp.c",

- MemoryDiscoveredPpiNotifyCallback() function in
"UefiCpuPkg/CpuMpPei/CpuPaging.c" -- via InitializeCpuMpWorker()

So it seems like this condition is impossible to mitigate, and we're not
going to boot even with this patch (in NOOPT/DEBUG builds) if the
condition occurs (and RELEASE builds will just crash or misbehave badly).

However, in that case, I don't think we should complicate the code with
a bunch of error propagation. We identify the problem in
FillExchangeInfoData(). *If* we cannot fix the agents that are
responsible for the GDT / IDT / page table allocations in the first
place, *then* we should ASSERT() in FillExchangeInfoData() immediately.
It's a serious platform misconfiguration, and we shouldn't even attempt
to continue.

Consider it from the following angle too: if this condition causes
MpInitLibInitialize() to fail, then we'll never reach functions such as
SwitchBSPWorker() and MpInitChangeApLoopCallback() -- so why complicate
them with new branches (that don't do the right thing anyway)?


Is it perhaps the case that the open source code in edk2 *itself* is
safe, but you have some proprietary firmware platforms that use
MpInitLib *but* do not satisfy the 32-bit allocation requirements? How
and where was the problem actually experienced?
[Eric] This issue was found by a shell application, not during boot. So the
Code flow is totally different with your analysis.

Thanks,
Eric

Thanks,
Laszlo



> @@ -1009,9 +1047,6 @@ FillExchangeInfoData (
>
>    ExchangeInfo->CodeSegment     = AsmReadCs ();
>    ExchangeInfo->DataSegment     = AsmReadDs ();
> -
> -  ExchangeInfo->Cr3             = AsmReadCr3 ();
> -
>    ExchangeInfo->CFunction       = (UINTN) ApWakeupFunction;
>    ExchangeInfo->ApIndex         = 0;
>    ExchangeInfo->NumApsExecuting = 0;
> @@ -1037,13 +1072,6 @@ FillExchangeInfoData (
>
>    ExchangeInfo->SevEsIsEnabled  = CpuMpData->SevEsIsEnabled;
>    ExchangeInfo->GhcbBase        = (UINTN) CpuMpData->GhcbBase;
> -
> -  //
> -  // Get the BSP's data of GDT and IDT
> -  //
> -  AsmReadGdtr ((IA32_DESCRIPTOR *) &ExchangeInfo->GdtrProfile);
> -  AsmReadIdtr ((IA32_DESCRIPTOR *) &ExchangeInfo->IdtrProfile);
> -
>    //
>    // Find a 32-bit code segment
>    //
> @@ -1084,6 +1112,8 @@ FillExchangeInfoData (
>                           (UINT32)ExchangeInfo->ModeOffset -
>                           (UINT32)CpuMpData->AddressMap.ModeTransitionOffset;
>    ExchangeInfo->ModeHighSegment = (UINT16)ExchangeInfo->CodeSegment;
> +
> +  return EFI_SUCCESS;
>  }
>
>  /**
> @@ -1308,8 +1338,12 @@ SetSevEsJumpTable (
>    @param[in] Procedure          The function to be invoked by AP
>    @param[in] ProcedureArgument  The argument to be passed into AP function
>    @param[in] WakeUpDisabledAps  Whether need to wake up disabled APs in broadcast mode.
> +
> +  @retval EFI_SUCCESS           Wake up the AP success.
> +  @retval EFI_UNSUPPORTED       Invalid CR3, IDT, GDT value caused fail to wake up AP.
> +
>  **/
> -VOID
> +EFI_STATUS
>  WakeUpAP (
>    IN CPU_MP_DATA               *CpuMpData,
>    IN BOOLEAN                   Broadcast,
> @@ -1324,6 +1358,7 @@ WakeUpAP (
>    CPU_AP_DATA                      *CpuData;
>    BOOLEAN                          ResetVectorRequired;
>    CPU_INFO_IN_HOB                  *CpuInfoInHob;
> +  EFI_STATUS                       Status;
>
>    CpuMpData->FinishedCount = 0;
>    ResetVectorRequired = FALSE;
> @@ -1333,7 +1368,10 @@ WakeUpAP (
>      ResetVectorRequired = TRUE;
>      AllocateResetVector (CpuMpData);
>      AllocateSevEsAPMemory (CpuMpData);
> -    FillExchangeInfoData (CpuMpData);
> +    Status = FillExchangeInfoData (CpuMpData);
> +    if (EFI_ERROR (Status)) {
> +      return Status;
> +    }
>      SaveLocalApicTimerSetting (CpuMpData);
>    }
>
> @@ -1500,6 +1538,8 @@ WakeUpAP (
>    // S3SmmInitDone Ppi.
>    //
>    CpuMpData->WakeUpByInitSipiSipi = (CpuMpData->ApLoopMode == ApInHltLoop);
> +
> +  return EFI_SUCCESS;
>  }
>
>  /**
> @@ -1945,6 +1985,7 @@ MpInitLibInitialize (
>    UINTN                    ApResetVectorSize;
>    UINTN                    BackupBufferAddr;
>    UINTN                    ApIdtBase;
> +  EFI_STATUS               Status;
>
>    OldCpuMpData = GetCpuMpDataFromGuidedHob ();
>    if (OldCpuMpData == NULL) {
> @@ -2067,7 +2108,10 @@ MpInitLibInitialize (
>        //
>        // Wakeup all APs and calculate the processor count in system
>        //
> -      CollectProcessorCount (CpuMpData);
> +      Status = CollectProcessorCount (CpuMpData);
> +      if (EFI_ERROR (Status)) {
> +        return Status;
> +      }
>      }
>    } else {
>      //
> @@ -2118,7 +2162,11 @@ MpInitLibInitialize (
>        //
>        CpuMpData->InitFlag = ApInitReconfig;
>      }
> -    WakeUpAP (CpuMpData, TRUE, 0, ApInitializeSync, CpuMpData, TRUE);
> +    Status = WakeUpAP (CpuMpData, TRUE, 0, ApInitializeSync, CpuMpData, TRUE);
> +    if (EFI_ERROR (Status)) {
> +      return Status;
> +    }
> +
>      //
>      // Wait for all APs finished initialization
>      //
> @@ -2262,6 +2310,7 @@ SwitchBSPWorker (
>    MSR_IA32_APIC_BASE_REGISTER  ApicBaseMsr;
>    BOOLEAN                      OldInterruptState;
>    BOOLEAN                      OldTimerInterruptState;
> +  EFI_STATUS                   Status;
>
>    //
>    // Save and Disable Local APIC timer interrupt
> @@ -2333,7 +2382,10 @@ SwitchBSPWorker (
>    //
>    // Need to wakeUp AP (future BSP).
>    //
> -  WakeUpAP (CpuMpData, FALSE, ProcessorNumber, FutureBSPProc, CpuMpData, TRUE);
> +  Status = WakeUpAP (CpuMpData, FALSE, ProcessorNumber, FutureBSPProc, CpuMpData, TRUE);
> +  if (EFI_ERROR (Status)) {
> +    return Status;
> +  }
>
>    AsmExchangeRole (&CpuMpData->BSPInfo, &CpuMpData->APInfo);
>
> @@ -2669,14 +2721,21 @@ StartupAllCPUsWorker (
>    CpuMpData->WaitEvent     = WaitEvent;
>
>    if (!SingleThread) {
> -    WakeUpAP (CpuMpData, TRUE, 0, Procedure, ProcedureArgument, FALSE);
> +    Status = WakeUpAP (CpuMpData, TRUE, 0, Procedure, ProcedureArgument, FALSE);
> +    if (EFI_ERROR (Status)) {
> +      return Status;
> +    }
>    } else {
>      for (ProcessorNumber = 0; ProcessorNumber < ProcessorCount; ProcessorNumber++) {
>        if (ProcessorNumber == CallerNumber) {
>          continue;
>        }
>        if (CpuMpData->CpuData[ProcessorNumber].Waiting) {
> -        WakeUpAP (CpuMpData, FALSE, ProcessorNumber, Procedure, ProcedureArgument, TRUE);
> +        Status = WakeUpAP (CpuMpData, FALSE, ProcessorNumber, Procedure, ProcedureArgument, TRUE);
> +        if (EFI_ERROR (Status)) {
> +          return Status;
> +        }
> +
>          break;
>        }
>      }
> @@ -2795,7 +2854,10 @@ StartupThisAPWorker (
>    CpuData->ExpectedTime = CalculateTimeout (TimeoutInMicroseconds, &CpuData->CurrentTime);
>    CpuData->TotalTime    = 0;
>
> -  WakeUpAP (CpuMpData, FALSE, ProcessorNumber, Procedure, ProcedureArgument, TRUE);
> +  Status = WakeUpAP (CpuMpData, FALSE, ProcessorNumber, Procedure, ProcedureArgument, TRUE);
> +  if (EFI_ERROR (Status)) {
> +    return Status;
> +  }
>
>    //
>    // If WaitEvent is NULL, execute in blocking mode.
> diff --git a/UefiCpuPkg/Library/MpInitLib/MpLib.h b/UefiCpuPkg/Library/MpInitLib/MpLib.h
> index 02652eaae1..9cf5c0f9b4 100644
> --- a/UefiCpuPkg/Library/MpInitLib/MpLib.h
> +++ b/UefiCpuPkg/Library/MpInitLib/MpLib.h
> @@ -459,8 +459,12 @@ GetSevEsAPMemory (
>    @param[in] Procedure          The function to be invoked by AP
>    @param[in] ProcedureArgument  The argument to be passed into AP function
>    @param[in] WakeUpDisabledAps  Whether need to wake up disabled APs in broadcast mode.
> +
> +  @retval EFI_SUCCESS           Wake up the AP success.
> +  @retval EFI_UNSUPPORTED       Invalid CR3, IDT, GDT value caused fail to wake up AP.
> +
>  **/
> -VOID
> +EFI_STATUS
>  WakeUpAP (
>    IN CPU_MP_DATA               *CpuMpData,
>    IN BOOLEAN                   Broadcast,
>

-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.

View/Reply Online (#64990): https://edk2.groups.io/g/devel/message/64990
Mute This Topic: https://groups.io/mt/76573401/1787277
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub  [importer@patchew.org]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [edk2-devel] [PATCH] UefiCpuPkg/MpInitLib: Add check for CR3/GDT/IDT.

Posted by Laszlo Ersek 2 weeks ago
On 09/03/20 03:47, Dong, Eric wrote:
> Hi Laszlo,
>
> Thanks for your detail review and good comments, add my reply in
> below.
>
> This issue was reported by tester. They use a shell application to do
> the test. In the test, it first moves the page table to above 4G range
> then calls StartUpThisAp to let AP run the test procedure.  The system
> will hang during the test.
>
> Add more comments in below inline.

Thanks. Let me summarize my understanding:

(1) During normal boot, and normal MP services PPI usage, and normal MP
    services protocol usage, there is no issue.

(2) Right now, we have not identified the exact edk2 core modules and
    locations that are responsible for allocating the IDT / GDT / page
    tables. Nonetheless, the allocations are all placed in the 32-bit
    address space, and so there is no problem with AP startup, during
    normal usage (see point (1)).

(3) There is a UEFI shell application that isn't more closely identified
    in this discussion. It first moves at least one of the IDT / GDT /
    page tables above 4GB, and then calls some member functions of the
    MP services protocol. This causes a hang, experienced during the
    execution of the UEFI shell application.

(4) The present patch recognizes the issue in FillExchangeInfoData(),
    and returns an error. In the relevant use case (UEFI shell
    application), this causes the called MP services protocol member
    function to fail, synchronously. This means the application still
    doesn't work, but there is no hang at least.

Is my understanding correct?

If it is, then I have the following comments:

- Please file a TianoCore BZ for this issue, and capture the use case in
  it (= UEFI shell application that changes the IDT / GDT / page tables
  base). Please feel free to copy parts of the above description, if you
  think that's useful.

- I'm quite doubtful that the use case is valid, in the first place. A
  UEFI application is supposed to consume the services described in the
  UEFI specification. Messing with the page tables is something that a
  UEFI application should *not* do, in my opinion. Such page table
  manipulation is expected to interfere with various DXE drivers in the
  platform firmware.

- Another comment on the UEFI shell application, and the present patch,
  is that their *combination* will still break ExitBootServices().
  Assume that the patch is applied, and the UEFI shell application is
  invoked. The application now fails gracefully, and exits. Then we
  attempt to boot an OS (this is a valid thing to do). Because the
  application moved the IDT / GDT / page tables "out of range",
  MpInitChangeApLoopCallback() will do the wrong thing.

- Most importantly: in MpInitLib, there is a *large* amount of call
  sites, and a large number of call paths that lead to WakeUpAP(), and
  ultimately to FillExchangeInfoData(). This means that changing the
  return type of FillExchangeInfoData() from VOID to EFI_STATUS has a
  "ripple effect" -- many call paths would have to deal with error
  checking, error propagation, and resource release (!!!) along those
  error paths.

  - For example, your current patch leaks resources in WakeUpAP(), when
    FillExchangeInfoData() fails -- see AllocateResetVector() and
    AllocateSevEsAPMemory().

  - For another example, your current patch does not handle several
    WakeUpAP() call sites, such as the ones in
    ResetProcessorToIdleState() and CheckAllAPs().

  Whereas in reality, from the applications point of view, we only need
  the MP services protocol member functions to fail cleanly. Therefore
  we should address this problem *early*; that is, *much less deep* in
  the call stack.


I suggest the following:

- introduce a feature PCD (default value FALSE)

- modify the following functions in MpInitLib:
  - MpInitLibGetNumberOfProcessors
  - MpInitLibGetProcessorInfo
  - MpInitLibStartupAllAPs
  - MpInitLibStartupThisAP
  - MpInitLibSwitchBSP
  - MpInitLibEnableDisableAP
  - MpInitLibWhoAmI

- each modified function should check the feature PCD *very early*. If
  the PCD is true, then the function should pre-emptively verify the IDT
  / GDT / root page table  location. If any one of those objects is
  outside of the 32-bit address space, then the function should fail
  immediately.

- We need to consider the event handlers in DxeMpInitLib:

  - CheckApsStatus() is not a problem, because the above short-circuits
    in the public MpInitLib functions will guarantee that
    "mStopCheckAllApsStatus" is always TRUE.

  - For exit-boot-services and legacy-boot, we need to modify
    MpInitChangeApLoopCallback() however. MpInitChangeApLoopCallback()
    should do the same as the above-listed functions (check the PCD,
    check the IDT / GDT / root page table), and if one of them is above
    4GB, then MpInitChangeApLoopCallback() should hang *hard* -- call
    CpuDeadLoop() --, even in RELEASE builds.

    This is because the UEFI application in question makes OS boot
    impossible, so we should stop the system as early as we realize
    that.

- Of course, I suggest a helper function to encapsulate the PCD check,
  and the GDT / IDT / CR3 checks.

This will let most platforms ignore the special UEFI shell application
use case (which I do feel is invalid). In addition, platforms that do
want to handle this use case, can set the PCD to TRUE for the CpuDxe
module *only* (so that CpuMpPei is not penalized). Finally, this
approach keeps the code clean; long call paths to FillExchangeInfoData()
are not made more complex.

Thanks
Laszlo


-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.

View/Reply Online (#64997): https://edk2.groups.io/g/devel/message/64997
Mute This Topic: https://groups.io/mt/76573401/1787277
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub  [importer@patchew.org]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [edk2-devel] [PATCH] UefiCpuPkg/MpInitLib: Add check for CR3/GDT/IDT.

Posted by Dong, Eric 2 weeks ago
Hi Laszlo,

Thanks for your comments, I send out v2 change.  I agree with you suggest adding PCD to control this check. Detail see v2 change.

Add other comments inline below.

From: Laszlo Ersek <lersek@redhat.com>
Sent: Thursday, September 3, 2020 3:59 PM
To: Dong, Eric <eric.dong@intel.com>; devel@edk2.groups.io
Cc: Ni, Ray <ray.ni@intel.com>
Subject: Re: [edk2-devel] [PATCH] UefiCpuPkg/MpInitLib: Add check for CR3/GDT/IDT.

On 09/03/20 03:47, Dong, Eric wrote:
> Hi Laszlo,
>
> Thanks for your detail review and good comments, add my reply in
> below.
>
> This issue was reported by tester. They use a shell application to do
> the test. In the test, it first moves the page table to above 4G range
> then calls StartUpThisAp to let AP run the test procedure.  The system
> will hang during the test.
>
> Add more comments in below inline.

Thanks. Let me summarize my understanding:

(1) During normal boot, and normal MP services PPI usage, and normal MP
    services protocol usage, there is no issue.
[Eric] Correct.

(2) Right now, we have not identified the exact edk2 core modules and
    locations that are responsible for allocating the IDT / GDT / page
    tables.
[Eric] I don’t check the IDT/GDT yet.
For page table, it created by DxeIpl driver and used for later boot phase.

Nonetheless, the allocations are all placed in the 32-bit
    address space, and so there is no problem with AP startup, during
    normal usage (see point (1)).
[Eric] Correct.


(3) There is a UEFI shell application that isn't more closely identified
    in this discussion. It first moves at least one of the IDT / GDT /
    page tables above 4GB, and then calls some member functions of the
    MP services protocol. This causes a hang, experienced during the
    execution of the UEFI shell application.
[Eric] Correct.


(4) The present patch recognizes the issue in FillExchangeInfoData(),
    and returns an error. In the relevant use case (UEFI shell
    application), this causes the called MP services protocol member
    function to fail, synchronously. This means the application still
    doesn't work, but there is no hang at least.
[Eric] Correct.

Is my understanding correct?

If it is, then I have the following comments:

- Please file a TianoCore BZ for this issue, and capture the use case in
  it (= UEFI shell application that changes the IDT / GDT / page tables
  base). Please feel free to copy parts of the above description, if you
  think that's useful.

- I'm quite doubtful that the use case is valid, in the first place. A
  UEFI application is supposed to consume the services described in the
  UEFI specification. Messing with the page tables is something that a
  UEFI application should *not* do, in my opinion. Such page table
  manipulation is expected to interfere with various DXE drivers in the
  platform firmware.

- Another comment on the UEFI shell application, and the present patch,
  is that their *combination* will still break ExitBootServices().
  Assume that the patch is applied, and the UEFI shell application is
  invoked. The application now fails gracefully, and exits. Then we
  attempt to boot an OS (this is a valid thing to do). Because the
  application moved the IDT / GDT / page tables "out of range",
  MpInitChangeApLoopCallback() will do the wrong thing.

- Most importantly: in MpInitLib, there is a *large* amount of call
  sites, and a large number of call paths that lead to WakeUpAP(), and
  ultimately to FillExchangeInfoData(). This means that changing the
  return type of FillExchangeInfoData() from VOID to EFI_STATUS has a
  "ripple effect" -- many call paths would have to deal with error
  checking, error propagation, and resource release (!!!) along those
  error paths.

  - For example, your current patch leaks resources in WakeUpAP(), when
    FillExchangeInfoData() fails -- see AllocateResetVector() and
    AllocateSevEsAPMemory().

  - For another example, your current patch does not handle several
    WakeUpAP() call sites, such as the ones in
    ResetProcessorToIdleState() and CheckAllAPs().

  Whereas in reality, from the applications point of view, we only need
  the MP services protocol member functions to fail cleanly. Therefore
  we should address this problem *early*; that is, *much less deep* in
  the call stack.


I suggest the following:

- introduce a feature PCD (default value FALSE)

- modify the following functions in MpInitLib:

  - MpInitLibStartupAllAPs
  - MpInitLibStartupThisAP
  - MpInitLibSwitchBSP
  - MpInitLibEnableDisableAP
[Eric] I found above 4 API need to wake up AP to do the task, so I add code to check the CR3/GDT/IDT.
Below 3 API does not need to wake up AP, so I don’t add the check.

  - MpInitLibGetNumberOfProcessors
  - MpInitLibGetProcessorInfo
  - MpInitLibWhoAmI


- each modified function should check the feature PCD *very early*. If
  the PCD is true, then the function should pre-emptively verify the IDT
  / GDT / root page table  location. If any one of those objects is
  outside of the 32-bit address space, then the function should fail
  immediately.

- We need to consider the event handlers in DxeMpInitLib:

  - CheckApsStatus() is not a problem, because the above short-circuits
    in the public MpInitLib functions will guarantee that
    "mStopCheckAllApsStatus" is always TRUE.

  - For exit-boot-services and legacy-boot, we need to modify
    MpInitChangeApLoopCallback() however. MpInitChangeApLoopCallback()
    should do the same as the above-listed functions (check the PCD,
    check the IDT / GDT / root page table), and if one of them is above
    4GB, then MpInitChangeApLoopCallback() should hang *hard* -- call
    CpuDeadLoop() --, even in RELEASE builds.

    This is because the UEFI application in question makes OS boot
    impossible, so we should stop the system as early as we realize
    that.

- Of course, I suggest a helper function to encapsulate the PCD check,
  and the GDT / IDT / CR3 checks.

[Eric] good suggestion, I will follow up to provide v2 changes.

This will let most platforms ignore the special UEFI shell application
use case (which I do feel is invalid). In addition, platforms that do
want to handle this use case, can set the PCD to TRUE for the CpuDxe
module *only* (so that CpuMpPei is not penalized). Finally, this
approach keeps the code clean; long call paths to FillExchangeInfoData()
are not made more complex.

Thanks
Laszlo

-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.

View/Reply Online (#65008): https://edk2.groups.io/g/devel/message/65008
Mute This Topic: https://groups.io/mt/76573401/1787277
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub  [importer@patchew.org]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [edk2-devel] [PATCH] UefiCpuPkg/MpInitLib: Add check for CR3/GDT/IDT.

Posted by Laszlo Ersek 2 weeks ago
On 09/03/20 17:21, Dong, Eric wrote:

> - modify the following functions in MpInitLib:
> 
>   - MpInitLibStartupAllAPs
>   - MpInitLibStartupThisAP
>   - MpInitLibSwitchBSP
>   - MpInitLibEnableDisableAP
> [Eric] I found above 4 API need to wake up AP to do the task, so I add code to check the CR3/GDT/IDT.
> Below 3 API does not need to wake up AP, so I don’t add the check.
> 
>   - MpInitLibGetNumberOfProcessors
>   - MpInitLibGetProcessorInfo
>   - MpInitLibWhoAmI

That's even better, thanks!
Laszlo


-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.

View/Reply Online (#65010): https://edk2.groups.io/g/devel/message/65010
Mute This Topic: https://groups.io/mt/76573401/1787277
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub  [importer@patchew.org]
-=-=-=-=-=-=-=-=-=-=-=-