[edk2-devel] [PATCH v8 08/10] OvmfPkg/CpuHotplugSmm: add EjectCpu()

Ankur Arora posted 10 patches 3 years, 8 months ago
There is a newer version of this series
[edk2-devel] [PATCH v8 08/10] OvmfPkg/CpuHotplugSmm: add EjectCpu()
Posted by Ankur Arora 3 years, 8 months ago
Add EjectCpu(), which handles the CPU ejection, and provides a holding
area for said CPUs. It is called via SmmCpuFeaturesRendezvousExit(),
at the tail end of the SMI handling.

Also UnplugCpus() now stashes QEMU Selectors of CPUs which need to be
ejected in CPU_HOT_EJECT_DATA.QemuSelectorMap. This is used by
EjectCpu() to identify CPUs marked for ejection.

Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Ard Biesheuvel <ard.biesheuvel@arm.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Aaron Young <aaron.young@oracle.com>
Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=3132
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---

Notes:
    Address these review comments from v6:
     (1) s/CpuEject/EjectCpu/g
     (2) Ensure that the added include is in sorted order.
     (3) Switch to a cheaper CpuSleep() based loop instead of
      CpuDeadLoop().  Also add the CpuLib LibraryClass.
     (4) Remove the nested else clause
     (5) Use Laszlo's much clearer comment when we try to map multiple
      QemuSelector to the same ProcessorNum.
     (6a) Fix indentation of the debug print in the block in (5).
     (6b,6c,6d) Fix printf types for ProcessorNum, use FMT_APIC_ID for
      APIC_ID and 0x%Lx for QemuSelector[].
     () As discussed elsewhere add an DEBUG_INFO print logging the
      correspondence between ProcessorNum, APIC_ID, QemuSelector.
     (7a,7b) Use EFI_ALREADY_STARTED instead of EFI_INVALID_PARAMETER and
      document it in the UnplugCpus() comment block.
     ()  As discussed elsewhere, add the import statement for
      PcdCpuHotEjectDataAddress.
     (9) Use Laszlo's comment in the PcdGet64(PcdCpuHotEjectDataAddress)
      description block.
     (10) Change mCpuHotEjectData init state checks from ASSERT to ones
      consistent with similar checks for mCpuHotPlugData.
     (11-14) Get rid of mCpuHotEjectData init loop: moved to a prior
      patch so it can be done at allocation time.
     (15) s/SmmCpuFeaturesSmiRendezvousExit/SmmCpuFeaturesRendezvousExit/
     (16,17) Document the ordering requirements of
      mCpuHotEjectData->Handler, and mCpuHotEjectData->QemuSelectorMap.

    Not addressed:
     (8) Not removing the EjectCount variable as I'd like to minimize
      stores/loads to CPU_HOT_EJECT_DATA->Handler and so would like to do this
      a single time at the end of the iteration.  (It is safe to write multiple
      times to the handler in UnplugCpus() but given the ordering concerns
      around it, it seems cleaner to not access it unnecessarily.)

 OvmfPkg/CpuHotplugSmm/CpuHotplugSmm.inf |   2 +
 OvmfPkg/CpuHotplugSmm/CpuHotplug.c      | 157 ++++++++++++++++++++++++++++++--
 2 files changed, 151 insertions(+), 8 deletions(-)

diff --git a/OvmfPkg/CpuHotplugSmm/CpuHotplugSmm.inf b/OvmfPkg/CpuHotplugSmm/CpuHotplugSmm.inf
index 04322b0d7855..ebcc7e2ac63a 100644
--- a/OvmfPkg/CpuHotplugSmm/CpuHotplugSmm.inf
+++ b/OvmfPkg/CpuHotplugSmm/CpuHotplugSmm.inf
@@ -40,6 +40,7 @@ [Packages]
 [LibraryClasses]
   BaseLib
   BaseMemoryLib
+  CpuLib
   DebugLib
   LocalApicLib
   MmServicesTableLib
@@ -54,6 +55,7 @@ [Protocols]
 
 [Pcd]
   gUefiCpuPkgTokenSpaceGuid.PcdCpuHotPlugDataAddress                ## CONSUMES
+  gUefiOvmfPkgTokenSpaceGuid.PcdCpuHotEjectDataAddress              ## CONSUMES
   gUefiOvmfPkgTokenSpaceGuid.PcdQ35SmramAtDefaultSmbase             ## CONSUMES
 
 [FeaturePcd]
diff --git a/OvmfPkg/CpuHotplugSmm/CpuHotplug.c b/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
index f07b5072749a..851e2b28aad9 100644
--- a/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
+++ b/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
@@ -10,10 +10,12 @@
 #include <IndustryStandard/Q35MchIch9.h>     // ICH9_APM_CNT
 #include <IndustryStandard/QemuCpuHotplug.h> // QEMU_CPUHP_CMD_GET_PENDING
 #include <Library/BaseLib.h>                 // CpuDeadLoop()
+#include <Library/CpuLib.h>                  // CpuSleep()
 #include <Library/DebugLib.h>                // ASSERT()
 #include <Library/MmServicesTableLib.h>      // gMmst
 #include <Library/PcdLib.h>                  // PcdGetBool()
 #include <Library/SafeIntLib.h>              // SafeUintnSub()
+#include <Pcd/CpuHotEjectData.h>             // CPU_HOT_EJECT_DATA
 #include <Protocol/MmCpuIo.h>                // EFI_MM_CPU_IO_PROTOCOL
 #include <Protocol/SmmCpuService.h>          // EFI_SMM_CPU_SERVICE_PROTOCOL
 #include <Uefi/UefiBaseType.h>               // EFI_STATUS
@@ -32,11 +34,12 @@ STATIC EFI_MM_CPU_IO_PROTOCOL *mMmCpuIo;
 //
 STATIC EFI_SMM_CPU_SERVICE_PROTOCOL *mMmCpuService;
 //
-// This structure is a communication side-channel between the
+// These structures serve as communication side-channels between the
 // EFI_SMM_CPU_SERVICE_PROTOCOL consumer (i.e., this driver) and provider
 // (i.e., PiSmmCpuDxeSmm).
 //
 STATIC CPU_HOT_PLUG_DATA *mCpuHotPlugData;
+STATIC CPU_HOT_EJECT_DATA *mCpuHotEjectData;
 //
 // SMRAM arrays for fetching the APIC IDs of processors with pending events (of
 // known event types), for the time of just one MMI.
@@ -188,18 +191,72 @@ RevokeNewSlot:
 }
 
 /**
+  CPU Hot-eject handler, called from SmmCpuFeaturesRendezvousExit()
+  on each CPU at exit from SMM.
+
+  If, the executing CPU is not being ejected, nothing to be done.
+  If, the executing CPU is being ejected, wait in a halted loop
+  until ejected.
+
+  @param[in] ProcessorNum      ProcessorNum denotes the CPU exiting SMM,
+                               and will be used as an index into
+                               CPU_HOT_EJECT_DATA->QemuSelectorMap. It is
+                               identical to the processor handle number in
+                               EFI_SMM_CPU_SERVICE_PROTOCOL.
+**/
+VOID
+EFIAPI
+EjectCpu (
+  IN UINTN ProcessorNum
+  )
+{
+  UINT64 QemuSelector;
+
+  QemuSelector = mCpuHotEjectData->QemuSelectorMap[ProcessorNum];
+  if (QemuSelector == CPU_EJECT_QEMU_SELECTOR_INVALID) {
+    return;
+  }
+
+  //
+  // CPU(s) being unplugged get here from SmmCpuFeaturesRendezvousExit()
+  // after having been cleared to exit the SMI by the BSP and thus have
+  // no SMM processing remaining.
+  //
+  // Given that we cannot allow them to escape to the guest, we pen them
+  // here until the BSP tells QEMU to unplug them.
+  //
+  for (;;) {
+    DisableInterrupts ();
+    CpuSleep ();
+  }
+}
+
+/**
   Process to be hot-unplugged CPUs, per QemuCpuhpCollectApicIds().
 
   For each such CPU, report the CPU to PiSmmCpuDxeSmm via
-  EFI_SMM_CPU_SERVICE_PROTOCOL. If the to be hot-unplugged CPU is
-  unknown, skip it silently.
+  EFI_SMM_CPU_SERVICE_PROTOCOL and stash the APIC ID for later ejection.
+  If the to be hot-unplugged CPU is unknown, skip it silently.
+
+  Additonally, if we do stash any APIC IDs, also install a CPU eject handler
+  which would handle the ejection.
 
   @param[in] ToUnplugApicIds    The APIC IDs of the CPUs that are about to be
                                 hot-unplugged.
 
+  @param[in] ToUnplugSelector   The QEMU Selectors of the CPUs that are about to
+                                be hot-unplugged.
+
   @param[in] ToUnplugCount      The number of filled-in APIC IDs in
                                 ToUnplugApicIds.
 
+  @retval EFI_ALREADY_STARTED   For the ProcessorNum that
+                                EFI_SMM_CPU_SERVICE_PROTOCOL had assigned to
+                                one of the APIC ID in ToUnplugApicIds,
+                                mCpuHotEjectData->QemuSelectorMap already has
+                                the QemuSelector value stashed. (This should
+                                never happen.)
+
   @retval EFI_SUCCESS           Known APIC IDs have been removed from SMM data
                                 structures.
 
@@ -210,23 +267,36 @@ STATIC
 EFI_STATUS
 UnplugCpus (
   IN APIC_ID                      *ToUnplugApicIds,
+  IN UINT32                       *ToUnplugSelector,
   IN UINT32                       ToUnplugCount
   )
 {
   EFI_STATUS Status;
   UINT32     ToUnplugIdx;
+  UINT32     EjectCount;
   UINTN      ProcessorNum;
 
   ToUnplugIdx = 0;
+  EjectCount = 0;
   while (ToUnplugIdx < ToUnplugCount) {
     APIC_ID    RemoveApicId;
+    UINT32     QemuSelector;
 
     RemoveApicId = ToUnplugApicIds[ToUnplugIdx];
+    QemuSelector = ToUnplugSelector[ToUnplugIdx];
 
     //
-    // mCpuHotPlugData->ApicId maps ProcessorNum -> ApicId. Use it to find
-    // the ProcessorNum for the APIC ID to be removed.
+    // mCpuHotPlugData->ApicId maps ProcessorNum -> ApicId. Use RemoveApicId
+    // to find the corresponding ProcessorNum for the CPU to be removed.
     //
+    // With this we can establish a 3 way mapping:
+    //    APIC_ID -- ProcessorNum -- QemuSelector
+    //
+    // We stash the ProcessorNum -> QemuSelector mapping so it can later be
+    // used for CPU hot-eject in SmmCpuFeaturesRendezvousExit() context (where
+    // we only have ProcessorNum available.)
+    //
+
     for (ProcessorNum = 0;
          ProcessorNum < mCpuHotPlugData->ArrayLength;
          ProcessorNum++) {
@@ -255,11 +325,64 @@ UnplugCpus (
       return Status;
     }
 
+    if (mCpuHotEjectData->QemuSelectorMap[ProcessorNum] !=
+          CPU_EJECT_QEMU_SELECTOR_INVALID) {
+      //
+      // mCpuHotEjectData->QemuSelectorMap[ProcessorNum] is set to
+      // CPU_EJECT_QEMU_SELECTOR_INVALID when mCpuHotEjectData->QemuSelectorMap
+      // is allocated, and once the subject processsor is ejected.
+      //
+      // Additionally, mMmCpuService->RemoveProcessor(ProcessorNum) invalidates
+      // mCpuHotPlugData->ApicId[ProcessorNum], so a given ProcessorNum can
+      // never match more than one APIC ID and by transitivity, more than one
+      // QemuSelector in a single invocation of UnplugCpus().
+      //
+      DEBUG ((DEBUG_ERROR, "%a: ProcessorNum %Lu maps to QemuSelector 0x%Lx, "
+        "cannot also map to 0x%Lx\n", __FUNCTION__, (UINT64)ProcessorNum,
+        (UINT64)mCpuHotEjectData->QemuSelectorMap[ProcessorNum], QemuSelector));
+
+      Status = EFI_ALREADY_STARTED;
+      return Status;
+    }
+
+    //
+    // Stash the QemuSelector so we can do the actual ejection later.
+    //
+    mCpuHotEjectData->QemuSelectorMap[ProcessorNum] = (UINT64)QemuSelector;
+
+    DEBUG ((DEBUG_INFO, "%a: Started hot-unplug on ProcessorNum %Lu, APIC ID "
+      FMT_APIC_ID ", QemuSelector 0x%Lx\n", __FUNCTION__, (UINT64)ProcessorNum,
+      RemoveApicId, mCpuHotEjectData->QemuSelectorMap[ProcessorNum]));
+
+    EjectCount++;
     ToUnplugIdx++;
   }
 
+  if (EjectCount != 0) {
+    //
+    // We have processors to be ejected; install the handler.
+    //
+    mCpuHotEjectData->Handler = EjectCpu;
+
+    //
+    // The BSP, CPUs to be ejected dereference mCpuHotEjectData->Handler, and
+    // mCpuHotEjectData->QemuSelectorMap[] in SmmCpuFeaturesRendezvousExit().
+    //
+    // Assignments to both of these are ordered-before the BSP's SMI exit signal
+    // which happens via a write to SMM_DISPATCHER_MP_SYNC_DATA->AllCpusInSync.
+    // Dereferences of both are ordered-after the synchronization via
+    // "AllCpusInSync".
+    //
+    // So we are guaranteed that the Handler would see the assignments above.
+    // However, add a MemoryFence() here in-lieu of a compiler barrier to
+    // ensure that the compiler doesn't monkey around with the stores.
+    //
+    MemoryFence ();
+  }
+
   //
-  // We've removed this set of APIC IDs from SMM data structures.
+  // We've removed this set of APIC IDs from SMM data structures and
+  // have installed an ejection handler if needed.
   //
   return EFI_SUCCESS;
 }
@@ -387,7 +510,7 @@ CpuHotplugMmi (
   }
 
   if (ToUnplugCount > 0) {
-    Status = UnplugCpus (mToUnplugApicIds, ToUnplugCount);
+    Status = UnplugCpus (mToUnplugApicIds, mToUnplugSelector, ToUnplugCount);
     if (EFI_ERROR (Status)) {
       goto Fatal;
     }
@@ -458,9 +581,14 @@ CpuHotplugEntry (
 
   //
   // Our DEPEX on EFI_SMM_CPU_SERVICE_PROTOCOL guarantees that PiSmmCpuDxeSmm
-  // has pointed PcdCpuHotPlugDataAddress to CPU_HOT_PLUG_DATA in SMRAM.
+  // has pointed:
+  // - PcdCpuHotPlugDataAddress to CPU_HOT_PLUG_DATA in SMRAM,
+  // - PcdCpuHotEjectDataAddress to CPU_HOT_EJECT_DATA in SMRAM, if the
+  //   possible CPU count is greater than 1.
   //
   mCpuHotPlugData = (VOID *)(UINTN)PcdGet64 (PcdCpuHotPlugDataAddress);
+  mCpuHotEjectData = (VOID *)(UINTN)PcdGet64 (PcdCpuHotEjectDataAddress);
+
   if (mCpuHotPlugData == NULL) {
     Status = EFI_NOT_FOUND;
     DEBUG ((DEBUG_ERROR, "%a: CPU_HOT_PLUG_DATA: %r\n", __FUNCTION__, Status));
@@ -472,6 +600,19 @@ CpuHotplugEntry (
   if (mCpuHotPlugData->ArrayLength == 1) {
     return EFI_UNSUPPORTED;
   }
+
+  if (mCpuHotEjectData == NULL) {
+    Status = EFI_NOT_FOUND;
+  } else if (mCpuHotPlugData->ArrayLength != mCpuHotEjectData->ArrayLength) {
+    Status = EFI_INVALID_PARAMETER;
+  } else {
+    Status = EFI_SUCCESS;
+  }
+  if (EFI_ERROR (Status)) {
+    DEBUG ((DEBUG_ERROR, "%a: CPU_HOT_EJECT_DATA: %r\n", __FUNCTION__, Status));
+    goto Fatal;
+  }
+
   //
   // Allocate the data structures that depend on the possible CPU count.
   //
-- 
2.9.3



-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#71919): https://edk2.groups.io/g/devel/message/71919
Mute This Topic: https://groups.io/mt/80819863/1787277
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [edk2-devel] [PATCH v8 08/10] OvmfPkg/CpuHotplugSmm: add EjectCpu()
Posted by Laszlo Ersek 3 years, 8 months ago
superficial comments only; the patch is nearly ready:

On 02/22/21 08:19, Ankur Arora wrote:
> Add EjectCpu(), which handles the CPU ejection, and provides a holding
> area for said CPUs. It is called via SmmCpuFeaturesRendezvousExit(),
> at the tail end of the SMI handling.
>
> Also UnplugCpus() now stashes QEMU Selectors of CPUs which need to be
> ejected in CPU_HOT_EJECT_DATA.QemuSelectorMap. This is used by
> EjectCpu() to identify CPUs marked for ejection.
>
> Cc: Laszlo Ersek <lersek@redhat.com>
> Cc: Jordan Justen <jordan.l.justen@intel.com>
> Cc: Ard Biesheuvel <ard.biesheuvel@arm.com>
> Cc: Igor Mammedov <imammedo@redhat.com>
> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> Cc: Aaron Young <aaron.young@oracle.com>
> Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=3132
> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
> ---
>
> Notes:
>     Address these review comments from v6:
>      (1) s/CpuEject/EjectCpu/g
>      (2) Ensure that the added include is in sorted order.
>      (3) Switch to a cheaper CpuSleep() based loop instead of
>       CpuDeadLoop().  Also add the CpuLib LibraryClass.
>      (4) Remove the nested else clause
>      (5) Use Laszlo's much clearer comment when we try to map multiple
>       QemuSelector to the same ProcessorNum.
>      (6a) Fix indentation of the debug print in the block in (5).
>      (6b,6c,6d) Fix printf types for ProcessorNum, use FMT_APIC_ID for
>       APIC_ID and 0x%Lx for QemuSelector[].
>      () As discussed elsewhere add an DEBUG_INFO print logging the
>       correspondence between ProcessorNum, APIC_ID, QemuSelector.
>      (7a,7b) Use EFI_ALREADY_STARTED instead of EFI_INVALID_PARAMETER and
>       document it in the UnplugCpus() comment block.
>      ()  As discussed elsewhere, add the import statement for
>       PcdCpuHotEjectDataAddress.
>      (9) Use Laszlo's comment in the PcdGet64(PcdCpuHotEjectDataAddress)
>       description block.
>      (10) Change mCpuHotEjectData init state checks from ASSERT to ones
>       consistent with similar checks for mCpuHotPlugData.
>      (11-14) Get rid of mCpuHotEjectData init loop: moved to a prior
>       patch so it can be done at allocation time.
>      (15) s/SmmCpuFeaturesSmiRendezvousExit/SmmCpuFeaturesRendezvousExit/
>      (16,17) Document the ordering requirements of
>       mCpuHotEjectData->Handler, and mCpuHotEjectData->QemuSelectorMap.
>
>     Not addressed:
>      (8) Not removing the EjectCount variable as I'd like to minimize
>       stores/loads to CPU_HOT_EJECT_DATA->Handler and so would like to do this
>       a single time at the end of the iteration.  (It is safe to write multiple
>       times to the handler in UnplugCpus() but given the ordering concerns
>       around it, it seems cleaner to not access it unnecessarily.)
>
>  OvmfPkg/CpuHotplugSmm/CpuHotplugSmm.inf |   2 +
>  OvmfPkg/CpuHotplugSmm/CpuHotplug.c      | 157 ++++++++++++++++++++++++++++++--
>  2 files changed, 151 insertions(+), 8 deletions(-)
>
> diff --git a/OvmfPkg/CpuHotplugSmm/CpuHotplugSmm.inf b/OvmfPkg/CpuHotplugSmm/CpuHotplugSmm.inf
> index 04322b0d7855..ebcc7e2ac63a 100644
> --- a/OvmfPkg/CpuHotplugSmm/CpuHotplugSmm.inf
> +++ b/OvmfPkg/CpuHotplugSmm/CpuHotplugSmm.inf
> @@ -40,6 +40,7 @@ [Packages]
>  [LibraryClasses]
>    BaseLib
>    BaseMemoryLib
> +  CpuLib
>    DebugLib
>    LocalApicLib
>    MmServicesTableLib
> @@ -54,6 +55,7 @@ [Protocols]
>
>  [Pcd]
>    gUefiCpuPkgTokenSpaceGuid.PcdCpuHotPlugDataAddress                ## CONSUMES
> +  gUefiOvmfPkgTokenSpaceGuid.PcdCpuHotEjectDataAddress              ## CONSUMES
>    gUefiOvmfPkgTokenSpaceGuid.PcdQ35SmramAtDefaultSmbase             ## CONSUMES
>
>  [FeaturePcd]
> diff --git a/OvmfPkg/CpuHotplugSmm/CpuHotplug.c b/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
> index f07b5072749a..851e2b28aad9 100644
> --- a/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
> +++ b/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
> @@ -10,10 +10,12 @@
>  #include <IndustryStandard/Q35MchIch9.h>     // ICH9_APM_CNT
>  #include <IndustryStandard/QemuCpuHotplug.h> // QEMU_CPUHP_CMD_GET_PENDING
>  #include <Library/BaseLib.h>                 // CpuDeadLoop()
> +#include <Library/CpuLib.h>                  // CpuSleep()
>  #include <Library/DebugLib.h>                // ASSERT()
>  #include <Library/MmServicesTableLib.h>      // gMmst
>  #include <Library/PcdLib.h>                  // PcdGetBool()
>  #include <Library/SafeIntLib.h>              // SafeUintnSub()
> +#include <Pcd/CpuHotEjectData.h>             // CPU_HOT_EJECT_DATA
>  #include <Protocol/MmCpuIo.h>                // EFI_MM_CPU_IO_PROTOCOL
>  #include <Protocol/SmmCpuService.h>          // EFI_SMM_CPU_SERVICE_PROTOCOL
>  #include <Uefi/UefiBaseType.h>               // EFI_STATUS
> @@ -32,11 +34,12 @@ STATIC EFI_MM_CPU_IO_PROTOCOL *mMmCpuIo;
>  //
>  STATIC EFI_SMM_CPU_SERVICE_PROTOCOL *mMmCpuService;
>  //
> -// This structure is a communication side-channel between the
> +// These structures serve as communication side-channels between the
>  // EFI_SMM_CPU_SERVICE_PROTOCOL consumer (i.e., this driver) and provider
>  // (i.e., PiSmmCpuDxeSmm).
>  //
>  STATIC CPU_HOT_PLUG_DATA *mCpuHotPlugData;
> +STATIC CPU_HOT_EJECT_DATA *mCpuHotEjectData;
>  //
>  // SMRAM arrays for fetching the APIC IDs of processors with pending events (of
>  // known event types), for the time of just one MMI.
> @@ -188,18 +191,72 @@ RevokeNewSlot:
>  }
>
>  /**
> +  CPU Hot-eject handler, called from SmmCpuFeaturesRendezvousExit()
> +  on each CPU at exit from SMM.
> +
> +  If, the executing CPU is not being ejected, nothing to be done.
> +  If, the executing CPU is being ejected, wait in a halted loop
> +  until ejected.
> +
> +  @param[in] ProcessorNum      ProcessorNum denotes the CPU exiting SMM,
> +                               and will be used as an index into
> +                               CPU_HOT_EJECT_DATA->QemuSelectorMap. It is
> +                               identical to the processor handle number in
> +                               EFI_SMM_CPU_SERVICE_PROTOCOL.
> +**/
> +VOID
> +EFIAPI
> +EjectCpu (
> +  IN UINTN ProcessorNum
> +  )
> +{
> +  UINT64 QemuSelector;
> +
> +  QemuSelector = mCpuHotEjectData->QemuSelectorMap[ProcessorNum];
> +  if (QemuSelector == CPU_EJECT_QEMU_SELECTOR_INVALID) {
> +    return;
> +  }
> +
> +  //
> +  // CPU(s) being unplugged get here from SmmCpuFeaturesRendezvousExit()
> +  // after having been cleared to exit the SMI by the BSP and thus have
> +  // no SMM processing remaining.
> +  //
> +  // Given that we cannot allow them to escape to the guest, we pen them
> +  // here until the BSP tells QEMU to unplug them.
> +  //
> +  for (;;) {
> +    DisableInterrupts ();
> +    CpuSleep ();
> +  }
> +}
> +
> +/**
>    Process to be hot-unplugged CPUs, per QemuCpuhpCollectApicIds().
>
>    For each such CPU, report the CPU to PiSmmCpuDxeSmm via
> -  EFI_SMM_CPU_SERVICE_PROTOCOL. If the to be hot-unplugged CPU is
> -  unknown, skip it silently.
> +  EFI_SMM_CPU_SERVICE_PROTOCOL and stash the APIC ID for later ejection.
> +  If the to be hot-unplugged CPU is unknown, skip it silently.
> +
> +  Additonally, if we do stash any APIC IDs, also install a CPU eject handler
> +  which would handle the ejection.

(1) Please update the two APIC ID references above to QEMU CPU selectors
(the commit message has been updated already, correctly).


>
>    @param[in] ToUnplugApicIds    The APIC IDs of the CPUs that are about to be
>                                  hot-unplugged.
>
> +  @param[in] ToUnplugSelector   The QEMU Selectors of the CPUs that are about to
> +                                be hot-unplugged.
> +

(2) Please rename the parameter to "ToUnplugSelectors" (plural), both
here in the comment and in the actual parameter list.


>    @param[in] ToUnplugCount      The number of filled-in APIC IDs in
>                                  ToUnplugApicIds.
>
> +  @retval EFI_ALREADY_STARTED   For the ProcessorNum that
> +                                EFI_SMM_CPU_SERVICE_PROTOCOL had assigned to
> +                                one of the APIC ID in ToUnplugApicIds,
> +                                mCpuHotEjectData->QemuSelectorMap already has
> +                                the QemuSelector value stashed. (This should
> +                                never happen.)
> +

(3) Unfortunately I made a typing error in my v6 review where I
suggested this language: it should say

  one of the APIC ID*s* in ToUnplugApicIds

(emphasis only here, in this comment)


>    @retval EFI_SUCCESS           Known APIC IDs have been removed from SMM data
>                                  structures.
>
> @@ -210,23 +267,36 @@ STATIC
>  EFI_STATUS
>  UnplugCpus (
>    IN APIC_ID                      *ToUnplugApicIds,
> +  IN UINT32                       *ToUnplugSelector,
>    IN UINT32                       ToUnplugCount
>    )
>  {
>    EFI_STATUS Status;
>    UINT32     ToUnplugIdx;
> +  UINT32     EjectCount;
>    UINTN      ProcessorNum;
>
>    ToUnplugIdx = 0;
> +  EjectCount = 0;
>    while (ToUnplugIdx < ToUnplugCount) {
>      APIC_ID    RemoveApicId;
> +    UINT32     QemuSelector;
>
>      RemoveApicId = ToUnplugApicIds[ToUnplugIdx];
> +    QemuSelector = ToUnplugSelector[ToUnplugIdx];
>
>      //
> -    // mCpuHotPlugData->ApicId maps ProcessorNum -> ApicId. Use it to find
> -    // the ProcessorNum for the APIC ID to be removed.
> +    // mCpuHotPlugData->ApicId maps ProcessorNum -> ApicId. Use RemoveApicId
> +    // to find the corresponding ProcessorNum for the CPU to be removed.
>      //
> +    // With this we can establish a 3 way mapping:
> +    //    APIC_ID -- ProcessorNum -- QemuSelector
> +    //
> +    // We stash the ProcessorNum -> QemuSelector mapping so it can later be
> +    // used for CPU hot-eject in SmmCpuFeaturesRendezvousExit() context (where
> +    // we only have ProcessorNum available.)
> +    //
> +
>      for (ProcessorNum = 0;
>           ProcessorNum < mCpuHotPlugData->ArrayLength;
>           ProcessorNum++) {
> @@ -255,11 +325,64 @@ UnplugCpus (
>        return Status;
>      }
>
> +    if (mCpuHotEjectData->QemuSelectorMap[ProcessorNum] !=
> +          CPU_EJECT_QEMU_SELECTOR_INVALID) {

(4) Invalid indentation; in this (controlling expression) context,
CPU_EJECT_QEMU_SELECTOR_INVALID should line up with "mCpuHotEjectData".


> +      //
> +      // mCpuHotEjectData->QemuSelectorMap[ProcessorNum] is set to
> +      // CPU_EJECT_QEMU_SELECTOR_INVALID when mCpuHotEjectData->QemuSelectorMap
> +      // is allocated, and once the subject processsor is ejected.
> +      //
> +      // Additionally, mMmCpuService->RemoveProcessor(ProcessorNum) invalidates
> +      // mCpuHotPlugData->ApicId[ProcessorNum], so a given ProcessorNum can
> +      // never match more than one APIC ID and by transitivity, more than one
> +      // QemuSelector in a single invocation of UnplugCpus().
> +      //

(5) good comment, but please make it a tiny bit easier to read: please
insert two "em dashes", as follows:

      // never match more than one APIC ID -- and by transitivity, designate
      // more than one QemuSelector -- in a single invocation of UnplugCpus().

(I've also inserted the verb "designate", because "match" doesn't seem
to apply in that context.)


> +      DEBUG ((DEBUG_ERROR, "%a: ProcessorNum %Lu maps to QemuSelector 0x%Lx, "
> +        "cannot also map to 0x%Lx\n", __FUNCTION__, (UINT64)ProcessorNum,
> +        (UINT64)mCpuHotEjectData->QemuSelectorMap[ProcessorNum], QemuSelector));

(6a) "mCpuHotEjectData->QemuSelectorMap[ProcessorNum]" has type UINT64,
so the cast is superfluous

(6b) we log selectors in all other places in decimal, so please use %Lu
here, not 0x%Lx, for logging
"mCpuHotEjectData->QemuSelectorMap[ProcessorNum]"

(6c) "QemuSelector" has type UINT32 (correctly), so we need to log it
with either "0x%x" or "%u". Given my above point about logging selectors
in decimal everywhere else, please use "%u".

      DEBUG ((DEBUG_ERROR, "%a: ProcessorNum %Lu maps to QemuSelector %Lu, "
        "cannot also map to %u\n", __FUNCTION__, (UINT64)ProcessorNum,
        mCpuHotEjectData->QemuSelectorMap[ProcessorNum], QemuSelector));


> +
> +      Status = EFI_ALREADY_STARTED;
> +      return Status;
> +    }

(7) style nit -- this looks better as "return EFI_ALREADY_STARTED".


> +
> +    //
> +    // Stash the QemuSelector so we can do the actual ejection later.
> +    //
> +    mCpuHotEjectData->QemuSelectorMap[ProcessorNum] = (UINT64)QemuSelector;
> +
> +    DEBUG ((DEBUG_INFO, "%a: Started hot-unplug on ProcessorNum %Lu, APIC ID "
> +      FMT_APIC_ID ", QemuSelector 0x%Lx\n", __FUNCTION__, (UINT64)ProcessorNum,
> +      RemoveApicId, mCpuHotEjectData->QemuSelectorMap[ProcessorNum]));

(8a) In the format string, please replace "QemuSelector 0x%Lx" with
"QemuSelector %u" -- the main point is the decimal notation

(8b) As a suggestion, I think we should simplify this DEBUG call by
replacing the "mCpuHotEjectData->QemuSelectorMap[ProcessorNum]" argument
with just "QemuSelector".

    DEBUG ((DEBUG_INFO, "%a: Started hot-unplug on ProcessorNum %Lu, APIC ID "
      FMT_APIC_ID ", QemuSelector %u\n", __FUNCTION__, (UINT64)ProcessorNum,
      RemoveApicId, QemuSelector));


> +
> +    EjectCount++;

(I'm fine with keeping the "EjectCount" variable.)


>      ToUnplugIdx++;
>    }
>
> +  if (EjectCount != 0) {
> +    //
> +    // We have processors to be ejected; install the handler.
> +    //
> +    mCpuHotEjectData->Handler = EjectCpu;
> +
> +    //
> +    // The BSP, CPUs to be ejected dereference mCpuHotEjectData->Handler, and
> +    // mCpuHotEjectData->QemuSelectorMap[] in SmmCpuFeaturesRendezvousExit().
> +    //
> +    // Assignments to both of these are ordered-before the BSP's SMI exit signal
> +    // which happens via a write to SMM_DISPATCHER_MP_SYNC_DATA->AllCpusInSync.
> +    // Dereferences of both are ordered-after the synchronization via
> +    // "AllCpusInSync".
> +    //
> +    // So we are guaranteed that the Handler would see the assignments above.
> +    // However, add a MemoryFence() here in-lieu of a compiler barrier to
> +    // ensure that the compiler doesn't monkey around with the stores.
> +    //
> +    MemoryFence ();
> +  }
> +

(9) Per previous discussion (under patch v8 #7), please replace the last
paragraph of this comment, with a "ReleaseMemoryFence" reference.

The rest looks good.

Thanks!
Laszlo

>    //
> -  // We've removed this set of APIC IDs from SMM data structures.
> +  // We've removed this set of APIC IDs from SMM data structures and
> +  // have installed an ejection handler if needed.
>    //
>    return EFI_SUCCESS;
>  }
> @@ -387,7 +510,7 @@ CpuHotplugMmi (
>    }
>
>    if (ToUnplugCount > 0) {
> -    Status = UnplugCpus (mToUnplugApicIds, ToUnplugCount);
> +    Status = UnplugCpus (mToUnplugApicIds, mToUnplugSelector, ToUnplugCount);
>      if (EFI_ERROR (Status)) {
>        goto Fatal;
>      }
> @@ -458,9 +581,14 @@ CpuHotplugEntry (
>
>    //
>    // Our DEPEX on EFI_SMM_CPU_SERVICE_PROTOCOL guarantees that PiSmmCpuDxeSmm
> -  // has pointed PcdCpuHotPlugDataAddress to CPU_HOT_PLUG_DATA in SMRAM.
> +  // has pointed:
> +  // - PcdCpuHotPlugDataAddress to CPU_HOT_PLUG_DATA in SMRAM,
> +  // - PcdCpuHotEjectDataAddress to CPU_HOT_EJECT_DATA in SMRAM, if the
> +  //   possible CPU count is greater than 1.
>    //
>    mCpuHotPlugData = (VOID *)(UINTN)PcdGet64 (PcdCpuHotPlugDataAddress);
> +  mCpuHotEjectData = (VOID *)(UINTN)PcdGet64 (PcdCpuHotEjectDataAddress);
> +
>    if (mCpuHotPlugData == NULL) {
>      Status = EFI_NOT_FOUND;
>      DEBUG ((DEBUG_ERROR, "%a: CPU_HOT_PLUG_DATA: %r\n", __FUNCTION__, Status));
> @@ -472,6 +600,19 @@ CpuHotplugEntry (
>    if (mCpuHotPlugData->ArrayLength == 1) {
>      return EFI_UNSUPPORTED;
>    }
> +
> +  if (mCpuHotEjectData == NULL) {
> +    Status = EFI_NOT_FOUND;
> +  } else if (mCpuHotPlugData->ArrayLength != mCpuHotEjectData->ArrayLength) {
> +    Status = EFI_INVALID_PARAMETER;
> +  } else {
> +    Status = EFI_SUCCESS;
> +  }
> +  if (EFI_ERROR (Status)) {
> +    DEBUG ((DEBUG_ERROR, "%a: CPU_HOT_EJECT_DATA: %r\n", __FUNCTION__, Status));
> +    goto Fatal;
> +  }
> +
>    //
>    // Allocate the data structures that depend on the possible CPU count.
>    //
>



-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#72113): https://edk2.groups.io/g/devel/message/72113
Mute This Topic: https://groups.io/mt/80819863/1787277
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [edk2-devel] [PATCH v8 08/10] OvmfPkg/CpuHotplugSmm: add EjectCpu()
Posted by Ankur Arora 3 years, 8 months ago
On 2021-02-23 12:36 p.m., Laszlo Ersek wrote:
> superficial comments only; the patch is nearly ready:
> 
> On 02/22/21 08:19, Ankur Arora wrote:
>> Add EjectCpu(), which handles the CPU ejection, and provides a holding
>> area for said CPUs. It is called via SmmCpuFeaturesRendezvousExit(),
>> at the tail end of the SMI handling.
>>
>> Also UnplugCpus() now stashes QEMU Selectors of CPUs which need to be
>> ejected in CPU_HOT_EJECT_DATA.QemuSelectorMap. This is used by
>> EjectCpu() to identify CPUs marked for ejection.
>>
>> Cc: Laszlo Ersek <lersek@redhat.com>
>> Cc: Jordan Justen <jordan.l.justen@intel.com>
>> Cc: Ard Biesheuvel <ard.biesheuvel@arm.com>
>> Cc: Igor Mammedov <imammedo@redhat.com>
>> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
>> Cc: Aaron Young <aaron.young@oracle.com>
>> Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=3132
>> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
>> ---
>>
>> Notes:
>>      Address these review comments from v6:
>>       (1) s/CpuEject/EjectCpu/g
>>       (2) Ensure that the added include is in sorted order.
>>       (3) Switch to a cheaper CpuSleep() based loop instead of
>>        CpuDeadLoop().  Also add the CpuLib LibraryClass.
>>       (4) Remove the nested else clause
>>       (5) Use Laszlo's much clearer comment when we try to map multiple
>>        QemuSelector to the same ProcessorNum.
>>       (6a) Fix indentation of the debug print in the block in (5).
>>       (6b,6c,6d) Fix printf types for ProcessorNum, use FMT_APIC_ID for
>>        APIC_ID and 0x%Lx for QemuSelector[].
>>       () As discussed elsewhere add an DEBUG_INFO print logging the
>>        correspondence between ProcessorNum, APIC_ID, QemuSelector.
>>       (7a,7b) Use EFI_ALREADY_STARTED instead of EFI_INVALID_PARAMETER and
>>        document it in the UnplugCpus() comment block.
>>       ()  As discussed elsewhere, add the import statement for
>>        PcdCpuHotEjectDataAddress.
>>       (9) Use Laszlo's comment in the PcdGet64(PcdCpuHotEjectDataAddress)
>>        description block.
>>       (10) Change mCpuHotEjectData init state checks from ASSERT to ones
>>        consistent with similar checks for mCpuHotPlugData.
>>       (11-14) Get rid of mCpuHotEjectData init loop: moved to a prior
>>        patch so it can be done at allocation time.
>>       (15) s/SmmCpuFeaturesSmiRendezvousExit/SmmCpuFeaturesRendezvousExit/
>>       (16,17) Document the ordering requirements of
>>        mCpuHotEjectData->Handler, and mCpuHotEjectData->QemuSelectorMap.
>>
>>      Not addressed:
>>       (8) Not removing the EjectCount variable as I'd like to minimize
>>        stores/loads to CPU_HOT_EJECT_DATA->Handler and so would like to do this
>>        a single time at the end of the iteration.  (It is safe to write multiple
>>        times to the handler in UnplugCpus() but given the ordering concerns
>>        around it, it seems cleaner to not access it unnecessarily.)
>>
>>   OvmfPkg/CpuHotplugSmm/CpuHotplugSmm.inf |   2 +
>>   OvmfPkg/CpuHotplugSmm/CpuHotplug.c      | 157 ++++++++++++++++++++++++++++++--
>>   2 files changed, 151 insertions(+), 8 deletions(-)
>>
>> diff --git a/OvmfPkg/CpuHotplugSmm/CpuHotplugSmm.inf b/OvmfPkg/CpuHotplugSmm/CpuHotplugSmm.inf
>> index 04322b0d7855..ebcc7e2ac63a 100644
>> --- a/OvmfPkg/CpuHotplugSmm/CpuHotplugSmm.inf
>> +++ b/OvmfPkg/CpuHotplugSmm/CpuHotplugSmm.inf
>> @@ -40,6 +40,7 @@ [Packages]
>>   [LibraryClasses]
>>     BaseLib
>>     BaseMemoryLib
>> +  CpuLib
>>     DebugLib
>>     LocalApicLib
>>     MmServicesTableLib
>> @@ -54,6 +55,7 @@ [Protocols]
>>
>>   [Pcd]
>>     gUefiCpuPkgTokenSpaceGuid.PcdCpuHotPlugDataAddress                ## CONSUMES
>> +  gUefiOvmfPkgTokenSpaceGuid.PcdCpuHotEjectDataAddress              ## CONSUMES
>>     gUefiOvmfPkgTokenSpaceGuid.PcdQ35SmramAtDefaultSmbase             ## CONSUMES
>>
>>   [FeaturePcd]
>> diff --git a/OvmfPkg/CpuHotplugSmm/CpuHotplug.c b/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
>> index f07b5072749a..851e2b28aad9 100644
>> --- a/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
>> +++ b/OvmfPkg/CpuHotplugSmm/CpuHotplug.c
>> @@ -10,10 +10,12 @@
>>   #include <IndustryStandard/Q35MchIch9.h>     // ICH9_APM_CNT
>>   #include <IndustryStandard/QemuCpuHotplug.h> // QEMU_CPUHP_CMD_GET_PENDING
>>   #include <Library/BaseLib.h>                 // CpuDeadLoop()
>> +#include <Library/CpuLib.h>                  // CpuSleep()
>>   #include <Library/DebugLib.h>                // ASSERT()
>>   #include <Library/MmServicesTableLib.h>      // gMmst
>>   #include <Library/PcdLib.h>                  // PcdGetBool()
>>   #include <Library/SafeIntLib.h>              // SafeUintnSub()
>> +#include <Pcd/CpuHotEjectData.h>             // CPU_HOT_EJECT_DATA
>>   #include <Protocol/MmCpuIo.h>                // EFI_MM_CPU_IO_PROTOCOL
>>   #include <Protocol/SmmCpuService.h>          // EFI_SMM_CPU_SERVICE_PROTOCOL
>>   #include <Uefi/UefiBaseType.h>               // EFI_STATUS
>> @@ -32,11 +34,12 @@ STATIC EFI_MM_CPU_IO_PROTOCOL *mMmCpuIo;
>>   //
>>   STATIC EFI_SMM_CPU_SERVICE_PROTOCOL *mMmCpuService;
>>   //
>> -// This structure is a communication side-channel between the
>> +// These structures serve as communication side-channels between the
>>   // EFI_SMM_CPU_SERVICE_PROTOCOL consumer (i.e., this driver) and provider
>>   // (i.e., PiSmmCpuDxeSmm).
>>   //
>>   STATIC CPU_HOT_PLUG_DATA *mCpuHotPlugData;
>> +STATIC CPU_HOT_EJECT_DATA *mCpuHotEjectData;
>>   //
>>   // SMRAM arrays for fetching the APIC IDs of processors with pending events (of
>>   // known event types), for the time of just one MMI.
>> @@ -188,18 +191,72 @@ RevokeNewSlot:
>>   }
>>
>>   /**
>> +  CPU Hot-eject handler, called from SmmCpuFeaturesRendezvousExit()
>> +  on each CPU at exit from SMM.
>> +
>> +  If, the executing CPU is not being ejected, nothing to be done.
>> +  If, the executing CPU is being ejected, wait in a halted loop
>> +  until ejected.
>> +
>> +  @param[in] ProcessorNum      ProcessorNum denotes the CPU exiting SMM,
>> +                               and will be used as an index into
>> +                               CPU_HOT_EJECT_DATA->QemuSelectorMap. It is
>> +                               identical to the processor handle number in
>> +                               EFI_SMM_CPU_SERVICE_PROTOCOL.
>> +**/
>> +VOID
>> +EFIAPI
>> +EjectCpu (
>> +  IN UINTN ProcessorNum
>> +  )
>> +{
>> +  UINT64 QemuSelector;
>> +
>> +  QemuSelector = mCpuHotEjectData->QemuSelectorMap[ProcessorNum];
>> +  if (QemuSelector == CPU_EJECT_QEMU_SELECTOR_INVALID) {
>> +    return;
>> +  }
>> +
>> +  //
>> +  // CPU(s) being unplugged get here from SmmCpuFeaturesRendezvousExit()
>> +  // after having been cleared to exit the SMI by the BSP and thus have
>> +  // no SMM processing remaining.
>> +  //
>> +  // Given that we cannot allow them to escape to the guest, we pen them
>> +  // here until the BSP tells QEMU to unplug them.
>> +  //
>> +  for (;;) {
>> +    DisableInterrupts ();
>> +    CpuSleep ();
>> +  }
>> +}
>> +
>> +/**
>>     Process to be hot-unplugged CPUs, per QemuCpuhpCollectApicIds().
>>
>>     For each such CPU, report the CPU to PiSmmCpuDxeSmm via
>> -  EFI_SMM_CPU_SERVICE_PROTOCOL. If the to be hot-unplugged CPU is
>> -  unknown, skip it silently.
>> +  EFI_SMM_CPU_SERVICE_PROTOCOL and stash the APIC ID for later ejection.
>> +  If the to be hot-unplugged CPU is unknown, skip it silently.
>> +
>> +  Additonally, if we do stash any APIC IDs, also install a CPU eject handler
>> +  which would handle the ejection.
> 
> (1) Please update the two APIC ID references above to QEMU CPU selectors
> (the commit message has been updated already, correctly).
> 
> 
>>
>>     @param[in] ToUnplugApicIds    The APIC IDs of the CPUs that are about to be
>>                                   hot-unplugged.
>>
>> +  @param[in] ToUnplugSelector   The QEMU Selectors of the CPUs that are about to
>> +                                be hot-unplugged.
>> +
> 
> (2) Please rename the parameter to "ToUnplugSelectors" (plural), both
> here in the comment and in the actual parameter list.
> 
> 
>>     @param[in] ToUnplugCount      The number of filled-in APIC IDs in
>>                                   ToUnplugApicIds.
>>
>> +  @retval EFI_ALREADY_STARTED   For the ProcessorNum that
>> +                                EFI_SMM_CPU_SERVICE_PROTOCOL had assigned to
>> +                                one of the APIC ID in ToUnplugApicIds,
>> +                                mCpuHotEjectData->QemuSelectorMap already has
>> +                                the QemuSelector value stashed. (This should
>> +                                never happen.)
>> +
> 
> (3) Unfortunately I made a typing error in my v6 review where I
> suggested this language: it should say
> 
>    one of the APIC ID*s* in ToUnplugApicIds
> 
> (emphasis only here, in this comment)
> 
> 
>>     @retval EFI_SUCCESS           Known APIC IDs have been removed from SMM data
>>                                   structures.
>>
>> @@ -210,23 +267,36 @@ STATIC
>>   EFI_STATUS
>>   UnplugCpus (
>>     IN APIC_ID                      *ToUnplugApicIds,
>> +  IN UINT32                       *ToUnplugSelector,
>>     IN UINT32                       ToUnplugCount
>>     )
>>   {
>>     EFI_STATUS Status;
>>     UINT32     ToUnplugIdx;
>> +  UINT32     EjectCount;
>>     UINTN      ProcessorNum;
>>
>>     ToUnplugIdx = 0;
>> +  EjectCount = 0;
>>     while (ToUnplugIdx < ToUnplugCount) {
>>       APIC_ID    RemoveApicId;
>> +    UINT32     QemuSelector;
>>
>>       RemoveApicId = ToUnplugApicIds[ToUnplugIdx];
>> +    QemuSelector = ToUnplugSelector[ToUnplugIdx];
>>
>>       //
>> -    // mCpuHotPlugData->ApicId maps ProcessorNum -> ApicId. Use it to find
>> -    // the ProcessorNum for the APIC ID to be removed.
>> +    // mCpuHotPlugData->ApicId maps ProcessorNum -> ApicId. Use RemoveApicId
>> +    // to find the corresponding ProcessorNum for the CPU to be removed.
>>       //
>> +    // With this we can establish a 3 way mapping:
>> +    //    APIC_ID -- ProcessorNum -- QemuSelector
>> +    //
>> +    // We stash the ProcessorNum -> QemuSelector mapping so it can later be
>> +    // used for CPU hot-eject in SmmCpuFeaturesRendezvousExit() context (where
>> +    // we only have ProcessorNum available.)
>> +    //
>> +
>>       for (ProcessorNum = 0;
>>            ProcessorNum < mCpuHotPlugData->ArrayLength;
>>            ProcessorNum++) {
>> @@ -255,11 +325,64 @@ UnplugCpus (
>>         return Status;
>>       }
>>
>> +    if (mCpuHotEjectData->QemuSelectorMap[ProcessorNum] !=
>> +          CPU_EJECT_QEMU_SELECTOR_INVALID) {
> 
> (4) Invalid indentation; in this (controlling expression) context,
> CPU_EJECT_QEMU_SELECTOR_INVALID should line up with "mCpuHotEjectData".
> 
> 
>> +      //
>> +      // mCpuHotEjectData->QemuSelectorMap[ProcessorNum] is set to
>> +      // CPU_EJECT_QEMU_SELECTOR_INVALID when mCpuHotEjectData->QemuSelectorMap
>> +      // is allocated, and once the subject processsor is ejected.
>> +      //
>> +      // Additionally, mMmCpuService->RemoveProcessor(ProcessorNum) invalidates
>> +      // mCpuHotPlugData->ApicId[ProcessorNum], so a given ProcessorNum can
>> +      // never match more than one APIC ID and by transitivity, more than one
>> +      // QemuSelector in a single invocation of UnplugCpus().
>> +      //
> 
> (5) good comment, but please make it a tiny bit easier to read: please
> insert two "em dashes", as follows:
> 
>        // never match more than one APIC ID -- and by transitivity, designate
>        // more than one QemuSelector -- in a single invocation of UnplugCpus().
> 
> (I've also inserted the verb "designate", because "match" doesn't seem
> to apply in that context.)
> 
> 
>> +      DEBUG ((DEBUG_ERROR, "%a: ProcessorNum %Lu maps to QemuSelector 0x%Lx, "
>> +        "cannot also map to 0x%Lx\n", __FUNCTION__, (UINT64)ProcessorNum,
>> +        (UINT64)mCpuHotEjectData->QemuSelectorMap[ProcessorNum], QemuSelector));
> 
> (6a) "mCpuHotEjectData->QemuSelectorMap[ProcessorNum]" has type UINT64,
> so the cast is superfluous
> 
> (6b) we log selectors in all other places in decimal, so please use %Lu
> here, not 0x%Lx, for logging
> "mCpuHotEjectData->QemuSelectorMap[ProcessorNum]"
> 
> (6c) "QemuSelector" has type UINT32 (correctly), so we need to log it
> with either "0x%x" or "%u". Given my above point about logging selectors
> in decimal everywhere else, please use "%u".
> 
>        DEBUG ((DEBUG_ERROR, "%a: ProcessorNum %Lu maps to QemuSelector %Lu, "
>          "cannot also map to %u\n", __FUNCTION__, (UINT64)ProcessorNum,
>          mCpuHotEjectData->QemuSelectorMap[ProcessorNum], QemuSelector));
> 
> 
>> +
>> +      Status = EFI_ALREADY_STARTED;
>> +      return Status;
>> +    }
> 
> (7) style nit -- this looks better as "return EFI_ALREADY_STARTED".
> 
> 
>> +
>> +    //
>> +    // Stash the QemuSelector so we can do the actual ejection later.
>> +    //
>> +    mCpuHotEjectData->QemuSelectorMap[ProcessorNum] = (UINT64)QemuSelector;
>> +
>> +    DEBUG ((DEBUG_INFO, "%a: Started hot-unplug on ProcessorNum %Lu, APIC ID "
>> +      FMT_APIC_ID ", QemuSelector 0x%Lx\n", __FUNCTION__, (UINT64)ProcessorNum,
>> +      RemoveApicId, mCpuHotEjectData->QemuSelectorMap[ProcessorNum]));
> 
> (8a) In the format string, please replace "QemuSelector 0x%Lx" with
> "QemuSelector %u" -- the main point is the decimal notation
> 
> (8b) As a suggestion, I think we should simplify this DEBUG call by
> replacing the "mCpuHotEjectData->QemuSelectorMap[ProcessorNum]" argument
> with just "QemuSelector".
> 
>      DEBUG ((DEBUG_INFO, "%a: Started hot-unplug on ProcessorNum %Lu, APIC ID "
>        FMT_APIC_ID ", QemuSelector %u\n", __FUNCTION__, (UINT64)ProcessorNum,
>        RemoveApicId, QemuSelector));
> 
> 
>> +
>> +    EjectCount++;
> 
> (I'm fine with keeping the "EjectCount" variable.)
> 
> 
>>       ToUnplugIdx++;
>>     }
>>
>> +  if (EjectCount != 0) {
>> +    //
>> +    // We have processors to be ejected; install the handler.
>> +    //
>> +    mCpuHotEjectData->Handler = EjectCpu;
>> +
>> +    //
>> +    // The BSP, CPUs to be ejected dereference mCpuHotEjectData->Handler, and
>> +    // mCpuHotEjectData->QemuSelectorMap[] in SmmCpuFeaturesRendezvousExit().
>> +    //
>> +    // Assignments to both of these are ordered-before the BSP's SMI exit signal
>> +    // which happens via a write to SMM_DISPATCHER_MP_SYNC_DATA->AllCpusInSync.
>> +    // Dereferences of both are ordered-after the synchronization via
>> +    // "AllCpusInSync".
>> +    //
>> +    // So we are guaranteed that the Handler would see the assignments above.
>> +    // However, add a MemoryFence() here in-lieu of a compiler barrier to
>> +    // ensure that the compiler doesn't monkey around with the stores.
>> +    //
>> +    MemoryFence ();
>> +  }
>> +
> 
> (9) Per previous discussion (under patch v8 #7), please replace the last
> paragraph of this comment, with a "ReleaseMemoryFence" reference.

Will do. And acking the comments above.

Ankur

> 
> The rest looks good.> 
> Thanks!
> Laszlo
> 
>>     //
>> -  // We've removed this set of APIC IDs from SMM data structures.
>> +  // We've removed this set of APIC IDs from SMM data structures and
>> +  // have installed an ejection handler if needed.
>>     //
>>     return EFI_SUCCESS;
>>   }
>> @@ -387,7 +510,7 @@ CpuHotplugMmi (
>>     }
>>
>>     if (ToUnplugCount > 0) {
>> -    Status = UnplugCpus (mToUnplugApicIds, ToUnplugCount);
>> +    Status = UnplugCpus (mToUnplugApicIds, mToUnplugSelector, ToUnplugCount);
>>       if (EFI_ERROR (Status)) {
>>         goto Fatal;
>>       }
>> @@ -458,9 +581,14 @@ CpuHotplugEntry (
>>
>>     //
>>     // Our DEPEX on EFI_SMM_CPU_SERVICE_PROTOCOL guarantees that PiSmmCpuDxeSmm
>> -  // has pointed PcdCpuHotPlugDataAddress to CPU_HOT_PLUG_DATA in SMRAM.
>> +  // has pointed:
>> +  // - PcdCpuHotPlugDataAddress to CPU_HOT_PLUG_DATA in SMRAM,
>> +  // - PcdCpuHotEjectDataAddress to CPU_HOT_EJECT_DATA in SMRAM, if the
>> +  //   possible CPU count is greater than 1.
>>     //
>>     mCpuHotPlugData = (VOID *)(UINTN)PcdGet64 (PcdCpuHotPlugDataAddress);
>> +  mCpuHotEjectData = (VOID *)(UINTN)PcdGet64 (PcdCpuHotEjectDataAddress);
>> +
>>     if (mCpuHotPlugData == NULL) {
>>       Status = EFI_NOT_FOUND;
>>       DEBUG ((DEBUG_ERROR, "%a: CPU_HOT_PLUG_DATA: %r\n", __FUNCTION__, Status));
>> @@ -472,6 +600,19 @@ CpuHotplugEntry (
>>     if (mCpuHotPlugData->ArrayLength == 1) {
>>       return EFI_UNSUPPORTED;
>>     }
>> +
>> +  if (mCpuHotEjectData == NULL) {
>> +    Status = EFI_NOT_FOUND;
>> +  } else if (mCpuHotPlugData->ArrayLength != mCpuHotEjectData->ArrayLength) {
>> +    Status = EFI_INVALID_PARAMETER;
>> +  } else {
>> +    Status = EFI_SUCCESS;
>> +  }
>> +  if (EFI_ERROR (Status)) {
>> +    DEBUG ((DEBUG_ERROR, "%a: CPU_HOT_EJECT_DATA: %r\n", __FUNCTION__, Status));
>> +    goto Fatal;
>> +  }
>> +
>>     //
>>     // Allocate the data structures that depend on the possible CPU count.
>>     //
>>
> 


-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#72115): https://edk2.groups.io/g/devel/message/72115
Mute This Topic: https://groups.io/mt/80819863/1787277
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org]
-=-=-=-=-=-=-=-=-=-=-=-