[edk2-devel] [RFC PATCH v2 38/44] UefiCpuPkg: Allow AP booting under SEV-ES

Lendacky, Thomas posted 44 patches 6 years ago
There is a newer version of this series
[edk2-devel] [RFC PATCH v2 38/44] UefiCpuPkg: Allow AP booting under SEV-ES
Posted by Lendacky, Thomas 6 years ago
From: Tom Lendacky <thomas.lendacky@amd.com>

BZ: https://bugzilla.tianocore.org/show_bug.cgi?id=2198

Typically, an AP is booted using the INIT-SIPI-SIPI sequence. This
sequence is intercepted by the hypervisor, which sets the AP's registers
to the values requested by the sequence. At that point, the hypervisor can
start the AP, which will then begin execution at the appropriate location.

Under SEV-ES, AP booting presents some challenges since the hypervisor is
not allowed to alter the AP's register state. In this situation, we have
to distinguish between the AP's first boot and AP's subsequent boots.

First boot:
 Once the AP's register state has been defined (which is before the guest
 is first booted) it cannot be altered. Should the hypervisor attempt to
 alter the register state, the change would be detected by the hardware
 and the VMRUN instruction would fail. Given this, the first boot for the
 AP is required to begin execution with this initial register state, which
 is typically the reset vector. This prevents the BSP from directing the
 AP startup location through the INIT-SIPI-SIPI sequence.

 To work around this, provide a four-byte field at offset 0xffffffd0 that
 can contain an IP / CS register combination, that if non-zero, causes
 the AP to perform a far jump to that location instead of a near jump to
 EarlyBspInitReal16. Before booting the AP for the first time, the BSP
 should set the IP / CS value for the AP based on the value that would be
 derived from the INIT-SIPI-SIPI sequence.

Subsequent boots:
 Again, the hypervisor cannot alter the AP register state, so a method is
 required to take the AP out of halt state and redirect it to the desired
 IP location. If it is determined that the AP is running in an SEV-ES
 guest, then instead of calling CpuSleep(), a VMGEXIT is issued with the
 AP Reset Hold exit code (0x80000004). The hypervisor will put the AP in
 a halt state, waiting for an INIT-SIPI-SIPI sequence. Once the sequence
 is recognized, the hypervisor will resume the AP. At this point the AP
 must transition from the current 64-bit long mode down to 16-bit real
 mode and begin executing at the derived location from the INIT-SIPI-SIPI
 sequence.

 Another change is around the area of obtaining the (x2)APIC ID during AP
 startup. During AP startup, the AP can't take a #VC exception before the
 AP has established a stack. However, the AP stack is set by using the
 (x2)APIC ID, which is obtained through CPUID instructions. A CPUID
 instruction will cause a #VC, so a different method must be used. The
 GHCB protocol supports a method to obtain CPUID information from the
 hypervisor through the GHCB MSR. This method does not require a stack,
 so it is used to obtain the necessary CPUID information to determine the
 (x2)APIC ID.

The OVMF SEV support is updated to set the SEV-ES active PCD entry
(PcdSevEsActive) when the guest is an SEV-ES guest. Also, the OVMF
support is updated to create its own reset vector routine in order to
supply the far jump field required for an AP first boot.

The new 16-bit protected mode GDT entry is used in order to transition
from 64-bit long mode down to 16-bit real mode.

A new assembler routine is created that takes the AP from 64-bit long mode
to 16-bit real mode.  This is located under 1MB in memory and transitions
from 64-bit long mode to 32-bit compatibility mode to 16-bit protected
mode and finally 16-bit real mode.

Cc: Eric Dong <eric.dong@intel.com>
Cc: Ray Ni <ray.ni@intel.com>
Cc: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 UefiCpuPkg/Library/MpInitLib/DxeMpInitLib.inf |   2 +
 UefiCpuPkg/Library/MpInitLib/PeiMpInitLib.inf |   2 +
 UefiCpuPkg/Library/MpInitLib/MpLib.h          |  58 +++++
 UefiCpuPkg/Library/MpInitLib/DxeMpLib.c       |  70 ++++-
 UefiCpuPkg/Library/MpInitLib/MpLib.c          | 235 ++++++++++++++++-
 UefiCpuPkg/Library/MpInitLib/PeiMpLib.c       |  19 ++
 UefiCpuPkg/PiSmmCpuDxeSmm/X64/SmmFuncsArch.c  |   2 +-
 UefiCpuPkg/Library/MpInitLib/Ia32/MpEqu.inc   |   2 +-
 .../Library/MpInitLib/Ia32/MpFuncs.nasm       |  15 ++
 UefiCpuPkg/Library/MpInitLib/X64/MpEqu.inc    |   4 +-
 UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm | 239 ++++++++++++++++++
 11 files changed, 634 insertions(+), 14 deletions(-)

diff --git a/UefiCpuPkg/Library/MpInitLib/DxeMpInitLib.inf b/UefiCpuPkg/Library/MpInitLib/DxeMpInitLib.inf
index 557507e9a466..ad5f33451aa3 100644
--- a/UefiCpuPkg/Library/MpInitLib/DxeMpInitLib.inf
+++ b/UefiCpuPkg/Library/MpInitLib/DxeMpInitLib.inf
@@ -51,6 +51,7 @@ [LibraryClasses]
   UefiBootServicesTableLib
   DebugAgentLib
   SynchronizationLib
+  VmgExitLib
 
 [Protocols]
   gEfiTimerArchProtocolGuid                     ## SOMETIMES_CONSUMES
@@ -69,4 +70,5 @@ [Pcd]
   gUefiCpuPkgTokenSpaceGuid.PcdCpuApTargetCstate                   ## SOMETIMES_CONSUMES
   gEfiMdeModulePkgTokenSpaceGuid.PcdCpuStackGuard                  ## CONSUMES
   gEfiMdeModulePkgTokenSpaceGuid.PcdSevEsActive                    ## CONSUMES
+  gEfiMdeModulePkgTokenSpaceGuid.PcdGhcbBase                       ## CONSUMES
 
diff --git a/UefiCpuPkg/Library/MpInitLib/PeiMpInitLib.inf b/UefiCpuPkg/Library/MpInitLib/PeiMpInitLib.inf
index f26ffd5a2ef5..4d4799aaa932 100644
--- a/UefiCpuPkg/Library/MpInitLib/PeiMpInitLib.inf
+++ b/UefiCpuPkg/Library/MpInitLib/PeiMpInitLib.inf
@@ -50,6 +50,7 @@ [LibraryClasses]
   UefiCpuLib
   SynchronizationLib
   PeiServicesLib
+  VmgExitLib
 
 [Pcd]
   gUefiCpuPkgTokenSpaceGuid.PcdCpuMaxLogicalProcessorNumber        ## CONSUMES
@@ -60,6 +61,7 @@ [Pcd]
   gUefiCpuPkgTokenSpaceGuid.PcdCpuApLoopMode                       ## CONSUMES
   gUefiCpuPkgTokenSpaceGuid.PcdCpuApTargetCstate                   ## SOMETIMES_CONSUMES
   gEfiMdeModulePkgTokenSpaceGuid.PcdSevEsActive                    ## CONSUMES
+  gEfiMdeModulePkgTokenSpaceGuid.PcdGhcbBase                       ## CONSUMES
 
 [Guids]
   gEdkiiS3SmmInitDoneGuid
diff --git a/UefiCpuPkg/Library/MpInitLib/MpLib.h b/UefiCpuPkg/Library/MpInitLib/MpLib.h
index 1dfe9a9cd756..4cfb93ee4f77 100644
--- a/UefiCpuPkg/Library/MpInitLib/MpLib.h
+++ b/UefiCpuPkg/Library/MpInitLib/MpLib.h
@@ -151,6 +151,11 @@ typedef struct {
   UINT8             *RelocateApLoopFuncAddress;
   UINTN             RelocateApLoopFuncSize;
   UINTN             ModeTransitionOffset;
+  UINTN             SwitchToRealSize;
+  UINTN             SwitchToRealOffset;
+  UINTN             SwitchToRealNoNxOffset;
+  UINTN             SwitchToRealPM16ModeOffset;
+  UINTN             SwitchToRealPM16ModeSize;
 } MP_ASSEMBLY_ADDRESS_MAP;
 
 typedef struct _CPU_MP_DATA  CPU_MP_DATA;
@@ -189,6 +194,8 @@ typedef struct {
   // Enable5LevelPaging indicates whether 5-level paging is enabled in long mode.
   //
   BOOLEAN               Enable5LevelPaging;
+  BOOLEAN               SevEsActive;
+  UINTN                 GhcbBase;
 } MP_CPU_EXCHANGE_INFO;
 
 #pragma pack()
@@ -236,6 +243,7 @@ struct _CPU_MP_DATA {
   UINT8                          ApLoopMode;
   UINT8                          ApTargetCState;
   UINT16                         PmCodeSegment;
+  UINT16                         Pm16CodeSegment;
   CPU_AP_DATA                    *CpuData;
   volatile MP_CPU_EXCHANGE_INFO  *MpCpuExchangeInfo;
 
@@ -262,8 +270,45 @@ struct _CPU_MP_DATA {
   BOOLEAN                        WakeUpByInitSipiSipi;
 
   BOOLEAN                        SevEsActive;
+  UINTN                          SevEsAPBuffer;
+  UINTN                          SevEsAPResetStackStart;
+  CPU_MP_DATA                    *NewCpuMpData;
+
+  UINT64                         GhcbBase;
 };
 
+#define AP_RESET_STACK_SIZE 64
+
+typedef union {
+  struct {
+    UINT16  Rip;
+    UINT16  Segment;
+  } ApStart;
+  UINT32    Uint32;
+} SEV_ES_AP_JMP_FAR;
+
+/**
+  Assembly code to move an AP from long mode to real mode.
+
+  Move an AP from long mode to real mode in preparation to invoking
+  the reset vector.  This is used for SEV-ES guests where a hypervisor
+  is not allowed to set the CS and RIP to point to the reset vector.
+
+  @param[in]  BufferStart  The reset vector target.
+  @param[in]  Code16       16-bit protected mode code segment value.
+  @param[in]  Code32       32-bit protected mode code segment value.
+  @param[in]  StackStart   The start of a stack to be used for transitioning
+                           from long mode to real mode.
+**/
+typedef
+VOID
+(EFIAPI AP_RESET) (
+  IN UINTN    BufferStart,
+  IN UINT16   Code16,
+  IN UINT16   Code32,
+  IN UINTN    StackStart
+  );
+
 extern EFI_GUID mCpuInitMpLibHobGuid;
 
 /**
@@ -369,6 +414,19 @@ GetModeTransitionBuffer (
   IN UINTN                BufferSize
   );
 
+/**
+  Return the address of the SEV-ES AP jump table.
+
+  This buffer is required in order for an SEV-ES guest to transition from
+  UEFI into an OS.
+
+  @retval other   Return SEV-ES AP jump table buffer
+**/
+UINTN
+GetSevEsAPMemory (
+  VOID
+  );
+
 /**
   This function will be called by BSP to wakeup AP.
 
diff --git a/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c b/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c
index b17e287bbf49..8df5b6d919e6 100644
--- a/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c
+++ b/UefiCpuPkg/Library/MpInitLib/DxeMpLib.c
@@ -12,6 +12,8 @@
 #include <Library/UefiBootServicesTableLib.h>
 #include <Library/DebugAgentLib.h>
 #include <Library/DxeServicesTableLib.h>
+#include <Register/Amd/Fam17Msr.h>
+#include <Register/Amd/Ghcb.h>
 
 #include <Protocol/Timer.h>
 
@@ -145,6 +147,39 @@ GetModeTransitionBuffer (
   return (UINTN)StartAddress;
 }
 
+/**
+  Return the address of the SEV-ES AP jump table.
+
+  This buffer is required in order for an SEV-ES guest to transition from
+  UEFI into an OS.
+
+  @retval other   Return SEV-ES AP jump table buffer
+**/
+UINTN
+GetSevEsAPMemory (
+  VOID
+  )
+{
+  EFI_STATUS            Status;
+  EFI_PHYSICAL_ADDRESS  StartAddress;
+
+  //
+  // Allocate 1 page for AP jump table page
+  //
+  StartAddress = BASE_4GB - 1;
+  Status = gBS->AllocatePages (
+                  AllocateMaxAddress,
+                  EfiReservedMemoryType,
+                  1,
+                  &StartAddress
+                  );
+  ASSERT_EFI_ERROR (Status);
+
+  DEBUG ((DEBUG_INFO, "Dxe: SevEsAPMemory = %lx\n", (UINTN) StartAddress));
+
+  return (UINTN) StartAddress;
+}
+
 /**
   Checks APs status and updates APs status if needed.
 
@@ -219,6 +254,38 @@ CheckApsStatus (
   }
 }
 
+/**
+  Get Protected mode code segment with 16-bit default addressing
+  from current GDT table.
+
+  @return  Protected mode 16-bit code segment value.
+**/
+UINT16
+GetProtectedMode16CS (
+  VOID
+  )
+{
+  IA32_DESCRIPTOR          GdtrDesc;
+  IA32_SEGMENT_DESCRIPTOR  *GdtEntry;
+  UINTN                    GdtEntryCount;
+  UINT16                   Index;
+
+  Index = (UINT16) -1;
+  AsmReadGdtr (&GdtrDesc);
+  GdtEntryCount = (GdtrDesc.Limit + 1) / sizeof (IA32_SEGMENT_DESCRIPTOR);
+  GdtEntry = (IA32_SEGMENT_DESCRIPTOR *) GdtrDesc.Base;
+  for (Index = 0; Index < GdtEntryCount; Index++) {
+    if (GdtEntry->Bits.L == 0) {
+      if (GdtEntry->Bits.Type > 8 && GdtEntry->Bits.DB == 0) {
+        break;
+      }
+    }
+    GdtEntry++;
+  }
+  ASSERT (Index != GdtEntryCount);
+  return Index * 8;
+}
+
 /**
   Get Protected mode code segment from current GDT table.
 
@@ -239,7 +306,7 @@ GetProtectedModeCS (
   GdtEntry = (IA32_SEGMENT_DESCRIPTOR *) GdtrDesc.Base;
   for (Index = 0; Index < GdtEntryCount; Index++) {
     if (GdtEntry->Bits.L == 0) {
-      if (GdtEntry->Bits.Type > 8 && GdtEntry->Bits.L == 0) {
+      if (GdtEntry->Bits.Type > 8 && GdtEntry->Bits.DB == 1) {
         break;
       }
     }
@@ -301,6 +368,7 @@ MpInitChangeApLoopCallback (
 
   CpuMpData = GetCpuMpData ();
   CpuMpData->PmCodeSegment = GetProtectedModeCS ();
+  CpuMpData->Pm16CodeSegment = GetProtectedMode16CS ();
   CpuMpData->ApLoopMode = PcdGet8 (PcdCpuApLoopMode);
   mNumberToFinish = CpuMpData->CpuCount - 1;
   WakeUpAP (CpuMpData, TRUE, 0, RelocateApLoop, NULL, TRUE);
diff --git a/UefiCpuPkg/Library/MpInitLib/MpLib.c b/UefiCpuPkg/Library/MpInitLib/MpLib.c
index 3e20900ec7bc..43aa9c0a4950 100644
--- a/UefiCpuPkg/Library/MpInitLib/MpLib.c
+++ b/UefiCpuPkg/Library/MpInitLib/MpLib.c
@@ -7,6 +7,9 @@
 **/
 
 #include "MpLib.h"
+#include <Library/VmgExitLib.h>
+#include <Register/Amd/Fam17Msr.h>
+#include <Register/Amd/Ghcb.h>
 
 EFI_GUID mCpuInitMpLibHobGuid = CPU_INIT_MP_LIB_HOB_GUID;
 
@@ -288,6 +291,14 @@ GetApLoopMode (
       //
       ApLoopMode = ApInHltLoop;
     }
+
+    if (PcdGetBool (PcdSevEsActive)) {
+      //
+      // For SEV-ES, force AP in Hlt-loop mode in order to use the GHCB
+      // protocol for starting APs
+      //
+      ApLoopMode = ApInHltLoop;
+    }
   }
 
   if (ApLoopMode != ApInMwaitLoop) {
@@ -555,6 +566,108 @@ InitializeApData (
   SetApState (&CpuMpData->CpuData[ProcessorNumber], CpuStateIdle);
 }
 
+/**
+  Get Protected mode code segment with 16-bit default addressing
+  from current GDT table.
+
+  @return  Protected mode 16-bit code segment value.
+**/
+STATIC
+UINT16
+GetProtectedMode16CS (
+  VOID
+  )
+{
+  IA32_DESCRIPTOR          GdtrDesc;
+  IA32_SEGMENT_DESCRIPTOR  *GdtEntry;
+  UINTN                    GdtEntryCount;
+  UINT16                   Index;
+
+  Index = (UINT16) -1;
+  AsmReadGdtr (&GdtrDesc);
+  GdtEntryCount = (GdtrDesc.Limit + 1) / sizeof (IA32_SEGMENT_DESCRIPTOR);
+  GdtEntry = (IA32_SEGMENT_DESCRIPTOR *) GdtrDesc.Base;
+  for (Index = 0; Index < GdtEntryCount; Index++) {
+    if (GdtEntry->Bits.L == 0 &&
+        GdtEntry->Bits.DB == 0 &&
+        GdtEntry->Bits.Type > 8) {
+      break;
+    }
+    GdtEntry++;
+  }
+  ASSERT (Index != GdtEntryCount);
+  return Index * 8;
+}
+
+/**
+  Get Protected mode code segment with 32-bit default addressing
+  from current GDT table.
+
+  @return  Protected mode 32-bit code segment value.
+**/
+STATIC
+UINT16
+GetProtectedMode32CS (
+  VOID
+  )
+{
+  IA32_DESCRIPTOR          GdtrDesc;
+  IA32_SEGMENT_DESCRIPTOR  *GdtEntry;
+  UINTN                    GdtEntryCount;
+  UINT16                   Index;
+
+  Index = (UINT16) -1;
+  AsmReadGdtr (&GdtrDesc);
+  GdtEntryCount = (GdtrDesc.Limit + 1) / sizeof (IA32_SEGMENT_DESCRIPTOR);
+  GdtEntry = (IA32_SEGMENT_DESCRIPTOR *) GdtrDesc.Base;
+  for (Index = 0; Index < GdtEntryCount; Index++) {
+    if (GdtEntry->Bits.L == 0 &&
+        GdtEntry->Bits.DB == 1 &&
+        GdtEntry->Bits.Type > 8) {
+      break;
+    }
+    GdtEntry++;
+  }
+  ASSERT (Index != GdtEntryCount);
+  return Index * 8;
+}
+
+/**
+  Reset an AP when in SEV-ES mode.
+
+  @retval EFI_DEVICE_ERROR        Reset of AP failed.
+**/
+STATIC
+VOID
+MpInitLibSevEsAPReset (
+  GHCB                         *Ghcb,
+  CPU_MP_DATA                  *CpuMpData
+  )
+{
+  UINT16           Code16, Code32;
+  AP_RESET         *APResetFn;
+  UINTN            BufferStart;
+  UINTN            StackStart;
+
+  Code16 = GetProtectedMode16CS ();
+  Code32 = GetProtectedMode32CS ();
+
+  if (CpuMpData->WakeupBufferHigh != 0) {
+    APResetFn = (AP_RESET *) (CpuMpData->WakeupBufferHigh + CpuMpData->AddressMap.SwitchToRealNoNxOffset);
+  } else {
+    APResetFn = (AP_RESET *) (CpuMpData->MpCpuExchangeInfo->BufferStart + CpuMpData->AddressMap.SwitchToRealOffset);
+  }
+
+  BufferStart = CpuMpData->MpCpuExchangeInfo->BufferStart;
+  StackStart = CpuMpData->SevEsAPResetStackStart -
+                 (AP_RESET_STACK_SIZE * GetApicId ());
+
+  //
+  // This call never returns.
+  //
+  APResetFn (BufferStart, Code16, Code32, StackStart);
+}
+
 /**
   This function will be called from AP reset code if BSP uses WakeUpAP.
 
@@ -714,7 +827,28 @@ ApWakeupFunction (
       //
       while (TRUE) {
         DisableInterrupts ();
-        CpuSleep ();
+        if (CpuMpData->SevEsActive) {
+          MSR_SEV_ES_GHCB_REGISTER  Msr;
+          GHCB                      *Ghcb;
+
+          Msr.GhcbPhysicalAddress = AsmReadMsr64 (MSR_SEV_ES_GHCB);
+          Ghcb = Msr.Ghcb;
+
+          VmgInit (Ghcb);
+          VmgExit (Ghcb, SvmExitApResetHold, 0, 0);
+          /*TODO: Check return value to verify SIPI issued */
+
+          //
+          // Awakened in a new phase? Use the new CpuMpData
+          //
+          if (CpuMpData->NewCpuMpData) {
+            CpuMpData = CpuMpData->NewCpuMpData;
+          }
+
+          MpInitLibSevEsAPReset (Ghcb, CpuMpData);
+        } else {
+          CpuSleep ();
+        }
         CpuPause ();
       }
     }
@@ -827,6 +961,9 @@ FillExchangeInfoData (
   ExchangeInfo->Enable5LevelPaging = (BOOLEAN) (Cr4.Bits.LA57 == 1);
   DEBUG ((DEBUG_INFO, "%a: 5-Level Paging = %d\n", gEfiCallerBaseName, ExchangeInfo->Enable5LevelPaging));
 
+  ExchangeInfo->SevEsActive     = CpuMpData->SevEsActive;
+  ExchangeInfo->GhcbBase        = CpuMpData->GhcbBase;
+
   //
   // Get the BSP's data of GDT and IDT
   //
@@ -853,8 +990,9 @@ FillExchangeInfoData (
   // EfiBootServicesCode to avoid page fault if NX memory protection is enabled.
   //
   if (CpuMpData->WakeupBufferHigh != 0) {
-    Size = CpuMpData->AddressMap.RendezvousFunnelSize -
-           CpuMpData->AddressMap.ModeTransitionOffset;
+    Size = CpuMpData->AddressMap.RendezvousFunnelSize +
+             CpuMpData->AddressMap.SwitchToRealSize -
+             CpuMpData->AddressMap.ModeTransitionOffset;
     CopyMem (
       (VOID *)CpuMpData->WakeupBufferHigh,
       CpuMpData->AddressMap.RendezvousFunnelAddress +
@@ -907,7 +1045,8 @@ BackupAndPrepareWakeupBuffer(
   CopyMem (
     (VOID *) CpuMpData->WakeupBuffer,
     (VOID *) CpuMpData->AddressMap.RendezvousFunnelAddress,
-    CpuMpData->AddressMap.RendezvousFunnelSize
+    CpuMpData->AddressMap.RendezvousFunnelSize +
+      CpuMpData->AddressMap.SwitchToRealSize
     );
 }
 
@@ -928,6 +1067,40 @@ RestoreWakeupBuffer(
     );
 }
 
+/**
+  Calculate the size of the reset stack.
+**/
+STATIC
+UINTN
+GetApResetStackSize(
+  VOID
+  )
+{
+  return AP_RESET_STACK_SIZE * PcdGet32(PcdCpuMaxLogicalProcessorNumber);
+}
+
+/**
+  Calculate the size of the reset vector.
+
+  @param[in]  AddressMap  The pointer to Address Map structure.
+**/
+STATIC
+UINTN
+GetApResetVectorSize(
+  IN MP_ASSEMBLY_ADDRESS_MAP  *AddressMap
+  )
+{
+  UINTN  Size;
+
+  Size = ALIGN_VALUE (AddressMap->RendezvousFunnelSize +
+                        AddressMap->SwitchToRealSize +
+                        sizeof (MP_CPU_EXCHANGE_INFO),
+                      CPU_STACK_ALIGNMENT);
+  Size += GetApResetStackSize ();
+
+  return Size;
+}
+
 /**
   Allocate reset vector buffer.
 
@@ -941,16 +1114,22 @@ AllocateResetVector (
   UINTN           ApResetVectorSize;
 
   if (CpuMpData->WakeupBuffer == (UINTN) -1) {
-    ApResetVectorSize = CpuMpData->AddressMap.RendezvousFunnelSize +
-                          sizeof (MP_CPU_EXCHANGE_INFO);
+    ApResetVectorSize = GetApResetVectorSize (&CpuMpData->AddressMap);
 
     CpuMpData->WakeupBuffer      = GetWakeupBuffer (ApResetVectorSize);
     CpuMpData->MpCpuExchangeInfo = (MP_CPU_EXCHANGE_INFO *) (UINTN)
-                    (CpuMpData->WakeupBuffer + CpuMpData->AddressMap.RendezvousFunnelSize);
+                    (CpuMpData->WakeupBuffer +
+                       CpuMpData->AddressMap.RendezvousFunnelSize +
+                       CpuMpData->AddressMap.SwitchToRealSize);
     CpuMpData->WakeupBufferHigh  = GetModeTransitionBuffer (
-                                    CpuMpData->AddressMap.RendezvousFunnelSize -
+                                    CpuMpData->AddressMap.RendezvousFunnelSize +
+                                    CpuMpData->AddressMap.SwitchToRealSize -
                                     CpuMpData->AddressMap.ModeTransitionOffset
                                     );
+    //
+    // The reset stack starts at the end of the buffer.
+    //
+    CpuMpData->SevEsAPResetStackStart = CpuMpData->WakeupBuffer + ApResetVectorSize;
   }
   BackupAndPrepareWakeupBuffer (CpuMpData);
 }
@@ -965,7 +1144,30 @@ FreeResetVector (
   IN CPU_MP_DATA              *CpuMpData
   )
 {
-  RestoreWakeupBuffer (CpuMpData);
+  //
+  // If SEV-ES is active, the reset area is needed for AP parking and
+  // and AP startup in the OS, so the reset area is reserved. Do not
+  // perform the restore as this will overwrite memory which has data
+  // needed by SEV-ES.
+  //
+  if (!CpuMpData->SevEsActive) {
+    RestoreWakeupBuffer (CpuMpData);
+  }
+}
+
+/**
+  Allocate the SEV-ES AP jump table buffer.
+
+  @param[in, out]  CpuMpData  The pointer to CPU MP Data structure.
+**/
+VOID
+AllocateSevEsAPMemory (
+  IN OUT CPU_MP_DATA          *CpuMpData
+  )
+{
+  if (CpuMpData->SevEsAPBuffer == (UINTN) -1) {
+    CpuMpData->SevEsAPBuffer = CpuMpData->SevEsActive ? GetSevEsAPMemory () : 0;
+  }
 }
 
 /**
@@ -1002,6 +1204,7 @@ WakeUpAP (
       CpuMpData->InitFlag   != ApInitDone) {
     ResetVectorRequired = TRUE;
     AllocateResetVector (CpuMpData);
+    AllocateSevEsAPMemory (CpuMpData);
     FillExchangeInfoData (CpuMpData);
     SaveLocalApicTimerSetting (CpuMpData);
   }
@@ -1038,6 +1241,15 @@ WakeUpAP (
       }
     }
     if (ResetVectorRequired) {
+      //
+      // For SEV-ES, set the jump address for initial AP boot
+      //
+      if (CpuMpData->SevEsActive) {
+        SEV_ES_AP_JMP_FAR *JmpFar = (SEV_ES_AP_JMP_FAR *)0xFFFFFFD0;
+
+        JmpFar->ApStart.Rip = 0;
+        JmpFar->ApStart.Segment = (UINT16) (ExchangeInfo->BufferStart >> 4);
+      }
       //
       // Wakeup all APs
       //
@@ -1563,7 +1775,7 @@ MpInitLibInitialize (
   ASSERT (MaxLogicalProcessorNumber != 0);
 
   AsmGetAddressMap (&AddressMap);
-  ApResetVectorSize = AddressMap.RendezvousFunnelSize + sizeof (MP_CPU_EXCHANGE_INFO);
+  ApResetVectorSize = GetApResetVectorSize (&AddressMap);
   ApStackSize = PcdGet32(PcdCpuApStackSize);
   ApLoopMode  = GetApLoopMode (&MonitorFilterSize);
 
@@ -1658,6 +1870,8 @@ MpInitLibInitialize (
   }
   InitializeSpinLock(&CpuMpData->MpLock);
   CpuMpData->SevEsActive      = PcdGetBool (PcdSevEsActive);
+  CpuMpData->SevEsAPBuffer    = (UINTN) -1;
+  CpuMpData->GhcbBase         = PcdGet64 (PcdGhcbBase);
 
   //
   // Make sure no memory usage outside of the allocated buffer.
@@ -1724,6 +1938,7 @@ MpInitLibInitialize (
     // APs have been wakeup before, just get the CPU Information
     // from HOB
     //
+    OldCpuMpData->NewCpuMpData = CpuMpData;
     CpuMpData->CpuCount  = OldCpuMpData->CpuCount;
     CpuMpData->BspNumber = OldCpuMpData->BspNumber;
     CpuMpData->InitFlag  = ApInitReconfig;
diff --git a/UefiCpuPkg/Library/MpInitLib/PeiMpLib.c b/UefiCpuPkg/Library/MpInitLib/PeiMpLib.c
index 3999603c3efc..56956a615b6b 100644
--- a/UefiCpuPkg/Library/MpInitLib/PeiMpLib.c
+++ b/UefiCpuPkg/Library/MpInitLib/PeiMpLib.c
@@ -279,6 +279,25 @@ GetModeTransitionBuffer (
   return 0;
 }
 
+/**
+  Return the address of the SEV-ES AP jump table.
+
+  This buffer is required in order for an SEV-ES guest to transition from
+  UEFI into an OS.
+
+  @retval other   Return SEV-ES AP jump table buffer
+**/
+UINTN
+GetSevEsAPMemory (
+  VOID
+  )
+{
+  //
+  // PEI phase doesn't need to do such transition. So simply return 0.
+  //
+  return 0;
+}
+
 /**
   Checks APs status and updates APs status if needed.
 
diff --git a/UefiCpuPkg/PiSmmCpuDxeSmm/X64/SmmFuncsArch.c b/UefiCpuPkg/PiSmmCpuDxeSmm/X64/SmmFuncsArch.c
index 6298571e29b2..28f8e8e133e5 100644
--- a/UefiCpuPkg/PiSmmCpuDxeSmm/X64/SmmFuncsArch.c
+++ b/UefiCpuPkg/PiSmmCpuDxeSmm/X64/SmmFuncsArch.c
@@ -121,7 +121,7 @@ GetProtectedModeCS (
   GdtEntry = (IA32_SEGMENT_DESCRIPTOR *) GdtrDesc.Base;
   for (Index = 0; Index < GdtEntryCount; Index++) {
     if (GdtEntry->Bits.L == 0) {
-      if (GdtEntry->Bits.Type > 8 && GdtEntry->Bits.L == 0) {
+      if (GdtEntry->Bits.Type > 8 && GdtEntry->Bits.DB == 1) {
         break;
       }
     }
diff --git a/UefiCpuPkg/Library/MpInitLib/Ia32/MpEqu.inc b/UefiCpuPkg/Library/MpInitLib/Ia32/MpEqu.inc
index efb1bc2bf7cb..4f5a7c859a56 100644
--- a/UefiCpuPkg/Library/MpInitLib/Ia32/MpEqu.inc
+++ b/UefiCpuPkg/Library/MpInitLib/Ia32/MpEqu.inc
@@ -19,7 +19,7 @@ CPU_SWITCH_STATE_IDLE         equ        0
 CPU_SWITCH_STATE_STORED       equ        1
 CPU_SWITCH_STATE_LOADED       equ        2
 
-LockLocation                  equ        (RendezvousFunnelProcEnd - RendezvousFunnelProcStart)
+LockLocation                  equ        (SwitchToRealProcEnd - RendezvousFunnelProcStart)
 StackStartAddressLocation     equ        LockLocation + 04h
 StackSizeLocation             equ        LockLocation + 08h
 ApProcedureLocation           equ        LockLocation + 0Ch
diff --git a/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm b/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm
index b74046b76af3..309d53bf3b37 100644
--- a/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm
+++ b/UefiCpuPkg/Library/MpInitLib/Ia32/MpFuncs.nasm
@@ -215,6 +215,16 @@ CProcedureInvoke:
     jmp        $                 ; Never reach here
 RendezvousFunnelProcEnd:
 
+;-------------------------------------------------------------------------------------
+;SwitchToRealProc procedure follows.
+;NOT USED IN 32 BIT MODE.
+;-------------------------------------------------------------------------------------
+global ASM_PFX(SwitchToRealProc)
+ASM_PFX(SwitchToRealProc):
+SwitchToRealProcStart:
+    jmp        $                 ; Never reach here
+SwitchToRealProcEnd:
+
 ;-------------------------------------------------------------------------------------
 ;  AsmRelocateApLoop (MwaitSupport, ApTargetCState, PmCodeSegment, TopOfApStack, CountTofinish);
 ;-------------------------------------------------------------------------------------
@@ -263,6 +273,11 @@ ASM_PFX(AsmGetAddressMap):
     mov        dword [ebx + 0Ch], AsmRelocateApLoopStart
     mov        dword [ebx + 10h], AsmRelocateApLoopEnd - AsmRelocateApLoopStart
     mov        dword [ebx + 14h], Flat32Start - RendezvousFunnelProcStart
+    mov        dword [ebx + 18h], SwitchToRealProcEnd - SwitchToRealProcStart       ; SwitchToRealSize
+    mov        dword [ebx + 1Ch], SwitchToRealProcStart - RendezvousFunnelProcStart ; SwitchToRealOffset
+    mov        dword [ebx + 20h], SwitchToRealProcStart - Flat32Start               ; SwitchToRealNoNxOffset
+    mov        dword [ebx + 24h], 0                                                 ; SwitchToRealPM16ModeOffset
+    mov        dword [ebx + 28h], 0                                                 ; SwitchToRealPM16ModeSize
 
     popad
     ret
diff --git a/UefiCpuPkg/Library/MpInitLib/X64/MpEqu.inc b/UefiCpuPkg/Library/MpInitLib/X64/MpEqu.inc
index 58ef369342a7..245f323f977b 100644
--- a/UefiCpuPkg/Library/MpInitLib/X64/MpEqu.inc
+++ b/UefiCpuPkg/Library/MpInitLib/X64/MpEqu.inc
@@ -19,7 +19,7 @@ CPU_SWITCH_STATE_IDLE         equ        0
 CPU_SWITCH_STATE_STORED       equ        1
 CPU_SWITCH_STATE_LOADED       equ        2
 
-LockLocation                  equ        (RendezvousFunnelProcEnd - RendezvousFunnelProcStart)
+LockLocation                  equ        (SwitchToRealProcEnd - RendezvousFunnelProcStart)
 StackStartAddressLocation     equ        LockLocation + 08h
 StackSizeLocation             equ        LockLocation + 10h
 ApProcedureLocation           equ        LockLocation + 18h
@@ -41,3 +41,5 @@ ModeTransitionSegmentLocation       equ  LockLocation + 98h
 ModeHighMemoryLocation              equ  LockLocation + 9Ah
 ModeHighSegmentLocation             equ  LockLocation + 9Eh
 Enable5LevelPagingLocation          equ  LockLocation + 0A0h
+SevEsActiveLocation                 equ  LockLocation + 0A1h
+GhcbBaseLocation                    equ  LockLocation + 0A2h
diff --git a/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm b/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm
index 87f2523e856f..bbc7432740ff 100644
--- a/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm
+++ b/UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm
@@ -184,9 +184,97 @@ Releaselock:
     add        edi, StackStartAddressLocation
     add        rax, qword [edi]
     mov        rsp, rax
+
+    lea        edi, [esi + SevEsActiveLocation]
+    cmp        byte [edi], 1        ; SevEsActive
+    jne        CProcedureInvoke
+
+    ;
+    ; program GHCB
+    ;   Each page after the GHCB is a per-CPU page, so the calculation programs
+    ;   a GHCB to be every 8KB.
+    ;
+    mov        eax, SIZE_4KB
+    shl        eax, 1                            ; EAX = SIZE_4K * 2
+    mov        ecx, ebx
+    mul        ecx                               ; EAX = SIZE_4K * 2 * CpuNumber
+    mov        edi, esi
+    add        edi, GhcbBaseLocation
+    add        rax, qword [edi]
+    mov        rdx, rax
+    shr        rdx, 32
+    mov        rcx, 0xc0010130
+    wrmsr
     jmp        CProcedureInvoke
 
 GetApicId:
+    lea        edi, [esi + SevEsActiveLocation]
+    cmp        byte [edi], 1        ; SevEsActive
+    jne        DoCpuid
+
+    ;
+    ; Since we don't have a stack yet, we can't take a #VC
+    ; exception. Use the GHCB protocol to perform the CPUID
+    ; calls.
+    ;
+    mov        rcx, 0xc0010130
+    rdmsr
+    shl        rdx, 32
+    or         rax, rdx
+    mov        rdi, rax             ; RDI now holds the original GHCB GPA
+
+    mov        rdx, 0               ; CPUID function 0
+    mov        rax, 0               ; RAX register requested
+    or         rax, 4
+    wrmsr
+    rep vmmcall
+    rdmsr
+    cmp        edx, 0bh
+    jb         NoX2ApicSevEs        ; CPUID level below CPUID_EXTENDED_TOPOLOGY
+
+    mov        rdx, 0bh             ; CPUID function 0x0b
+    mov        rax, 040000000h      ; RBX register requested
+    or         rax, 4
+    wrmsr
+    rep vmmcall
+    rdmsr
+    test       edx, 0ffffh
+    jz         NoX2ApicSevEs        ; CPUID.0BH:EBX[15:0] is zero
+
+    mov        rdx, 0bh             ; CPUID function 0x0b
+    mov        rax, 0c0000000h      ; RDX register requested
+    or         rax, 4
+    wrmsr
+    rep vmmcall
+    rdmsr
+
+    ; Processor is x2APIC capable; 32-bit x2APIC ID is now in EDX
+    jmp        RestoreGhcb
+
+NoX2ApicSevEs:
+    ; Processor is not x2APIC capable, so get 8-bit APIC ID
+    mov        rdx, 1               ; CPUID function 1
+    mov        rax, 040000000h      ; RBX register requested
+    or         rax, 4
+    wrmsr
+    rep vmmcall
+    rdmsr
+    shr        edx, 24
+
+RestoreGhcb:
+    mov        rbx, rdx             ; Save x2APIC/APIC ID
+
+    mov        rdx, rdi             ; RDI holds the saved GHCB GPA
+    shr        rdx, 32
+    mov        eax, edi
+    wrmsr
+
+    mov        rdx, rbx
+
+    ; x2APIC ID or APIC ID is in EDX
+    jmp        GetProcessorNumber
+
+DoCpuid:
     mov        eax, 0
     cpuid
     cmp        eax, 0bh
@@ -253,12 +341,158 @@ CProcedureInvoke:
 
 RendezvousFunnelProcEnd:
 
+;-------------------------------------------------------------------------------------
+;SwitchToRealProc procedure follows.
+;ALSO THIS PROCEDURE IS EXECUTED BY APs TRANSITIONING TO 16 BIT MODE. HENCE THIS PROC
+;IS IN MACHINE CODE.
+;  SwitchToRealProc (UINTN BufferStart, UINT16 Code16, UINT16 Code32, UINTN StackStart)
+;  rcx - Buffer Start
+;  rdx - Code16 Selector Offset
+;  r8  - Code32 Selector Offset
+;  r9  - Stack Start
+;-------------------------------------------------------------------------------------
+global ASM_PFX(SwitchToRealProc)
+ASM_PFX(SwitchToRealProc):
+SwitchToRealProcStart:
+BITS 64
+    cli
+
+    ;
+    ; Get RDX reset value before changing stacks since the
+    ; new stack won't be able to accomodate a #VC exception.
+    ;
+    push       rax
+    push       rbx
+    push       rcx
+    push       rdx
+
+    mov        rax, 1
+    cpuid
+    mov        rsi, rax                    ; Save off the reset value for RDX
+
+    pop        rdx
+    pop        rcx
+    pop        rbx
+    pop        rax
+
+    ;
+    ; Establish stack below 1MB
+    ;
+    mov        rsp, r9
+
+    ;
+    ; Push ultimate Reset Vector onto the stack
+    ;
+    mov        rax, rcx
+    shr        rax, 4
+    push       word 0x0002                 ; RFLAGS
+    push       ax                          ; CS
+    push       word 0x0000                 ; RIP
+    push       word 0x0000                 ; For alignment, will be discarded
+
+    ;
+    ; Get address of "16-bit operand size" label
+    ;
+    lea        rbx, [PM16Mode]
+
+    ;
+    ; Push addresses used to change to compatibility mode
+    ;
+    lea        rax, [CompatMode]
+    push       r8
+    push       rax
+
+    ;
+    ; Clear R8 - R15, for reset, before going into 32-bit mode
+    ;
+    xor        r8, r8
+    xor        r9, r9
+    xor        r10, r10
+    xor        r11, r11
+    xor        r12, r12
+    xor        r13, r13
+    xor        r14, r14
+    xor        r15, r15
+
+    ;
+    ; Far return into 32-bit mode
+    ;
+o64 retf
+
+BITS 32
+CompatMode:
+    ;
+    ; Set up stack to prepare for exiting protected mode
+    ;
+    push       edx                         ; Code16 CS
+    push       ebx                         ; PM16Mode label address
+
+    ;
+    ; Disable paging
+    ;
+    mov        eax, cr0                    ; Read CR0
+    btr        eax, 31                     ; Set PG=0
+    mov        cr0, eax                    ; Write CR0
+
+    ;
+    ; Disable long mode
+    ;
+    mov        ecx, 0c0000080h             ; EFER MSR number
+    rdmsr                                  ; Read EFER
+    btr        eax, 8                      ; Set LME=0
+    wrmsr                                  ; Write EFER
+
+    ;
+    ; Disable PAE
+    ;
+    mov        eax, cr4                    ; Read CR4
+    btr        eax, 5                      ; Set PAE=0
+    mov        cr4, eax                    ; Write CR4
+
+    mov        edx, esi                    ; Restore RDX reset value
+
+    ;
+    ; Switch to 16-bit operand size
+    ;
+    retf
+
+BITS 16
+    ;
+    ; At entry to this label
+    ;   - RDX will have its reset value
+    ;   - On the top of the stack
+    ;     - Alignment data (two bytes) to be discarded
+    ;     - IP for Real Mode (two bytes)
+    ;     - CS for Real Mode (two bytes)
+    ;
+PM16Mode:
+    mov        eax, cr0                    ; Read CR0
+    btr        eax, 0                      ; Set PE=0
+    mov        cr0, eax                    ; Write CR0
+
+    pop        ax                          ; Discard alignment data
+
+    ;
+    ; Clear registers (except RDX and RSP) before going into 16-bit mode
+    ;
+    xor        eax, eax
+    xor        ebx, ebx
+    xor        ecx, ecx
+    xor        esi, esi
+    xor        edi, edi
+    xor        ebp, ebp
+
+    iret
+
+SwitchToRealProcEnd:
+
 ;-------------------------------------------------------------------------------------
 ;  AsmRelocateApLoop (MwaitSupport, ApTargetCState, PmCodeSegment, TopOfApStack, CountTofinish);
 ;-------------------------------------------------------------------------------------
 global ASM_PFX(AsmRelocateApLoop)
 ASM_PFX(AsmRelocateApLoop):
 AsmRelocateApLoopStart:
+BITS 64
     cli                          ; Disable interrupt before switching to 32-bit mode
     mov        rax, [rsp + 40]   ; CountTofinish
     lock dec   dword [rax]       ; (*CountTofinish)--
@@ -324,6 +558,11 @@ ASM_PFX(AsmGetAddressMap):
     mov        qword [rcx + 18h], rax
     mov        qword [rcx + 20h], AsmRelocateApLoopEnd - AsmRelocateApLoopStart
     mov        qword [rcx + 28h], Flat32Start - RendezvousFunnelProcStart
+    mov        qword [rcx + 30h], SwitchToRealProcEnd - SwitchToRealProcStart          ; SwitchToRealSize
+    mov        qword [rcx + 38h], SwitchToRealProcStart - RendezvousFunnelProcStart    ; SwitchToRealOffset
+    mov        qword [rcx + 40h], SwitchToRealProcStart - Flat32Start                  ; SwitchToRealNoNxOffset
+    mov        qword [rcx + 48h], PM16Mode - RendezvousFunnelProcStart                 ; SwitchToRealPM16ModeOffset
+    mov        qword [rcx + 50h], SwitchToRealProcEnd - PM16Mode                       ; SwitchToRealPM16ModeSize
     ret
 
 ;-------------------------------------------------------------------------------------
-- 
2.17.1


-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.

View/Reply Online (#47672): https://edk2.groups.io/g/devel/message/47672
Mute This Topic: https://groups.io/mt/34203585/1787277
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub  [importer@patchew.org]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [edk2-devel] [RFC PATCH v2 38/44] UefiCpuPkg: Allow AP booting under SEV-ES
Posted by Laszlo Ersek 5 years, 12 months ago
Adding Phil.

I'm looking at this patch only because one thing caught my attention in
the previous one, "OvmfPkg: Add support for SEV-ES AP reset vector
re-directing":

On 09/19/19 21:53, Lendacky, Thomas wrote:
> From: Tom Lendacky <thomas.lendacky@amd.com>
> 
> BZ: https://bugzilla.tianocore.org/show_bug.cgi?id=2198
> 
> Typically, an AP is booted using the INIT-SIPI-SIPI sequence. This
> sequence is intercepted by the hypervisor, which sets the AP's registers
> to the values requested by the sequence. At that point, the hypervisor can
> start the AP, which will then begin execution at the appropriate location.
> 
> Under SEV-ES, AP booting presents some challenges since the hypervisor is
> not allowed to alter the AP's register state. In this situation, we have
> to distinguish between the AP's first boot and AP's subsequent boots.
> 
> First boot:
>  Once the AP's register state has been defined (which is before the guest
>  is first booted) it cannot be altered. Should the hypervisor attempt to
>  alter the register state, the change would be detected by the hardware
>  and the VMRUN instruction would fail. Given this, the first boot for the
>  AP is required to begin execution with this initial register state, which
>  is typically the reset vector. This prevents the BSP from directing the
>  AP startup location through the INIT-SIPI-SIPI sequence.
> 
>  To work around this, provide a four-byte field at offset 0xffffffd0 that
>  can contain an IP / CS register combination, that if non-zero, causes
>  the AP to perform a far jump to that location instead of a near jump to
>  EarlyBspInitReal16. Before booting the AP for the first time, the BSP
>  should set the IP / CS value for the AP based on the value that would be
>  derived from the INIT-SIPI-SIPI sequence.

I don't understand how this can work: the guest-phys address 0xffffffd0
is backed by read-only pflash in most OVMF deployments.

In addition:

[...]

> @@ -1002,6 +1204,7 @@ WakeUpAP (
>        CpuMpData->InitFlag   != ApInitDone) {
>      ResetVectorRequired = TRUE;
>      AllocateResetVector (CpuMpData);
> +    AllocateSevEsAPMemory (CpuMpData);
>      FillExchangeInfoData (CpuMpData);
>      SaveLocalApicTimerSetting (CpuMpData);
>    }
> @@ -1038,6 +1241,15 @@ WakeUpAP (
>        }
>      }
>      if (ResetVectorRequired) {
> +      //
> +      // For SEV-ES, set the jump address for initial AP boot
> +      //
> +      if (CpuMpData->SevEsActive) {
> +        SEV_ES_AP_JMP_FAR *JmpFar = (SEV_ES_AP_JMP_FAR *)0xFFFFFFD0;
> +
> +        JmpFar->ApStart.Rip = 0;
> +        JmpFar->ApStart.Segment = (UINT16) (ExchangeInfo->BufferStart >> 4);
> +      }

Even if the address is backed by a single "unified" pflash, mapped r/w
-- which we can call a "non-standard OVMF deployment" nowadays --, a
normal store doesn't appear sufficient to me. The first write to pflash
will flip it to "programming mode", and the values stored are supposed
to be pflash commands (not just the raw data we intend to put in place).

See for example the QemuFlashWrite() function in
"OvmfPkg/QemuFlashFvbServicesRuntimeDxe/QemuFlash.c". (But, again, that
is used with the pflash chip that hosts the variable store, and is
therefore mapped r/w.)


Taking a step back... I don't think APs execute any code from pflash,
when MpInitLib boots them.

In OVMF, PcdCpuApLoopMode is ApInHltLoop (value 1), therefore
"CpuMpData->WakeUpByInitSipiSipi" should be TRUE, and
"ResetVectorRequired" too should be TRUE, at first AP boot.
Consequently, the reset vector seems to be allocated with
AllocateResetVector().

AllocateResetVector() has separate implementations for PEI and DXE, but
in both cases, it returns RAM. So I don't see where the AP accesses (or
executes) pflash.

Thanks,
Laszlo

-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.

View/Reply Online (#48376): https://edk2.groups.io/g/devel/message/48376
Mute This Topic: https://groups.io/mt/34203585/1787277
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub  [importer@patchew.org]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [edk2-devel] [RFC PATCH v2 38/44] UefiCpuPkg: Allow AP booting under SEV-ES
Posted by Lendacky, Thomas 5 years, 12 months ago
On 10/2/19 10:15 AM, Laszlo Ersek via Groups.Io wrote:
> Adding Phil.
> 
> I'm looking at this patch only because one thing caught my attention in
> the previous one, "OvmfPkg: Add support for SEV-ES AP reset vector
> re-directing":
> 
> On 09/19/19 21:53, Lendacky, Thomas wrote:
>> From: Tom Lendacky <thomas.lendacky@amd.com>
>>
>> BZ: https://bugzilla.tianocore.org/show_bug.cgi?id=2198
>>
>> Typically, an AP is booted using the INIT-SIPI-SIPI sequence. This
>> sequence is intercepted by the hypervisor, which sets the AP's registers
>> to the values requested by the sequence. At that point, the hypervisor can
>> start the AP, which will then begin execution at the appropriate location.
>>
>> Under SEV-ES, AP booting presents some challenges since the hypervisor is
>> not allowed to alter the AP's register state. In this situation, we have
>> to distinguish between the AP's first boot and AP's subsequent boots.
>>
>> First boot:
>>  Once the AP's register state has been defined (which is before the guest
>>  is first booted) it cannot be altered. Should the hypervisor attempt to
>>  alter the register state, the change would be detected by the hardware
>>  and the VMRUN instruction would fail. Given this, the first boot for the
>>  AP is required to begin execution with this initial register state, which
>>  is typically the reset vector. This prevents the BSP from directing the
>>  AP startup location through the INIT-SIPI-SIPI sequence.
>>
>>  To work around this, provide a four-byte field at offset 0xffffffd0 that
>>  can contain an IP / CS register combination, that if non-zero, causes
>>  the AP to perform a far jump to that location instead of a near jump to
>>  EarlyBspInitReal16. Before booting the AP for the first time, the BSP
>>  should set the IP / CS value for the AP based on the value that would be
>>  derived from the INIT-SIPI-SIPI sequence.
> 
> I don't understand how this can work: the guest-phys address 0xffffffd0
> is backed by read-only pflash in most OVMF deployments.
> 
> In addition:
> 
> [...]
> 
>> @@ -1002,6 +1204,7 @@ WakeUpAP (
>>        CpuMpData->InitFlag   != ApInitDone) {
>>      ResetVectorRequired = TRUE;
>>      AllocateResetVector (CpuMpData);
>> +    AllocateSevEsAPMemory (CpuMpData);
>>      FillExchangeInfoData (CpuMpData);
>>      SaveLocalApicTimerSetting (CpuMpData);
>>    }
>> @@ -1038,6 +1241,15 @@ WakeUpAP (
>>        }
>>      }
>>      if (ResetVectorRequired) {
>> +      //
>> +      // For SEV-ES, set the jump address for initial AP boot
>> +      //
>> +      if (CpuMpData->SevEsActive) {
>> +        SEV_ES_AP_JMP_FAR *JmpFar = (SEV_ES_AP_JMP_FAR *)0xFFFFFFD0;
>> +
>> +        JmpFar->ApStart.Rip = 0;
>> +        JmpFar->ApStart.Segment = (UINT16) (ExchangeInfo->BufferStart >> 4);
>> +      }
> 
> Even if the address is backed by a single "unified" pflash, mapped r/w
> -- which we can call a "non-standard OVMF deployment" nowadays --, a
> normal store doesn't appear sufficient to me. The first write to pflash
> will flip it to "programming mode", and the values stored are supposed
> to be pflash commands (not just the raw data we intend to put in place).

There is a corresponding patch in Qemu that does not set the
KVM_MEM_READONLY flag for the ROM when starting an SEV-ES guest, thus
allowing the MP library to update the location with desired address.

> 
> See for example the QemuFlashWrite() function in
> "OvmfPkg/QemuFlashFvbServicesRuntimeDxe/QemuFlash.c". (But, again, that
> is used with the pflash chip that hosts the variable store, and is
> therefore mapped r/w.)
> 
> 
> Taking a step back... I don't think APs execute any code from pflash,
> when MpInitLib boots them.

Correct, a non-SEV-ES guest will not execute this code because the
INIT-SIPI-SIPI sequence will direct the RIP to the desired location.

Thanks,
Tom

> 
> In OVMF, PcdCpuApLoopMode is ApInHltLoop (value 1), therefore
> "CpuMpData->WakeUpByInitSipiSipi" should be TRUE, and
> "ResetVectorRequired" too should be TRUE, at first AP boot.
> Consequently, the reset vector seems to be allocated with
> AllocateResetVector().
> 
> AllocateResetVector() has separate implementations for PEI and DXE, but
> in both cases, it returns RAM. So I don't see where the AP accesses (or
> executes) pflash.
> 
> Thanks,
> Laszlo
> 
> 
> 

-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.

View/Reply Online (#48388): https://edk2.groups.io/g/devel/message/48388
Mute This Topic: https://groups.io/mt/34203585/1787277
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub  [importer@patchew.org]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [edk2-devel] [RFC PATCH v2 38/44] UefiCpuPkg: Allow AP booting under SEV-ES
Posted by Laszlo Ersek 5 years, 12 months ago
On 10/02/19 19:58, Lendacky, Thomas wrote:
> On 10/2/19 10:15 AM, Laszlo Ersek via Groups.Io wrote:
>> Adding Phil.
>>
>> I'm looking at this patch only because one thing caught my attention in
>> the previous one, "OvmfPkg: Add support for SEV-ES AP reset vector
>> re-directing":
>>
>> On 09/19/19 21:53, Lendacky, Thomas wrote:
>>> From: Tom Lendacky <thomas.lendacky@amd.com>
>>>
>>> BZ: https://bugzilla.tianocore.org/show_bug.cgi?id=2198
>>>
>>> Typically, an AP is booted using the INIT-SIPI-SIPI sequence. This
>>> sequence is intercepted by the hypervisor, which sets the AP's registers
>>> to the values requested by the sequence. At that point, the hypervisor can
>>> start the AP, which will then begin execution at the appropriate location.
>>>
>>> Under SEV-ES, AP booting presents some challenges since the hypervisor is
>>> not allowed to alter the AP's register state. In this situation, we have
>>> to distinguish between the AP's first boot and AP's subsequent boots.
>>>
>>> First boot:
>>>  Once the AP's register state has been defined (which is before the guest
>>>  is first booted) it cannot be altered. Should the hypervisor attempt to
>>>  alter the register state, the change would be detected by the hardware
>>>  and the VMRUN instruction would fail. Given this, the first boot for the
>>>  AP is required to begin execution with this initial register state, which
>>>  is typically the reset vector. This prevents the BSP from directing the
>>>  AP startup location through the INIT-SIPI-SIPI sequence.
>>>
>>>  To work around this, provide a four-byte field at offset 0xffffffd0 that
>>>  can contain an IP / CS register combination, that if non-zero, causes
>>>  the AP to perform a far jump to that location instead of a near jump to
>>>  EarlyBspInitReal16. Before booting the AP for the first time, the BSP
>>>  should set the IP / CS value for the AP based on the value that would be
>>>  derived from the INIT-SIPI-SIPI sequence.
>>
>> I don't understand how this can work: the guest-phys address 0xffffffd0
>> is backed by read-only pflash in most OVMF deployments.
>>
>> In addition:
>>
>> [...]
>>
>>> @@ -1002,6 +1204,7 @@ WakeUpAP (
>>>        CpuMpData->InitFlag   != ApInitDone) {
>>>      ResetVectorRequired = TRUE;
>>>      AllocateResetVector (CpuMpData);
>>> +    AllocateSevEsAPMemory (CpuMpData);
>>>      FillExchangeInfoData (CpuMpData);
>>>      SaveLocalApicTimerSetting (CpuMpData);
>>>    }
>>> @@ -1038,6 +1241,15 @@ WakeUpAP (
>>>        }
>>>      }
>>>      if (ResetVectorRequired) {
>>> +      //
>>> +      // For SEV-ES, set the jump address for initial AP boot
>>> +      //
>>> +      if (CpuMpData->SevEsActive) {
>>> +        SEV_ES_AP_JMP_FAR *JmpFar = (SEV_ES_AP_JMP_FAR *)0xFFFFFFD0;
>>> +
>>> +        JmpFar->ApStart.Rip = 0;
>>> +        JmpFar->ApStart.Segment = (UINT16) (ExchangeInfo->BufferStart >> 4);
>>> +      }
>>
>> Even if the address is backed by a single "unified" pflash, mapped r/w
>> -- which we can call a "non-standard OVMF deployment" nowadays --, a
>> normal store doesn't appear sufficient to me. The first write to pflash
>> will flip it to "programming mode", and the values stored are supposed
>> to be pflash commands (not just the raw data we intend to put in place).
> 
> There is a corresponding patch in Qemu that does not set the
> KVM_MEM_READONLY flag for the ROM when starting an SEV-ES guest, thus
> allowing the MP library to update the location with desired address.

Isn't that a price too high to pay? It seems to allow the memory mapped
contents of the pflash chip to diverge from the host-side file, even if
the QEMU command line specifies that the pflash chip is to be mapped r/o.

With regard to SMM, I think this could even be considered a security
problem -- if the pflash region is not marked as readonly, then the
guest OS could write to it as well (with direct hardware access, like
the code seen above). The write would not trap to QEMU, and QEMU
couldn't prevent the write.

The OS could use this to inject code into the pflash region (e.g., SEC).
Although the host-side pflash file would not be modified, if the OS
performed a warm reboot (or S3 cycle) afterwards, the code it injected
into the flash region would be executed with firmware privileges.

AMD Publication #56421 states that SMM is currently out of scope for
SEV-ES, so I don't think there's a problem right now; I'm just generally
concerned that here we're enabling something I've always considered a
big no-no for SMM builds.

I still have to read your next response though (to my suggestion to use
a fixed RAM address, I believe).

Thanks!
Laszlo

-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.

View/Reply Online (#48426): https://edk2.groups.io/g/devel/message/48426
Mute This Topic: https://groups.io/mt/34203585/1787277
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub  [importer@patchew.org]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [edk2-devel] [RFC PATCH v2 38/44] UefiCpuPkg: Allow AP booting under SEV-ES
Posted by Laszlo Ersek 5 years, 12 months ago
On 10/02/19 17:15, Laszlo Ersek wrote:
> Adding Phil.
> 
> I'm looking at this patch only because one thing caught my attention in
> the previous one, "OvmfPkg: Add support for SEV-ES AP reset vector
> re-directing":
> 
> On 09/19/19 21:53, Lendacky, Thomas wrote:
>> From: Tom Lendacky <thomas.lendacky@amd.com>
>>
>> BZ: https://bugzilla.tianocore.org/show_bug.cgi?id=2198
>>
>> Typically, an AP is booted using the INIT-SIPI-SIPI sequence. This
>> sequence is intercepted by the hypervisor, which sets the AP's registers
>> to the values requested by the sequence. At that point, the hypervisor can
>> start the AP, which will then begin execution at the appropriate location.
>>
>> Under SEV-ES, AP booting presents some challenges since the hypervisor is
>> not allowed to alter the AP's register state. In this situation, we have
>> to distinguish between the AP's first boot and AP's subsequent boots.
>>
>> First boot:
>>  Once the AP's register state has been defined (which is before the guest
>>  is first booted) it cannot be altered. Should the hypervisor attempt to
>>  alter the register state, the change would be detected by the hardware
>>  and the VMRUN instruction would fail. Given this, the first boot for the
>>  AP is required to begin execution with this initial register state, which
>>  is typically the reset vector. This prevents the BSP from directing the
>>  AP startup location through the INIT-SIPI-SIPI sequence.
>>
>>  To work around this, provide a four-byte field at offset 0xffffffd0 that
>>  can contain an IP / CS register combination, that if non-zero, causes
>>  the AP to perform a far jump to that location instead of a near jump to
>>  EarlyBspInitReal16. Before booting the AP for the first time, the BSP
>>  should set the IP / CS value for the AP based on the value that would be
>>  derived from the INIT-SIPI-SIPI sequence.
> 
> I don't understand how this can work: the guest-phys address 0xffffffd0
> is backed by read-only pflash in most OVMF deployments.
> 
> In addition:
> 
> [...]
> 
>> @@ -1002,6 +1204,7 @@ WakeUpAP (
>>        CpuMpData->InitFlag   != ApInitDone) {
>>      ResetVectorRequired = TRUE;
>>      AllocateResetVector (CpuMpData);
>> +    AllocateSevEsAPMemory (CpuMpData);
>>      FillExchangeInfoData (CpuMpData);
>>      SaveLocalApicTimerSetting (CpuMpData);
>>    }
>> @@ -1038,6 +1241,15 @@ WakeUpAP (
>>        }
>>      }
>>      if (ResetVectorRequired) {
>> +      //
>> +      // For SEV-ES, set the jump address for initial AP boot
>> +      //
>> +      if (CpuMpData->SevEsActive) {
>> +        SEV_ES_AP_JMP_FAR *JmpFar = (SEV_ES_AP_JMP_FAR *)0xFFFFFFD0;
>> +
>> +        JmpFar->ApStart.Rip = 0;
>> +        JmpFar->ApStart.Segment = (UINT16) (ExchangeInfo->BufferStart >> 4);
>> +      }
> 
> Even if the address is backed by a single "unified" pflash, mapped r/w
> -- which we can call a "non-standard OVMF deployment" nowadays --, a
> normal store doesn't appear sufficient to me. The first write to pflash
> will flip it to "programming mode", and the values stored are supposed
> to be pflash commands (not just the raw data we intend to put in place).
> 
> See for example the QemuFlashWrite() function in
> "OvmfPkg/QemuFlashFvbServicesRuntimeDxe/QemuFlash.c". (But, again, that
> is used with the pflash chip that hosts the variable store, and is
> therefore mapped r/w.)
> 
> 
> Taking a step back... I don't think APs execute any code from pflash,
> when MpInitLib boots them.
> 
> In OVMF, PcdCpuApLoopMode is ApInHltLoop (value 1), therefore
> "CpuMpData->WakeUpByInitSipiSipi" should be TRUE, and
> "ResetVectorRequired" too should be TRUE, at first AP boot.
> Consequently, the reset vector seems to be allocated with
> AllocateResetVector().
> 
> AllocateResetVector() has separate implementations for PEI and DXE, but
> in both cases, it returns RAM. So I don't see where the AP accesses (or
> executes) pflash.

... I believe I understand that this is precisely what cannot work under
SEV-ES -- because we cannot launch an AP at an address that's
dynamically chosen by the firmware (and passed to the hypervisor), like
with INIT-SIPI-SIPI.

And so firmware and hypervisor have to agree upon a *constant* AP reset
vector address, in advance.

We have two options:

- pick the reset vector address *constant* such that it falls into RAM, or

- let the AP reset vector reside in pflash, but then the code in pflash
has to look for a parameter block at a fixed address in RAM.

So in the end, both options require the same -- we need a RAM address
constant that is determined at firmware build time. Either for the reset
vector itself, or for the reset vector's parameter block.

Thanks
Laszlo

-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.

View/Reply Online (#48377): https://edk2.groups.io/g/devel/message/48377
Mute This Topic: https://groups.io/mt/34203585/1787277
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub  [importer@patchew.org]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [edk2-devel] [RFC PATCH v2 38/44] UefiCpuPkg: Allow AP booting under SEV-ES
Posted by Lendacky, Thomas 5 years, 12 months ago
On 10/2/19 10:26 AM, Laszlo Ersek wrote:
> On 10/02/19 17:15, Laszlo Ersek wrote:
>> Adding Phil.
>>
>> I'm looking at this patch only because one thing caught my attention in
>> the previous one, "OvmfPkg: Add support for SEV-ES AP reset vector
>> re-directing":
>>
>> On 09/19/19 21:53, Lendacky, Thomas wrote:
>>> From: Tom Lendacky <thomas.lendacky@amd.com>
>>>
>>> BZ: https://bugzilla.tianocore.org/show_bug.cgi?id=2198
>>>
>>> Typically, an AP is booted using the INIT-SIPI-SIPI sequence. This
>>> sequence is intercepted by the hypervisor, which sets the AP's registers
>>> to the values requested by the sequence. At that point, the hypervisor can
>>> start the AP, which will then begin execution at the appropriate location.
>>>
>>> Under SEV-ES, AP booting presents some challenges since the hypervisor is
>>> not allowed to alter the AP's register state. In this situation, we have
>>> to distinguish between the AP's first boot and AP's subsequent boots.
>>>
>>> First boot:
>>>  Once the AP's register state has been defined (which is before the guest
>>>  is first booted) it cannot be altered. Should the hypervisor attempt to
>>>  alter the register state, the change would be detected by the hardware
>>>  and the VMRUN instruction would fail. Given this, the first boot for the
>>>  AP is required to begin execution with this initial register state, which
>>>  is typically the reset vector. This prevents the BSP from directing the
>>>  AP startup location through the INIT-SIPI-SIPI sequence.
>>>
>>>  To work around this, provide a four-byte field at offset 0xffffffd0 that
>>>  can contain an IP / CS register combination, that if non-zero, causes
>>>  the AP to perform a far jump to that location instead of a near jump to
>>>  EarlyBspInitReal16. Before booting the AP for the first time, the BSP
>>>  should set the IP / CS value for the AP based on the value that would be
>>>  derived from the INIT-SIPI-SIPI sequence.
>>
>> I don't understand how this can work: the guest-phys address 0xffffffd0
>> is backed by read-only pflash in most OVMF deployments.
>>
>> In addition:
>>
>> [...]
>>
>>> @@ -1002,6 +1204,7 @@ WakeUpAP (
>>>        CpuMpData->InitFlag   != ApInitDone) {
>>>      ResetVectorRequired = TRUE;
>>>      AllocateResetVector (CpuMpData);
>>> +    AllocateSevEsAPMemory (CpuMpData);
>>>      FillExchangeInfoData (CpuMpData);
>>>      SaveLocalApicTimerSetting (CpuMpData);
>>>    }
>>> @@ -1038,6 +1241,15 @@ WakeUpAP (
>>>        }
>>>      }
>>>      if (ResetVectorRequired) {
>>> +      //
>>> +      // For SEV-ES, set the jump address for initial AP boot
>>> +      //
>>> +      if (CpuMpData->SevEsActive) {
>>> +        SEV_ES_AP_JMP_FAR *JmpFar = (SEV_ES_AP_JMP_FAR *)0xFFFFFFD0;
>>> +
>>> +        JmpFar->ApStart.Rip = 0;
>>> +        JmpFar->ApStart.Segment = (UINT16) (ExchangeInfo->BufferStart >> 4);
>>> +      }
>>
>> Even if the address is backed by a single "unified" pflash, mapped r/w
>> -- which we can call a "non-standard OVMF deployment" nowadays --, a
>> normal store doesn't appear sufficient to me. The first write to pflash
>> will flip it to "programming mode", and the values stored are supposed
>> to be pflash commands (not just the raw data we intend to put in place).
>>
>> See for example the QemuFlashWrite() function in
>> "OvmfPkg/QemuFlashFvbServicesRuntimeDxe/QemuFlash.c". (But, again, that
>> is used with the pflash chip that hosts the variable store, and is
>> therefore mapped r/w.)
>>
>>
>> Taking a step back... I don't think APs execute any code from pflash,
>> when MpInitLib boots them.
>>
>> In OVMF, PcdCpuApLoopMode is ApInHltLoop (value 1), therefore
>> "CpuMpData->WakeUpByInitSipiSipi" should be TRUE, and
>> "ResetVectorRequired" too should be TRUE, at first AP boot.
>> Consequently, the reset vector seems to be allocated with
>> AllocateResetVector().
>>
>> AllocateResetVector() has separate implementations for PEI and DXE, but
>> in both cases, it returns RAM. So I don't see where the AP accesses (or
>> executes) pflash.
> 
> ... I believe I understand that this is precisely what cannot work under
> SEV-ES -- because we cannot launch an AP at an address that's
> dynamically chosen by the firmware (and passed to the hypervisor), like
> with INIT-SIPI-SIPI.

Correct.

> 
> And so firmware and hypervisor have to agree upon a *constant* AP reset
> vector address, in advance.
> 
> We have two options:
> 
> - pick the reset vector address *constant* such that it falls into RAM, or
> 
> - let the AP reset vector reside in pflash, but then the code in pflash
> has to look for a parameter block at a fixed address in RAM.
> 
> So in the end, both options require the same -- we need a RAM address
> constant that is determined at firmware build time. Either for the reset
> vector itself, or for the reset vector's parameter block.

As long as the hypervisor has a way to determine a build time address,
that would be the preferred method. I couldn't come up with a way (or at
least don't know of a way) to allow the hypervisor to discover that build
time address. Hopefully someone else does and we can eliminate the need
to map the ROM read/write and just program the vCPUs initial registers to
the build time value.

I think the address should be for the AP reset vector itself, otherwise
you would have to guarantee that the reset vector parameter block is zero
for the initial BSP boot.

Thanks,
Tom

> 
> Thanks
> Laszlo
> 

-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.

View/Reply Online (#48389): https://edk2.groups.io/g/devel/message/48389
Mute This Topic: https://groups.io/mt/34203585/1787277
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub  [importer@patchew.org]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [edk2-devel] [RFC PATCH v2 38/44] UefiCpuPkg: Allow AP booting under SEV-ES
Posted by Laszlo Ersek 5 years, 12 months ago
On 10/02/19 20:07, Lendacky, Thomas wrote:
> On 10/2/19 10:26 AM, Laszlo Ersek wrote:
>> On 10/02/19 17:15, Laszlo Ersek wrote:
>>> Adding Phil.
>>>
>>> I'm looking at this patch only because one thing caught my attention in
>>> the previous one, "OvmfPkg: Add support for SEV-ES AP reset vector
>>> re-directing":
>>>
>>> On 09/19/19 21:53, Lendacky, Thomas wrote:
>>>> From: Tom Lendacky <thomas.lendacky@amd.com>
>>>>
>>>> BZ: https://bugzilla.tianocore.org/show_bug.cgi?id=2198
>>>>
>>>> Typically, an AP is booted using the INIT-SIPI-SIPI sequence. This
>>>> sequence is intercepted by the hypervisor, which sets the AP's registers
>>>> to the values requested by the sequence. At that point, the hypervisor can
>>>> start the AP, which will then begin execution at the appropriate location.
>>>>
>>>> Under SEV-ES, AP booting presents some challenges since the hypervisor is
>>>> not allowed to alter the AP's register state. In this situation, we have
>>>> to distinguish between the AP's first boot and AP's subsequent boots.
>>>>
>>>> First boot:
>>>>  Once the AP's register state has been defined (which is before the guest
>>>>  is first booted) it cannot be altered. Should the hypervisor attempt to
>>>>  alter the register state, the change would be detected by the hardware
>>>>  and the VMRUN instruction would fail. Given this, the first boot for the
>>>>  AP is required to begin execution with this initial register state, which
>>>>  is typically the reset vector. This prevents the BSP from directing the
>>>>  AP startup location through the INIT-SIPI-SIPI sequence.
>>>>
>>>>  To work around this, provide a four-byte field at offset 0xffffffd0 that
>>>>  can contain an IP / CS register combination, that if non-zero, causes
>>>>  the AP to perform a far jump to that location instead of a near jump to
>>>>  EarlyBspInitReal16. Before booting the AP for the first time, the BSP
>>>>  should set the IP / CS value for the AP based on the value that would be
>>>>  derived from the INIT-SIPI-SIPI sequence.
>>>
>>> I don't understand how this can work: the guest-phys address 0xffffffd0
>>> is backed by read-only pflash in most OVMF deployments.
>>>
>>> In addition:
>>>
>>> [...]
>>>
>>>> @@ -1002,6 +1204,7 @@ WakeUpAP (
>>>>        CpuMpData->InitFlag   != ApInitDone) {
>>>>      ResetVectorRequired = TRUE;
>>>>      AllocateResetVector (CpuMpData);
>>>> +    AllocateSevEsAPMemory (CpuMpData);
>>>>      FillExchangeInfoData (CpuMpData);
>>>>      SaveLocalApicTimerSetting (CpuMpData);
>>>>    }
>>>> @@ -1038,6 +1241,15 @@ WakeUpAP (
>>>>        }
>>>>      }
>>>>      if (ResetVectorRequired) {
>>>> +      //
>>>> +      // For SEV-ES, set the jump address for initial AP boot
>>>> +      //
>>>> +      if (CpuMpData->SevEsActive) {
>>>> +        SEV_ES_AP_JMP_FAR *JmpFar = (SEV_ES_AP_JMP_FAR *)0xFFFFFFD0;
>>>> +
>>>> +        JmpFar->ApStart.Rip = 0;
>>>> +        JmpFar->ApStart.Segment = (UINT16) (ExchangeInfo->BufferStart >> 4);
>>>> +      }
>>>
>>> Even if the address is backed by a single "unified" pflash, mapped r/w
>>> -- which we can call a "non-standard OVMF deployment" nowadays --, a
>>> normal store doesn't appear sufficient to me. The first write to pflash
>>> will flip it to "programming mode", and the values stored are supposed
>>> to be pflash commands (not just the raw data we intend to put in place).
>>>
>>> See for example the QemuFlashWrite() function in
>>> "OvmfPkg/QemuFlashFvbServicesRuntimeDxe/QemuFlash.c". (But, again, that
>>> is used with the pflash chip that hosts the variable store, and is
>>> therefore mapped r/w.)
>>>
>>>
>>> Taking a step back... I don't think APs execute any code from pflash,
>>> when MpInitLib boots them.
>>>
>>> In OVMF, PcdCpuApLoopMode is ApInHltLoop (value 1), therefore
>>> "CpuMpData->WakeUpByInitSipiSipi" should be TRUE, and
>>> "ResetVectorRequired" too should be TRUE, at first AP boot.
>>> Consequently, the reset vector seems to be allocated with
>>> AllocateResetVector().
>>>
>>> AllocateResetVector() has separate implementations for PEI and DXE, but
>>> in both cases, it returns RAM. So I don't see where the AP accesses (or
>>> executes) pflash.
>>
>> ... I believe I understand that this is precisely what cannot work under
>> SEV-ES -- because we cannot launch an AP at an address that's
>> dynamically chosen by the firmware (and passed to the hypervisor), like
>> with INIT-SIPI-SIPI.
> 
> Correct.
> 
>>
>> And so firmware and hypervisor have to agree upon a *constant* AP reset
>> vector address, in advance.
>>
>> We have two options:
>>
>> - pick the reset vector address *constant* such that it falls into RAM, or
>>
>> - let the AP reset vector reside in pflash, but then the code in pflash
>> has to look for a parameter block at a fixed address in RAM.
>>
>> So in the end, both options require the same -- we need a RAM address
>> constant that is determined at firmware build time. Either for the reset
>> vector itself, or for the reset vector's parameter block.
> 
> As long as the hypervisor has a way to determine a build time address,
> that would be the preferred method. I couldn't come up with a way (or at
> least don't know of a way) to allow the hypervisor to discover that build
> time address. Hopefully someone else does and we can eliminate the need
> to map the ROM read/write and just program the vCPUs initial registers to
> the build time value.
> 
> I think the address should be for the AP reset vector itself, otherwise
> you would have to guarantee that the reset vector parameter block is zero
> for the initial BSP boot.

Rough idea:

(1) Introduce a new fixed PCD (UINT32) pointing into RAM -- the AP reset
vector address. You could carve it out from another unused part of
FD.MEMFD, and set the PCD in the FDF files.

(2) In PlatformPei, reserve the area.

(3) In "ResetVectorVtf0.asm", we should introduce a packed structure as
follows, so that it end at label "applicationProcessorEntryPoint":

  UINT32   ApEntryPoint;
  EFI_GUID SevEsFooterGuid;
  UINT16   Size;

I think the assembly source code could populate this structure with DD /
DB / DW pseudo-instructions.

- The "ApEntryPoint" field would be filled from FixedPcdGet32.

- The SevEsFooterGuid field would be open-coded with DB. We'd pick a new
GUID with uuidgen for this.

- The "Size" field would cover the entire structure. I think we should
be able to auto-calculate it for a DW pseudo-instruction, placing a
label at the top of the structure, and then using ($ - ThatLabel) or
similar.

(4) The following information would be fixed (by convention) and known
to QEMU:
- the address of the SevEsFooterGuid field (offset backwards from the
end of flash / 4GB)
- the value of SevEsFooterGuid.

(5) Whenever QEMU finds the GUID, it can check the Size field. Given the
Size field, QEMU can check fields *prepended* to the footer structure.
Incompatible changes to the structure (i.e. any change other than
prepending new fields) require a new GUID to be generated for
SevEsFooterGuid.

(6) Thus QEMU could read ApEntryPoint out of the pflash image.

Thanks
Laszlo

-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.

View/Reply Online (#48427): https://edk2.groups.io/g/devel/message/48427
Mute This Topic: https://groups.io/mt/34203585/1787277
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub  [importer@patchew.org]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [edk2-devel] [RFC PATCH v2 38/44] UefiCpuPkg: Allow AP booting under SEV-ES
Posted by Laszlo Ersek 5 years, 12 months ago
On 10/03/19 12:12, Laszlo Ersek wrote:

>   UINT32   ApEntryPoint;
>   EFI_GUID SevEsFooterGuid;
>   UINT16   Size;

It's probably better to reverse the order of "Size" and
"SevEsFooterGuid", like this:

  UINT32   ApEntryPoint;
  UINT16   Size;
  EFI_GUID SevEsFooterGuid;

because then even the "Size" field can be changed (or resized), as a
function of the footer GUID.

Thanks
Laszlo

-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.

View/Reply Online (#48428): https://edk2.groups.io/g/devel/message/48428
Mute This Topic: https://groups.io/mt/34203585/1787277
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub  [importer@patchew.org]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [edk2-devel] [RFC PATCH v2 38/44] UefiCpuPkg: Allow AP booting under SEV-ES
Posted by Lendacky, Thomas 5 years, 12 months ago

On 10/3/19 5:32 AM, Laszlo Ersek wrote:
> On 10/03/19 12:12, Laszlo Ersek wrote:
> 
>>   UINT32   ApEntryPoint;
>>   EFI_GUID SevEsFooterGuid;
>>   UINT16   Size;
> 
> It's probably better to reverse the order of "Size" and
> "SevEsFooterGuid", like this:
> 
>   UINT32   ApEntryPoint;
>   UINT16   Size;
>   EFI_GUID SevEsFooterGuid;
> 
> because then even the "Size" field can be changed (or resized), as a
> function of the footer GUID.

Cool, I'll look into doing this and see how it works out.

Thanks!
Tom

> 
> Thanks
> Laszlo
> 

-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.

View/Reply Online (#48436): https://edk2.groups.io/g/devel/message/48436
Mute This Topic: https://groups.io/mt/34203585/1787277
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub  [importer@patchew.org]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [edk2-devel] [RFC PATCH v2 38/44] UefiCpuPkg: Allow AP booting under SEV-ES
Posted by Lendacky, Thomas 5 years, 11 months ago
On 10/3/19 10:12 AM, Tom Lendacky wrote:
> 
> 
> On 10/3/19 5:32 AM, Laszlo Ersek wrote:
>> On 10/03/19 12:12, Laszlo Ersek wrote:
>>
>>>   UINT32   ApEntryPoint;
>>>   EFI_GUID SevEsFooterGuid;
>>>   UINT16   Size;
>>
>> It's probably better to reverse the order of "Size" and
>> "SevEsFooterGuid", like this:
>>
>>   UINT32   ApEntryPoint;
>>   UINT16   Size;
>>   EFI_GUID SevEsFooterGuid;
>>
>> because then even the "Size" field can be changed (or resized), as a
>> function of the footer GUID.
> 
> Cool, I'll look into doing this and see how it works out.

Just an update on this idea. This has worked out well, but has a couple of
caveats. Removing the Qemu change to make the flash mapped read-only in
the nested page tables, caused the following:

1. QemuFlashDetected() will attempt to detect how the flash memory device
   behaves. Because it is marked as read-only by the hypervisor, writing
   to the area results in a #NPF for the write-fault. With SEV-ES,
   emulation of the instruction can't be performed (can't read guest
   memory and not provided the faulting instruction bytes), so the vCPU is
   just restarted. This results in an infinite #NPF occurring.

   The solution here was to check for SEV-ES being enabled and just return
   false from QemuFlashDetected(). Any downfalls to doing that?

2. Commit 2db0ccc2d7fe ("UefiCpuPkg: Update CpuExceptionHandlerLib pass
   XCODE5 tool chain") causes a similar situation to #1. It attempts to do
   some address fixups and write to the flash device.

   Reverting that commit fixes the issue. I don't think that will be an
   acceptable solution, though, so need to think about what to do here.

After those two changes, the above method works well.

Thanks,
Tom

> 
> Thanks!
> Tom
> 
>>
>> Thanks
>> Laszlo
>>

-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.

View/Reply Online (#48762): https://edk2.groups.io/g/devel/message/48762
Mute This Topic: https://groups.io/mt/34203585/1787277
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub  [importer@patchew.org]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [edk2-devel] [RFC PATCH v2 38/44] UefiCpuPkg: Allow AP booting under SEV-ES
Posted by Andrew Fish via Groups.Io 5 years, 11 months ago

> On Oct 10, 2019, at 4:17 PM, Lendacky, Thomas <thomas.lendacky@amd.com> wrote:
> 
> On 10/3/19 10:12 AM, Tom Lendacky wrote:
>> 
>> 
>> On 10/3/19 5:32 AM, Laszlo Ersek wrote:
>>> On 10/03/19 12:12, Laszlo Ersek wrote:
>>> 
>>>>  UINT32   ApEntryPoint;
>>>>  EFI_GUID SevEsFooterGuid;
>>>>  UINT16   Size;
>>> 
>>> It's probably better to reverse the order of "Size" and
>>> "SevEsFooterGuid", like this:
>>> 
>>>  UINT32   ApEntryPoint;
>>>  UINT16   Size;
>>>  EFI_GUID SevEsFooterGuid;
>>> 
>>> because then even the "Size" field can be changed (or resized), as a
>>> function of the footer GUID.
>> 
>> Cool, I'll look into doing this and see how it works out.
> 
> Just an update on this idea. This has worked out well, but has a couple of
> caveats. Removing the Qemu change to make the flash mapped read-only in
> the nested page tables, caused the following:
> 
> 1. QemuFlashDetected() will attempt to detect how the flash memory device
>   behaves. Because it is marked as read-only by the hypervisor, writing
>   to the area results in a #NPF for the write-fault. With SEV-ES,
>   emulation of the instruction can't be performed (can't read guest
>   memory and not provided the faulting instruction bytes), so the vCPU is
>   just restarted. This results in an infinite #NPF occurring.
> 
>   The solution here was to check for SEV-ES being enabled and just return
>   false from QemuFlashDetected(). Any downfalls to doing that?
> 
> 2. Commit 2db0ccc2d7fe ("UefiCpuPkg: Update CpuExceptionHandlerLib pass
>   XCODE5 tool chain") causes a similar situation to #1. It attempts to do
>   some address fixups and write to the flash device.
> 
>   Reverting that commit fixes the issue. I don't think that will be an
>   acceptable solution, though, so need to think about what to do here.
> 

Did you fill a bugzilla for 2)?

Thanks,

Andrew Fish

> After those two changes, the above method works well.
> 
> Thanks,
> Tom
> 
>> 
>> Thanks!
>> Tom
>> 
>>> 
>>> Thanks
>>> Laszlo
>>> 
> 
> 
> 


-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.

View/Reply Online (#48766): https://edk2.groups.io/g/devel/message/48766
Mute This Topic: https://groups.io/mt/34203585/1787277
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub  [importer@patchew.org]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [edk2-devel] [RFC PATCH v2 38/44] UefiCpuPkg: Allow AP booting under SEV-ES
Posted by Laszlo Ersek 5 years, 11 months ago
On 10/11/19 01:17, Lendacky, Thomas wrote:
> On 10/3/19 10:12 AM, Tom Lendacky wrote:
>>
>>
>> On 10/3/19 5:32 AM, Laszlo Ersek wrote:
>>> On 10/03/19 12:12, Laszlo Ersek wrote:
>>>
>>>>   UINT32   ApEntryPoint;
>>>>   EFI_GUID SevEsFooterGuid;
>>>>   UINT16   Size;
>>>
>>> It's probably better to reverse the order of "Size" and
>>> "SevEsFooterGuid", like this:
>>>
>>>   UINT32   ApEntryPoint;
>>>   UINT16   Size;
>>>   EFI_GUID SevEsFooterGuid;
>>>
>>> because then even the "Size" field can be changed (or resized), as a
>>> function of the footer GUID.
>>
>> Cool, I'll look into doing this and see how it works out.
> 
> Just an update on this idea. This has worked out well, but has a couple of
> caveats. Removing the Qemu change to make the flash mapped read-only in
> the nested page tables, caused the following:
> 
> 1. QemuFlashDetected() will attempt to detect how the flash memory device
>    behaves. Because it is marked as read-only by the hypervisor, writing
>    to the area results in a #NPF for the write-fault. With SEV-ES,
>    emulation of the instruction can't be performed (can't read guest
>    memory and not provided the faulting instruction bytes), so the vCPU is
>    just restarted. This results in an infinite #NPF occurring.
> 
>    The solution here was to check for SEV-ES being enabled and just return
>    false from QemuFlashDetected(). Any downfalls to doing that?

Short-circuiting QemuFlashDetected() on SEV-ES seems appropriate.

However, I don't understand why you return FALSE in that case. You
should return TRUE. If QemuFlashDetected() returns FALSE, then the UEFI
variable store will not be backed by the real pflash chip, it will be
emulated with an \NvVars file on the EFI system partition. That
emulation should really not be used nowadays.

So IMO the right approach here is:
- declare that SEV-ES only targets the "two pflash chips" setup
- return TRUE from QemuFlashDetected() when SEV-ES is on.

> 
> 2. Commit 2db0ccc2d7fe ("UefiCpuPkg: Update CpuExceptionHandlerLib pass
>    XCODE5 tool chain") causes a similar situation to #1. It attempts to do
>    some address fixups and write to the flash device.

That's... stunning.

Commit 2db0ccc2d7fe changes the file

  UefiCpuPkg/Library/CpuExceptionHandlerLib/X64/ExceptionHandlerAsm.nasm

such that it does in-place binary patching.

This source file is referenced from:

  UefiCpuPkg/Library/CpuExceptionHandlerLib/SecPeiCpuExceptionHandlerLib.inf

as well. Note "SecPei".

That makes the commit buggy, to my eyes, regardless of SEV-ES. Because:

The binary patching appears to occur in the SEC phase as well, i.e. at a
time when the exception handler is located in flash. That's incorrect on
physical hardware too.

Upon re-reading <https://bugzilla.tianocore.org/show_bug.cgi?id=849>,
this commit worked around an XCODE toolchain bug.

Unfortunately, the workaround is not suitable for the SEC phase. (Also
not suitable for the PEI phase, for such PEIMs that still execute from
flash.)

Please open a new bug for UefiCpuPkg in the TianoCore Bugzilla,
reference BZ#849 in the See Also field, and please also make the new bug
block BZ#2198.

(I'll comment on this issue in a different thread too; I'll CC you on it.)

>    Reverting that commit fixes the issue. I don't think that will be an
>    acceptable solution, though, so need to think about what to do here.
> 
> After those two changes, the above method works well.

I'm happy to hear!

Thanks,
Laszlo

-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.

View/Reply Online (#48807): https://edk2.groups.io/g/devel/message/48807
Mute This Topic: https://groups.io/mt/34203585/1787277
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub  [importer@patchew.org]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [edk2-devel] [RFC PATCH v2 38/44] UefiCpuPkg: Allow AP booting under SEV-ES
Posted by Andrew Fish via Groups.Io 5 years, 11 months ago
Laszlo,

For 2) this  is very unfortunate. I think the root cause is for those of us who work on x86 hardware day to day we get programed that SEC/PEI is IA32 and DXE is X64, and this can lead to some unfortunate coding outcomes. 

I'm guessing this code probably got ported from the DXE CPU driver or some other place that had no XIP assumptions. One option vs. patching is putting the relocations in the .data section. The only issue with that could be the need to align sections on page boundaries and that may take up too much space in XIP code. Perhaps we could only require the .data section relocations for XCODE, and map them to .text for the other toolchain? 

Thanks,

Andrew Fish

> On Oct 11, 2019, at 1:56 AM, Laszlo Ersek <lersek@redhat.com> wrote:
> 
> On 10/11/19 01:17, Lendacky, Thomas wrote:
>> On 10/3/19 10:12 AM, Tom Lendacky wrote:
>>> 
>>> 
>>> On 10/3/19 5:32 AM, Laszlo Ersek wrote:
>>>> On 10/03/19 12:12, Laszlo Ersek wrote:
>>>> 
>>>>>  UINT32   ApEntryPoint;
>>>>>  EFI_GUID SevEsFooterGuid;
>>>>>  UINT16   Size;
>>>> 
>>>> It's probably better to reverse the order of "Size" and
>>>> "SevEsFooterGuid", like this:
>>>> 
>>>>  UINT32   ApEntryPoint;
>>>>  UINT16   Size;
>>>>  EFI_GUID SevEsFooterGuid;
>>>> 
>>>> because then even the "Size" field can be changed (or resized), as a
>>>> function of the footer GUID.
>>> 
>>> Cool, I'll look into doing this and see how it works out.
>> 
>> Just an update on this idea. This has worked out well, but has a couple of
>> caveats. Removing the Qemu change to make the flash mapped read-only in
>> the nested page tables, caused the following:
>> 
>> 1. QemuFlashDetected() will attempt to detect how the flash memory device
>>   behaves. Because it is marked as read-only by the hypervisor, writing
>>   to the area results in a #NPF for the write-fault. With SEV-ES,
>>   emulation of the instruction can't be performed (can't read guest
>>   memory and not provided the faulting instruction bytes), so the vCPU is
>>   just restarted. This results in an infinite #NPF occurring.
>> 
>>   The solution here was to check for SEV-ES being enabled and just return
>>   false from QemuFlashDetected(). Any downfalls to doing that?
> 
> Short-circuiting QemuFlashDetected() on SEV-ES seems appropriate.
> 
> However, I don't understand why you return FALSE in that case. You
> should return TRUE. If QemuFlashDetected() returns FALSE, then the UEFI
> variable store will not be backed by the real pflash chip, it will be
> emulated with an \NvVars file on the EFI system partition. That
> emulation should really not be used nowadays.
> 
> So IMO the right approach here is:
> - declare that SEV-ES only targets the "two pflash chips" setup
> - return TRUE from QemuFlashDetected() when SEV-ES is on.
> 
>> 
>> 2. Commit 2db0ccc2d7fe ("UefiCpuPkg: Update CpuExceptionHandlerLib pass
>>   XCODE5 tool chain") causes a similar situation to #1. It attempts to do
>>   some address fixups and write to the flash device.
> 
> That's... stunning.
> 
> Commit 2db0ccc2d7fe changes the file
> 
>  UefiCpuPkg/Library/CpuExceptionHandlerLib/X64/ExceptionHandlerAsm.nasm
> 
> such that it does in-place binary patching.
> 
> This source file is referenced from:
> 
>  UefiCpuPkg/Library/CpuExceptionHandlerLib/SecPeiCpuExceptionHandlerLib.inf
> 
> as well. Note "SecPei".
> 
> That makes the commit buggy, to my eyes, regardless of SEV-ES. Because:
> 
> The binary patching appears to occur in the SEC phase as well, i.e. at a
> time when the exception handler is located in flash. That's incorrect on
> physical hardware too.
> 
> Upon re-reading <https://bugzilla.tianocore.org/show_bug.cgi?id=849 <https://bugzilla.tianocore.org/show_bug.cgi?id=849>>,
> this commit worked around an XCODE toolchain bug.
> 
> Unfortunately, the workaround is not suitable for the SEC phase. (Also
> not suitable for the PEI phase, for such PEIMs that still execute from
> flash.)
> 
> Please open a new bug for UefiCpuPkg in the TianoCore Bugzilla,
> reference BZ#849 in the See Also field, and please also make the new bug
> block BZ#2198.
> 
> (I'll comment on this issue in a different thread too; I'll CC you on it.)
> 
>>   Reverting that commit fixes the issue. I don't think that will be an
>>   acceptable solution, though, so need to think about what to do here.
>> 
>> After those two changes, the above method works well.
> 
> I'm happy to hear!
> 
> Thanks,
> Laszlo
> 
> 


-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.

View/Reply Online (#48861): https://edk2.groups.io/g/devel/message/48861
Mute This Topic: https://groups.io/mt/34203585/1787277
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub  [importer@patchew.org]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [edk2-devel] [RFC PATCH v2 38/44] UefiCpuPkg: Allow AP booting under SEV-ES
Posted by Liming Gao 5 years, 11 months ago
Andrew:
  Can you give more detail on how to update nasm source code to put the 64bit absolute address from .text section to .data section? I will verify it. Now, the patching way doesn't support X64 SEC/PEI. This is a gab in XCODE tool chain.

Thanks
Liming
From: devel@edk2.groups.io [mailto:devel@edk2.groups.io] On Behalf Of Andrew Fish via Groups.Io
Sent: Saturday, October 12, 2019 2:43 PM
To: devel@edk2.groups.io; Laszlo Ersek <lersek@redhat.com>
Cc: Lendacky, Thomas <Thomas.Lendacky@amd.com>; Justen, Jordan L <jordan.l.justen@intel.com>; Ard Biesheuvel <ard.biesheuvel@linaro.org>; Kinney, Michael D <michael.d.kinney@intel.com>; Gao, Liming <liming.gao@intel.com>; Dong, Eric <eric.dong@intel.com>; Ni, Ray <ray.ni@intel.com>; Singh, Brijesh <brijesh.singh@amd.com>; Philippe Mathieu-Daudé <philmd@redhat.com>
Subject: Re: [edk2-devel] [RFC PATCH v2 38/44] UefiCpuPkg: Allow AP booting under SEV-ES

Laszlo,

For 2) this  is very unfortunate. I think the root cause is for those of us who work on x86 hardware day to day we get programed that SEC/PEI is IA32 and DXE is X64, and this can lead to some unfortunate coding outcomes.

I'm guessing this code probably got ported from the DXE CPU driver or some other place that had no XIP assumptions. One option vs. patching is putting the relocations in the .data section. The only issue with that could be the need to align sections on page boundaries and that may take up too much space in XIP code. Perhaps we could only require the .data section relocations for XCODE, and map them to .text for the other toolchain?

Thanks,

Andrew Fish


On Oct 11, 2019, at 1:56 AM, Laszlo Ersek <lersek@redhat.com<mailto:lersek@redhat.com>> wrote:

On 10/11/19 01:17, Lendacky, Thomas wrote:

On 10/3/19 10:12 AM, Tom Lendacky wrote:



On 10/3/19 5:32 AM, Laszlo Ersek wrote:

On 10/03/19 12:12, Laszlo Ersek wrote:


 UINT32   ApEntryPoint;
 EFI_GUID SevEsFooterGuid;
 UINT16   Size;

It's probably better to reverse the order of "Size" and
"SevEsFooterGuid", like this:

 UINT32   ApEntryPoint;
 UINT16   Size;
 EFI_GUID SevEsFooterGuid;

because then even the "Size" field can be changed (or resized), as a
function of the footer GUID.

Cool, I'll look into doing this and see how it works out.

Just an update on this idea. This has worked out well, but has a couple of
caveats. Removing the Qemu change to make the flash mapped read-only in
the nested page tables, caused the following:

1. QemuFlashDetected() will attempt to detect how the flash memory device
  behaves. Because it is marked as read-only by the hypervisor, writing
  to the area results in a #NPF for the write-fault. With SEV-ES,
  emulation of the instruction can't be performed (can't read guest
  memory and not provided the faulting instruction bytes), so the vCPU is
  just restarted. This results in an infinite #NPF occurring.

  The solution here was to check for SEV-ES being enabled and just return
  false from QemuFlashDetected(). Any downfalls to doing that?

Short-circuiting QemuFlashDetected() on SEV-ES seems appropriate.

However, I don't understand why you return FALSE in that case. You
should return TRUE. If QemuFlashDetected() returns FALSE, then the UEFI
variable store will not be backed by the real pflash chip, it will be
emulated with an \NvVars file on the EFI system partition. That
emulation should really not be used nowadays.

So IMO the right approach here is:
- declare that SEV-ES only targets the "two pflash chips" setup
- return TRUE from QemuFlashDetected() when SEV-ES is on.



2. Commit 2db0ccc2d7fe ("UefiCpuPkg: Update CpuExceptionHandlerLib pass
  XCODE5 tool chain") causes a similar situation to #1. It attempts to do
  some address fixups and write to the flash device.

That's... stunning.

Commit 2db0ccc2d7fe changes the file

 UefiCpuPkg/Library/CpuExceptionHandlerLib/X64/ExceptionHandlerAsm.nasm

such that it does in-place binary patching.

This source file is referenced from:

 UefiCpuPkg/Library/CpuExceptionHandlerLib/SecPeiCpuExceptionHandlerLib.inf

as well. Note "SecPei".

That makes the commit buggy, to my eyes, regardless of SEV-ES. Because:

The binary patching appears to occur in the SEC phase as well, i.e. at a
time when the exception handler is located in flash. That's incorrect on
physical hardware too.

Upon re-reading <https://bugzilla.tianocore.org/show_bug.cgi?id=849>,
this commit worked around an XCODE toolchain bug.

Unfortunately, the workaround is not suitable for the SEC phase. (Also
not suitable for the PEI phase, for such PEIMs that still execute from
flash.)

Please open a new bug for UefiCpuPkg in the TianoCore Bugzilla,
reference BZ#849 in the See Also field, and please also make the new bug
block BZ#2198.

(I'll comment on this issue in a different thread too; I'll CC you on it.)


  Reverting that commit fixes the issue. I don't think that will be an
  acceptable solution, though, so need to think about what to do here.

After those two changes, the above method works well.

I'm happy to hear!

Thanks,
Laszlo





-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.

View/Reply Online (#48864): https://edk2.groups.io/g/devel/message/48864
Mute This Topic: https://groups.io/mt/34203585/1787277
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub  [importer@patchew.org]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [edk2-devel] [RFC PATCH v2 38/44] UefiCpuPkg: Allow AP booting under SEV-ES
Posted by Andrew Fish via Groups.Io 5 years, 11 months ago
Liming,

Here is a simple example of a global with absolute function address in it. 

1) The text section uses %rip relative addressing. So the 1st goto technique is to convert absolute addressing to PC relative addressing if possible. 
2) The data section can contain absolute addresses. The data section is read/write. As you can see .quad can have a pointer to an absolute address (that would require a relocation).
3) The text section can access data section via PC relative addressing. 
4) While the code looks like it is located together the data section is going to follow the text section and get aligned to section alignment. So in my simple example the data section is 4K from the start of the text section. 
5) If all else fails the assembler will let you put code in the data section, and that code can have relocations, but see 4). 

~/work/Compiler>cat relocation.c
int main();

void *gRelocation = (void *)main;

int main ()
{
  return (int)(unsigned long long)gRelocation;
}
~/work/Compiler>clang -S -Os relocation.c
~/work/Compiler>cat relocation.S
	.section	__TEXT,__text,regular,pure_instructions
	.globl	_main                   ## -- Begin function main
_main:                                  ## @main
	pushq	%rbp
	movq	%rsp, %rbp
	movl	_gRelocation(%rip), %eax
	popq	%rbp
	retq
                                        ## -- End function

	.section	__DATA,__data
	.globl	_gRelocation            ## @gRelocation
	.p2align	3
_gRelocation:
	.quad	_main


.subsections_via_symbols

If you have questions about a specific chunk of code to convert let me know. 

Thanks,

Andrew Fish

> On Oct 12, 2019, at 12:46 AM, Liming Gao <liming.gao@intel.com> wrote:
> 
> Andrew:
>   Can you give more detail on how to update nasm source code to put the 64bit absolute address from .text section to .data section? I will verify it. Now, the patching way doesn’t support X64 SEC/PEI. This is a gab in XCODE tool chain. 
>
> Thanks
> Liming
> From: devel@edk2.groups.io <mailto:devel@edk2.groups.io> [mailto:devel@edk2.groups.io <mailto:devel@edk2.groups.io>] On Behalf Of Andrew Fish via Groups.Io
> Sent: Saturday, October 12, 2019 2:43 PM
> To: devel@edk2.groups.io <mailto:devel@edk2.groups.io>; Laszlo Ersek <lersek@redhat.com <mailto:lersek@redhat.com>>
> Cc: Lendacky, Thomas <Thomas.Lendacky@amd.com <mailto:Thomas.Lendacky@amd.com>>; Justen, Jordan L <jordan.l.justen@intel.com <mailto:jordan.l.justen@intel.com>>; Ard Biesheuvel <ard.biesheuvel@linaro.org <mailto:ard.biesheuvel@linaro.org>>; Kinney, Michael D <michael.d.kinney@intel.com <mailto:michael.d.kinney@intel.com>>; Gao, Liming <liming.gao@intel.com <mailto:liming.gao@intel.com>>; Dong, Eric <eric.dong@intel.com <mailto:eric.dong@intel.com>>; Ni, Ray <ray.ni@intel.com <mailto:ray.ni@intel.com>>; Singh, Brijesh <brijesh.singh@amd.com <mailto:brijesh.singh@amd.com>>; Philippe Mathieu-Daudé <philmd@redhat.com <mailto:philmd@redhat.com>>
> Subject: Re: [edk2-devel] [RFC PATCH v2 38/44] UefiCpuPkg: Allow AP booting under SEV-ES
>
> Laszlo,
>
> For 2) this  is very unfortunate. I think the root cause is for those of us who work on x86 hardware day to day we get programed that SEC/PEI is IA32 and DXE is X64, and this can lead to some unfortunate coding outcomes. 
>
> I'm guessing this code probably got ported from the DXE CPU driver or some other place that had no XIP assumptions. One option vs. patching is putting the relocations in the .data section. The only issue with that could be the need to align sections on page boundaries and that may take up too much space in XIP code. Perhaps we could only require the .data section relocations for XCODE, and map them to .text for the other toolchain? 
>
> Thanks,
>
> Andrew Fish
> 
> 
> On Oct 11, 2019, at 1:56 AM, Laszlo Ersek <lersek@redhat.com <mailto:lersek@redhat.com>> wrote:
>
> On 10/11/19 01:17, Lendacky, Thomas wrote:
> 
> On 10/3/19 10:12 AM, Tom Lendacky wrote:
> 
> 
> 
> On 10/3/19 5:32 AM, Laszlo Ersek wrote:
> 
> On 10/03/19 12:12, Laszlo Ersek wrote:
> 
> 
>  UINT32   ApEntryPoint;
>  EFI_GUID SevEsFooterGuid;
>  UINT16   Size;
> 
> It's probably better to reverse the order of "Size" and
> "SevEsFooterGuid", like this:
> 
>  UINT32   ApEntryPoint;
>  UINT16   Size;
>  EFI_GUID SevEsFooterGuid;
> 
> because then even the "Size" field can be changed (or resized), as a
> function of the footer GUID.
> 
> Cool, I'll look into doing this and see how it works out.
> 
> Just an update on this idea. This has worked out well, but has a couple of
> caveats. Removing the Qemu change to make the flash mapped read-only in
> the nested page tables, caused the following:
> 
> 1. QemuFlashDetected() will attempt to detect how the flash memory device
>   behaves. Because it is marked as read-only by the hypervisor, writing
>   to the area results in a #NPF for the write-fault. With SEV-ES,
>   emulation of the instruction can't be performed (can't read guest
>   memory and not provided the faulting instruction bytes), so the vCPU is
>   just restarted. This results in an infinite #NPF occurring.
> 
>   The solution here was to check for SEV-ES being enabled and just return
>   false from QemuFlashDetected(). Any downfalls to doing that?
> 
> Short-circuiting QemuFlashDetected() on SEV-ES seems appropriate.
> 
> However, I don't understand why you return FALSE in that case. You
> should return TRUE. If QemuFlashDetected() returns FALSE, then the UEFI
> variable store will not be backed by the real pflash chip, it will be
> emulated with an \NvVars file on the EFI system partition. That
> emulation should really not be used nowadays.
> 
> So IMO the right approach here is:
> - declare that SEV-ES only targets the "two pflash chips" setup
> - return TRUE from QemuFlashDetected() when SEV-ES is on.
> 
> 
> 
> 2. Commit 2db0ccc2d7fe ("UefiCpuPkg: Update CpuExceptionHandlerLib pass
>   XCODE5 tool chain") causes a similar situation to #1. It attempts to do
>   some address fixups and write to the flash device.
> 
> That's... stunning.
> 
> Commit 2db0ccc2d7fe changes the file
> 
>  UefiCpuPkg/Library/CpuExceptionHandlerLib/X64/ExceptionHandlerAsm.nasm
> 
> such that it does in-place binary patching.
> 
> This source file is referenced from:
> 
>  UefiCpuPkg/Library/CpuExceptionHandlerLib/SecPeiCpuExceptionHandlerLib.inf
> 
> as well. Note "SecPei".
> 
> That makes the commit buggy, to my eyes, regardless of SEV-ES. Because:
> 
> The binary patching appears to occur in the SEC phase as well, i.e. at a
> time when the exception handler is located in flash. That's incorrect on
> physical hardware too.
> 
> Upon re-reading <https://bugzilla.tianocore.org/show_bug.cgi?id=849 <https://bugzilla.tianocore.org/show_bug.cgi?id=849>>,
> this commit worked around an XCODE toolchain bug.
> 
> Unfortunately, the workaround is not suitable for the SEC phase. (Also
> not suitable for the PEI phase, for such PEIMs that still execute from
> flash.)
> 
> Please open a new bug for UefiCpuPkg in the TianoCore Bugzilla,
> reference BZ#849 in the See Also field, and please also make the new bug
> block BZ#2198.
> 
> (I'll comment on this issue in a different thread too; I'll CC you on it.)
> 
> 
>   Reverting that commit fixes the issue. I don't think that will be an
>   acceptable solution, though, so need to think about what to do here.
> 
> After those two changes, the above method works well.
> 
> I'm happy to hear!
> 
> Thanks,
> Laszlo
> 
> 
>
> 


-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.

View/Reply Online (#48870): https://edk2.groups.io/g/devel/message/48870
Mute This Topic: https://groups.io/mt/34203585/1787277
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub  [importer@patchew.org]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [edk2-devel] [RFC PATCH v2 38/44] UefiCpuPkg: Allow AP booting under SEV-ES
Posted by Laszlo Ersek 5 years, 11 months ago
On 10/12/19 08:42, Andrew Fish wrote:
> Laszlo,
>
> For 2) this  is very unfortunate. I think the root cause is for those
> of us who work on x86 hardware day to day we get programed that
> SEC/PEI is IA32 and DXE is X64, and this can lead to some unfortunate
> coding outcomes.

First I was confused by this; I didn't understand why the "bitness" of
SEC/PEI mattered here.

But, of course, you are right: if SEC/PEI are 32-bit, then the
problematic relocations are never generated by the compiler in the first
place.

> I'm guessing this code probably got ported from the DXE CPU driver or
> some other place that had no XIP assumptions. One option vs. patching
> is putting the relocations in the .data section. The only issue with
> that could be the need to align sections on page boundaries and that
> may take up too much space in XIP code. Perhaps we could only require
> the .data section relocations for XCODE, and map them to .text for
> the other toolchain?

In SEC, all sections of the binary are located in flash, aren't they?
Why would it help if the relocations were placed in .data? I'm reminded
of global variables in SEC, and those are not writable in SEC either.
(At least on QEMU.) I'm uncertain if this is somehow connected to
Cache-as-RAM, but if it is: QEMU does not support Cache-as-RAM.

Thanks
Laszlo

-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.

View/Reply Online (#48901): https://edk2.groups.io/g/devel/message/48901
Mute This Topic: https://groups.io/mt/34203585/1787277
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub  [importer@patchew.org]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [edk2-devel] [RFC PATCH v2 38/44] UefiCpuPkg: Allow AP booting under SEV-ES
Posted by Andrew Fish via Groups.Io 5 years, 11 months ago

> On Oct 14, 2019, at 6:11 AM, Laszlo Ersek <lersek@redhat.com> wrote:
> 
> On 10/12/19 08:42, Andrew Fish wrote:
>> Laszlo,
>> 
>> For 2) this  is very unfortunate. I think the root cause is for those
>> of us who work on x86 hardware day to day we get programed that
>> SEC/PEI is IA32 and DXE is X64, and this can lead to some unfortunate
>> coding outcomes.
> 
> First I was confused by this; I didn't understand why the "bitness" of
> SEC/PEI mattered here.
> 
> But, of course, you are right: if SEC/PEI are 32-bit, then the
> problematic relocations are never generated by the compiler in the first
> place.
> 

Laszlo,

The other "human" problem is people assume DXE always runs from memory so they think that patching is OK. I've even see people use IA32 to mean PEI on X64 platforms. So the implementation (32-bit PEI) leaks into the architecture. 

>> I'm guessing this code probably got ported from the DXE CPU driver or
>> some other place that had no XIP assumptions. One option vs. patching
>> is putting the relocations in the .data section. The only issue with
>> that could be the need to align sections on page boundaries and that
>> may take up too much space in XIP code. Perhaps we could only require
>> the .data section relocations for XCODE, and map them to .text for
>> the other toolchain?
> 
> In SEC, all sections of the binary are located in flash, aren't they?

Yes. 

> Why would it help if the relocations were placed in .data?

Sorry if I was unclear. The Xcode linker (ld64) enforces a macOS x86_64 ABI that makes it so even the image loader can't write the text section.  Thus you can't link an image that has relocations in the text section, and there is no flag to ld64 to turn this behavior off.  So the workaround is to generate code like the compiler and don't have any relocations in the text section.  Moving the relocation from the text to the data section is needed to make the Xcode linker (ld64) link the code. if there are relocations in the text section then the link will fail on Xcode builds. 

When you are writing assembly code there is no restriction about placing code in the data section. Generally we try to fix issues by converting the code to use PC relative addressing, but a quick and dirty fix is to change the code over to the data section.  The cleaner solution is to have a global in the data section that contains the relocatable address, and that is like the simple example of the C code global I sent out. 

> I'm reminded
> of global variables in SEC, and those are not writable in SEC either.
> (At least on QEMU.) I'm uncertain if this is somehow connected to
> Cache-as-RAM, but if it is: QEMU does not support Cache-as-RAM.
> 

For SEC the relocations would have been applied at build time. The SEC will have been linked at around zero, and when it was placed in the FV it would have been relocated to its XIP location. For code running from memory the EFI PE/COFF loader would apply the relocations as part of loading the image.  But non of the above works if your build fails due to a linker error :(. 

Thanks,

Andrew Fish


> Thanks
> Laszlo


-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.

View/Reply Online (#48935): https://edk2.groups.io/g/devel/message/48935
Mute This Topic: https://groups.io/mt/34203585/1787277
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub  [importer@patchew.org]
-=-=-=-=-=-=-=-=-=-=-=-