RFC: https://bugzilla.tianocore.org/show_bug.cgi?id=3429
When host VMM create the Td guest, the system memory informations are
stored in TdHob, which is a memory region described in Tdx metadata.
The system memory region in TdHob should be accepted before it can be
accessed. So the major task of this patch set is to process the TdHobList
to accept the memory. After that TDVF follow the standard OVMF flow
and jump to PEI phase.
2 PCDs are added for page accepting. They're the default settings of the
chunk size and the Accept page size.
4 Tdx specific libs are used by OvmfPkgX64:
- VmTdExitLib
- TdxLib
- TdxProbeLib
- TdxMailboxLib
TDX only works on X64, so the code is only valid in X64 arch.
Cc: Ard Biesheuvel <ardb+tianocore@kernel.org>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Brijesh Singh <brijesh.singh@amd.com>
Cc: Erdem Aktas <erdemaktas@google.com>
Cc: James Bottomley <jejb@linux.ibm.com>
Cc: Jiewen Yao <jiewen.yao@intel.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Min Xu <min.m.xu@intel.com>
---
OvmfPkg/OvmfPkg.dec | 6 +
OvmfPkg/OvmfPkgIa32.dsc | 3 +
OvmfPkg/OvmfPkgIa32X64.dsc | 3 +
OvmfPkg/OvmfPkgX64.dsc | 5 +-
OvmfPkg/Sec/IntelTdx.c | 608 +++++++++++++++++++++++++++++++++++++
OvmfPkg/Sec/IntelTdx.h | 33 ++
OvmfPkg/Sec/SecMain.c | 43 ++-
OvmfPkg/Sec/SecMain.inf | 6 +
8 files changed, 700 insertions(+), 7 deletions(-)
create mode 100644 OvmfPkg/Sec/IntelTdx.c
create mode 100644 OvmfPkg/Sec/IntelTdx.h
diff --git a/OvmfPkg/OvmfPkg.dec b/OvmfPkg/OvmfPkg.dec
index d478a4b8e418..a9765f2a60be 100644
--- a/OvmfPkg/OvmfPkg.dec
+++ b/OvmfPkg/OvmfPkg.dec
@@ -349,6 +349,12 @@
## Ignore the VE halt in Tdx
gUefiOvmfPkgTokenSpaceGuid.PcdIgnoreVeHalt|FALSE|BOOLEAN|0x50
+ ## The chunk size of Tdx accept page
+ gUefiOvmfPkgTokenSpaceGuid.PcdTdxAcceptChunkSize|0x2000000|UINT64|0x51
+
+ ## The Tdx accept page size. 0x1000(4k),0x200000(2M)
+ gUefiOvmfPkgTokenSpaceGuid.PcdTdxAcceptPageSize|0x1000|UINT64|0x52
+
[PcdsDynamic, PcdsDynamicEx]
gUefiOvmfPkgTokenSpaceGuid.PcdEmuVariableEvent|0|UINT64|2
gUefiOvmfPkgTokenSpaceGuid.PcdOvmfFlashVariablesEnable|FALSE|BOOLEAN|0x10
diff --git a/OvmfPkg/OvmfPkgIa32.dsc b/OvmfPkg/OvmfPkgIa32.dsc
index 1e1765060e3f..27c70d233c68 100644
--- a/OvmfPkg/OvmfPkgIa32.dsc
+++ b/OvmfPkg/OvmfPkgIa32.dsc
@@ -244,6 +244,9 @@
BaseCryptLib|CryptoPkg/Library/BaseCryptLib/BaseCryptLib.inf
VmgExitLib|UefiCpuPkg/Library/VmgExitLibNull/VmgExitLibNull.inf
VmTdExitLib|UefiCpuPkg/Library/VmTdExitLibNull/VmTdExitLibNull.inf
+ TdxLib|MdePkg/Library/TdxLib/TdxLib.inf
+ TdxProbeLib|MdePkg/Library/TdxProbeLib/TdxProbeLib.inf
+ TdxMailboxLib|OvmfPkg/Library/TdxMailboxLib/TdxMailboxLib.inf
[LibraryClasses.common.SEC]
TimerLib|OvmfPkg/Library/AcpiTimerLib/BaseRomAcpiTimerLib.inf
diff --git a/OvmfPkg/OvmfPkgIa32X64.dsc b/OvmfPkg/OvmfPkgIa32X64.dsc
index 2eadb51683c9..54c664382ef1 100644
--- a/OvmfPkg/OvmfPkgIa32X64.dsc
+++ b/OvmfPkg/OvmfPkgIa32X64.dsc
@@ -248,6 +248,9 @@
BaseCryptLib|CryptoPkg/Library/BaseCryptLib/BaseCryptLib.inf
VmgExitLib|UefiCpuPkg/Library/VmgExitLibNull/VmgExitLibNull.inf
VmTdExitLib|UefiCpuPkg/Library/VmTdExitLibNull/VmTdExitLibNull.inf
+ TdxLib|MdePkg/Library/TdxLib/TdxLib.inf
+ TdxProbeLib|MdePkg/Library/TdxProbeLib/TdxProbeLib.inf
+ TdxMailboxLib|OvmfPkg/Library/TdxMailboxLib/TdxMailboxLib.inf
[LibraryClasses.common.SEC]
TimerLib|OvmfPkg/Library/AcpiTimerLib/BaseRomAcpiTimerLib.inf
diff --git a/OvmfPkg/OvmfPkgX64.dsc b/OvmfPkg/OvmfPkgX64.dsc
index 3376c4f4c2f9..b2e07233ebd6 100644
--- a/OvmfPkg/OvmfPkgX64.dsc
+++ b/OvmfPkg/OvmfPkgX64.dsc
@@ -247,7 +247,10 @@
[LibraryClasses.common]
BaseCryptLib|CryptoPkg/Library/BaseCryptLib/BaseCryptLib.inf
VmgExitLib|OvmfPkg/Library/VmgExitLib/VmgExitLib.inf
- VmTdExitLib|UefiCpuPkg/Library/VmTdExitLibNull/VmTdExitLibNull.inf
+ VmTdExitLib|OvmfPkg/Library/VmTdExitLib/VmTdExitLib.inf
+ TdxLib|MdePkg/Library/TdxLib/TdxLib.inf
+ TdxProbeLib|MdePkg/Library/TdxProbeLib/TdxProbeLib.inf
+ TdxMailboxLib|OvmfPkg/Library/TdxMailboxLib/TdxMailboxLib.inf
[LibraryClasses.common.SEC]
TimerLib|OvmfPkg/Library/AcpiTimerLib/BaseRomAcpiTimerLib.inf
diff --git a/OvmfPkg/Sec/IntelTdx.c b/OvmfPkg/Sec/IntelTdx.c
new file mode 100644
index 000000000000..e4bbd0fdea4e
--- /dev/null
+++ b/OvmfPkg/Sec/IntelTdx.c
@@ -0,0 +1,608 @@
+/** @file
+
+ Copyright (c) 2008, Intel Corporation. All rights reserved.<BR>
+ (C) Copyright 2016 Hewlett Packard Enterprise Development LP<BR>
+
+ SPDX-License-Identifier: BSD-2-Clause-Patent
+
+**/
+
+#include <PiPei.h>
+#include <Uefi/UefiSpec.h>
+#include <Uefi/UefiBaseType.h>
+#include <Library/BaseLib.h>
+#include <Library/DebugLib.h>
+#include <Library/HobLib.h>
+#include <Library/BaseMemoryLib.h>
+#include <IndustryStandard/UefiTcgPlatform.h>
+#include <Library/MemoryAllocationLib.h>
+#include <Library/SynchronizationLib.h>
+#include <Library/TdxLib.h>
+#include <Library/TdxMailboxLib.h>
+#include <IndustryStandard/Tdx.h>
+#include <IndustryStandard/IntelTdx.h>
+#include "IntelTdx.h"
+
+#define ALIGNED_2MB_MASK 0x1fffff
+
+/**
+ BSP call this function to accept memory in a range.
+
+ @param[in] StartAddress Start address of the memory region
+ @param[in] Length Length of the memory region
+ @param[in] AcceptChunkSize Accept chunk size
+ @param[in] AcceptPageSize Accept page size
+ @retval EFI_SUCCESS Successfully accept the memory region
+ @retval Others Indicate the other errors
+**/
+EFI_STATUS
+EFIAPI
+BspAcceptMemoryResourceRange (
+ IN EFI_PHYSICAL_ADDRESS StartAddress,
+ IN UINT64 Length,
+ IN UINT64 AcceptChunkSize,
+ IN UINT64 AcceptPageSize
+ )
+{
+ EFI_STATUS Status;
+ UINT64 Pages;
+ UINT64 Stride;
+ EFI_PHYSICAL_ADDRESS PhysicalAddress;
+ volatile MP_WAKEUP_MAILBOX *MailBox;
+
+ Status = EFI_SUCCESS;
+ PhysicalAddress = StartAddress;
+ Stride = GetCpusNum () * AcceptChunkSize;
+ MailBox = (volatile MP_WAKEUP_MAILBOX *) GetTdxMailBox ();
+
+ while (!EFI_ERROR(Status) && PhysicalAddress < StartAddress + Length) {
+ //
+ // Decrease size of near end of resource if needed.
+ //
+ Pages = MIN (AcceptChunkSize, StartAddress + Length - PhysicalAddress) / AcceptPageSize;
+
+ MailBox->Tallies[0] += (UINT32)Pages;
+
+ Status = TdAcceptPages (PhysicalAddress, Pages, AcceptPageSize);
+ //
+ // Bump address to next chunk this cpu is responisble for
+ //
+ PhysicalAddress += Stride;
+ }
+
+ return Status;
+}
+
+/**
+ This function will be called to accept pages. BSP and APs are invokded
+ to do the task together.
+
+ TDCALL(ACCEPT_PAGE) supports the accept page size of 4k and 2M. To
+ simplify the implementation, the Memory to be accpeted is splitted
+ into 3 parts:
+ ----------------- <-- StartAddress1 (not 2M aligned)
+ | part 1 | Length1 < 2M
+ |---------------| <-- StartAddress2 (2M aligned)
+ | | Length2 = Integer multiples of 2M
+ | part 2 |
+ | |
+ |---------------| <-- StartAddress3
+ | part 3 | Length3 < 2M
+ |---------------|
+
+ part 1) will be accepted in 4k and by BSP.
+ Part 2) will be accepted in 2M and by BSP/AP.
+ Part 3) will be accepted in 4k and by BSP.
+
+ @param[in] PhysicalAddress Start physical adress
+ @param[in] PhysicalEnd End physical address
+
+ @retval EFI_SUCCESS Accept memory successfully
+ @retval Others Other errors as indicated
+**/
+EFI_STATUS
+EFIAPI
+MpAcceptMemoryResourceRange (
+ IN EFI_PHYSICAL_ADDRESS PhysicalAddress,
+ IN EFI_PHYSICAL_ADDRESS PhysicalEnd
+ )
+{
+ EFI_STATUS Status;
+ UINT64 AcceptChunkSize;
+ UINT64 AcceptPageSize;
+ UINT64 StartAddress1;
+ UINT64 StartAddress2;
+ UINT64 StartAddress3;
+ UINT64 TotalLength;
+ UINT64 Length1;
+ UINT64 Length2;
+ UINT64 Length3;
+ UINT32 Index;
+ UINT32 CpusNum;
+ volatile MP_WAKEUP_MAILBOX *MailBox;
+
+ AcceptChunkSize = FixedPcdGet64 (PcdTdxAcceptChunkSize);
+ AcceptPageSize = FixedPcdGet64 (PcdTdxAcceptPageSize);
+ TotalLength = PhysicalEnd - PhysicalAddress;
+ StartAddress1 = 0;
+ StartAddress2 = 0;
+ StartAddress3 = 0;
+ Length1 = 0;
+ Length2 = 0;
+ Length3 = 0;
+
+ if (AcceptPageSize == SIZE_4KB || TotalLength <= SIZE_2MB) {
+ //
+ // if total length is less than 2M, then we accept pages in 4k
+ //
+ StartAddress1 = 0;
+ Length1 = 0;
+ StartAddress2 = PhysicalAddress;
+ Length2 = PhysicalEnd - PhysicalAddress;
+ StartAddress3 = 0;
+ Length3 = 0;
+ AcceptPageSize = SIZE_4KB;
+ } else if (AcceptPageSize == SIZE_2MB) {
+ //
+ // Total length is bigger than 2M and Page Accept size 2M is supported.
+ //
+ if ((PhysicalAddress & ALIGNED_2MB_MASK) == 0) {
+ //
+ // Start address is 2M aligned
+ //
+ StartAddress1 = 0;
+ Length1 = 0;
+ StartAddress2 = PhysicalAddress;
+ Length2 = TotalLength & ~(UINT64)ALIGNED_2MB_MASK;
+
+ if (TotalLength > Length2) {
+ //
+ // There is remaining part 3)
+ //
+ StartAddress3 = StartAddress2 + Length2;
+ Length3 = TotalLength - Length2;
+ ASSERT (Length3 < SIZE_2MB);
+ }
+ } else {
+ //
+ // Start address is not 2M aligned and total length is bigger than 2M.
+ //
+ StartAddress1 = PhysicalAddress;
+ ASSERT (TotalLength > SIZE_2MB);
+ Length1 = SIZE_2MB - (PhysicalAddress & ALIGNED_2MB_MASK);
+ if (TotalLength - Length1 < SIZE_2MB) {
+ //
+ // The Part 2) length is less than 2MB, so let's accept all the
+ // memory in 4K
+ //
+ Length1 = TotalLength;
+
+ } else {
+ StartAddress2 = PhysicalAddress + Length1;
+ Length2 = (TotalLength - Length1) & ~(UINT64)ALIGNED_2MB_MASK;
+ Length3 = TotalLength - Length1 - Length2;
+ StartAddress3 = Length3 > 0 ? StartAddress2 + Length2 : 0;
+ ASSERT (Length3 < SIZE_2MB);
+ }
+ }
+ }
+
+ DEBUG ((DEBUG_INFO, "TdAccept: 0x%llx - 0x%llx\n", PhysicalAddress, TotalLength));
+ DEBUG ((DEBUG_INFO, " Part1: 0x%llx - 0x%llx\n", StartAddress1, Length1));
+ DEBUG ((DEBUG_INFO, " Part2: 0x%llx - 0x%llx\n", StartAddress2, Length2));
+ DEBUG ((DEBUG_INFO, " Part3: 0x%llx - 0x%llx\n", StartAddress3, Length3));
+ DEBUG ((DEBUG_INFO, " Chunk: 0x%llx, Page : 0x%llx\n", AcceptChunkSize, AcceptPageSize));
+
+ MpSerializeStart ();
+
+ if (Length2 > 0) {
+ MpSendWakeupCommand (
+ MpProtectedModeWakeupCommandAcceptPages,
+ 0,
+ StartAddress2,
+ StartAddress2 + Length2,
+ AcceptChunkSize,
+ AcceptPageSize);
+
+ Status = BspAcceptMemoryResourceRange (
+ StartAddress2,
+ Length2,
+ AcceptChunkSize,
+ AcceptPageSize);
+ ASSERT (!EFI_ERROR (Status));
+ }
+
+ if (Length1 > 0) {
+ Status = BspAcceptMemoryResourceRange (
+ StartAddress1,
+ Length1,
+ AcceptChunkSize,
+ SIZE_4KB);
+ ASSERT (!EFI_ERROR (Status));
+ }
+
+ if (Length3 > 0) {
+ Status = BspAcceptMemoryResourceRange (
+ StartAddress3,
+ Length3,
+ AcceptChunkSize,
+ SIZE_4KB);
+ ASSERT (!EFI_ERROR (Status));
+ }
+
+ MpSerializeEnd ();
+
+ CpusNum = GetCpusNum ();
+ MailBox = (volatile MP_WAKEUP_MAILBOX *) GetTdxMailBox ();
+
+ DEBUG ((DEBUG_INFO, "AcceptPage Tallies:\n"));
+ DEBUG ((DEBUG_INFO, " "));
+ for (Index = 0; Index < CpusNum; Index++) {
+ DEBUG ((DEBUG_INFO, "%8d", MailBox->Tallies[Index]));
+ if (Index % 8 == 7) {
+ DEBUG ((DEBUG_INFO, "\n"));
+ DEBUG ((DEBUG_INFO, " "));
+ }
+ }
+ DEBUG ((DEBUG_INFO, "\n"));
+
+ for (Index = 0; Index < CpusNum; Index++) {
+ if (MailBox->Errors[Index] > 0) {
+ Status = EFI_DEVICE_ERROR;
+ DEBUG ((DEBUG_ERROR, "Error(%d) of CPU-%d when accepting memory\n",
+ MailBox->Errors[Index], Index));
+ }
+ }
+
+ return Status;
+}
+
+/**
+ Dump out the hob list
+
+ @param[in] HobStart Start address of the hob list
+**/
+VOID
+EFIAPI
+DEBUG_HOBLIST (
+ IN CONST VOID *HobStart
+ )
+{
+ EFI_PEI_HOB_POINTERS Hob;
+ Hob.Raw = (UINT8 *) HobStart;
+ //
+ // Parse the HOB list until end of list or matching type is found.
+ //
+ while (!END_OF_HOB_LIST (Hob)) {
+ DEBUG ((DEBUG_INFO, "HOB(%p) : %x %x\n", Hob, Hob.Header->HobType, Hob.Header->HobLength));
+ switch (Hob.Header->HobType) {
+ case EFI_HOB_TYPE_RESOURCE_DESCRIPTOR:
+ DEBUG ((DEBUG_INFO, "\t: %x %x %llx %llx\n",
+ Hob.ResourceDescriptor->ResourceType,
+ Hob.ResourceDescriptor->ResourceAttribute,
+ Hob.ResourceDescriptor->PhysicalStart,
+ Hob.ResourceDescriptor->ResourceLength));
+
+ break;
+ case EFI_HOB_TYPE_MEMORY_ALLOCATION:
+ DEBUG ((DEBUG_INFO, "\t: %llx %llx %x\n",
+ Hob.MemoryAllocation->AllocDescriptor.MemoryBaseAddress,
+ Hob.MemoryAllocation->AllocDescriptor.MemoryLength,
+ Hob.MemoryAllocation->AllocDescriptor.MemoryType));
+ break;
+ default:
+ break;
+ }
+ Hob.Raw = GET_NEXT_HOB (Hob);
+ }
+}
+
+/**
+ Check the value whether in the valid list.
+
+ @param[in] Value A value
+ @param[in] ValidList A pointer to valid list
+ @param[in] ValidListLength Length of valid list
+
+ @retval TRUE The value is in valid list.
+ @retval FALSE The value is not in valid list.
+
+**/
+BOOLEAN
+EFIAPI
+IsInValidList (
+ IN UINT32 Value,
+ IN UINT32 *ValidList,
+ IN UINT32 ValidListLength
+) {
+ UINT32 index;
+
+ if (ValidList == NULL) {
+ return FALSE;
+ }
+
+ for (index = 0; index < ValidListLength; index ++) {
+ if (ValidList[index] == Value) {
+ return TRUE;
+ }
+ }
+
+ return FALSE;
+}
+
+/**
+ Check the integrity of VMM Hob List.
+
+ @param[in] VmmHobList A pointer to Hob List
+
+ @retval TRUE The Hob List is valid.
+ @retval FALSE The Hob List is invalid.
+
+**/
+BOOLEAN
+EFIAPI
+ValidateHobList (
+ IN CONST VOID *VmmHobList
+ )
+{
+ EFI_PEI_HOB_POINTERS Hob;
+ UINT32 EFI_BOOT_MODE_LIST[12] = { BOOT_WITH_FULL_CONFIGURATION,
+ BOOT_WITH_MINIMAL_CONFIGURATION,
+ BOOT_ASSUMING_NO_CONFIGURATION_CHANGES,
+ BOOT_WITH_FULL_CONFIGURATION_PLUS_DIAGNOSTICS,
+ BOOT_WITH_DEFAULT_SETTINGS,
+ BOOT_ON_S4_RESUME,
+ BOOT_ON_S5_RESUME,
+ BOOT_WITH_MFG_MODE_SETTINGS,
+ BOOT_ON_S2_RESUME,
+ BOOT_ON_S3_RESUME,
+ BOOT_ON_FLASH_UPDATE,
+ BOOT_IN_RECOVERY_MODE
+ };
+
+ UINT32 EFI_RESOURCE_TYPE_LIST[8] = { EFI_RESOURCE_SYSTEM_MEMORY,
+ EFI_RESOURCE_MEMORY_MAPPED_IO,
+ EFI_RESOURCE_IO,
+ EFI_RESOURCE_FIRMWARE_DEVICE,
+ EFI_RESOURCE_MEMORY_MAPPED_IO_PORT,
+ EFI_RESOURCE_MEMORY_RESERVED,
+ EFI_RESOURCE_IO_RESERVED,
+ EFI_RESOURCE_MAX_MEMORY_TYPE
+ };
+
+ if (VmmHobList == NULL) {
+ DEBUG ((DEBUG_ERROR, "HOB: HOB data pointer is NULL\n"));
+ return FALSE;
+ }
+
+ Hob.Raw = (UINT8 *) VmmHobList;
+
+ //
+ // Parse the HOB list until end of list or matching type is found.
+ //
+ while (!END_OF_HOB_LIST (Hob)) {
+ if (Hob.Header->Reserved != (UINT32) 0) {
+ DEBUG ((DEBUG_ERROR, "HOB: Hob header Reserved filed should be zero\n"));
+ return FALSE;
+ }
+
+ if (Hob.Header->HobLength == 0) {
+ DEBUG ((DEBUG_ERROR, "HOB: Hob header LEANGTH should not be zero\n"));
+ return FALSE;
+ }
+
+ switch (Hob.Header->HobType) {
+ case EFI_HOB_TYPE_HANDOFF:
+ if (Hob.Header->HobLength != sizeof(EFI_HOB_HANDOFF_INFO_TABLE)) {
+ DEBUG ((DEBUG_ERROR, "HOB: Hob length is not equal corresponding hob structure. Type: 0x%04x\n", EFI_HOB_TYPE_HANDOFF));
+ return FALSE;
+ }
+
+ if (IsInValidList (Hob.HandoffInformationTable->BootMode, EFI_BOOT_MODE_LIST, 12) == FALSE) {
+ DEBUG ((DEBUG_ERROR, "HOB: Unknow HandoffInformationTable BootMode type. Type: 0x%08x\n", Hob.HandoffInformationTable->BootMode));
+ return FALSE;
+ }
+
+ if ((Hob.HandoffInformationTable->EfiFreeMemoryTop % 4096) != 0) {
+ DEBUG ((DEBUG_ERROR, "HOB: HandoffInformationTable EfiFreeMemoryTop address must be 4-KB aligned to meet page restrictions of UEFI.\
+ Address: 0x%016lx\n", Hob.HandoffInformationTable->EfiFreeMemoryTop));
+ return FALSE;
+ }
+ break;
+
+ case EFI_HOB_TYPE_RESOURCE_DESCRIPTOR:
+ if (Hob.Header->HobLength != sizeof(EFI_HOB_RESOURCE_DESCRIPTOR)) {
+ DEBUG ((DEBUG_ERROR, "HOB: Hob length is not equal corresponding hob structure. Type: 0x%04x\n", EFI_HOB_TYPE_RESOURCE_DESCRIPTOR));
+ return FALSE;
+ }
+
+ if (IsInValidList (Hob.ResourceDescriptor->ResourceType, EFI_RESOURCE_TYPE_LIST, 8) == FALSE) {
+ DEBUG ((DEBUG_ERROR, "HOB: Unknow ResourceDescriptor ResourceType type. Type: 0x%08x\n", Hob.ResourceDescriptor->ResourceType));
+ return FALSE;
+ }
+
+ if ((Hob.ResourceDescriptor->ResourceAttribute & (~(EFI_RESOURCE_ATTRIBUTE_PRESENT |
+ EFI_RESOURCE_ATTRIBUTE_INITIALIZED |
+ EFI_RESOURCE_ATTRIBUTE_TESTED |
+ EFI_RESOURCE_ATTRIBUTE_READ_PROTECTED |
+ EFI_RESOURCE_ATTRIBUTE_WRITE_PROTECTED |
+ EFI_RESOURCE_ATTRIBUTE_EXECUTION_PROTECTED |
+ EFI_RESOURCE_ATTRIBUTE_PERSISTENT |
+ EFI_RESOURCE_ATTRIBUTE_SINGLE_BIT_ECC |
+ EFI_RESOURCE_ATTRIBUTE_MULTIPLE_BIT_ECC |
+ EFI_RESOURCE_ATTRIBUTE_ECC_RESERVED_1 |
+ EFI_RESOURCE_ATTRIBUTE_ECC_RESERVED_2 |
+ EFI_RESOURCE_ATTRIBUTE_UNCACHEABLE |
+ EFI_RESOURCE_ATTRIBUTE_WRITE_COMBINEABLE |
+ EFI_RESOURCE_ATTRIBUTE_WRITE_THROUGH_CACHEABLE |
+ EFI_RESOURCE_ATTRIBUTE_WRITE_BACK_CACHEABLE |
+ EFI_RESOURCE_ATTRIBUTE_16_BIT_IO |
+ EFI_RESOURCE_ATTRIBUTE_32_BIT_IO |
+ EFI_RESOURCE_ATTRIBUTE_64_BIT_IO |
+ EFI_RESOURCE_ATTRIBUTE_UNCACHED_EXPORTED |
+ EFI_RESOURCE_ATTRIBUTE_READ_PROTECTABLE |
+ EFI_RESOURCE_ATTRIBUTE_WRITE_PROTECTABLE |
+ EFI_RESOURCE_ATTRIBUTE_EXECUTION_PROTECTABLE |
+ EFI_RESOURCE_ATTRIBUTE_PERSISTABLE |
+ EFI_RESOURCE_ATTRIBUTE_READ_ONLY_PROTECTED |
+ EFI_RESOURCE_ATTRIBUTE_READ_ONLY_PROTECTABLE |
+ EFI_RESOURCE_ATTRIBUTE_MORE_RELIABLE |
+ EFI_RESOURCE_ATTRIBUTE_ENCRYPTED))) != 0) {
+ DEBUG ((DEBUG_ERROR, "HOB: Unknow ResourceDescriptor ResourceAttribute type. Type: 0x%08x\n", Hob.ResourceDescriptor->ResourceAttribute));
+ return FALSE;
+ }
+ break;
+
+ // EFI_HOB_GUID_TYPE is variable length data, so skip check
+ case EFI_HOB_TYPE_GUID_EXTENSION:
+ break;
+
+ case EFI_HOB_TYPE_FV:
+ if (Hob.Header->HobLength != sizeof (EFI_HOB_FIRMWARE_VOLUME)) {
+ DEBUG ((DEBUG_ERROR, "HOB: Hob length is not equal corresponding hob structure. Type: 0x%04x\n", EFI_HOB_TYPE_FV));
+ return FALSE;
+ }
+ break;
+
+ case EFI_HOB_TYPE_FV2:
+ if (Hob.Header->HobLength != sizeof(EFI_HOB_FIRMWARE_VOLUME2)) {
+ DEBUG ((DEBUG_ERROR, "HOB: Hob length is not equal corresponding hob structure. Type: 0x%04x\n", EFI_HOB_TYPE_FV2));
+ return FALSE;
+ }
+ break;
+
+ case EFI_HOB_TYPE_FV3:
+ if (Hob.Header->HobLength != sizeof(EFI_HOB_FIRMWARE_VOLUME3)) {
+ DEBUG ((DEBUG_ERROR, "HOB: Hob length is not equal corresponding hob structure. Type: 0x%04x\n", EFI_HOB_TYPE_FV3));
+ return FALSE;
+ }
+ break;
+
+ case EFI_HOB_TYPE_CPU:
+ if (Hob.Header->HobLength != sizeof(EFI_HOB_CPU)) {
+ DEBUG ((DEBUG_ERROR, "HOB: Hob length is not equal corresponding hob structure. Type: 0x%04x\n", EFI_HOB_TYPE_CPU));
+ return FALSE;
+ }
+
+ for (UINT32 index = 0; index < 6; index ++) {
+ if (Hob.Cpu->Reserved[index] != 0) {
+ DEBUG ((DEBUG_ERROR, "HOB: Cpu Reserved field will always be set to zero.\n"));
+ return FALSE;
+ }
+ }
+ break;
+
+ default:
+ DEBUG ((DEBUG_ERROR, "HOB: Hob type is not know. Type: 0x%04x\n", Hob.Header->HobType));
+ return FALSE;
+ }
+ // Get next HOB
+ Hob.Raw = (UINT8 *) (Hob.Raw + Hob.Header->HobLength);
+ }
+
+ return TRUE;
+}
+
+/**
+ Processing the incoming HobList for the TDX
+
+ Firmware must parse list, and accept the pages of memory before their can be
+ use by the guest.
+
+ @param[in] VmmHobList The Hoblist pass the firmware
+
+ @retval EFI_SUCCESS Process the HobList successfully
+ @retval Others Other errors as indicated
+
+**/
+EFI_STATUS
+EFIAPI
+ProcessHobList (
+ IN CONST VOID *VmmHobList
+ )
+{
+ EFI_STATUS Status;
+ EFI_PEI_HOB_POINTERS Hob;
+ EFI_PHYSICAL_ADDRESS PhysicalEnd;
+
+ Status = EFI_SUCCESS;
+ ASSERT (VmmHobList != NULL);
+ Hob.Raw = (UINT8 *) VmmHobList;
+
+ //
+ // Parse the HOB list until end of list or matching type is found.
+ //
+ while (!END_OF_HOB_LIST (Hob)) {
+
+ if (Hob.Header->HobType == EFI_HOB_TYPE_RESOURCE_DESCRIPTOR) {
+ DEBUG ((DEBUG_INFO, "\nResourceType: 0x%x\n", Hob.ResourceDescriptor->ResourceType));
+
+ if (Hob.ResourceDescriptor->ResourceType == EFI_RESOURCE_SYSTEM_MEMORY) {
+ DEBUG ((DEBUG_INFO, "ResourceAttribute: 0x%x\n", Hob.ResourceDescriptor->ResourceAttribute));
+ DEBUG ((DEBUG_INFO, "PhysicalStart: 0x%llx\n", Hob.ResourceDescriptor->PhysicalStart));
+ DEBUG ((DEBUG_INFO, "ResourceLength: 0x%llx\n", Hob.ResourceDescriptor->ResourceLength));
+ DEBUG ((DEBUG_INFO, "Owner: %g\n\n", &Hob.ResourceDescriptor->Owner));
+
+ PhysicalEnd = Hob.ResourceDescriptor->PhysicalStart + Hob.ResourceDescriptor->ResourceLength;
+
+ Status = MpAcceptMemoryResourceRange (
+ Hob.ResourceDescriptor->PhysicalStart,
+ PhysicalEnd);
+ if (EFI_ERROR (Status)) {
+ break;
+ }
+ }
+ }
+ Hob.Raw = GET_NEXT_HOB (Hob);
+ }
+
+ return Status;
+}
+
+/**
+ In Tdx guest, some information need to be passed from host VMM to guest
+ firmware. For example, the memory resource, etc. These information are
+ prepared by host VMM and put in HobList which is described in TdxMetadata.
+
+ Information in HobList is treated as external input. From the security
+ perspective before it is consumed, it should be validated.
+
+ @retval EFI_SUCCESS Successfully process the hoblist
+ @retval Others Other error as indicated
+**/
+EFI_STATUS
+EFIAPI
+ProcessTdxHobList (
+ VOID
+ )
+{
+ EFI_STATUS Status;
+ VOID *TdHob;
+ TD_RETURN_DATA TdReturnData;
+
+ TdHob = (VOID *) (UINTN) FixedPcdGet32 (PcdOvmfSecGhcbBase);
+ Status = TdCall (TDCALL_TDINFO, 0, 0, 0, &TdReturnData);
+ if (EFI_ERROR (Status)) {
+ return Status;
+ }
+
+ DEBUG ((DEBUG_INFO,
+ "Intel Tdx Started with (GPAW: %d, Cpus: %d)\n",
+ TdReturnData.TdInfo.Gpaw,
+ TdReturnData.TdInfo.NumVcpus
+ ));
+
+ //
+ // Validate HobList
+ //
+ if (ValidateHobList (TdHob) == FALSE) {
+ return EFI_INVALID_PARAMETER;
+ }
+
+ //
+ // Process Hoblist to accept memory
+ //
+ Status = ProcessHobList (TdHob);
+
+ return Status;
+}
diff --git a/OvmfPkg/Sec/IntelTdx.h b/OvmfPkg/Sec/IntelTdx.h
new file mode 100644
index 000000000000..9420f586b176
--- /dev/null
+++ b/OvmfPkg/Sec/IntelTdx.h
@@ -0,0 +1,33 @@
+/** @file
+
+ Copyright (c) 2021, Intel Corporation. All rights reserved.<BR>
+
+ SPDX-License-Identifier: BSD-2-Clause-Patent
+
+**/
+#ifndef __INTEL_TDX_H__
+#define __INTEL_TDX_H__
+
+#include <PiPei.h>
+#include <Library/BaseLib.h>
+#include <Uefi/UefiSpec.h>
+#include <Uefi/UefiBaseType.h>
+#include <IndustryStandard/UefiTcgPlatform.h>
+
+/**
+ In Tdx guest, some information need to be passed from host VMM to guest
+ firmware. For example, the memory resource, etc. These information are
+ prepared by host VMM and put in HobList which is described in TdxMetadata.
+
+ Information in HobList is treated as external input. From the security
+ perspective before it is consumed, it should be validated.
+
+ @retval EFI_SUCCESS Successfully process the hoblist
+ @retval Others Other error as indicated
+**/
+EFI_STATUS
+EFIAPI
+ProcessTdxHobList (
+ VOID
+ );
+#endif
diff --git a/OvmfPkg/Sec/SecMain.c b/OvmfPkg/Sec/SecMain.c
index e166a9389a1a..9792d7abe2d3 100644
--- a/OvmfPkg/Sec/SecMain.c
+++ b/OvmfPkg/Sec/SecMain.c
@@ -29,8 +29,11 @@
#include <Library/MemEncryptSevLib.h>
#include <Register/Amd/Ghcb.h>
#include <Register/Amd/Msr.h>
-
+#include <IndustryStandard/Tdx.h>
+#include <Library/TdxLib.h>
+#include <Library/TdxProbeLib.h>
#include <Ppi/TemporaryRamSupport.h>
+#include "IntelTdx.h"
#define SEC_IDT_ENTRY_COUNT 34
@@ -844,6 +847,19 @@ SecCoreStartupWithStack (
UINT32 Index;
volatile UINT8 *Table;
+#if defined (MDE_CPU_X64)
+ if (TdxIsEnabled ()) {
+ //
+ // For Td guests, the memory map info is in TdHobLib. It should be processed
+ // first so that the memory is accepted. Otherwise access to the unaccepted
+ // memory will trigger tripple fault.
+ //
+ if (ProcessTdxHobList () != EFI_SUCCESS) {
+ CpuDeadLoop ();
+ }
+ }
+#endif
+
//
// To ensure SMM can't be compromised on S3 resume, we must force re-init of
// the BaseExtractGuidedSectionLib. Since this is before library contructors
@@ -861,13 +877,20 @@ SecCoreStartupWithStack (
// we use a loop rather than CopyMem.
//
IdtTableInStack.PeiService = NULL;
+
for (Index = 0; Index < SEC_IDT_ENTRY_COUNT; Index ++) {
- UINT8 *Src;
- UINT8 *Dst;
- UINTN Byte;
+ //
+ // Declare the local variables that actually move the data elements as
+ // volatile to prevent the optimizer from replacing this function with
+ // the intrinsic memcpy()
+ //
+ CONST UINT8 *Src;
+ volatile UINT8 *Dst;
+ UINTN Byte;
+
+ Src = (CONST UINT8 *) &mIdtEntryTemplate;
+ Dst = (volatile UINT8 *) &IdtTableInStack.IdtTable[Index];
- Src = (UINT8 *) &mIdtEntryTemplate;
- Dst = (UINT8 *) &IdtTableInStack.IdtTable[Index];
for (Byte = 0; Byte < sizeof (mIdtEntryTemplate); Byte++) {
Dst[Byte] = Src[Byte];
}
@@ -913,6 +936,14 @@ SecCoreStartupWithStack (
AsmEnableCache ();
}
+ if (TdxIsEnabled ()) {
+ //
+ // InitializeCpuExceptionHandlers () should be called in Td guests so that
+ // #VE exceptions can be handled correctly.
+ //
+ InitializeCpuExceptionHandlers (NULL);
+ }
+
DEBUG ((DEBUG_INFO,
"SecCoreStartupWithStack(0x%x, 0x%x)\n",
(UINT32)(UINTN)BootFv,
diff --git a/OvmfPkg/Sec/SecMain.inf b/OvmfPkg/Sec/SecMain.inf
index 75cd5ee7264b..7d19ce1f4524 100644
--- a/OvmfPkg/Sec/SecMain.inf
+++ b/OvmfPkg/Sec/SecMain.inf
@@ -28,6 +28,7 @@
Ia32/SecEntry.nasm
[Sources.X64]
+ IntelTdx.c
X64/SecEntry.nasm
[Packages]
@@ -51,6 +52,9 @@
ExtractGuidedSectionLib
LocalApicLib
CpuExceptionHandlerLib
+ TdxLib
+ TdxMailboxLib
+ TdxProbeLib
[Ppis]
gEfiTemporaryRamSupportPpiGuid # PPI ALWAYS_PRODUCED
@@ -71,6 +75,8 @@
gUefiOvmfPkgTokenSpaceGuid.PcdOvmfDecompressionScratchEnd
gEfiMdeModulePkgTokenSpaceGuid.PcdInitValueInTempStack
gUefiOvmfPkgTokenSpaceGuid.PcdOvmfSecGhcbBackupBase
+ gUefiOvmfPkgTokenSpaceGuid.PcdTdxAcceptChunkSize
+ gUefiOvmfPkgTokenSpaceGuid.PcdTdxAcceptPageSize
[FeaturePcd]
gUefiOvmfPkgTokenSpaceGuid.PcdSmmSmramRequire
--
2.29.2.windows.2
-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#79174): https://edk2.groups.io/g/devel/message/79174
Mute This Topic: https://groups.io/mt/84837914/1787277
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org]
-=-=-=-=-=-=-=-=-=-=-=-
Hi, > +/** > + In Tdx guest, some information need to be passed from host VMM to guest > + firmware. For example, the memory resource, etc. These information are > + prepared by host VMM and put in HobList which is described in TdxMetadata. What kind of information is passed to the guest here? qemu has fw_cfg to pass information from the VMM to the guest firmware. What are the reasons to not use fw_cfg? thanks, Gerd -=-=-=-=-=-=-=-=-=-=-=- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#79551): https://edk2.groups.io/g/devel/message/79551 Mute This Topic: https://groups.io/mt/84837914/1787277 Group Owner: devel+owner@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org] -=-=-=-=-=-=-=-=-=-=-=-
On August 19, 2021 2:50 PM, Gerd Hoffmann wrote: > > +/** > > + In Tdx guest, some information need to be passed from host VMM to > guest > > + firmware. For example, the memory resource, etc. These information are > > + prepared by host VMM and put in HobList which is described in > TdxMetadata. > > What kind of information is passed to the guest here? Please see https://software.intel.com/content/dam/develop/external/us/en/documents/tdx-virtual-firmware-design-guide-rev-1.pdf Section 4.2 TD Hand-Off Block (HOB) > > qemu has fw_cfg to pass information from the VMM to the guest firmware. > What are the reasons to not use fw_cfg? Not all the VMM support fw_cfg. Cloud-Hypervisor is the example. https://github.com/cloud-hypervisor/cloud-hypervisor TD Hob list gives Cloud-Hypervisor a chance to pass information to guest firmware. For example, ACPI can be downloaded from QEMU via fw_cfg to firmware. But Cloud-Hypervisor cannot pass ACPI via fw_cfg. In this situation, TD Hob can resolve this problem. > > thanks, > Gerd -=-=-=-=-=-=-=-=-=-=-=- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#79584): https://edk2.groups.io/g/devel/message/79584 Mute This Topic: https://groups.io/mt/84837914/1787277 Group Owner: devel+owner@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org] -=-=-=-=-=-=-=-=-=-=-=-
On Thu, Aug 19, 2021 at 02:27:16PM +0000, Min Xu wrote: > On August 19, 2021 2:50 PM, Gerd Hoffmann wrote: > > > +/** > > > + In Tdx guest, some information need to be passed from host VMM to > > guest > > > + firmware. For example, the memory resource, etc. These information are > > > + prepared by host VMM and put in HobList which is described in > > TdxMetadata. > > > > What kind of information is passed to the guest here? > Please see https://software.intel.com/content/dam/develop/external/us/en/documents/tdx-virtual-firmware-design-guide-rev-1.pdf > Section 4.2 TD Hand-Off Block (HOB) So basically the physical memory map. qemu has etc/e820 for that. > > qemu has fw_cfg to pass information from the VMM to the guest firmware. > > What are the reasons to not use fw_cfg? > Not all the VMM support fw_cfg. Cloud-Hypervisor is the example. I can't see any support for Cloud-Hypervisor in OVMF. Also FreeBSD's bhyve doesn't support fw_cfg either and has its own ways to detect memory. Cloud-Hypervisor can surely do that too. So, why does this matter? > https://github.com/cloud-hypervisor/cloud-hypervisor > TD Hob list gives Cloud-Hypervisor a chance to pass information to guest firmware. > For example, ACPI can be downloaded from QEMU via fw_cfg to firmware. But > Cloud-Hypervisor cannot pass ACPI via fw_cfg. In this situation, TD Hob can resolve > this problem. Sure, but again, why does this matter? For qemu? I don't like the idea to have TDX take a completely different code paths. That increases the code complexity and makes testing harder for no good reason. take care, Gerd -=-=-=-=-=-=-=-=-=-=-=- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#79634): https://edk2.groups.io/g/devel/message/79634 Mute This Topic: https://groups.io/mt/84837914/1787277 Group Owner: devel+owner@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org] -=-=-=-=-=-=-=-=-=-=-=-
On August 20, 2021 3:23 PM, Gerd Hoffmann wrote: > On Thu, Aug 19, 2021 at 02:27:16PM +0000, Min Xu wrote: > > On August 19, 2021 2:50 PM, Gerd Hoffmann wrote: > > > > +/** > > > > + In Tdx guest, some information need to be passed from host VMM > > > > +to > > > guest > > > > + firmware. For example, the memory resource, etc. These > > > > + information are prepared by host VMM and put in HobList which > > > > + is described in > > > TdxMetadata. > > > > > > What kind of information is passed to the guest here? > > Please see > > > https://software.intel.com/content/dam/develop/external/us/en/document > > s/tdx-virtual-firmware-design-guide-rev-1.pdf > > Section 4.2 TD Hand-Off Block (HOB) > > So basically the physical memory map. > qemu has etc/e820 for that. > > > > qemu has fw_cfg to pass information from the VMM to the guest > firmware. > > > What are the reasons to not use fw_cfg? > > Not all the VMM support fw_cfg. Cloud-Hypervisor is the example. > > I can't see any support for Cloud-Hypervisor in OVMF. Right that currently OVMF is not supported by Cloud-Hypervisor in Td guest. But we're planning to support Cloud-Hypervisor to launch OVMF in Td guest and have done some POC. > > Also FreeBSD's bhyve doesn't support fw_cfg either and has its own ways to > detect memory. Cloud-Hypervisor can surely do that too. > > So, why does this matter? Yes, Cloud-Hypervisor has some POC to launch OVMF in Non-Td guest. In that POC Cloud-Hypervisor leverage a 4k page in MEMFD and pass ACPI data to guest Firmware in that memory. https://github.com/cloud-hypervisor/edk2 "ch" branch https://github.com/cloud-hypervisor/edk2/commit/52cb72a748ef70833100ca664f6c2a704c28a93f > > > https://github.com/cloud-hypervisor/cloud-hypervisor > > TD Hob list gives Cloud-Hypervisor a chance to pass information to guest > firmware. > > For example, ACPI can be downloaded from QEMU via fw_cfg to firmware. > > But Cloud-Hypervisor cannot pass ACPI via fw_cfg. In this situation, > > TD Hob can resolve this problem. > > Sure, but again, why does this matter? For qemu? I don't quite understand the question here(For qumu?). What I mean in my last answer is that TD Hob can resolve the problem when the host VMM doesn't support fw_cfg communication mechanism. For the host VMMs which doesn't support fw_cfg, when ACPI data need to be passed to guest firmware, a 4k page (to hold ACPI data) is added in MEMFD. Then when SMBIOS is needed, shall we add another page in MEMFD? If the ACPI data is too big to be held in a 4k page, then the size of the reserved memory region in MEMFD is the restriction. > > I don't like the idea to have TDX take a completely different code paths. > That increases the code complexity and makes testing harder for no good > reason. TD Hob is not a completely different code path. This is a useful supplement to the fw_cfg which is not supported by some host VMM. From another perspective TD Hob can be treated as a set of launch parameter by host VMM. It provides the flexibility for the host VMM to bring up the guest firmware with more parameters. Another benefit is that TD Hob can be measured into some secure register (for example, in TD guest it is RTMR registers, like the TPM PCR) so that attestation can be done based on the measurement. Thanks Gerd for the comments. I am not sure if my explanation addressed your concern. Your comments is always welcomed. > Thanks! Min -=-=-=-=-=-=-=-=-=-=-=- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#79764): https://edk2.groups.io/g/devel/message/79764 Mute This Topic: https://groups.io/mt/84837914/1787277 Group Owner: devel+owner@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org] -=-=-=-=-=-=-=-=-=-=-=-
Hi, > > I can't see any support for Cloud-Hypervisor in OVMF. > Right that currently OVMF is not supported by Cloud-Hypervisor in Td guest. But we're > planning to support Cloud-Hypervisor to launch OVMF in Td guest and have done > some POC. > > Also FreeBSD's bhyve doesn't support fw_cfg either and has its own ways to > > detect memory. Cloud-Hypervisor can surely do that too. > > > > So, why does this matter? > Yes, Cloud-Hypervisor has some POC to launch OVMF in Non-Td guest. In that POC > Cloud-Hypervisor leverage a 4k page in MEMFD and pass ACPI data to guest > Firmware in that memory. > https://github.com/cloud-hypervisor/edk2 "ch" branch > https://github.com/cloud-hypervisor/edk2/commit/52cb72a748ef70833100ca664f6c2a704c28a93f Post the Cloud-Hypervisor patches to the list for review if you want them included in OVMF. out-of-tree patches lingering in some random branch @ github do not matter. > > > https://github.com/cloud-hypervisor/cloud-hypervisor > > > TD Hob list gives Cloud-Hypervisor a chance to pass information to guest > > firmware. > > > For example, ACPI can be downloaded from QEMU via fw_cfg to firmware. > > > But Cloud-Hypervisor cannot pass ACPI via fw_cfg. In this situation, > > > TD Hob can resolve this problem. > > > > Sure, but again, why does this matter? For qemu? > I don't quite understand the question here(For qumu?). Well, each VMM has different interfaces for guest <-> host communication. qemu/kvm uses fw_cfg. Xen and Bhyve have something different, and Cloud-Hypervisor seems to have something different again. > What I mean in my last answer is that TD Hob can resolve the problem > when the host VMM doesn't support fw_cfg communication mechanism. Sure. If Cloud-Hypervisor wants use that (for both TDX and non-TDX probably), fine. Submit the patches and we can discuss details. But why do you want switch qemu/kvm from fw_cfg to TD Hob? > > I don't like the idea to have TDX take a completely different code > > paths. That increases the code complexity and makes testing harder > > for no good reason. > TD Hob is not a completely different code path. This is a useful > supplement to the fw_cfg which is not supported by some host VMM. The code uses that unconditionally though and not only in case fw_cfg is not available. > From another perspective TD Hob can be treated as a set of launch > parameter by host VMM. It provides the flexibility for the host VMM > to bring up the guest firmware with more parameters. I'm wondering what you are running on the host? I assumed we are discussing qemu/kvm as VMM, is that actually the case? Or do you use Cloud Hypervisor? qemu doesn't provide a TD Hob. So, when running qemu you probably have some out-of-tree patches adding that. Have you submitted them upstream? What is the status? I suspect you need to come up with some *very* good arguments to get that accepted. The need to bring parameters to the guest firmware is the reason why fw_cfg exists in the first place ... > Another benefit is that TD Hob can be measured into some secure > register (for example, in TD guest it is RTMR registers, like the TPM > PCR) so that attestation can be done based on the measurement. fw_cfg is measured too (according to one of the tdx pdfs, don't remember which one). take care, Gerd -=-=-=-=-=-=-=-=-=-=-=- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#79791): https://edk2.groups.io/g/devel/message/79791 Mute This Topic: https://groups.io/mt/84837914/1787277 Group Owner: devel+owner@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org] -=-=-=-=-=-=-=-=-=-=-=-
On Tue, 24 Aug 2021 at 14:07, Xu, Min M <min.m.xu@intel.com> wrote: > > On August 20, 2021 3:23 PM, Gerd Hoffmann wrote: > > On Thu, Aug 19, 2021 at 02:27:16PM +0000, Min Xu wrote: > > > On August 19, 2021 2:50 PM, Gerd Hoffmann wrote: > > > > > +/** > > > > > + In Tdx guest, some information need to be passed from host VMM > > > > > +to > > > > guest > > > > > + firmware. For example, the memory resource, etc. These > > > > > + information are prepared by host VMM and put in HobList which > > > > > + is described in > > > > TdxMetadata. > > > > > > > > What kind of information is passed to the guest here? > > > Please see > > > > > https://software.intel.com/content/dam/develop/external/us/en/document > > > s/tdx-virtual-firmware-design-guide-rev-1.pdf > > > Section 4.2 TD Hand-Off Block (HOB) > > > > So basically the physical memory map. > > qemu has etc/e820 for that. > > > > > > qemu has fw_cfg to pass information from the VMM to the guest > > firmware. > > > > What are the reasons to not use fw_cfg? > > > Not all the VMM support fw_cfg. Cloud-Hypervisor is the example. > > > > I can't see any support for Cloud-Hypervisor in OVMF. > Right that currently OVMF is not supported by Cloud-Hypervisor in Td guest. But we're > planning to support Cloud-Hypervisor to launch OVMF in Td guest and have done > some POC. If cloud hypervisor support is coming to OVMF, please contribute those patches first, so they can be discussed in public. Adding special facilities here to accommodate out of tree functionality that may look completely differently after review is not the right way to approach this. -- Ard. > > > > Also FreeBSD's bhyve doesn't support fw_cfg either and has its own ways to > > detect memory. Cloud-Hypervisor can surely do that too. > > > > So, why does this matter? > Yes, Cloud-Hypervisor has some POC to launch OVMF in Non-Td guest. In that POC > Cloud-Hypervisor leverage a 4k page in MEMFD and pass ACPI data to guest > Firmware in that memory. > https://github.com/cloud-hypervisor/edk2 "ch" branch > https://github.com/cloud-hypervisor/edk2/commit/52cb72a748ef70833100ca664f6c2a704c28a93f > > > > > https://github.com/cloud-hypervisor/cloud-hypervisor > > > TD Hob list gives Cloud-Hypervisor a chance to pass information to guest > > firmware. > > > For example, ACPI can be downloaded from QEMU via fw_cfg to firmware. > > > But Cloud-Hypervisor cannot pass ACPI via fw_cfg. In this situation, > > > TD Hob can resolve this problem. > > > > Sure, but again, why does this matter? For qemu? > I don't quite understand the question here(For qumu?). > What I mean in my last answer is that TD Hob can resolve the problem when the host VMM > doesn't support fw_cfg communication mechanism. > For the host VMMs which doesn't support fw_cfg, when ACPI data need to be passed to guest > firmware, a 4k page (to hold ACPI data) is added in MEMFD. Then when SMBIOS is needed, > shall we add another page in MEMFD? If the ACPI data is too big to be held in a 4k page, then > the size of the reserved memory region in MEMFD is the restriction. > > > > I don't like the idea to have TDX take a completely different code paths. > > That increases the code complexity and makes testing harder for no good > > reason. > TD Hob is not a completely different code path. This is a useful supplement to the fw_cfg which > is not supported by some host VMM. > From another perspective TD Hob can be treated as a set of launch parameter by host VMM. > It provides the flexibility for the host VMM to bring up the guest firmware with more parameters. > Another benefit is that TD Hob can be measured into some secure register (for example, in TD guest > it is RTMR registers, like the TPM PCR) so that attestation can be done based on the measurement. > > Thanks Gerd for the comments. I am not sure if my explanation addressed your concern. Your comments > is always welcomed. > > > > Thanks! > Min -=-=-=-=-=-=-=-=-=-=-=- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#79768): https://edk2.groups.io/g/devel/message/79768 Mute This Topic: https://groups.io/mt/84837914/1787277 Group Owner: devel+owner@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org] -=-=-=-=-=-=-=-=-=-=-=-
HI Ard and Gerd I am not sure if I fully understand the argument here. In TDX architecture, the VMM provides a pointer to the TD guest as initial parameter. We define the detail information there to be TD Hob. This solution is generic to all hypervisor. fw_cfg is just a KVM/QEMU specific way to pass some parameter, but not all parameter. For example, OVMF today still get the memory size from CMOS. https://github.com/tianocore/edk2/blob/master/OvmfPkg/PlatformPei/MemDetect.c#L278 In TDVF design, we choose the use TDX defined initial pointer to pass the initial memory information - TD_HOB, instead of CMOS region. Please help me understand what is the real concern here. I understand the QEMU specific fw_cfg (https://github.com/qemu/qemu/blob/master/docs/specs/fw_cfg.txt). If you want to use fw_cfg to pass some QEMU specific parameter, it is still possible. For security reason, any input from the IO device must be measured by the TD guest. That means, if you get the same data twice from the fw_cfg, the TDVF must measure them twice. And TDVF may need handle the case that the data in first call is different with the data in second call. I can see potential attack surface there from architecture perspective. Using HOB in the initial pointer can be an alternative pattern to mitigate such risk. We just need measure them once then any component can use that. Also, it can help the people to evaluate the RTMR hash and TD event log data for the configuration in attestation flow, because the configuration is independent with the code execution flow. Please be aware that confidential computing (TDX) changes the threat model completely, any input from VMM is considered as malicious. Current solution might be OK for normal OVMF. But it does not mean the same solution is still the best one for confidential computing use case. Thank you Yao Jiewen > -----Original Message----- > From: Ard Biesheuvel <ardb@kernel.org> > Sent: Tuesday, August 24, 2021 8:56 PM > To: Xu, Min M <min.m.xu@intel.com> > Cc: Gerd Hoffmann <kraxel@redhat.com>; devel@edk2.groups.io; Ard > Biesheuvel <ardb+tianocore@kernel.org>; Justen, Jordan L > <jordan.l.justen@intel.com>; Brijesh Singh <brijesh.singh@amd.com>; Erdem > Aktas <erdemaktas@google.com>; James Bottomley <jejb@linux.ibm.com>; > Yao, Jiewen <jiewen.yao@intel.com>; Tom Lendacky > <thomas.lendacky@amd.com> > Subject: Re: [edk2-devel] [PATCH 18/23] OvmfPkg: Enable Tdx in SecMain.c > > On Tue, 24 Aug 2021 at 14:07, Xu, Min M <min.m.xu@intel.com> wrote: > > > > On August 20, 2021 3:23 PM, Gerd Hoffmann wrote: > > > On Thu, Aug 19, 2021 at 02:27:16PM +0000, Min Xu wrote: > > > > On August 19, 2021 2:50 PM, Gerd Hoffmann wrote: > > > > > > +/** > > > > > > + In Tdx guest, some information need to be passed from host VMM > > > > > > +to > > > > > guest > > > > > > + firmware. For example, the memory resource, etc. These > > > > > > + information are prepared by host VMM and put in HobList which > > > > > > + is described in > > > > > TdxMetadata. > > > > > > > > > > What kind of information is passed to the guest here? > > > > Please see > > > > > > > https://software.intel.com/content/dam/develop/external/us/en/document > > > > s/tdx-virtual-firmware-design-guide-rev-1.pdf > > > > Section 4.2 TD Hand-Off Block (HOB) > > > > > > So basically the physical memory map. > > > qemu has etc/e820 for that. > > > > > > > > qemu has fw_cfg to pass information from the VMM to the guest > > > firmware. > > > > > What are the reasons to not use fw_cfg? > > > > Not all the VMM support fw_cfg. Cloud-Hypervisor is the example. > > > > > > I can't see any support for Cloud-Hypervisor in OVMF. > > Right that currently OVMF is not supported by Cloud-Hypervisor in Td guest. > But we're > > planning to support Cloud-Hypervisor to launch OVMF in Td guest and have > done > > some POC. > > If cloud hypervisor support is coming to OVMF, please contribute those > patches first, so they can be discussed in public. Adding special > facilities here to accommodate out of tree functionality that may look > completely differently after review is not the right way to approach > this. > > -- > Ard. > > > > > > > > Also FreeBSD's bhyve doesn't support fw_cfg either and has its own ways to > > > detect memory. Cloud-Hypervisor can surely do that too. > > > > > > So, why does this matter? > > Yes, Cloud-Hypervisor has some POC to launch OVMF in Non-Td guest. In that > POC > > Cloud-Hypervisor leverage a 4k page in MEMFD and pass ACPI data to guest > > Firmware in that memory. > > https://github.com/cloud-hypervisor/edk2 "ch" branch > > https://github.com/cloud- > hypervisor/edk2/commit/52cb72a748ef70833100ca664f6c2a704c28a93f > > > > > > > https://github.com/cloud-hypervisor/cloud-hypervisor > > > > TD Hob list gives Cloud-Hypervisor a chance to pass information to guest > > > firmware. > > > > For example, ACPI can be downloaded from QEMU via fw_cfg to firmware. > > > > But Cloud-Hypervisor cannot pass ACPI via fw_cfg. In this situation, > > > > TD Hob can resolve this problem. > > > > > > Sure, but again, why does this matter? For qemu? > > I don't quite understand the question here(For qumu?). > > What I mean in my last answer is that TD Hob can resolve the problem when > the host VMM > > doesn't support fw_cfg communication mechanism. > > For the host VMMs which doesn't support fw_cfg, when ACPI data need to be > passed to guest > > firmware, a 4k page (to hold ACPI data) is added in MEMFD. Then when > SMBIOS is needed, > > shall we add another page in MEMFD? If the ACPI data is too big to be held in a > 4k page, then > > the size of the reserved memory region in MEMFD is the restriction. > > > > > > I don't like the idea to have TDX take a completely different code paths. > > > That increases the code complexity and makes testing harder for no good > > > reason. > > TD Hob is not a completely different code path. This is a useful supplement to > the fw_cfg which > > is not supported by some host VMM. > > From another perspective TD Hob can be treated as a set of launch parameter > by host VMM. > > It provides the flexibility for the host VMM to bring up the guest firmware with > more parameters. > > Another benefit is that TD Hob can be measured into some secure register (for > example, in TD guest > > it is RTMR registers, like the TPM PCR) so that attestation can be done based > on the measurement. > > > > Thanks Gerd for the comments. I am not sure if my explanation addressed your > concern. Your comments > > is always welcomed. > > > > > > > Thanks! > > Min -=-=-=-=-=-=-=-=-=-=-=- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#79790): https://edk2.groups.io/g/devel/message/79790 Mute This Topic: https://groups.io/mt/84837914/1787277 Group Owner: devel+owner@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org] -=-=-=-=-=-=-=-=-=-=-=-
Hi, > fw_cfg is just a KVM/QEMU specific way to pass some parameter, but not > all parameter. For example, OVMF today still get the memory size from > CMOS. > https://github.com/tianocore/edk2/blob/master/OvmfPkg/PlatformPei/MemDetect.c#L278 Patches to fix that are on the list. > In TDVF design, we choose the use TDX defined initial pointer to pass > the initial memory information - TD_HOB, instead of CMOS region. > Please help me understand what is the real concern here. Well, qemu settled to the fw_cfg design or a number of reasons. It is pretty flexible for example. The firmware can ask for the information it needs at any time and can store it as it pleases. I'd suggest to not take it for granted that an additional alternative way to do basically the same thing will be accepted to upstream qemu. Submit your patches to qemu-devel to discuss that. > That means, if you get the same data twice from the fw_cfg, the TDVF > must measure them twice. And TDVF may need handle the case that the > data in first call is different with the data in second call. Most fw_cfg entries are constant anyway, so we can easily avoid a second call by caching the results of the first call if that helps TDVF. > Using HOB in the initial pointer can be an alternative pattern to > mitigate such risk. We just need measure them once then any component > can use that. Also, it can help the people to evaluate the RTMR hash > and TD event log data for the configuration in attestation flow, > because the configuration is independent with the code execution flow. Well, it covers only the memory map, correct? All other configuration is still loaded from fw_cfg. I can't see the improvement here. How do you pass the HOB to the guest? Copy data to guest ram? Map a ro page into guest address space? What happens on VM reset? > Please be aware that confidential computing (TDX) changes the threat > model completely, any input from VMM is considered as malicious. > Current solution might be OK for normal OVMF. But it does not mean the > same solution is still the best one for confidential computing use > case. Well, SEV seems to be happy with fw_cfg. Any input from AMD on the topic? take care, Gerd -=-=-=-=-=-=-=-=-=-=-=- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#79793): https://edk2.groups.io/g/devel/message/79793 Mute This Topic: https://groups.io/mt/84837914/1787277 Group Owner: devel+owner@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org] -=-=-=-=-=-=-=-=-=-=-=-
Comment below: > -----Original Message----- > From: devel@edk2.groups.io <devel@edk2.groups.io> On Behalf Of Gerd > Hoffmann > Sent: Wednesday, August 25, 2021 3:52 PM > To: Yao, Jiewen <jiewen.yao@intel.com> > Cc: Ard Biesheuvel <ardb@kernel.org>; Xu, Min M <min.m.xu@intel.com>; > devel@edk2.groups.io; Ard Biesheuvel <ardb+tianocore@kernel.org>; Justen, > Jordan L <jordan.l.justen@intel.com>; Brijesh Singh <brijesh.singh@amd.com>; > Erdem Aktas <erdemaktas@google.com>; James Bottomley > <jejb@linux.ibm.com>; Tom Lendacky <thomas.lendacky@amd.com> > Subject: Re: [edk2-devel] [PATCH 18/23] OvmfPkg: Enable Tdx in SecMain.c > > Hi, > > > fw_cfg is just a KVM/QEMU specific way to pass some parameter, but not > > all parameter. For example, OVMF today still get the memory size from > > CMOS. > > > https://github.com/tianocore/edk2/blob/master/OvmfPkg/PlatformPei/MemDe > tect.c#L278 > > Patches to fix that are on the list. [Jiewen] Surprisingly. It was sent one week ago. I obviously miss that email. Please file a Bugzilla and include me in CC list next time. > > > In TDVF design, we choose the use TDX defined initial pointer to pass > > the initial memory information - TD_HOB, instead of CMOS region. > > Please help me understand what is the real concern here. > > Well, qemu settled to the fw_cfg design or a number of reasons. It is > pretty flexible for example. The firmware can ask for the information > it needs at any time and can store it as it pleases. > > I'd suggest to not take it for granted that an additional alternative > way to do basically the same thing will be accepted to upstream qemu. > Submit your patches to qemu-devel to discuss that. [Jiewen] I think Intel Linux team is doing that separately. > > > That means, if you get the same data twice from the fw_cfg, the TDVF > > must measure them twice. And TDVF may need handle the case that the > > data in first call is different with the data in second call. > > Most fw_cfg entries are constant anyway, so we can easily avoid a second > call by caching the results of the first call if that helps TDVF. [Jiewen] It is possible. We can have multiple ways: 1) Per usage cache. However, that means every driver need use its own way to cache the data, either PCD or HOB in PEI phase. Also driver A need to know clearly that driver B will use the same data, then it will cache otherwise it will not cache. I treat it as a huge burden for the developer. 2) Always cache per driver. That means every driver need follow the same pattern: search cache, if miss the get it and cache it. But it still cannot guarantee the data order in different path architecturally. 3) Always cache in one common driver. One driver can get all data one time and cache them. That can resolve the data order problem. I am not sure if that is desired. But I cannot see too much difference between passing data at entry point. > > > Using HOB in the initial pointer can be an alternative pattern to > > mitigate such risk. We just need measure them once then any component > > can use that. Also, it can help the people to evaluate the RTMR hash > > and TD event log data for the configuration in attestation flow, > > because the configuration is independent with the code execution flow. > > Well, it covers only the memory map, correct? All other configuration > is still loaded from fw_cfg. I can't see the improvement here. [Jiewen] At this point of time, memory map is the most important parameter in the TD Hob, because we do need the memory information at the TD entrypoint. That is mandatory for any TD boot. The fw_cfg is still allowed in the TDVF design guide, just because we feel it is a burden to convert everything suddenly. I hope to limit the configuration from VMM. Most fw_cfg will NOT be used for TDVF, for example, "etc/smi", "etc/tpm", "etc/edk2/https/cacerts", "etc/msr_feature_control", "etc/system-states", especially in the container use case. The flexibility is a double-sward. You can treat the TD Hob as the boot parameter for the kernel, here is the kernel == TDVF. Having a static way to get configuration data in memory at one time is the simplest solution, from my perspective. > > How do you pass the HOB to the guest? Copy data to guest ram? Map a > ro page into guest address space? What happens on VM reset? [Jiewen] Yes, VMM will prepare the memory information based upon TDVF metadata. The VMM need copy TD HOB data to a predefined memory region according to TDVF metadata. I don't fully understand the VM reset question. I try to answer. But if that is not what you are asking, please clarify. This action that VMM initiates TD HOB happens when the VMM launches a TD guest. After that the region will becomes TD private memory and own by TD. VMM can no longer access it (no read/no write). If VM reset, then this memory is gone. If VMM need launch a new TD, the VMM need initiate the data again. > > > Please be aware that confidential computing (TDX) changes the threat > > model completely, any input from VMM is considered as malicious. > > Current solution might be OK for normal OVMF. But it does not mean the > > same solution is still the best one for confidential computing use > > case. > > Well, SEV seems to be happy with fw_cfg. > Any input from AMD on the topic? > > take care, > Gerd > > > > > -=-=-=-=-=-=-=-=-=-=-=- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#79794): https://edk2.groups.io/g/devel/message/79794 Mute This Topic: https://groups.io/mt/84837914/1787277 Group Owner: devel+owner@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org] -=-=-=-=-=-=-=-=-=-=-=-
Hi, > > > In TDVF design, we choose the use TDX defined initial pointer to pass > > > the initial memory information - TD_HOB, instead of CMOS region. > > > Please help me understand what is the real concern here. > > > > Well, qemu settled to the fw_cfg design or a number of reasons. It is > > pretty flexible for example. The firmware can ask for the information > > it needs at any time and can store it as it pleases. > > > > I'd suggest to not take it for granted that an additional alternative > > way to do basically the same thing will be accepted to upstream qemu. > > Submit your patches to qemu-devel to discuss that. > > [Jiewen] I think Intel Linux team is doing that separately. Please ask them to send the patches. Changes like this obviously need coordination and agreement between qemu and edk2 projects, and ideally both guest and host code is reviewed in parallel. > > Most fw_cfg entries are constant anyway, so we can easily avoid a second > > call by caching the results of the first call if that helps TDVF. > > [Jiewen] It is possible. We can have multiple ways: > 1) Per usage cache. However, that means every driver need use its own way to cache the data, either PCD or HOB in PEI phase. Also driver A need to know clearly that driver B will use the same data, then it will cache otherwise it will not cache. I treat it as a huge burden for the developer. > 2) Always cache per driver. That means every driver need follow the same pattern: search cache, if miss the get it and cache it. But it still cannot guarantee the data order in different path architecturally. > 3) Always cache in one common driver. One driver can get all data one time and cache them. That can resolve the data order problem. I am not sure if that is desired. But I cannot see too much difference between passing data at entry point. Not investigated yet. seabios fw_cfg handling is close to (3) for small items (not kernel or initrd or other large data sets) so I think I would look into that first. > > > Using HOB in the initial pointer can be an alternative pattern to > > > mitigate such risk. We just need measure them once then any component > > > can use that. Also, it can help the people to evaluate the RTMR hash > > > and TD event log data for the configuration in attestation flow, > > > because the configuration is independent with the code execution flow. > > > > Well, it covers only the memory map, correct? All other configuration > > is still loaded from fw_cfg. I can't see the improvement here. > > [Jiewen] At this point of time, memory map is the most important > parameter in the TD Hob, because we do need the memory information at > the TD entrypoint. That is mandatory for any TD boot. Well, I can see that the memory map is kind of special here because you need that quite early in the firmware initialization workflow. > The fw_cfg is still allowed in the TDVF design guide, just because we > feel it is a burden to convert everything suddenly. What is the longer-term plan here? Does it make sense to special-case the memory map? If we want handle other fw_cfg items that way too later on, shouldn't we better check how we can improve the fw_cfg interface so it works better with confidential computing? > > How do you pass the HOB to the guest? Copy data to guest ram? Map a > > ro page into guest address space? What happens on VM reset? > [Jiewen] Yes, VMM will prepare the memory information based upon TDVF > metadata. The VMM need copy TD HOB data to a predefined memory region > according to TDVF metadata. Is all that documented somewhere? The TVDF design overview focuses on the guest/firmware side of things, so it isn't very helpful here. Did I mention posting the qemu patches would be a good idea? > I don't fully understand the VM reset question. I try to answer. But if that is not what you are asking, please clarify. What happens if you reboot the guest? On non-TDX guests the VM will be reset, the cpu will jump to the reset vector (executing from rom / flash), firmware will re-initialize everything and re-load any config information it needs from fw_cfg > This action that VMM initiates TD HOB happens when the VMM launches a TD guest. > After that the region will becomes TD private memory and own by TD. VMM can no longer access it (no read/no write). > If VM reset, then this memory is gone. > If VMM need launch a new TD, the VMM need initiate the data again. Sounds like reset is not supported, you need to stop and re-start the guest instead. Is that correct? take care, Gerd -=-=-=-=-=-=-=-=-=-=-=- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#79800): https://edk2.groups.io/g/devel/message/79800 Mute This Topic: https://groups.io/mt/84837914/1787277 Group Owner: devel+owner@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org] -=-=-=-=-=-=-=-=-=-=-=-
Comment below: > -----Original Message----- > From: kraxel@redhat.com <kraxel@redhat.com> > Sent: Wednesday, August 25, 2021 10:52 PM > To: Yao, Jiewen <jiewen.yao@intel.com> > Cc: devel@edk2.groups.io; Ard Biesheuvel <ardb@kernel.org>; Xu, Min M > <min.m.xu@intel.com>; Ard Biesheuvel <ardb+tianocore@kernel.org>; Justen, > Jordan L <jordan.l.justen@intel.com>; Brijesh Singh <brijesh.singh@amd.com>; > Erdem Aktas <erdemaktas@google.com>; James Bottomley > <jejb@linux.ibm.com>; Tom Lendacky <thomas.lendacky@amd.com> > Subject: Re: [edk2-devel] [PATCH 18/23] OvmfPkg: Enable Tdx in SecMain.c > > Hi, > > > > > In TDVF design, we choose the use TDX defined initial pointer to pass > > > > the initial memory information - TD_HOB, instead of CMOS region. > > > > Please help me understand what is the real concern here. > > > > > > Well, qemu settled to the fw_cfg design or a number of reasons. It is > > > pretty flexible for example. The firmware can ask for the information > > > it needs at any time and can store it as it pleases. > > > > > > I'd suggest to not take it for granted that an additional alternative > > > way to do basically the same thing will be accepted to upstream qemu. > > > Submit your patches to qemu-devel to discuss that. > > > > [Jiewen] I think Intel Linux team is doing that separately. > > Please ask them to send the patches. Changes like this obviously need > coordination and agreement between qemu and edk2 projects, and ideally > both guest and host code is reviewed in parallel. [Jiewen] Sure. I add Yamahata, Isaku <isaku.yamahata@intel.com> here. He can help answer the KVM/QEMU related question. Some reference for QEMU: https://lists.nongnu.org/archive/html/qemu-devel/2021-07/msg01682.html in patchwork, https://patchwork.kernel.org/project/qemu-devel/cover/cover.1625704980.git.isaku.yamahata@intel.com/ And I guess you probably need look at the KVM as well. > > > > Most fw_cfg entries are constant anyway, so we can easily avoid a second > > > call by caching the results of the first call if that helps TDVF. > > > > [Jiewen] It is possible. We can have multiple ways: > > 1) Per usage cache. However, that means every driver need use its own way to > cache the data, either PCD or HOB in PEI phase. Also driver A need to know > clearly that driver B will use the same data, then it will cache otherwise it will > not cache. I treat it as a huge burden for the developer. > > 2) Always cache per driver. That means every driver need follow the same > pattern: search cache, if miss the get it and cache it. But it still cannot guarantee > the data order in different path architecturally. > > 3) Always cache in one common driver. One driver can get all data one time > and cache them. That can resolve the data order problem. I am not sure if that is > desired. But I cannot see too much difference between passing data at entry > point. > > Not investigated yet. seabios fw_cfg handling is close to (3) for small > items (not kernel or initrd or other large data sets) so I think I would > look into that first. [Jiewen] I don't think it is urgent at this moment. > > > > > Using HOB in the initial pointer can be an alternative pattern to > > > > mitigate such risk. We just need measure them once then any component > > > > can use that. Also, it can help the people to evaluate the RTMR hash > > > > and TD event log data for the configuration in attestation flow, > > > > because the configuration is independent with the code execution flow. > > > > > > Well, it covers only the memory map, correct? All other configuration > > > is still loaded from fw_cfg. I can't see the improvement here. > > > > [Jiewen] At this point of time, memory map is the most important > > parameter in the TD Hob, because we do need the memory information at > > the TD entrypoint. That is mandatory for any TD boot. > > Well, I can see that the memory map is kind of special here because you > need that quite early in the firmware initialization workflow. [Jiewen] That is correct. > > > The fw_cfg is still allowed in the TDVF design guide, just because we > > feel it is a burden to convert everything suddenly. > > What is the longer-term plan here? > > Does it make sense to special-case the memory map? > > If we want handle other fw_cfg items that way too later on, shouldn't we > better check how we can improve the fw_cfg interface so it works better > with confidential computing? [Jiewen] So far, my hope is to limit the fw_cfg as much as possible. My worry is that we have to measure fw_cfg everywhere. If we miss one place, it will be a completeness vulnerability for trusted computing. I also think if we can add measurement code inside of fw_cfg get function. Then we need improve the FwCfg API - Current style: QemuFwCfgSelectItem() + QemuFwCfgReadxxx() is not friendly for measurement. For example, we can combine them and do QemuFwCfgSelectRead (). The QemuFwCfgWritexxx() interface may also bring inconsistency issue. If we use this API, we have 2 copy data. One is in TDVF (trusted), and the other is in VMM/QEMU (untrusted). What if the VMM modifies its untrusted copy? What I can see is many potential attack surfaces. :-( > > > > How do you pass the HOB to the guest? Copy data to guest ram? Map a > > > ro page into guest address space? What happens on VM reset? > > > [Jiewen] Yes, VMM will prepare the memory information based upon TDVF > > metadata. The VMM need copy TD HOB data to a predefined memory region > > according to TDVF metadata. > > Is all that documented somewhere? The TVDF design overview focuses on > the guest/firmware side of things, so it isn't very helpful here. [Jiewen] The TDX architecture define this architecture way, the RCX/R9 refer is the pointer. We have a couple of TDX document at https://software.intel.com/content/www/us/en/develop/articles/intel-trust-domain-extensions.html https://software.intel.com/content/dam/develop/external/us/en/documents/tdx-module-1eas-v0.85.039.pdf Section 8.1 defines the VCPU init state. The RCX/R8 hold the launch parameter. https://software.intel.com/content/dam/develop/external/us/en/documents/tdx-virtual-firmware-design-guide-rev-1.pdf Section 4.1.2 describes the TD HOB usage for RCX/R8. Section 7 adds more detail on memory map reporting. Please let me know if you need any other information. > > Did I mention posting the qemu patches would be a good idea? > > > I don't fully understand the VM reset question. I try to answer. But if that is not > what you are asking, please clarify. > > What happens if you reboot the guest? > > On non-TDX guests the VM will be reset, the cpu will jump to the reset > vector (executing from rom / flash), firmware will re-initialize > everything and re-load any config information it needs from fw_cfg > > > This action that VMM initiates TD HOB happens when the VMM launches a TD > guest. > > After that the region will becomes TD private memory and own by TD. VMM > can no longer access it (no read/no write). > > If VM reset, then this memory is gone. > > If VMM need launch a new TD, the VMM need initiate the data again. > > Sounds like reset is not supported, you need to stop and re-start the > guest instead. Is that correct? [Jiewen] That is correct. In our definition, reset == stop + restart. > > take care, > Gerd -=-=-=-=-=-=-=-=-=-=-=- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#79803): https://edk2.groups.io/g/devel/message/79803 Mute This Topic: https://groups.io/mt/84837914/1787277 Group Owner: devel+owner@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org] -=-=-=-=-=-=-=-=-=-=-=-
Hi, > Some reference for QEMU: > https://lists.nongnu.org/archive/html/qemu-devel/2021-07/msg01682.html Ah, good. /me adds an entry to the todo list. > > > The fw_cfg is still allowed in the TDVF design guide, just because we > > > feel it is a burden to convert everything suddenly. > > > > What is the longer-term plan here? > > > > Does it make sense to special-case the memory map? > > > > If we want handle other fw_cfg items that way too later on, shouldn't we > > better check how we can improve the fw_cfg interface so it works better > > with confidential computing? > > [Jiewen] So far, my hope is to limit the fw_cfg as much as possible. > My worry is that we have to measure fw_cfg everywhere. If we miss one place, it will be a completeness vulnerability for trusted computing. > > I also think if we can add measurement code inside of fw_cfg get function. > Then we need improve the FwCfg API - Current style: QemuFwCfgSelectItem() + QemuFwCfgReadxxx() is not friendly for measurement. For example, we can combine them and do QemuFwCfgSelectRead (). I was more thinking about a completely different way to pass (constant) fw_cfg data. Something like defining a fw_cfg hob and adding that to the td hob. QemuFwCfgLib could lookup the hob and use that when it finds the needed entry there. In case the entry is not there try use io instead. We'll continue to need that for the acpi tables for example, these entries are not constant. qemu will adapt them when the firmware maps hardware resources referenced in acpi tables (mmconfig region, power management registers, ...). > The QemuFwCfgWritexxx() interface may also bring inconsistency issue. > If we use this API, we have 2 copy data. Do you need any writable fw_cfg entries in TDX mode? 'git grep' shows the ramfb driver, smi feature negotiation and s3 support use QemuFwCfgWrite() > One is in TDVF (trusted), and > the other is in VMM/QEMU (untrusted). What if the VMM modifies its > untrusted copy? > What I can see is many potential attack surfaces. :-( Well, you have to trust VMM/QEMU to a certain degree. TDX can prevent data leaking, but it can't prevent VMM misbehaving. > Please let me know if you need any other information. Sure. For now I have to read more docs and patches ... take care, Gerd -=-=-=-=-=-=-=-=-=-=-=- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#79836): https://edk2.groups.io/g/devel/message/79836 Mute This Topic: https://groups.io/mt/84837914/1787277 Group Owner: devel+owner@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org] -=-=-=-=-=-=-=-=-=-=-=-
Comment below: > -----Original Message----- > From: devel@edk2.groups.io <devel@edk2.groups.io> On Behalf Of Gerd > Hoffmann > Sent: Thursday, August 26, 2021 4:32 PM > To: Yao, Jiewen <jiewen.yao@intel.com> > Cc: devel@edk2.groups.io; Ard Biesheuvel <ardb@kernel.org>; Xu, Min M > <min.m.xu@intel.com>; Ard Biesheuvel <ardb+tianocore@kernel.org>; Justen, > Jordan L <jordan.l.justen@intel.com>; Brijesh Singh <brijesh.singh@amd.com>; > Erdem Aktas <erdemaktas@google.com>; James Bottomley > <jejb@linux.ibm.com>; Tom Lendacky <thomas.lendacky@amd.com>; > Yamahata, Isaku <isaku.yamahata@intel.com> > Subject: Re: [edk2-devel] [PATCH 18/23] OvmfPkg: Enable Tdx in SecMain.c > > Hi, > > > Some reference for QEMU: > > https://lists.nongnu.org/archive/html/qemu-devel/2021-07/msg01682.html > > Ah, good. /me adds an entry to the todo list. > > > > > The fw_cfg is still allowed in the TDVF design guide, just because we > > > > feel it is a burden to convert everything suddenly. > > > > > > What is the longer-term plan here? > > > > > > Does it make sense to special-case the memory map? > > > > > > If we want handle other fw_cfg items that way too later on, shouldn't we > > > better check how we can improve the fw_cfg interface so it works better > > > with confidential computing? > > > > [Jiewen] So far, my hope is to limit the fw_cfg as much as possible. > > My worry is that we have to measure fw_cfg everywhere. If we miss one place, > it will be a completeness vulnerability for trusted computing. > > > > I also think if we can add measurement code inside of fw_cfg get function. > > Then we need improve the FwCfg API - Current style: QemuFwCfgSelectItem() > + QemuFwCfgReadxxx() is not friendly for measurement. For example, we can > combine them and do QemuFwCfgSelectRead (). > > I was more thinking about a completely different way to pass (constant) > fw_cfg data. Something like defining a fw_cfg hob and adding that to the > td hob. QemuFwCfgLib could lookup the hob and use that when it finds > the needed entry there. > > In case the entry is not there try use io instead. We'll continue to > need that for the acpi tables for example, these entries are not > constant. qemu will adapt them when the firmware maps hardware > resources referenced in acpi tables (mmconfig region, power management > registers, ...). [Jiewen] That is great idea. I really like it. > > > The QemuFwCfgWritexxx() interface may also bring inconsistency issue. > > If we use this API, we have 2 copy data. > > Do you need any writable fw_cfg entries in TDX mode? [Jiewen] I hope NOT to support writable fw_cfg. In our TDX design, we even don't want to support SetVariable to NV Storage, just to reduce the risk. > > 'git grep' shows the ramfb driver, smi feature negotiation and s3 > support use QemuFwCfgWrite() [Jiewen] TDVF does not support SMM, and TDVF does not support S3. > > > One is in TDVF (trusted), and > > the other is in VMM/QEMU (untrusted). What if the VMM modifies its > > untrusted copy? > > > What I can see is many potential attack surfaces. :-( > > Well, you have to trust VMM/QEMU to a certain degree. TDX can prevent > data leaking, but it can't prevent VMM misbehaving. [Jiewen] Yes, you are right. It is "in certain degree". The threat model is : TD cannot resist the deny-of-service (DOS) attack from VMM/QEMU. TD need maintain the integrity and confidentiality, to avoid tamper and information disclosure. If VMM misbehaving causes the system hang or guest device error, it is OK. But if VMM misbehaving causes a TD secret leak to QEMU or TD tampered without being detected by measurement register (MRTD or RTMR), that is NOT acceptable. If we allow the misbehaving, then we have to do thorough analysis to understand the impact. If we can think of a way to avoid the possibility of misbehaving, then we know we are good. :-) That is our preference so far. > > > Please let me know if you need any other information. > > Sure. For now I have to read more docs and patches ... > > take care, > Gerd > > > > > -=-=-=-=-=-=-=-=-=-=-=- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#79854): https://edk2.groups.io/g/devel/message/79854 Mute This Topic: https://groups.io/mt/84837914/1787277 Group Owner: devel+owner@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org] -=-=-=-=-=-=-=-=-=-=-=-
© 2016 - 2026 Red Hat, Inc.