[edk2-devel] [RFC PATCH 00/19] Add AMD Secure Nested Paging (SEV-SNP) support

Brijesh Singh posted 19 patches 3 years ago
Failed in applying to current master (apply log)
There is a newer version of this series
MdePkg/Include/Library/BaseLib.h              |  37 +++
MdePkg/Include/Register/Amd/Fam17Msr.h        |  31 ++-
MdePkg/Include/Register/Amd/Ghcb.h            |  39 ++-
MdePkg/Library/BaseLib/BaseLib.inf            |   1 +
MdePkg/Library/BaseLib/X64/Pvalidate.nasm     |  43 +++
OvmfPkg/Include/Library/GhcbRegisterLib.h     |  27 ++
OvmfPkg/Include/Library/MemEncryptSevLib.h    |  30 +++
.../DxeMemEncryptSevLib.inf                   |   7 +
.../DxeMemEncryptSevLibInternal.c             |  27 ++
.../Ia32/SnpPageStateChange.c                 |  17 ++
.../PeiMemEncryptSevLib.inf                   |   9 +
.../PeiMemEncryptSevLibInternal.c             |  47 ++++
.../SecMemEncryptSevLib.inf                   |   4 +
.../SecMemEncryptSevLibInternal.c             |  39 +++
.../BaseMemEncryptSevLib/SnpPageStateChange.h |  37 +++
.../X64/PeiDxeSnpSetPageState.c               |  63 +++++
.../X64/PeiDxeVirtualMemory.c                 | 151 ++++++++++-
.../X64/PeiSnpSystemRamValidate.c             | 129 +++++++++
.../X64/SecSnpSystemRamValidate.c             |  23 ++
.../X64/SnpPageStateChangeInternal.c          | 254 ++++++++++++++++++
.../X64/SnpPageStateTrack.c                   | 119 ++++++++
.../X64/SnpPageStateTrack.h                   |  36 +++
.../X64/SnpSetPageState.h                     |  27 ++
.../BaseMemEncryptSevLib/X64/VirtualMemory.h  |  19 ++
.../Library/GhcbRegisterLib/GhcbRegisterLib.c |  97 +++++++
.../GhcbRegisterLib/GhcbRegisterLib.inf       |  33 +++
OvmfPkg/Library/VmgExitLib/SecVmgExitLib.inf  |   4 +
OvmfPkg/Library/VmgExitLib/VmgExitLib.inf     |   7 +
OvmfPkg/Library/VmgExitLib/VmgExitVcHandler.c |  45 ++++
OvmfPkg/OvmfPkg.dec                           |  12 +
OvmfPkg/OvmfPkgX64.dsc                        |   1 +
OvmfPkg/OvmfPkgX64.fdf                        |  33 ++-
OvmfPkg/PlatformPei/AmdSev.c                  |  52 ++++
OvmfPkg/PlatformPei/PlatformPei.inf           |   2 +
OvmfPkg/ResetVector/Ia16/ResetVectorVtf0.asm  |  24 ++
OvmfPkg/ResetVector/Ia32/PageTables64.asm     | 106 ++++++++
OvmfPkg/ResetVector/ResetVector.inf           |   5 +
OvmfPkg/ResetVector/ResetVector.nasmb         |   4 +
OvmfPkg/Sec/SecMain.c                         | 102 +++++++
OvmfPkg/Sec/SecMain.inf                       |   2 +
UefiCpuPkg/Library/MpInitLib/DxeMpInitLib.inf |   1 +
UefiCpuPkg/Library/MpInitLib/MpEqu.inc        |   1 +
UefiCpuPkg/Library/MpInitLib/MpLib.c          |   2 +
UefiCpuPkg/Library/MpInitLib/MpLib.h          |   2 +
UefiCpuPkg/Library/MpInitLib/PeiMpInitLib.inf |   1 +
UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm |  51 ++++
UefiCpuPkg/UefiCpuPkg.dec                     |   6 +
47 files changed, 1790 insertions(+), 19 deletions(-)
create mode 100644 MdePkg/Library/BaseLib/X64/Pvalidate.nasm
create mode 100644 OvmfPkg/Include/Library/GhcbRegisterLib.h
create mode 100644 OvmfPkg/Library/BaseMemEncryptSevLib/Ia32/SnpPageStateChange.c
create mode 100644 OvmfPkg/Library/BaseMemEncryptSevLib/SnpPageStateChange.h
create mode 100644 OvmfPkg/Library/BaseMemEncryptSevLib/X64/PeiDxeSnpSetPageState.c
create mode 100644 OvmfPkg/Library/BaseMemEncryptSevLib/X64/PeiSnpSystemRamValidate.c
create mode 100644 OvmfPkg/Library/BaseMemEncryptSevLib/X64/SecSnpSystemRamValidate.c
create mode 100644 OvmfPkg/Library/BaseMemEncryptSevLib/X64/SnpPageStateChangeInternal.c
create mode 100644 OvmfPkg/Library/BaseMemEncryptSevLib/X64/SnpPageStateTrack.c
create mode 100644 OvmfPkg/Library/BaseMemEncryptSevLib/X64/SnpPageStateTrack.h
create mode 100644 OvmfPkg/Library/BaseMemEncryptSevLib/X64/SnpSetPageState.h
create mode 100644 OvmfPkg/Library/GhcbRegisterLib/GhcbRegisterLib.c
create mode 100644 OvmfPkg/Library/GhcbRegisterLib/GhcbRegisterLib.inf
[edk2-devel] [RFC PATCH 00/19] Add AMD Secure Nested Paging (SEV-SNP) support
Posted by Brijesh Singh 3 years ago
BZ: https://bugzilla.tianocore.org/show_bug.cgi?id=3275

SEV-SNP builds upon existing SEV and SEV-ES functionality while adding
new hardware-based memory protections. SEV-SNP adds strong memory integrity
protection to help prevent malicious hypervisor-based attacks like data
replay, memory re-mapping and more in order to create an isolated memory
encryption environment.
 
This series provides the basic building blocks to support booting the SEV-SNP
VMs, it does not cover all the security enhancement introduced by the SEV-SNP
such as interrupt protection.

Many of the integrity guarantees of SEV-SNP are enforced through a new
structure called the Reverse Map Table (RMP). Adding a new page to SEV-SNP
VM requires a 2-step process. First, the hypervisor assigns a page to the
guest using the new RMPUPDATE instruction. This transitions the page to
guest-invalid. Second, the guest validates the page using the new PVALIDATE
instruction. The SEV-SNP VMs can use the new "Page State Change Request NAE"
defined in the GHCB specification to ask hypervisor to add or remove page
from the RMP table.
 
Each page assigned to the SEV-SNP VM can either be validated or unvalidated,
as indicated by the Validated flag in the page's RMP entry. There are two
approaches that can be taken for the page validation: Pre-validation and
Lazy Validation.
  
Under pre-validation, the pages are validated prior to first use. And under
lazy validation, pages are validated when first accessed. An access to a
unvalidated page results in a #VC exception, at which time the exception
handler may validate the page. Lazy validation requires careful tracking of
the validated pages to avoid validating the same GPA more than once. The
recently introduced "Unaccepted" memory type can be used to communicate the
unvalidated memory ranges to the Guest OS.

At this time we only support the pre-validation. OVMF detects all the available
system RAM in the PEI phase. When SEV-SNP is enabled, the memory is validated
before it is made available to the EDK2 core.

This series does not implements the following SEV-SNP features yet:

* CPUID filtering
* AP bring up using the new SEV-SNP NAE
* Lazy validation
* Interrupt security

The series is based on commit:
e542e05d4f UefiCpuPkg/SmmCpuFeaturesLib: Abstract PcdCpuMaxLogicalProcessorNumber

Additional resources
---------------------
SEV-SNP whitepaper
https://www.amd.com/system/files/TechDocs/SEV-SNP-strengthening-vm-isolation-with-integrity-protection-and-more.pdf
 
APM 2: https://www.amd.com/system/files/TechDocs/24593.pdf (section 15.36)

The complete source is available at
https://github.com/AMDESE/ovmf/tree/sev-snp-rfc-1

GHCB spec v2:
  The draft specification is posted on AMD-SEV-SNP mailing list:
   https://lists.suse.com/mailman/private/amd-sev-snp/

  Copy of the spec is also available at 
  https://github.com/AMDESE/AMDSEV/blob/sev-snp-devel/docs/56421-Guest_Hypervisor_Communication_Block_Standardization.pdf

GHCB spec v1:
SEV-SNP firmware specification:
 https://developer.amd.com/sev/
  
Cc: James Bottomley <jejb@linux.ibm.com>
Cc: Min Xu <min.m.xu@intel.com>
Cc: Jiewen Yao <jiewen.yao@intel.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Jordan Justen <jordan.l.justen@intel.com>
Cc: Ard Biesheuvel <ardb+tianocore@kernel.org>
Cc: Laszlo Ersek <lersek@redhat.com>

Brijesh Singh (19):
  OvmfPkg: Reserve the Secrets and Cpuid page for the SEV-SNP guest
  OvmfPkg: validate the data pages used in the SEC phase
  MdePkg: Expand the SEV MSR to include the SNP definition
  OvmfPkg/MemEncryptSevLib: add MemEncryptSevSnpEnabled()
  MdePkg: Define the GHCB GPA structure
  UefiCpuPkg/MpLib: add support to register GHCB GPA when SEV-SNP is
    enabled
  OvmfPkg: Add a library to support registering GHCB GPA
  OvmfPkg: register GHCB gpa for the SEV-SNP guest
  MdePkg: Add AsmPvalidate() support
  OvmfPkg: Define the Page State Change VMGEXIT structures
  OvmfPkg/ResetVector: Invalidate the GHCB page
  OvmfPkg/MemEncryptSevLib: Add support to validate system RAM
  OvmfPkg/SecMain: Validate the data/code pages used for the PEI phase
  OvmfPkg/MemEncryptSevLib: Add support to validate RAM in PEI phase
  OvmfPkg/PlatformPei: Validate the system RAM when SNP is active
  OvmfPkg/MemEncryptSevLib: Add support to validate > 4GB memory in PEI
    phase
  OvmfPkg/VmgExitLib: Allow PMBASE register access in Dxe phase
  OvmfPkg/MemEncryptSevLib: Validate the memory during set or clear enc
    attribute
  OvmfPkg/MemEncryptSevLib: Skip page state change for non RAM region

 MdePkg/Include/Library/BaseLib.h              |  37 +++
 MdePkg/Include/Register/Amd/Fam17Msr.h        |  31 ++-
 MdePkg/Include/Register/Amd/Ghcb.h            |  39 ++-
 MdePkg/Library/BaseLib/BaseLib.inf            |   1 +
 MdePkg/Library/BaseLib/X64/Pvalidate.nasm     |  43 +++
 OvmfPkg/Include/Library/GhcbRegisterLib.h     |  27 ++
 OvmfPkg/Include/Library/MemEncryptSevLib.h    |  30 +++
 .../DxeMemEncryptSevLib.inf                   |   7 +
 .../DxeMemEncryptSevLibInternal.c             |  27 ++
 .../Ia32/SnpPageStateChange.c                 |  17 ++
 .../PeiMemEncryptSevLib.inf                   |   9 +
 .../PeiMemEncryptSevLibInternal.c             |  47 ++++
 .../SecMemEncryptSevLib.inf                   |   4 +
 .../SecMemEncryptSevLibInternal.c             |  39 +++
 .../BaseMemEncryptSevLib/SnpPageStateChange.h |  37 +++
 .../X64/PeiDxeSnpSetPageState.c               |  63 +++++
 .../X64/PeiDxeVirtualMemory.c                 | 151 ++++++++++-
 .../X64/PeiSnpSystemRamValidate.c             | 129 +++++++++
 .../X64/SecSnpSystemRamValidate.c             |  23 ++
 .../X64/SnpPageStateChangeInternal.c          | 254 ++++++++++++++++++
 .../X64/SnpPageStateTrack.c                   | 119 ++++++++
 .../X64/SnpPageStateTrack.h                   |  36 +++
 .../X64/SnpSetPageState.h                     |  27 ++
 .../BaseMemEncryptSevLib/X64/VirtualMemory.h  |  19 ++
 .../Library/GhcbRegisterLib/GhcbRegisterLib.c |  97 +++++++
 .../GhcbRegisterLib/GhcbRegisterLib.inf       |  33 +++
 OvmfPkg/Library/VmgExitLib/SecVmgExitLib.inf  |   4 +
 OvmfPkg/Library/VmgExitLib/VmgExitLib.inf     |   7 +
 OvmfPkg/Library/VmgExitLib/VmgExitVcHandler.c |  45 ++++
 OvmfPkg/OvmfPkg.dec                           |  12 +
 OvmfPkg/OvmfPkgX64.dsc                        |   1 +
 OvmfPkg/OvmfPkgX64.fdf                        |  33 ++-
 OvmfPkg/PlatformPei/AmdSev.c                  |  52 ++++
 OvmfPkg/PlatformPei/PlatformPei.inf           |   2 +
 OvmfPkg/ResetVector/Ia16/ResetVectorVtf0.asm  |  24 ++
 OvmfPkg/ResetVector/Ia32/PageTables64.asm     | 106 ++++++++
 OvmfPkg/ResetVector/ResetVector.inf           |   5 +
 OvmfPkg/ResetVector/ResetVector.nasmb         |   4 +
 OvmfPkg/Sec/SecMain.c                         | 102 +++++++
 OvmfPkg/Sec/SecMain.inf                       |   2 +
 UefiCpuPkg/Library/MpInitLib/DxeMpInitLib.inf |   1 +
 UefiCpuPkg/Library/MpInitLib/MpEqu.inc        |   1 +
 UefiCpuPkg/Library/MpInitLib/MpLib.c          |   2 +
 UefiCpuPkg/Library/MpInitLib/MpLib.h          |   2 +
 UefiCpuPkg/Library/MpInitLib/PeiMpInitLib.inf |   1 +
 UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm |  51 ++++
 UefiCpuPkg/UefiCpuPkg.dec                     |   6 +
 47 files changed, 1790 insertions(+), 19 deletions(-)
 create mode 100644 MdePkg/Library/BaseLib/X64/Pvalidate.nasm
 create mode 100644 OvmfPkg/Include/Library/GhcbRegisterLib.h
 create mode 100644 OvmfPkg/Library/BaseMemEncryptSevLib/Ia32/SnpPageStateChange.c
 create mode 100644 OvmfPkg/Library/BaseMemEncryptSevLib/SnpPageStateChange.h
 create mode 100644 OvmfPkg/Library/BaseMemEncryptSevLib/X64/PeiDxeSnpSetPageState.c
 create mode 100644 OvmfPkg/Library/BaseMemEncryptSevLib/X64/PeiSnpSystemRamValidate.c
 create mode 100644 OvmfPkg/Library/BaseMemEncryptSevLib/X64/SecSnpSystemRamValidate.c
 create mode 100644 OvmfPkg/Library/BaseMemEncryptSevLib/X64/SnpPageStateChangeInternal.c
 create mode 100644 OvmfPkg/Library/BaseMemEncryptSevLib/X64/SnpPageStateTrack.c
 create mode 100644 OvmfPkg/Library/BaseMemEncryptSevLib/X64/SnpPageStateTrack.h
 create mode 100644 OvmfPkg/Library/BaseMemEncryptSevLib/X64/SnpSetPageState.h
 create mode 100644 OvmfPkg/Library/GhcbRegisterLib/GhcbRegisterLib.c
 create mode 100644 OvmfPkg/Library/GhcbRegisterLib/GhcbRegisterLib.inf

-- 
2.17.1



-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#73216): https://edk2.groups.io/g/devel/message/73216
Mute This Topic: https://groups.io/mt/81584576/1787277
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [edk2-devel] [RFC PATCH 00/19] Add AMD Secure Nested Paging (SEV-SNP) support
Posted by Laszlo Ersek 3 years ago
On 03/24/21 16:31, Brijesh Singh wrote:
> BZ: https://bugzilla.tianocore.org/show_bug.cgi?id=3275
> 
> SEV-SNP builds upon existing SEV and SEV-ES functionality while adding
> new hardware-based memory protections. SEV-SNP adds strong memory integrity
> protection to help prevent malicious hypervisor-based attacks like data
> replay, memory re-mapping and more in order to create an isolated memory
> encryption environment.
>  
> This series provides the basic building blocks to support booting the SEV-SNP
> VMs, it does not cover all the security enhancement introduced by the SEV-SNP
> such as interrupt protection.

Thanks, Brijesh. I'm adding the series to my review queue. Due to some
PTO coming up, I'll probably start reviewing this work only in April.
Other reviewers, please feel free to have at it.

Cheers
Laszlo

> 
> Many of the integrity guarantees of SEV-SNP are enforced through a new
> structure called the Reverse Map Table (RMP). Adding a new page to SEV-SNP
> VM requires a 2-step process. First, the hypervisor assigns a page to the
> guest using the new RMPUPDATE instruction. This transitions the page to
> guest-invalid. Second, the guest validates the page using the new PVALIDATE
> instruction. The SEV-SNP VMs can use the new "Page State Change Request NAE"
> defined in the GHCB specification to ask hypervisor to add or remove page
> from the RMP table.
>  
> Each page assigned to the SEV-SNP VM can either be validated or unvalidated,
> as indicated by the Validated flag in the page's RMP entry. There are two
> approaches that can be taken for the page validation: Pre-validation and
> Lazy Validation.
>   
> Under pre-validation, the pages are validated prior to first use. And under
> lazy validation, pages are validated when first accessed. An access to a
> unvalidated page results in a #VC exception, at which time the exception
> handler may validate the page. Lazy validation requires careful tracking of
> the validated pages to avoid validating the same GPA more than once. The
> recently introduced "Unaccepted" memory type can be used to communicate the
> unvalidated memory ranges to the Guest OS.
> 
> At this time we only support the pre-validation. OVMF detects all the available
> system RAM in the PEI phase. When SEV-SNP is enabled, the memory is validated
> before it is made available to the EDK2 core.
> 
> This series does not implements the following SEV-SNP features yet:
> 
> * CPUID filtering
> * AP bring up using the new SEV-SNP NAE
> * Lazy validation
> * Interrupt security
> 
> The series is based on commit:
> e542e05d4f UefiCpuPkg/SmmCpuFeaturesLib: Abstract PcdCpuMaxLogicalProcessorNumber
> 
> Additional resources
> ---------------------
> SEV-SNP whitepaper
> https://www.amd.com/system/files/TechDocs/SEV-SNP-strengthening-vm-isolation-with-integrity-protection-and-more.pdf
>  
> APM 2: https://www.amd.com/system/files/TechDocs/24593.pdf (section 15.36)
> 
> The complete source is available at
> https://github.com/AMDESE/ovmf/tree/sev-snp-rfc-1
> 
> GHCB spec v2:
>   The draft specification is posted on AMD-SEV-SNP mailing list:
>    https://lists.suse.com/mailman/private/amd-sev-snp/
> 
>   Copy of the spec is also available at 
>   https://github.com/AMDESE/AMDSEV/blob/sev-snp-devel/docs/56421-Guest_Hypervisor_Communication_Block_Standardization.pdf
> 
> GHCB spec v1:
> SEV-SNP firmware specification:
>  https://developer.amd.com/sev/
>   
> Cc: James Bottomley <jejb@linux.ibm.com>
> Cc: Min Xu <min.m.xu@intel.com>
> Cc: Jiewen Yao <jiewen.yao@intel.com>
> Cc: Tom Lendacky <thomas.lendacky@amd.com>
> Cc: Jordan Justen <jordan.l.justen@intel.com>
> Cc: Ard Biesheuvel <ardb+tianocore@kernel.org>
> Cc: Laszlo Ersek <lersek@redhat.com>
> 
> Brijesh Singh (19):
>   OvmfPkg: Reserve the Secrets and Cpuid page for the SEV-SNP guest
>   OvmfPkg: validate the data pages used in the SEC phase
>   MdePkg: Expand the SEV MSR to include the SNP definition
>   OvmfPkg/MemEncryptSevLib: add MemEncryptSevSnpEnabled()
>   MdePkg: Define the GHCB GPA structure
>   UefiCpuPkg/MpLib: add support to register GHCB GPA when SEV-SNP is
>     enabled
>   OvmfPkg: Add a library to support registering GHCB GPA
>   OvmfPkg: register GHCB gpa for the SEV-SNP guest
>   MdePkg: Add AsmPvalidate() support
>   OvmfPkg: Define the Page State Change VMGEXIT structures
>   OvmfPkg/ResetVector: Invalidate the GHCB page
>   OvmfPkg/MemEncryptSevLib: Add support to validate system RAM
>   OvmfPkg/SecMain: Validate the data/code pages used for the PEI phase
>   OvmfPkg/MemEncryptSevLib: Add support to validate RAM in PEI phase
>   OvmfPkg/PlatformPei: Validate the system RAM when SNP is active
>   OvmfPkg/MemEncryptSevLib: Add support to validate > 4GB memory in PEI
>     phase
>   OvmfPkg/VmgExitLib: Allow PMBASE register access in Dxe phase
>   OvmfPkg/MemEncryptSevLib: Validate the memory during set or clear enc
>     attribute
>   OvmfPkg/MemEncryptSevLib: Skip page state change for non RAM region
> 
>  MdePkg/Include/Library/BaseLib.h              |  37 +++
>  MdePkg/Include/Register/Amd/Fam17Msr.h        |  31 ++-
>  MdePkg/Include/Register/Amd/Ghcb.h            |  39 ++-
>  MdePkg/Library/BaseLib/BaseLib.inf            |   1 +
>  MdePkg/Library/BaseLib/X64/Pvalidate.nasm     |  43 +++
>  OvmfPkg/Include/Library/GhcbRegisterLib.h     |  27 ++
>  OvmfPkg/Include/Library/MemEncryptSevLib.h    |  30 +++
>  .../DxeMemEncryptSevLib.inf                   |   7 +
>  .../DxeMemEncryptSevLibInternal.c             |  27 ++
>  .../Ia32/SnpPageStateChange.c                 |  17 ++
>  .../PeiMemEncryptSevLib.inf                   |   9 +
>  .../PeiMemEncryptSevLibInternal.c             |  47 ++++
>  .../SecMemEncryptSevLib.inf                   |   4 +
>  .../SecMemEncryptSevLibInternal.c             |  39 +++
>  .../BaseMemEncryptSevLib/SnpPageStateChange.h |  37 +++
>  .../X64/PeiDxeSnpSetPageState.c               |  63 +++++
>  .../X64/PeiDxeVirtualMemory.c                 | 151 ++++++++++-
>  .../X64/PeiSnpSystemRamValidate.c             | 129 +++++++++
>  .../X64/SecSnpSystemRamValidate.c             |  23 ++
>  .../X64/SnpPageStateChangeInternal.c          | 254 ++++++++++++++++++
>  .../X64/SnpPageStateTrack.c                   | 119 ++++++++
>  .../X64/SnpPageStateTrack.h                   |  36 +++
>  .../X64/SnpSetPageState.h                     |  27 ++
>  .../BaseMemEncryptSevLib/X64/VirtualMemory.h  |  19 ++
>  .../Library/GhcbRegisterLib/GhcbRegisterLib.c |  97 +++++++
>  .../GhcbRegisterLib/GhcbRegisterLib.inf       |  33 +++
>  OvmfPkg/Library/VmgExitLib/SecVmgExitLib.inf  |   4 +
>  OvmfPkg/Library/VmgExitLib/VmgExitLib.inf     |   7 +
>  OvmfPkg/Library/VmgExitLib/VmgExitVcHandler.c |  45 ++++
>  OvmfPkg/OvmfPkg.dec                           |  12 +
>  OvmfPkg/OvmfPkgX64.dsc                        |   1 +
>  OvmfPkg/OvmfPkgX64.fdf                        |  33 ++-
>  OvmfPkg/PlatformPei/AmdSev.c                  |  52 ++++
>  OvmfPkg/PlatformPei/PlatformPei.inf           |   2 +
>  OvmfPkg/ResetVector/Ia16/ResetVectorVtf0.asm  |  24 ++
>  OvmfPkg/ResetVector/Ia32/PageTables64.asm     | 106 ++++++++
>  OvmfPkg/ResetVector/ResetVector.inf           |   5 +
>  OvmfPkg/ResetVector/ResetVector.nasmb         |   4 +
>  OvmfPkg/Sec/SecMain.c                         | 102 +++++++
>  OvmfPkg/Sec/SecMain.inf                       |   2 +
>  UefiCpuPkg/Library/MpInitLib/DxeMpInitLib.inf |   1 +
>  UefiCpuPkg/Library/MpInitLib/MpEqu.inc        |   1 +
>  UefiCpuPkg/Library/MpInitLib/MpLib.c          |   2 +
>  UefiCpuPkg/Library/MpInitLib/MpLib.h          |   2 +
>  UefiCpuPkg/Library/MpInitLib/PeiMpInitLib.inf |   1 +
>  UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm |  51 ++++
>  UefiCpuPkg/UefiCpuPkg.dec                     |   6 +
>  47 files changed, 1790 insertions(+), 19 deletions(-)
>  create mode 100644 MdePkg/Library/BaseLib/X64/Pvalidate.nasm
>  create mode 100644 OvmfPkg/Include/Library/GhcbRegisterLib.h
>  create mode 100644 OvmfPkg/Library/BaseMemEncryptSevLib/Ia32/SnpPageStateChange.c
>  create mode 100644 OvmfPkg/Library/BaseMemEncryptSevLib/SnpPageStateChange.h
>  create mode 100644 OvmfPkg/Library/BaseMemEncryptSevLib/X64/PeiDxeSnpSetPageState.c
>  create mode 100644 OvmfPkg/Library/BaseMemEncryptSevLib/X64/PeiSnpSystemRamValidate.c
>  create mode 100644 OvmfPkg/Library/BaseMemEncryptSevLib/X64/SecSnpSystemRamValidate.c
>  create mode 100644 OvmfPkg/Library/BaseMemEncryptSevLib/X64/SnpPageStateChangeInternal.c
>  create mode 100644 OvmfPkg/Library/BaseMemEncryptSevLib/X64/SnpPageStateTrack.c
>  create mode 100644 OvmfPkg/Library/BaseMemEncryptSevLib/X64/SnpPageStateTrack.h
>  create mode 100644 OvmfPkg/Library/BaseMemEncryptSevLib/X64/SnpSetPageState.h
>  create mode 100644 OvmfPkg/Library/GhcbRegisterLib/GhcbRegisterLib.c
>  create mode 100644 OvmfPkg/Library/GhcbRegisterLib/GhcbRegisterLib.inf
> 



-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#73236): https://edk2.groups.io/g/devel/message/73236
Mute This Topic: https://groups.io/mt/81584576/1787277
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [edk2-devel] [RFC PATCH 00/19] Add AMD Secure Nested Paging (SEV-SNP) support
Posted by Laszlo Ersek 3 years ago
Hi Brijesh,

On 03/24/21 16:31, Brijesh Singh wrote:
> BZ: https://bugzilla.tianocore.org/show_bug.cgi?id=3275
> 
> SEV-SNP builds upon existing SEV and SEV-ES functionality while adding
> new hardware-based memory protections. SEV-SNP adds strong memory integrity
> protection to help prevent malicious hypervisor-based attacks like data
> replay, memory re-mapping and more in order to create an isolated memory
> encryption environment.
>  
> This series provides the basic building blocks to support booting the SEV-SNP
> VMs, it does not cover all the security enhancement introduced by the SEV-SNP
> such as interrupt protection.
> 
> Many of the integrity guarantees of SEV-SNP are enforced through a new
> structure called the Reverse Map Table (RMP). Adding a new page to SEV-SNP
> VM requires a 2-step process. First, the hypervisor assigns a page to the
> guest using the new RMPUPDATE instruction. This transitions the page to
> guest-invalid. Second, the guest validates the page using the new PVALIDATE
> instruction. The SEV-SNP VMs can use the new "Page State Change Request NAE"
> defined in the GHCB specification to ask hypervisor to add or remove page
> from the RMP table.
>  
> Each page assigned to the SEV-SNP VM can either be validated or unvalidated,
> as indicated by the Validated flag in the page's RMP entry. There are two
> approaches that can be taken for the page validation: Pre-validation and
> Lazy Validation.
>   
> Under pre-validation, the pages are validated prior to first use. And under
> lazy validation, pages are validated when first accessed. An access to a
> unvalidated page results in a #VC exception, at which time the exception
> handler may validate the page. Lazy validation requires careful tracking of
> the validated pages to avoid validating the same GPA more than once. The
> recently introduced "Unaccepted" memory type can be used to communicate the
> unvalidated memory ranges to the Guest OS.
> 
> At this time we only support the pre-validation. OVMF detects all the available
> system RAM in the PEI phase. When SEV-SNP is enabled, the memory is validated
> before it is made available to the EDK2 core.

Can you describe this in a bit more detail, before I look at the
individual patches? Specifically, what existing logic in the PEI phase
was taken, and extended, and how?

If there is a particular patch whose commit message is closely related
to my question, can you point it out? Patch#15 perhaps? (Doesn't seem
like a big patch; for some reason I'd expect something more complex, but
perhaps that's only because it builds upon the many earlier patches.)

Thanks,
Laszlo

> 
> This series does not implements the following SEV-SNP features yet:
> 
> * CPUID filtering
> * AP bring up using the new SEV-SNP NAE
> * Lazy validation
> * Interrupt security
> 
> The series is based on commit:
> e542e05d4f UefiCpuPkg/SmmCpuFeaturesLib: Abstract PcdCpuMaxLogicalProcessorNumber
> 
> Additional resources
> ---------------------
> SEV-SNP whitepaper
> https://www.amd.com/system/files/TechDocs/SEV-SNP-strengthening-vm-isolation-with-integrity-protection-and-more.pdf
>  
> APM 2: https://www.amd.com/system/files/TechDocs/24593.pdf (section 15.36)
> 
> The complete source is available at
> https://github.com/AMDESE/ovmf/tree/sev-snp-rfc-1
> 
> GHCB spec v2:
>   The draft specification is posted on AMD-SEV-SNP mailing list:
>    https://lists.suse.com/mailman/private/amd-sev-snp/
> 
>   Copy of the spec is also available at 
>   https://github.com/AMDESE/AMDSEV/blob/sev-snp-devel/docs/56421-Guest_Hypervisor_Communication_Block_Standardization.pdf
> 
> GHCB spec v1:
> SEV-SNP firmware specification:
>  https://developer.amd.com/sev/
>   
> Cc: James Bottomley <jejb@linux.ibm.com>
> Cc: Min Xu <min.m.xu@intel.com>
> Cc: Jiewen Yao <jiewen.yao@intel.com>
> Cc: Tom Lendacky <thomas.lendacky@amd.com>
> Cc: Jordan Justen <jordan.l.justen@intel.com>
> Cc: Ard Biesheuvel <ardb+tianocore@kernel.org>
> Cc: Laszlo Ersek <lersek@redhat.com>
> 
> Brijesh Singh (19):
>   OvmfPkg: Reserve the Secrets and Cpuid page for the SEV-SNP guest
>   OvmfPkg: validate the data pages used in the SEC phase
>   MdePkg: Expand the SEV MSR to include the SNP definition
>   OvmfPkg/MemEncryptSevLib: add MemEncryptSevSnpEnabled()
>   MdePkg: Define the GHCB GPA structure
>   UefiCpuPkg/MpLib: add support to register GHCB GPA when SEV-SNP is
>     enabled
>   OvmfPkg: Add a library to support registering GHCB GPA
>   OvmfPkg: register GHCB gpa for the SEV-SNP guest
>   MdePkg: Add AsmPvalidate() support
>   OvmfPkg: Define the Page State Change VMGEXIT structures
>   OvmfPkg/ResetVector: Invalidate the GHCB page
>   OvmfPkg/MemEncryptSevLib: Add support to validate system RAM
>   OvmfPkg/SecMain: Validate the data/code pages used for the PEI phase
>   OvmfPkg/MemEncryptSevLib: Add support to validate RAM in PEI phase
>   OvmfPkg/PlatformPei: Validate the system RAM when SNP is active
>   OvmfPkg/MemEncryptSevLib: Add support to validate > 4GB memory in PEI
>     phase
>   OvmfPkg/VmgExitLib: Allow PMBASE register access in Dxe phase
>   OvmfPkg/MemEncryptSevLib: Validate the memory during set or clear enc
>     attribute
>   OvmfPkg/MemEncryptSevLib: Skip page state change for non RAM region
> 
>  MdePkg/Include/Library/BaseLib.h              |  37 +++
>  MdePkg/Include/Register/Amd/Fam17Msr.h        |  31 ++-
>  MdePkg/Include/Register/Amd/Ghcb.h            |  39 ++-
>  MdePkg/Library/BaseLib/BaseLib.inf            |   1 +
>  MdePkg/Library/BaseLib/X64/Pvalidate.nasm     |  43 +++
>  OvmfPkg/Include/Library/GhcbRegisterLib.h     |  27 ++
>  OvmfPkg/Include/Library/MemEncryptSevLib.h    |  30 +++
>  .../DxeMemEncryptSevLib.inf                   |   7 +
>  .../DxeMemEncryptSevLibInternal.c             |  27 ++
>  .../Ia32/SnpPageStateChange.c                 |  17 ++
>  .../PeiMemEncryptSevLib.inf                   |   9 +
>  .../PeiMemEncryptSevLibInternal.c             |  47 ++++
>  .../SecMemEncryptSevLib.inf                   |   4 +
>  .../SecMemEncryptSevLibInternal.c             |  39 +++
>  .../BaseMemEncryptSevLib/SnpPageStateChange.h |  37 +++
>  .../X64/PeiDxeSnpSetPageState.c               |  63 +++++
>  .../X64/PeiDxeVirtualMemory.c                 | 151 ++++++++++-
>  .../X64/PeiSnpSystemRamValidate.c             | 129 +++++++++
>  .../X64/SecSnpSystemRamValidate.c             |  23 ++
>  .../X64/SnpPageStateChangeInternal.c          | 254 ++++++++++++++++++
>  .../X64/SnpPageStateTrack.c                   | 119 ++++++++
>  .../X64/SnpPageStateTrack.h                   |  36 +++
>  .../X64/SnpSetPageState.h                     |  27 ++
>  .../BaseMemEncryptSevLib/X64/VirtualMemory.h  |  19 ++
>  .../Library/GhcbRegisterLib/GhcbRegisterLib.c |  97 +++++++
>  .../GhcbRegisterLib/GhcbRegisterLib.inf       |  33 +++
>  OvmfPkg/Library/VmgExitLib/SecVmgExitLib.inf  |   4 +
>  OvmfPkg/Library/VmgExitLib/VmgExitLib.inf     |   7 +
>  OvmfPkg/Library/VmgExitLib/VmgExitVcHandler.c |  45 ++++
>  OvmfPkg/OvmfPkg.dec                           |  12 +
>  OvmfPkg/OvmfPkgX64.dsc                        |   1 +
>  OvmfPkg/OvmfPkgX64.fdf                        |  33 ++-
>  OvmfPkg/PlatformPei/AmdSev.c                  |  52 ++++
>  OvmfPkg/PlatformPei/PlatformPei.inf           |   2 +
>  OvmfPkg/ResetVector/Ia16/ResetVectorVtf0.asm  |  24 ++
>  OvmfPkg/ResetVector/Ia32/PageTables64.asm     | 106 ++++++++
>  OvmfPkg/ResetVector/ResetVector.inf           |   5 +
>  OvmfPkg/ResetVector/ResetVector.nasmb         |   4 +
>  OvmfPkg/Sec/SecMain.c                         | 102 +++++++
>  OvmfPkg/Sec/SecMain.inf                       |   2 +
>  UefiCpuPkg/Library/MpInitLib/DxeMpInitLib.inf |   1 +
>  UefiCpuPkg/Library/MpInitLib/MpEqu.inc        |   1 +
>  UefiCpuPkg/Library/MpInitLib/MpLib.c          |   2 +
>  UefiCpuPkg/Library/MpInitLib/MpLib.h          |   2 +
>  UefiCpuPkg/Library/MpInitLib/PeiMpInitLib.inf |   1 +
>  UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm |  51 ++++
>  UefiCpuPkg/UefiCpuPkg.dec                     |   6 +
>  47 files changed, 1790 insertions(+), 19 deletions(-)
>  create mode 100644 MdePkg/Library/BaseLib/X64/Pvalidate.nasm
>  create mode 100644 OvmfPkg/Include/Library/GhcbRegisterLib.h
>  create mode 100644 OvmfPkg/Library/BaseMemEncryptSevLib/Ia32/SnpPageStateChange.c
>  create mode 100644 OvmfPkg/Library/BaseMemEncryptSevLib/SnpPageStateChange.h
>  create mode 100644 OvmfPkg/Library/BaseMemEncryptSevLib/X64/PeiDxeSnpSetPageState.c
>  create mode 100644 OvmfPkg/Library/BaseMemEncryptSevLib/X64/PeiSnpSystemRamValidate.c
>  create mode 100644 OvmfPkg/Library/BaseMemEncryptSevLib/X64/SecSnpSystemRamValidate.c
>  create mode 100644 OvmfPkg/Library/BaseMemEncryptSevLib/X64/SnpPageStateChangeInternal.c
>  create mode 100644 OvmfPkg/Library/BaseMemEncryptSevLib/X64/SnpPageStateTrack.c
>  create mode 100644 OvmfPkg/Library/BaseMemEncryptSevLib/X64/SnpPageStateTrack.h
>  create mode 100644 OvmfPkg/Library/BaseMemEncryptSevLib/X64/SnpSetPageState.h
>  create mode 100644 OvmfPkg/Library/GhcbRegisterLib/GhcbRegisterLib.c
>  create mode 100644 OvmfPkg/Library/GhcbRegisterLib/GhcbRegisterLib.inf
> 



-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#73839): https://edk2.groups.io/g/devel/message/73839
Mute This Topic: https://groups.io/mt/81584576/1787277
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [edk2-devel] [RFC PATCH 00/19] Add AMD Secure Nested Paging (SEV-SNP) support
Posted by Brijesh Singh 3 years ago
Hi Laszlo,

On 4/8/21 4:58 AM, Laszlo Ersek wrote:
> Hi Brijesh,
>
> On 03/24/21 16:31, Brijesh Singh wrote:
>> BZ: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.tianocore.org%2Fshow_bug.cgi%3Fid%3D3275&amp;data=04%7C01%7Cbrijesh.singh%40amd.com%7Cb8ee3abf81aa4e5b1e6008d8fa74dbbc%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637534728122143364%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=G2AQ%2FCks3%2BbczHJXMwqlqpWpoBJmb0pmxb1VNLw6t%2BA%3D&amp;reserved=0
>>
>> SEV-SNP builds upon existing SEV and SEV-ES functionality while adding
>> new hardware-based memory protections. SEV-SNP adds strong memory integrity
>> protection to help prevent malicious hypervisor-based attacks like data
>> replay, memory re-mapping and more in order to create an isolated memory
>> encryption environment.
>>  
>> This series provides the basic building blocks to support booting the SEV-SNP
>> VMs, it does not cover all the security enhancement introduced by the SEV-SNP
>> such as interrupt protection.
>>
>> Many of the integrity guarantees of SEV-SNP are enforced through a new
>> structure called the Reverse Map Table (RMP). Adding a new page to SEV-SNP
>> VM requires a 2-step process. First, the hypervisor assigns a page to the
>> guest using the new RMPUPDATE instruction. This transitions the page to
>> guest-invalid. Second, the guest validates the page using the new PVALIDATE
>> instruction. The SEV-SNP VMs can use the new "Page State Change Request NAE"
>> defined in the GHCB specification to ask hypervisor to add or remove page
>> from the RMP table.
>>  
>> Each page assigned to the SEV-SNP VM can either be validated or unvalidated,
>> as indicated by the Validated flag in the page's RMP entry. There are two
>> approaches that can be taken for the page validation: Pre-validation and
>> Lazy Validation.
>>   
>> Under pre-validation, the pages are validated prior to first use. And under
>> lazy validation, pages are validated when first accessed. An access to a
>> unvalidated page results in a #VC exception, at which time the exception
>> handler may validate the page. Lazy validation requires careful tracking of
>> the validated pages to avoid validating the same GPA more than once. The
>> recently introduced "Unaccepted" memory type can be used to communicate the
>> unvalidated memory ranges to the Guest OS.
>>
>> At this time we only support the pre-validation. OVMF detects all the available
>> system RAM in the PEI phase. When SEV-SNP is enabled, the memory is validated
>> before it is made available to the EDK2 core.
> Can you describe this in a bit more detail, before I look at the
> individual patches? Specifically, what existing logic in the PEI phase
> was taken, and extended, and how?

One of the key requirement is that the guest private pages much be
validated before the access. If guest tries to access the pages before
the validation then it will result in #VC (page-not-validated)
exception. To avoid the #VC, we propose the validating the memory before
the access. We will incrementally add the support to lazy validate (i.e
validate on access).

Let me try explaining a bit, the page validation process consist of two
steps:

1. Add the pages in the RMP table -- must be done by the hypervisor
using the RMPUPDATE instruction. The guest can use VMGEXIT NAEs to ask
hypervisor to add or remove pages from the RMP table.

2. Guest issue the PVALIDATE instruction -- this sets the validate bit
in the RMP table.

Similar to SEV, the OVMF_CODE.fd is encrypted through the SNP firmware
before the launch. The SNP firmware also validates the memory page after
encrypting. This allows us to boot the initial entry code without guest
going through the validation process.

The OVMF reset vector uses few data pages (e.g page table, early Sec
stack). Access to these data pages will result in #VC. There are two
approaches we can take to validate these data pages:

1. Ask SNP firmware to pre-validate it -- SNP firmware provides an
special command that can be used to pre-validate the pages without
affecting the measurement.

2. Enhance the reset vector code to validate the pages.

For now I choose #1.

The pre-validation performed by the SNP firmware is sufficient to boot
through the SEC phase. The SEC phase later decompress the Fv to a new
memory location. Now we need the OVMF to take over the validation
procedure.  The series extends the MemEncryptSevLib to add a new helper
MemEncryptSevSnpValidateRam(). The helper is used to validate the system
RAM. See patch #12. SEC phase calls the MemEncryptSevSnpValidateRam() to
validate the output buffer used for the decompression. This was
sufficient to boot into the PEI phase, see patch #13. The PEI detects
all the available system RAM. After the memory detection is completed
the PlatformPei calls the AmdSevSnpInitialize(). The initialization
routine iterate through the HOB and calls the
MemEncryptSevSnpValidateRam() to validate all the system RAM. Is it
possible the more system ram can be detected after the PlatformPei is
completed ?

One of the important thing is we should *never* validate the pages
twice. The MemEncryptSevSnpValidateRam() uses a interval search tree to
keep the record of what has been validated. Before validating the range,
it lookup in its tree and if it finds that range is already validated
then do nothing. If it detects an overlap then it will validate only non
overlapping regions -- see patch #14.

The patch #18 extend the MemEncrypt{Set,Clear}PageEncMask() to call the
SNP page state change during the C-bit toggle.

Please let me know if you have any questions. We can hash out the design
here before you taking a closure look at the code.

>
> If there is a particular patch whose commit message is closely related
> to my question, can you point it out? Patch#15 perhaps? (Doesn't seem
> like a big patch; for some reason I'd expect something more complex, but
> perhaps that's only because it builds upon the many earlier patches.)
>
> Thanks,
> Laszlo
>
>> This series does not implements the following SEV-SNP features yet:
>>
>> * CPUID filtering
>> * AP bring up using the new SEV-SNP NAE
>> * Lazy validation
>> * Interrupt security
>>
>> The series is based on commit:
>> e542e05d4f UefiCpuPkg/SmmCpuFeaturesLib: Abstract PcdCpuMaxLogicalProcessorNumber
>>
>> Additional resources
>> ---------------------
>> SEV-SNP whitepaper
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.amd.com%2Fsystem%2Ffiles%2FTechDocs%2FSEV-SNP-strengthening-vm-isolation-with-integrity-protection-and-more.pdf&amp;data=04%7C01%7Cbrijesh.singh%40amd.com%7Cb8ee3abf81aa4e5b1e6008d8fa74dbbc%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637534728122143364%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=uUqlVHhWQ6geGDaNHwxGMpoSpIamB%2F1vHH69h%2FEGUro%3D&amp;reserved=0
>>  
>> APM 2: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.amd.com%2Fsystem%2Ffiles%2FTechDocs%2F24593.pdf&amp;data=04%7C01%7Cbrijesh.singh%40amd.com%7Cb8ee3abf81aa4e5b1e6008d8fa74dbbc%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637534728122143364%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=R2kU42jvCDZat8kGZ5gDDz2nFXIawHfXdRW1aovhNK8%3D&amp;reserved=0 (section 15.36)
>>
>> The complete source is available at
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FAMDESE%2Fovmf%2Ftree%2Fsev-snp-rfc-1&amp;data=04%7C01%7Cbrijesh.singh%40amd.com%7Cb8ee3abf81aa4e5b1e6008d8fa74dbbc%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637534728122143364%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=yreFg2hr%2F82WYEjxqCmb7pXUdtrRCJRYPrPHgfrWjM8%3D&amp;reserved=0
>>
>> GHCB spec v2:
>>   The draft specification is posted on AMD-SEV-SNP mailing list:
>>    https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.suse.com%2Fmailman%2Fprivate%2Famd-sev-snp%2F&amp;data=04%7C01%7Cbrijesh.singh%40amd.com%7Cb8ee3abf81aa4e5b1e6008d8fa74dbbc%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637534728122143364%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=PFV2mA7T%2Fbl2zP5j52kNdT%2FavDMRLWDEDqz6JGusEFg%3D&amp;reserved=0
>>
>>   Copy of the spec is also available at 
>>   https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FAMDESE%2FAMDSEV%2Fblob%2Fsev-snp-devel%2Fdocs%2F56421-Guest_Hypervisor_Communication_Block_Standardization.pdf&amp;data=04%7C01%7Cbrijesh.singh%40amd.com%7Cb8ee3abf81aa4e5b1e6008d8fa74dbbc%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637534728122143364%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=U3oKRe5m0NxI0xqv1gBoyh%2BEEX1LVeCWR42rvPh6XZ8%3D&amp;reserved=0
>>
>> GHCB spec v1:
>> SEV-SNP firmware specification:
>>  https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdeveloper.amd.com%2Fsev%2F&amp;data=04%7C01%7Cbrijesh.singh%40amd.com%7Cb8ee3abf81aa4e5b1e6008d8fa74dbbc%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637534728122143364%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=G%2BttkORsTJchJ1Fy1iNlD%2B%2BqiQZuwI8md5vhJjEb%2Fn4%3D&amp;reserved=0
>>   
>> Cc: James Bottomley <jejb@linux.ibm.com>
>> Cc: Min Xu <min.m.xu@intel.com>
>> Cc: Jiewen Yao <jiewen.yao@intel.com>
>> Cc: Tom Lendacky <thomas.lendacky@amd.com>
>> Cc: Jordan Justen <jordan.l.justen@intel.com>
>> Cc: Ard Biesheuvel <ardb+tianocore@kernel.org>
>> Cc: Laszlo Ersek <lersek@redhat.com>
>>
>> Brijesh Singh (19):
>>   OvmfPkg: Reserve the Secrets and Cpuid page for the SEV-SNP guest
>>   OvmfPkg: validate the data pages used in the SEC phase
>>   MdePkg: Expand the SEV MSR to include the SNP definition
>>   OvmfPkg/MemEncryptSevLib: add MemEncryptSevSnpEnabled()
>>   MdePkg: Define the GHCB GPA structure
>>   UefiCpuPkg/MpLib: add support to register GHCB GPA when SEV-SNP is
>>     enabled
>>   OvmfPkg: Add a library to support registering GHCB GPA
>>   OvmfPkg: register GHCB gpa for the SEV-SNP guest
>>   MdePkg: Add AsmPvalidate() support
>>   OvmfPkg: Define the Page State Change VMGEXIT structures
>>   OvmfPkg/ResetVector: Invalidate the GHCB page
>>   OvmfPkg/MemEncryptSevLib: Add support to validate system RAM
>>   OvmfPkg/SecMain: Validate the data/code pages used for the PEI phase
>>   OvmfPkg/MemEncryptSevLib: Add support to validate RAM in PEI phase
>>   OvmfPkg/PlatformPei: Validate the system RAM when SNP is active
>>   OvmfPkg/MemEncryptSevLib: Add support to validate > 4GB memory in PEI
>>     phase
>>   OvmfPkg/VmgExitLib: Allow PMBASE register access in Dxe phase
>>   OvmfPkg/MemEncryptSevLib: Validate the memory during set or clear enc
>>     attribute
>>   OvmfPkg/MemEncryptSevLib: Skip page state change for non RAM region
>>
>>  MdePkg/Include/Library/BaseLib.h              |  37 +++
>>  MdePkg/Include/Register/Amd/Fam17Msr.h        |  31 ++-
>>  MdePkg/Include/Register/Amd/Ghcb.h            |  39 ++-
>>  MdePkg/Library/BaseLib/BaseLib.inf            |   1 +
>>  MdePkg/Library/BaseLib/X64/Pvalidate.nasm     |  43 +++
>>  OvmfPkg/Include/Library/GhcbRegisterLib.h     |  27 ++
>>  OvmfPkg/Include/Library/MemEncryptSevLib.h    |  30 +++
>>  .../DxeMemEncryptSevLib.inf                   |   7 +
>>  .../DxeMemEncryptSevLibInternal.c             |  27 ++
>>  .../Ia32/SnpPageStateChange.c                 |  17 ++
>>  .../PeiMemEncryptSevLib.inf                   |   9 +
>>  .../PeiMemEncryptSevLibInternal.c             |  47 ++++
>>  .../SecMemEncryptSevLib.inf                   |   4 +
>>  .../SecMemEncryptSevLibInternal.c             |  39 +++
>>  .../BaseMemEncryptSevLib/SnpPageStateChange.h |  37 +++
>>  .../X64/PeiDxeSnpSetPageState.c               |  63 +++++
>>  .../X64/PeiDxeVirtualMemory.c                 | 151 ++++++++++-
>>  .../X64/PeiSnpSystemRamValidate.c             | 129 +++++++++
>>  .../X64/SecSnpSystemRamValidate.c             |  23 ++
>>  .../X64/SnpPageStateChangeInternal.c          | 254 ++++++++++++++++++
>>  .../X64/SnpPageStateTrack.c                   | 119 ++++++++
>>  .../X64/SnpPageStateTrack.h                   |  36 +++
>>  .../X64/SnpSetPageState.h                     |  27 ++
>>  .../BaseMemEncryptSevLib/X64/VirtualMemory.h  |  19 ++
>>  .../Library/GhcbRegisterLib/GhcbRegisterLib.c |  97 +++++++
>>  .../GhcbRegisterLib/GhcbRegisterLib.inf       |  33 +++
>>  OvmfPkg/Library/VmgExitLib/SecVmgExitLib.inf  |   4 +
>>  OvmfPkg/Library/VmgExitLib/VmgExitLib.inf     |   7 +
>>  OvmfPkg/Library/VmgExitLib/VmgExitVcHandler.c |  45 ++++
>>  OvmfPkg/OvmfPkg.dec                           |  12 +
>>  OvmfPkg/OvmfPkgX64.dsc                        |   1 +
>>  OvmfPkg/OvmfPkgX64.fdf                        |  33 ++-
>>  OvmfPkg/PlatformPei/AmdSev.c                  |  52 ++++
>>  OvmfPkg/PlatformPei/PlatformPei.inf           |   2 +
>>  OvmfPkg/ResetVector/Ia16/ResetVectorVtf0.asm  |  24 ++
>>  OvmfPkg/ResetVector/Ia32/PageTables64.asm     | 106 ++++++++
>>  OvmfPkg/ResetVector/ResetVector.inf           |   5 +
>>  OvmfPkg/ResetVector/ResetVector.nasmb         |   4 +
>>  OvmfPkg/Sec/SecMain.c                         | 102 +++++++
>>  OvmfPkg/Sec/SecMain.inf                       |   2 +
>>  UefiCpuPkg/Library/MpInitLib/DxeMpInitLib.inf |   1 +
>>  UefiCpuPkg/Library/MpInitLib/MpEqu.inc        |   1 +
>>  UefiCpuPkg/Library/MpInitLib/MpLib.c          |   2 +
>>  UefiCpuPkg/Library/MpInitLib/MpLib.h          |   2 +
>>  UefiCpuPkg/Library/MpInitLib/PeiMpInitLib.inf |   1 +
>>  UefiCpuPkg/Library/MpInitLib/X64/MpFuncs.nasm |  51 ++++
>>  UefiCpuPkg/UefiCpuPkg.dec                     |   6 +
>>  47 files changed, 1790 insertions(+), 19 deletions(-)
>>  create mode 100644 MdePkg/Library/BaseLib/X64/Pvalidate.nasm
>>  create mode 100644 OvmfPkg/Include/Library/GhcbRegisterLib.h
>>  create mode 100644 OvmfPkg/Library/BaseMemEncryptSevLib/Ia32/SnpPageStateChange.c
>>  create mode 100644 OvmfPkg/Library/BaseMemEncryptSevLib/SnpPageStateChange.h
>>  create mode 100644 OvmfPkg/Library/BaseMemEncryptSevLib/X64/PeiDxeSnpSetPageState.c
>>  create mode 100644 OvmfPkg/Library/BaseMemEncryptSevLib/X64/PeiSnpSystemRamValidate.c
>>  create mode 100644 OvmfPkg/Library/BaseMemEncryptSevLib/X64/SecSnpSystemRamValidate.c
>>  create mode 100644 OvmfPkg/Library/BaseMemEncryptSevLib/X64/SnpPageStateChangeInternal.c
>>  create mode 100644 OvmfPkg/Library/BaseMemEncryptSevLib/X64/SnpPageStateTrack.c
>>  create mode 100644 OvmfPkg/Library/BaseMemEncryptSevLib/X64/SnpPageStateTrack.h
>>  create mode 100644 OvmfPkg/Library/BaseMemEncryptSevLib/X64/SnpSetPageState.h
>>  create mode 100644 OvmfPkg/Library/GhcbRegisterLib/GhcbRegisterLib.c
>>  create mode 100644 OvmfPkg/Library/GhcbRegisterLib/GhcbRegisterLib.inf
>>


-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#73846): https://edk2.groups.io/g/devel/message/73846
Mute This Topic: https://groups.io/mt/81584576/1787277
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [edk2-devel] [RFC PATCH 00/19] Add AMD Secure Nested Paging (SEV-SNP) support
Posted by Laszlo Ersek 3 years ago
On 04/08/21 13:59, Brijesh Singh wrote:
> On 4/8/21 4:58 AM, Laszlo Ersek wrote:
>> On 03/24/21 16:31, Brijesh Singh wrote:

>>> At this time we only support the pre-validation. OVMF detects all the available
>>> system RAM in the PEI phase. When SEV-SNP is enabled, the memory is validated
>>> before it is made available to the EDK2 core.
>> Can you describe this in a bit more detail, before I look at the
>> individual patches? Specifically, what existing logic in the PEI phase
>> was taken, and extended, and how?
> 
> One of the key requirement is that the guest private pages much be
> validated before the access. If guest tries to access the pages before
> the validation then it will result in #VC (page-not-validated)
> exception. To avoid the #VC, we propose the validating the memory before
> the access. We will incrementally add the support to lazy validate (i.e
> validate on access).

What is the primary threat (in simple terms please) that validation is
supposed to prevent?

And, against that particular threat, does pre-validation offer some
protection, or will that only come with lazy validation?

> 
> Let me try explaining a bit, the page validation process consist of two
> steps:
> 
> 1. Add the pages in the RMP table -- must be done by the hypervisor
> using the RMPUPDATE instruction. The guest can use VMGEXIT NAEs to ask
> hypervisor to add or remove pages from the RMP table.
> 
> 2. Guest issue the PVALIDATE instruction -- this sets the validate bit
> in the RMP table.
> 
> Similar to SEV, the OVMF_CODE.fd is encrypted through the SNP firmware
> before the launch. The SNP firmware also validates the memory page after
> encrypting. This allows us to boot the initial entry code without guest
> going through the validation process.
> 
> The OVMF reset vector uses few data pages (e.g page table, early Sec
> stack). Access to these data pages will result in #VC. There are two
> approaches we can take to validate these data pages:
> 
> 1. Ask SNP firmware to pre-validate it -- SNP firmware provides an
> special command that can be used to pre-validate the pages without
> affecting the measurement.

This means the two pflash chips, right?

> 
> 2. Enhance the reset vector code to validate the pages.
> 
> For now I choose #1.
> 
> The pre-validation performed by the SNP firmware is sufficient to boot
> through the SEC phase. The SEC phase later decompress the Fv to a new
> memory location. Now we need the OVMF to take over the validation
> procedure.  The series extends the MemEncryptSevLib to add a new helper
> MemEncryptSevSnpValidateRam(). The helper is used to validate the system
> RAM. See patch #12. SEC phase calls the MemEncryptSevSnpValidateRam() to
> validate the output buffer used for the decompression. This was
> sufficient to boot into the PEI phase, see patch #13.

Two questions here:

- Is ACPI S3 in scope for now?

- We need more areas made accessible in SEC than just the decompression
buffer; for example the temporary SEC/PEI heap and stack, and (IIRC)
various special SEV-ES areas laid out via MEMFD. Where are all of those
handled / enumerated?

> The PEI detects
> all the available system RAM. After the memory detection is completed
> the PlatformPei calls the AmdSevSnpInitialize(). The initialization
> routine iterate through the HOB and calls the
> MemEncryptSevSnpValidateRam() to validate all the system RAM. Is it
> possible the more system ram can be detected after the PlatformPei is
> completed ?

That would cause problems even without SEV-SNP (i.e., with plain SEV),
so I'm not worried about it.

> 
> One of the important thing is we should *never* validate the pages
> twice.

What are the symptoms / consequences of:

- the guest accessing an unvalidated page (I understand it causes a #VC,
but what is the direct result of that, when this series is applied?),

- doubly-validating a page?

The first question is relevant because we should crash as soon as we
touch a page we forgot to validate (we shouldn't let any corruption or
similar spread out until we finally realize there's a problem).

The second question is relevant for security I guess. What attacks
become possible, and/or what symptoms are produced, if we
doubly-validate a page?

Furthermore, IIRC, we have separate #VC handlers for the different
firmware phases; do they behave consistently / identicall when a
#VC(page-not-validated) occurs, when this patch set is applied?

My first question is basically asking whether we can *exclusively* rely
on #VC(page-not-validated) faults to tell us that we missed validating a
particular page. If we can do that, then the job is a bit easier,
because from the GPA, we can more or less also derive *when and where*
we should pre-validate the page (at least until validation is done
completely lazily).

> The MemEncryptSevSnpValidateRam() uses a interval search tree to
> keep the record of what has been validated. Before validating the range,
> it lookup in its tree and if it finds that range is already validated
> then do nothing. If it detects an overlap then it will validate only non
> overlapping regions -- see patch #14.

What data structure is used for implementing the interval tree?

I'm not necessarily looking for a data structure with "nice" asymptotic
time complexity. With pre-validation especially, I think simplicity
(ease of review) is more important for the data structure than
performance. If it's not an actual "tree", we shouldn't call it a
"tree". (An "interval tree" is usually an extension of a Red-Black Tree,
and that's not the simplest data structure in existence; although edk2
does offer an rbtree library.)

Furthermore, what you describe above is called idempotency. No matter
how many times we attempt to validate a range, it may (or may not even)
cause an actual change in the first action only. Is this property
(=idempotency) an inherent requirement of the technology, or is it a
simplification of the implementation? Put differently: if you called
CpuDeadLoop() in the validation function any time an overlapping
validation request were received, would that hugely complicate the call
sites?

I'm kind of "obsessing" about idempotency because you say we must
*never* doubly-validate a page, so the *difference* between:
- explicitly crashing on such an attempt,
- and silently ignoring such an attempt,
may be meaningful.

It's kind of the difference between "oops this is a call site *bug*, but
we patched it up", vs. "this is expected of call sites, we should just
handle it internally".

> The patch #18 extend the MemEncrypt{Set,Clear}PageEncMask() to call the
> SNP page state change during the C-bit toggle.

- When exactly do we invalidate?

- Does the time and place of invalidation depend on whether we perform
pre-validation, or lazy validation?

- Is page invalidation idempotent too?

- What is the consequence (security impact) if we forget invalidation?

- There are four page states I can imagine:
  - encrypted for host access, valid for guest access
  - encrypted for host access, invalid for guest access
  - decrypted for host access, valid for guest access
  - decrypted for host access, invalid for guest access

Does a state exist, from these four, that's never used? (Potentially
caught by the hardware?)

Do the patches highlight / explain the validity transitions, in
comments? My understanding is that the C-bit toggle and the guest access
valid/invalid toggle are separate actions, so "middle" states do exist,
but perhaps only temporarily.

I'm curious how it works, for example, with variousvirtio transfers (bus
master read, bus master write, bus master common buffer). In the
IoMmuDxe driver minimally, we access memory both through PTEs with the
C-bit set and through PTEs with the C-bit clear, meaning that
"encrypted, valid", and "decrypted, valid" are both required states. But
that seems to conflict with the notion that "C-bit toggle" be directly
coupled with a "validity toggle".

Put differently, I'm happy that modifying IoMmuDxe appears unnecessary,
but then that tells me I'm missing something about the state transitions
between the above *four* states.

> 
> Please let me know if you have any questions. We can hash out the design
> here before you taking a closure look at the code.

Sorry that I've been (and am being) slow to start reviewing this series.

Thanks,
Laszlo



-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#73901): https://edk2.groups.io/g/devel/message/73901
Mute This Topic: https://groups.io/mt/81584576/1787277
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [edk2-devel] [RFC PATCH 00/19] Add AMD Secure Nested Paging (SEV-SNP) support
Posted by Brijesh Singh 3 years ago
On 4/9/21 7:24 AM, Laszlo Ersek wrote:
> On 04/08/21 13:59, Brijesh Singh wrote:
>> On 4/8/21 4:58 AM, Laszlo Ersek wrote:
>>> On 03/24/21 16:31, Brijesh Singh wrote:
>>>> At this time we only support the pre-validation. OVMF detects all the available
>>>> system RAM in the PEI phase. When SEV-SNP is enabled, the memory is validated
>>>> before it is made available to the EDK2 core.
>>> Can you describe this in a bit more detail, before I look at the
>>> individual patches? Specifically, what existing logic in the PEI phase
>>> was taken, and extended, and how?
>> One of the key requirement is that the guest private pages much be
>> validated before the access. If guest tries to access the pages before
>> the validation then it will result in #VC (page-not-validated)
>> exception. To avoid the #VC, we propose the validating the memory before
>> the access. We will incrementally add the support to lazy validate (i.e
>> validate on access).
> What is the primary threat (in simple terms please) that validation is
> supposed to prevent?


To protect against the memory re-mapping attack the guest pages must be
validated. The idea is that every guest page can map only to a single
physical memory page at one time.


> And, against that particular threat, does pre-validation offer some
> protection, or will that only come with lazy validation?

For the hardware it does not matter how the memory was validated -- lazy
vs prevalidate. Both approaches use the PVALIDATE instruction to
validate the page.

In the case of pre-validation, the memory is validated before the
access. Whereas in the lazy validation, the access will cause a fault
and fault handler should validate the page and retry the access. Its
similar to a page fault handler, OS can populate the page table or back
the pages on demand.

The only downside of pre-validation is, we will take a hit on the boot
time. The GHCB spec provides method by which we can batch multiple
requests at once to minimize the context switches.


>
>> Let me try explaining a bit, the page validation process consist of two
>> steps:
>>
>> 1. Add the pages in the RMP table -- must be done by the hypervisor
>> using the RMPUPDATE instruction. The guest can use VMGEXIT NAEs to ask
>> hypervisor to add or remove pages from the RMP table.
>>
>> 2. Guest issue the PVALIDATE instruction -- this sets the validate bit
>> in the RMP table.
>>
>> Similar to SEV, the OVMF_CODE.fd is encrypted through the SNP firmware
>> before the launch. The SNP firmware also validates the memory page after
>> encrypting. This allows us to boot the initial entry code without guest
>> going through the validation process.
>>
>> The OVMF reset vector uses few data pages (e.g page table, early Sec
>> stack). Access to these data pages will result in #VC. There are two
>> approaches we can take to validate these data pages:
>>
>> 1. Ask SNP firmware to pre-validate it -- SNP firmware provides an
>> special command that can be used to pre-validate the pages without
>> affecting the measurement.
> This means the two pflash chips, right?


This does not need to be two pflash chips. The SNP firmware command does
not know anything about the ROM or pflash chip. The command accepts the
system physical address that need to be validated by the firmware. In
patch #2, OVMF provides a range of data pages that need to be validated
by the SNP firmware before booting the guest.

>
>> 2. Enhance the reset vector code to validate the pages.
>>
>> For now I choose #1.
>>
>> The pre-validation performed by the SNP firmware is sufficient to boot
>> through the SEC phase. The SEC phase later decompress the Fv to a new
>> memory location. Now we need the OVMF to take over the validation
>> procedure.  The series extends the MemEncryptSevLib to add a new helper
>> MemEncryptSevSnpValidateRam(). The helper is used to validate the system
>> RAM. See patch #12. SEC phase calls the MemEncryptSevSnpValidateRam() to
>> validate the output buffer used for the decompression. This was
>> sufficient to boot into the PEI phase, see patch #13.
> Two questions here:
>
> - Is ACPI S3 in scope for now?


Its not in the scope yet. I have not looked at it.


>
> - We need more areas made accessible in SEC than just the decompression
> buffer; for example the temporary SEC/PEI heap and stack, and (IIRC)
> various special SEV-ES areas laid out via MEMFD. Where are all of those
> handled / enumerated?


Sorry, I simplified my response by saying just decompression. You are
right that its more than the decompression buffer. In my current patch,
the SEC phase validates all the memory up to
PcdOvmfDecompressionScratchEnd (which includes more than just output
buffer). See patch #13.

>
>> The PEI detects
>> all the available system RAM. After the memory detection is completed
>> the PlatformPei calls the AmdSevSnpInitialize(). The initialization
>> routine iterate through the HOB and calls the
>> MemEncryptSevSnpValidateRam() to validate all the system RAM. Is it
>> possible the more system ram can be detected after the PlatformPei is
>> completed ?
> That would cause problems even without SEV-SNP (i.e., with plain SEV),
> so I'm not worried about it.
>
>> One of the important thing is we should *never* validate the pages
>> twice.
> What are the symptoms / consequences of:
>
> - the guest accessing an unvalidated page (I understand it causes a #VC,
> but what is the direct result of that, when this series is applied?),


After the series it applied, an access to an unvalidated page will
result in #VC. The series does not extend the VC handler to handle the
page-not-validated error code. The current VC handler
[InternalVmgExitHandleVc()] will inject a #GP to terminate the guest if
#VC is raised for unvalidated page.


>
> - doubly-validating a page?
>
> The first question is relevant because we should crash as soon as we
> touch a page we forgot to validate (we shouldn't let any corruption or
> similar spread out until we finally realize there's a problem).
>
> The second question is relevant for security I guess. What attacks
> become possible, and/or what symptoms are produced, if we
> doubly-validate a page?

Assuming that guest validates its memory, this guarantees the injective
mapping between GPAs and SPAs.

As I noted that the page validation process consist of two steps #1
RMPUPDATE and 2#PVALIDATE. The RMPUPDATE is a hypervisor instruction and
PVALIDATE is a guest instruction. A malicious hypervisor can remap gpa
to different physical address in RMP table using the RMPUPDATE just
before the second PVALIDATE instruction is executed. The guest need to
ensure that it does not leave the security vulnerability that can be
exploited by a malicious hypervisor. The SNP white paper recommends that
a guest OS should keep track of what is has validate so that it can
avoid getting into these situation.


>
> Furthermore, IIRC, we have separate #VC handlers for the different
> firmware phases; do they behave consistently / identicall when a
> #VC(page-not-validated) occurs, when this patch set is applied?
>
> My first question is basically asking whether we can *exclusively* rely
> on #VC(page-not-validated) faults to tell us that we missed validating a
> particular page. If we can do that, then the job is a bit easier,
> because from the GPA, we can more or less also derive *when and where*
> we should pre-validate the page (at least until validation is done
> completely lazily).

Yes, we should be able to rely #VC to tell us that we missed validating
the page. In my this series so far I have not seen a #VC due to
page-not-validated because we have carefully validated the pages before
they are accessed. I am thinking that #VC(page-not-validated) should be
treated as an error in the first phase. Incrementally we will add the
lazy validation, in which the #VC(page-not-validated) will provide a
hint as to what has not been validated.


>
>> The MemEncryptSevSnpValidateRam() uses a interval search tree to
>> keep the record of what has been validated. Before validating the range,
>> it lookup in its tree and if it finds that range is already validated
>> then do nothing. If it detects an overlap then it will validate only non
>> overlapping regions -- see patch #14.
> What data structure is used for implementing the interval tree?
>
> I'm not necessarily looking for a data structure with "nice" asymptotic
> time complexity. With pre-validation especially, I think simplicity
> (ease of review) is more important for the data structure than
> performance. If it's not an actual "tree", we shouldn't call it a
> "tree". (An "interval tree" is usually an extension of a Red-Black Tree,
> and that's not the simplest data structure in existence; although edk2
> does offer an rbtree library.)


I am using binary search tree. Currently, we don't have many nodes to
track. Most of the guest memory is private and are validated before we
exit from the PEI phase.

> Furthermore, what you describe above is called idempotency. No matter
> how many times we attempt to validate a range, it may (or may not even)
> cause an actual change in the first action only. Is this property
> (=idempotency) an inherent requirement of the technology, or is it a
> simplification of the implementation? Put differently: if you called
> CpuDeadLoop() in the validation function any time an overlapping
> validation request were received, would that hugely complicate the call
> sites?

From the technology point of view you can validate the same page again
and again. Hardware does not force any restriction. To protect against
time-based remap attack a guest also need to ensure that it does not
blindly validates the pages otherwise guest is taking risk. Avoiding a
double validation is strong recommendation. In current implementation,
the validation burden is not put on the caller. Caller calls standard
APIs to toggle the encryption attribute. Under the hood, if we see that
page is already validated then no need to issue the PVALIDATE. If the
page is getting changed from C=1 -> 0 then unvalidate it.


> I'm kind of "obsessing" about idempotency because you say we must
> *never* doubly-validate a page, so the *difference* between:
> - explicitly crashing on such an attempt,
> - and silently ignoring such an attempt,
> may be meaningful.


I said we should never doubly validate to avoid creating a vulnerability
that can later be exploited by the malicious hypervisor. Currently we
lookup address in our tree to ensure that we don't do a double
validation. The PVALIDATE instruction also can be used to determine if
we are attempting to validate an already validated page. In current
implementation after the PVALIDATE completion, I check the rflags.CF to
ensure that its not a double validation condition. If its double
validation then abort the guest. Its bug in the guest.

> It's kind of the difference between "oops this is a call site *bug*, but
> we patched it up", vs. "this is expected of call sites, we should just
> handle it internally".

Our attempt should be to patch to avoid the double validation. I would
still check the status of PVALIDATE, if instruction detects it as a
double validation then we have a bug in our tracking and should abort
the guest.

>
>> The patch #18 extend the MemEncrypt{Set,Clear}PageEncMask() to call the
>> SNP page state change during the C-bit toggle.
> - When exactly do we invalidate?

Basically before the page is transition from C=1 -> 0 in the page table.

>
> - Does the time and place of invalidation depend on whether we perform
> pre-validation, or lazy validation?
It does not matter. Both type of validation update the Validated bit in
the RMP table. The important thing is that invalidation can happen only
for a already validated private page. If the page is shared then
invalidation will result in #GP.
>
> - Is page invalidation idempotent too?
We should treat this as idempotent too. As I said, tacking the page
state is strongly recommended for the security reason. The SNP white
paper has a section dedicated for the validation and may able to clarify
things which is not covered in my response.
>
> - What is the consequence (security impact) if we forget invalidation?
I can't think of a security implication of it but if we forget to
invalidate then it can lead issues in the tracking data structure when
it comes to validation next time. e.g a page was mapped as C=1 and we
changed it to C=0 without invalidating it. Now if the same page is being
changed from C=0 -> 1 then tracking data structure may think that page
is already validated and will skip the validation process and thus
access will result in #VC (page-not-validated).
>
> - There are four page states I can imagine:
>   - encrypted for host access, valid for guest access
>   - encrypted for host access, invalid for guest access
>   - decrypted for host access, valid for guest access
>   - decrypted for host access, invalid for guest access
>
> Does a state exist, from these four, that's never used? (Potentially
> caught by the hardware?)

I would not mix the page state with the host access. Basically there is
two page state exist

- Guest-Valid

- Guest-Invalid

By default all the guest memory is in Guest-Invalid state on boot. The
PVALIDATE instruction is used to transition from Valid->Invalid. Guest
can use the PVALIDATE to change the state from Valid to Invalid.

A guest private page (i.e page mapped C=1) must be in Guest Valid state.
If hardware see that a private access page is not in the Guest-Valid
state then it will trigger #VC.


>
> Do the patches highlight / explain the validity transitions, in
> comments? My understanding is that the C-bit toggle and the guest access
> valid/invalid toggle are separate actions, so "middle" states do exist,
> but perhaps only temporarily.

I have tried to document the steps in the patch description and I can
add more comments as required. When SEV-SNP is active then we need to
invalidate the page before toggling the C-bit from C=1 -> 0 and
similarly after the page is toggled from C=0 -> 1 we must validate it.


> I'm curious how it works, for example, with variousvirtio transfers (bus
> master read, bus master write, bus master common buffer). In the
> IoMmuDxe driver minimally, we access memory both through PTEs with the
> C-bit set and through PTEs with the C-bit clear, meaning that
> "encrypted, valid", and "decrypted, valid" are both required states. But
> that seems to conflict with the notion that "C-bit toggle" be directly
> coupled with a "validity toggle".

That should not be an issue. Each page have entry in the RMP table. So,
you can have one page as guest invalid and another page as a guest valid.

> Put differently, I'm happy that modifying IoMmuDxe appears unnecessary,
> but then that tells me I'm missing something about the state transitions
> between the above *four* states.


Only thing which I would propose changing in IoMmuDxe driver is to use a
pre-allocated buffer and avoid toggling the C-bit on every read/write.
In past the C-bit toggle was all contained inside the guest. But when
SNP is enabled, each C-bit toggle causes two additional steps #1
RMPUPDATE and #2 PVALIDATE. The RMPUPDATE is requested using the VMEXIT
so it will cause a VM context switch. I would prefer to avoid it.

My current patch set does not have this optimization yet.


>
>> Please let me know if you have any questions. We can hash out the design
>> here before you taking a closure look at the code.
> Sorry that I've been (and am being) slow to start reviewing this series.


No worries, take your time. Thank you for all your feedback.

-Brijesh

>
> Thanks,
> Laszlo
>


-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#73918): https://edk2.groups.io/g/devel/message/73918
Mute This Topic: https://groups.io/mt/81584576/1787277
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [edk2-devel] [RFC PATCH 00/19] Add AMD Secure Nested Paging (SEV-SNP) support
Posted by Laszlo Ersek 3 years ago
On 04/10/21 00:43, Brijesh Singh wrote:
>
> On 4/9/21 7:24 AM, Laszlo Ersek wrote:
>> On 04/08/21 13:59, Brijesh Singh wrote:
>>> On 4/8/21 4:58 AM, Laszlo Ersek wrote:
>>>> On 03/24/21 16:31, Brijesh Singh wrote:
>>>>> At this time we only support the pre-validation. OVMF detects all
>>>>> the available system RAM in the PEI phase. When SEV-SNP is
>>>>> enabled, the memory is validated before it is made available to
>>>>> the EDK2 core.
>>>> Can you describe this in a bit more detail, before I look at the
>>>> individual patches? Specifically, what existing logic in the PEI
>>>> phase was taken, and extended, and how?
>>> One of the key requirement is that the guest private pages much be
>>> validated before the access. If guest tries to access the pages
>>> before the validation then it will result in #VC
>>> (page-not-validated) exception. To avoid the #VC, we propose the
>>> validating the memory before the access. We will incrementally add
>>> the support to lazy validate (i.e validate on access).
>> What is the primary threat (in simple terms please) that validation
>> is supposed to prevent?
>
>
> To protect against the memory re-mapping attack the guest pages must
> be validated. The idea is that every guest page can map only to a
> single physical memory page at one time.

I don't understand. How can a given guest page frame map to multiple
host page frames?

Do you mean that the situation to prevent is when multiple guest page
frames (GPAs) map to the same host page frame, as set up by the
hypervisor?


>> And, against that particular threat, does pre-validation offer some
>> protection, or will that only come with lazy validation?
>
> For the hardware it does not matter how the memory was validated --
> lazy vs prevalidate. Both approaches use the PVALIDATE instruction to
> validate the page.
>
> In the case of pre-validation, the memory is validated before the
> access. Whereas in the lazy validation, the access will cause a fault
> and fault handler should validate the page and retry the access. Its
> similar to a page fault handler, OS can populate the page table or
> back the pages on demand.
>
> The only downside of pre-validation is, we will take a hit on the boot
> time. The GHCB spec provides method by which we can batch multiple
> requests at once to minimize the context switches.

If pre-validation is simpler, and the only drawback is a one-time hit
during boot, then wouldn't it be better to stick with pre-validation
forever? I expect that would be a lot simpler for (a) the #VC handlers,
(b) the tracking of the "already validated" information.


>>> Similar to SEV, the OVMF_CODE.fd is encrypted through the SNP
>>> firmware before the launch. The SNP firmware also validates the
>>> memory page after encrypting. This allows us to boot the initial
>>> entry code without guest going through the validation process.
>>>
>>> The OVMF reset vector uses few data pages (e.g page table, early Sec
>>> stack). Access to these data pages will result in #VC. There are two
>>> approaches we can take to validate these data pages:
>>>
>>> 1. Ask SNP firmware to pre-validate it -- SNP firmware provides an
>>> special command that can be used to pre-validate the pages without
>>> affecting the measurement.
>> This means the two pflash chips, right?
>
>
> This does not need to be two pflash chips. The SNP firmware command
> does not know anything about the ROM or pflash chip. The command
> accepts the system physical address that need to be validated by the
> firmware. In patch #2, OVMF provides a range of data pages that need
> to be validated by the SNP firmware before booting the guest.

I guess my question related to the guest code that executes from pflash,
and/or accesses data from pflash, before the guest itself could ask for
any validation. Such as, the reset vector (which runs from pflash).
Then, some non-volatile UEFI variable data that resides in the other
pflash chip, and is accessed (read-only) during the PEI phase (the
memory type information variable namely). I understood your comments as
QEMU pre-validating those areas before guest launch. Is that correct?

Anyway, I guess at this point I should just start reviewing the series.

Some more comments below.


>> - We need more areas made accessible in SEC than just the
>> decompression buffer; for example the temporary SEC/PEI heap and
>> stack, and (IIRC) various special SEV-ES areas laid out via MEMFD.
>> Where are all of those handled / enumerated?
>
>
> Sorry, I simplified my response by saying just decompression. You are
> right that its more than the decompression buffer. In my current
> patch, the SEC phase validates all the memory up to
> PcdOvmfDecompressionScratchEnd (which includes more than just output
> buffer). See patch #13.

That sounds good (will check the patch out of course).


>>> One of the important thing is we should *never* validate the pages
>>> twice.
>> What are the symptoms / consequences of:
>>
>> - the guest accessing an unvalidated page (I understand it causes a
>> #VC, but what is the direct result of that, when this series is
>> applied?),
>
>
> After the series it applied, an access to an unvalidated page will
> result in #VC. The series does not extend the VC handler to handle the
> page-not-validated error code. The current VC handler
> [InternalVmgExitHandleVc()] will inject a #GP to terminate the guest
> if #VC is raised for unvalidated page.

OK.


>> - doubly-validating a page?
>>
>> The first question is relevant because we should crash as soon as we
>> touch a page we forgot to validate (we shouldn't let any corruption
>> or similar spread out until we finally realize there's a problem).
>>
>> The second question is relevant for security I guess. What attacks
>> become possible, and/or what symptoms are produced, if we
>> doubly-validate a page?
>
> Assuming that guest validates its memory, this guarantees the
> injective mapping between GPAs and SPAs.
>
> As I noted that the page validation process consist of two steps #1
> RMPUPDATE and 2#PVALIDATE. The RMPUPDATE is a hypervisor instruction
> and PVALIDATE is a guest instruction. A malicious hypervisor can remap
> gpa to different physical address in RMP table using the RMPUPDATE
> just before the second PVALIDATE instruction is executed. The guest
> need to ensure that it does not leave the security vulnerability that
> can be exploited by a malicious hypervisor. The SNP white paper
> recommends that a guest OS should keep track of what is has validate
> so that it can avoid getting into these situation.

Hm, OK. I think this answers my top-most question.


>> Furthermore, IIRC, we have separate #VC handlers for the different
>> firmware phases; do they behave consistently / identicall when a
>> #VC(page-not-validated) occurs, when this patch set is applied?
>>
>> My first question is basically asking whether we can *exclusively*
>> rely on #VC(page-not-validated) faults to tell us that we missed
>> validating a particular page. If we can do that, then the job is a
>> bit easier, because from the GPA, we can more or less also derive
>> *when and where* we should pre-validate the page (at least until
>> validation is done completely lazily).
>
> Yes, we should be able to rely #VC to tell us that we missed
> validating the page. In my this series so far I have not seen a #VC
> due to page-not-validated because we have carefully validated the
> pages before they are accessed. I am thinking that
> #VC(page-not-validated) should be treated as an error in the first
> phase. Incrementally we will add the lazy validation, in which the
> #VC(page-not-validated) will provide a hint as to what has not been
> validated.

I'm OK with the current handling of #VC(page-not-validated) -- i.e.,
fatal error.

I'm not yet convinced that lazy validation will be useful at all, but
that could be just my lack of understanding.


>>> The MemEncryptSevSnpValidateRam() uses a interval search tree to
>>> keep the record of what has been validated. Before validating the
>>> range, it lookup in its tree and if it finds that range is already
>>> validated then do nothing. If it detects an overlap then it will
>>> validate only non overlapping regions -- see patch #14. hat data
>>> structure is used for implementing the interval tree?
>>
>> I'm not necessarily looking for a data structure with "nice"
>> asymptotic time complexity. With pre-validation especially, I think
>> simplicity (ease of review) is more important for the data structure
>> than performance. If it's not an actual "tree", we shouldn't call it
>> a "tree". (An "interval tree" is usually an extension of a Red-Black
>> Tree, and that's not the simplest data structure in existence;
>> although edk2 does offer an rbtree library.)
>
>
> I am using binary search tree. Currently, we don't have many nodes to
> track. Most of the guest memory is private and are validated before we
> exit from the PEI phase.

My preliminary comments about MemEncryptSevSnpValidateSystemRam():

(1) The function prototype (patch#12) promises that the function can
never fail. In patch#14 however, I see some AddRangeToIntervalTree()
calls whose return values are not checked. I think error checking should
be strict, and if an error is detected (such as memory allocation
failure), a plain ASSERT (FALSE) is not sufficient. Both ASSERT() and
CpuDeadLoop() should be used explicitly, for the sake of RELEASE builds.
And the function prototype (the leading comment) should highlight that
the function either succeeds, or doesn't return.


(2) I don't like the recursive nature of the functions. That's always
convenient first, and almost always a mess to get out of later. However,
this is not a request to complicate the tree's implementation,
because...


(3) ... you mention we don't have many nodes to track. Is a tree
structure, with dynamically allocated nodes, justified at all?


(4) Regarding repeating dynamic allocation -- that's never a great thing
to do in the SEC / PEI phases. Can we use a fixed size, once-allocated,
or even static, array somewhere?

In particular, patch#14 uses AllocatePool() for SNP_VALIDATED_RANGE, in
the PEI phase, that results in EFI_HOB_TYPE_MEMORY_POOL HOBs; see:

  AllocatePool()              [MdePkg/Library/PeiMemoryAllocationLib/MemoryAllocationLib.c]
    PeiServicesAllocatePool() [MdePkg/Library/PeiServicesLib/PeiServicesLib.c]
      PeiAllocatePool()       [MdeModulePkg/Core/Pei/Memory/MemoryServices.c]

Now, dependent on where you call this AllocatePool() from, it could
occur still on the temporary SEC/PEI RAM -- i.e., before permanent PEI
RAM is installed. That means that such HOBs would be subject to the HOB
list migration, after permanent PEI RAM is installed. The PI spec writes
about EFI_HOB_MEMORY_POOL: "The HOB consumer phase should be able to
ignore these HOBs". The PI spec writes about AllocatePool(): "The early
allocations from temporary memory will be migrated to permanent memory
when permanent main memory is installed; this migration shall occur when
the HOB list is migrated to permanent memory".

So, whenever we use AllocatePool() in the PEI phase, we need an "address
stability proof", namely that the AllocatePool() call, and any later
access to the allocated area (the HOB), are never separated by the
installation of "gEfiPeiMemoryDiscoveredPpiGuid" in the PPI database, by
the PEI Core. If we can't guarantee that, then an access later on could
occur to the original allocation address, which has by then been
invalidated by the temporary RAM migration (= the allocation HOB has
been moved).

AllocatePages() is "stable" however, in this sense; no proof like the
one described above is necessary.


(5) The "mRootNode" variable is defined in the file
"OvmfPkg/Library/BaseMemEncryptSevLib/X64/PeiSnpSystemRamValidate.c"
(patch #14). That means that every single module (i.e., PEIM) which
consumes this library instance will own a separate "mRootNode" variable,
with a validation-tracking tree that's correspondingly private to that
PEIM.

But the validation state is global to the entire firmware. I think we
should somehow clarify, or even enforce, that the tracking data
structure is firmware-global.

Perhaps this can be solved by (a) moving the ownership of the data
structure to a particular firmware module, with sole responsibility for
pre-validation, and (b) updating the validation APIs to take this data
structure explicitly.

Alternatively, if multiple modules are expected to inter-operate on the
tracker data structure (by which I mean inter-operation *in sequence*,
not concurrently in a multiprocessing sense), then we might need a
dynamic PCD for inter-driver understanding.


(NB: both points (4) and (5) seem to conflict with *existent* code in
BaseMemEncryptSevLib. However:

- concerning (4), the current allocations are all performed with
  AllocateAlignedPages(), which are unaffected by temporary-to-permanent
  RAM migration,

- and concerning (5), namely the existent "mPageTablePool" global
  variable, is a *known bug* -- see commit b721aa749b86a and
  TianoCore#847. This issue (namely, multiple modules messing with the
  page tables independently) remains unsolved, and I'd like to avoid
  introducing *more* paging-related allocations that are module-owned,
  and not firmware-level.)


All in all, here's what I'd prefer (and please correct me if it's
unrealistic):

- make the SNP validation functions take a simple, *unresizeable*
  tracker structure for input/output,

- allocate that data structure with a single AllocatePages() call in the
  proper PEIM,

- if the SNP validation functions run out of space in the structure,
  hang hard,

- avoid recursion, and use a simple linear search through the array,

- evaluate whether graceful handling of overlaps (= range splitting,
  merging) is actually a requirement. See more on this below.


(6) Can you please confirm that the order in which
MemEncryptSevSnpValidateSystemRam()'s declaration, call sites, and
implementations are introduced, is sensible? I can't tell immediately,
but I'd like to be sure that the tree at least builds at every stage (no
linkage errors at any stage).


>> Furthermore, what you describe above is called idempotency. No matter
>> how many times we attempt to validate a range, it may (or may not
>> even) cause an actual change in the first action only. Is this
>> property (=idempotency) an inherent requirement of the technology, or
>> is it a simplification of the implementation? Put differently: if you
>> called CpuDeadLoop() in the validation function any time an
>> overlapping validation request were received, would that hugely
>> complicate the call sites?
>
> From the technology point of view you can validate the same page again
> and again. Hardware does not force any restriction. To protect against
> time-based remap attack a guest also need to ensure that it does not
> blindly validates the pages otherwise guest is taking risk. Avoiding a
> double validation is strong recommendation. In current implementation,
> the validation burden is not put on the caller. Caller calls standard
> APIs to toggle the encryption attribute. Under the hood, if we see
> that page is already validated then no need to issue the PVALIDATE. If
> the page is getting changed from C=1 -> 0 then unvalidate it.

There remains a huge gap in my understanding here. (The rest of my
earlier questions did concern the relationship between the C bit and
validation, but I might as well comment right at this spot.)

"Invalidation in connection to C-bit toggling" means that there's going
to be significant "validation churn" in the DXE phase. For example when
fw_cfg is used (which relies on the edk2 IOMMU protocol), or when virtio
transfers are performed (which also relies on the edk2 IOMMU protocol,
although less directly -- through PciIo, namely).

That, however, brings up several problems for me:

- Is the tracker data structure scalable? I don't see any balancing in
  it. (And I wouldn't want a new, complex data structure for it anyway.)

- In fact, I don't see any code related to invalidation -- more
  precisely, I don't see where and how the C-bit changes consult the
  tracker structure *at all*.

- Given issue (5), I don't see how the tracker structure would be
  inherited from the PEI phase to the DXE phase.

In particular, in patch#18, the flipping of the C-bit results in calls
to SetPageStateInternal(). But SetPageStateInternal() does not go
*through* the tracker structure. Even in patch#14,
SevSnpValidateSystemRam() calls SetPageStateInternal() *in addition to*
AddRangeToIntervalTree().

In other words, I don't see where validity is tracked in the DXE phase,
in response to the C-bit toggling.


>> I'm kind of "obsessing" about idempotency because you say we must
>> *never* doubly-validate a page, so the *difference* between:
>> - explicitly crashing on such an attempt,
>> - and silently ignoring such an attempt,
>> may be meaningful.
>
>
> I said we should never doubly validate to avoid creating a
> vulnerability that can later be exploited by the malicious hypervisor.
> Currently we lookup address in our tree to ensure that we don't do a
> double validation. The PVALIDATE instruction also can be used to
> determine if we are attempting to validate an already validated page.
> In current implementation after the PVALIDATE completion, I check the
> rflags.CF to ensure that its not a double validation condition. If its
> double validation then abort the guest. Its bug in the guest.
>
>> It's kind of the difference between "oops this is a call site *bug*,
>> but we patched it up", vs. "this is expected of call sites, we should
>> just handle it internally".
>
> Our attempt should be to patch to avoid the double validation. I would
> still check the status of PVALIDATE, if instruction detects it as a
> double validation then we have a bug in our tracking and should abort
> the guest.

Let me rephrase. My question concerns the *abstraction level* at which
we should catch and handle double-validation attempts.

If we silently split and merge ranges in the tracker structure, then the
guest (as a whole) will not double-validate, which is good. However,
double-validation attempts will potentially still *exist*, in
higher-level modules in the firmware. My question is whether it is
*right* to paper over such overlaps, coming from the high-level code, in
the tracker structure. I suspect that it's not right.

In other words, the spot where we should abort, due to
"double-validation detected", is not when the PVALIDATE instruction
reports a problem in the tracker structure. Instead, the spot where we
should abort is when we detect an overlapping request *before* issuing
PVALIDATE. I'd like the tracker to be as simple and dumb as possible,
and I'd like the actual sites that request validation to blow up
immediately, if they issue an overlapping request. Is my expectation
wrong?


>>> The patch #18 extend the MemEncrypt{Set,Clear}PageEncMask() to call
>>> the SNP page state change during the C-bit toggle.
>> - When exactly do we invalidate?
>
> Basically before the page is transition from C=1 -> 0 in the page
> table.
>
>>
>> - Does the time and place of invalidation depend on whether we
>> perform pre-validation, or lazy validation?
> It does not matter. Both type of validation update the Validated bit
> in the RMP table. The important thing is that invalidation can happen
> only for a already validated private page. If the page is shared then
> invalidation will result in #GP.
>>
>> - Is page invalidation idempotent too?
> We should treat this as idempotent too. As I said, tacking the page
> state is strongly recommended for the security reason. The SNP white
> paper has a section dedicated for the validation and may able to
> clarify things which is not covered in my response.
>>
>> - What is the consequence (security impact) if we forget
>> invalidation?
> I can't think of a security implication of it but if we forget to
> invalidate then it can lead issues in the tracking data structure when
> it comes to validation next time. e.g a page was mapped as C=1 and we
> changed it to C=0 without invalidating it. Now if the same page is
> being changed from C=0 -> 1 then tracking data structure may think
> that page is already validated and will skip the validation process
> and thus access will result in #VC (page-not-validated).

As I said above, I don't see where the validation tracking and the C-bit
flipping "meet". Especially across the PEI and DXE phases.

Basically I'm missing the core concept for maintaining a firmware-global
structure, for tracking the validation status of GPA intervals.

Up to and *excluding* the DXE phase, the validation tracking should be
trivial, naive, and unforgiving. This is because, in the PEI phase,
validation is one-shot. I would expect the tracker structure to live in
the PlatformPei module.

Starting with the DXE phase, where we need (in)validation to closely
accompany C-bit toggling, we need a different, performant data structure
for tracking validation status. This data structure should be primed
from the status last set up in the PEI phase. The tracking should still
be unforgiving. I believe *this* tracker structure may have to live in
the IOMMU driver.

BaseMemEncryptSevLib may offer helper functions, but even if it does, it
should never attempt to *own* tracking information, via an internal
global variable. Furthermore, those helper functions that only make
sense for a particular firmware phase, should have such implementations
that hang hard, in library instances that match the *other* firmware
phases. For example, SevSnpValidateSystemRam() should have a
DXE-specific implementation that hangs hard.

It's entirely possible that my points above are complete garbage. I
don't have a grasp on the core concept, for the firmware-wide tracking
of the validation status.

... Ugh, wait. I've just noticed something. In patch#19,
"SEC_SEV_ES_WORK_AREA.SnpSystemRamValidatedRootAddress" is introduced,
which seems (?) to be implementing the kind of global (inter-driver,
inter-phase) tracking that I've been missing. However, even if my guess
(?) is correct about that, this idea definitely doesn't belong in the
last patch, presented as a "bugfix". Even if it were a bugfix, it should
be incorporated into one of the earlier patches (so that we preferably
not introduce the bug in the first place). However, my argument is that
it's not a bugfix -- to me that seems to be the core concept. That's
what the whole series needs to be built upon. (In all of my thinking,
before this particular paragraph, I *never* looked at patch#19. There
was no reason to.)


>> - There are four page states I can imagine:
>>   - encrypted for host access, valid for guest access
>>   - encrypted for host access, invalid for guest access
>>   - decrypted for host access, valid for guest access
>>   - decrypted for host access, invalid for guest access
>>
>> Does a state exist, from these four, that's never used? (Potentially
>> caught by the hardware?)
>
> I would not mix the page state with the host access. Basically there
> is two page state exist
>
> - Guest-Valid
>
> - Guest-Invalid
>
> By default all the guest memory is in Guest-Invalid state on boot. The
> PVALIDATE instruction is used to transition from Valid->Invalid.

Do you mean "Invalid->Valid" transition?


> Guest can use the PVALIDATE to change the state from Valid to Invalid.
>
> A guest private page (i.e page mapped C=1) must be in Guest Valid
> state. If hardware see that a private access page is not in the
> Guest-Valid state then it will trigger #VC.
>
>
>>
>> Do the patches highlight / explain the validity transitions, in
>> comments? My understanding is that the C-bit toggle and the guest
>> access valid/invalid toggle are separate actions, so "middle" states
>> do exist, but perhaps only temporarily.
>
> I have tried to document the steps in the patch description and I can
> add more comments as required. When SEV-SNP is active then we need to
> invalidate the page before toggling the C-bit from C=1 -> 0 and
> similarly after the page is toggled from C=0 -> 1 we must validate it.
>
>
>> I'm curious how it works, for example, with variousvirtio transfers
>> (bus master read, bus master write, bus master common buffer). In the
>> IoMmuDxe driver minimally, we access memory both through PTEs with
>> the C-bit set and through PTEs with the C-bit clear, meaning that
>> "encrypted, valid", and "decrypted, valid" are both required states.
>> But that seems to conflict with the notion that "C-bit toggle" be
>> directly coupled with a "validity toggle".
>
> That should not be an issue. Each page have entry in the RMP table.
> So, you can have one page as guest invalid and another page as a guest
> valid.
>
>> Put differently, I'm happy that modifying IoMmuDxe appears
>> unnecessary, but then that tells me I'm missing something about the
>> state transitions between the above *four* states.
>
>
> Only thing which I would propose changing in IoMmuDxe driver is to use
> a pre-allocated buffer and avoid toggling the C-bit on every
> read/write. In past the C-bit toggle was all contained inside the
> guest. But when SNP is enabled, each C-bit toggle causes two
> additional steps #1 RMPUPDATE and #2 PVALIDATE. The RMPUPDATE is
> requested using the VMEXIT so it will cause a VM context switch. I
> would prefer to avoid it.
>
> My current patch set does not have this optimization yet.

Please let's stick with the absolute minimum in this series, yes.

I realize my email is complete chaos. That's because my understanding of
this work is complete chaos.

Can we approach this feature as follows:

- Make "validation tracking" the core tenet (central principle) of this
  feature.

- Design the validation *patterns* that are required in each firmware
  phase.

- Identify the *sole* owner module, for validation tracking, in each
  firmware phase. If we can't readily do this, we might have to
  introduce new drivers (PEIMs, DXE drivers?) and new interfaces (PPIs,
  protocols?)

- Design the hand-off between the owner modules (on firmware phase
  boundaries).

- Add only helper functions to BaseMemEncryptSevLib, without any
  ownership of tracking info. The helpers should be as restricted as
  possible.

I'd like to avoid "smearing" page validation responsibility over a bunch
of modules. (See again: TianoCore#847.) I'd like responsibilities
strictly centralized.

If you feel that my points make no sense, I can start going over the
individual patches. However, even that way, I'd have found patch#19 only
in the end, and that doesn't seem right. And I wouldn't like to go
"numb" on the actual code, being distracted by style issues and so on,
before we have some agreement on the core approach.

Another tip: if you can add content to MdePkg that is governed by AMD
specifications, such as patches #3, #5, #9, I would recommend splitting
that off to a separate (pre-requisite) series. That content seems
unaffected by whatever we do in OvmfPkg, and it would help decrease the
size of the OvmfPkg series. It's not like we're going to abandon the
SEV-SNP feature, so I don't think there's a risk in merging the MdePkg
series first, while the OvmfPkg series is not ready. The MdePkg stuff
will not remain unused indefinitely.

Thanks,
Laszlo



-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#73966): https://edk2.groups.io/g/devel/message/73966
Mute This Topic: https://groups.io/mt/81584576/1787277
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [edk2-devel] [RFC PATCH 00/19] Add AMD Secure Nested Paging (SEV-SNP) support
Posted by Brijesh Singh 3 years ago
On 4/12/21 11:23 AM, Laszlo Ersek wrote:
> On 04/10/21 00:43, Brijesh Singh wrote:
>> On 4/9/21 7:24 AM, Laszlo Ersek wrote:
>>> On 04/08/21 13:59, Brijesh Singh wrote:
>>>> On 4/8/21 4:58 AM, Laszlo Ersek wrote:
>>>>> On 03/24/21 16:31, Brijesh Singh wrote:
>>>>>> At this time we only support the pre-validation. OVMF detects all
>>>>>> the available system RAM in the PEI phase. When SEV-SNP is
>>>>>> enabled, the memory is validated before it is made available to
>>>>>> the EDK2 core.
>>>>> Can you describe this in a bit more detail, before I look at the
>>>>> individual patches? Specifically, what existing logic in the PEI
>>>>> phase was taken, and extended, and how?
>>>> One of the key requirement is that the guest private pages much be
>>>> validated before the access. If guest tries to access the pages
>>>> before the validation then it will result in #VC
>>>> (page-not-validated) exception. To avoid the #VC, we propose the
>>>> validating the memory before the access. We will incrementally add
>>>> the support to lazy validate (i.e validate on access).
>>> What is the primary threat (in simple terms please) that validation
>>> is supposed to prevent?
>>
>> To protect against the memory re-mapping attack the guest pages must
>> be validated. The idea is that every guest page can map only to a
>> single physical memory page at one time.
> I don't understand. How can a given guest page frame map to multiple
> host page frames?
>
> Do you mean that the situation to prevent is when multiple guest page
> frames (GPAs) map to the same host page frame, as set up by the
> hypervisor?
>
A malicious hypervisor may attempt to remap validated gpa to a different
host frame. The PVALIDATE is designed to protect against those type of
attacks. See
https://static.sched.com/hosted_files/lsseu2019/65/SEV-SNP%20Slides%20Nov%201%202019.pdf
(slide 13). You can also find more information about it in the SEV-SNP
whitepaper.


>>> And, against that particular threat, does pre-validation offer some
>>> protection, or will that only come with lazy validation?
>> For the hardware it does not matter how the memory was validated --
>> lazy vs prevalidate. Both approaches use the PVALIDATE instruction to
>> validate the page.
>>
>> In the case of pre-validation, the memory is validated before the
>> access. Whereas in the lazy validation, the access will cause a fault
>> and fault handler should validate the page and retry the access. Its
>> similar to a page fault handler, OS can populate the page table or
>> back the pages on demand.
>>
>> The only downside of pre-validation is, we will take a hit on the boot
>> time. The GHCB spec provides method by which we can batch multiple
>> requests at once to minimize the context switches.
> If pre-validation is simpler, and the only drawback is a one-time hit
> during boot, then wouldn't it be better to stick with pre-validation
> forever? I expect that would be a lot simpler for (a) the #VC handlers,
> (b) the tracking of the "already validated" information.

This could be an issue for the larger VMs. The boot delay will vary
based on the VM size. In addition to the boot delay,  the PVALIDATE
instruction requires that there is a valid entry for the page in the
nested page table. If the entry does not exist in the NPT then HV will
get #NPF and will fill the NPT with the backing host page. By using the
lazy-validation we can push allocation of the backing pages to only when
required and thus release the memory pressure on the VMM.


>
>
>>>> Similar to SEV, the OVMF_CODE.fd is encrypted through the SNP
>>>> firmware before the launch. The SNP firmware also validates the
>>>> memory page after encrypting. This allows us to boot the initial
>>>> entry code without guest going through the validation process.
>>>>
>>>> The OVMF reset vector uses few data pages (e.g page table, early Sec
>>>> stack). Access to these data pages will result in #VC. There are two
>>>> approaches we can take to validate these data pages:
>>>>
>>>> 1. Ask SNP firmware to pre-validate it -- SNP firmware provides an
>>>> special command that can be used to pre-validate the pages without
>>>> affecting the measurement.
>>> This means the two pflash chips, right?
>>
>> This does not need to be two pflash chips. The SNP firmware command
>> does not know anything about the ROM or pflash chip. The command
>> accepts the system physical address that need to be validated by the
>> firmware. In patch #2, OVMF provides a range of data pages that need
>> to be validated by the SNP firmware before booting the guest.
> I guess my question related to the guest code that executes from pflash,
> and/or accesses data from pflash, before the guest itself could ask for
> any validation. Such as, the reset vector (which runs from pflash).
> Then, some non-volatile UEFI variable data that resides in the other
> pflash chip, and is accessed (read-only) during the PEI phase (the
> memory type information variable namely). I understood your comments as
> QEMU pre-validating those areas before guest launch. Is that correct?


Yes that is correct. All the OVMF pflash0 memory will be pre-validated
before the guest is launched. IIRC, the variables which resides on the
other pflash chip are accessed unencrypted in the guest. Only the
encrypted memory access (aka C=1) need to be validated. The host MMIO
regions does not need to be validated.


>
> Anyway, I guess at this point I should just start reviewing the series.
>
> Some more comments below.
>
>
>>> - We need more areas made accessible in SEC than just the
>>> decompression buffer; for example the temporary SEC/PEI heap and
>>> stack, and (IIRC) various special SEV-ES areas laid out via MEMFD.
>>> Where are all of those handled / enumerated?
>>
>> Sorry, I simplified my response by saying just decompression. You are
>> right that its more than the decompression buffer. In my current
>> patch, the SEC phase validates all the memory up to
>> PcdOvmfDecompressionScratchEnd (which includes more than just output
>> buffer). See patch #13.
> That sounds good (will check the patch out of course).
>
>
>>>> One of the important thing is we should *never* validate the pages
>>>> twice.
>>> What are the symptoms / consequences of:
>>>
>>> - the guest accessing an unvalidated page (I understand it causes a
>>> #VC, but what is the direct result of that, when this series is
>>> applied?),
>>
>> After the series it applied, an access to an unvalidated page will
>> result in #VC. The series does not extend the VC handler to handle the
>> page-not-validated error code. The current VC handler
>> [InternalVmgExitHandleVc()] will inject a #GP to terminate the guest
>> if #VC is raised for unvalidated page.
> OK.
>
>
>>> - doubly-validating a page?
>>>
>>> The first question is relevant because we should crash as soon as we
>>> touch a page we forgot to validate (we shouldn't let any corruption
>>> or similar spread out until we finally realize there's a problem).
>>>
>>> The second question is relevant for security I guess. What attacks
>>> become possible, and/or what symptoms are produced, if we
>>> doubly-validate a page?
>> Assuming that guest validates its memory, this guarantees the
>> injective mapping between GPAs and SPAs.
>>
>> As I noted that the page validation process consist of two steps #1
>> RMPUPDATE and 2#PVALIDATE. The RMPUPDATE is a hypervisor instruction
>> and PVALIDATE is a guest instruction. A malicious hypervisor can remap
>> gpa to different physical address in RMP table using the RMPUPDATE
>> just before the second PVALIDATE instruction is executed. The guest
>> need to ensure that it does not leave the security vulnerability that
>> can be exploited by a malicious hypervisor. The SNP white paper
>> recommends that a guest OS should keep track of what is has validate
>> so that it can avoid getting into these situation.
> Hm, OK. I think this answers my top-most question.
>
>
>>> Furthermore, IIRC, we have separate #VC handlers for the different
>>> firmware phases; do they behave consistently / identicall when a
>>> #VC(page-not-validated) occurs, when this patch set is applied?
>>>
>>> My first question is basically asking whether we can *exclusively*
>>> rely on #VC(page-not-validated) faults to tell us that we missed
>>> validating a particular page. If we can do that, then the job is a
>>> bit easier, because from the GPA, we can more or less also derive
>>> *when and where* we should pre-validate the page (at least until
>>> validation is done completely lazily).
>> Yes, we should be able to rely #VC to tell us that we missed
>> validating the page. In my this series so far I have not seen a #VC
>> due to page-not-validated because we have carefully validated the
>> pages before they are accessed. I am thinking that
>> #VC(page-not-validated) should be treated as an error in the first
>> phase. Incrementally we will add the lazy validation, in which the
>> #VC(page-not-validated) will provide a hint as to what has not been
>> validated.
> I'm OK with the current handling of #VC(page-not-validated) -- i.e.,
> fatal error.
>
> I'm not yet convinced that lazy validation will be useful at all, but
> that could be just my lack of understanding.


See if my above comment makes sense on why we may need the lazy
validation. IMO, we should not be going with the lazy validation in
OVMF. To minimize the boot delay and avoid the complexity of the code in
the best case OVMF validated lower 4GB memory and mark rest of the
memory as "unaccepted" in the UEFI map.

David from google has started a thread about lazy validation in kernel
https://www.spinics.net/lists/linux-mm/msg243954.html. We would like to
find a method which works for both Intel TDX and AMD SEV. We will build
these support incrementally both in the kernel and OVMF.

>
>
>>>> The MemEncryptSevSnpValidateRam() uses a interval search tree to
>>>> keep the record of what has been validated. Before validating the
>>>> range, it lookup in its tree and if it finds that range is already
>>>> validated then do nothing. If it detects an overlap then it will
>>>> validate only non overlapping regions -- see patch #14. hat data
>>>> structure is used for implementing the interval tree?
>>> I'm not necessarily looking for a data structure with "nice"
>>> asymptotic time complexity. With pre-validation especially, I think
>>> simplicity (ease of review) is more important for the data structure
>>> than performance. If it's not an actual "tree", we shouldn't call it
>>> a "tree". (An "interval tree" is usually an extension of a Red-Black
>>> Tree, and that's not the simplest data structure in existence;
>>> although edk2 does offer an rbtree library.)
>>
>> I am using binary search tree. Currently, we don't have many nodes to
>> track. Most of the guest memory is private and are validated before we
>> exit from the PEI phase.
> My preliminary comments about MemEncryptSevSnpValidateSystemRam():
>
> (1) The function prototype (patch#12) promises that the function can
> never fail. In patch#14 however, I see some AddRangeToIntervalTree()
> calls whose return values are not checked. I think error checking should
> be strict, and if an error is detected (such as memory allocation
> failure), a plain ASSERT (FALSE) is not sufficient. Both ASSERT() and
> CpuDeadLoop() should be used explicitly, for the sake of RELEASE builds.
> And the function prototype (the leading comment) should highlight that
> the function either succeeds, or doesn't return.
>
>
> (2) I don't like the recursive nature of the functions. That's always
> convenient first, and almost always a mess to get out of later. However,
> this is not a request to complicate the tree's implementation,
> because...
>
>
> (3) ... you mention we don't have many nodes to track. Is a tree
> structure, with dynamically allocated nodes, justified at all?
>
>
> (4) Regarding repeating dynamic allocation -- that's never a great thing
> to do in the SEC / PEI phases. Can we use a fixed size, once-allocated,
> or even static, array somewhere?
>
> In particular, patch#14 uses AllocatePool() for SNP_VALIDATED_RANGE, in
> the PEI phase, that results in EFI_HOB_TYPE_MEMORY_POOL HOBs; see:
>
>   AllocatePool()              [MdePkg/Library/PeiMemoryAllocationLib/MemoryAllocationLib.c]
>     PeiServicesAllocatePool() [MdePkg/Library/PeiServicesLib/PeiServicesLib.c]
>       PeiAllocatePool()       [MdeModulePkg/Core/Pei/Memory/MemoryServices.c]
>
> Now, dependent on where you call this AllocatePool() from, it could
> occur still on the temporary SEC/PEI RAM -- i.e., before permanent PEI
> RAM is installed. That means that such HOBs would be subject to the HOB
> list migration, after permanent PEI RAM is installed. The PI spec writes
> about EFI_HOB_MEMORY_POOL: "The HOB consumer phase should be able to
> ignore these HOBs". The PI spec writes about AllocatePool(): "The early
> allocations from temporary memory will be migrated to permanent memory
> when permanent main memory is installed; this migration shall occur when
> the HOB list is migrated to permanent memory".
>
> So, whenever we use AllocatePool() in the PEI phase, we need an "address
> stability proof", namely that the AllocatePool() call, and any later
> access to the allocated area (the HOB), are never separated by the
> installation of "gEfiPeiMemoryDiscoveredPpiGuid" in the PPI database, by
> the PEI Core. If we can't guarantee that, then an access later on could
> occur to the original allocation address, which has by then been
> invalidated by the temporary RAM migration (= the allocation HOB has
> been moved).
>
> AllocatePages() is "stable" however, in this sense; no proof like the
> one described above is necessary.
>
>
> (5) The "mRootNode" variable is defined in the file
> "OvmfPkg/Library/BaseMemEncryptSevLib/X64/PeiSnpSystemRamValidate.c"
> (patch #14). That means that every single module (i.e., PEIM) which
> consumes this library instance will own a separate "mRootNode" variable,
> with a validation-tracking tree that's correspondingly private to that
> PEIM.
>
> But the validation state is global to the entire firmware. I think we
> should somehow clarify, or even enforce, that the tracking data
> structure is firmware-global.
>
> Perhaps this can be solved by (a) moving the ownership of the data
> structure to a particular firmware module, with sole responsibility for
> pre-validation, and (b) updating the validation APIs to take this data
> structure explicitly.
>
> Alternatively, if multiple modules are expected to inter-operate on the
> tracker data structure (by which I mean inter-operation *in sequence*,
> not concurrently in a multiprocessing sense), then we might need a
> dynamic PCD for inter-driver understanding.
>
>
> (NB: both points (4) and (5) seem to conflict with *existent* code in
> BaseMemEncryptSevLib. However:
>
> - concerning (4), the current allocations are all performed with
>   AllocateAlignedPages(), which are unaffected by temporary-to-permanent
>   RAM migration,
>
> - and concerning (5), namely the existent "mPageTablePool" global
>   variable, is a *known bug* -- see commit b721aa749b86a and
>   TianoCore#847. This issue (namely, multiple modules messing with the
>   page tables independently) remains unsolved, and I'd like to avoid
>   introducing *more* paging-related allocations that are module-owned,
>   and not firmware-level.)
>
>
> All in all, here's what I'd prefer (and please correct me if it's
> unrealistic):
>
> - make the SNP validation functions take a simple, *unresizeable*
>   tracker structure for input/output,
>
> - allocate that data structure with a single AllocatePages() call in the
>   proper PEIM,
>
> - if the SNP validation functions run out of space in the structure,
>   hang hard,
>
> - avoid recursion, and use a simple linear search through the array,
>
> - evaluate whether graceful handling of overlaps (= range splitting,
>   merging) is actually a requirement. See more on this below.


Yes this will work fine. The primary reason for me adding the interval
tree was to detect the overlap condition before we finalize the
pre-validation in PEI. The validation flow works like this:

- Qemu validated few pages

- SEC validates few more page

- When we detect the memory in the PEI we need to exclude the pages
validated by the Qemu and SEC.

The page ranges validated using the Qemu and SEC are accessible through
a fixed PCDs. I can do simple overlap check with those fixed PCDs and
skip the pre-validated ranges. I don't have any strong reason to use the
interval tree to detect the overlap condition. There is no call to
toggle the C-bit before we finalize the validation. 

IMO, there is no need for tracking the page state after we finish the
pre-validation. After the pre-validtion is completed, the caller will
use either clear the C-bit or set the C-bit. If it attempts to set the
C-bit without clearing it first then its a bug -- which will be caught
by the page state change internal function. We can use the output of CF
flag from the PVALIDATE to determine where caller is trying to do doubly
validation or invalidation and abort it.


>
> (6) Can you please confirm that the order in which
> MemEncryptSevSnpValidateSystemRam()'s declaration, call sites, and
> implementations are introduced, is sensible? I can't tell immediately,
> but I'd like to be sure that the tree at least builds at every stage (no
> linkage errors at any stage).

I can certainly say the order in which they are introduced and the call
sites are a sensible. I don't expect any linkage failure but since it
was an RFC so I didn't go through making sure that it builds in every
stage. I wanted to get the overall direction on where community would
like to go before making the final changes. You have been asking some
very good question and that is certainly helping us to reduce some of
the complexity in the patch.


>
>>> Furthermore, what you describe above is called idempotency. No matter
>>> how many times we attempt to validate a range, it may (or may not
>>> even) cause an actual change in the first action only. Is this
>>> property (=idempotency) an inherent requirement of the technology, or
>>> is it a simplification of the implementation? Put differently: if you
>>> called CpuDeadLoop() in the validation function any time an
>>> overlapping validation request were received, would that hugely
>>> complicate the call sites?
>> From the technology point of view you can validate the same page again
>> and again. Hardware does not force any restriction. To protect against
>> time-based remap attack a guest also need to ensure that it does not
>> blindly validates the pages otherwise guest is taking risk. Avoiding a
>> double validation is strong recommendation. In current implementation,
>> the validation burden is not put on the caller. Caller calls standard
>> APIs to toggle the encryption attribute. Under the hood, if we see
>> that page is already validated then no need to issue the PVALIDATE. If
>> the page is getting changed from C=1 -> 0 then unvalidate it.
> There remains a huge gap in my understanding here. (The rest of my
> earlier questions did concern the relationship between the C bit and
> validation, but I might as well comment right at this spot.)
>
> "Invalidation in connection to C-bit toggling" means that there's going
> to be significant "validation churn" in the DXE phase. For example when
> fw_cfg is used (which relies on the edk2 IOMMU protocol), or when virtio
> transfers are performed (which also relies on the edk2 IOMMU protocol,
> although less directly -- through PciIo, namely).
>
> That, however, brings up several problems for me:
>
> - Is the tracker data structure scalable? I don't see any balancing in
>   it. (And I wouldn't want a new, complex data structure for it anyway.)
>
> - In fact, I don't see any code related to invalidation -- more
>   precisely, I don't see where and how the C-bit changes consult the
>   tracker structure *at all*.
>
> - Given issue (5), I don't see how the tracker structure would be
>   inherited from the PEI phase to the DXE phase.
>
> In particular, in patch#18, the flipping of the C-bit results in calls
> to SetPageStateInternal(). But SetPageStateInternal() does not go
> *through* the tracker structure. Even in patch#14,
> SevSnpValidateSystemRam() calls SetPageStateInternal() *in addition to*
> AddRangeToIntervalTree().
>
> In other words, I don't see where validity is tracked in the DXE phase,
> in response to the C-bit toggling.

As I highlighted above there is actually no need to track the page state
after we successfully complete the pre-validation. All these guest page
validation or invalidation are applicable to the system RAM. But
AmdSevDxe driver calls to clear the C-bit of the MMIO range. These range
are not a system and have never been validated so we invalidating it
will cause an issue. Since I had the data structures available in PEI
phase for the tracking the page state hence made those available to DXE
to verify that we are called to invalidate the SYSTEM_RAM and not MMIO.
IMO, we should either extend the MemEncryptSev{Set,Clear}PageEncMask()
to pass either a new flag to hint where its system RAM or non-RAM.
Thoughts ?



>
>>> I'm kind of "obsessing" about idempotency because you say we must
>>> *never* doubly-validate a page, so the *difference* between:
>>> - explicitly crashing on such an attempt,
>>> - and silently ignoring such an attempt,
>>> may be meaningful.
>>
>> I said we should never doubly validate to avoid creating a
>> vulnerability that can later be exploited by the malicious hypervisor.
>> Currently we lookup address in our tree to ensure that we don't do a
>> double validation. The PVALIDATE instruction also can be used to
>> determine if we are attempting to validate an already validated page.
>> In current implementation after the PVALIDATE completion, I check the
>> rflags.CF to ensure that its not a double validation condition. If its
>> double validation then abort the guest. Its bug in the guest.
>>
>>> It's kind of the difference between "oops this is a call site *bug*,
>>> but we patched it up", vs. "this is expected of call sites, we should
>>> just handle it internally".
>> Our attempt should be to patch to avoid the double validation. I would
>> still check the status of PVALIDATE, if instruction detects it as a
>> double validation then we have a bug in our tracking and should abort
>> the guest.
> Let me rephrase. My question concerns the *abstraction level* at which
> we should catch and handle double-validation attempts.
>
> If we silently split and merge ranges in the tracker structure, then the
> guest (as a whole) will not double-validate, which is good. However,
> double-validation attempts will potentially still *exist*, in
> higher-level modules in the firmware. My question is whether it is
> *right* to paper over such overlaps, coming from the high-level code, in
> the tracker structure. I suspect that it's not right.

Yes we should bail out if such a request comes from the high level
module. This is why I was actually not checking the tracking structure
when toggling the C-bit. The complete memory much have been validated
before we are asked to toggle the C-bit. If a module is performing a
doubly validation or invalidation then it should be aborted to avoid any
future exploits.


>
> In other words, the spot where we should abort, due to
> "double-validation detected", is not when the PVALIDATE instruction
> reports a problem in the tracker structure. Instead, the spot where we
> should abort is when we detect an overlapping request *before* issuing
> PVALIDATE. I'd like the tracker to be as simple and dumb as possible,
> and I'd like the actual sites that request validation to blow up
> immediately, if they issue an overlapping request. Is my expectation
> wrong?
>

If we go with the tracking only until we complete the pre-validation
then we can use the hardware PVALIDATE help to detect the doubly
validation and abort it. There is no need to check the overlap conditions.


>>>> The patch #18 extend the MemEncrypt{Set,Clear}PageEncMask() to call
>>>> the SNP page state change during the C-bit toggle.
>>> - When exactly do we invalidate?
>> Basically before the page is transition from C=1 -> 0 in the page
>> table.
>>
>>> - Does the time and place of invalidation depend on whether we
>>> perform pre-validation, or lazy validation?
>> It does not matter. Both type of validation update the Validated bit
>> in the RMP table. The important thing is that invalidation can happen
>> only for a already validated private page. If the page is shared then
>> invalidation will result in #GP.
>>> - Is page invalidation idempotent too?
>> We should treat this as idempotent too. As I said, tacking the page
>> state is strongly recommended for the security reason. The SNP white
>> paper has a section dedicated for the validation and may able to
>> clarify things which is not covered in my response.
>>> - What is the consequence (security impact) if we forget
>>> invalidation?
>> I can't think of a security implication of it but if we forget to
>> invalidate then it can lead issues in the tracking data structure when
>> it comes to validation next time. e.g a page was mapped as C=1 and we
>> changed it to C=0 without invalidating it. Now if the same page is
>> being changed from C=0 -> 1 then tracking data structure may think
>> that page is already validated and will skip the validation process
>> and thus access will result in #VC (page-not-validated).
> As I said above, I don't see where the validation tracking and the C-bit
> flipping "meet". Especially across the PEI and DXE phases.
>
> Basically I'm missing the core concept for maintaining a firmware-global
> structure, for tracking the validation status of GPA intervals.
>
> Up to and *excluding* the DXE phase, the validation tracking should be
> trivial, naive, and unforgiving. This is because, in the PEI phase,
> validation is one-shot. I would expect the tracker structure to live in
> the PlatformPei module.
>
> Starting with the DXE phase, where we need (in)validation to closely
> accompany C-bit toggling, we need a different, performant data structure
> for tracking validation status. This data structure should be primed
> from the status last set up in the PEI phase. The tracking should still
> be unforgiving. I believe *this* tracker structure may have to live in
> the IOMMU driver.
>
> BaseMemEncryptSevLib may offer helper functions, but even if it does, it
> should never attempt to *own* tracking information, via an internal
> global variable. Furthermore, those helper functions that only make
> sense for a particular firmware phase, should have such implementations
> that hang hard, in library instances that match the *other* firmware
> phases. For example, SevSnpValidateSystemRam() should have a
> DXE-specific implementation that hangs hard.
>
> It's entirely possible that my points above are complete garbage. I
> don't have a grasp on the core concept, for the firmware-wide tracking
> of the validation status.
>
> ... Ugh, wait. I've just noticed something. In patch#19,
> "SEC_SEV_ES_WORK_AREA.SnpSystemRamValidatedRootAddress" is introduced,
> which seems (?) to be implementing the kind of global (inter-driver,
> inter-phase) tracking that I've been missing. However, even if my guess
> (?) is correct about that, this idea definitely doesn't belong in the
> last patch, presented as a "bugfix". Even if it were a bugfix, it should
> be incorporated into one of the earlier patches (so that we preferably
> not introduce the bug in the first place). However, my argument is that
> it's not a bugfix -- to me that seems to be the core concept. That's
> what the whole series needs to be built upon. (In all of my thinking,
> before this particular paragraph, I *never* looked at patch#19. There
> was no reason to.)

It was my bad that I should have highlighted this before and not left it
to the last patch. As you can see the only reason pass the data
structure from PEI to DXE is to catch those non SYSTEM invalidation
request coming from the AmdSev Driver. I am too happy about the solution
I have in this RFC. I want to go with option to change the call sites to
tell us whether its a system RAM or non existence memory type. 


>
>>> - There are four page states I can imagine:
>>>   - encrypted for host access, valid for guest access
>>>   - encrypted for host access, invalid for guest access
>>>   - decrypted for host access, valid for guest access
>>>   - decrypted for host access, invalid for guest access
>>>
>>> Does a state exist, from these four, that's never used? (Potentially
>>> caught by the hardware?)
>> I would not mix the page state with the host access. Basically there
>> is two page state exist
>>
>> - Guest-Valid
>>
>> - Guest-Invalid
>>
>> By default all the guest memory is in Guest-Invalid state on boot. The
>> PVALIDATE instruction is used to transition from Valid->Invalid.
> Do you mean "Invalid->Valid" transition?


Ah yes that is correct.


>
>
>> Guest can use the PVALIDATE to change the state from Valid to Invalid.
>>
>> A guest private page (i.e page mapped C=1) must be in Guest Valid
>> state. If hardware see that a private access page is not in the
>> Guest-Valid state then it will trigger #VC.
>>
>>
>>> Do the patches highlight / explain the validity transitions, in
>>> comments? My understanding is that the C-bit toggle and the guest
>>> access valid/invalid toggle are separate actions, so "middle" states
>>> do exist, but perhaps only temporarily.
>> I have tried to document the steps in the patch description and I can
>> add more comments as required. When SEV-SNP is active then we need to
>> invalidate the page before toggling the C-bit from C=1 -> 0 and
>> similarly after the page is toggled from C=0 -> 1 we must validate it.
>>
>>
>>> I'm curious how it works, for example, with variousvirtio transfers
>>> (bus master read, bus master write, bus master common buffer). In the
>>> IoMmuDxe driver minimally, we access memory both through PTEs with
>>> the C-bit set and through PTEs with the C-bit clear, meaning that
>>> "encrypted, valid", and "decrypted, valid" are both required states.
>>> But that seems to conflict with the notion that "C-bit toggle" be
>>> directly coupled with a "validity toggle".
>> That should not be an issue. Each page have entry in the RMP table.
>> So, you can have one page as guest invalid and another page as a guest
>> valid.
>>
>>> Put differently, I'm happy that modifying IoMmuDxe appears
>>> unnecessary, but then that tells me I'm missing something about the
>>> state transitions between the above *four* states.
>>
>> Only thing which I would propose changing in IoMmuDxe driver is to use
>> a pre-allocated buffer and avoid toggling the C-bit on every
>> read/write. In past the C-bit toggle was all contained inside the
>> guest. But when SNP is enabled, each C-bit toggle causes two
>> additional steps #1 RMPUPDATE and #2 PVALIDATE. The RMPUPDATE is
>> requested using the VMEXIT so it will cause a VM context switch. I
>> would prefer to avoid it.
>>
>> My current patch set does not have this optimization yet.
> Please let's stick with the absolute minimum in this series, yes.
>
> I realize my email is complete chaos. That's because my understanding of
> this work is complete chaos.
>
> Can we approach this feature as follows:
>
> - Make "validation tracking" the core tenet (central principle) of this
>   feature.
>
> - Design the validation *patterns* that are required in each firmware
>   phase.
>
> - Identify the *sole* owner module, for validation tracking, in each
>   firmware phase. If we can't readily do this, we might have to
>   introduce new drivers (PEIMs, DXE drivers?) and new interfaces (PPIs,
>   protocols?)
>
> - Design the hand-off between the owner modules (on firmware phase
>   boundaries).
>
> - Add only helper functions to BaseMemEncryptSevLib, without any
>   ownership of tracking info. The helpers should be as restricted as
>   possible.
>
> I'd like to avoid "smearing" page validation responsibility over a bunch
> of modules. (See again: TianoCore#847.) I'd like responsibilities
> strictly centralized.
>
> If you feel that my points make no sense, I can start going over the
> individual patches. However, even that way, I'd have found patch#19 only
> in the end, and that doesn't seem right. And I wouldn't like to go
> "numb" on the actual code, being distracted by style issues and so on,
> before we have some agreement on the core approach.


Certainly your points makes sense. Let me know what you think about
removing the tracking data structure and using a liner array of
pre-valiated range (build with a Fixed PCD) works ? I was hoping you to
just glance over it and provide me high level feedback. I will go
through styles and comments in great details on non RFC patches. I still
have some TODO items (e.g Secrets, CPUID etc) before we take off the RFC
tag. Thank you so much for your detail feedback.


>
> Another tip: if you can add content to MdePkg that is governed by AMD
> specifications, such as patches #3, #5, #9, I would recommend splitting
> that off to a separate (pre-requisite) series. That content seems
> unaffected by whatever we do in OvmfPkg, and it would help decrease the
> size of the OvmfPkg series. It's not like we're going to abandon the
> SEV-SNP feature, so I don't think there's a risk in merging the MdePkg
> series first, while the OvmfPkg series is not ready. The MdePkg stuff
> will not remain unused indefinitely.


Ah, sure I can do that. 


-Brijsh



-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#73985): https://edk2.groups.io/g/devel/message/73985
Mute This Topic: https://groups.io/mt/81584576/1787277
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [edk2-devel] [RFC PATCH 00/19] Add AMD Secure Nested Paging (SEV-SNP) support
Posted by Laszlo Ersek 3 years ago
On 04/12/21 22:14, Brijesh Singh wrote:
>
> On 4/12/21 11:23 AM, Laszlo Ersek wrote:
>> On 04/10/21 00:43, Brijesh Singh wrote:
>>> On 4/9/21 7:24 AM, Laszlo Ersek wrote:

>>>> What is the primary threat (in simple terms please) that validation
>>>> is supposed to prevent?
>>>
>>> To protect against the memory re-mapping attack the guest pages must
>>> be validated. The idea is that every guest page can map only to a
>>> single physical memory page at one time.
>> I don't understand. How can a given guest page frame map to multiple
>> host page frames?
>>
>> Do you mean that the situation to prevent is when multiple guest page
>> frames (GPAs) map to the same host page frame, as set up by the
>> hypervisor?
>>
> A malicious hypervisor may attempt to remap validated gpa to a
> different host frame. The PVALIDATE is designed to protect against
> those type of attacks. See
> https://static.sched.com/hosted_files/lsseu2019/65/SEV-SNP%20Slides%20Nov%201%202019.pdf
> (slide 13). You can also find more information about it in the SEV-SNP
> whitepaper.

Thanks for the link. Page 7 confirms my understanding:

"""
Memory Aliasing
Example attack: Map two guest pages to same DRAM page
"""

(On the other hand, slide 12 seems to be buggy, "Page validation
guarantees that each GPA only maps to one valid SPA at any given time"
-- the GPA:SPA relationship is n:1, so any given GPA could never ever
map to multiple SPAs, regardless of SEV-SNP. The multiplicity is
possible in the other direction (a single SPA is exposed as multiple
GPAs, maybe in a single guest, maybe in multiple guests), and that's
what SEV-SNP is supposed to prevent. The GPA->SPA mapping must be
injective, but that's not what the words on the slide actually express.)

The page remapping slide (13) does not necessarily break injectivity,
and even if the hypervisor does that, the same GPA *still* corresponds
to a single SPA, at this new ("one") time. It's just that the
correspondence is not what the guest has approved. Expressing this
correctly would require a temporal logic formula of sorts.

Anyway... at least now I (believe I) understand that these are two
separate attacks (aliasing vs. remapping).

>> If pre-validation is simpler, and the only drawback is a one-time hit
>> during boot, then wouldn't it be better to stick with pre-validation
>> forever? I expect that would be a lot simpler for (a) the #VC
>> handlers, (b) the tracking of the "already validated" information.
>
> This could be an issue for the larger VMs. The boot delay will vary
> based on the VM size. In addition to the boot delay,  the PVALIDATE
> instruction requires that there is a valid entry for the page in the
> nested page table. If the entry does not exist in the NPT then HV will
> get #NPF and will fill the NPT with the backing host page. By using
> the lazy-validation we can push allocation of the backing pages to
> only when required and thus release the memory pressure on the VMM.

Ugh... OK, I guess. I'd need to digest this a lot more, for a fully
informed agreement, but I'm happy with this answer (and my rudimentary
understanding thereof) for now.

>> I guess my question related to the guest code that executes from
>> pflash, and/or accesses data from pflash, before the guest itself
>> could ask for any validation. Such as, the reset vector (which runs
>> from pflash). Then, some non-volatile UEFI variable data that resides
>> in the other pflash chip, and is accessed (read-only) during the PEI
>> phase (the memory type information variable namely). I understood
>> your comments as QEMU pre-validating those areas before guest launch.
>> Is that correct?
>
>
> Yes that is correct. All the OVMF pflash0 memory will be pre-validated
> before the guest is launched. IIRC, the variables which resides on the
> other pflash chip are accessed unencrypted in the guest. Only the
> encrypted memory access (aka C=1) need to be validated. The host MMIO
> regions does not need to be validated.

Good point. Thanks for the explanation / reminder!

> See if my above comment makes sense on why we may need the lazy
> validation. IMO, we should not be going with the lazy validation in
> OVMF. To minimize the boot delay and avoid the complexity of the code
> in the best case OVMF validated lower 4GB memory and mark rest of the
> memory as "unaccepted" in the UEFI map.

I'm going to violently support any idea that keeps things simple for
OVMF -- is this "lower 4GB" what the current patch set does already? Or
is it the proposed next step?

> David from google has started a thread about lazy validation in kernel
> https://www.spinics.net/lists/linux-mm/msg243954.html. We would like
> to find a method which works for both Intel TDX and AMD SEV. We will
> build these support incrementally both in the kernel and OVMF.

Ah, OK. That explains it. Next step then. Hmm, yes, patch#15 seems to
pre-validate everything now. OK.

> Yes this will work fine. The primary reason for me adding the interval
> tree was to detect the overlap condition before we finalize the
> pre-validation in PEI. The validation flow works like this:
>
> - Qemu validated few pages
>
> - SEC validates few more page
>
> - When we detect the memory in the PEI we need to exclude the pages
> validated by the Qemu and SEC.
>
> The page ranges validated using the Qemu and SEC are accessible
> through a fixed PCDs. I can do simple overlap check with those fixed
> PCDs and skip the pre-validated ranges. I don't have any strong reason
> to use the interval tree to detect the overlap condition. There is no
> call to toggle the C-bit before we finalize the validation.

Thank you. This is a *huge* relief.

> IMO, there is no need for tracking the page state after we finish the
> pre-validation. After the pre-validtion is completed, the caller will
> use either clear the C-bit or set the C-bit. If it attempts to set the
> C-bit without clearing it first then its a bug -- which will be caught
> by the page state change internal function. We can use the output of
> CF flag from the PVALIDATE to determine where caller is trying to do
> doubly validation or invalidation and abort it.

Sounds good. Even in case the final PVALIDATE action proves incorrect,
if we catch that immediately via CF, and refuse to do anything else
(i.e. we won't go ahead and use the page(s) in question), things should
be OK.

>> (6) Can you please confirm that the order in which
>> MemEncryptSevSnpValidateSystemRam()'s declaration, call sites, and
>> implementations are introduced, is sensible? I can't tell
>> immediately, but I'd like to be sure that the tree at least builds at
>> every stage (no linkage errors at any stage).
>
> I can certainly say the order in which they are introduced and the
> call sites are a sensible. I don't expect any linkage failure but
> since it was an RFC so I didn't go through making sure that it builds
> in every stage. I wanted to get the overall direction on where
> community would like to go before making the final changes. You have
> been asking some very good question and that is certainly helping us
> to reduce some of the complexity in the patch.

Thanks. I've been very stressed as to whether my questions or
"proposals" make any sense.

And yes, it's a desirable trait for a patch set to build at every stage
(bisectability).

> As I highlighted above there is actually no need to track the page
> state after we successfully complete the pre-validation.

Again, I welcome this very much.

> All these guest page
> validation or invalidation are applicable to the system RAM. But
> AmdSevDxe driver calls to clear the C-bit of the MMIO range. These
> range are not a system and have never been validated so we
> invalidating it will cause an issue.

Understood.

> Since I had the data structures available in PEI
> phase for the tracking the page state hence made those available to
> DXE to verify that we are called to invalidate the SYSTEM_RAM and not
> MMIO. IMO, we should either extend the
> MemEncryptSev{Set,Clear}PageEncMask() to pass either a new flag to
> hint where its system RAM or non-RAM. Thoughts ?

I think I'd prefer a new function in the lib class, one that's dedicated
to clearing the C bit on MMIO without attempting invalidation. It's a
special-case function, and the dedicated name helps with two things:

- no need to mess with existing call sites except in AmdSevDxe,
- the new function is easier to grep for.

BTW, there would be at least one other approach for this, I think -- the
PEI instance of the library could consult the HOB list, and the DXE
instance of the library could consult the GCD memory space map, to
decide whether the page being C-bit-flipped is MMIO or not. But that's a
lot of complexity (and likely some performance hit too), when in fact we
know at the call sites whether the area being encrypted/decrypted is
MMIO or RAM.

Note also that (IIUC) in AmdSevDxe, we only *decrypt*, so we only need
*one* new function, "MemEncryptSevClearMmioPageEncMask" or similar.

BTW, in the "MemEncryptSevLib.h" header, the documentation of the
MemEncryptSev{Set,Clear}PageEncMask function declarations should be
updated -- the leading comments should explain that, in case SEV-SNP is
found active, then page (in)validation will occur too (as appropriate).

> Yes we should bail out if such a request comes from the high level
> module. This is why I was actually not checking the tracking structure
> when toggling the C-bit. The complete memory much have been validated
> before we are asked to toggle the C-bit. If a module is performing a
> doubly validation or invalidation then it should be aborted to avoid
> any future exploits.

Great, thank you.

> If we go with the tracking only until we complete the pre-validation
> then we can use the hardware PVALIDATE help to detect the doubly
> validation and abort it. There is no need to check the overlap
> conditions.

Again, great. In the pre-validation step, where you explicitly carve out
the PCD-described ranges that have been handled already by QEMU and
earlier phases of the firmware, you can still check the PVALIDATE result
(CF), just to be sure.

> Certainly your points makes sense. Let me know what you think about
> removing the tracking data structure and using a liner array of
> pre-valiated range (build with a Fixed PCD) works ?

I couldn't have hoped for such a great resolution to this conundrum.
Yes, please do that.

> I was hoping you to just glance over it and provide me high level
> feedback. I will go through styles and comments in great details on
> non RFC patches. I still have some TODO items (e.g Secrets, CPUID etc)
> before we take off the RFC tag. Thank you so much for your detail
> feedback.

Well, this is the first email on the SEV-SNP topic where what I feel can
be described as "relief" :)

Thank you!
Laszlo



-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#74041): https://edk2.groups.io/g/devel/message/74041
Mute This Topic: https://groups.io/mt/81584576/1787277
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [edk2-devel] [RFC PATCH 00/19] Add AMD Secure Nested Paging (SEV-SNP) support
Posted by Brijesh Singh 3 years ago
>> Since I had the data structures available in PEI
>> phase for the tracking the page state hence made those available to
>> DXE to verify that we are called to invalidate the SYSTEM_RAM and not
>> MMIO. IMO, we should either extend the
>> MemEncryptSev{Set,Clear}PageEncMask() to pass either a new flag to
>> hint where its system RAM or non-RAM. Thoughts ?
> I think I'd prefer a new function in the lib class, one that's dedicated
> to clearing the C bit on MMIO without attempting invalidation. It's a
> special-case function, and the dedicated name helps with two things:
>
> - no need to mess with existing call sites except in AmdSevDxe,
> - the new function is easier to grep for.
>
> BTW, there would be at least one other approach for this, I think -- the
> PEI instance of the library could consult the HOB list, and the DXE
> instance of the library could consult the GCD memory space map, to
> decide whether the page being C-bit-flipped is MMIO or not. But that's a
> lot of complexity (and likely some performance hit too), when in fact we
> know at the call sites whether the area being encrypted/decrypted is
> MMIO or RAM.


Sound good to me, will introduce new function for the MMIO in v2.


> Note also that (IIUC) in AmdSevDxe, we only *decrypt*, so we only need
> *one* new function, "MemEncryptSevClearMmioPageEncMask" or similar.
>
> BTW, in the "MemEncryptSevLib.h" header, the documentation of the
> MemEncryptSev{Set,Clear}PageEncMask function declarations should be
> updated -- the leading comments should explain that, in case SEV-SNP is
> found active, then page (in)validation will occur too (as appropriate).

Noted.




-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#74094): https://edk2.groups.io/g/devel/message/74094
Mute This Topic: https://groups.io/mt/81584576/1787277
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org]
-=-=-=-=-=-=-=-=-=-=-=-