[RFC 0/3] Expose Confidential Computing capabilities on sysfs

Alejandro Jimenez posted 3 patches 4 years, 3 months ago
.../ABI/testing/sysfs-kernel-mm-mem-encrypt   |  88 +++++
arch/x86/include/asm/mem_encrypt.h            |   6 +
arch/x86/mm/mem_encrypt.c                     |  27 ++
arch/x86/mm/mem_encrypt_amd.c                 | 320 ++++++++++++++++++
4 files changed, 441 insertions(+)
create mode 100644 Documentation/ABI/testing/sysfs-kernel-mm-mem-encrypt
[RFC 0/3] Expose Confidential Computing capabilities on sysfs
Posted by Alejandro Jimenez 4 years, 3 months ago
Given the growing number of Confidential Computing features (AMD SME/SEV, Intel
TDX), I believe it is useful to expose relevant state/parameters in sysfs.
e.g. For AMD memory encryption features, the distinction between possible states
(supported/enabled/active) is explained in the documentation at:

https://www.kernel.org/doc/Documentation/x86/amd-memory-encryption.txt

but there are currently no standard interfaces to determine state and other
relevant info (e.g. nr of SEV ASIDs) besides searching dmesg or manually reading
various CPUID leaves and MSRs.

This patchset implements a sysfs interface where only relevant attributes are
displayed depending on context (e.g. no SME entry or ASID attributes are created
when running on a guest)

On EPYC Milan host:

$ grep -r . /sys/kernel/mm/mem_encrypt/*
/sys/kernel/mm/mem_encrypt/c_bit_position:51
/sys/kernel/mm/mem_encrypt/sev/nr_sev_asid:509
/sys/kernel/mm/mem_encrypt/sev/status:enabled
/sys/kernel/mm/mem_encrypt/sev/nr_asid_available:509
/sys/kernel/mm/mem_encrypt/sev_es/nr_sev_es_asid:0
/sys/kernel/mm/mem_encrypt/sev_es/status:enabled
/sys/kernel/mm/mem_encrypt/sev_es/nr_asid_available:509
/sys/kernel/mm/mem_encrypt/sme/status:active

On SEV guest running on EPYC Milan host (displays only relevant entries):

$ grep -r . /sys/kernel/mm/mem_encrypt/*
/sys/kernel/mm/mem_encrypt/c_bit_position:51
/sys/kernel/mm/mem_encrypt/sev/status:active
/sys/kernel/mm/mem_encrypt/sev_es/status:unsupported

The full directory tree looks like:

/sys/kernel/mm/mem_encrypt/
├── c_bit_position
├── sev
│   ├── nr_asid_available
│   ├── nr_sev_asid
│   └── status
├── sev_es
│   ├── nr_asid_available
│   ├── nr_sev_es_asid
│   └── status
└── sme
    └── status

The goal is to be able to easily add new entries as new features (TDX, SEV-SNP)
are merged.

I'd appreciate any suggestions/comments.

Thank you,
Alejandro

Alejandro Jimenez (3):
  x86: Expose Secure Memory Encryption capabilities in sysfs
  x86: Expose SEV capabilities in sysfs
  x86: Expose SEV-ES capabilities in sysfs

 .../ABI/testing/sysfs-kernel-mm-mem-encrypt   |  88 +++++
 arch/x86/include/asm/mem_encrypt.h            |   6 +
 arch/x86/mm/mem_encrypt.c                     |  27 ++
 arch/x86/mm/mem_encrypt_amd.c                 | 320 ++++++++++++++++++
 4 files changed, 441 insertions(+)
 create mode 100644 Documentation/ABI/testing/sysfs-kernel-mm-mem-encrypt

-- 
2.34.1

Re: [RFC 0/3] Expose Confidential Computing capabilities on sysfs
Posted by Dave Hansen 4 years, 3 months ago
On 3/9/22 14:06, Alejandro Jimenez wrote:>
> On EPYC Milan host:
> 
> $ grep -r . /sys/kernel/mm/mem_encrypt/*
> /sys/kernel/mm/mem_encrypt/c_bit_position:51

Why on earth would we want to expose this to userspace?

> /sys/kernel/mm/mem_encrypt/sev/nr_sev_asid:509
> /sys/kernel/mm/mem_encrypt/sev/status:enabled
> /sys/kernel/mm/mem_encrypt/sev/nr_asid_available:509
> /sys/kernel/mm/mem_encrypt/sev_es/nr_sev_es_asid:0
> /sys/kernel/mm/mem_encrypt/sev_es/status:enabled
> /sys/kernel/mm/mem_encrypt/sev_es/nr_asid_available:509
> /sys/kernel/mm/mem_encrypt/sme/status:active

For all of this...  What will userspace *do* with it?

For nr_asid_available, I get it.  It tells you how many guests you can
still run.  But, TDX will need the same logical thing.  Should TDX hosts
go looking for this in:

	/sys/kernel/mm/mem_encrypt/tdx/available_guest_key_ids

?

If it's something that's common, it needs to be somewhere common.
Re: [RFC 0/3] Expose Confidential Computing capabilities on sysfs
Posted by Alejandro Jimenez 4 years, 3 months ago
On 3/9/2022 5:40 PM, Dave Hansen wrote:
> On 3/9/22 14:06, Alejandro Jimenez wrote:>
>> On EPYC Milan host:
>>
>> $ grep -r . /sys/kernel/mm/mem_encrypt/*
>> /sys/kernel/mm/mem_encrypt/c_bit_position:51
> Why on earth would we want to expose this to userspace?
>
>> /sys/kernel/mm/mem_encrypt/sev/nr_sev_asid:509
>> /sys/kernel/mm/mem_encrypt/sev/status:enabled
>> /sys/kernel/mm/mem_encrypt/sev/nr_asid_available:509
>> /sys/kernel/mm/mem_encrypt/sev_es/nr_sev_es_asid:0
>> /sys/kernel/mm/mem_encrypt/sev_es/status:enabled
>> /sys/kernel/mm/mem_encrypt/sev_es/nr_asid_available:509
>> /sys/kernel/mm/mem_encrypt/sme/status:active
> For all of this...  What will userspace *do* with it?

In my case, this information was useful to know for debugging failures 
when testing the various features (e.g. need to specify cbitpos property 
on QEMU sev-guest object).

It helps get an account of what is currently supported/enabled/active on 
the host/guest, given that some of these capabilities will interact with 
other components and cause boot hangs or errors (e.g. AVIC+SME or 
AVIC+SEV hangs at boot, SEV guests with some configurations need to 
increase SWIOTLB limit).

The sysfs entry basically answers the questions in 
https://github.com/AMDESE/AMDSEV#faq without needing to run 
virsh/qmp-shell/rdmsr.

I am aware than having a new sysfs entry mostly to facilitate debugging 
might not be warranted, so I have tagged this as an RFC to ask if others 
working in this space have found additional use cases, or just want the 
convenience of having the data for current and future CoCo features in a 
single location.

>
> For nr_asid_available, I get it.  It tells you how many guests you can
> still run.  But, TDX will need the same logical thing.  Should TDX hosts
> go looking for this in:
>
> 	/sys/kernel/mm/mem_encrypt/tdx/available_guest_key_ids
>
> ?
>
> If it's something that's common, it needs to be somewhere common.
I think it makes sense to have common attributes for all CoCo providers 
under /sys/kernel/mm/mem_encrypt/. The various CoCo providers can create 
entries under mem_encrypt/<feature> exposing the information relevant to 
their specific features like these patches implement for the AMD case, 
and populate or link the <common_attr> attribute with the appropriate value.

Then we can have:

/sys/kernel/mm/mem_encrypt/
-- common_attr
-- sme/
-- sev/
-- sev_es/

or:

/sys/kernel/mm/mem_encrypt/
-- common_attr
-- tdx/

Note that at any single time, we are only creating entries that are 
applicable to the hardware we are running on, so there is not a mix of 
tdx and sme/sev subdirs.

I suspect it will be difficult to agree on what is "common" or even a 
descriptive name. Lets say this common attribute will be:

         /sys/kernel/mm/mem_encrypt/common_key

Where common_key can represent AMD SEV ASIDs/AMD SEV-{ES,SNP} ASIDs, or 
Intel TDX KeyIDs (private/shared), or s390x SEID (Secure Execution IDs), 
or <insert relevant ARM CCA attribute>.

We can have a (probably long) discussion to agree on the above; this 
patchset just attempts to provide a framework for registering different 
providers, and implements the AMD current capabilities.

Thank you,
Alejandro

Re: [RFC 0/3] Expose Confidential Computing capabilities on sysfs
Posted by Isaku Yamahata 4 years, 3 months ago
Added kvm@vger.kernel.org.

On Thu, Mar 10, 2022 at 01:07:33PM -0500,
Alejandro Jimenez <alejandro.j.jimenez@oracle.com> wrote:

> 
> On 3/9/2022 5:40 PM, Dave Hansen wrote:
> > On 3/9/22 14:06, Alejandro Jimenez wrote:>
> > > On EPYC Milan host:
> > > 
> > > $ grep -r . /sys/kernel/mm/mem_encrypt/*
> > > /sys/kernel/mm/mem_encrypt/c_bit_position:51
> > Why on earth would we want to expose this to userspace?
> > 
> > > /sys/kernel/mm/mem_encrypt/sev/nr_sev_asid:509
> > > /sys/kernel/mm/mem_encrypt/sev/status:enabled
> > > /sys/kernel/mm/mem_encrypt/sev/nr_asid_available:509
> > > /sys/kernel/mm/mem_encrypt/sev_es/nr_sev_es_asid:0
> > > /sys/kernel/mm/mem_encrypt/sev_es/status:enabled
> > > /sys/kernel/mm/mem_encrypt/sev_es/nr_asid_available:509
> > > /sys/kernel/mm/mem_encrypt/sme/status:active
> > For all of this...  What will userspace *do* with it?
> 
> In my case, this information was useful to know for debugging failures when
> testing the various features (e.g. need to specify cbitpos property on QEMU
> sev-guest object).
> 
> It helps get an account of what is currently supported/enabled/active on the
> host/guest, given that some of these capabilities will interact with other
> components and cause boot hangs or errors (e.g. AVIC+SME or AVIC+SEV hangs
> at boot, SEV guests with some configurations need to increase SWIOTLB
> limit).
> 
> The sysfs entry basically answers the questions in
> https://github.com/AMDESE/AMDSEV#faq without needing to run
> virsh/qmp-shell/rdmsr.
> 
> I am aware than having a new sysfs entry mostly to facilitate debugging
> might not be warranted, so I have tagged this as an RFC to ask if others
> working in this space have found additional use cases, or just want the
> convenience of having the data for current and future CoCo features in a
> single location.
> > 
> > For nr_asid_available, I get it.  It tells you how many guests you can
> > still run.  But, TDX will need the same logical thing.  Should TDX hosts
> > go looking for this in:
> > 
> > 	/sys/kernel/mm/mem_encrypt/tdx/available_guest_key_ids
> > 
> > ?
> > 
> > If it's something that's common, it needs to be somewhere common.
> I think it makes sense to have common attributes for all CoCo providers
> under /sys/kernel/mm/mem_encrypt/. The various CoCo providers can create
> entries under mem_encrypt/<feature> exposing the information relevant to
> their specific features like these patches implement for the AMD case, and
> populate or link the <common_attr> attribute with the appropriate value.
> 
> Then we can have:
> 
> /sys/kernel/mm/mem_encrypt/
> -- common_attr
> -- sme/
> -- sev/
> -- sev_es/
> 
> or:
> 
> /sys/kernel/mm/mem_encrypt/
> -- common_attr
> -- tdx/
> 
> Note that at any single time, we are only creating entries that are
> applicable to the hardware we are running on, so there is not a mix of tdx
> and sme/sev subdirs.
> 
> I suspect it will be difficult to agree on what is "common" or even a
> descriptive name. Lets say this common attribute will be:
> 
> ?????? ?????? /sys/kernel/mm/mem_encrypt/common_key
> 
> Where common_key can represent AMD SEV ASIDs/AMD SEV-{ES,SNP} ASIDs, or
> Intel TDX KeyIDs (private/shared), or s390x SEID (Secure Execution IDs), or
> <insert relevant ARM CCA attribute>.
> 
> We can have a (probably long) discussion to agree on the above; this
> patchset just attempts to provide a framework for registering different
> providers, and implements the AMD current capabilities.

The number of available Key IDs (TDX keyid or whatever is called) can be common.
Probably the common misc cgroup is desirable.  I don't see other common thing,
though.  I don't have requirements to expose bit position etc.

TDX requires firmwares which provide information about themselves.  Because
they're firmwares, I'm going to use /sysfs/firmware/tdx.

More concretely
- CPU feature (Secure Arbitration Mode: SEAM) as "seam" flag in /proc/cpuinfo
- TDX firmware(P-SEAMLDR and TDX module) information in /sysfs/firmware/tdx/

What:           /sys/firmware/tdx/
Description:
                Intel's Trust Domain Extensions (TDX) protect guest VMs from
                malicious hosts and some physical attacks.  This directory
                represents the entry point directory for the TDX.

                the TDX requires the TDX firmware to load into an isolated
                memory region.  It requires a two-step loading process.  It uses
                the first phase firmware loader (a.k.a NP-SEAMLDR) that loads
                the next loader and the second phase firmware loader(a.k.a
                P-SEAMLDR) that loads the TDX firmware(a.k.a the "TDX module").
                =============== ================================================
                keyid_num       the number of SEAM keyid as an hexadecimal
                                number with the "0x" prefix.
                =============== ================================================
Users:          libvirt

What:           /sys/firmware/tdx/p_seamldr/
Description:
                The P-SEAMLDR is the TDX module loader. The P-SEAMLDR comes
                with its attributes, vendor_id, build_date, build_num, minor
                version, major version to identify itself.

                Provides the information about the P-SEAMLDR loaded on the
                platform.  This directory exists if the P-SEAMLDR is
                successfully loaded.  It contains the following read-only files.
                The information corresponds to the data structure, SEAMLDR_INFO.
                The admins or VMM management software like libvirt can refer to
                that information, determine if P-SEAMLDR is supported, and
                identify the loaded P-SEAMLDR.

                =============== ================================================
                version         structure version of SEAMLDR_INFO as an
                                hexadecimal number with the "0x" prefix
                                "0x0".
                attributes      32bit flags as a hexadecimal number with the
                                "0x" prefix.
                                Bit 31 - Production-worthy (0) or
                                         debug (1).
                                Bits 30:0 - Reserved 0.
                vendor_id       Vendor ID as a hexadecimal number with the "0x"
                                prefix.
                                "0x0806" (Intel P-SEAMLDR module).
                build_date      Build date in yyyy.mm.dd BCD format.
                build_num       Build number as a hexadecimal number with the
                                "0x" prefix.
                minor           Minor version number as a hexadecimal number
                                with the "0x" prefix.
                major           Major version number as a hexadecimal number
                                with the "0x" prefix.
                seaminfo        The SEAM information of the TDX module currently
                                loaded as binary file.
                seam_ready      A boolean flag that indicates that a debuggable
                                TDX module can be loaded as a hexadecimal number
                                with the "0x" prefix.
                p_seamldr_ready A boolean flag that indicates that the P-SEAMLDR
                                module is ready for SEAMCALLs as a hexadecimal
                                number with the "0x" prefix.
                =============== ================================================
Users:          libvirt

What:           /sys/firmware/tdx/tdx_module/
Description:
                The TDX requires a firmware as known as the TDX module.  It comes
                with its attributes, vendor_id, build_data, build_num,
                minor_version, major_version, etc.

                Provides the information about the TDX module loaded on the
                platform.  It contains the following read-only files.  The
                information corresponds to the data structure, TDSYSINFO_STRUCT.
                The admins or VMM management software like libvirt can refer to
                that information, determine if TDX is supported, and identify
                the loaded the TDX module.

                ================== ============================================
                status             string of the TDX module status.
                                   "unknown"
                                   "none": the TDX module is not loaded
                                   "loaded": The TDX module is loaded, but not
                                             initialized
                                   "initialized": the TDX module is fully
                                                  initialized
                                   "shutdown": the TDX module is shutdown due to
                                               error during initialization.
                attributes         32bit flags of the TDX module attributes as
                                   a hexadecimal number with the "0x" prefix.
                                   Bits 31 - a production module(0) or
                                             a debug module(1).
                                   Bits 30:0 Reserved - set to 0.
                vendor_id          vendor ID as a hexadecimal number with the
                                   "0x" prefix.
                build_date         build date in yyyymmdd BCD format.
                build_num          build number as a hexadecimal number with
                                   the "0x" prefix.
                minor_version      minor version as a hexadecimal number with
                                   the "0x" prefix.
                major_version      major versionas a hexadecimal number with
                                   the "0x" prefix.
                attributes_fixed0  fixed-0 value for TD's attributes as a
                                   hexadecimal number with the "0x" prefix.
                attributes_fixed1  fixed-1 value for TD's attributes as a
                                   hexadecimal number with the "0x" prefix.
                xfam_fixed0        fixed-0 value for TD xfam value as a
                                   hexadecimal number with the "0x" prefix.
                xfam_fixed1        fixed-1 value for TD xfam value as a
                                   hexadecimal number with the "0x" prefix.
                ================== =============================================

-- 
Isaku Yamahata <isaku.yamahata@gmail.com>
Re: [RFC 0/3] Expose Confidential Computing capabilities on sysfs
Posted by Kai Huang 4 years, 3 months ago
> 
> More concretely
> - CPU feature (Secure Arbitration Mode: SEAM) as "seam" flag in /proc/cpuinfo

In my current patchset we don't have "seam" flag in /proc/cpuinfo.  

https://lore.kernel.org/kvm/cover.1647167475.git.kai.huang@intel.com/T/#m02542eb723394a81c35b9542b2763c783222d594

TDX architecture doesn't have a CPUID to report SEAM, so we will need a
synthetic flag if we want to add.  If userspace has requirement to use it, then
it makes sense to add it and expose to /proc/cpuinfo.  But so far I don't know
there's any.

Thanks
-Kai


Re: [RFC 0/3] Expose Confidential Computing capabilities on sysfs
Posted by Dave Hansen 4 years, 3 months ago
On 3/14/22 15:43, Isaku Yamahata wrote:
>                 xfam_fixed0        fixed-0 value for TD xfam value as a
>                                    hexadecimal number with the "0x" prefix.
>                 xfam_fixed1        fixed-1 value for TD xfam value as a
>                                    hexadecimal number with the "0x" prefix.

I don't think we should be exporting things and creating ABI just for
the heck of it.  These are a prime example.  XFAM is reported to the
guest in CPUID.  Yes, these may be used to help *build* XFAM, but
userspace doesn't need to know how XFAM was built.  It just needs to
know what features it can use.