[PATCH v3 26/26] coco/tdx-host: Set and document TDX Module update expectations

Chao Gao posted 26 patches 2 weeks ago
[PATCH v3 26/26] coco/tdx-host: Set and document TDX Module update expectations
Posted by Chao Gao 2 weeks ago
In rare cases, TDX Module updates may cause TD management operations to
fail if they occur during phases of the TD lifecycle that are sensitive
to update compatibility.

But not all combinations of P-SEAMLDR, kernel, and TDX Module have the
capability to detect and prevent said incompatibilities. Completely
disabling TDX Module updates on platforms without the capability would
be overkill, as these incompatibility cases are rare and can be
addressed by userspace through coordinated scheduling of updates and TD
management operations.

To set clear expectations for TDX Module updates, expose the capability
to detect and prevent these incompatibility cases via sysfs and
document the compatibility criteria and indications when those criteria
are violated.

Signed-off-by: Chao Gao <chao.gao@intel.com>
---
v3:
 - new, based on a reference patch from Dan Williams
---
 .../ABI/testing/sysfs-devices-faux-tdx-host   | 45 +++++++++++++++++++
 drivers/virt/coco/tdx-host/tdx-host.c         | 13 ++++++
 2 files changed, 58 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-devices-faux-tdx-host b/Documentation/ABI/testing/sysfs-devices-faux-tdx-host
index a3f155977016..81cb13e91f2a 100644
--- a/Documentation/ABI/testing/sysfs-devices-faux-tdx-host
+++ b/Documentation/ABI/testing/sysfs-devices-faux-tdx-host
@@ -29,3 +29,48 @@ Description:	(RO) Report the number of remaining updates that can be performed.
 		4.2 "SEAMLDR.INSTALL" for more information. The documentation is
 		available at:
 		https://cdrdv2-public.intel.com/739045/intel-tdx-seamldr-interface-specification.pdf
+
+What:		/sys/devices/faux/tdx_host/firmware/seamldr_upload
+Contact:	linux-coco@lists.linux.dev
+Description:	(Directory) The seamldr_upload directory implements the
+		fw_upload sysfs ABI, see
+		Documentation/ABI/testing/sysfs-class-firmware for the general
+		description of the attributes @data, @cancel, @error, @loading,
+		@remaining_size, and @status. This ABI facilitates "Compatible
+		TDX Module Updates". A compatible update is one that meets the
+		following criteria:
+
+		   Does not interrupt or interfere with any current TDX
+		   operation or TD VM.
+
+		   Does not invalidate any previously consumed Module metadata
+		   values outside of the TEE_TCB_SVN_2 field (updated Security
+		   Version Number) in TD Quotes.
+
+		   Does not require validation of new Module metadata fields. By
+		   implication, new Module features and capabilities are only
+		   available by installing the Module at reboot (BIOS or EFI
+		   helper loaded).
+
+		See tdx_host/compat_capable and
+		tdx_host/firmware/seamldr_upload/error. For details.
+
+What:		/sys/devices/faux/tdx_host/compat_capable
+Contact:	linux-coco@lists.linux.dev
+Description:	(RO) When present this attribute returns "1" to indicate that
+		the current seamldr, kernel, and TDX Module combination can
+		detect when an update conforms with the "Compatible TDX Module
+		Updates" criteria in the tdx_host/firmware/seamldr_upload description.
+		When this attribute is missing it is indeterminate whether an
+		update will violate the criteria.
+
+What:		/sys/devices/faux/tdx_host/firmware/seamldr_upload/error
+Contact:	linux-coco@lists.linux.dev
+Description:	(RO) See Documentation/ABI/testing/sysfs-class-firmware for
+		baseline expectations for this file. Updates that fail
+		compatibility checks end with the "device-busy" error in the
+		<STATUS>:<ERROR> format of this attribute. When this is
+		signalled current TDs and the current TDX Module stay running.
+		Other failures may result in all TDs being lost and further
+		TDX operations becoming impossible. This occurs when
+		/sys/devices/faux/tdx_host/version becomes unreadable.
diff --git a/drivers/virt/coco/tdx-host/tdx-host.c b/drivers/virt/coco/tdx-host/tdx-host.c
index 06487de2ebfe..8cc48e276533 100644
--- a/drivers/virt/coco/tdx-host/tdx-host.c
+++ b/drivers/virt/coco/tdx-host/tdx-host.c
@@ -45,8 +45,21 @@ static ssize_t version_show(struct device *dev, struct device_attribute *attr,
 }
 static DEVICE_ATTR_RO(version);
 
+static ssize_t compat_capable_show(struct device *dev, struct device_attribute *attr,
+				   char *buf)
+{
+	const struct tdx_sys_info *tdx_sysinfo = tdx_get_sysinfo();
+
+	if (!tdx_sysinfo)
+		return -ENXIO;
+
+	return sysfs_emit(buf, "%i\n", tdx_supports_update_compatibility(tdx_sysinfo));
+}
+static DEVICE_ATTR_RO(compat_capable);
+
 static struct attribute *tdx_host_attrs[] = {
 	&dev_attr_version.attr,
+	&dev_attr_compat_capable.attr,
 	NULL,
 };
 
-- 
2.47.3
Re: [PATCH v3 26/26] coco/tdx-host: Set and document TDX Module update expectations
Posted by dan.j.williams@intel.com 1 week, 4 days ago
Chao Gao wrote:
> In rare cases, TDX Module updates may cause TD management operations to
> fail if they occur during phases of the TD lifecycle that are sensitive
> to update compatibility.

No. The TDX Module wants to be able to claim that some updates are
compatible when they are not. If Linux takes on additional exclusions it
modestly increases the scope of changes that can be included in an
update. It is not possible to claim "rare" if module updates routinely
include that problematic scope.

> But not all combinations of P-SEAMLDR, kernel, and TDX Module have the
> capability to detect and prevent said incompatibilities. Completely
> disabling TDX Module updates on platforms without the capability would
> be overkill, as these incompatibility cases are rare and can be
> addressed by userspace through coordinated scheduling of updates and TD
> management operations.

"Completely disabling" is not the tradeoff. The tradeoff is whether or
not the TDX Module meets Linux compatible update requirements or not.

> To set clear expectations for TDX Module updates, expose the capability
> to detect and prevent these incompatibility cases via sysfs and
> document the compatibility criteria and indications when those criteria
> are violated.

Linux derives no benefit from a "compat_capable" kernel ABI. Yes, the
internals must export the error condition on collision. I am not
debating that nor revisiting the decision of pre-update-fail, vs
post-collision-notify. However, if the module violates the Linux
expectations that is the module's issue to document or preclude. The
fact that the compatibility contract is ambiguous to the kernel is a
feature. It puts the onus squarely on module updates to be documented
(or tools updated to understand) as meeting or violating Linux
compatibility expectations.

> Signed-off-by: Chao Gao <chao.gao@intel.com>
> ---
> v3:
>  - new, based on a reference patch from Dan Williams

One of the details that is missing is the protocol (module documentation
or tooling) to determine ahead of time if an update is compatible. That
obviates the need for "compat_capable" ABI which serves no long term
purpose. Specifically, the expectation is "run non-compatible updates at
your own operational risk".

So, remove "compat_capable" ABI. Amend the "error" ABI documentation
with the details for avoiding failures and the risk of running updates
on configurations that support update but not collision avoidance.

> ---
>  .../ABI/testing/sysfs-devices-faux-tdx-host   | 45 +++++++++++++++++++
>  drivers/virt/coco/tdx-host/tdx-host.c         | 13 ++++++
>  2 files changed, 58 insertions(+)
[..]
> 
> +What:		/sys/devices/faux/tdx_host/firmware/seamldr_upload/error
> +Contact:	linux-coco@lists.linux.dev
> +Description:	(RO) See Documentation/ABI/testing/sysfs-class-firmware for
> +		baseline expectations for this file. Updates that fail
> +		compatibility checks end with the "device-busy" error in the
> +		<STATUS>:<ERROR> format of this attribute. When this is
> +		signalled current TDs and the current TDX Module stay running.

This wants something like
---
See version_select_and_load.py [1] documentation for how to detect
compatible updates and whether the current platform components catch
errors or let them leak and cause potential TD attestation failures.

[1]: https://github.com/intel/confidential-computing.tdx.tdx-module.binaries/blob/main/version_select_and_load.py
---

...although I do not immediately see any help text or Documentation for
that tool.
Re: [PATCH v3 26/26] coco/tdx-host: Set and document TDX Module update expectations
Posted by Chao Gao 1 week, 3 days ago
On Mon, Jan 26, 2026 at 02:14:18PM -0800, dan.j.williams@intel.com wrote:
>Chao Gao wrote:
>> In rare cases, TDX Module updates may cause TD management operations to
>> fail if they occur during phases of the TD lifecycle that are sensitive
>> to update compatibility.
>
>No. The TDX Module wants to be able to claim that some updates are
>compatible when they are not. If Linux takes on additional exclusions it
>modestly increases the scope of changes that can be included in an
>update. It is not possible to claim "rare" if module updates routinely
>include that problematic scope.
>
>> But not all combinations of P-SEAMLDR, kernel, and TDX Module have the
>> capability to detect and prevent said incompatibilities. Completely
>> disabling TDX Module updates on platforms without the capability would
>> be overkill, as these incompatibility cases are rare and can be
>> addressed by userspace through coordinated scheduling of updates and TD
>> management operations.
>
>"Completely disabling" is not the tradeoff. The tradeoff is whether or
>not the TDX Module meets Linux compatible update requirements or not.
>
>> To set clear expectations for TDX Module updates, expose the capability
>> to detect and prevent these incompatibility cases via sysfs and
>> document the compatibility criteria and indications when those criteria
>> are violated.
>
>Linux derives no benefit from a "compat_capable" kernel ABI. Yes, the
>internals must export the error condition on collision. I am not
>debating that nor revisiting the decision of pre-update-fail, vs
>post-collision-notify. However, if the module violates the Linux
>expectations that is the module's issue to document or preclude. The
>fact that the compatibility contract is ambiguous to the kernel is a
>feature. It puts the onus squarely on module updates to be documented
>(or tools updated to understand) as meeting or violating Linux
>compatibility expectations.
>
>> Signed-off-by: Chao Gao <chao.gao@intel.com>
>> ---
>> v3:
>>  - new, based on a reference patch from Dan Williams
>
>One of the details that is missing is the protocol (module documentation
>or tooling) to determine ahead of time if an update is compatible. That
>obviates the need for "compat_capable" ABI which serves no long term
>purpose. Specifically, the expectation is "run non-compatible updates at
>your own operational risk".

Agreed. We need to add metadata like crypto library version or equivalent
abstraction to the mapping file. This enables userspace to determine whether
module updates meet Linux compatibility requirements. I'll submit a request
for this metadata.

And actually, userspace can already determine if the TDX module supports
"collision avoidance" by reading the "tdx_features0" field from the mapping
file [1].

[1]: https://github.com/intel/confidential-computing.tdx.tdx-module.binaries/blob/main/mapping_file.json

>
>So, remove "compat_capable" ABI. Amend the "error" ABI documentation
>with the details for avoiding failures and the risk of running updates
>on configurations that support update but not collision avoidance.

Got it. I will modify this patch as follows:

diff --git a/Documentation/ABI/testing/sysfs-devices-faux-tdx-host b/Documentation/ABI/testing/sysfs-devices-faux-tdx-host
index a3f155977016..0a68e68375fa 100644
--- a/Documentation/ABI/testing/sysfs-devices-faux-tdx-host
+++ b/Documentation/ABI/testing/sysfs-devices-faux-tdx-host
@@ -29,3 +29,57 @@ Description:	(RO) Report the number of remaining updates that can be performed.
		4.2 "SEAMLDR.INSTALL" for more information. The documentation is
		available at:
		https://cdrdv2-public.intel.com/739045/intel-tdx-seamldr-interface-specification.pdf
+
+What:		/sys/devices/faux/tdx_host/firmware/seamldr_upload
+Contact:	linux-coco@lists.linux.dev
+Description:	(Directory) The seamldr_upload directory implements the
+		fw_upload sysfs ABI, see
+		Documentation/ABI/testing/sysfs-class-firmware for the general
+		description of the attributes @data, @cancel, @error, @loading,
+		@remaining_size, and @status. This ABI facilitates "Compatible
+		TDX Module Updates". A compatible update is one that meets the
+		following criteria:
+
+		   Does not interrupt or interfere with any current TDX
+		   operation or TD VM.
+
+		   Does not invalidate any previously consumed Module metadata
+		   values outside of the TEE_TCB_SVN_2 field (updated Security
+		   Version Number) in TD Quotes.
+
+		   Does not require validation of new Module metadata fields. By
+		   implication, new Module features and capabilities are only
+		   available by installing the Module at reboot (BIOS or EFI
+		   helper loaded).
+
+		See tdx_host/firmware/seamldr_upload/error for more details.
+
+What:		/sys/devices/faux/tdx_host/firmware/seamldr_upload/error
+Contact:	linux-coco@lists.linux.dev
+Description:	(RO) See Documentation/ABI/testing/sysfs-class-firmware for
+		baseline expectations for this file. The <ERROR> part in the
+		<STATUS>:<ERROR> format can be:
+
+		   "device-busy": Compatibility checks failed or not all CPUs
+		                  are online
+		   "flash-wearout": the number of updates reached the limit.
+		   "read-write-error": Memory allocation failed.
+		   "hw-error": Cannot communicate with P-SEAMLDR or TDX Module
+		   "firmware-invalid": The TDX Module to be installed is invalid
+		                       or other unexpected errors occurred.
+
+		"hw-error" or "firmware-invalid" may be fatal, causing all TDs
+		and the TDX Module to be lost and preventing further TDX
+		operations. This occurs when /sys/devices/faux/tdx_host/version
+		becomes unreadable after update failures. For other errors, TDs
+		and the (previous) TDX Module stay running.
+
+		On certain earlier TDX Module versions, incompatible updates may
+		not trigger "device-busy" errors but instead cause TD
+		attestation failures.
+
+		See version_select_and_load.py [1] documentation for how to
+		detect compatible updates and whether the current platform
+		components catch errors or let them leak and cause potential TD
+		attestation failures.
+		[1]: https://github.com/intel/confidential-computing.tdx.tdx-module.binaries/blob/main/version_select_and_load.py
Re: [PATCH v3 26/26] coco/tdx-host: Set and document TDX Module update expectations
Posted by dan.j.williams@intel.com 1 week, 3 days ago
Chao Gao wrote:
[..]
> >So, remove "compat_capable" ABI. Amend the "error" ABI documentation
> >with the details for avoiding failures and the risk of running updates
> >on configurations that support update but not collision avoidance.
> 
> Got it. I will modify this patch as follows:

Overall, looks good to me. You can add:

Reviewed-by: Dan Williams <dan.j.williams@intel.com>

...after a few additional fixups below:

> diff --git a/Documentation/ABI/testing/sysfs-devices-faux-tdx-host b/Documentation/ABI/testing/sysfs-devices-faux-tdx-host
> index a3f155977016..0a68e68375fa 100644
> --- a/Documentation/ABI/testing/sysfs-devices-faux-tdx-host
> +++ b/Documentation/ABI/testing/sysfs-devices-faux-tdx-host
> @@ -29,3 +29,57 @@ Description:	(RO) Report the number of remaining updates that can be performed.
> 		4.2 "SEAMLDR.INSTALL" for more information. The documentation is
> 		available at:
> 		https://cdrdv2-public.intel.com/739045/intel-tdx-seamldr-interface-specification.pdf
> +
> +What:		/sys/devices/faux/tdx_host/firmware/seamldr_upload
> +Contact:	linux-coco@lists.linux.dev
> +Description:	(Directory) The seamldr_upload directory implements the
> +		fw_upload sysfs ABI, see
> +		Documentation/ABI/testing/sysfs-class-firmware for the general
> +		description of the attributes @data, @cancel, @error, @loading,
> +		@remaining_size, and @status. This ABI facilitates "Compatible
> +		TDX Module Updates". A compatible update is one that meets the
> +		following criteria:
> +
> +		   Does not interrupt or interfere with any current TDX
> +		   operation or TD VM.
> +
> +		   Does not invalidate any previously consumed Module metadata
> +		   values outside of the TEE_TCB_SVN_2 field (updated Security
> +		   Version Number) in TD Quotes.
> +
> +		   Does not require validation of new Module metadata fields. By
> +		   implication, new Module features and capabilities are only
> +		   available by installing the Module at reboot (BIOS or EFI
> +		   helper loaded).
> +
> +		See tdx_host/firmware/seamldr_upload/error for more details.
> +
> +What:		/sys/devices/faux/tdx_host/firmware/seamldr_upload/error
> +Contact:	linux-coco@lists.linux.dev
> +Description:	(RO) See Documentation/ABI/testing/sysfs-class-firmware for
> +		baseline expectations for this file. The <ERROR> part in the
> +		<STATUS>:<ERROR> format can be:
> +
> +		   "device-busy": Compatibility checks failed or not all CPUs
> +		                  are online
> +		   "flash-wearout": the number of updates reached the limit.
> +		   "read-write-error": Memory allocation failed.
> +		   "hw-error": Cannot communicate with P-SEAMLDR or TDX Module
> +		   "firmware-invalid": The TDX Module to be installed is invalid
> +		                       or other unexpected errors occurred.
> +
> +		"hw-error" or "firmware-invalid" may be fatal, causing all TDs
> +		and the TDX Module to be lost and preventing further TDX
> +		operations. This occurs when /sys/devices/faux/tdx_host/version
> +		becomes unreadable after update failures.

I would specify the exact unambiguous errno value that gets returned on
read when the version become indeterminate, like ENXIO.

> +		and the (previous) TDX Module stay running.
> +
> +		On certain earlier TDX Module versions, incompatible updates may
> +		not trigger "device-busy" errors but instead cause TD
> +		attestation failures.

I would just leave this out. It bitrots quickly and does not provide
any actionable information. This is not the kernel's responsibility...

> +
> +		See version_select_and_load.py [1] documentation for how to
> +		detect compatible updates and whether the current platform
> +		components catch errors or let them leak and cause potential TD
> +		attestation failures.
> +		[1]: https://github.com/intel/confidential-computing.tdx.tdx-module.binaries/blob/main/version_select_and_load.py

...that detail about what happens when compat detection is missing
belongs in the tooling documentation. That documentation does not exist
yet, so this link needs to be replaced with a pointer to documentation
before this goes upstream. I am assuming that we want to create an
actual package that distributions can pick up as project? It might be
worth going through the exercise of packaging the binaries and the tool
as an rpm or deb to get that work bootstrapped.
"version_select_and_load" probably wants a better name like "tdxctl" or
similar.

Note that a tdxctl project would also attract features related to TDX
Connect to wrap common flows around the tdx_host device sysfs ABIs.
Re: [PATCH v3 26/26] coco/tdx-host: Set and document TDX Module update expectations
Posted by Tony Lindgren 1 week, 4 days ago
On Fri, Jan 23, 2026 at 06:55:34AM -0800, Chao Gao wrote:
> In rare cases, TDX Module updates may cause TD management operations to
> fail if they occur during phases of the TD lifecycle that are sensitive
> to update compatibility.
> 
> But not all combinations of P-SEAMLDR, kernel, and TDX Module have the
> capability to detect and prevent said incompatibilities. Completely
> disabling TDX Module updates on platforms without the capability would
> be overkill, as these incompatibility cases are rare and can be
> addressed by userspace through coordinated scheduling of updates and TD
> management operations.
> 
> To set clear expectations for TDX Module updates, expose the capability
> to detect and prevent these incompatibility cases via sysfs and
> document the compatibility criteria and indications when those criteria
> are violated.

Reviewed-by: Tony Lindgren <tony.lindgren@linux.intel.com>