The PCIe 6.1 specification, section 11, introduces the Trusted Execution
Environment (TEE) Device Interface Security Protocol (TDISP). This
protocol definition builds upon Component Measurement and Authentication
(CMA), and link Integrity and Data Encryption (IDE). It adds support for
assigning devices (PCI physical or virtual function) to a confidential
VM such that the assigned device is enabled to access guest private
memory protected by technologies like Intel TDX, AMD SEV-SNP, RISCV
COVE, or ARM CCA.
The "TSM" (TEE Security Manager) is a concept in the TDISP specification
of an agent that mediates between a "DSM" (Device Security Manager) and
system software in both a VMM and a confidential VM. A VMM uses TSM ABIs
to setup link security and assign devices. A confidential VM uses TSM
ABIs to transition an assigned device into the TDISP "RUN" state and
validate its configuration. From a Linux perspective the TSM abstracts
many of the details of TDISP, IDE, and CMA. Some of those details leak
through at times, but for the most part TDISP is an internal
implementation detail of the TSM.
CONFIG_PCI_TSM adds an "authenticated" attribute and "tsm/" subdirectory
to pci-sysfs. Consider that the TSM driver may itself be a PCI driver.
Userspace can watch for the arrival of a "TSM" device,
/sys/class/tsm/tsm0/uevent KOBJ_CHANGE, to know when the PCI core has
initialized TSM services.
The operations that can be executed against a PCI device are split into
2 mutually exclusive operation sets, "Link" and "Security" (struct
pci_tsm_{link,security}_ops). The "Link" operations manage physical link
security properties and communication with the device's Device Security
Manager firmware. These are the host side operations in TDISP. The
"Security" operations coordinate the security state of the assigned
virtual device (TDI). These are the guest side operations in TDISP. Only
link management operations are defined at this stage and placeholders
provided for the security operations.
The locking allows for multiple devices to be executing commands
simultaneously, one outstanding command per-device and an rwsem
synchronizes the implementation relative to TSM
registration/unregistration events.
Thanks to Wu Hao for his work on an early draft of this support.
Cc: Lukas Wunner <lukas@wunner.de>
Cc: Samuel Ortiz <sameo@rivosinc.com>
Cc: Alexey Kardashevskiy <aik@amd.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Co-developed-by: Xu Yilun <yilun.xu@linux.intel.com>
Signed-off-by: Xu Yilun <yilun.xu@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
Documentation/ABI/testing/sysfs-bus-pci | 51 +++
Documentation/driver-api/pci/index.rst | 1 +
Documentation/driver-api/pci/tsm.rst | 12 +
MAINTAINERS | 4 +-
drivers/pci/Kconfig | 14 +
drivers/pci/Makefile | 1 +
drivers/pci/pci-sysfs.c | 4 +
drivers/pci/pci.h | 8 +
drivers/pci/remove.c | 3 +
drivers/pci/tsm.c | 554 ++++++++++++++++++++++++
drivers/virt/coco/tsm-core.c | 61 ++-
include/linux/pci-tsm.h | 158 +++++++
include/linux/pci.h | 3 +
include/linux/tsm.h | 8 +-
include/uapi/linux/pci_regs.h | 1 +
15 files changed, 879 insertions(+), 4 deletions(-)
create mode 100644 Documentation/driver-api/pci/tsm.rst
create mode 100644 drivers/pci/tsm.c
create mode 100644 include/linux/pci-tsm.h
diff --git a/Documentation/ABI/testing/sysfs-bus-pci b/Documentation/ABI/testing/sysfs-bus-pci
index 69f952fffec7..99315fbfbe10 100644
--- a/Documentation/ABI/testing/sysfs-bus-pci
+++ b/Documentation/ABI/testing/sysfs-bus-pci
@@ -612,3 +612,54 @@ Description:
# ls doe_features
0001:01 0001:02 doe_discovery
+
+What: /sys/bus/pci/devices/.../tsm/
+Contact: linux-coco@lists.linux.dev
+Description:
+ This directory only appears if a physical device function
+ supports authentication (PCIe CMA-SPDM), interface security
+ (PCIe TDISP), and is accepted for secure operation by the
+ platform TSM driver. This attribute directory appears
+ dynamically after the platform TSM driver loads. So, only after
+ the /sys/class/tsm/tsm0 device arrives can tools assume that
+ devices without a tsm/ attribute directory will never have one,
+ before that, the security capabilities of the device relative to
+ the platform TSM are unknown. See
+ Documentation/ABI/testing/sysfs-class-tsm.
+
+What: /sys/bus/pci/devices/.../tsm/connect
+Contact: linux-coco@lists.linux.dev
+Description:
+ (RW) Write the name of a TSM (TEE Security Manager) device from
+ /sys/class/tsm to this file to establish a connection with the
+ device. This typically includes an SPDM (DMTF Security
+ Protocols and Data Models) session over PCIe DOE (Data Object
+ Exchange) and may also include PCIe IDE (Integrity and Data
+ Encryption) establishment. Reads from this attribute return the
+ name of the connected TSM or the empty string if not
+ connected. A TSM device signals its readiness to accept PCI
+ connection via a KOBJ_CHANGE event.
+
+What: /sys/bus/pci/devices/.../tsm/disconnect
+Contact: linux-coco@lists.linux.dev
+Description:
+ (WO) Write '1' or 'true' to this attribute to disconnect it from
+ a previous TSM connection.
+
+What: /sys/bus/pci/devices/.../authenticated
+Contact: linux-pci@vger.kernel.org
+Description:
+ When the device's tsm/ directory is present device
+ authentication (PCIe CMA-SPDM) and link encryption (PCIe IDE)
+ are handled by the platform TSM (TEE Security Manager). When the
+ tsm/ directory is not present this attribute reflects only the
+ native CMA-SPDM authentication state with the kernel's
+ certificate store.
+
+ If the attribute is not present, it indicates that
+ authentication is unsupported by the device, or the TSM has no
+ available authentication methods for the device.
+
+ When present and the tsm/ attribute directory is present, the
+ authenticated attribute is an alias for the device 'connect'
+ state. See the 'tsm/connect' attribute for more details.
diff --git a/Documentation/driver-api/pci/index.rst b/Documentation/driver-api/pci/index.rst
index a38e475cdbe3..9e1b801d0f74 100644
--- a/Documentation/driver-api/pci/index.rst
+++ b/Documentation/driver-api/pci/index.rst
@@ -10,6 +10,7 @@ The Linux PCI driver implementer's API guide
pci
p2pdma
+ tsm
.. only:: subproject and html
diff --git a/Documentation/driver-api/pci/tsm.rst b/Documentation/driver-api/pci/tsm.rst
new file mode 100644
index 000000000000..59b94d79a4f2
--- /dev/null
+++ b/Documentation/driver-api/pci/tsm.rst
@@ -0,0 +1,12 @@
+.. SPDX-License-Identifier: GPL-2.0
+.. include:: <isonum.txt>
+
+========================================================
+PCI Trusted Execution Environment Security Manager (TSM)
+========================================================
+
+.. kernel-doc:: include/linux/pci-tsm.h
+ :internal:
+
+.. kernel-doc:: drivers/pci/tsm.c
+ :export:
diff --git a/MAINTAINERS b/MAINTAINERS
index cfa3fb8772d2..8cb7ee9270d2 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -25247,8 +25247,10 @@ L: linux-coco@lists.linux.dev
S: Maintained
F: Documentation/ABI/testing/configfs-tsm-report
F: Documentation/driver-api/coco/
+F: Documentation/driver-api/pci/tsm.rst
+F: drivers/pci/tsm.c
F: drivers/virt/coco/guest/
-F: include/linux/tsm*.h
+F: include/linux/*tsm*.h
F: samples/tsm-mr/
TRUSTED SERVICES TEE DRIVER
diff --git a/drivers/pci/Kconfig b/drivers/pci/Kconfig
index 4bd75d8b9b86..700addee8f62 100644
--- a/drivers/pci/Kconfig
+++ b/drivers/pci/Kconfig
@@ -136,6 +136,20 @@ config PCI_IDE_STREAM_MAX
platform capability for the foreseeable future is 4 to 8 streams. Bump
this value up if you have an expert testing need.
+config PCI_TSM
+ bool "PCI TSM: Device security protocol support"
+ select PCI_IDE
+ select PCI_DOE
+ help
+ The TEE (Trusted Execution Environment) Device Interface
+ Security Protocol (TDISP) defines a "TSM" as a platform agent
+ that manages device authentication, link encryption, link
+ integrity protection, and assignment of PCI device functions
+ (virtual or physical) to confidential computing VMs that can
+ access (DMA) guest private memory.
+
+ Enable a platform TSM driver to use this capability.
+
config PCI_DOE
bool "Enable PCI Data Object Exchange (DOE) support"
help
diff --git a/drivers/pci/Makefile b/drivers/pci/Makefile
index 6612256fd37d..2c545f877062 100644
--- a/drivers/pci/Makefile
+++ b/drivers/pci/Makefile
@@ -35,6 +35,7 @@ obj-$(CONFIG_XEN_PCIDEV_FRONTEND) += xen-pcifront.o
obj-$(CONFIG_VGA_ARB) += vgaarb.o
obj-$(CONFIG_PCI_DOE) += doe.o
obj-$(CONFIG_PCI_IDE) += ide.o
+obj-$(CONFIG_PCI_TSM) += tsm.o
obj-$(CONFIG_PCI_DYNAMIC_OF_NODES) += of_property.o
obj-$(CONFIG_PCI_NPEM) += npem.o
obj-$(CONFIG_PCIE_TPH) += tph.o
diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index 268c69daa4d5..23cbf6c8796a 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -1815,6 +1815,10 @@ const struct attribute_group *pci_dev_attr_groups[] = {
#endif
#ifdef CONFIG_PCI_DOE
&pci_doe_sysfs_group,
+#endif
+#ifdef CONFIG_PCI_TSM
+ &pci_tsm_auth_attr_group,
+ &pci_tsm_pf0_attr_group,
#endif
NULL,
};
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 1c223c79634f..3b282c24dde8 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -521,6 +521,14 @@ void pci_ide_init(struct pci_dev *dev);
static inline void pci_ide_init(struct pci_dev *dev) { }
#endif
+#ifdef CONFIG_PCI_TSM
+void pci_tsm_destroy(struct pci_dev *pdev);
+extern const struct attribute_group pci_tsm_pf0_attr_group;
+extern const struct attribute_group pci_tsm_auth_attr_group;
+#else
+static inline void pci_tsm_destroy(struct pci_dev *pdev) { }
+#endif
+
/**
* pci_dev_set_io_state - Set the new error state if possible.
*
diff --git a/drivers/pci/remove.c b/drivers/pci/remove.c
index 445afdfa6498..21851c13becd 100644
--- a/drivers/pci/remove.c
+++ b/drivers/pci/remove.c
@@ -55,6 +55,9 @@ static void pci_destroy_dev(struct pci_dev *dev)
pci_doe_sysfs_teardown(dev);
pci_npem_remove(dev);
+ /* before device_del() to keep config cycle access */
+ pci_tsm_destroy(dev);
+
device_del(&dev->dev);
down_write(&pci_bus_sem);
diff --git a/drivers/pci/tsm.c b/drivers/pci/tsm.c
new file mode 100644
index 000000000000..0784cc436dd3
--- /dev/null
+++ b/drivers/pci/tsm.c
@@ -0,0 +1,554 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * TEE Security Manager for the TEE Device Interface Security Protocol
+ * (TDISP, PCIe r6.1 sec 11)
+ *
+ * Copyright(c) 2024 Intel Corporation. All rights reserved.
+ */
+
+#define dev_fmt(fmt) "TSM: " fmt
+
+#include <linux/bitfield.h>
+#include <linux/xarray.h>
+#include <linux/sysfs.h>
+
+#include <linux/tsm.h>
+#include <linux/pci.h>
+#include <linux/pci-doe.h>
+#include <linux/pci-tsm.h>
+#include "pci.h"
+
+/*
+ * Provide a read/write lock against the init / exit of pdev tsm
+ * capabilities and arrival/departure of a tsm instance
+ */
+static DECLARE_RWSEM(pci_tsm_rwsem);
+static int pci_tsm_count;
+
+static inline bool is_dsm(struct pci_dev *pdev)
+{
+ return pdev->tsm && pdev->tsm->dsm == pdev;
+}
+
+static struct pci_tsm_pf0 *to_pci_tsm_pf0(struct pci_tsm *pci_tsm)
+{
+ struct pci_dev *pdev = pci_tsm->pdev;
+
+ if (!is_pci_tsm_pf0(pdev) || !is_dsm(pdev)) {
+ dev_WARN_ONCE(&pdev->dev, 1, "invalid context object\n");
+ return NULL;
+ }
+
+ return container_of(pci_tsm, struct pci_tsm_pf0, tsm);
+}
+
+static void tsm_remove(struct pci_tsm *tsm)
+{
+ struct pci_dev *pdev;
+
+ if (!tsm)
+ return;
+
+ pdev = tsm->pdev;
+ tsm->ops->remove(tsm);
+ pdev->tsm = NULL;
+}
+DEFINE_FREE(tsm_remove, struct pci_tsm *, if (_T) tsm_remove(_T))
+
+static int call_cb_put(struct pci_dev *pdev, void *data,
+ int (*cb)(struct pci_dev *pdev, void *data))
+{
+ int rc;
+
+ if (!pdev)
+ return 0;
+ rc = cb(pdev, data);
+ pci_dev_put(pdev);
+ return rc;
+}
+
+static void pci_tsm_walk_fns(struct pci_dev *pdev,
+ int (*cb)(struct pci_dev *pdev, void *data),
+ void *data)
+{
+ struct pci_dev *fn;
+ int i;
+
+ /* walk virtual functions */
+ for (i = 0; i < pci_num_vf(pdev); i++) {
+ fn = pci_get_domain_bus_and_slot(pci_domain_nr(pdev->bus),
+ pci_iov_virtfn_bus(pdev, i),
+ pci_iov_virtfn_devfn(pdev, i));
+ if (call_cb_put(fn, data, cb))
+ return;
+ }
+
+ /* walk subordinate physical functions */
+ for (i = 1; i < 8; i++) {
+ fn = pci_get_slot(pdev->bus,
+ PCI_DEVFN(PCI_SLOT(pdev->devfn), i));
+ if (call_cb_put(fn, data, cb))
+ return;
+ }
+
+ /* walk downstream devices */
+ if (pci_pcie_type(pdev) != PCI_EXP_TYPE_UPSTREAM)
+ return;
+
+ if (!is_dsm(pdev))
+ return;
+
+ pci_walk_bus(pdev->subordinate, cb, data);
+}
+
+static void pci_tsm_walk_fns_reverse(struct pci_dev *pdev,
+ int (*cb)(struct pci_dev *pdev,
+ void *data),
+ void *data)
+{
+ struct pci_dev *fn;
+ int i;
+
+ /* reverse walk virtual functions */
+ for (i = pci_num_vf(pdev) - 1; i >= 0; i--) {
+ fn = pci_get_domain_bus_and_slot(pci_domain_nr(pdev->bus),
+ pci_iov_virtfn_bus(pdev, i),
+ pci_iov_virtfn_devfn(pdev, i));
+ if (call_cb_put(fn, data, cb))
+ return;
+ }
+
+ /* reverse walk subordinate physical functions */
+ for (i = 7; i >= 1; i--) {
+ fn = pci_get_slot(pdev->bus,
+ PCI_DEVFN(PCI_SLOT(pdev->devfn), i));
+ if (call_cb_put(fn, data, cb))
+ return;
+ }
+
+ /* reverse walk downstream devices */
+ if (pci_pcie_type(pdev) != PCI_EXP_TYPE_UPSTREAM)
+ return;
+
+ if (!is_dsm(pdev))
+ return;
+
+ pci_walk_bus_reverse(pdev->subordinate, cb, data);
+}
+
+static int probe_fn(struct pci_dev *pdev, void *dsm)
+{
+ struct pci_dev *dsm_dev = dsm;
+ const struct pci_tsm_ops *ops = dsm_dev->tsm->ops;
+
+ pdev->tsm = ops->probe(pdev);
+ pci_dbg(pdev, "setup tsm context: dsm: %s status: %s\n",
+ pci_name(dsm_dev), pdev->tsm ? "success" : "failed");
+ return 0;
+}
+
+static void pci_tsm_probe_fns(struct pci_dev *dsm)
+{
+ pci_tsm_walk_fns(dsm, probe_fn, dsm);
+}
+
+static int pci_tsm_connect(struct pci_dev *pdev, struct tsm_dev *tsm_dev)
+{
+ int rc;
+ struct pci_tsm_pf0 *tsm_pf0;
+ const struct pci_tsm_ops *ops = tsm_pci_ops(tsm_dev);
+ struct pci_tsm *pci_tsm __free(tsm_remove) = ops->probe(pdev);
+
+ if (!pci_tsm)
+ return -ENXIO;
+
+ pdev->tsm = pci_tsm;
+ tsm_pf0 = to_pci_tsm_pf0(pdev->tsm);
+
+ ACQUIRE(mutex_intr, lock)(&tsm_pf0->lock);
+ if ((rc = ACQUIRE_ERR(mutex_intr, &lock)))
+ return rc;
+
+ rc = ops->connect(pdev);
+ if (rc)
+ return rc;
+
+ pdev->tsm = no_free_ptr(pci_tsm);
+
+ /*
+ * Now that the DSM is established, probe() all the potential
+ * dependent functions. Failure to probe a function is not fatal
+ * to connect(), it just disables subsequent security operations
+ * for that function.
+ */
+ pci_tsm_probe_fns(pdev);
+ return 0;
+}
+
+static ssize_t connect_show(struct device *dev, struct device_attribute *attr,
+ char *buf)
+{
+ struct pci_dev *pdev = to_pci_dev(dev);
+ int rc;
+
+ ACQUIRE(rwsem_read_intr, lock)(&pci_tsm_rwsem);
+ if ((rc = ACQUIRE_ERR(rwsem_read_intr, &lock)))
+ return rc;
+
+ if (!pdev->tsm)
+ return sysfs_emit(buf, "\n");
+
+ return sysfs_emit(buf, "%s\n", tsm_name(pdev->tsm->ops->owner));
+}
+
+static ssize_t connect_store(struct device *dev, struct device_attribute *attr,
+ const char *buf, size_t len)
+{
+ struct pci_dev *pdev = to_pci_dev(dev);
+ const struct pci_tsm_ops *ops;
+ struct tsm_dev *tsm_dev;
+ int rc, id;
+
+ rc = sscanf(buf, "tsm%d\n", &id);
+ if (rc != 1)
+ return -EINVAL;
+
+ ACQUIRE(rwsem_read_intr, lock)(&pci_tsm_rwsem);
+ if ((rc = ACQUIRE_ERR(rwsem_read_intr, &lock)))
+ return rc;
+
+ if (pdev->tsm)
+ return -EBUSY;
+
+ tsm_dev = find_tsm_dev(id);
+ if (!tsm_dev)
+ return -ENXIO;
+
+ ops = tsm_pci_ops(tsm_dev);
+ if (!ops || !ops->connect || !ops->probe)
+ return -ENXIO;
+
+ rc = pci_tsm_connect(pdev, tsm_dev);
+ if (rc)
+ return rc;
+ return len;
+}
+static DEVICE_ATTR_RW(connect);
+
+static int remove_fn(struct pci_dev *pdev, void *data)
+{
+ tsm_remove(pdev->tsm);
+ return 0;
+}
+
+static void pci_tsm_remove_fns(struct pci_dev *dsm)
+{
+ pci_tsm_walk_fns_reverse(dsm, remove_fn, NULL);
+}
+
+static void __pci_tsm_disconnect(struct pci_dev *pdev)
+{
+ struct pci_tsm_pf0 *tsm_pf0 = to_pci_tsm_pf0(pdev->tsm);
+ const struct pci_tsm_ops *ops = pdev->tsm->ops;
+
+ /* disconnect is not interruptible */
+ guard(mutex)(&tsm_pf0->lock);
+ pci_tsm_remove_fns(pdev);
+ ops->disconnect(pdev);
+}
+
+static void pci_tsm_disconnect(struct pci_dev *pdev)
+{
+ __pci_tsm_disconnect(pdev);
+ tsm_remove(pdev->tsm);
+}
+
+static ssize_t disconnect_store(struct device *dev,
+ struct device_attribute *attr, const char *buf,
+ size_t len)
+{
+ struct pci_dev *pdev = to_pci_dev(dev);
+ bool disconnect;
+ int rc;
+
+ rc = kstrtobool(buf, &disconnect);
+ if (rc)
+ return rc;
+ if (!disconnect)
+ return -EINVAL;
+
+ ACQUIRE(rwsem_read_intr, lock)(&pci_tsm_rwsem);
+ if ((rc = ACQUIRE_ERR(rwsem_read_intr, &lock)))
+ return rc;
+
+ if (!pdev->tsm)
+ return -ENXIO;
+
+ pci_tsm_disconnect(pdev);
+ return len;
+}
+static DEVICE_ATTR_WO(disconnect);
+
+static bool pci_tsm_pf0_group_visible(struct kobject *kobj)
+{
+ struct device *dev = kobj_to_dev(kobj);
+ struct pci_dev *pdev = to_pci_dev(dev);
+
+ return pci_tsm_count && is_pci_tsm_pf0(pdev);
+}
+DEFINE_SIMPLE_SYSFS_GROUP_VISIBLE(pci_tsm_pf0);
+
+static struct attribute *pci_tsm_pf0_attrs[] = {
+ &dev_attr_connect.attr,
+ &dev_attr_disconnect.attr,
+ NULL
+};
+
+const struct attribute_group pci_tsm_pf0_attr_group = {
+ .name = "tsm",
+ .attrs = pci_tsm_pf0_attrs,
+ .is_visible = SYSFS_GROUP_VISIBLE(pci_tsm_pf0),
+};
+
+static ssize_t authenticated_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ /*
+ * When device authentication is TSM owned, 'authenticated' is
+ * identical to the connect state.
+ */
+ return connect_show(dev, attr, buf);
+}
+static DEVICE_ATTR_RO(authenticated);
+
+static struct attribute *pci_tsm_auth_attrs[] = {
+ &dev_attr_authenticated.attr,
+ NULL
+};
+
+const struct attribute_group pci_tsm_auth_attr_group = {
+ .attrs = pci_tsm_auth_attrs,
+ .is_visible = SYSFS_GROUP_VISIBLE(pci_tsm_pf0),
+};
+
+/*
+ * Retrieve physical function0 device whether it has TEE capability or not
+ */
+static struct pci_dev *pf0_dev_get(struct pci_dev *pdev)
+{
+ struct pci_dev *pf_dev = pci_physfn(pdev);
+
+ if (PCI_FUNC(pf_dev->devfn) == 0)
+ return pci_dev_get(pf_dev);
+
+ return pci_get_slot(pf_dev->bus,
+ pf_dev->devfn - PCI_FUNC(pf_dev->devfn));
+}
+
+/*
+ * Find the PCI Device instance that serves as the Device Security
+ * Manger (DSM) for @pdev. Note that no additional reference is held for
+ * the resulting device because @pdev always has a longer registered
+ * lifetime than its DSM by virtue of being a child of or identical to
+ * its DSM.
+ */
+static struct pci_dev *find_dsm_dev(struct pci_dev *pdev)
+{
+ struct pci_dev *uport_pf0;
+
+ if (is_pci_tsm_pf0(pdev))
+ return pdev;
+
+ struct pci_dev *pf0 __free(pci_dev_put) = pf0_dev_get(pdev);
+ if (!pf0)
+ return NULL;
+
+ if (is_dsm(pf0))
+ return pf0;
+
+ /*
+ * For cases where a switch may be hosting TDISP services on
+ * behalf of downstream devices, check the first usptream port
+ * relative to this endpoint.
+ */
+ if (!pdev->dev.parent || !pdev->dev.parent->parent)
+ return NULL;
+
+ uport_pf0 = to_pci_dev(pdev->dev.parent->parent);
+ if (is_dsm(uport_pf0))
+ return uport_pf0;
+ return NULL;
+}
+
+/**
+ * pci_tsm_constructor() - base 'struct pci_tsm' initialization
+ * @pdev: The PCI device
+ * @tsm: context to initialize
+ * @ops: PCI operations provided by the TSM
+ */
+int pci_tsm_constructor(struct pci_dev *pdev, struct pci_tsm *tsm,
+ const struct pci_tsm_ops *ops)
+{
+ tsm->pdev = pdev;
+ tsm->ops = ops;
+ tsm->dsm = find_dsm_dev(pdev);
+ if (!tsm->dsm) {
+ pci_warn(pdev, "failed to find Device Security Manager\n");
+ return -ENXIO;
+ }
+ return 0;
+}
+EXPORT_SYMBOL_GPL(pci_tsm_constructor);
+
+/**
+ * pci_tsm_pf0_constructor() - common 'struct pci_tsm_pf0' initialization
+ * @pdev: Physical Function 0 PCI device (as indicated by is_pci_tsm_pf0())
+ * @tsm: context to initialize
+ */
+int pci_tsm_pf0_constructor(struct pci_dev *pdev, struct pci_tsm_pf0 *tsm,
+ const struct pci_tsm_ops *ops)
+{
+ struct tsm_dev *tsm_dev = ops->owner;
+
+ mutex_init(&tsm->lock);
+ tsm->doe_mb = pci_find_doe_mailbox(pdev, PCI_VENDOR_ID_PCI_SIG,
+ PCI_DOE_PROTO_CMA);
+ if (!tsm->doe_mb) {
+ pci_warn(pdev, "TSM init failure, no CMA mailbox\n");
+ return -ENODEV;
+ }
+
+ if (tsm_pci_group(tsm_dev))
+ sysfs_merge_group(&pdev->dev.kobj, tsm_pci_group(tsm_dev));
+
+ return pci_tsm_constructor(pdev, &tsm->tsm, ops);
+}
+EXPORT_SYMBOL_GPL(pci_tsm_pf0_constructor);
+
+void pci_tsm_pf0_destructor(struct pci_tsm_pf0 *pf0_tsm)
+{
+ struct pci_tsm *tsm = &pf0_tsm->tsm;
+ struct pci_dev *pdev = tsm->pdev;
+ struct tsm_dev *tsm_dev = tsm->ops->owner;
+
+ if (tsm_pci_group(tsm_dev))
+ sysfs_unmerge_group(&pdev->dev.kobj, tsm_pci_group(tsm_dev));
+ mutex_destroy(&pf0_tsm->lock);
+}
+EXPORT_SYMBOL_GPL(pci_tsm_pf0_destructor);
+
+static void pf0_sysfs_enable(struct pci_dev *pdev)
+{
+ pci_dbg(pdev, "Device Security Manager detected (%s%s )\n",
+ pdev->ide_cap ? " ide" : "",
+ pdev->devcap & PCI_EXP_DEVCAP_TEE ? " tee" : "");
+
+ sysfs_update_group(&pdev->dev.kobj, &pci_tsm_auth_attr_group);
+ sysfs_update_group(&pdev->dev.kobj, &pci_tsm_pf0_attr_group);
+}
+
+int pci_tsm_register(struct tsm_dev *tsm_dev)
+{
+ const struct pci_tsm_ops *ops;
+ struct pci_dev *pdev = NULL;
+
+ if (!tsm_dev)
+ return -EINVAL;
+
+ /*
+ * The TSM device must have pci_ops, and only implement one of link_ops
+ * or sec_ops.
+ */
+ ops = tsm_pci_ops(tsm_dev);
+ if (!ops)
+ return -EINVAL;
+
+ if (!ops->probe && !ops->sec_probe)
+ return -EINVAL;
+
+ if (ops->probe && ops->sec_probe)
+ return -EINVAL;
+
+ guard(rwsem_write)(&pci_tsm_rwsem);
+
+ pci_tsm_count++;
+
+ /* PCI/TSM sysfs already enabled? */
+ if (pci_tsm_count > 1)
+ return 0;
+
+ for_each_pci_dev(pdev)
+ if (is_pci_tsm_pf0(pdev))
+ pf0_sysfs_enable(pdev);
+ return 0;
+}
+EXPORT_SYMBOL_GPL(pci_tsm_register);
+
+/**
+ * __pci_tsm_destroy() - destroy the TSM context for @pdev
+ * @pdev: device to cleanup
+ * @tsm_dev: TSM context if a TSM device is being removed, NULL if
+ * @pdev is being removed.
+ *
+ * At device removal or TSM unregistration all established context
+ * with the TSM is torn down. Additionally, if there are no more TSMs
+ * registered, the PCI tsm/ sysfs attributes are hidden.
+ */
+static void __pci_tsm_destroy(struct pci_dev *pdev, struct tsm_dev *tsm_dev)
+{
+ struct pci_tsm *tsm = pdev->tsm;
+
+ lockdep_assert_held_write(&pci_tsm_rwsem);
+
+ if (tsm_dev && is_pci_tsm_pf0(pdev) && !pci_tsm_count) {
+ sysfs_update_group(&pdev->dev.kobj, &pci_tsm_auth_attr_group);
+ sysfs_update_group(&pdev->dev.kobj, &pci_tsm_pf0_attr_group);
+ }
+
+ if (!tsm)
+ return;
+
+ if (!tsm_dev)
+ tsm_dev = tsm->ops->owner;
+ else if (tsm_dev != tsm->ops->owner)
+ return;
+
+ if (is_pci_tsm_pf0(pdev))
+ pci_tsm_disconnect(pdev);
+ else
+ tsm_remove(pdev->tsm);
+}
+
+void pci_tsm_destroy(struct pci_dev *pdev)
+{
+ guard(rwsem_write)(&pci_tsm_rwsem);
+ __pci_tsm_destroy(pdev, NULL);
+}
+
+void pci_tsm_unregister(struct tsm_dev *tsm_dev)
+{
+ struct pci_dev *pdev = NULL;
+
+ guard(rwsem_write)(&pci_tsm_rwsem);
+ pci_tsm_count--;
+ for_each_pci_dev_reverse(pdev)
+ __pci_tsm_destroy(pdev, tsm_dev);
+}
+
+int pci_tsm_doe_transfer(struct pci_dev *pdev, enum pci_doe_proto type,
+ const void *req, size_t req_sz, void *resp,
+ size_t resp_sz)
+{
+ struct pci_tsm_pf0 *tsm;
+
+ if (!pdev->tsm || !is_pci_tsm_pf0(pdev))
+ return -ENXIO;
+
+ tsm = to_pci_tsm_pf0(pdev->tsm);
+ if (!tsm->doe_mb)
+ return -ENXIO;
+
+ return pci_doe(tsm->doe_mb, PCI_VENDOR_ID_PCI_SIG, type, req, req_sz,
+ resp, resp_sz);
+}
+EXPORT_SYMBOL_GPL(pci_tsm_doe_transfer);
diff --git a/drivers/virt/coco/tsm-core.c b/drivers/virt/coco/tsm-core.c
index 1f53b9049e2d..093824dc68dd 100644
--- a/drivers/virt/coco/tsm-core.c
+++ b/drivers/virt/coco/tsm-core.c
@@ -9,6 +9,7 @@
#include <linux/device.h>
#include <linux/module.h>
#include <linux/cleanup.h>
+#include <linux/pci-tsm.h>
static struct class *tsm_class;
static DECLARE_RWSEM(tsm_rwsem);
@@ -17,8 +18,39 @@ static DEFINE_IDR(tsm_idr);
struct tsm_dev {
struct device dev;
int id;
+ const struct pci_tsm_ops *pci_ops;
+ const struct attribute_group *group;
};
+const char *tsm_name(const struct tsm_dev *tsm_dev)
+{
+ return dev_name(&tsm_dev->dev);
+}
+
+/*
+ * Caller responsible for ensuring it does not race tsm_dev
+ * unregistration.
+ */
+struct tsm_dev *find_tsm_dev(int id)
+{
+ guard(rcu)();
+ return idr_find(&tsm_idr, id);
+}
+
+const struct pci_tsm_ops *tsm_pci_ops(const struct tsm_dev *tsm_dev)
+{
+ if (!tsm_dev)
+ return NULL;
+ return tsm_dev->pci_ops;
+}
+
+const struct attribute_group *tsm_pci_group(const struct tsm_dev *tsm_dev)
+{
+ if (!tsm_dev)
+ return NULL;
+ return tsm_dev->group;
+}
+
static struct tsm_dev *alloc_tsm_dev(struct device *parent,
const struct attribute_group **groups)
{
@@ -44,6 +76,29 @@ static struct tsm_dev *alloc_tsm_dev(struct device *parent,
return no_free_ptr(tsm_dev);
}
+static struct tsm_dev *tsm_register_pci_or_reset(struct tsm_dev *tsm_dev,
+ struct pci_tsm_ops *pci_ops)
+{
+ int rc;
+
+ if (!pci_ops)
+ return tsm_dev;
+
+ pci_ops->owner = tsm_dev;
+ tsm_dev->pci_ops = pci_ops;
+ rc = pci_tsm_register(tsm_dev);
+ if (rc) {
+ dev_err(tsm_dev->dev.parent,
+ "PCI/TSM registration failure: %d\n", rc);
+ device_unregister(&tsm_dev->dev);
+ return ERR_PTR(rc);
+ }
+
+ /* Notify TSM userspace that PCI/TSM operations are now possible */
+ kobject_uevent(&tsm_dev->dev.kobj, KOBJ_CHANGE);
+ return tsm_dev;
+}
+
static void put_tsm_dev(struct tsm_dev *tsm_dev)
{
if (!IS_ERR_OR_NULL(tsm_dev))
@@ -54,7 +109,8 @@ DEFINE_FREE(put_tsm_dev, struct tsm_dev *,
if (!IS_ERR_OR_NULL(_T)) put_tsm_dev(_T))
struct tsm_dev *tsm_register(struct device *parent,
- const struct attribute_group **groups)
+ const struct attribute_group **groups,
+ struct pci_tsm_ops *pci_ops)
{
struct tsm_dev *tsm_dev __free(put_tsm_dev) =
alloc_tsm_dev(parent, groups);
@@ -73,12 +129,13 @@ struct tsm_dev *tsm_register(struct device *parent,
if (rc)
return ERR_PTR(rc);
- return no_free_ptr(tsm_dev);
+ return tsm_register_pci_or_reset(no_free_ptr(tsm_dev), pci_ops);
}
EXPORT_SYMBOL_GPL(tsm_register);
void tsm_unregister(struct tsm_dev *tsm_dev)
{
+ pci_tsm_unregister(tsm_dev);
device_unregister(&tsm_dev->dev);
}
EXPORT_SYMBOL_GPL(tsm_unregister);
diff --git a/include/linux/pci-tsm.h b/include/linux/pci-tsm.h
new file mode 100644
index 000000000000..f370c022fac4
--- /dev/null
+++ b/include/linux/pci-tsm.h
@@ -0,0 +1,158 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __PCI_TSM_H
+#define __PCI_TSM_H
+#include <linux/mutex.h>
+#include <linux/pci.h>
+
+struct pci_tsm;
+
+/*
+ * struct pci_tsm_ops - manage confidential links and security state
+ * @link_ops: Coordinate PCIe SPDM and IDE establishment via a platform TSM.
+ * Provide a secure session transport for TDISP state management
+ * (typically bare metal physical function operations).
+ * @sec_ops: Lock, unlock, and interrogate the security state of the
+ * function via the platform TSM (typically virtual function
+ * operations).
+ * @owner: Back reference to the TSM device that owns this instance.
+ *
+ * This operations are mutually exclusive either a tsm_dev instance
+ * manages phyiscal link properties or it manages function security
+ * states like TDISP lock/unlock.
+ */
+struct pci_tsm_ops {
+ /*
+ * struct pci_tsm_link_ops - Manage physical link and the TSM/DSM session
+ * @probe: probe device for tsm link operation readiness, setup
+ * DSM context
+ * @remove: destroy DSM context
+ * @connect: establish / validate a secure connection (e.g. IDE)
+ * with the device
+ * @disconnect: teardown the secure link
+ *
+ * @probe and @remove run in pci_tsm_rwsem held for write context. All
+ * other ops run under the @pdev->tsm->lock mutex and pci_tsm_rwsem held
+ * for read.
+ */
+ struct_group_tagged(pci_tsm_link_ops, link_ops,
+ struct pci_tsm *(*probe)(struct pci_dev *pdev);
+ void (*remove)(struct pci_tsm *tsm);
+ int (*connect)(struct pci_dev *pdev);
+ void (*disconnect)(struct pci_dev *pdev);
+ );
+
+ /*
+ * struct pci_tsm_security_ops - Manage the security state of the function
+ * @sec_probe: probe device for tsm security operation
+ * readiness, setup security context
+ * @sec_remove: destroy security context
+ *
+ * @sec_probe and @sec_remove run in pci_tsm_rwsem held for
+ * write context. All other ops run under the @pdev->tsm->lock
+ * mutex and pci_tsm_rwsem held for read.
+ */
+ struct_group_tagged(pci_tsm_security_ops, ops,
+ struct pci_tsm *(*sec_probe)(struct pci_dev *pdev);
+ void (*sec_remove)(struct pci_tsm *tsm);
+ );
+ struct tsm_dev *owner;
+};
+
+/**
+ * struct pci_tsm - Core TSM context for a given PCIe endpoint
+ * @pdev: Back ref to device function, distinguishes type of pci_tsm context
+ * @dsm: PCI Device Security Manager for link operations on @pdev.
+ * @ops: Link Confidentiality or Device Function Security operations
+ *
+ * This structure is wrapped by low level TSM driver data and returned
+ * by probe()/sec_probe(), it is freed by the corresponding
+ * remove()/sec_remove().
+ *
+ * For link operations it serves to cache the association between a
+ * Device Security Manager (DSM) and the functions that manager can
+ * assign to a TVM. That can be "self", for assigning function0 of a
+ * TEE I/O device, a sub-function (SR-IOV virtual function, or
+ * non-function0 multifunction-device), or a downstream endpoint (PCIe
+ * upstream switch-port as DSM).
+ */
+struct pci_tsm {
+ struct pci_dev *pdev;
+ struct pci_dev *dsm;
+ const struct pci_tsm_ops *ops;
+};
+
+/**
+ * struct pci_tsm_pf0 - Physical Function 0 TDISP link context
+ * @tsm: generic core "tsm" context
+ * @lock: protect @state vs pci_tsm_ops invocation
+ * @doe_mb: PCIe Data Object Exchange mailbox
+ */
+struct pci_tsm_pf0 {
+ struct pci_tsm tsm;
+ struct mutex lock;
+ struct pci_doe_mb *doe_mb;
+};
+
+/* physical function0 and capable of 'connect' */
+static inline bool is_pci_tsm_pf0(struct pci_dev *pdev)
+{
+ if (!pci_is_pcie(pdev))
+ return false;
+
+ if (pdev->is_virtfn)
+ return false;
+
+ /*
+ * Allow for a Device Security Manager (DSM) associated with function0
+ * of an Endpoint to coordinate TDISP requests for other functions
+ * (physical or virtual) of the device, or allow for an Upstream Port
+ * DSM to accept TDISP requests for switch Downstream Endpoints.
+ */
+ switch (pci_pcie_type(pdev)) {
+ case PCI_EXP_TYPE_ENDPOINT:
+ case PCI_EXP_TYPE_UPSTREAM:
+ case PCI_EXP_TYPE_RC_END:
+ if (pdev->ide_cap || (pdev->devcap & PCI_EXP_DEVCAP_TEE))
+ break;
+ fallthrough;
+ default:
+ return false;
+ }
+
+ return PCI_FUNC(pdev->devfn) == 0;
+}
+
+enum pci_doe_proto {
+ PCI_DOE_PROTO_CMA = 1,
+ PCI_DOE_PROTO_SSESSION = 2,
+};
+
+#ifdef CONFIG_PCI_TSM
+struct tsm_dev;
+int pci_tsm_register(struct tsm_dev *tsm_dev);
+void pci_tsm_unregister(struct tsm_dev *tsm_dev);
+int pci_tsm_doe_transfer(struct pci_dev *pdev, enum pci_doe_proto type,
+ const void *req, size_t req_sz, void *resp,
+ size_t resp_sz);
+int pci_tsm_constructor(struct pci_dev *pdev, struct pci_tsm *tsm,
+ const struct pci_tsm_ops *ops);
+int pci_tsm_pf0_constructor(struct pci_dev *pdev, struct pci_tsm_pf0 *tsm,
+ const struct pci_tsm_ops *ops);
+void pci_tsm_pf0_destructor(struct pci_tsm_pf0 *tsm);
+#else
+static inline int pci_tsm_register(struct tsm_dev *tsm_dev)
+{
+ return 0;
+}
+static inline void pci_tsm_unregister(struct tsm_dev *tsm_dev)
+{
+}
+static inline int pci_tsm_doe_transfer(struct pci_dev *pdev,
+ enum pci_doe_proto type, const void *req,
+ size_t req_sz, void *resp,
+ size_t resp_sz)
+{
+ return -ENOENT;
+}
+#endif
+#endif /*__PCI_TSM_H */
diff --git a/include/linux/pci.h b/include/linux/pci.h
index b8bca0711967..0e5703fad0f6 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -539,6 +539,9 @@ struct pci_dev {
u8 nr_link_ide; /* Link Stream count (Selective Stream offset) */
unsigned int ide_cfg:1; /* Config cycles over IDE */
unsigned int ide_tee_limit:1; /* Disallow T=0 traffic over IDE */
+#endif
+#ifdef CONFIG_PCI_TSM
+ struct pci_tsm *tsm; /* TSM operation state */
#endif
u16 acs_cap; /* ACS Capability offset */
u8 supported_speeds; /* Supported Link Speeds Vector */
diff --git a/include/linux/tsm.h b/include/linux/tsm.h
index a90b40b1b13c..ce95589a5d5b 100644
--- a/include/linux/tsm.h
+++ b/include/linux/tsm.h
@@ -111,7 +111,13 @@ struct tsm_report_ops {
int tsm_report_register(const struct tsm_report_ops *ops, void *priv);
int tsm_report_unregister(const struct tsm_report_ops *ops);
struct tsm_dev;
+struct pci_tsm_ops;
struct tsm_dev *tsm_register(struct device *parent,
- const struct attribute_group **groups);
+ const struct attribute_group **groups,
+ struct pci_tsm_ops *ops);
void tsm_unregister(struct tsm_dev *tsm_dev);
+const char *tsm_name(const struct tsm_dev *tsm_dev);
+struct tsm_dev *find_tsm_dev(int id);
+const struct pci_tsm_ops *tsm_pci_ops(const struct tsm_dev *tsm_dev);
+const struct attribute_group *tsm_pci_group(const struct tsm_dev *tsm_dev);
#endif /* __TSM_H */
diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h
index ab4ebf0f8a46..1b991a88c19c 100644
--- a/include/uapi/linux/pci_regs.h
+++ b/include/uapi/linux/pci_regs.h
@@ -500,6 +500,7 @@
#define PCI_EXP_DEVCAP_PWR_VAL 0x03fc0000 /* Slot Power Limit Value */
#define PCI_EXP_DEVCAP_PWR_SCL 0x0c000000 /* Slot Power Limit Scale */
#define PCI_EXP_DEVCAP_FLR 0x10000000 /* Function Level Reset */
+#define PCI_EXP_DEVCAP_TEE 0x40000000 /* TEE I/O (TDISP) Support */
#define PCI_EXP_DEVCTL 0x08 /* Device Control */
#define PCI_EXP_DEVCTL_CERE 0x0001 /* Correctable Error Reporting En. */
#define PCI_EXP_DEVCTL_NFERE 0x0002 /* Non-Fatal Error Reporting Enable */
--
2.50.1
> +static int pci_tsm_connect(struct pci_dev *pdev, struct tsm_dev *tsm_dev) > +{ > + int rc; > + struct pci_tsm_pf0 *tsm_pf0; > + const struct pci_tsm_ops *ops = tsm_pci_ops(tsm_dev); > + struct pci_tsm *pci_tsm __free(tsm_remove) = ops->probe(pdev); > + > + if (!pci_tsm) > + return -ENXIO; > + > + pdev->tsm = pci_tsm; > + tsm_pf0 = to_pci_tsm_pf0(pdev->tsm); > + > + ACQUIRE(mutex_intr, lock)(&tsm_pf0->lock); > + if ((rc = ACQUIRE_ERR(mutex_intr, &lock))) > + return rc; > + > + rc = ops->connect(pdev); > + if (rc) > + return rc; > + > + pdev->tsm = no_free_ptr(pci_tsm); > + > + /* > + * Now that the DSM is established, probe() all the potential > + * dependent functions. Failure to probe a function is not fatal > + * to connect(), it just disables subsequent security operations > + * for that function. > + */ > + pci_tsm_probe_fns(pdev); > + return 0; > +} > +} > + [...] > +static void pf0_sysfs_enable(struct pci_dev *pdev) > +{ > + pci_dbg(pdev, "Device Security Manager detected (%s%s )\n", > + pdev->ide_cap ? " ide" : "", > + pdev->devcap & PCI_EXP_DEVCAP_TEE ? " tee" : ""); > + > + sysfs_update_group(&pdev->dev.kobj, &pci_tsm_auth_attr_group); > + sysfs_update_group(&pdev->dev.kobj, &pci_tsm_pf0_attr_group); > +} > + > +int pci_tsm_register(struct tsm_dev *tsm_dev) > +{ [...] > + for_each_pci_dev(pdev) > + if (is_pci_tsm_pf0(pdev)) > + pf0_sysfs_enable(pdev); Now the tsm attributes are exposed to user before ops->probe(), from user's POV, tsm link operation for this device is already ready ... > + return 0; > +} > +EXPORT_SYMBOL_GPL(pci_tsm_register); [...] > +struct pci_tsm_ops { > + /* > + * struct pci_tsm_link_ops - Manage physical link and the TSM/DSM session > + * @probe: probe device for tsm link operation readiness, setup So I think the probe callback is losing the meaning of readiness check. Users see the 'connect/disconnect', they write 'connect' and found errors no matter ->probe() fails or ->connect() fails. Maybe just remove the responsibility of readiness check from ->probe(), I found it simplifies code when implementing tdx-tsm driver. Thanks, Yilun > + * DSM context > + * @remove: destroy DSM context > + * @connect: establish / validate a secure connection (e.g. IDE) > + * with the device > + * @disconnect: teardown the secure link > + * > + * @probe and @remove run in pci_tsm_rwsem held for write context. All > + * other ops run under the @pdev->tsm->lock mutex and pci_tsm_rwsem held > + * for read. > + */ > + struct_group_tagged(pci_tsm_link_ops, link_ops, > + struct pci_tsm *(*probe)(struct pci_dev *pdev); > + void (*remove)(struct pci_tsm *tsm); > + int (*connect)(struct pci_dev *pdev); > + void (*disconnect)(struct pci_dev *pdev); > + );
Xu Yilun wrote: [..] > > + for_each_pci_dev(pdev) > > + if (is_pci_tsm_pf0(pdev)) > > + pf0_sysfs_enable(pdev); > > Now the tsm attributes are exposed to user before ops->probe(), from > user's POV, tsm link operation for this device is already ready ... > > > + return 0; > > +} > > +EXPORT_SYMBOL_GPL(pci_tsm_register); > > [...] > > > +struct pci_tsm_ops { > > + /* > > + * struct pci_tsm_link_ops - Manage physical link and the TSM/DSM session > > + * @probe: probe device for tsm link operation readiness, setup > > So I think the probe callback is losing the meaning of readiness check. > Users see the 'connect/disconnect', they write 'connect' and found > errors no matter ->probe() fails or ->connect() fails. > > Maybe just remove the responsibility of readiness check from ->probe(), > I found it simplifies code when implementing tdx-tsm driver. Oh true, that comment is now stale with this new organization as probe is only about setting up any context to allow future operations. Any "readiness" is determined in those follow-on operations, not probe. Updated the comment to: /* * struct pci_tsm_link_ops - Manage physical link and the TSM/DSM session * @probe: allocate context (wrap 'struct pci_tsm') for follow-on link * operations * @remove: destroy link operations context
On 18/7/25 04:33, Dan Williams wrote: > The PCIe 6.1 specification, section 11, introduces the Trusted Execution > Environment (TEE) Device Interface Security Protocol (TDISP). This > protocol definition builds upon Component Measurement and Authentication > (CMA), and link Integrity and Data Encryption (IDE). It adds support for > assigning devices (PCI physical or virtual function) to a confidential > VM such that the assigned device is enabled to access guest private > memory protected by technologies like Intel TDX, AMD SEV-SNP, RISCV > COVE, or ARM CCA. > > The "TSM" (TEE Security Manager) is a concept in the TDISP specification > of an agent that mediates between a "DSM" (Device Security Manager) and > system software in both a VMM and a confidential VM. A VMM uses TSM ABIs > to setup link security and assign devices. A confidential VM uses TSM > ABIs to transition an assigned device into the TDISP "RUN" state and > validate its configuration. From a Linux perspective the TSM abstracts > many of the details of TDISP, IDE, and CMA. Some of those details leak > through at times, but for the most part TDISP is an internal > implementation detail of the TSM. > > CONFIG_PCI_TSM adds an "authenticated" attribute and "tsm/" subdirectory > to pci-sysfs. Consider that the TSM driver may itself be a PCI driver. > Userspace can watch for the arrival of a "TSM" device, > /sys/class/tsm/tsm0/uevent KOBJ_CHANGE, to know when the PCI core has > initialized TSM services. > > The operations that can be executed against a PCI device are split into > 2 mutually exclusive operation sets, "Link" and "Security" (struct > pci_tsm_{link,security}_ops). The "Link" operations manage physical link > security properties and communication with the device's Device Security > Manager firmware. These are the host side operations in TDISP. The > "Security" operations coordinate the security state of the assigned > virtual device (TDI). These are the guest side operations in TDISP. Only > link management operations are defined at this stage and placeholders > provided for the security operations. > > The locking allows for multiple devices to be executing commands > simultaneously, one outstanding command per-device and an rwsem > synchronizes the implementation relative to TSM > registration/unregistration events. > > Thanks to Wu Hao for his work on an early draft of this support. > > Cc: Lukas Wunner <lukas@wunner.de> > Cc: Samuel Ortiz <sameo@rivosinc.com> > Cc: Alexey Kardashevskiy <aik@amd.com> > Acked-by: Bjorn Helgaas <bhelgaas@google.com> > Co-developed-by: Xu Yilun <yilun.xu@linux.intel.com> > Signed-off-by: Xu Yilun <yilun.xu@linux.intel.com> > Signed-off-by: Dan Williams <dan.j.williams@intel.com> > --- > Documentation/ABI/testing/sysfs-bus-pci | 51 +++ > Documentation/driver-api/pci/index.rst | 1 + > Documentation/driver-api/pci/tsm.rst | 12 + > MAINTAINERS | 4 +- > drivers/pci/Kconfig | 14 + > drivers/pci/Makefile | 1 + > drivers/pci/pci-sysfs.c | 4 + > drivers/pci/pci.h | 8 + > drivers/pci/remove.c | 3 + > drivers/pci/tsm.c | 554 ++++++++++++++++++++++++ > drivers/virt/coco/tsm-core.c | 61 ++- > include/linux/pci-tsm.h | 158 +++++++ > include/linux/pci.h | 3 + > include/linux/tsm.h | 8 +- > include/uapi/linux/pci_regs.h | 1 + > 15 files changed, 879 insertions(+), 4 deletions(-) > create mode 100644 Documentation/driver-api/pci/tsm.rst > create mode 100644 drivers/pci/tsm.c > create mode 100644 include/linux/pci-tsm.h > > diff --git a/Documentation/ABI/testing/sysfs-bus-pci b/Documentation/ABI/testing/sysfs-bus-pci > index 69f952fffec7..99315fbfbe10 100644 > --- a/Documentation/ABI/testing/sysfs-bus-pci > +++ b/Documentation/ABI/testing/sysfs-bus-pci > @@ -612,3 +612,54 @@ Description: > > # ls doe_features > 0001:01 0001:02 doe_discovery > + > +What: /sys/bus/pci/devices/.../tsm/ > +Contact: linux-coco@lists.linux.dev > +Description: > + This directory only appears if a physical device function > + supports authentication (PCIe CMA-SPDM), interface security > + (PCIe TDISP), and is accepted for secure operation by the > + platform TSM driver. This attribute directory appears > + dynamically after the platform TSM driver loads. So, only after > + the /sys/class/tsm/tsm0 device arrives can tools assume that > + devices without a tsm/ attribute directory will never have one, > + before that, the security capabilities of the device relative to > + the platform TSM are unknown. See > + Documentation/ABI/testing/sysfs-class-tsm. > + > +What: /sys/bus/pci/devices/.../tsm/connect > +Contact: linux-coco@lists.linux.dev > +Description: > + (RW) Write the name of a TSM (TEE Security Manager) device from > + /sys/class/tsm to this file to establish a connection with the > + device. This typically includes an SPDM (DMTF Security > + Protocols and Data Models) session over PCIe DOE (Data Object > + Exchange) and may also include PCIe IDE (Integrity and Data > + Encryption) establishment. Reads from this attribute return the > + name of the connected TSM or the empty string if not > + connected. A TSM device signals its readiness to accept PCI > + connection via a KOBJ_CHANGE event. > + > +What: /sys/bus/pci/devices/.../tsm/disconnect > +Contact: linux-coco@lists.linux.dev > +Description: > + (WO) Write '1' or 'true' to this attribute to disconnect it from > + a previous TSM connection. > + > +What: /sys/bus/pci/devices/.../authenticated > +Contact: linux-pci@vger.kernel.org > +Description: > + When the device's tsm/ directory is present device > + authentication (PCIe CMA-SPDM) and link encryption (PCIe IDE) > + are handled by the platform TSM (TEE Security Manager). When the > + tsm/ directory is not present this attribute reflects only the > + native CMA-SPDM authentication state with the kernel's > + certificate store. > + > + If the attribute is not present, it indicates that > + authentication is unsupported by the device, or the TSM has no > + available authentication methods for the device. > + > + When present and the tsm/ attribute directory is present, the > + authenticated attribute is an alias for the device 'connect' > + state. See the 'tsm/connect' attribute for more details. > diff --git a/Documentation/driver-api/pci/index.rst b/Documentation/driver-api/pci/index.rst > index a38e475cdbe3..9e1b801d0f74 100644 > --- a/Documentation/driver-api/pci/index.rst > +++ b/Documentation/driver-api/pci/index.rst > @@ -10,6 +10,7 @@ The Linux PCI driver implementer's API guide > > pci > p2pdma > + tsm > > .. only:: subproject and html > > diff --git a/Documentation/driver-api/pci/tsm.rst b/Documentation/driver-api/pci/tsm.rst > new file mode 100644 > index 000000000000..59b94d79a4f2 > --- /dev/null > +++ b/Documentation/driver-api/pci/tsm.rst > @@ -0,0 +1,12 @@ > +.. SPDX-License-Identifier: GPL-2.0 > +.. include:: <isonum.txt> > + > +======================================================== > +PCI Trusted Execution Environment Security Manager (TSM) > +======================================================== > + > +.. kernel-doc:: include/linux/pci-tsm.h > + :internal: > + > +.. kernel-doc:: drivers/pci/tsm.c > + :export: > diff --git a/MAINTAINERS b/MAINTAINERS > index cfa3fb8772d2..8cb7ee9270d2 100644 > --- a/MAINTAINERS > +++ b/MAINTAINERS > @@ -25247,8 +25247,10 @@ L: linux-coco@lists.linux.dev > S: Maintained > F: Documentation/ABI/testing/configfs-tsm-report > F: Documentation/driver-api/coco/ > +F: Documentation/driver-api/pci/tsm.rst > +F: drivers/pci/tsm.c > F: drivers/virt/coco/guest/ > -F: include/linux/tsm*.h > +F: include/linux/*tsm*.h > F: samples/tsm-mr/ > > TRUSTED SERVICES TEE DRIVER > diff --git a/drivers/pci/Kconfig b/drivers/pci/Kconfig > index 4bd75d8b9b86..700addee8f62 100644 > --- a/drivers/pci/Kconfig > +++ b/drivers/pci/Kconfig > @@ -136,6 +136,20 @@ config PCI_IDE_STREAM_MAX > platform capability for the foreseeable future is 4 to 8 streams. Bump > this value up if you have an expert testing need. > > +config PCI_TSM > + bool "PCI TSM: Device security protocol support" > + select PCI_IDE > + select PCI_DOE > + help > + The TEE (Trusted Execution Environment) Device Interface > + Security Protocol (TDISP) defines a "TSM" as a platform agent > + that manages device authentication, link encryption, link > + integrity protection, and assignment of PCI device functions > + (virtual or physical) to confidential computing VMs that can > + access (DMA) guest private memory. > + > + Enable a platform TSM driver to use this capability. > + > config PCI_DOE > bool "Enable PCI Data Object Exchange (DOE) support" > help > diff --git a/drivers/pci/Makefile b/drivers/pci/Makefile > index 6612256fd37d..2c545f877062 100644 > --- a/drivers/pci/Makefile > +++ b/drivers/pci/Makefile > @@ -35,6 +35,7 @@ obj-$(CONFIG_XEN_PCIDEV_FRONTEND) += xen-pcifront.o > obj-$(CONFIG_VGA_ARB) += vgaarb.o > obj-$(CONFIG_PCI_DOE) += doe.o > obj-$(CONFIG_PCI_IDE) += ide.o > +obj-$(CONFIG_PCI_TSM) += tsm.o > obj-$(CONFIG_PCI_DYNAMIC_OF_NODES) += of_property.o > obj-$(CONFIG_PCI_NPEM) += npem.o > obj-$(CONFIG_PCIE_TPH) += tph.o > diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c > index 268c69daa4d5..23cbf6c8796a 100644 > --- a/drivers/pci/pci-sysfs.c > +++ b/drivers/pci/pci-sysfs.c > @@ -1815,6 +1815,10 @@ const struct attribute_group *pci_dev_attr_groups[] = { > #endif > #ifdef CONFIG_PCI_DOE > &pci_doe_sysfs_group, > +#endif > +#ifdef CONFIG_PCI_TSM > + &pci_tsm_auth_attr_group, > + &pci_tsm_pf0_attr_group, > #endif > NULL, > }; > diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h > index 1c223c79634f..3b282c24dde8 100644 > --- a/drivers/pci/pci.h > +++ b/drivers/pci/pci.h > @@ -521,6 +521,14 @@ void pci_ide_init(struct pci_dev *dev); > static inline void pci_ide_init(struct pci_dev *dev) { } > #endif > > +#ifdef CONFIG_PCI_TSM > +void pci_tsm_destroy(struct pci_dev *pdev); > +extern const struct attribute_group pci_tsm_pf0_attr_group; > +extern const struct attribute_group pci_tsm_auth_attr_group; > +#else > +static inline void pci_tsm_destroy(struct pci_dev *pdev) { } > +#endif > + > /** > * pci_dev_set_io_state - Set the new error state if possible. > * > diff --git a/drivers/pci/remove.c b/drivers/pci/remove.c > index 445afdfa6498..21851c13becd 100644 > --- a/drivers/pci/remove.c > +++ b/drivers/pci/remove.c > @@ -55,6 +55,9 @@ static void pci_destroy_dev(struct pci_dev *dev) > pci_doe_sysfs_teardown(dev); > pci_npem_remove(dev); > > + /* before device_del() to keep config cycle access */ > + pci_tsm_destroy(dev); > + > device_del(&dev->dev); > > down_write(&pci_bus_sem); > diff --git a/drivers/pci/tsm.c b/drivers/pci/tsm.c > new file mode 100644 > index 000000000000..0784cc436dd3 > --- /dev/null > +++ b/drivers/pci/tsm.c > @@ -0,0 +1,554 @@ > +// SPDX-License-Identifier: GPL-2.0 > +/* > + * TEE Security Manager for the TEE Device Interface Security Protocol > + * (TDISP, PCIe r6.1 sec 11) > + * > + * Copyright(c) 2024 Intel Corporation. All rights reserved. > + */ > + > +#define dev_fmt(fmt) "TSM: " fmt > + > +#include <linux/bitfield.h> > +#include <linux/xarray.h> > +#include <linux/sysfs.h> > + > +#include <linux/tsm.h> > +#include <linux/pci.h> > +#include <linux/pci-doe.h> > +#include <linux/pci-tsm.h> > +#include "pci.h" > + > +/* > + * Provide a read/write lock against the init / exit of pdev tsm > + * capabilities and arrival/departure of a tsm instance > + */ > +static DECLARE_RWSEM(pci_tsm_rwsem); > +static int pci_tsm_count; > + > +static inline bool is_dsm(struct pci_dev *pdev) > +{ > + return pdev->tsm && pdev->tsm->dsm == pdev; > +} > + > +static struct pci_tsm_pf0 *to_pci_tsm_pf0(struct pci_tsm *pci_tsm) > +{ > + struct pci_dev *pdev = pci_tsm->pdev; > + > + if (!is_pci_tsm_pf0(pdev) || !is_dsm(pdev)) { > + dev_WARN_ONCE(&pdev->dev, 1, "invalid context object\n"); > + return NULL; > + } > + > + return container_of(pci_tsm, struct pci_tsm_pf0, tsm); > +} > + > +static void tsm_remove(struct pci_tsm *tsm) > +{ > + struct pci_dev *pdev; > + > + if (!tsm) > + return; > + > + pdev = tsm->pdev; > + tsm->ops->remove(tsm); > + pdev->tsm = NULL; > +} > +DEFINE_FREE(tsm_remove, struct pci_tsm *, if (_T) tsm_remove(_T)) > + > +static int call_cb_put(struct pci_dev *pdev, void *data, > + int (*cb)(struct pci_dev *pdev, void *data)) > +{ > + int rc; > + > + if (!pdev) > + return 0; > + rc = cb(pdev, data); > + pci_dev_put(pdev); > + return rc; > +} > + > +static void pci_tsm_walk_fns(struct pci_dev *pdev, > + int (*cb)(struct pci_dev *pdev, void *data), > + void *data) > +{ > + struct pci_dev *fn; > + int i; > + > + /* walk virtual functions */ > + for (i = 0; i < pci_num_vf(pdev); i++) { > + fn = pci_get_domain_bus_and_slot(pci_domain_nr(pdev->bus), > + pci_iov_virtfn_bus(pdev, i), > + pci_iov_virtfn_devfn(pdev, i)); > + if (call_cb_put(fn, data, cb)) > + return; > + } > + > + /* walk subordinate physical functions */ > + for (i = 1; i < 8; i++) { > + fn = pci_get_slot(pdev->bus, > + PCI_DEVFN(PCI_SLOT(pdev->devfn), i)); > + if (call_cb_put(fn, data, cb)) > + return; > + } > + > + /* walk downstream devices */ > + if (pci_pcie_type(pdev) != PCI_EXP_TYPE_UPSTREAM) > + return; > + > + if (!is_dsm(pdev)) > + return; > + > + pci_walk_bus(pdev->subordinate, cb, data); > +} > + > +static void pci_tsm_walk_fns_reverse(struct pci_dev *pdev, > + int (*cb)(struct pci_dev *pdev, > + void *data), > + void *data) > +{ > + struct pci_dev *fn; > + int i; > + > + /* reverse walk virtual functions */ > + for (i = pci_num_vf(pdev) - 1; i >= 0; i--) { > + fn = pci_get_domain_bus_and_slot(pci_domain_nr(pdev->bus), > + pci_iov_virtfn_bus(pdev, i), > + pci_iov_virtfn_devfn(pdev, i)); > + if (call_cb_put(fn, data, cb)) > + return; > + } > + > + /* reverse walk subordinate physical functions */ > + for (i = 7; i >= 1; i--) { > + fn = pci_get_slot(pdev->bus, > + PCI_DEVFN(PCI_SLOT(pdev->devfn), i)); > + if (call_cb_put(fn, data, cb)) > + return; > + } > + > + /* reverse walk downstream devices */ > + if (pci_pcie_type(pdev) != PCI_EXP_TYPE_UPSTREAM) > + return; > + > + if (!is_dsm(pdev)) > + return; > + > + pci_walk_bus_reverse(pdev->subordinate, cb, data); > +} > + > +static int probe_fn(struct pci_dev *pdev, void *dsm) > +{ > + struct pci_dev *dsm_dev = dsm; > + const struct pci_tsm_ops *ops = dsm_dev->tsm->ops; > + > + pdev->tsm = ops->probe(pdev); > + pci_dbg(pdev, "setup tsm context: dsm: %s status: %s\n", > + pci_name(dsm_dev), pdev->tsm ? "success" : "failed"); > + return 0; > +} > + > +static void pci_tsm_probe_fns(struct pci_dev *dsm) > +{ > + pci_tsm_walk_fns(dsm, probe_fn, dsm); > +} > + > +static int pci_tsm_connect(struct pci_dev *pdev, struct tsm_dev *tsm_dev) > +{ > + int rc; > + struct pci_tsm_pf0 *tsm_pf0; > + const struct pci_tsm_ops *ops = tsm_pci_ops(tsm_dev); > + struct pci_tsm *pci_tsm __free(tsm_remove) = ops->probe(pdev); > + > + if (!pci_tsm) > + return -ENXIO; > + > + pdev->tsm = pci_tsm; > + tsm_pf0 = to_pci_tsm_pf0(pdev->tsm); > + > + ACQUIRE(mutex_intr, lock)(&tsm_pf0->lock); > + if ((rc = ACQUIRE_ERR(mutex_intr, &lock))) > + return rc; > + > + rc = ops->connect(pdev); > + if (rc) > + return rc; > + > + pdev->tsm = no_free_ptr(pci_tsm); > + > + /* > + * Now that the DSM is established, probe() all the potential > + * dependent functions. Failure to probe a function is not fatal > + * to connect(), it just disables subsequent security operations > + * for that function. > + */ > + pci_tsm_probe_fns(pdev); > + return 0; > +} > + > +static ssize_t connect_show(struct device *dev, struct device_attribute *attr, > + char *buf) > +{ > + struct pci_dev *pdev = to_pci_dev(dev); > + int rc; > + > + ACQUIRE(rwsem_read_intr, lock)(&pci_tsm_rwsem); > + if ((rc = ACQUIRE_ERR(rwsem_read_intr, &lock))) > + return rc; > + > + if (!pdev->tsm) > + return sysfs_emit(buf, "\n"); > + > + return sysfs_emit(buf, "%s\n", tsm_name(pdev->tsm->ops->owner)); > +} > + > +static ssize_t connect_store(struct device *dev, struct device_attribute *attr, > + const char *buf, size_t len) > +{ > + struct pci_dev *pdev = to_pci_dev(dev); > + const struct pci_tsm_ops *ops; > + struct tsm_dev *tsm_dev; > + int rc, id; > + > + rc = sscanf(buf, "tsm%d\n", &id); Why is id needed here? Are there going to be multiple DSMs per a PCI device? I am missing the point of tsm_dev. It does not have sysfs nodes (the pci_dev parent does), tsm_register() takes attribute_group but what would posibbly go there? certificates/meas/report blobs? The pci_dev struct itself has *tsm now so this child device is not that. Hm. > + if (rc != 1) > + return -EINVAL; > + > + ACQUIRE(rwsem_read_intr, lock)(&pci_tsm_rwsem); > + if ((rc = ACQUIRE_ERR(rwsem_read_intr, &lock))) > + return rc; > + > + if (pdev->tsm) > + return -EBUSY; > + > + tsm_dev = find_tsm_dev(id); When PCI TSM loads, all it does is add "connect" nodes. And when write to "connect" happens, this find_tsm_dev() is expected to find a tsm_dev but what is going to add those in the real PCI? devsec_tsm_probe() does not really explain. > + if (!tsm_dev) > + return -ENXIO; > + > + ops = tsm_pci_ops(tsm_dev); > + if (!ops || !ops->connect || !ops->probe) > + return -ENXIO; > + > + rc = pci_tsm_connect(pdev, tsm_dev); > + if (rc) > + return rc; > + return len; > +} > +static DEVICE_ATTR_RW(connect); > + > +static int remove_fn(struct pci_dev *pdev, void *data) > +{ > + tsm_remove(pdev->tsm); > + return 0; > +} > + > +static void pci_tsm_remove_fns(struct pci_dev *dsm) > +{ > + pci_tsm_walk_fns_reverse(dsm, remove_fn, NULL); > +} > + > +static void __pci_tsm_disconnect(struct pci_dev *pdev) > +{ > + struct pci_tsm_pf0 *tsm_pf0 = to_pci_tsm_pf0(pdev->tsm); > + const struct pci_tsm_ops *ops = pdev->tsm->ops; > + > + /* disconnect is not interruptible */ > + guard(mutex)(&tsm_pf0->lock); > + pci_tsm_remove_fns(pdev); > + ops->disconnect(pdev); > +} > + > +static void pci_tsm_disconnect(struct pci_dev *pdev) > +{ > + __pci_tsm_disconnect(pdev); > + tsm_remove(pdev->tsm); > +} > + > +static ssize_t disconnect_store(struct device *dev, > + struct device_attribute *attr, const char *buf, > + size_t len) > +{ > + struct pci_dev *pdev = to_pci_dev(dev); > + bool disconnect; > + int rc; > + > + rc = kstrtobool(buf, &disconnect); > + if (rc) > + return rc; > + if (!disconnect) > + return -EINVAL; > + > + ACQUIRE(rwsem_read_intr, lock)(&pci_tsm_rwsem); > + if ((rc = ACQUIRE_ERR(rwsem_read_intr, &lock))) > + return rc; > + > + if (!pdev->tsm) > + return -ENXIO; > + > + pci_tsm_disconnect(pdev); > + return len; > +} > +static DEVICE_ATTR_WO(disconnect); imho "echo 0 > connect" is more descriptive than "echo 1 > disconnect", and one less sysfs node. > + > +static bool pci_tsm_pf0_group_visible(struct kobject *kobj) > +{ > + struct device *dev = kobj_to_dev(kobj); > + struct pci_dev *pdev = to_pci_dev(dev); > + > + return pci_tsm_count && is_pci_tsm_pf0(pdev); > +} > +DEFINE_SIMPLE_SYSFS_GROUP_VISIBLE(pci_tsm_pf0); > + > +static struct attribute *pci_tsm_pf0_attrs[] = { > + &dev_attr_connect.attr, > + &dev_attr_disconnect.attr, > + NULL > +}; > + > +const struct attribute_group pci_tsm_pf0_attr_group = { > + .name = "tsm", > + .attrs = pci_tsm_pf0_attrs, > + .is_visible = SYSFS_GROUP_VISIBLE(pci_tsm_pf0), > +}; > + > +static ssize_t authenticated_show(struct device *dev, > + struct device_attribute *attr, char *buf) > +{ > + /* > + * When device authentication is TSM owned, 'authenticated' is > + * identical to the connect state. > + */ > + return connect_show(dev, attr, buf); > +} > +static DEVICE_ATTR_RO(authenticated); > + > +static struct attribute *pci_tsm_auth_attrs[] = { > + &dev_attr_authenticated.attr, > + NULL > +}; > + > +const struct attribute_group pci_tsm_auth_attr_group = { > + .attrs = pci_tsm_auth_attrs, > + .is_visible = SYSFS_GROUP_VISIBLE(pci_tsm_pf0), > +}; > + > +/* > + * Retrieve physical function0 device whether it has TEE capability or not > + */ > +static struct pci_dev *pf0_dev_get(struct pci_dev *pdev) > +{ > + struct pci_dev *pf_dev = pci_physfn(pdev); > + > + if (PCI_FUNC(pf_dev->devfn) == 0) > + return pci_dev_get(pf_dev); > + > + return pci_get_slot(pf_dev->bus, > + pf_dev->devfn - PCI_FUNC(pf_dev->devfn)); > +} > + > +/* > + * Find the PCI Device instance that serves as the Device Security > + * Manger (DSM) for @pdev. Note that no additional reference is held for > + * the resulting device because @pdev always has a longer registered > + * lifetime than its DSM by virtue of being a child of or identical to > + * its DSM. > + */ > +static struct pci_dev *find_dsm_dev(struct pci_dev *pdev) > +{ > + struct pci_dev *uport_pf0; > + > + if (is_pci_tsm_pf0(pdev)) > + return pdev; > + > + struct pci_dev *pf0 __free(pci_dev_put) = pf0_dev_get(pdev); > + if (!pf0) > + return NULL; > + > + if (is_dsm(pf0)) > + return pf0; > + > + /* > + * For cases where a switch may be hosting TDISP services on > + * behalf of downstream devices, check the first usptream port > + * relative to this endpoint. > + */ > + if (!pdev->dev.parent || !pdev->dev.parent->parent) > + return NULL; > + > + uport_pf0 = to_pci_dev(pdev->dev.parent->parent); > + if (is_dsm(uport_pf0)) > + return uport_pf0; > + return NULL; > +} > + > +/** > + * pci_tsm_constructor() - base 'struct pci_tsm' initialization > + * @pdev: The PCI device > + * @tsm: context to initialize > + * @ops: PCI operations provided by the TSM > + */ > +int pci_tsm_constructor(struct pci_dev *pdev, struct pci_tsm *tsm, > + const struct pci_tsm_ops *ops) > +{ > + tsm->pdev = pdev; > + tsm->ops = ops; These should go down, right before "return 0". Thanks, > + tsm->dsm = find_dsm_dev(pdev); > + if (!tsm->dsm) { > + pci_warn(pdev, "failed to find Device Security Manager\n"); > + return -ENXIO; > + } > + return 0; > +} > +EXPORT_SYMBOL_GPL(pci_tsm_constructor); > + > +/** > + * pci_tsm_pf0_constructor() - common 'struct pci_tsm_pf0' initialization > + * @pdev: Physical Function 0 PCI device (as indicated by is_pci_tsm_pf0()) > + * @tsm: context to initialize > + */ > +int pci_tsm_pf0_constructor(struct pci_dev *pdev, struct pci_tsm_pf0 *tsm, > + const struct pci_tsm_ops *ops) > +{ > + struct tsm_dev *tsm_dev = ops->owner; > + > + mutex_init(&tsm->lock); > + tsm->doe_mb = pci_find_doe_mailbox(pdev, PCI_VENDOR_ID_PCI_SIG, > + PCI_DOE_PROTO_CMA); > + if (!tsm->doe_mb) { > + pci_warn(pdev, "TSM init failure, no CMA mailbox\n"); > + return -ENODEV; > + } > + > + if (tsm_pci_group(tsm_dev)) > + sysfs_merge_group(&pdev->dev.kobj, tsm_pci_group(tsm_dev)); > + > + return pci_tsm_constructor(pdev, &tsm->tsm, ops); > +} > +EXPORT_SYMBOL_GPL(pci_tsm_pf0_constructor); > + > +void pci_tsm_pf0_destructor(struct pci_tsm_pf0 *pf0_tsm) > +{ > + struct pci_tsm *tsm = &pf0_tsm->tsm; > + struct pci_dev *pdev = tsm->pdev; > + struct tsm_dev *tsm_dev = tsm->ops->owner; > + > + if (tsm_pci_group(tsm_dev)) > + sysfs_unmerge_group(&pdev->dev.kobj, tsm_pci_group(tsm_dev)); > + mutex_destroy(&pf0_tsm->lock); > +} > +EXPORT_SYMBOL_GPL(pci_tsm_pf0_destructor); > + > +static void pf0_sysfs_enable(struct pci_dev *pdev) > +{ > + pci_dbg(pdev, "Device Security Manager detected (%s%s )\n", > + pdev->ide_cap ? " ide" : "", > + pdev->devcap & PCI_EXP_DEVCAP_TEE ? " tee" : ""); > + > + sysfs_update_group(&pdev->dev.kobj, &pci_tsm_auth_attr_group); > + sysfs_update_group(&pdev->dev.kobj, &pci_tsm_pf0_attr_group); > +} > + > +int pci_tsm_register(struct tsm_dev *tsm_dev) > +{ > + const struct pci_tsm_ops *ops; > + struct pci_dev *pdev = NULL; > + > + if (!tsm_dev) > + return -EINVAL; > + > + /* > + * The TSM device must have pci_ops, and only implement one of link_ops > + * or sec_ops. > + */ > + ops = tsm_pci_ops(tsm_dev); > + if (!ops) > + return -EINVAL; > + > + if (!ops->probe && !ops->sec_probe) > + return -EINVAL; > + > + if (ops->probe && ops->sec_probe) > + return -EINVAL; > + > + guard(rwsem_write)(&pci_tsm_rwsem); > + > + pci_tsm_count++; > + > + /* PCI/TSM sysfs already enabled? */ > + if (pci_tsm_count > 1) > + return 0; > + > + for_each_pci_dev(pdev) > + if (is_pci_tsm_pf0(pdev)) > + pf0_sysfs_enable(pdev); > + return 0; > +} > +EXPORT_SYMBOL_GPL(pci_tsm_register); > + > +/** > + * __pci_tsm_destroy() - destroy the TSM context for @pdev > + * @pdev: device to cleanup > + * @tsm_dev: TSM context if a TSM device is being removed, NULL if > + * @pdev is being removed. > + * > + * At device removal or TSM unregistration all established context > + * with the TSM is torn down. Additionally, if there are no more TSMs > + * registered, the PCI tsm/ sysfs attributes are hidden. > + */ > +static void __pci_tsm_destroy(struct pci_dev *pdev, struct tsm_dev *tsm_dev) > +{ > + struct pci_tsm *tsm = pdev->tsm; > + > + lockdep_assert_held_write(&pci_tsm_rwsem); > + > + if (tsm_dev && is_pci_tsm_pf0(pdev) && !pci_tsm_count) { > + sysfs_update_group(&pdev->dev.kobj, &pci_tsm_auth_attr_group); > + sysfs_update_group(&pdev->dev.kobj, &pci_tsm_pf0_attr_group); > + } > + > + if (!tsm) > + return; > + > + if (!tsm_dev) > + tsm_dev = tsm->ops->owner; > + else if (tsm_dev != tsm->ops->owner) > + return; > + > + if (is_pci_tsm_pf0(pdev)) > + pci_tsm_disconnect(pdev); > + else > + tsm_remove(pdev->tsm); > +} > + > +void pci_tsm_destroy(struct pci_dev *pdev) > +{ > + guard(rwsem_write)(&pci_tsm_rwsem); > + __pci_tsm_destroy(pdev, NULL); > +} > + > +void pci_tsm_unregister(struct tsm_dev *tsm_dev) > +{ > + struct pci_dev *pdev = NULL; > + > + guard(rwsem_write)(&pci_tsm_rwsem); > + pci_tsm_count--; > + for_each_pci_dev_reverse(pdev) > + __pci_tsm_destroy(pdev, tsm_dev); > +} > + > +int pci_tsm_doe_transfer(struct pci_dev *pdev, enum pci_doe_proto type, > + const void *req, size_t req_sz, void *resp, > + size_t resp_sz) > +{ > + struct pci_tsm_pf0 *tsm; > + > + if (!pdev->tsm || !is_pci_tsm_pf0(pdev)) > + return -ENXIO; > + > + tsm = to_pci_tsm_pf0(pdev->tsm); > + if (!tsm->doe_mb) > + return -ENXIO; > + > + return pci_doe(tsm->doe_mb, PCI_VENDOR_ID_PCI_SIG, type, req, req_sz, > + resp, resp_sz); > +} > +EXPORT_SYMBOL_GPL(pci_tsm_doe_transfer); > diff --git a/drivers/virt/coco/tsm-core.c b/drivers/virt/coco/tsm-core.c > index 1f53b9049e2d..093824dc68dd 100644 > --- a/drivers/virt/coco/tsm-core.c > +++ b/drivers/virt/coco/tsm-core.c > @@ -9,6 +9,7 @@ > #include <linux/device.h> > #include <linux/module.h> > #include <linux/cleanup.h> > +#include <linux/pci-tsm.h> > > static struct class *tsm_class; > static DECLARE_RWSEM(tsm_rwsem); > @@ -17,8 +18,39 @@ static DEFINE_IDR(tsm_idr); > struct tsm_dev { > struct device dev; > int id; > + const struct pci_tsm_ops *pci_ops; > + const struct attribute_group *group; > }; > > +const char *tsm_name(const struct tsm_dev *tsm_dev) > +{ > + return dev_name(&tsm_dev->dev); > +} > + > +/* > + * Caller responsible for ensuring it does not race tsm_dev > + * unregistration. > + */ > +struct tsm_dev *find_tsm_dev(int id) > +{ > + guard(rcu)(); > + return idr_find(&tsm_idr, id); > +} > + > +const struct pci_tsm_ops *tsm_pci_ops(const struct tsm_dev *tsm_dev) > +{ > + if (!tsm_dev) > + return NULL; > + return tsm_dev->pci_ops; > +} > + > +const struct attribute_group *tsm_pci_group(const struct tsm_dev *tsm_dev) > +{ > + if (!tsm_dev) > + return NULL; > + return tsm_dev->group; > +} > + > static struct tsm_dev *alloc_tsm_dev(struct device *parent, > const struct attribute_group **groups) > { > @@ -44,6 +76,29 @@ static struct tsm_dev *alloc_tsm_dev(struct device *parent, > return no_free_ptr(tsm_dev); > } > > +static struct tsm_dev *tsm_register_pci_or_reset(struct tsm_dev *tsm_dev, > + struct pci_tsm_ops *pci_ops) > +{ > + int rc; > + > + if (!pci_ops) > + return tsm_dev; > + > + pci_ops->owner = tsm_dev; > + tsm_dev->pci_ops = pci_ops; > + rc = pci_tsm_register(tsm_dev); > + if (rc) { > + dev_err(tsm_dev->dev.parent, > + "PCI/TSM registration failure: %d\n", rc); > + device_unregister(&tsm_dev->dev); > + return ERR_PTR(rc); > + } > + > + /* Notify TSM userspace that PCI/TSM operations are now possible */ > + kobject_uevent(&tsm_dev->dev.kobj, KOBJ_CHANGE); > + return tsm_dev; > +} > + > static void put_tsm_dev(struct tsm_dev *tsm_dev) > { > if (!IS_ERR_OR_NULL(tsm_dev)) > @@ -54,7 +109,8 @@ DEFINE_FREE(put_tsm_dev, struct tsm_dev *, > if (!IS_ERR_OR_NULL(_T)) put_tsm_dev(_T)) > > struct tsm_dev *tsm_register(struct device *parent, > - const struct attribute_group **groups) > + const struct attribute_group **groups, > + struct pci_tsm_ops *pci_ops) > { > struct tsm_dev *tsm_dev __free(put_tsm_dev) = > alloc_tsm_dev(parent, groups); > @@ -73,12 +129,13 @@ struct tsm_dev *tsm_register(struct device *parent, > if (rc) > return ERR_PTR(rc); > > - return no_free_ptr(tsm_dev); > + return tsm_register_pci_or_reset(no_free_ptr(tsm_dev), pci_ops); > } > EXPORT_SYMBOL_GPL(tsm_register); > > void tsm_unregister(struct tsm_dev *tsm_dev) > { > + pci_tsm_unregister(tsm_dev); > device_unregister(&tsm_dev->dev); > } > EXPORT_SYMBOL_GPL(tsm_unregister); > diff --git a/include/linux/pci-tsm.h b/include/linux/pci-tsm.h > new file mode 100644 > index 000000000000..f370c022fac4 > --- /dev/null > +++ b/include/linux/pci-tsm.h > @@ -0,0 +1,158 @@ > +/* SPDX-License-Identifier: GPL-2.0 */ > +#ifndef __PCI_TSM_H > +#define __PCI_TSM_H > +#include <linux/mutex.h> > +#include <linux/pci.h> > + > +struct pci_tsm; > + > +/* > + * struct pci_tsm_ops - manage confidential links and security state > + * @link_ops: Coordinate PCIe SPDM and IDE establishment via a platform TSM. > + * Provide a secure session transport for TDISP state management > + * (typically bare metal physical function operations). > + * @sec_ops: Lock, unlock, and interrogate the security state of the > + * function via the platform TSM (typically virtual function > + * operations). > + * @owner: Back reference to the TSM device that owns this instance. > + * > + * This operations are mutually exclusive either a tsm_dev instance > + * manages phyiscal link properties or it manages function security > + * states like TDISP lock/unlock. > + */ > +struct pci_tsm_ops { > + /* > + * struct pci_tsm_link_ops - Manage physical link and the TSM/DSM session > + * @probe: probe device for tsm link operation readiness, setup > + * DSM context > + * @remove: destroy DSM context > + * @connect: establish / validate a secure connection (e.g. IDE) > + * with the device > + * @disconnect: teardown the secure link > + * > + * @probe and @remove run in pci_tsm_rwsem held for write context. All > + * other ops run under the @pdev->tsm->lock mutex and pci_tsm_rwsem held > + * for read. > + */ > + struct_group_tagged(pci_tsm_link_ops, link_ops, > + struct pci_tsm *(*probe)(struct pci_dev *pdev); > + void (*remove)(struct pci_tsm *tsm); > + int (*connect)(struct pci_dev *pdev); > + void (*disconnect)(struct pci_dev *pdev); > + ); > + > + /* > + * struct pci_tsm_security_ops - Manage the security state of the function > + * @sec_probe: probe device for tsm security operation > + * readiness, setup security context > + * @sec_remove: destroy security context > + * > + * @sec_probe and @sec_remove run in pci_tsm_rwsem held for > + * write context. All other ops run under the @pdev->tsm->lock > + * mutex and pci_tsm_rwsem held for read. > + */ > + struct_group_tagged(pci_tsm_security_ops, ops, > + struct pci_tsm *(*sec_probe)(struct pci_dev *pdev); > + void (*sec_remove)(struct pci_tsm *tsm); > + ); > + struct tsm_dev *owner; > +}; > + > +/** > + * struct pci_tsm - Core TSM context for a given PCIe endpoint > + * @pdev: Back ref to device function, distinguishes type of pci_tsm context > + * @dsm: PCI Device Security Manager for link operations on @pdev. > + * @ops: Link Confidentiality or Device Function Security operations > + * > + * This structure is wrapped by low level TSM driver data and returned > + * by probe()/sec_probe(), it is freed by the corresponding > + * remove()/sec_remove(). > + * > + * For link operations it serves to cache the association between a > + * Device Security Manager (DSM) and the functions that manager can > + * assign to a TVM. That can be "self", for assigning function0 of a > + * TEE I/O device, a sub-function (SR-IOV virtual function, or > + * non-function0 multifunction-device), or a downstream endpoint (PCIe > + * upstream switch-port as DSM). > + */ > +struct pci_tsm { > + struct pci_dev *pdev; > + struct pci_dev *dsm; > + const struct pci_tsm_ops *ops; > +}; > + > +/** > + * struct pci_tsm_pf0 - Physical Function 0 TDISP link context > + * @tsm: generic core "tsm" context > + * @lock: protect @state vs pci_tsm_ops invocation > + * @doe_mb: PCIe Data Object Exchange mailbox > + */ > +struct pci_tsm_pf0 { > + struct pci_tsm tsm; > + struct mutex lock; > + struct pci_doe_mb *doe_mb; > +}; > + > +/* physical function0 and capable of 'connect' */ > +static inline bool is_pci_tsm_pf0(struct pci_dev *pdev) > +{ > + if (!pci_is_pcie(pdev)) > + return false; > + > + if (pdev->is_virtfn) > + return false; > + > + /* > + * Allow for a Device Security Manager (DSM) associated with function0 > + * of an Endpoint to coordinate TDISP requests for other functions > + * (physical or virtual) of the device, or allow for an Upstream Port > + * DSM to accept TDISP requests for switch Downstream Endpoints. > + */ > + switch (pci_pcie_type(pdev)) { > + case PCI_EXP_TYPE_ENDPOINT: > + case PCI_EXP_TYPE_UPSTREAM: > + case PCI_EXP_TYPE_RC_END: > + if (pdev->ide_cap || (pdev->devcap & PCI_EXP_DEVCAP_TEE)) > + break; > + fallthrough; > + default: > + return false; > + } > + > + return PCI_FUNC(pdev->devfn) == 0; > +} > + > +enum pci_doe_proto { > + PCI_DOE_PROTO_CMA = 1, > + PCI_DOE_PROTO_SSESSION = 2, > +}; > + > +#ifdef CONFIG_PCI_TSM > +struct tsm_dev; > +int pci_tsm_register(struct tsm_dev *tsm_dev); > +void pci_tsm_unregister(struct tsm_dev *tsm_dev); > +int pci_tsm_doe_transfer(struct pci_dev *pdev, enum pci_doe_proto type, > + const void *req, size_t req_sz, void *resp, > + size_t resp_sz); > +int pci_tsm_constructor(struct pci_dev *pdev, struct pci_tsm *tsm, > + const struct pci_tsm_ops *ops); > +int pci_tsm_pf0_constructor(struct pci_dev *pdev, struct pci_tsm_pf0 *tsm, > + const struct pci_tsm_ops *ops); > +void pci_tsm_pf0_destructor(struct pci_tsm_pf0 *tsm); > +#else > +static inline int pci_tsm_register(struct tsm_dev *tsm_dev) > +{ > + return 0; > +} > +static inline void pci_tsm_unregister(struct tsm_dev *tsm_dev) > +{ > +} > +static inline int pci_tsm_doe_transfer(struct pci_dev *pdev, > + enum pci_doe_proto type, const void *req, > + size_t req_sz, void *resp, > + size_t resp_sz) > +{ > + return -ENOENT; > +} > +#endif > +#endif /*__PCI_TSM_H */ > diff --git a/include/linux/pci.h b/include/linux/pci.h > index b8bca0711967..0e5703fad0f6 100644 > --- a/include/linux/pci.h > +++ b/include/linux/pci.h > @@ -539,6 +539,9 @@ struct pci_dev { > u8 nr_link_ide; /* Link Stream count (Selective Stream offset) */ > unsigned int ide_cfg:1; /* Config cycles over IDE */ > unsigned int ide_tee_limit:1; /* Disallow T=0 traffic over IDE */ > +#endif > +#ifdef CONFIG_PCI_TSM > + struct pci_tsm *tsm; /* TSM operation state */ > #endif > u16 acs_cap; /* ACS Capability offset */ > u8 supported_speeds; /* Supported Link Speeds Vector */ > diff --git a/include/linux/tsm.h b/include/linux/tsm.h > index a90b40b1b13c..ce95589a5d5b 100644 > --- a/include/linux/tsm.h > +++ b/include/linux/tsm.h > @@ -111,7 +111,13 @@ struct tsm_report_ops { > int tsm_report_register(const struct tsm_report_ops *ops, void *priv); > int tsm_report_unregister(const struct tsm_report_ops *ops); > struct tsm_dev; > +struct pci_tsm_ops; > struct tsm_dev *tsm_register(struct device *parent, > - const struct attribute_group **groups); > + const struct attribute_group **groups, > + struct pci_tsm_ops *ops); > void tsm_unregister(struct tsm_dev *tsm_dev); > +const char *tsm_name(const struct tsm_dev *tsm_dev); > +struct tsm_dev *find_tsm_dev(int id); > +const struct pci_tsm_ops *tsm_pci_ops(const struct tsm_dev *tsm_dev); > +const struct attribute_group *tsm_pci_group(const struct tsm_dev *tsm_dev); > #endif /* __TSM_H */ > diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h > index ab4ebf0f8a46..1b991a88c19c 100644 > --- a/include/uapi/linux/pci_regs.h > +++ b/include/uapi/linux/pci_regs.h > @@ -500,6 +500,7 @@ > #define PCI_EXP_DEVCAP_PWR_VAL 0x03fc0000 /* Slot Power Limit Value */ > #define PCI_EXP_DEVCAP_PWR_SCL 0x0c000000 /* Slot Power Limit Scale */ > #define PCI_EXP_DEVCAP_FLR 0x10000000 /* Function Level Reset */ > +#define PCI_EXP_DEVCAP_TEE 0x40000000 /* TEE I/O (TDISP) Support */ > #define PCI_EXP_DEVCTL 0x08 /* Device Control */ > #define PCI_EXP_DEVCTL_CERE 0x0001 /* Correctable Error Reporting En. */ > #define PCI_EXP_DEVCTL_NFERE 0x0002 /* Non-Fatal Error Reporting Enable */ -- Alexey
Alexey Kardashevskiy wrote: > On 18/7/25 04:33, Dan Williams wrote: [..] > > diff --git a/drivers/pci/tsm.c b/drivers/pci/tsm.c > > new file mode 100644 > > index 000000000000..0784cc436dd3 > > --- /dev/null > > +++ b/drivers/pci/tsm.c [..] > > +static ssize_t connect_store(struct device *dev, struct device_attribute *attr, > > + const char *buf, size_t len) > > +{ > > + struct pci_dev *pdev = to_pci_dev(dev); > > + const struct pci_tsm_ops *ops; > > + struct tsm_dev *tsm_dev; > > + int rc, id; > > + > > + rc = sscanf(buf, "tsm%d\n", &id); > > Why is id needed here? Are there going to be multiple DSMs per a PCI > device? The implementation allows for multiple TSMs per platform [1], and you acknowledged this earlier [2] (at least the "no globals" bit). [1]: http://lore.kernel.org/683f9e141f1b1_1626e1009@dwillia2-xfh.jf.intel.com.notmuch [2]: http://lore.kernel.org/b281b714-5097-4b3a-9809-7bdcb9e004dc@amd.com One of the nice properties of multiple tsm_devs is the ability to unit test host and guest side TSM flows in the same kernel image. > I am missing the point of tsm_dev. It does not have sysfs nodes (the > pci_dev parent does) The resource accounting symlinks for each each IDE stream point to the tsm_dev, see tsm_ide_stream_register(). > tsm_register() takes attribute_group but what would posibbly go there? Any vendor specific implementation of commonly named attributes. Contrast that with vendor specific attributes with vendor specific names that the vendor specific device publishes. > certificates/meas/report blobs? Perhaps. For now, I want to just focus on the mechanics of the getting a TDI into the run state. The attestation flow is a separate design debate one there is consensus on getting the TDI up and running. > The pci_dev struct itself has *tsm now so this child device is not > that. Hm. This tsm_dev is not a child device it is the common class representation of a platform capability that can establish SPDM and optionally IDE. > > + if (rc != 1) > > + return -EINVAL; > > + > > + ACQUIRE(rwsem_read_intr, lock)(&pci_tsm_rwsem); > > + if ((rc = ACQUIRE_ERR(rwsem_read_intr, &lock))) > > + return rc; > > + > > + if (pdev->tsm) > > + return -EBUSY; > > + > > + tsm_dev = find_tsm_dev(id); > > When PCI TSM loads, all it does is add "connect" nodes. And when write > to "connect" happens, this find_tsm_dev() is expected to find a > tsm_dev but what is going to add those in the real PCI? sev_tsm_init() calls tsm_register(). Userspace catches the tsm_dev KOBJECT_ADD event to run: echo $TSM_DEV > /sys/bus/pci/devices/$PDEV/tsm/connect [..] > imho "echo 0 > connect" is more descriptive than "echo 1 > disconnect", and one less sysfs node. That makes it a bit too ambiguous for my taste as connect is "connect to a tsm of the following identifier", so, for example, "is '0' a shorthand for 'tsm0'?" ...and as I say that I realize disconnect as the same problem. I will update disconnect to take the tsm device name just like connect for symmetry, this ambiguity concern, and in case multiple TSM connections per device might ever happen way down the road. [..] > > +/** > > + * pci_tsm_constructor() - base 'struct pci_tsm' initialization > > + * @pdev: The PCI device > > + * @tsm: context to initialize > > + * @ops: PCI operations provided by the TSM > > + */ > > +int pci_tsm_constructor(struct pci_dev *pdev, struct pci_tsm *tsm, > > + const struct pci_tsm_ops *ops) > > +{ > > + tsm->pdev = pdev; > > + tsm->ops = ops; > > These should go down, right before "return 0". Thanks, Sure, makes sense. In practice @tsm will be unwound, but might as well not make it a valid object while it is awaiting to be freed.
On 14/8/25 11:40, dan.j.williams@intel.com wrote: > Alexey Kardashevskiy wrote: >> On 18/7/25 04:33, Dan Williams wrote: > [..] >>> diff --git a/drivers/pci/tsm.c b/drivers/pci/tsm.c >>> new file mode 100644 >>> index 000000000000..0784cc436dd3 >>> --- /dev/null >>> +++ b/drivers/pci/tsm.c > [..] >>> +static ssize_t connect_store(struct device *dev, struct device_attribute *attr, >>> + const char *buf, size_t len) >>> +{ >>> + struct pci_dev *pdev = to_pci_dev(dev); >>> + const struct pci_tsm_ops *ops; >>> + struct tsm_dev *tsm_dev; >>> + int rc, id; >>> + >>> + rc = sscanf(buf, "tsm%d\n", &id); >> >> Why is id needed here? Are there going to be multiple DSMs per a PCI >> device? > > The implementation allows for multiple TSMs per platform [1], and you > acknowledged this earlier [2] (at least the "no globals" bit). > > [1]: http://lore.kernel.org/683f9e141f1b1_1626e1009@dwillia2-xfh.jf.intel.com.notmuch Right but I'd think that devices (or, more precisely, PCIe slots) are statically assigned to TSMs. A bit hard to imagine 2 TSMs in a system and ability to connect some PCI device to either of those. It is not impossible but not exactly "painfully simple". > [2]: http://lore.kernel.org/b281b714-5097-4b3a-9809-7bdcb9e004dc@amd.com > > One of the nice properties of multiple tsm_devs is the ability to unit test > host and guest side TSM flows in the same kernel image. > >> I am missing the point of tsm_dev. It does not have sysfs nodes (the >> pci_dev parent does) > > The resource accounting symlinks for each each IDE stream point to the > tsm_dev, see tsm_ide_stream_register(). > >> tsm_register() takes attribute_group but what would posibbly go there? > > Any vendor specific implementation of commonly named attributes. > Contrast that with vendor specific attributes with vendor specific names > that the vendor specific device publishes. > >> certificates/meas/report blobs? > > Perhaps. Hm. Those groups are per a TSM so no device's certificates/meas/report blobs there, right? > For now, I want to just focus on the mechanics of the getting a > TDI into the run state. The attestation flow is a separate design debate > one there is consensus on getting the TDI up and running. >>> The pci_dev struct itself has *tsm now so this child device is not >> that. Hm. > > This tsm_dev is not a child device it is the common class representation > of a platform capability that can establish SPDM and optionally IDE. Yeah, I realized that soon after I hit "send". >>> + if (rc != 1) >>> + return -EINVAL; >>> + >>> + ACQUIRE(rwsem_read_intr, lock)(&pci_tsm_rwsem); >>> + if ((rc = ACQUIRE_ERR(rwsem_read_intr, &lock))) >>> + return rc; >>> + >>> + if (pdev->tsm) >>> + return -EBUSY; >>> + >>> + tsm_dev = find_tsm_dev(id); >> >> When PCI TSM loads, all it does is add "connect" nodes. And when write >> to "connect" happens, this find_tsm_dev() is expected to find a >> tsm_dev but what is going to add those in the real PCI? > > sev_tsm_init() calls tsm_register(). Userspace catches the tsm_dev > KOBJECT_ADD event to run: > > echo $TSM_DEV > /sys/bus/pci/devices/$PDEV/tsm/connect > > [..] >> imho "echo 0 > connect" is more descriptive than "echo 1 > disconnect", and one less sysfs node. > > That makes it a bit too ambiguous for my taste as connect is "connect to > a tsm of the following identifier", so, for example, "is '0' a shorthand > for 'tsm0'?" Nah, ignore my "imho" then. Thanks, > ...and as I say that I realize disconnect as the same problem. I will > update disconnect to take the tsm device name just like connect for > symmetry, this ambiguity concern, and in case multiple TSM connections > per device might ever happen way down the road. > > [..] >>> +/** >>> + * pci_tsm_constructor() - base 'struct pci_tsm' initialization >>> + * @pdev: The PCI device >>> + * @tsm: context to initialize >>> + * @ops: PCI operations provided by the TSM >>> + */ >>> +int pci_tsm_constructor(struct pci_dev *pdev, struct pci_tsm *tsm, >>> + const struct pci_tsm_ops *ops) >>> +{ >>> + tsm->pdev = pdev; >>> + tsm->ops = ops; >> >> These should go down, right before "return 0". Thanks, > > Sure, makes sense. > > In practice @tsm will be unwound, but might as well not make it a > valid object while it is awaiting to be freed. -- Alexey
Alexey Kardashevskiy wrote: > > > On 14/8/25 11:40, dan.j.williams@intel.com wrote: > > Alexey Kardashevskiy wrote: > >> On 18/7/25 04:33, Dan Williams wrote: > > [..] > >>> diff --git a/drivers/pci/tsm.c b/drivers/pci/tsm.c > >>> new file mode 100644 > >>> index 000000000000..0784cc436dd3 > >>> --- /dev/null > >>> +++ b/drivers/pci/tsm.c > > [..] > >>> +static ssize_t connect_store(struct device *dev, struct device_attribute *attr, > >>> + const char *buf, size_t len) > >>> +{ > >>> + struct pci_dev *pdev = to_pci_dev(dev); > >>> + const struct pci_tsm_ops *ops; > >>> + struct tsm_dev *tsm_dev; > >>> + int rc, id; > >>> + > >>> + rc = sscanf(buf, "tsm%d\n", &id); > >> > >> Why is id needed here? Are there going to be multiple DSMs per a PCI > >> device? > > > > The implementation allows for multiple TSMs per platform [1], and you > > acknowledged this earlier [2] (at least the "no globals" bit). > > > > [1]: http://lore.kernel.org/683f9e141f1b1_1626e1009@dwillia2-xfh.jf.intel.com.notmuch > > Right but I'd think that devices (or, more precisely, PCIe slots) are > statically assigned to TSMs. A bit hard to imagine 2 TSMs in a system > and ability to connect some PCI device to either of those. It is not > impossible but not exactly "painfully simple". The simple case is the typical case, single TSM. If a platform invents multiple TSMs then it needs to define a protocol for userspace to figure out the rules, like "match TSMs to devices by PCIe Segment Number", or something similar. "Painfully simple" also means not pre-constraining the ABI just to mitigate future userspace complexity. In the end the kernel is allowed to not need / have an opinion about this detail. > > [2]: http://lore.kernel.org/b281b714-5097-4b3a-9809-7bdcb9e004dc@amd.com > > > > One of the nice properties of multiple tsm_devs is the ability to unit test > > host and guest side TSM flows in the same kernel image. > > > >> I am missing the point of tsm_dev. It does not have sysfs nodes (the > >> pci_dev parent does) > > > > The resource accounting symlinks for each each IDE stream point to the > > tsm_dev, see tsm_ide_stream_register(). > > > >> tsm_register() takes attribute_group but what would posibbly go there? > > > > Any vendor specific implementation of commonly named attributes. > > Contrast that with vendor specific attributes with vendor specific names > > that the vendor specific device publishes. > > > >> certificates/meas/report blobs? > > > > Perhaps. > > Hm. Those groups are per a TSM so no device's certificates/meas/report blobs there, right? True, I was thinking of a per-device TSM driver supplied attributes similar to 'struct device_driver'::dev_groups. Both that, and this @groups parameter to tsm_register() can wait until a solid use case arrives.
On Thu, Jul 17, 2025 at 11:33:52AM -0700, Dan Williams wrote: > The PCIe 6.1 specification, section 11, introduces the Trusted Execution > Environment (TEE) Device Interface Security Protocol (TDISP). This > protocol definition builds upon Component Measurement and Authentication > (CMA), and link Integrity and Data Encryption (IDE). It adds support for > assigning devices (PCI physical or virtual function) to a confidential > VM such that the assigned device is enabled to access guest private > memory protected by technologies like Intel TDX, AMD SEV-SNP, RISCV > COVE, or ARM CCA. Previous patches reference PCIe r6.2. Personally I would change them all the citations to r7.0, since that's out now and (I assume) includes everything. I guess you said "introduced in r6.1," which is not the same as "introduced in r7.0," but I'm not sure how relevant it is to know that very first revision. > The operations that can be executed against a PCI device are split into > 2 mutually exclusive operation sets, "Link" and "Security" (struct s/2/two/ Old skool, but you obviously pay attention to details like that :) > +++ b/Documentation/ABI/testing/sysfs-bus-pci > +What: /sys/bus/pci/devices/.../tsm/ > +Contact: linux-coco@lists.linux.dev > +Description: > + This directory only appears if a physical device function > + supports authentication (PCIe CMA-SPDM), interface security > + (PCIe TDISP), and is accepted for secure operation by the > + platform TSM driver. This attribute directory appears > + dynamically after the platform TSM driver loads. So, only after > + the /sys/class/tsm/tsm0 device arrives can tools assume that > + devices without a tsm/ attribute directory will never have one, > + before that, the security capabilities of the device relative to > + the platform TSM are unknown. See > + Documentation/ABI/testing/sysfs-class-tsm. s/never have one,/never have one;/ > +++ b/drivers/pci/tsm.c > +#define dev_fmt(fmt) "TSM: " fmt Include "PCI" for context? > + * Provide a read/write lock against the init / exit of pdev tsm > + * capabilities and arrival/departure of a tsm instance s/tsm/TSM/ in comments. > +static void pci_tsm_walk_fns(struct pci_dev *pdev, > + int (*cb)(struct pci_dev *pdev, void *data), > + void *data) > +{ > + struct pci_dev *fn; > + int i; > + > + /* walk virtual functions */ > + for (i = 0; i < pci_num_vf(pdev); i++) { > + fn = pci_get_domain_bus_and_slot(pci_domain_nr(pdev->bus), > + pci_iov_virtfn_bus(pdev, i), > + pci_iov_virtfn_devfn(pdev, i)); > + if (call_cb_put(fn, data, cb)) > + return; > + } > + > + /* walk subordinate physical functions */ > + for (i = 1; i < 8; i++) { > + fn = pci_get_slot(pdev->bus, > + PCI_DEVFN(PCI_SLOT(pdev->devfn), i)); > + if (call_cb_put(fn, data, cb)) > + return; > + } > + > + /* walk downstream devices */ > + if (pci_pcie_type(pdev) != PCI_EXP_TYPE_UPSTREAM) > + return; > + > + if (!is_dsm(pdev)) > + return; > + > + pci_walk_bus(pdev->subordinate, cb, data); What's the difference between all this and just pci_walk_bus() of pdev->subordinate? Are VFs not included in that walk? Maybe a hint here would be useful. > +static int pci_tsm_connect(struct pci_dev *pdev, struct tsm_dev *tsm_dev) > +{ > + int rc; > + struct pci_tsm_pf0 *tsm_pf0; > + const struct pci_tsm_ops *ops = tsm_pci_ops(tsm_dev); > + struct pci_tsm *pci_tsm __free(tsm_remove) = ops->probe(pdev); > + > + if (!pci_tsm) > + return -ENXIO; > + > + pdev->tsm = pci_tsm; > + tsm_pf0 = to_pci_tsm_pf0(pdev->tsm); > + > + ACQUIRE(mutex_intr, lock)(&tsm_pf0->lock); > + if ((rc = ACQUIRE_ERR(mutex_intr, &lock))) > + return rc; > + > + rc = ops->connect(pdev); > + if (rc) > + return rc; > + > + pdev->tsm = no_free_ptr(pci_tsm); > + > + /* > + * Now that the DSM is established, probe() all the potential > + * dependent functions. Failure to probe a function is not fatal > + * to connect(), it just disables subsequent security operations > + * for that function. > + */ > + pci_tsm_probe_fns(pdev); Makes me wonder what happens if a device is hot-added in the hierarchy. I guess nothing. Is that what we want? What would be the flow if we *did* want to do something? I guess disconnect and connect again? > + * Find the PCI Device instance that serves as the Device Security > + * Manger (DSM) for @pdev. Note that no additional reference is held for s/Manger/Manager/ > + * For cases where a switch may be hosting TDISP services on > + * behalf of downstream devices, check the first usptream port > + * relative to this endpoint. s/usptream/upstream/ > +++ b/include/linux/pci-tsm.h > + * struct pci_tsm_ops - manage confidential links and security state > + * @link_ops: Coordinate PCIe SPDM and IDE establishment via a platform TSM. > + * Provide a secure session transport for TDISP state management > + * (typically bare metal physical function operations). > + * @sec_ops: Lock, unlock, and interrogate the security state of the > + * function via the platform TSM (typically virtual function > + * operations). > + * @owner: Back reference to the TSM device that owns this instance. > + * > + * This operations are mutually exclusive either a tsm_dev instance > + * manages phyiscal link properties or it manages function security > + * states like TDISP lock/unlock. s/phyiscal/physical/ > +struct pci_tsm_ops { > + /* > + * struct pci_tsm_link_ops - Manage physical link and the TSM/DSM session > + * @probe: probe device for tsm link operation readiness, setup > + * DSM context s/tsm link/TSM link/ > + * struct pci_tsm_security_ops - Manage the security state of the function > + * @sec_probe: probe device for tsm security operation > + * readiness, setup security context s/for tsm/for TSM/ > + * struct pci_tsm - Core TSM context for a given PCIe endpoint > + * @pdev: Back ref to device function, distinguishes type of pci_tsm context > + * @dsm: PCI Device Security Manager for link operations on @pdev. Extra period at end, unlike others. > + * @ops: Link Confidentiality or Device Function Security operations > +static inline bool is_pci_tsm_pf0(struct pci_dev *pdev) > +{ > + if (!pci_is_pcie(pdev)) > + return false; > + > + if (pdev->is_virtfn) > + return false; > + > + /* > + * Allow for a Device Security Manager (DSM) associated with function0 > + * of an Endpoint to coordinate TDISP requests for other functions > + * (physical or virtual) of the device, or allow for an Upstream Port > + * DSM to accept TDISP requests for switch Downstream Endpoints. What exactly is a "switch Downstream Endpoint"? Do you mean a "Switch Downstream Port"? Or an Endpoint that is downstream of a Switch? Bjorn
Bjorn Helgaas wrote: > On Thu, Jul 17, 2025 at 11:33:52AM -0700, Dan Williams wrote: > > The PCIe 6.1 specification, section 11, introduces the Trusted Execution > > Environment (TEE) Device Interface Security Protocol (TDISP). This > > protocol definition builds upon Component Measurement and Authentication > > (CMA), and link Integrity and Data Encryption (IDE). It adds support for > > assigning devices (PCI physical or virtual function) to a confidential > > VM such that the assigned device is enabled to access guest private > > memory protected by technologies like Intel TDX, AMD SEV-SNP, RISCV > > COVE, or ARM CCA. > > Previous patches reference PCIe r6.2. Personally I would change them > all the citations to r7.0, since that's out now and (I assume) > includes everything. I guess you said "introduced in r6.1," which is > not the same as "introduced in r7.0," but I'm not sure how relevant it > is to know that very first revision. Ack, looks like the section numbers have not changed which makes it easier. > > The operations that can be executed against a PCI device are split into > > 2 mutually exclusive operation sets, "Link" and "Security" (struct > > s/2/two/ Old skool, but you obviously pay attention to details like > that :) I only recently gave up the fight against 2^H^H two spaces after a period, fixed. > > +++ b/Documentation/ABI/testing/sysfs-bus-pci > > +What: /sys/bus/pci/devices/.../tsm/ > > +Contact: linux-coco@lists.linux.dev > > +Description: > > + This directory only appears if a physical device function > > + supports authentication (PCIe CMA-SPDM), interface security > > + (PCIe TDISP), and is accepted for secure operation by the > > + platform TSM driver. This attribute directory appears > > + dynamically after the platform TSM driver loads. So, only after > > + the /sys/class/tsm/tsm0 device arrives can tools assume that > > + devices without a tsm/ attribute directory will never have one, > > + before that, the security capabilities of the device relative to > > + the platform TSM are unknown. See > > + Documentation/ABI/testing/sysfs-class-tsm. > > s/never have one,/never have one;/ yes. > > > +++ b/drivers/pci/tsm.c > > +#define dev_fmt(fmt) "TSM: " fmt > > Include "PCI" for context? Sure. > > > + * Provide a read/write lock against the init / exit of pdev tsm > > + * capabilities and arrival/departure of a tsm instance > > s/tsm/TSM/ in comments. Got it. > > +static void pci_tsm_walk_fns(struct pci_dev *pdev, > > + int (*cb)(struct pci_dev *pdev, void *data), > > + void *data) > > +{ > > + struct pci_dev *fn; > > + int i; > > + > > + /* walk virtual functions */ > > + for (i = 0; i < pci_num_vf(pdev); i++) { > > + fn = pci_get_domain_bus_and_slot(pci_domain_nr(pdev->bus), > > + pci_iov_virtfn_bus(pdev, i), > > + pci_iov_virtfn_devfn(pdev, i)); > > + if (call_cb_put(fn, data, cb)) > > + return; > > + } > > + > > + /* walk subordinate physical functions */ > > + for (i = 1; i < 8; i++) { > > + fn = pci_get_slot(pdev->bus, > > + PCI_DEVFN(PCI_SLOT(pdev->devfn), i)); > > + if (call_cb_put(fn, data, cb)) > > + return; > > + } > > + > > + /* walk downstream devices */ > > + if (pci_pcie_type(pdev) != PCI_EXP_TYPE_UPSTREAM) > > + return; > > + > > + if (!is_dsm(pdev)) > > + return; > > + > > + pci_walk_bus(pdev->subordinate, cb, data); > > What's the difference between all this and just pci_walk_bus() of > pdev->subordinate? Are VFs not included in that walk? Maybe a > hint here would be useful. Right, ->subordinate is only managed for actual bridge devices. PFs do use one or more 'struct pci_bus *' instances for their VFs, but do not set ->subordinate I assume becuase of that "or more" case. See the NULL @bridge parameter to pci_add_new_bus() in virtfn_add_bus(). With that there is no clean way I see to walk all the virtfn buses of a PF, so fall back to a pci_get_domain_bus_and_slot() walk. I will add a note to this effect as I had to do some digging here to be sure. > > +static int pci_tsm_connect(struct pci_dev *pdev, struct tsm_dev *tsm_dev) > > +{ > > + int rc; > > + struct pci_tsm_pf0 *tsm_pf0; > > + const struct pci_tsm_ops *ops = tsm_pci_ops(tsm_dev); > > + struct pci_tsm *pci_tsm __free(tsm_remove) = ops->probe(pdev); > > + > > + if (!pci_tsm) > > + return -ENXIO; > > + > > + pdev->tsm = pci_tsm; > > + tsm_pf0 = to_pci_tsm_pf0(pdev->tsm); > > + > > + ACQUIRE(mutex_intr, lock)(&tsm_pf0->lock); > > + if ((rc = ACQUIRE_ERR(mutex_intr, &lock))) > > + return rc; > > + > > + rc = ops->connect(pdev); > > + if (rc) > > + return rc; > > + > > + pdev->tsm = no_free_ptr(pci_tsm); > > + > > + /* > > + * Now that the DSM is established, probe() all the potential > > + * dependent functions. Failure to probe a function is not fatal > > + * to connect(), it just disables subsequent security operations > > + * for that function. > > + */ > > + pci_tsm_probe_fns(pdev); > > Makes me wonder what happens if a device is hot-added in the > hierarchy. I guess nothing. Is that what we want? What would be the > flow if we *did* want to do something? I guess disconnect and connect > again? If a subfunction is found after the 'connect' event, like late enable of SR-IOV capability, then the resulting pci_device_add() for that should lookup and perform the ->probe() at that time. > > + * Find the PCI Device instance that serves as the Device Security > > + * Manger (DSM) for @pdev. Note that no additional reference is held for > > s/Manger/Manager/ ...could have swore I ran checkpatch, but indeed it flags this. Fixed, along with the others. > > +struct pci_tsm_ops { > > + /* > > + * struct pci_tsm_link_ops - Manage physical link and the TSM/DSM session > > + * @probe: probe device for tsm link operation readiness, setup > > + * DSM context > > s/tsm link/TSM link/ > > > + * struct pci_tsm_security_ops - Manage the security state of the function > > + * @sec_probe: probe device for tsm security operation > > + * readiness, setup security context > > s/for tsm/for TSM/ > > > + * struct pci_tsm - Core TSM context for a given PCIe endpoint > > + * @pdev: Back ref to device function, distinguishes type of pci_tsm context > > + * @dsm: PCI Device Security Manager for link operations on @pdev. > > Extra period at end, unlike others. Fixed the above. > > > + * @ops: Link Confidentiality or Device Function Security operations > > > +static inline bool is_pci_tsm_pf0(struct pci_dev *pdev) > > +{ > > + if (!pci_is_pcie(pdev)) > > + return false; > > + > > + if (pdev->is_virtfn) > > + return false; > > + > > + /* > > + * Allow for a Device Security Manager (DSM) associated with function0 > > + * of an Endpoint to coordinate TDISP requests for other functions > > + * (physical or virtual) of the device, or allow for an Upstream Port > > + * DSM to accept TDISP requests for switch Downstream Endpoints. > > What exactly is a "switch Downstream Endpoint"? Do you mean a "Switch > Downstream Port"? Or an Endpoint that is downstream of a Switch? Endpoint that is downstream of a Switch. I will clarify the comment.
On Thu, 17 Jul 2025 11:33:52 -0700 Dan Williams <dan.j.williams@intel.com> wrote: > The PCIe 6.1 specification, section 11, introduces the Trusted Execution > Environment (TEE) Device Interface Security Protocol (TDISP). This > protocol definition builds upon Component Measurement and Authentication > (CMA), and link Integrity and Data Encryption (IDE). It adds support for > assigning devices (PCI physical or virtual function) to a confidential > VM such that the assigned device is enabled to access guest private > memory protected by technologies like Intel TDX, AMD SEV-SNP, RISCV > COVE, or ARM CCA. > > The "TSM" (TEE Security Manager) is a concept in the TDISP specification > of an agent that mediates between a "DSM" (Device Security Manager) and > system software in both a VMM and a confidential VM. A VMM uses TSM ABIs > to setup link security and assign devices. A confidential VM uses TSM > ABIs to transition an assigned device into the TDISP "RUN" state and > validate its configuration. From a Linux perspective the TSM abstracts > many of the details of TDISP, IDE, and CMA. Some of those details leak > through at times, but for the most part TDISP is an internal > implementation detail of the TSM. > > CONFIG_PCI_TSM adds an "authenticated" attribute and "tsm/" subdirectory > to pci-sysfs. Consider that the TSM driver may itself be a PCI driver. > Userspace can watch for the arrival of a "TSM" device, > /sys/class/tsm/tsm0/uevent KOBJ_CHANGE, to know when the PCI core has > initialized TSM services. > > The operations that can be executed against a PCI device are split into > 2 mutually exclusive operation sets, "Link" and "Security" (struct > pci_tsm_{link,security}_ops). The "Link" operations manage physical link > security properties and communication with the device's Device Security > Manager firmware. These are the host side operations in TDISP. The > "Security" operations coordinate the security state of the assigned > virtual device (TDI). These are the guest side operations in TDISP. Only > link management operations are defined at this stage and placeholders > provided for the security operations. > > The locking allows for multiple devices to be executing commands > simultaneously, one outstanding command per-device and an rwsem > synchronizes the implementation relative to TSM > registration/unregistration events. > > Thanks to Wu Hao for his work on an early draft of this support. > > Cc: Lukas Wunner <lukas@wunner.de> > Cc: Samuel Ortiz <sameo@rivosinc.com> > Cc: Alexey Kardashevskiy <aik@amd.com> > Acked-by: Bjorn Helgaas <bhelgaas@google.com> > Co-developed-by: Xu Yilun <yilun.xu@linux.intel.com> > Signed-off-by: Xu Yilun <yilun.xu@linux.intel.com> > Signed-off-by: Dan Williams <dan.j.williams@intel.com> Various things inline. > diff --git a/drivers/pci/tsm.c b/drivers/pci/tsm.c > new file mode 100644 > index 000000000000..0784cc436dd3 > --- /dev/null > +++ b/drivers/pci/tsm.c > @@ -0,0 +1,554 @@ > +// SPDX-License-Identifier: GPL-2.0 > +/* > + * TEE Security Manager for the TEE Device Interface Security Protocol > + * (TDISP, PCIe r6.1 sec 11) > + * > + * Copyright(c) 2024 Intel Corporation. All rights reserved. > + */ > +static void tsm_remove(struct pci_tsm *tsm) > +{ > + struct pci_dev *pdev; > + > + if (!tsm) You protect against this in the DEFINE_FREE() so probably safe to assume it is always set if we get here. > + return; > + > + pdev = tsm->pdev; > + tsm->ops->remove(tsm); > + pdev->tsm = NULL; > +} > +DEFINE_FREE(tsm_remove, struct pci_tsm *, if (_T) tsm_remove(_T)) > + > +static int call_cb_put(struct pci_dev *pdev, void *data, Is this combination worth while? I don't like the 'and' aspect of it and it only saves a few lines... vs if (pdev) { rc = cb(pdev, data); pci_dev_put(pdev); if (rc) return; } > + int (*cb)(struct pci_dev *pdev, void *data)) > +{ > + int rc; > + > + if (!pdev) > + return 0; > + rc = cb(pdev, data); > + pci_dev_put(pdev); > + return rc; > +} > + > +static void pci_tsm_walk_fns(struct pci_dev *pdev, > + int (*cb)(struct pci_dev *pdev, void *data), > + void *data) > +{ > + struct pci_dev *fn; > + int i; > + > + /* walk virtual functions */ > + for (i = 0; i < pci_num_vf(pdev); i++) { > + fn = pci_get_domain_bus_and_slot(pci_domain_nr(pdev->bus), > + pci_iov_virtfn_bus(pdev, i), > + pci_iov_virtfn_devfn(pdev, i)); > + if (call_cb_put(fn, data, cb)) > + return; > + } > + > + /* walk subordinate physical functions */ > + for (i = 1; i < 8; i++) { > + fn = pci_get_slot(pdev->bus, > + PCI_DEVFN(PCI_SLOT(pdev->devfn), i)); > + if (call_cb_put(fn, data, cb)) > + return; > + } > + > + /* walk downstream devices */ > + if (pci_pcie_type(pdev) != PCI_EXP_TYPE_UPSTREAM) spaces rather than tabs... > + return; > + > + if (!is_dsm(pdev)) > + return; > + > + pci_walk_bus(pdev->subordinate, cb, data); > +} > + > +static void pci_tsm_walk_fns_reverse(struct pci_dev *pdev, > + int (*cb)(struct pci_dev *pdev, > + void *data), > + void *data) > +{ > + struct pci_dev *fn; > + int i; > + > + /* reverse walk virtual functions */ > + for (i = pci_num_vf(pdev) - 1; i >= 0; i--) { > + fn = pci_get_domain_bus_and_slot(pci_domain_nr(pdev->bus), > + pci_iov_virtfn_bus(pdev, i), > + pci_iov_virtfn_devfn(pdev, i)); > + if (call_cb_put(fn, data, cb)) > + return; > + } > + While it probably doesn't matter can we make this strict reverse by doing the physical functions first? I prefer not to think about whether it matters. > + /* reverse walk subordinate physical functions */ > + for (i = 7; i >= 1; i--) { > + fn = pci_get_slot(pdev->bus, > + PCI_DEVFN(PCI_SLOT(pdev->devfn), i)); > + if (call_cb_put(fn, data, cb)) > + return; > + } > + > + /* reverse walk downstream devices */ > + if (pci_pcie_type(pdev) != PCI_EXP_TYPE_UPSTREAM) > + return; > + > + if (!is_dsm(pdev)) > + return; > + > + pci_walk_bus_reverse(pdev->subordinate, cb, data); Likewise, can we do this before the rest. > +} > +/* > + * Find the PCI Device instance that serves as the Device Security > + * Manger (DSM) for @pdev. Note that no additional reference is held for > + * the resulting device because @pdev always has a longer registered > + * lifetime than its DSM by virtue of being a child of or identical to > + * its DSM. > + */ > +static struct pci_dev *find_dsm_dev(struct pci_dev *pdev) > +{ > + struct pci_dev *uport_pf0; > + > + if (is_pci_tsm_pf0(pdev)) > + return pdev; > + > + struct pci_dev *pf0 __free(pci_dev_put) = pf0_dev_get(pdev); > + if (!pf0) > + return NULL; > + > + if (is_dsm(pf0)) > + return pf0; Unusual for a find command to not hold the device reference on the device it returns. Maybe just call that out in the comment. > + > + /* > + * For cases where a switch may be hosting TDISP services on > + * behalf of downstream devices, check the first usptream port > + * relative to this endpoint. > + */ Odd alignment. Space rather than tab. > + if (!pdev->dev.parent || !pdev->dev.parent->parent) > + return NULL; > + > + uport_pf0 = to_pci_dev(pdev->dev.parent->parent); > + if (is_dsm(uport_pf0)) > + return uport_pf0; > + return NULL; > +} > +/** > + * pci_tsm_pf0_constructor() - common 'struct pci_tsm_pf0' initialization > + * @pdev: Physical Function 0 PCI device (as indicated by is_pci_tsm_pf0()) > + * @tsm: context to initialize ops missing. Run kernel-doc or do W=1 build to catch these. > + */ > +int pci_tsm_pf0_constructor(struct pci_dev *pdev, struct pci_tsm_pf0 *tsm, > + const struct pci_tsm_ops *ops) > +{ > + struct tsm_dev *tsm_dev = ops->owner; > + > + mutex_init(&tsm->lock); Might as well do devm_mutex_init() > + tsm->doe_mb = pci_find_doe_mailbox(pdev, PCI_VENDOR_ID_PCI_SIG, > + PCI_DOE_PROTO_CMA); > + if (!tsm->doe_mb) { > + pci_warn(pdev, "TSM init failure, no CMA mailbox\n"); > + return -ENODEV; > + } > + > + if (tsm_pci_group(tsm_dev)) > + sysfs_merge_group(&pdev->dev.kobj, tsm_pci_group(tsm_dev)); > + > + return pci_tsm_constructor(pdev, &tsm->tsm, ops); > +} > +EXPORT_SYMBOL_GPL(pci_tsm_pf0_constructor); > diff --git a/drivers/virt/coco/tsm-core.c b/drivers/virt/coco/tsm-core.c > index 1f53b9049e2d..093824dc68dd 100644 > --- a/drivers/virt/coco/tsm-core.c > +++ b/drivers/virt/coco/tsm-core.c > +/* > + * Caller responsible for ensuring it does not race tsm_dev > + * unregistration. Wrap is a bit early. unregistration fits on the line above. > + */ > +struct tsm_dev *find_tsm_dev(int id) > +{ > + guard(rcu)(); > + return idr_find(&tsm_idr, id); > +} > @@ -44,6 +76,29 @@ static struct tsm_dev *alloc_tsm_dev(struct device *parent, > return no_free_ptr(tsm_dev); > } > > +static struct tsm_dev *tsm_register_pci_or_reset(struct tsm_dev *tsm_dev, > + struct pci_tsm_ops *pci_ops) > +{ > + int rc; > + > + if (!pci_ops) > + return tsm_dev; > + > + pci_ops->owner = tsm_dev; > + tsm_dev->pci_ops = pci_ops; > + rc = pci_tsm_register(tsm_dev); > + if (rc) { > + dev_err(tsm_dev->dev.parent, > + "PCI/TSM registration failure: %d\n", rc); > + device_unregister(&tsm_dev->dev); As below. I'm fairly sure this device_unregister is nothing to do with what this function is doing, so having it buried in here is less easy to follow than pushing it up a layer. > + return ERR_PTR(rc); > + } > + > + /* Notify TSM userspace that PCI/TSM operations are now possible */ > + kobject_uevent(&tsm_dev->dev.kobj, KOBJ_CHANGE); > + return tsm_dev; > +} > + > static void put_tsm_dev(struct tsm_dev *tsm_dev) > { > if (!IS_ERR_OR_NULL(tsm_dev)) > @@ -54,7 +109,8 @@ DEFINE_FREE(put_tsm_dev, struct tsm_dev *, > if (!IS_ERR_OR_NULL(_T)) put_tsm_dev(_T)) > > struct tsm_dev *tsm_register(struct device *parent, > - const struct attribute_group **groups) > + const struct attribute_group **groups, > + struct pci_tsm_ops *pci_ops) > { > struct tsm_dev *tsm_dev __free(put_tsm_dev) = > alloc_tsm_dev(parent, groups); > @@ -73,12 +129,13 @@ struct tsm_dev *tsm_register(struct device *parent, > if (rc) > return ERR_PTR(rc); > > - return no_free_ptr(tsm_dev); > + return tsm_register_pci_or_reset(no_free_ptr(tsm_dev), pci_ops); Having a function call that either succeeds or cleans up something it never did on error is odd. The or_reset hints at that oddity but to me is not enough. If you want to use __free magic in here maybe hand off the tsm_dev on succesful device registration. struct tsm_dev *registered_tsm_dev __free(unregister_tsm_dev) = no_free_ptr(tsm_dev); rc = tsm_register_pci(registered_tsm_dev, pci_ops); //change return type as no need for another tsm_dev if (rc) return ERR_PTR(rc); return no_free_ptr(registered_tsm_dev); > } > EXPORT_SYMBOL_GPL(tsm_register); > > void tsm_unregister(struct tsm_dev *tsm_dev) > { > + pci_tsm_unregister(tsm_dev); > device_unregister(&tsm_dev->dev); > } > EXPORT_SYMBOL_GPL(tsm_unregister); > diff --git a/include/linux/pci-tsm.h b/include/linux/pci-tsm.h > new file mode 100644 > index 000000000000..f370c022fac4 > --- /dev/null > +++ b/include/linux/pci-tsm.h > @@ -0,0 +1,158 @@ > +/* SPDX-License-Identifier: GPL-2.0 */ > +#ifndef __PCI_TSM_H > +#define __PCI_TSM_H > +#include <linux/mutex.h> > +#include <linux/pci.h> > + > +struct pci_tsm; > + > +/* /** Or was this intentional? Feels like it should be kernel-doc. > + * struct pci_tsm_ops - manage confidential links and security state > + * @link_ops: Coordinate PCIe SPDM and IDE establishment via a platform TSM. > + * Provide a secure session transport for TDISP state management > + * (typically bare metal physical function operations). > + * @sec_ops: Lock, unlock, and interrogate the security state of the > + * function via the platform TSM (typically virtual function > + * operations). > + * @owner: Back reference to the TSM device that owns this instance. > + * > + * This operations are mutually exclusive either a tsm_dev instance > + * manages phyiscal link properties or it manages function security > + * states like TDISP lock/unlock. > + */ > +struct pci_tsm_ops { > + /* Likewise though I'm not sure if kernel-doc deals with struct groups. > + * struct pci_tsm_link_ops - Manage physical link and the TSM/DSM session > + * @probe: probe device for tsm link operation readiness, setup > + * DSM context > + * @remove: destroy DSM context > + * @connect: establish / validate a secure connection (e.g. IDE) > + * with the device > + * @disconnect: teardown the secure link > + * > + * @probe and @remove run in pci_tsm_rwsem held for write context. All > + * other ops run under the @pdev->tsm->lock mutex and pci_tsm_rwsem held > + * for read. > + */ > + struct_group_tagged(pci_tsm_link_ops, link_ops, > + struct pci_tsm *(*probe)(struct pci_dev *pdev); > + void (*remove)(struct pci_tsm *tsm); > + int (*connect)(struct pci_dev *pdev); > + void (*disconnect)(struct pci_dev *pdev); > + ); > + > + /* > + * struct pci_tsm_security_ops - Manage the security state of the function > + * @sec_probe: probe device for tsm security operation > + * readiness, setup security context > + * @sec_remove: destroy security context > + * > + * @sec_probe and @sec_remove run in pci_tsm_rwsem held for > + * write context. All other ops run under the @pdev->tsm->lock > + * mutex and pci_tsm_rwsem held for read. > + */ > + struct_group_tagged(pci_tsm_security_ops, ops, > + struct pci_tsm *(*sec_probe)(struct pci_dev *pdev); > + void (*sec_remove)(struct pci_tsm *tsm); > + ); > + struct tsm_dev *owner; > +}; > + > +/** > + * struct pci_tsm_pf0 - Physical Function 0 TDISP link context > + * @tsm: generic core "tsm" context > + * @lock: protect @state vs pci_tsm_ops invocation What is @state referring to? > + * @doe_mb: PCIe Data Object Exchange mailbox > + */ > +struct pci_tsm_pf0 { > + struct pci_tsm tsm; > + struct mutex lock; > + struct pci_doe_mb *doe_mb; > +};
Jonathan Cameron wrote: > On Thu, 17 Jul 2025 11:33:52 -0700 > Dan Williams <dan.j.williams@intel.com> wrote: > > > The PCIe 6.1 specification, section 11, introduces the Trusted Execution > > Environment (TEE) Device Interface Security Protocol (TDISP). This > > protocol definition builds upon Component Measurement and Authentication > > (CMA), and link Integrity and Data Encryption (IDE). It adds support for > > assigning devices (PCI physical or virtual function) to a confidential > > VM such that the assigned device is enabled to access guest private > > memory protected by technologies like Intel TDX, AMD SEV-SNP, RISCV > > COVE, or ARM CCA. > > > > The "TSM" (TEE Security Manager) is a concept in the TDISP specification > > of an agent that mediates between a "DSM" (Device Security Manager) and > > system software in both a VMM and a confidential VM. A VMM uses TSM ABIs > > to setup link security and assign devices. A confidential VM uses TSM > > ABIs to transition an assigned device into the TDISP "RUN" state and > > validate its configuration. From a Linux perspective the TSM abstracts > > many of the details of TDISP, IDE, and CMA. Some of those details leak > > through at times, but for the most part TDISP is an internal > > implementation detail of the TSM. > > > > CONFIG_PCI_TSM adds an "authenticated" attribute and "tsm/" subdirectory > > to pci-sysfs. Consider that the TSM driver may itself be a PCI driver. > > Userspace can watch for the arrival of a "TSM" device, > > /sys/class/tsm/tsm0/uevent KOBJ_CHANGE, to know when the PCI core has > > initialized TSM services. > > > > The operations that can be executed against a PCI device are split into > > 2 mutually exclusive operation sets, "Link" and "Security" (struct > > pci_tsm_{link,security}_ops). The "Link" operations manage physical link > > security properties and communication with the device's Device Security > > Manager firmware. These are the host side operations in TDISP. The > > "Security" operations coordinate the security state of the assigned > > virtual device (TDI). These are the guest side operations in TDISP. Only > > link management operations are defined at this stage and placeholders > > provided for the security operations. > > > > The locking allows for multiple devices to be executing commands > > simultaneously, one outstanding command per-device and an rwsem > > synchronizes the implementation relative to TSM > > registration/unregistration events. > > > > Thanks to Wu Hao for his work on an early draft of this support. > > > > Cc: Lukas Wunner <lukas@wunner.de> > > Cc: Samuel Ortiz <sameo@rivosinc.com> > > Cc: Alexey Kardashevskiy <aik@amd.com> > > Acked-by: Bjorn Helgaas <bhelgaas@google.com> > > Co-developed-by: Xu Yilun <yilun.xu@linux.intel.com> > > Signed-off-by: Xu Yilun <yilun.xu@linux.intel.com> > > Signed-off-by: Dan Williams <dan.j.williams@intel.com> > Various things inline. > > > diff --git a/drivers/pci/tsm.c b/drivers/pci/tsm.c > > new file mode 100644 > > index 000000000000..0784cc436dd3 > > --- /dev/null > > +++ b/drivers/pci/tsm.c > > @@ -0,0 +1,554 @@ > > +// SPDX-License-Identifier: GPL-2.0 > > +/* > > + * TEE Security Manager for the TEE Device Interface Security Protocol > > + * (TDISP, PCIe r6.1 sec 11) > > + * > > + * Copyright(c) 2024 Intel Corporation. All rights reserved. > > + */ > > > +static void tsm_remove(struct pci_tsm *tsm) > > +{ > > + struct pci_dev *pdev; > > + > > + if (!tsm) > > You protect against this in the DEFINE_FREE() so probably safe > to assume it is always set if we get here. It is safe, but I would rather not require reading other code to understand the expectation that some callers may unconditionally call this routine. > > + return; > > + > > + pdev = tsm->pdev; > > + tsm->ops->remove(tsm); > > + pdev->tsm = NULL; > > +} > > +DEFINE_FREE(tsm_remove, struct pci_tsm *, if (_T) tsm_remove(_T)) > > + > > +static int call_cb_put(struct pci_dev *pdev, void *data, > > Is this combination worth while? I don't like the 'and' aspect of it > and it only saves a few lines... > > vs > if (pdev) { > rc = cb(pdev, data); > pci_dev_put(pdev); > if (rc) > return; > } I think it is worth it, but an even better option is to just let scope-based cleanup handle it by moving the variable inside the loop declaration. > > > + int (*cb)(struct pci_dev *pdev, void *data)) > > +{ > > + int rc; > > + > > + if (!pdev) > > + return 0; > > + rc = cb(pdev, data); > > + pci_dev_put(pdev); > > + return rc; > > +} > > + > > +static void pci_tsm_walk_fns(struct pci_dev *pdev, > > + int (*cb)(struct pci_dev *pdev, void *data), > > + void *data) > > +{ > > + struct pci_dev *fn; > > + int i; > > + > > + /* walk virtual functions */ > > + for (i = 0; i < pci_num_vf(pdev); i++) { > > + fn = pci_get_domain_bus_and_slot(pci_domain_nr(pdev->bus), > > + pci_iov_virtfn_bus(pdev, i), > > + pci_iov_virtfn_devfn(pdev, i)); > > + if (call_cb_put(fn, data, cb)) > > + return; > > + } > > + > > + /* walk subordinate physical functions */ > > + for (i = 1; i < 8; i++) { > > + fn = pci_get_slot(pdev->bus, > > + PCI_DEVFN(PCI_SLOT(pdev->devfn), i)); > > + if (call_cb_put(fn, data, cb)) > > + return; > > + } > > + > > + /* walk downstream devices */ > > + if (pci_pcie_type(pdev) != PCI_EXP_TYPE_UPSTREAM) > > spaces rather than tabs... Fixed. > > + return; > > + > > + if (!is_dsm(pdev)) > > + return; > > + > > + pci_walk_bus(pdev->subordinate, cb, data); > > +} > > + > > +static void pci_tsm_walk_fns_reverse(struct pci_dev *pdev, > > + int (*cb)(struct pci_dev *pdev, > > + void *data), > > + void *data) > > +{ > > + struct pci_dev *fn; > > + int i; > > + > > + /* reverse walk virtual functions */ > > + for (i = pci_num_vf(pdev) - 1; i >= 0; i--) { > > + fn = pci_get_domain_bus_and_slot(pci_domain_nr(pdev->bus), > > + pci_iov_virtfn_bus(pdev, i), > > + pci_iov_virtfn_devfn(pdev, i)); > > + if (call_cb_put(fn, data, cb)) > > + return; > > + } > > + > While it probably doesn't matter can we make this strict reverse by doing > the physical functions first? I prefer not to think about whether it matters. Actually, me too, and in that case I also want the downstream devices to be done in strict reverse order. So, I do not have a use in mind where this matters, but changed the order to physical->virtual->downstream and downstream->virtual->physical for the reverse case. Oh, this is missing the potential case of SR-IOV on multiple physical functions of the device, so added that too. > > > + /* reverse walk subordinate physical functions */ > > + for (i = 7; i >= 1; i--) { > > + fn = pci_get_slot(pdev->bus, > > + PCI_DEVFN(PCI_SLOT(pdev->devfn), i)); > > + if (call_cb_put(fn, data, cb)) > > + return; > > + } > > + > > + /* reverse walk downstream devices */ > > + if (pci_pcie_type(pdev) != PCI_EXP_TYPE_UPSTREAM) > > + return; > > + > > + if (!is_dsm(pdev)) > > + return; > > + > > + pci_walk_bus_reverse(pdev->subordinate, cb, data); > > Likewise, can we do this before the rest. Fixed. > > > +} > > > +/* > > + * Find the PCI Device instance that serves as the Device Security > > + * Manger (DSM) for @pdev. Note that no additional reference is held for > > + * the resulting device because @pdev always has a longer registered > > + * lifetime than its DSM by virtue of being a child of or identical to > > + * its DSM. > > + */ > > +static struct pci_dev *find_dsm_dev(struct pci_dev *pdev) > > +{ > > + struct pci_dev *uport_pf0; > > + > > + if (is_pci_tsm_pf0(pdev)) > > + return pdev; > > + > > + struct pci_dev *pf0 __free(pci_dev_put) = pf0_dev_get(pdev); > > + if (!pf0) > > + return NULL; > > + > > + if (is_dsm(pf0)) > > + return pf0; > > > Unusual for a find command to not hold the device reference on the device > it returns. Maybe just call that out in the comment. It is, "Note that no additional reference..." > > + > > + /* > > + * For cases where a switch may be hosting TDISP services on > > + * behalf of downstream devices, check the first usptream port > > + * relative to this endpoint. > > + */ > Odd alignment. Space rather than tab. Hmm, clang-format does not fixup tabs vs spaces in block comments, fixed. > > > > + if (!pdev->dev.parent || !pdev->dev.parent->parent) > > + return NULL; > > + > > + uport_pf0 = to_pci_dev(pdev->dev.parent->parent); > > + if (is_dsm(uport_pf0)) > > + return uport_pf0; > > + return NULL; > > +} > > > > +/** > > + * pci_tsm_pf0_constructor() - common 'struct pci_tsm_pf0' initialization > > + * @pdev: Physical Function 0 PCI device (as indicated by is_pci_tsm_pf0()) > > + * @tsm: context to initialize > > ops missing. Run kernel-doc or do W=1 build to catch these. TIL I do not need need to create a Documentation file to reference this file to get kdoc build warnings. Fixed. > > > + */ > > +int pci_tsm_pf0_constructor(struct pci_dev *pdev, struct pci_tsm_pf0 *tsm, > > + const struct pci_tsm_ops *ops) > > +{ > > + struct tsm_dev *tsm_dev = ops->owner; > > + > > + mutex_init(&tsm->lock); > Might as well do devm_mutex_init() Hmm, no, this is running out of the driver bind lifetime of @pdev. > > + tsm->doe_mb = pci_find_doe_mailbox(pdev, PCI_VENDOR_ID_PCI_SIG, > > + PCI_DOE_PROTO_CMA); > > + if (!tsm->doe_mb) { > > + pci_warn(pdev, "TSM init failure, no CMA mailbox\n"); > > + return -ENODEV; > > + } > > + > > + if (tsm_pci_group(tsm_dev)) > > + sysfs_merge_group(&pdev->dev.kobj, tsm_pci_group(tsm_dev)); > > + > > + return pci_tsm_constructor(pdev, &tsm->tsm, ops); > > +} > > +EXPORT_SYMBOL_GPL(pci_tsm_pf0_constructor); > > > diff --git a/drivers/virt/coco/tsm-core.c b/drivers/virt/coco/tsm-core.c > > index 1f53b9049e2d..093824dc68dd 100644 > > --- a/drivers/virt/coco/tsm-core.c > > +++ b/drivers/virt/coco/tsm-core.c > > > +/* > > + * Caller responsible for ensuring it does not race tsm_dev > > + * unregistration. > Wrap is a bit early. unregistration fits on the line above. True, fixed up editor settings to automate this. > > + */ > > +struct tsm_dev *find_tsm_dev(int id) > > +{ > > + guard(rcu)(); > > + return idr_find(&tsm_idr, id); > > +} > > > @@ -44,6 +76,29 @@ static struct tsm_dev *alloc_tsm_dev(struct device *parent, > > return no_free_ptr(tsm_dev); > > } > > > > +static struct tsm_dev *tsm_register_pci_or_reset(struct tsm_dev *tsm_dev, > > + struct pci_tsm_ops *pci_ops) > > +{ > > + int rc; > > + > > + if (!pci_ops) > > + return tsm_dev; > > + > > + pci_ops->owner = tsm_dev; > > + tsm_dev->pci_ops = pci_ops; > > + rc = pci_tsm_register(tsm_dev); > > + if (rc) { > > + dev_err(tsm_dev->dev.parent, > > + "PCI/TSM registration failure: %d\n", rc); > > + device_unregister(&tsm_dev->dev); > > As below. I'm fairly sure this device_unregister is nothing to do with > what this function is doing, so having it buried in here is less easy > to follow than pushing it up a layer. I prefer a short function with an early exit and no scope based unwind for this. > > + return ERR_PTR(rc); > > + } > > + > > + /* Notify TSM userspace that PCI/TSM operations are now possible */ > > + kobject_uevent(&tsm_dev->dev.kobj, KOBJ_CHANGE); > > + return tsm_dev; > > +} > > + > > static void put_tsm_dev(struct tsm_dev *tsm_dev) > > { > > if (!IS_ERR_OR_NULL(tsm_dev)) > > @@ -54,7 +109,8 @@ DEFINE_FREE(put_tsm_dev, struct tsm_dev *, > > if (!IS_ERR_OR_NULL(_T)) put_tsm_dev(_T)) > > > > struct tsm_dev *tsm_register(struct device *parent, > > - const struct attribute_group **groups) > > + const struct attribute_group **groups, > > + struct pci_tsm_ops *pci_ops) > > { > > struct tsm_dev *tsm_dev __free(put_tsm_dev) = > > alloc_tsm_dev(parent, groups); > > @@ -73,12 +129,13 @@ struct tsm_dev *tsm_register(struct device *parent, > > if (rc) > > return ERR_PTR(rc); > > > > - return no_free_ptr(tsm_dev); > > + return tsm_register_pci_or_reset(no_free_ptr(tsm_dev), pci_ops); > > Having a function call that either succeeds or cleans up something it > never did on error is odd. devm_add_action_or_reset() is the same pattern. Do the action, or otherwise take care of cleaning up. The action in this case is pci_tsm_register() and the reset is cleaning up the device_add(). > The or_reset hints at that oddity but to me is not enough. If you want > to use __free magic in here maybe hand off the tsm_dev on succesful > device registration. > > struct tsm_dev *registered_tsm_dev __free(unregister_tsm_dev) = > no_free_ptr(tsm_dev); That does not look like an improvement to me. The moment device_add() succeeds the cleanup shifts from put_device() to device_unregister(). I considered wrapping device_add(), but I think it is awkard to redeclare the same variable again vs being able to walk all instances of @tsm_dev in the function. [..] > > + * struct pci_tsm_ops - manage confidential links and security state > > + * @link_ops: Coordinate PCIe SPDM and IDE establishment via a platform TSM. > > + * Provide a secure session transport for TDISP state management > > + * (typically bare metal physical function operations). > > + * @sec_ops: Lock, unlock, and interrogate the security state of the > > + * function via the platform TSM (typically virtual function > > + * operations). > > + * @owner: Back reference to the TSM device that owns this instance. > > + * > > + * This operations are mutually exclusive either a tsm_dev instance > > + * manages phyiscal link properties or it manages function security > > + * states like TDISP lock/unlock. > > + */ > > +struct pci_tsm_ops { > > + /* > Likewise though I'm not sure if kernel-doc deals with struct groups. It does not. > > +/** > > + * struct pci_tsm_pf0 - Physical Function 0 TDISP link context > > + * @tsm: generic core "tsm" context > > + * @lock: protect @state vs pci_tsm_ops invocation > > What is @state referring to? @state was removed with v4 of the PCI/TSM series. The kernel-doc for 'struct pci_tsm_pf0' is now: /** * struct pci_tsm_pf0 - Physical Function 0 TDISP link context * @tsm: generic core "tsm" context * @lock: mutual exclustion for pci_tsm_ops invocation * @doe_mb: PCIe Data Object Exchange mailbox */ struct pci_tsm_pf0 { struct pci_tsm tsm; struct mutex lock; struct pci_doe_mb *doe_mb; };
> > > > > diff --git a/drivers/pci/tsm.c b/drivers/pci/tsm.c > > > new file mode 100644 > > > index 000000000000..0784cc436dd3 > > > --- /dev/null > > > +++ b/drivers/pci/tsm.c > > > @@ -0,0 +1,554 @@ > > > +// SPDX-License-Identifier: GPL-2.0 > > > +/* > > > + * TEE Security Manager for the TEE Device Interface Security Protocol > > > + * (TDISP, PCIe r6.1 sec 11) > > > + * > > > + * Copyright(c) 2024 Intel Corporation. All rights reserved. > > > + */ > > > > > +static void tsm_remove(struct pci_tsm *tsm) > > > +{ > > > + struct pci_dev *pdev; > > > + > > > + if (!tsm) > > > > You protect against this in the DEFINE_FREE() so probably safe > > to assume it is always set if we get here. > > It is safe, but I would rather not require reading other code to > understand the expectation that some callers may unconditionally call > this routine. I think a function like remove being called on 'nothing' should pretty much always be a bug, but meh, up to you. > > > > + return; > > > + > > > + pdev = tsm->pdev; > > > + tsm->ops->remove(tsm); > > > + pdev->tsm = NULL; > > > +} > > > +DEFINE_FREE(tsm_remove, struct pci_tsm *, if (_T) tsm_remove(_T)) > > > + > > > +static int call_cb_put(struct pci_dev *pdev, void *data, > > > > Is this combination worth while? I don't like the 'and' aspect of it > > and it only saves a few lines... > > > > vs > > if (pdev) { > > rc = cb(pdev, data); > > pci_dev_put(pdev); > > if (rc) > > return; > > } > > I think it is worth it, but an even better option is to just let > scope-based cleanup handle it by moving the variable inside the loop > declaration. I don't follow that lat bit, but will look at next version to see what you mean! > > > > + return; > > > + > > > + if (!is_dsm(pdev)) > > > + return; > > > + > > > + pci_walk_bus(pdev->subordinate, cb, data); > > > +} > > > + > > > > > +/* > > > + * Find the PCI Device instance that serves as the Device Security > > > + * Manger (DSM) for @pdev. Note that no additional reference is held for > > > + * the resulting device because @pdev always has a longer registered > > > + * lifetime than its DSM by virtue of being a child of or identical to > > > + * its DSM. > > > + */ > > > +static struct pci_dev *find_dsm_dev(struct pci_dev *pdev) > > > +{ > > > + struct pci_dev *uport_pf0; > > > + > > > + if (is_pci_tsm_pf0(pdev)) > > > + return pdev; > > > + > > > + struct pci_dev *pf0 __free(pci_dev_put) = pf0_dev_get(pdev); > > > + if (!pf0) > > > + return NULL; > > > + > > > + if (is_dsm(pf0)) > > > + return pf0; > > > > > > Unusual for a find command to not hold the device reference on the device > > it returns. Maybe just call that out in the comment. > > It is, "Note that no additional reference..." Good point :) > > > diff --git a/drivers/virt/coco/tsm-core.c b/drivers/virt/coco/tsm-core.c > > > index 1f53b9049e2d..093824dc68dd 100644 > > > --- a/drivers/virt/coco/tsm-core.c > > > +++ b/drivers/virt/coco/tsm-core.c > > > +static struct tsm_dev *tsm_register_pci_or_reset(struct tsm_dev *tsm_dev, > > > + struct pci_tsm_ops *pci_ops) > > > +{ > > > + int rc; > > > + > > > + if (!pci_ops) > > > + return tsm_dev; > > > + > > > + pci_ops->owner = tsm_dev; > > > + tsm_dev->pci_ops = pci_ops; > > > + rc = pci_tsm_register(tsm_dev); > > > + if (rc) { > > > + dev_err(tsm_dev->dev.parent, > > > + "PCI/TSM registration failure: %d\n", rc); > > > + device_unregister(&tsm_dev->dev); > > > > As below. I'm fairly sure this device_unregister is nothing to do with > > what this function is doing, so having it buried in here is less easy > > to follow than pushing it up a layer. > > I prefer a short function with an early exit and no scope based unwind > for this. > > > > + return ERR_PTR(rc); > > > + } > > > + > > > + /* Notify TSM userspace that PCI/TSM operations are now possible */ > > > + kobject_uevent(&tsm_dev->dev.kobj, KOBJ_CHANGE); > > > + return tsm_dev; > > > +} > > > + > > > static void put_tsm_dev(struct tsm_dev *tsm_dev) > > > { > > > if (!IS_ERR_OR_NULL(tsm_dev)) > > > @@ -54,7 +109,8 @@ DEFINE_FREE(put_tsm_dev, struct tsm_dev *, > > > if (!IS_ERR_OR_NULL(_T)) put_tsm_dev(_T)) > > > > > > struct tsm_dev *tsm_register(struct device *parent, > > > - const struct attribute_group **groups) > > > + const struct attribute_group **groups, > > > + struct pci_tsm_ops *pci_ops) > > > { > > > struct tsm_dev *tsm_dev __free(put_tsm_dev) = > > > alloc_tsm_dev(parent, groups); > > > @@ -73,12 +129,13 @@ struct tsm_dev *tsm_register(struct device *parent, > > > if (rc) > > > return ERR_PTR(rc); > > > > > > - return no_free_ptr(tsm_dev); > > > + return tsm_register_pci_or_reset(no_free_ptr(tsm_dev), pci_ops); > > > > Having a function call that either succeeds or cleans up something it > > never did on error is odd. > > devm_add_action_or_reset() is the same pattern. Do the action, or > otherwise take care of cleaning up. The action in this case is > pci_tsm_register() and the reset is cleaning up the device_add(). It's a pretty obscure pattern but I agree there is that precedence. In my head that one case gets to be 'special' because it is always calling the callback, just a question of now or later (in remove). Here thing callback happens in remove but it's explicit and nothing to do with this function (unlikely the devm version) Anyhow, not a thing I'm going to bother pushing back that hard on. > > > > The or_reset hints at that oddity but to me is not enough. If you want > > to use __free magic in here maybe hand off the tsm_dev on succesful > > device registration. > > > > struct tsm_dev *registered_tsm_dev __free(unregister_tsm_dev) = > > no_free_ptr(tsm_dev); > > That does not look like an improvement to me. > > The moment device_add() succeeds the cleanup shifts from put_device() to > device_unregister(). I considered wrapping device_add(), but I think it > is awkard to redeclare the same variable again vs being able to walk all > instances of @tsm_dev in the function. I agree it's a slightly odd construction and so might cause confusion. So whilst I think I prefer it to the or_reset() pattern I guess I'll just try and remember why this is odd (should I ever read this again after it's merged!) :) > > [..] > > > + * struct pci_tsm_ops - manage confidential links and security state > > > + * @link_ops: Coordinate PCIe SPDM and IDE establishment via a platform TSM. > > > + * Provide a secure session transport for TDISP state management > > > + * (typically bare metal physical function operations). > > > + * @sec_ops: Lock, unlock, and interrogate the security state of the > > > + * function via the platform TSM (typically virtual function > > > + * operations). > > > + * @owner: Back reference to the TSM device that owns this instance. > > > + * > > > + * This operations are mutually exclusive either a tsm_dev instance > > > + * manages phyiscal link properties or it manages function security > > > + * states like TDISP lock/unlock. > > > + */ > > > +struct pci_tsm_ops { > > > + /* > > Likewise though I'm not sure if kernel-doc deals with struct groups. > > It does not. Hmm. Given they are getting common maybe that's one to address, but obviously not in this series. Jonathan
Jonathan Cameron wrote: [..] > > > > diff --git a/drivers/pci/tsm.c b/drivers/pci/tsm.c > > > > new file mode 100644 > > > > index 000000000000..0784cc436dd3 > > > > --- /dev/null > > > > +++ b/drivers/pci/tsm.c > > > > @@ -0,0 +1,554 @@ > > > > +// SPDX-License-Identifier: GPL-2.0 > > > > +/* > > > > + * TEE Security Manager for the TEE Device Interface Security Protocol > > > > + * (TDISP, PCIe r6.1 sec 11) > > > > + * > > > > + * Copyright(c) 2024 Intel Corporation. All rights reserved. > > > > + */ > > > > > > > +static void tsm_remove(struct pci_tsm *tsm) > > > > +{ > > > > + struct pci_dev *pdev; > > > > + > > > > + if (!tsm) > > > > > > You protect against this in the DEFINE_FREE() so probably safe > > > to assume it is always set if we get here. > > > > It is safe, but I would rather not require reading other code to > > understand the expectation that some callers may unconditionally call > > this routine. > > I think a function like remove being called on 'nothing' should > pretty much always be a bug, but meh, up to you. I should have noted earlier that tsm_probe() on subfunctions might fail without failing the 'connect' operation and unwinding the subfunctions that did probe successfully. tsm_probe() should rarely fail, it is just subject to kmalloc(GFP_KERNEL) failure in most cases. So at shutdown time tsm_remove() will opportunistically cleanup just the subfunctions that probed.
Jonathan Cameron wrote: > > > You protect against this in the DEFINE_FREE() so probably safe > > > to assume it is always set if we get here. > > > > It is safe, but I would rather not require reading other code to > > understand the expectation that some callers may unconditionally call > > this routine. > > I think a function like remove being called on 'nothing' should > pretty much always be a bug, but meh, up to you. ...inspired by kfree(NULL). Potentially saves "if (tsm) tsm_remove(tsm)" checks down the road, but yes, all of those are obviated by the DEFINE_FREE() at present. > > > > + pdev = tsm->pdev; > > > > + tsm->ops->remove(tsm); > > > > + pdev->tsm = NULL; > > > > +} > > > > +DEFINE_FREE(tsm_remove, struct pci_tsm *, if (_T) tsm_remove(_T)) > > > > + > > > > +static int call_cb_put(struct pci_dev *pdev, void *data, > > > > > > Is this combination worth while? I don't like the 'and' aspect of it > > > and it only saves a few lines... > > > > > > vs > > > if (pdev) { > > > rc = cb(pdev, data); > > > pci_dev_put(pdev); > > > if (rc) > > > return; > > > } > > > > I think it is worth it, but an even better option is to just let > > scope-based cleanup handle it by moving the variable inside the loop > > declaration. > I don't follow that lat bit, but will look at next version to see > what you mean! Here is new approach (only compile tested) after understanding that loop declared variables do trigger cleanup on each iteration. static void pci_tsm_walk_fns(struct pci_dev *pdev, int (*cb)(struct pci_dev *pdev, void *data), void *data) { /* Walk subordinate physical functions */ for (int i = 0; i < 8; i++) { struct pci_dev *pf __free(pci_dev_put) = pci_get_slot( pdev->bus, PCI_DEVFN(PCI_SLOT(pdev->devfn), i)); if (!pf) continue; /* on entry function 0 has already run @cb */ if (i > 0) cb(pf, data); /* walk virtual functions of each pf */ for (int j = 0; j < pci_num_vf(pf); j++) { struct pci_dev *vf __free(pci_dev_put) = pci_get_domain_bus_and_slot( pci_domain_nr(pf->bus), pci_iov_virtfn_bus(pf, j), pci_iov_virtfn_devfn(pf, j)); if (!vf) continue; cb(vf, data); } } /* * Walk downstream devices, assumes that an upstream DSM is * limited to downstream physical functions */ if (pci_pcie_type(pdev) == PCI_EXP_TYPE_UPSTREAM && is_dsm(pdev)) pci_walk_bus(pdev->subordinate, cb, data); } static void pci_tsm_walk_fns_reverse(struct pci_dev *pdev, int (*cb)(struct pci_dev *pdev, void *data), void *data) { /* Reverse walk downstream devices */ if (pci_pcie_type(pdev) == PCI_EXP_TYPE_UPSTREAM && is_dsm(pdev)) pci_walk_bus_reverse(pdev->subordinate, cb, data); /* Reverse walk subordinate physical functions */ for (int i = 7; i >= 0; i--) { struct pci_dev *pf __free(pci_dev_put) = pci_get_slot( pdev->bus, PCI_DEVFN(PCI_SLOT(pdev->devfn), i)); if (!pf) continue; /* reverse walk virtual functions */ for (int j = pci_num_vf(pf) - 1; j >= 0; j--) { struct pci_dev *vf __free(pci_dev_put) = pci_get_domain_bus_and_slot( pci_domain_nr(pf->bus), pci_iov_virtfn_bus(pf, j), pci_iov_virtfn_devfn(pf, j)); if (!vf) continue; cb(vf, data); } /* on exit, caller will run @cb on function 0 */ if (i > 0) cb(pf, data); } } [..] > I agree it's a slightly odd construction and so might cause confusion. > So whilst I think I prefer it to the or_reset() pattern I guess I'll just > try and remember why this is odd (should I ever read this again after it's > merged!) :) However, I am interested in these "the trouble with cleanup.h" style discussions. I recently suggested this [1] in another thread which indeed uses multiple local variables of the same object to represent the different phases of the setup. It was easier there because the code was straigtforward to convert to an ERR_PTR() organization. If there was already an alternative device_add() like this: struct device *device_add_or_reset(struct device *dev) That handled the put_device() then you could write: struct device *devreg __free(device_unregister) = device_add_or_reset(no_free_ptr(dev)) ...and help that common pattern of 'struct device' setup transitions from put_device() to device_unregister() at device_add() time. [1]: http://lore.kernel.org/688bcf40215c3_48e5100d6@dwillia2-xfh.jf.intel.com.notmuch [..] > > > > + * struct pci_tsm_ops - manage confidential links and security state > > > > + * @link_ops: Coordinate PCIe SPDM and IDE establishment via a platform TSM. > > > > + * Provide a secure session transport for TDISP state management > > > > + * (typically bare metal physical function operations). > > > > + * @sec_ops: Lock, unlock, and interrogate the security state of the > > > > + * function via the platform TSM (typically virtual function > > > > + * operations). > > > > + * @owner: Back reference to the TSM device that owns this instance. > > > > + * > > > > + * This operations are mutually exclusive either a tsm_dev instance > > > > + * manages phyiscal link properties or it manages function security > > > > + * states like TDISP lock/unlock. > > > > + */ > > > > +struct pci_tsm_ops { > > > > + /* > > > Likewise though I'm not sure if kernel-doc deals with struct groups. > > > > It does not. > > Hmm. Given they are getting common maybe that's one to address, but > obviously not in this series. CXL could use it too...
> > > > > + pdev = tsm->pdev; > > > > > + tsm->ops->remove(tsm); > > > > > + pdev->tsm = NULL; > > > > > +} > > > > > +DEFINE_FREE(tsm_remove, struct pci_tsm *, if (_T) tsm_remove(_T)) > > > > > + > > > > > +static int call_cb_put(struct pci_dev *pdev, void *data, > > > > > > > > Is this combination worth while? I don't like the 'and' aspect of it > > > > and it only saves a few lines... > > > > > > > > vs > > > > if (pdev) { > > > > rc = cb(pdev, data); > > > > pci_dev_put(pdev); > > > > if (rc) > > > > return; > > > > } > > > > > > I think it is worth it, but an even better option is to just let > > > scope-based cleanup handle it by moving the variable inside the loop > > > declaration. > > I don't follow that lat bit, but will look at next version to see > > what you mean! > > Here is new approach (only compile tested) after understanding that loop > declared variables do trigger cleanup on each iteration. Looks good. > > [..] > > I agree it's a slightly odd construction and so might cause confusion. > > So whilst I think I prefer it to the or_reset() pattern I guess I'll just > > try and remember why this is odd (should I ever read this again after it's > > merged!) :) > > However, I am interested in these "the trouble with cleanup.h" style > discussions. > > I recently suggested this [1] in another thread which indeed uses > multiple local variables of the same object to represent the different > phases of the setup. It was easier there because the code was > straigtforward to convert to an ERR_PTR() organization. > > If there was already an alternative device_add() like this: > > struct device *device_add_or_reset(struct device *dev) > > That handled the put_device() then you could write: > > struct device *devreg __free(device_unregister) = device_add_or_reset(no_free_ptr(dev)) > > ...and help that common pattern of 'struct device' setup transitions > from put_device() to device_unregister() at device_add() time. > > [1]: http://lore.kernel.org/688bcf40215c3_48e5100d6@dwillia2-xfh.jf.intel.com.notmuch That's definitely interesting (in a fairly good way) as anything to stop people introducing bugs around the device_add() stuff would be welcome. It'll take a bit of getting used to though. Maybe make it more explicit device_add_or_put()? Naming hard as normal.. > Jonathan
© 2016 - 2025 Red Hat, Inc.