With the introduction of the new object and its infrastructure, update the
doc to reflect that and add a new graph.
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
Documentation/userspace-api/iommufd.rst | 69 ++++++++++++++++++++++++-
1 file changed, 68 insertions(+), 1 deletion(-)
diff --git a/Documentation/userspace-api/iommufd.rst b/Documentation/userspace-api/iommufd.rst
index 2deba93bf159..a8b7766c2849 100644
--- a/Documentation/userspace-api/iommufd.rst
+++ b/Documentation/userspace-api/iommufd.rst
@@ -63,6 +63,37 @@ Following IOMMUFD objects are exposed to userspace:
space usually has mappings from guest-level I/O virtual addresses to guest-
level physical addresses.
+- IOMMUFD_OBJ_VIOMMU, representing a slice of the physical IOMMU instance,
+ passed to or shared with a VM. It may be some HW-accelerated virtualization
+ features and some SW resources used by the VM. For examples:
+ * Security namespace for guest owned ID, e.g. guest-controlled cache tags
+ * Non-device-affiliated event reporting, e.g. invalidation queue errors
+ * Access to a sharable nesting parent pagetable across physical IOMMUs
+ * Virtualization of various platforms IDs, e.g. RIDs and others
+ * Delivery of paravirtualized invalidation
+ * Direct assigned invalidation queues
+ * Direct assigned interrupts
+ Such a vIOMMU object generally has the access to a nesting parent pagetable
+ to support some HW-accelerated virtualization features. So, a vIOMMU object
+ must be created given a nesting parent HWPT_PAGING object, and then it would
+ encapsulate that HWPT_PAGING object. Therefore, a vIOMMU object can be used
+ to allocate an HWPT_NESTED object in place of the encapsulated HWPT_PAGING.
+
+ .. note::
+
+ The name "vIOMMU" isn't necessarily identical to a virtualized IOMMU in a
+ VM. A VM can have one giant virtualized IOMMU running on a machine having
+ multiple physical IOMMUs, in which case the VMM will dispatch the requests
+ or configurations from this single virtualized IOMMU instance to multiple
+ vIOMMU objects created for individual slices of different physical IOMMUs.
+ In other words, a vIOMMU object is always a representation of one physical
+ IOMMU, not necessarily of a virtualized IOMMU. For VMMs that want the full
+ virtualization features from physical IOMMUs, it is suggested to build the
+ same number of virtualized IOMMUs as the number of physical IOMMUs, so the
+ passed-through devices would be connected to their own virtualized IOMMUs
+ backed by corresponding vIOMMU objects, in which case a guest OS would do
+ the "dispatch" naturally instead of VMM trappings.
+
All user-visible objects are destroyed via the IOMMU_DESTROY uAPI.
The diagrams below show relationships between user-visible objects and kernel
@@ -101,6 +132,28 @@ creating the objects and links::
|------------>|iommu_domain|<----|iommu_domain|<----|device|
|____________| |____________| |______|
+ _______________________________________________________________________
+ | iommufd (with vIOMMU) |
+ | |
+ | [5] |
+ | _____________ |
+ | | | |
+ | |----------------| vIOMMU | |
+ | | | | |
+ | | | | |
+ | | [1] | | [4] [2] |
+ | | ______ | | _____________ ________ |
+ | | | | | [3] | | | | | |
+ | | | IOAS |<---|(HWPT_PAGING)|<---| HWPT_NESTED |<--| DEVICE | |
+ | | |______| |_____________| |_____________| |________| |
+ | | | | | | |
+ |______|________|______________|__________________|_______________|_____|
+ | | | | |
+ ______v_____ | ______v_____ ______v_____ ___v__
+ | struct | | PFN | (paging) | | (nested) | |struct|
+ |iommu_device| |------>|iommu_domain|<----|iommu_domain|<----|device|
+ |____________| storage|____________| |____________| |______|
+
1. IOMMUFD_OBJ_IOAS is created via the IOMMU_IOAS_ALLOC uAPI. An iommufd can
hold multiple IOAS objects. IOAS is the most generic object and does not
expose interfaces that are specific to single IOMMU drivers. All operations
@@ -132,7 +185,8 @@ creating the objects and links::
flag is set.
4. IOMMUFD_OBJ_HWPT_NESTED can be only manually created via the IOMMU_HWPT_ALLOC
- uAPI, provided an hwpt_id via @pt_id to associate the new HWPT_NESTED object
+ uAPI, provided an hwpt_id or a viommu_id of a vIOMMU object encapsulating a
+ nesting parent HWPT_PAGING via @pt_id to associate the new HWPT_NESTED object
to the corresponding HWPT_PAGING object. The associating HWPT_PAGING object
must be a nesting parent manually allocated via the same uAPI previously with
an IOMMU_HWPT_ALLOC_NEST_PARENT flag, otherwise the allocation will fail. The
@@ -149,6 +203,18 @@ creating the objects and links::
created via the same IOMMU_HWPT_ALLOC uAPI. The difference is at the type
of the object passed in via the @pt_id field of struct iommufd_hwpt_alloc.
+5. IOMMUFD_OBJ_VIOMMU can be only manually created via the IOMMU_VIOMMU_ALLOC
+ uAPI, provided a dev_id (for the device's physical IOMMU to back the vIOMMU)
+ and an hwpt_id (to associate the vIOMMU to a nesting parent HWPT_PAGING). The
+ iommufd core will link the vIOMMU object to the struct iommu_device that the
+ struct device is behind. And an IOMMU driver can implement a viommu_alloc op
+ to allocate its own vIOMMU data structure embedding the core-level structure
+ iommufd_viommu and some driver-specific data. If necessary, the driver can
+ also configure its HW virtualization feature for that vIOMMU (and thus for
+ the VM). Successful completion of this operation sets up the linkages between
+ the vIOMMU object and the HWPT_PAGING, then this vIOMMU object can be used
+ as a nesting parent object to allocate an HWPT_NESTED object described above.
+
A device can only bind to an iommufd due to DMA ownership claim and attach to at
most one IOAS object (no support of PASID yet).
@@ -161,6 +227,7 @@ User visible objects are backed by following datastructures:
- iommufd_device for IOMMUFD_OBJ_DEVICE.
- iommufd_hwpt_paging for IOMMUFD_OBJ_HWPT_PAGING.
- iommufd_hwpt_nested for IOMMUFD_OBJ_HWPT_NESTED.
+- iommufd_viommu for IOMMUFD_OBJ_VIOMMU.
Several terminologies when looking at these datastructures:
--
2.43.0