[PATCH v1 00/19] Rebase ATS onto lastest Qemu mailing list state

CLEMENT MATHIEU--DRIF posted 19 patches 1 week ago
hw/i386/intel_iommu.c          | 128 ++++++--
hw/i386/intel_iommu_internal.h |   2 +
hw/pci/pci.c                   | 111 ++++++-
hw/pci/pcie.c                  |  42 +++
include/exec/memory.h          |  51 +++-
include/hw/i386/intel_iommu.h  |   2 +-
include/hw/pci/pci.h           |  83 ++++++
include/hw/pci/pci_device.h    |   1 +
include/hw/pci/pcie.h          |   9 +-
include/hw/pci/pcie_regs.h     |   5 +
system/memory.c                |  20 ++
tests/unit/meson.build         |   1 +
tests/unit/test-atc.c          | 527 +++++++++++++++++++++++++++++++++
util/atc.c                     | 211 +++++++++++++
util/atc.h                     | 117 ++++++++
util/meson.build               |   1 +
16 files changed, 1280 insertions(+), 31 deletions(-)
create mode 100644 tests/unit/test-atc.c
create mode 100644 util/atc.c
create mode 100644 util/atc.h
[PATCH v1 00/19] Rebase ATS onto lastest Qemu mailing list state
Posted by CLEMENT MATHIEU--DRIF 1 week ago
From: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>

This series belongs to a list of series that add SVM support for VT-d.

As a starting point, we use the series called
'intel_iommu: Enable stage-1 translation for emulated device' by Zhenzhong Duan and Yi Liu.
Ref: https://github.com/yiliu1765/qemu/tree/zhenzhong/iommufd_stage1_emu_v5

Based-on: 20241111083457.2090664-1-zhenzhong.duan@intel.com

Here we focus on implementing ATS support in the IOMMU and adding a
PCI-level API to be used by virtual devices.

This work is based on the VT-d specification version 4.1 (March 2023).

Here is a link to our GitHub repository where you can find the following elements:
    - Qemu with all the patches for SVM
        - ATS
        - PRI
        - Device IOTLB invalidations
        - Requests with already pre-translated addresses
    - A demo device
    - A simple driver for the demo device
    - A userspace program (for testing and demonstration purposes)

https://github.com/BullSequana/Qemu-in-guest-SVM-demo

===============

Context and design notes
''''''''''''''''''''''''

The main purpose of this work is to enable vVT-d users to make
translation requests to the vIOMMU as described in the PCIe Gen 5.0
specification (section 10). Moreover, we aim to implement a
PCI/Memory-level framework that could be used by other vIOMMUs
to implement the same features.

What is ATS?
''''''''''''

ATS (Address Translation Service) is a PCIe-level protocol that
enables PCIe devices to query an IOMMU for virtual to physical
address translations in a specific address space (such as a userland
process address space). When a device receives translation responses
from an IOMMU, it may decide to store them in an internal cache,
often known as "ATC" (Address Translation Cache) or "Device IOTLB".
To keep page tables and caches consistent, the IOMMU is allowed to 
send asynchronous invalidation requests to its client devices.

To avoid introducing an unnecessarily complicated API, this series
simply exposes 3 functions. The first 2 are a pair of setup functions
that are called to install and remove the ATS invalidation callback
during the initialization phase of a process. The third one will be
used to request translations. The callback setup API introduced in
this series calls the IOMMUNotifier API under the hood.

API design
''''''''''

- int pci_register_iommu_tlb_event_notifier(PCIDevice *dev,
                                            uint32_t pasid,
                                            IOMMUNotifier *n);

- int pci_unregister_iommu_tlb_event_notifier(PCIDevice *dev, uint32_t pasid,
                                              IOMMUNotifier *n);

- ssize_t pci_ats_request_translation_pasid(PCIDevice *dev, uint32_t pasid,
                                            bool priv_req, bool exec_req,
                                            hwaddr addr, size_t length,
                                            bool no_write,
                                            IOMMUTLBEntry *result,
                                            size_t result_length,
                                            uint32_t *err_count);

Although device developers may want to implement custom ATC for
testing or performance measurement purposes, we provide a generic
implementation as a utility module.

Overview
''''''''

Here are the interactions between an ATS-capable PCIe device and the vVT-d:
                                                                                          
  ┌───────────┐                 ┌────────────┐                ┌─────────────────────────┐ 
  │Device     │                 │PCI / Memory│                │vVT-d                    │ 
  │           │ pci_ats_request_│abstraction │ iommu_ats_     │                         │ 
  │           │ translation_    │            │ request_       │                         │ 
  │┌─────────┐│ pasid           │ AS lookup  │ translation    │                         │ 
  ││Logic    ││────────────────>│╶╶╶╶╶╶╶╶╶╶╶>│───────────────>│────────────────┐        │ 
  │└─────────┘│<────────────────│<╶╶╶╶╶╶╶╶╶╶╶│<───────────────│<─────┐         ∨        │ 
  │┌─────────┐│                 │            │                │┌───────────────────────┐│ 
  ││inv func ││<───────┐        │            │                ││Translation logic      ││ 
  │└─────────┘│        │        │            │                │└───────────────────────┘│ 
  │    │      │        │        │            │      ┌─────────│<───────────┐            │ 
  │    ∨      │        │        │            │      │         │            │            │ 
  │┌─────────┐│        │        │            │      │         │┌───────────────────────┐│ 
  ││ATC      ││        │        │            │      │         ││  Invalidation queue   ││ 
  │└─────────┘│        │        │            │      │         │└───────────∧───────────┘│ 
  └───────────┘        │        └────────────┘      │         └────────────┼────────────┘ 
                       │                            │                      │              
                       └────────────────────────────┘                      │              
                                                                           │              
                                                               ┌────────────────────────┐ 
                                                               │Kernel driver           │ 
                                                               │                        │ 
                                                               └────────────────────────┘



Clément Mathieu--Drif (19):
  memory: Add permissions in IOMMUAccessFlags
  intel_iommu: Declare supported PASID size
  memory: Allow to store the PASID in IOMMUTLBEntry
  intel_iommu: Fill the PASID field when creating an IOMMUTLBEntry
  pcie: Add helper to declare PASID capability for a pcie device
  pcie: Helper functions to check if PASID is enabled
  pcie: Helper function to check if ATS is enabled
  pci: Cache the bus mastering status in the device
  pci: Add IOMMU operations to get memory regions with PASID
  intel_iommu: Implement the get_memory_region_pasid iommu operation
  memory: Store user data pointer in the IOMMU notifiers
  pci: Add a pci-level initialization function for iommu notifiers
  atc: Generic ATC that can be used by PCIe devices that support SVM
  atc: Add unit tests
  memory: Add an API for ATS support
  pci: Add a pci-level API for ATS
  intel_iommu: Set address mask when a translation fails and adjust W
    permission
  intel_iommu: Return page walk level even when the translation fails
  intel_iommu: Add support for ATS

 hw/i386/intel_iommu.c          | 128 ++++++--
 hw/i386/intel_iommu_internal.h |   2 +
 hw/pci/pci.c                   | 111 ++++++-
 hw/pci/pcie.c                  |  42 +++
 include/exec/memory.h          |  51 +++-
 include/hw/i386/intel_iommu.h  |   2 +-
 include/hw/pci/pci.h           |  83 ++++++
 include/hw/pci/pci_device.h    |   1 +
 include/hw/pci/pcie.h          |   9 +-
 include/hw/pci/pcie_regs.h     |   5 +
 system/memory.c                |  20 ++
 tests/unit/meson.build         |   1 +
 tests/unit/test-atc.c          | 527 +++++++++++++++++++++++++++++++++
 util/atc.c                     | 211 +++++++++++++
 util/atc.h                     | 117 ++++++++
 util/meson.build               |   1 +
 16 files changed, 1280 insertions(+), 31 deletions(-)
 create mode 100644 tests/unit/test-atc.c
 create mode 100644 util/atc.c
 create mode 100644 util/atc.h

-- 
2.47.0