On Mon, Feb 17, 2025 at 10:18:17AM +0000, Teddy Astie wrote:
> This work has been presented at Xen Summit 2024 during the
> IOMMU paravirtualization and Xen IOMMU subsystem rework
> design session.
>
> Operating systems may want to have access to a IOMMU in order to do DMA
> protection or implement certain features (e.g VFIO on Linux).
>
> VFIO support is mandatory for framework such as SPDK, which can be useful to
> implement an alternative storage backend for virtual machines [1].
>
> In this patch series, we introduce in Xen the ability to manage several
> contexts per domain and provide a new hypercall interface to allow guests
> to manage IOMMU contexts.
>
> The VT-d and AMD-Vi driver is updated to support these new features.
>
> [1] Using SPDK with the Xen hypervisor - FOSDEM 2023
> ---
> Cc: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
>
> PCI Passthrough now work on my side, but things are still feels quite brittle.
Pipeline:
https://gitlab.com/xen-project/people/marmarek/xen/-/pipelines/1676457780
The passthrough tests on Intel are green, but not on AMD. I don't see
any specific crash, but the device doesn't work (DHCP timeout). Likely
an issue with interrupts.
There are also build failures on ARM...
> Changed in v2 :
> * fixed Xen crash when dumping IOMMU contexts (using X debug key)
> with DomUs without IOMMU
> * s/dettach/detach/
> * removed some unused includes
> * fix dangling devices in contexts with detach
>
> Changed in v3 :
> * lock entirely map/unmap in hypercall
> * prevent IOMMU operations on dying contexts (fix race condition)
> * iommu_check_context+iommu_get_context -> iommu_get_context and check for NULL
>
> Changed in v4 :
> * Part of initialization logic is moved to domain or toolstack (IOMMU_init)
> + domain/toolstack now decides on "context count" and "pagetable pool size"
> + for now, all domains are able to initialize PV-IOMMU
> * introduce "dom0-iommu=no-dma" to make default context block all DMA
> (disables HAP and sync-pt), enforcing usage of PV-IOMMU for DMA
> Can be used to expose properly "Pre-boot DMA protection"
> * redesigned locking logic for contexts
> + contexts are accessed using iommu_get_context and released with iommu_put_context
>
> Changed in v5 :
> * various PCI Passthrough related fixes
> + rewrote parts of PCI Passthrough logic
> + various other related bug fixes
> * simplified VT-d DID (for hardware) management by only having one map instead of two
> (pseudo_domid map was previously used for old quarantine code then recycled for PV-IOMMU
> in addition to another map also tracing Domain<->VT-d DID, now there is only one
> map tracking both making things simpler)
> * reworked parts of Xen quarantine logic (needed for PCI Passthrough)
> * added cf_check annotations
> * some changes to PV-IOMMU headers (Alejandro)
>
> Changed in v6 :
> * reorganized the patch series to allow bissecting
> * it is splitted in various smaller patches
> * initial AMD-Vi port (it doesn't completely work with PV-IOMMU though, but builds at
> least)
> * AMD-Vi lacks support for iommu_lookup_page (needed for several PV-IOMMU ops)
>
> TODO:
> * fix some issues with no-dma+PV and grants
> * complete "no-dma" mode (expose to toolstack, add documentation, ...)
> * properly define nested mode and PASID support
> * consider per-iommu domid limit (allocate did on first attach/reattach ?)
> * fix ARM/PPC build issues
>
> * make new quarantine code more unity region aware (isolate devices with
> different reserved regions regions using separate 'contexts')
> * find a way to make PV-IOMMU work in DomUs (they don't see machine bdf)
> * there are corner cases with PV-IOMMU and to-domain Xen PCI Passthrough
> (e.g pci-assignable-remove will reassign to context 0, while the driver
> expects the device to to be in context X)
>
> Teddy Astie (11):
> docs/designs: Add a design document for IOMMU subsystem redesign
> docs/designs: Add a design document for PV-IOMMU
> x86/domain: Defer domain iommu initialization.
> iommu: Move IOMMU domain related structures to (arch_)iommu_context
> iommu: Simplify quarantine logic
> vtd: Remove MAP_ERROR_RECOVERY code path in domain_context_mapping_one
> iommu: Simplify hardware did management
> iommu: Introduce redesigned IOMMU subsystem
> x86/iommu: Introduce IOMMU arena
> iommu: Introduce PV-IOMMU
> iommu: Introduce no-dma feature
>
> docs/designs/iommu-contexts.md | 403 +++++
> docs/designs/pv-iommu.md | 116 ++
> xen/arch/arm/include/asm/iommu.h | 4 +
> xen/arch/ppc/include/asm/iommu.h | 3 +
> xen/arch/x86/domain.c | 10 +-
> xen/arch/x86/include/asm/arena.h | 54 +
> xen/arch/x86/include/asm/iommu.h | 59 +-
> xen/arch/x86/include/asm/pci.h | 17 -
> xen/arch/x86/mm/p2m-ept.c | 2 +-
> xen/arch/x86/pv/dom0_build.c | 6 +-
> xen/arch/x86/tboot.c | 3 +-
> xen/common/Makefile | 1 +
> xen/common/memory.c | 4 +-
> xen/common/pv-iommu.c | 539 +++++++
> xen/drivers/passthrough/amd/iommu.h | 21 +-
> xen/drivers/passthrough/amd/iommu_cmd.c | 20 +-
> xen/drivers/passthrough/amd/iommu_init.c | 13 +-
> xen/drivers/passthrough/amd/iommu_map.c | 217 +--
> xen/drivers/passthrough/amd/pci_amd_iommu.c | 346 ++--
> xen/drivers/passthrough/iommu.c | 735 ++++++++-
> xen/drivers/passthrough/pci.c | 404 ++---
> xen/drivers/passthrough/vtd/extern.h | 19 +-
> xen/drivers/passthrough/vtd/iommu.c | 1612 ++++++-------------
> xen/drivers/passthrough/vtd/iommu.h | 2 -
> xen/drivers/passthrough/vtd/qinval.c | 2 +-
> xen/drivers/passthrough/vtd/quirks.c | 21 +-
> xen/drivers/passthrough/vtd/vtd.h | 3 +-
> xen/drivers/passthrough/x86/Makefile | 1 +
> xen/drivers/passthrough/x86/arena.c | 157 ++
> xen/drivers/passthrough/x86/iommu.c | 294 +++-
> xen/include/hypercall-defs.c | 6 +
> xen/include/public/pv-iommu.h | 343 ++++
> xen/include/public/xen.h | 1 +
> xen/include/xen/iommu.h | 117 +-
> xen/include/xen/pci.h | 3 +
> 35 files changed, 3585 insertions(+), 1973 deletions(-)
> create mode 100644 docs/designs/iommu-contexts.md
> create mode 100644 docs/designs/pv-iommu.md
> create mode 100644 xen/arch/x86/include/asm/arena.h
> create mode 100644 xen/common/pv-iommu.c
> create mode 100644 xen/drivers/passthrough/x86/arena.c
> create mode 100644 xen/include/public/pv-iommu.h
>
> --
> 2.47.2
>
>
>
> Teddy Astie | Vates XCP-ng Developer
>
> XCP-ng & Xen Orchestra - Vates solutions
>
> web: https://vates.tech
>
--
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab