[XEN RFC PATCH v4 0/5] IOMMU subsystem redesign and PV-IOMMU interface

Teddy Astie posted 5 patches 1 year, 3 months ago
Patches applied successfully (tree, apply log)
git fetch https://gitlab.com/xen-project/patchew/xen tags/patchew/cover.1730718102.git.teddy.astie@vates.tech
There is a newer version of this series
docs/designs/iommu-contexts.md       |  403 +++++++
docs/designs/pv-iommu.md             |  116 ++
xen/arch/x86/domain.c                |    2 +-
xen/arch/x86/include/asm/arena.h     |   54 +
xen/arch/x86/include/asm/iommu.h     |   58 +-
xen/arch/x86/include/asm/pci.h       |   17 -
xen/arch/x86/mm/p2m-ept.c            |    2 +-
xen/arch/x86/pv/dom0_build.c         |    4 +-
xen/arch/x86/tboot.c                 |    4 +-
xen/common/Makefile                  |    1 +
xen/common/memory.c                  |    4 +-
xen/common/pv-iommu.c                |  539 ++++++++++
xen/drivers/passthrough/Makefile     |    3 +
xen/drivers/passthrough/context.c    |  711 +++++++++++++
xen/drivers/passthrough/iommu.c      |  396 +++----
xen/drivers/passthrough/pci.c        |  117 +-
xen/drivers/passthrough/quarantine.c |   49 +
xen/drivers/passthrough/vtd/Makefile |    2 +-
xen/drivers/passthrough/vtd/extern.h |   14 +-
xen/drivers/passthrough/vtd/iommu.c  | 1478 +++++++++-----------------
xen/drivers/passthrough/vtd/quirks.c |   20 +-
xen/drivers/passthrough/x86/Makefile |    1 +
xen/drivers/passthrough/x86/arena.c  |  157 +++
xen/drivers/passthrough/x86/iommu.c  |  270 +++--
xen/include/hypercall-defs.c         |    6 +
xen/include/public/pv-iommu.h        |  341 ++++++
xen/include/public/xen.h             |    1 +
xen/include/xen/iommu.h              |  117 +-
xen/include/xen/pci.h                |    3 +
29 files changed, 3423 insertions(+), 1467 deletions(-)
create mode 100644 docs/designs/iommu-contexts.md
create mode 100644 docs/designs/pv-iommu.md
create mode 100644 xen/arch/x86/include/asm/arena.h
create mode 100644 xen/common/pv-iommu.c
create mode 100644 xen/drivers/passthrough/context.c
create mode 100644 xen/drivers/passthrough/quarantine.c
create mode 100644 xen/drivers/passthrough/x86/arena.c
create mode 100644 xen/include/public/pv-iommu.h
[XEN RFC PATCH v4 0/5] IOMMU subsystem redesign and PV-IOMMU interface
Posted by Teddy Astie 1 year, 3 months ago
This work has been presented at Xen Summit 2024 during the
  IOMMU paravirtualization and Xen IOMMU subsystem rework
design session.

Operating systems may want to have access to a IOMMU in order to do DMA
protection or implement certain features (e.g VFIO on Linux).

VFIO support is mandatory for framework such as SPDK, which can be useful to
implement an alternative storage backend for virtual machines [1].

In this patch series, we introduce in Xen the ability to manage several
contexts per domain and provide a new hypercall interface to allow guests
to manage IOMMU contexts.

The VT-d driver is updated to support these new features.

[1] Using SPDK with the Xen hypervisor - FOSDEM 2023
---
Changed in v2 :
* fixed Xen crash when dumping IOMMU contexts (using X debug key)
with DomUs without IOMMU
* s/dettach/detach/
* removed some unused includes
* fix dangling devices in contexts with detach

Changed in v3 :
* lock entirely map/unmap in hypercall
* prevent IOMMU operations on dying contexts (fix race condition)
* iommu_check_context+iommu_get_context -> iommu_get_context and check for NULL

Changed in v4 :
* Part of initialization logic is moved to domain or toolstack (IOMMU_init)
  + domain/toolstack now decides on "context count" and "pagetable pool size"
  + for now, all domains are able to initialize PV-IOMMU
* introduce "dom0-iommu=no-dma" to make default context block all DMA
  (disables HAP and sync-pt), enforcing usage of PV-IOMMU for DMA
  Can be used to expose properly "Pre-boot DMA protection"
* redesigned locking logic for contexts
  + contexts are accessed using iommu_get_context and released with iommu_put_context

TODO:
* add stub implementations for bissecting needs and non-ported IOMMU implementations
* fix some issues with no-dma+PV and grants
* complete "no-dma" mode (expose to toolstack, add documentation, ...)
* properly define nested mode and PASID support

Teddy Astie (5):
  docs/designs: Add a design document for PV-IOMMU
  docs/designs: Add a design document for IOMMU subsystem redesign
  IOMMU: Introduce redesigned IOMMU subsystem
  VT-d: Port IOMMU driver to new subsystem
  xen/public: Introduce PV-IOMMU hypercall interface

 docs/designs/iommu-contexts.md       |  403 +++++++
 docs/designs/pv-iommu.md             |  116 ++
 xen/arch/x86/domain.c                |    2 +-
 xen/arch/x86/include/asm/arena.h     |   54 +
 xen/arch/x86/include/asm/iommu.h     |   58 +-
 xen/arch/x86/include/asm/pci.h       |   17 -
 xen/arch/x86/mm/p2m-ept.c            |    2 +-
 xen/arch/x86/pv/dom0_build.c         |    4 +-
 xen/arch/x86/tboot.c                 |    4 +-
 xen/common/Makefile                  |    1 +
 xen/common/memory.c                  |    4 +-
 xen/common/pv-iommu.c                |  539 ++++++++++
 xen/drivers/passthrough/Makefile     |    3 +
 xen/drivers/passthrough/context.c    |  711 +++++++++++++
 xen/drivers/passthrough/iommu.c      |  396 +++----
 xen/drivers/passthrough/pci.c        |  117 +-
 xen/drivers/passthrough/quarantine.c |   49 +
 xen/drivers/passthrough/vtd/Makefile |    2 +-
 xen/drivers/passthrough/vtd/extern.h |   14 +-
 xen/drivers/passthrough/vtd/iommu.c  | 1478 +++++++++-----------------
 xen/drivers/passthrough/vtd/quirks.c |   20 +-
 xen/drivers/passthrough/x86/Makefile |    1 +
 xen/drivers/passthrough/x86/arena.c  |  157 +++
 xen/drivers/passthrough/x86/iommu.c  |  270 +++--
 xen/include/hypercall-defs.c         |    6 +
 xen/include/public/pv-iommu.h        |  341 ++++++
 xen/include/public/xen.h             |    1 +
 xen/include/xen/iommu.h              |  117 +-
 xen/include/xen/pci.h                |    3 +
 29 files changed, 3423 insertions(+), 1467 deletions(-)
 create mode 100644 docs/designs/iommu-contexts.md
 create mode 100644 docs/designs/pv-iommu.md
 create mode 100644 xen/arch/x86/include/asm/arena.h
 create mode 100644 xen/common/pv-iommu.c
 create mode 100644 xen/drivers/passthrough/context.c
 create mode 100644 xen/drivers/passthrough/quarantine.c
 create mode 100644 xen/drivers/passthrough/x86/arena.c
 create mode 100644 xen/include/public/pv-iommu.h

-- 
2.45.2



Teddy Astie | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech
Re: [XEN RFC PATCH v4 0/5] IOMMU subsystem redesign and PV-IOMMU interface
Posted by Marek Marczykowski-Górecki 1 year ago
On Mon, Nov 04, 2024 at 02:28:38PM +0000, Teddy Astie wrote:
> This work has been presented at Xen Summit 2024 during the
>   IOMMU paravirtualization and Xen IOMMU subsystem rework
> design session.
> 
> Operating systems may want to have access to a IOMMU in order to do DMA
> protection or implement certain features (e.g VFIO on Linux).
> 
> VFIO support is mandatory for framework such as SPDK, which can be useful to
> implement an alternative storage backend for virtual machines [1].
> 
> In this patch series, we introduce in Xen the ability to manage several
> contexts per domain and provide a new hypercall interface to allow guests
> to manage IOMMU contexts.
> 
> The VT-d driver is updated to support these new features.
> 
> [1] Using SPDK with the Xen hypervisor - FOSDEM 2023
> ---
> Changed in v2 :
> * fixed Xen crash when dumping IOMMU contexts (using X debug key)
> with DomUs without IOMMU
> * s/dettach/detach/
> * removed some unused includes
> * fix dangling devices in contexts with detach
> 
> Changed in v3 :
> * lock entirely map/unmap in hypercall
> * prevent IOMMU operations on dying contexts (fix race condition)
> * iommu_check_context+iommu_get_context -> iommu_get_context and check for NULL
> 
> Changed in v4 :
> * Part of initialization logic is moved to domain or toolstack (IOMMU_init)
>   + domain/toolstack now decides on "context count" and "pagetable pool size"
>   + for now, all domains are able to initialize PV-IOMMU
> * introduce "dom0-iommu=no-dma" to make default context block all DMA
>   (disables HAP and sync-pt), enforcing usage of PV-IOMMU for DMA
>   Can be used to expose properly "Pre-boot DMA protection"
> * redesigned locking logic for contexts
>   + contexts are accessed using iommu_get_context and released with iommu_put_context
> 
> TODO:
> * add stub implementations for bissecting needs and non-ported IOMMU implementations
> * fix some issues with no-dma+PV and grants
> * complete "no-dma" mode (expose to toolstack, add documentation, ...)
> * properly define nested mode and PASID support

Hi,

I finally got time to try this revision (sorry it took so long!). My
goal was to test it this time with some HVM domU too. I didn't get very
far...

Issues I hit:

1. AMD IOMMU driver is not converted (fails to build), for now disabled
   CONFIG_AMD_IOMMU.
2. PV shim build fails (linker fails to find p2m_add_identity_entry
   symbol referenced from iommu.c)
3. Xen complains on boot about missing endbr64 (surprisingly, it didn't
   exploded):

    (XEN) alt table ffff82d0404234d8 -> ffff82d040432d82
    (XEN) altcall iommu_get_max_iova+0x11/0x30 dest iommu.c#intel_iommu_get_max_iova has no endbr64
    (XEN) altcall context.c#iommu_reattach_phantom+0x30/0x50 dest iommu.c#intel_iommu_add_devfn has no endbr64
    (XEN) altcall context.c#iommu_detach_phantom+0x25/0x40 dest iommu.c#intel_iommu_remove_devfn has no endbr64
    (XEN) altcall iommu_context_init+0x27/0x40 dest iommu.c#intel_iommu_context_init has no endbr64
    (XEN) altcall iommu_attach_context+0x3c/0xd0 dest iommu.c#intel_iommu_attach has no endbr64
    (XEN) altcall context.c#iommu_attach_context.cold+0x1d/0x53 dest iommu.c#intel_iommu_detach has no endbr64
    (XEN) altcall iommu_detach_context+0x37/0xa0 dest iommu.c#intel_iommu_detach has no endbr64
    (XEN) altcall iommu_reattach_context+0x95/0x240 dest iommu.c#intel_iommu_reattach has no endbr64
    (XEN) altcall context.c#iommu_reattach_context.cold+0x29/0x110 dest iommu.c#intel_iommu_reattach has no endbr64
    (XEN) altcall iommu_context_teardown+0x3f/0xa0 dest iommu.c#intel_iommu_context_teardown has no endbr64
    (XEN) altcall pci.c#deassign_device+0x99/0x270 dest iommu.c#intel_iommu_add_devfn has no endbr64

4. Starting a HVM domU with PCI device fails with:

    libxl: libxl_pci.c:1552:pci_add_dm_done: Domain 1:xc_assign_device failed: No space left on device
    libxl: libxl_pci.c:1875:device_pci_add_done: Domain 1:libxl__device_pci_add failed for PCI device 0:aa:0.0 (rc -3)
    libxl: libxl_create.c:2061:domcreate_attach_devices: Domain 1:unable to add pci devices

I didn't change anything in the toolstack - maybe default context needs
to be initialized somehow? But the docs suggest the default context
should work out of the box. On the other hand, changelog for v4 says
some parts are moved to the toolstack, but I don't see any changes in
tools/ in this series...

FWIW The exact version I tried is this (this series, on top of staging +
qubes patches):
https://github.com/QubesOS/qubes-vmm-xen/pull/200
At this stage, dom0 kernel didn't have PV-IOMMU driver included yet.

Full Xen log, with some debug info collected:
https://gist.github.com/marmarek/e7ac2571df033c7181bf03f21aa5f9ab

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
Re: [XEN RFC PATCH v4 0/5] IOMMU subsystem redesign and PV-IOMMU interface
Posted by Teddy Astie 1 year ago
Thanks for your review.

> Hi,
> 
> I finally got time to try this revision (sorry it took so long!). My
> goal was to test it this time with some HVM domU too. I didn't get very
> far...
> 
> Issues I hit:
> 
> 1. AMD IOMMU driver is not converted (fails to build), for now disabled
>     CONFIG_AMD_IOMMU.

I haven't really worked on the AMD-Vi code yet. I have plans for it but 
there is some specific bits to deal with (especially regarding interrupt 
remapping), that I planned to discuss especially during the Xen Project 
Winter Summit 2025.

> 2. PV shim build fails (linker fails to find p2m_add_identity_entry
>     symbol referenced from iommu.c)

I haven't considered PV shim yet, so I am not really surprised that 
there are some issues with it. We probably want to expose some PV-IOMMU 
features for PV guests under PV shim, but it probably needs some 
specific code for it.

> 3. Xen complains on boot about missing endbr64 (surprisingly, it didn't
>     exploded):
> 
>      (XEN) alt table ffff82d0404234d8 -> ffff82d040432d82
>      (XEN) altcall iommu_get_max_iova+0x11/0x30 dest iommu.c#intel_iommu_get_max_iova has no endbr64
>      (XEN) altcall context.c#iommu_reattach_phantom+0x30/0x50 dest iommu.c#intel_iommu_add_devfn has no endbr64
>      (XEN) altcall context.c#iommu_detach_phantom+0x25/0x40 dest iommu.c#intel_iommu_remove_devfn has no endbr64
>      (XEN) altcall iommu_context_init+0x27/0x40 dest iommu.c#intel_iommu_context_init has no endbr64
>      (XEN) altcall iommu_attach_context+0x3c/0xd0 dest iommu.c#intel_iommu_attach has no endbr64
>      (XEN) altcall context.c#iommu_attach_context.cold+0x1d/0x53 dest iommu.c#intel_iommu_detach has no endbr64
>      (XEN) altcall iommu_detach_context+0x37/0xa0 dest iommu.c#intel_iommu_detach has no endbr64
>      (XEN) altcall iommu_reattach_context+0x95/0x240 dest iommu.c#intel_iommu_reattach has no endbr64
>      (XEN) altcall context.c#iommu_reattach_context.cold+0x29/0x110 dest iommu.c#intel_iommu_reattach has no endbr64
>      (XEN) altcall iommu_context_teardown+0x3f/0xa0 dest iommu.c#intel_iommu_context_teardown has no endbr64
>      (XEN) altcall pci.c#deassign_device+0x99/0x270 dest iommu.c#intel_iommu_add_devfn has no endbr64
> 

I also see that, but I am not sure what I need to do to fix it.

> 4. Starting a HVM domU with PCI device fails with:
> 
>      libxl: libxl_pci.c:1552:pci_add_dm_done: Domain 1:xc_assign_device failed: No space left on device
>      libxl: libxl_pci.c:1875:device_pci_add_done: Domain 1:libxl__device_pci_add failed for PCI device 0:aa:0.0 (rc -3)
>      libxl: libxl_create.c:2061:domcreate_attach_devices: Domain 1:unable to add pci devices
> > I didn't change anything in the toolstack - maybe default context needs
> to be initialized somehow? But the docs suggest the default context
> should work out of the box. On the other hand, changelog for v4 says
> some parts are moved to the toolstack, but I don't see any changes in
> tools/ in this series...
> 

I only tried stuff inside Dom0, but I haven't really tried passing 
through a device. I think I missed some step regarding quarantine domain 
initialization, which is probably why you have "-ENOSPC" here. You can 
try in the meantime to set "quarantine=0" to disable this part to see if 
it progresses further.

I will plan to do some testing on usual PCI passthrough to see if there 
are issues there.

> FWIW The exact version I tried is this (this series, on top of staging +
> qubes patches):
> https://github.com/QubesOS/qubes-vmm-xen/pull/200
> At this stage, dom0 kernel didn't have PV-IOMMU driver included yet.
> 
> Full Xen log, with some debug info collected:
> https://gist.github.com/marmarek/e7ac2571df033c7181bf03f21aa5f9ab
> 

Thanks
Teddy



Teddy Astie | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech
Re: [XEN RFC PATCH v4 0/5] IOMMU subsystem redesign and PV-IOMMU interface
Posted by Marek Marczykowski-Górecki 1 year ago
On Thu, Jan 09, 2025 at 11:39:04AM +0000, Teddy Astie wrote:
> Thanks for your review.
> 
> > Hi,
> > 
> > I finally got time to try this revision (sorry it took so long!). My
> > goal was to test it this time with some HVM domU too. I didn't get very
> > far...
> > 
> > Issues I hit:
> > 
> > 1. AMD IOMMU driver is not converted (fails to build), for now disabled
> >     CONFIG_AMD_IOMMU.
> 
> I haven't really worked on the AMD-Vi code yet. I have plans for it but 
> there is some specific bits to deal with (especially regarding interrupt 
> remapping), that I planned to discuss especially during the Xen Project 
> Winter Summit 2025.

:)

> > 2. PV shim build fails (linker fails to find p2m_add_identity_entry
> >     symbol referenced from iommu.c)
> 
> I haven't considered PV shim yet, so I am not really surprised that 
> there are some issues with it. We probably want to expose some PV-IOMMU 
> features for PV guests under PV shim, but it probably needs some 
> specific code for it.

I'm not sure if passthrough is supported with PV shim (never tried). The
current issue is much earlier ;)

> > 3. Xen complains on boot about missing endbr64 (surprisingly, it didn't
> >     exploded):
> > 
> >      (XEN) alt table ffff82d0404234d8 -> ffff82d040432d82
> >      (XEN) altcall iommu_get_max_iova+0x11/0x30 dest iommu.c#intel_iommu_get_max_iova has no endbr64
> >      (XEN) altcall context.c#iommu_reattach_phantom+0x30/0x50 dest iommu.c#intel_iommu_add_devfn has no endbr64
> >      (XEN) altcall context.c#iommu_detach_phantom+0x25/0x40 dest iommu.c#intel_iommu_remove_devfn has no endbr64
> >      (XEN) altcall iommu_context_init+0x27/0x40 dest iommu.c#intel_iommu_context_init has no endbr64
> >      (XEN) altcall iommu_attach_context+0x3c/0xd0 dest iommu.c#intel_iommu_attach has no endbr64
> >      (XEN) altcall context.c#iommu_attach_context.cold+0x1d/0x53 dest iommu.c#intel_iommu_detach has no endbr64
> >      (XEN) altcall iommu_detach_context+0x37/0xa0 dest iommu.c#intel_iommu_detach has no endbr64
> >      (XEN) altcall iommu_reattach_context+0x95/0x240 dest iommu.c#intel_iommu_reattach has no endbr64
> >      (XEN) altcall context.c#iommu_reattach_context.cold+0x29/0x110 dest iommu.c#intel_iommu_reattach has no endbr64
> >      (XEN) altcall iommu_context_teardown+0x3f/0xa0 dest iommu.c#intel_iommu_context_teardown has no endbr64
> >      (XEN) altcall pci.c#deassign_device+0x99/0x270 dest iommu.c#intel_iommu_add_devfn has no endbr64
> > 
> 
> I also see that, but I am not sure what I need to do to fix it.

I guess add "cf_check" annotation to functions that are called
indirectly.

> > 4. Starting a HVM domU with PCI device fails with:
> > 
> >      libxl: libxl_pci.c:1552:pci_add_dm_done: Domain 1:xc_assign_device failed: No space left on device
> >      libxl: libxl_pci.c:1875:device_pci_add_done: Domain 1:libxl__device_pci_add failed for PCI device 0:aa:0.0 (rc -3)
> >      libxl: libxl_create.c:2061:domcreate_attach_devices: Domain 1:unable to add pci devices
> > > I didn't change anything in the toolstack - maybe default context needs
> > to be initialized somehow? But the docs suggest the default context
> > should work out of the box. On the other hand, changelog for v4 says
> > some parts are moved to the toolstack, but I don't see any changes in
> > tools/ in this series...
> > 
> 
> I only tried stuff inside Dom0, but I haven't really tried passing 
> through a device. I think I missed some step regarding quarantine domain 
> initialization, which is probably why you have "-ENOSPC" here. You can 
> try in the meantime to set "quarantine=0" to disable this part to see if 
> it progresses further.

That helped a bit. Now domU starts. But device doesn't work - qemu
complains:

[2025-01-09 06:52:45] [00:08.0] xen_pt_realize: Real physical device 00:0d.3 registered successfully
...
[2025-01-09 06:52:45] [00:09.0] xen_pt_realize: Real physical device 00:0d.2 registered successfully
...
[2025-01-09 06:52:45] [00:0a.0] xen_pt_realize: Real physical device 00:0d.0 registered successfully
...
[2025-01-09 06:52:59] [00:0a.0] xen_pt_msgctrl_reg_write: setup MSI (register: 87).
[2025-01-09 06:52:59] [00:0a.0] msi_msix_setup: Error: Mapping of MSI (err: 19, vec: 0x25, entry 0[2025-01-09 06:52:59] x0)
[2025-01-09 06:52:59] [00:0a.0] xen_pt_msgctrl_reg_write: Warning: Can not map MSI (register: 86)!
[2025-01-09 06:54:21] [00:08.0] msix_set_enable: disabling MSI-X.
[2025-01-09 06:54:21] [00:08.0] xen_pt_msixctrl_reg_write: disable MSI-X
[2025-01-09 06:54:21] [00:09.0] xen_pt_msixctrl_reg_write: enable MSI-X
[2025-01-09 06:54:21] [00:09.0] msi_msix_setup: Error: Mapping of MSI-X (err: 19, vec: 0xef, entry 0x0)
[2025-01-09 06:54:21] [00:09.0] msi_msix_setup: Error: Mapping of MSI-X (err: 19, vec: 0xef, entry 0x1)
[2025-01-09 06:54:21] [00:09.0] msi_msix_setup: Error: Mapping of MSI-X (err: 19, vec: 0xef, entry 0x2)
[2025-01-09 06:54:21] [00:09.0] msi_msix_setup: Error: Mapping of MSI-X (err: 19, vec: 0xef, entry 0x3)
[2025-01-09 06:54:21] [00:09.0] msi_msix_setup: Error: Mapping of MSI-X (err: 19, vec: 0xef, entry 0x4)
[2025-01-09 06:54:21] [00:09.0] msi_msix_setup: Error: Mapping of MSI-X (err: 19, vec: 0xef, entry 0x5)
[2025-01-09 06:54:21] [00:09.0] msi_msix_setup: Error: Mapping of MSI-X (err: 19, vec: 0xef, entry 0x6)
[2025-01-09 06:54:21] [00:09.0] msi_msix_setup: Error: Mapping of MSI-X (err: 19, vec: 0xef, entry 0x7)
[2025-01-09 06:54:21] [00:09.0] msi_msix_setup: Error: Mapping of MSI-X (err: 19, vec: 0xef, entry 0x8)
[2025-01-09 06:54:21] [00:09.0] msi_msix_setup: Error: Mapping of MSI-X (err: 19, vec: 0xef, entry 0x9)
[2025-01-09 06:54:21] [00:09.0] msi_msix_setup: Error: Mapping of MSI-X (err: 19, vec: 0xef, entry 0xa)
[2025-01-09 06:54:21] [00:09.0] msi_msix_setup: Error: Mapping of MSI-X (err: 19, vec: 0xef, entry 0xb)

and interestingly, Xen says all devices are still in dom0:

[2025-01-09 06:53:39] (XEN) ==== PCI devices ====
[2025-01-09 06:53:39] (XEN) ==== segment 0000 ====
[2025-01-09 06:53:39] (XEN) 0000:aa:00.0 - d0 - node -1
[2025-01-09 06:53:39] (XEN) 0000:01:00.0 - d0 - node -1  - MSIs < 132 133 134 135 136 >
[2025-01-09 06:53:39] (XEN) 0000:00:1f.5 - d0 - node -1
[2025-01-09 06:53:39] (XEN) 0000:00:1f.4 - d0 - node -1
[2025-01-09 06:53:39] (XEN) 0000:00:1f.3 - d0 - node -1  - MSIs < 139 >
[2025-01-09 06:53:39] (XEN) 0000:00:1f.0 - d0 - node -1
[2025-01-09 06:53:39] (XEN) 0000:00:1d.0 - d0 - node -1  - MSIs < 131 >
[2025-01-09 06:53:39] (XEN) 0000:00:16.0 - d0 - node -1  - MSIs < 138 >
[2025-01-09 06:53:39] (XEN) 0000:00:15.3 - d0 - node -1
[2025-01-09 06:53:39] (XEN) 0000:00:15.1 - d0 - node -1
[2025-01-09 06:53:39] (XEN) 0000:00:15.0 - d0 - node -1
[2025-01-09 06:53:39] (XEN) 0000:00:14.2 - d0 - node -1
[2025-01-09 06:53:39] (XEN) 0000:00:14.0 - d0 - node -1
[2025-01-09 06:53:39] (XEN) 0000:00:12.0 - d0 - node -1
[2025-01-09 06:53:39] (XEN) 0000:00:0d.3 - d0 - node -1
[2025-01-09 06:53:39] (XEN) 0000:00:0d.2 - d0 - node -1
[2025-01-09 06:53:39] (XEN) 0000:00:0d.0 - d0 - node -1
[2025-01-09 06:53:39] (XEN) 0000:00:0a.0 - d0 - node -1
[2025-01-09 06:53:39] (XEN) 0000:00:08.0 - d0 - node -1
[2025-01-09 06:53:39] (XEN) 0000:00:07.3 - d0 - node -1  - MSIs < 130 >
[2025-01-09 06:53:39] (XEN) 0000:00:07.2 - d0 - node -1  - MSIs < 129 >
[2025-01-09 06:53:39] (XEN) 0000:00:07.1 - d0 - node -1  - MSIs < 128 >
[2025-01-09 06:53:39] (XEN) 0000:00:07.0 - d0 - node -1  - MSIs < 127 >
[2025-01-09 06:53:39] (XEN) 0000:00:06.0 - d0 - node -1  - MSIs < 126 >
[2025-01-09 06:53:39] (XEN) 0000:00:04.0 - d0 - node -1
[2025-01-09 06:53:39] (XEN) 0000:00:02.0 - d0 - node -1  - MSIs < 137 >
[2025-01-09 06:53:39] (XEN) 0000:00:00.0 - d0 - node -1

I don't see any errors from toolstack this time.

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
Re: [XEN RFC PATCH v4 0/5] IOMMU subsystem redesign and PV-IOMMU interface
Posted by Teddy Astie 1 year ago
Hello,

>>> 3. Xen complains on boot about missing endbr64 (surprisingly, it didn't
>>>      exploded):
>>>
>>>       (XEN) alt table ffff82d0404234d8 -> ffff82d040432d82
>>>       (XEN) altcall iommu_get_max_iova+0x11/0x30 dest iommu.c#intel_iommu_get_max_iova has no endbr64
>>>       (XEN) altcall context.c#iommu_reattach_phantom+0x30/0x50 dest iommu.c#intel_iommu_add_devfn has no endbr64
>>>       (XEN) altcall context.c#iommu_detach_phantom+0x25/0x40 dest iommu.c#intel_iommu_remove_devfn has no endbr64
>>>       (XEN) altcall iommu_context_init+0x27/0x40 dest iommu.c#intel_iommu_context_init has no endbr64
>>>       (XEN) altcall iommu_attach_context+0x3c/0xd0 dest iommu.c#intel_iommu_attach has no endbr64
>>>       (XEN) altcall context.c#iommu_attach_context.cold+0x1d/0x53 dest iommu.c#intel_iommu_detach has no endbr64
>>>       (XEN) altcall iommu_detach_context+0x37/0xa0 dest iommu.c#intel_iommu_detach has no endbr64
>>>       (XEN) altcall iommu_reattach_context+0x95/0x240 dest iommu.c#intel_iommu_reattach has no endbr64
>>>       (XEN) altcall context.c#iommu_reattach_context.cold+0x29/0x110 dest iommu.c#intel_iommu_reattach has no endbr64
>>>       (XEN) altcall iommu_context_teardown+0x3f/0xa0 dest iommu.c#intel_iommu_context_teardown has no endbr64
>>>       (XEN) altcall pci.c#deassign_device+0x99/0x270 dest iommu.c#intel_iommu_add_devfn has no endbr64
>>>
>>
>> I also see that, but I am not sure what I need to do to fix it.
> 
> I guess add "cf_check" annotation to functions that are called
> indirectly.
> 

Will add them for v5.

>>> 4. Starting a HVM domU with PCI device fails with:
>>>
>>>       libxl: libxl_pci.c:1552:pci_add_dm_done: Domain 1:xc_assign_device failed: No space left on device
>>>       libxl: libxl_pci.c:1875:device_pci_add_done: Domain 1:libxl__device_pci_add failed for PCI device 0:aa:0.0 (rc -3)
>>>       libxl: libxl_create.c:2061:domcreate_attach_devices: Domain 1:unable to add pci devices
>>>> I didn't change anything in the toolstack - maybe default context needs
>>> to be initialized somehow? But the docs suggest the default context
>>> should work out of the box. On the other hand, changelog for v4 says
>>> some parts are moved to the toolstack, but I don't see any changes in
>>> tools/ in this series...
>>>
>>
>> I only tried stuff inside Dom0, but I haven't really tried passing
>> through a device. I think I missed some step regarding quarantine domain
>> initialization, which is probably why you have "-ENOSPC" here. You can
>> try in the meantime to set "quarantine=0" to disable this part to see if
>> it progresses further.
> 
> That helped a bit. Now domU starts. But device doesn't work - qemu
> complains:
> 
> [2025-01-09 06:52:45] [00:08.0] xen_pt_realize: Real physical device 00:0d.3 registered successfully
> ...
> [2025-01-09 06:52:45] [00:09.0] xen_pt_realize: Real physical device 00:0d.2 registered successfully
> ...
> [2025-01-09 06:52:45] [00:0a.0] xen_pt_realize: Real physical device 00:0d.0 registered successfully
> ...
> [2025-01-09 06:52:59] [00:0a.0] xen_pt_msgctrl_reg_write: setup MSI (register: 87).
> [2025-01-09 06:52:59] [00:0a.0] msi_msix_setup: Error: Mapping of MSI (err: 19, vec: 0x25, entry 0[2025-01-09 06:52:59] x0)
> [2025-01-09 06:52:59] [00:0a.0] xen_pt_msgctrl_reg_write: Warning: Can not map MSI (register: 86)!
> [2025-01-09 06:54:21] [00:08.0] msix_set_enable: disabling MSI-X.
> [2025-01-09 06:54:21] [00:08.0] xen_pt_msixctrl_reg_write: disable MSI-X
> [2025-01-09 06:54:21] [00:09.0] xen_pt_msixctrl_reg_write: enable MSI-X
> [2025-01-09 06:54:21] [00:09.0] msi_msix_setup: Error: Mapping of MSI-X (err: 19, vec: 0xef, entry 0x0)
> [2025-01-09 06:54:21] [00:09.0] msi_msix_setup: Error: Mapping of MSI-X (err: 19, vec: 0xef, entry 0x1)
> [2025-01-09 06:54:21] [00:09.0] msi_msix_setup: Error: Mapping of MSI-X (err: 19, vec: 0xef, entry 0x2)
> [2025-01-09 06:54:21] [00:09.0] msi_msix_setup: Error: Mapping of MSI-X (err: 19, vec: 0xef, entry 0x3)
> [2025-01-09 06:54:21] [00:09.0] msi_msix_setup: Error: Mapping of MSI-X (err: 19, vec: 0xef, entry 0x4)
> [2025-01-09 06:54:21] [00:09.0] msi_msix_setup: Error: Mapping of MSI-X (err: 19, vec: 0xef, entry 0x5)
> [2025-01-09 06:54:21] [00:09.0] msi_msix_setup: Error: Mapping of MSI-X (err: 19, vec: 0xef, entry 0x6)
> [2025-01-09 06:54:21] [00:09.0] msi_msix_setup: Error: Mapping of MSI-X (err: 19, vec: 0xef, entry 0x7)
> [2025-01-09 06:54:21] [00:09.0] msi_msix_setup: Error: Mapping of MSI-X (err: 19, vec: 0xef, entry 0x8)
> [2025-01-09 06:54:21] [00:09.0] msi_msix_setup: Error: Mapping of MSI-X (err: 19, vec: 0xef, entry 0x9)
> [2025-01-09 06:54:21] [00:09.0] msi_msix_setup: Error: Mapping of MSI-X (err: 19, vec: 0xef, entry 0xa)
> [2025-01-09 06:54:21] [00:09.0] msi_msix_setup: Error: Mapping of MSI-X (err: 19, vec: 0xef, entry 0xb)
> 
> and interestingly, Xen says all devices are still in dom0:
> 
> [2025-01-09 06:53:39] (XEN) ==== PCI devices ====
> [2025-01-09 06:53:39] (XEN) ==== segment 0000 ====
> [2025-01-09 06:53:39] (XEN) 0000:aa:00.0 - d0 - node -1
> [2025-01-09 06:53:39] (XEN) 0000:01:00.0 - d0 - node -1  - MSIs < 132 133 134 135 136 >
> [2025-01-09 06:53:39] (XEN) 0000:00:1f.5 - d0 - node -1
> [2025-01-09 06:53:39] (XEN) 0000:00:1f.4 - d0 - node -1
> [2025-01-09 06:53:39] (XEN) 0000:00:1f.3 - d0 - node -1  - MSIs < 139 >
> [2025-01-09 06:53:39] (XEN) 0000:00:1f.0 - d0 - node -1
> [2025-01-09 06:53:39] (XEN) 0000:00:1d.0 - d0 - node -1  - MSIs < 131 >
> [2025-01-09 06:53:39] (XEN) 0000:00:16.0 - d0 - node -1  - MSIs < 138 >
> [2025-01-09 06:53:39] (XEN) 0000:00:15.3 - d0 - node -1
> [2025-01-09 06:53:39] (XEN) 0000:00:15.1 - d0 - node -1
> [2025-01-09 06:53:39] (XEN) 0000:00:15.0 - d0 - node -1
> [2025-01-09 06:53:39] (XEN) 0000:00:14.2 - d0 - node -1
> [2025-01-09 06:53:39] (XEN) 0000:00:14.0 - d0 - node -1
> [2025-01-09 06:53:39] (XEN) 0000:00:12.0 - d0 - node -1
> [2025-01-09 06:53:39] (XEN) 0000:00:0d.3 - d0 - node -1
> [2025-01-09 06:53:39] (XEN) 0000:00:0d.2 - d0 - node -1
> [2025-01-09 06:53:39] (XEN) 0000:00:0d.0 - d0 - node -1
> [2025-01-09 06:53:39] (XEN) 0000:00:0a.0 - d0 - node -1
> [2025-01-09 06:53:39] (XEN) 0000:00:08.0 - d0 - node -1
> [2025-01-09 06:53:39] (XEN) 0000:00:07.3 - d0 - node -1  - MSIs < 130 >
> [2025-01-09 06:53:39] (XEN) 0000:00:07.2 - d0 - node -1  - MSIs < 129 >
> [2025-01-09 06:53:39] (XEN) 0000:00:07.1 - d0 - node -1  - MSIs < 128 >
> [2025-01-09 06:53:39] (XEN) 0000:00:07.0 - d0 - node -1  - MSIs < 127 >
> [2025-01-09 06:53:39] (XEN) 0000:00:06.0 - d0 - node -1  - MSIs < 126 >
> [2025-01-09 06:53:39] (XEN) 0000:00:04.0 - d0 - node -1
> [2025-01-09 06:53:39] (XEN) 0000:00:02.0 - d0 - node -1  - MSIs < 137 >
> [2025-01-09 06:53:39] (XEN) 0000:00:00.0 - d0 - node -1
> 

I checked the PCI passthrough logic, and it looks like some bits are 
missing in my code. While they seem to be set up properly for the IOMMU 
subsystem point of view (DMA remapping at least), their domains 
(pdev->domain) are not updated. I suppose it is what confuses the 
intremap code.

Will do some additional testing with PCI passthrough and plan to fix it 
for v5.

Teddy


Teddy Astie | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech
Re: [XEN RFC PATCH v4 0/5] IOMMU subsystem redesign and PV-IOMMU interface
Posted by Marek Marczykowski-Górecki 1 year ago
On Thu, Jan 09, 2025 at 12:41:35PM +0000, Teddy Astie wrote:
> Will do some additional testing with PCI passthrough and plan to fix it 
> for v5.

There are PCI passthrough tests on gitlab that should cover the cases I
hit. You may want to let the machine do the work for you ;)

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
Re: [XEN RFC PATCH v4 0/5] IOMMU subsystem redesign and PV-IOMMU interface
Posted by Jan Beulich 1 year ago
On 09.01.2025 12:39, Teddy Astie wrote:
>> 3. Xen complains on boot about missing endbr64 (surprisingly, it didn't
>>     exploded):
>>
>>      (XEN) alt table ffff82d0404234d8 -> ffff82d040432d82
>>      (XEN) altcall iommu_get_max_iova+0x11/0x30 dest iommu.c#intel_iommu_get_max_iova has no endbr64
>>      (XEN) altcall context.c#iommu_reattach_phantom+0x30/0x50 dest iommu.c#intel_iommu_add_devfn has no endbr64
>>      (XEN) altcall context.c#iommu_detach_phantom+0x25/0x40 dest iommu.c#intel_iommu_remove_devfn has no endbr64
>>      (XEN) altcall iommu_context_init+0x27/0x40 dest iommu.c#intel_iommu_context_init has no endbr64
>>      (XEN) altcall iommu_attach_context+0x3c/0xd0 dest iommu.c#intel_iommu_attach has no endbr64
>>      (XEN) altcall context.c#iommu_attach_context.cold+0x1d/0x53 dest iommu.c#intel_iommu_detach has no endbr64
>>      (XEN) altcall iommu_detach_context+0x37/0xa0 dest iommu.c#intel_iommu_detach has no endbr64
>>      (XEN) altcall iommu_reattach_context+0x95/0x240 dest iommu.c#intel_iommu_reattach has no endbr64
>>      (XEN) altcall context.c#iommu_reattach_context.cold+0x29/0x110 dest iommu.c#intel_iommu_reattach has no endbr64
>>      (XEN) altcall iommu_context_teardown+0x3f/0xa0 dest iommu.c#intel_iommu_context_teardown has no endbr64
>>      (XEN) altcall pci.c#deassign_device+0x99/0x270 dest iommu.c#intel_iommu_add_devfn has no endbr64
>>
> 
> I also see that, but I am not sure what I need to do to fix it.

Add cf_check to the functions in question, I guess.

Jan
Re: [XEN RFC PATCH v4 0/5] IOMMU subsystem redesign and PV-IOMMU interface
Posted by Marek Marczykowski-Górecki 1 year, 3 months ago
On Mon, Nov 04, 2024 at 02:28:38PM +0000, Teddy Astie wrote:
> * introduce "dom0-iommu=no-dma" to make default context block all DMA
>   (disables HAP and sync-pt), enforcing usage of PV-IOMMU for DMA
>   Can be used to expose properly "Pre-boot DMA protection"

This sounds like it disables HAP completely, but actually it looks like
disabling sharing HAP page tables with IOMMU ones, right? That (HAP
sharing) is relevant for PVH dom0 only, correct?

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
Re: [XEN RFC PATCH v4 0/5] IOMMU subsystem redesign and PV-IOMMU interface
Posted by Teddy Astie 1 year, 3 months ago
Hello,

Le 05/11/2024 à 02:10, Marek Marczykowski-Górecki a écrit :
> On Mon, Nov 04, 2024 at 02:28:38PM +0000, Teddy Astie wrote:
>> * introduce "dom0-iommu=no-dma" to make default context block all DMA
>>    (disables HAP and sync-pt), enforcing usage of PV-IOMMU for DMA
>>    Can be used to expose properly "Pre-boot DMA protection"
> 
> This sounds like it disables HAP completely, but actually it looks like
> disabling sharing HAP page tables with IOMMU ones, right? That (HAP
> sharing) is relevant for PVH dom0 only, correct?
> 

Yes that's it

Teddy


Teddy Astie | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech