[PATCH v5 0/2] s390x/pci: relax I/O address translation requirement

Matthew Rosato posted 2 patches 1 month ago
hw/s390x/s390-pci-bus.c         | 39 +++++++++++++++++++++++++++++++--
hw/s390x/s390-pci-inst.c        | 13 +++++++++--
hw/s390x/s390-pci-vfio.c        | 28 ++++++++++++++++++-----
hw/s390x/s390-virtio-ccw.c      |  5 +++++
include/hw/s390x/s390-pci-bus.h |  3 +++
include/hw/s390x/s390-pci-clp.h |  1 +
6 files changed, 80 insertions(+), 9 deletions(-)
[PATCH v5 0/2] s390x/pci: relax I/O address translation requirement
Posted by Matthew Rosato 1 month ago
This series introduces the concept of the relaxed translation requirement
for s390x guests in order to allow bypass of the guest IOMMU for more
efficient PCI passthrough.

With this series, QEMU can indicate to the guest that an IOMMU is not
strictly required for a zPCI device.  This would subsequently allow a
guest linux to use iommu.passthrough=1 and bypass their guest IOMMU for
PCI devices.

When this occurs, QEMU will note the behavior via an intercepted MPCIFC
instruction and will fill the host iommu with mappings of the entire
guest address space in response.

The kernel series [1] that adds the relevant behavior needed to
exploit this new feature from within a s390x linux guest is available
in linux-next via iommu-next.

[1]: https://lore.kernel.org/linux-s390/20250212213418.182902-1-mjrosato@linux.ibm.com/

Changes for v5:
- Add some review/test tags (had to drop some due to code changes)
- Dynamically allocate iommu->dm_mr, remove direct_map bool 

Changes for v4:
- use get_system_memory() instead of ms->ram
- rename rtr_allowed to rtr_avail
- turn off rtr_avail for emulated devices so MPCFIC fence properly
  rejects an attempt at direct mapping (we only advertise via CLP
  for passthrough devices)
- turn off rtr_avail for passthrough ISM devices
- various minor changes

Changes for v3:
- use s390_get_memory_limit
- advertise full aperture for relaxed-translation-capable devices

Changes for v2:
- Add relax-translation property, fence for older machines
- Add a new MPCIFC failure case when direct-mapping requested but
  the relax-translation property is set to off.
- For direct mapping, use a memory alias to handle the SMDA offset and
  then just let vfio handle the pinning of memory.

Matthew Rosato (2):
  s390x/pci: add support for guests that request direct mapping
  s390x/pci: indicate QEMU supports relaxed translation for passthrough

 hw/s390x/s390-pci-bus.c         | 39 +++++++++++++++++++++++++++++++--
 hw/s390x/s390-pci-inst.c        | 13 +++++++++--
 hw/s390x/s390-pci-vfio.c        | 28 ++++++++++++++++++-----
 hw/s390x/s390-virtio-ccw.c      |  5 +++++
 include/hw/s390x/s390-pci-bus.h |  3 +++
 include/hw/s390x/s390-pci-clp.h |  1 +
 6 files changed, 80 insertions(+), 9 deletions(-)

-- 
2.48.1
Re: [PATCH v5 0/2] s390x/pci: relax I/O address translation requirement
Posted by Thomas Huth 4 weeks, 1 day ago
On 26/02/2025 22.00, Matthew Rosato wrote:
> This series introduces the concept of the relaxed translation requirement
> for s390x guests in order to allow bypass of the guest IOMMU for more
> efficient PCI passthrough.
> 
> With this series, QEMU can indicate to the guest that an IOMMU is not
> strictly required for a zPCI device.  This would subsequently allow a
> guest linux to use iommu.passthrough=1 and bypass their guest IOMMU for
> PCI devices.
> 
> When this occurs, QEMU will note the behavior via an intercepted MPCIFC
> instruction and will fill the host iommu with mappings of the entire
> guest address space in response.
> 
> The kernel series [1] that adds the relevant behavior needed to
> exploit this new feature from within a s390x linux guest is available
> in linux-next via iommu-next.
> 
> [1]: https://lore.kernel.org/linux-s390/20250212213418.182902-1-mjrosato@linux.ibm.com/
> 
> Changes for v5:
> - Add some review/test tags (had to drop some due to code changes)
> - Dynamically allocate iommu->dm_mr, remove direct_map bool
> 
> Changes for v4:
> - use get_system_memory() instead of ms->ram
> - rename rtr_allowed to rtr_avail
> - turn off rtr_avail for emulated devices so MPCFIC fence properly
>    rejects an attempt at direct mapping (we only advertise via CLP
>    for passthrough devices)
> - turn off rtr_avail for passthrough ISM devices
> - various minor changes
> 
> Changes for v3:
> - use s390_get_memory_limit
> - advertise full aperture for relaxed-translation-capable devices
> 
> Changes for v2:
> - Add relax-translation property, fence for older machines
> - Add a new MPCIFC failure case when direct-mapping requested but
>    the relax-translation property is set to off.
> - For direct mapping, use a memory alias to handle the SMDA offset and
>    then just let vfio handle the pinning of memory.
> 
> Matthew Rosato (2):
>    s390x/pci: add support for guests that request direct mapping
>    s390x/pci: indicate QEMU supports relaxed translation for passthrough
> 
>   hw/s390x/s390-pci-bus.c         | 39 +++++++++++++++++++++++++++++++--
>   hw/s390x/s390-pci-inst.c        | 13 +++++++++--
>   hw/s390x/s390-pci-vfio.c        | 28 ++++++++++++++++++-----
>   hw/s390x/s390-virtio-ccw.c      |  5 +++++
>   include/hw/s390x/s390-pci-bus.h |  3 +++
>   include/hw/s390x/s390-pci-clp.h |  1 +
>   6 files changed, 80 insertions(+), 9 deletions(-)
> 

Series
Acked-by: Thomas Huth <thuth@redhat.com>

I'll queue it for my next pull request.

  Thomas