[PATCH v3 00/22] AMD vIOMMU: DMA remapping support for VFIO devices

Alejandro Jimenez posted 22 patches 1 week, 1 day ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20250919213515.917111-1-alejandro.j.jimenez@oracle.com
Maintainers: "Michael S. Tsirkin" <mst@redhat.com>, Igor Mammedov <imammedo@redhat.com>, Ani Sinha <anisinha@redhat.com>, Paolo Bonzini <pbonzini@redhat.com>, Richard Henderson <richard.henderson@linaro.org>, Eduardo Habkost <eduardo@habkost.net>, Marcel Apfelbaum <marcel.apfelbaum@gmail.com>, Jason Wang <jasowang@redhat.com>, Yi Liu <yi.l.liu@intel.com>, "Clément Mathieu--Drif" <clement.mathieu--drif@eviden.com>, Peter Xu <peterx@redhat.com>, David Hildenbrand <david@redhat.com>, "Philippe Mathieu-Daudé" <philmd@linaro.org>
hw/i386/acpi-build.c        |    6 +-
hw/i386/amd_iommu.c         | 1056 ++++++++++++++++++++++++++++++-----
hw/i386/amd_iommu.h         |   51 ++
hw/i386/intel_iommu.c       |    5 +-
hw/i386/x86-iommu.c         |    1 +
include/hw/i386/x86-iommu.h |    1 +
qemu-options.hx             |   23 +
system/memory.c             |   10 +-
8 files changed, 999 insertions(+), 154 deletions(-)
[PATCH v3 00/22] AMD vIOMMU: DMA remapping support for VFIO devices
Posted by Alejandro Jimenez 1 week, 1 day ago
This series adds support for guests using the AMD vIOMMU to enable DMA remapping
for VFIO devices. Please see v1[0] cover letter for additional details such as
example QEMU command line parameters used in testing.

I have sanity tested on an AMD EPYC Genoa host, booting a Linux guest with
'iommu.passthrough=0' and several CX6 VFs, and there are no issues during
typical guest operation.

When using the non-default parameter 'iommu.forcedac=1' in the guest kernel
cmdline, this initially fails due to a VFIO integer overflow bug which requires
the following fix in the host kernel:

https://github.com/aljimenezb/linux/commit/014be8cafe7464d278729583a2dd5d94514e2e2a
This is a work in progress as there are other locations in the driver that are
susceptible to overflows, but the above is sufficient to fix the initial
problem.

Even after that fix is applied, I see an issue on guest reboot when 'forcedac=1'
is in use. Although the guest boots, the VF is not properly initialized, failing
with a timeout. Once the guest reaches userspace the VF driver can be reloaded
and it then works as expected. I am still investigating the root cause for this
issue, and will need to discuss all the steps I have tried to eliminate
potential sources of errors in a separate thread.

I am sending v3 despite this known issue since forcedac=1 is not a default or
commonly known/used setting. Having the large portions of the infrastructure for
DMA remapping already in place (and working) will make it easier to debug this
corner case and get feedback/testing from the community. I hope this is a viable
approach, otherwise I am happy to discuss all the steps I have taken to debug
this issue in this thread and test any suggestions to address it.

Changes since v2[2]:
- P5: Fixed missed check for AMDVI_FR_DTE_RTR_ERR in amdvi_do_translate() (Sairaj)
- P6: Reword commit message to clarify the need to discern between empty PTEs and errors (Vasant)
- P9: Use correct enum type for notifier flags and remove whitespace changes (Sairaj)
- P11: Fixed integer overflow bug when guest uses iommu.forcedac=1. Fixed in P8. (Sairaj)
- P15: Fixed typo in commit message (Sairaj)
- P16: On reset, use passthrough mode by default on all address spaces (Sairaj)
- P18: Enforce isolation by using DMA mode on errors retrieving DTE (Ethan & Sairaj)
- P20: Removed unused pte_override_page_mask() and pte_get_page_mask() to avoid -Wunused-function error.
- Add HATDis support patches from Joao Martins (HATDis available in Linux since [1])

Thank you,
Alejandro

[0] https://lore.kernel.org/all/20250414020253.443831-1-alejandro.j.jimenez@oracle.com/
[1] https://lore.kernel.org/all/cover.1749016436.git.Ankit.Soni@amd.com/
[2] https://lore.kernel.org/qemu-devel/20250502021605.1795985-1-alejandro.j.jimenez@oracle.com/

Alejandro Jimenez (20):
  memory: Adjust event ranges to fit within notifier boundaries
  amd_iommu: Document '-device amd-iommu' common options
  amd_iommu: Reorder device and page table helpers
  amd_iommu: Helper to decode size of page invalidation command
  amd_iommu: Add helper function to extract the DTE
  amd_iommu: Return an error when unable to read PTE from guest memory
  amd_iommu: Add helpers to walk AMD v1 Page Table format
  amd_iommu: Add a page walker to sync shadow page tables on
    invalidation
  amd_iommu: Add basic structure to support IOMMU notifier updates
  amd_iommu: Sync shadow page tables on page invalidation
  amd_iommu: Use iova_tree records to determine large page size on UNMAP
  amd_iommu: Unmap all address spaces under the AMD IOMMU on reset
  amd_iommu: Add replay callback
  amd_iommu: Invalidate address translations on INVALIDATE_IOMMU_ALL
  amd_iommu: Toggle memory regions based on address translation mode
  amd_iommu: Set all address spaces to use passthrough mode on reset
  amd_iommu: Add dma-remap property to AMD vIOMMU device
  amd_iommu: Toggle address translation mode on devtab entry
    invalidation
  amd_iommu: Do not assume passthrough translation when DTE[TV]=0
  amd_iommu: Refactor amdvi_page_walk() to use common code for page walk

Joao Martins (2):
  i386/intel-iommu: Move dma_translation to x86-iommu
  amd_iommu: HATDis/HATS=11 support

 hw/i386/acpi-build.c        |    6 +-
 hw/i386/amd_iommu.c         | 1056 ++++++++++++++++++++++++++++++-----
 hw/i386/amd_iommu.h         |   51 ++
 hw/i386/intel_iommu.c       |    5 +-
 hw/i386/x86-iommu.c         |    1 +
 include/hw/i386/x86-iommu.h |    1 +
 qemu-options.hx             |   23 +
 system/memory.c             |   10 +-
 8 files changed, 999 insertions(+), 154 deletions(-)


base-commit: ab8008b231e758e03c87c1c483c03afdd9c02e19
-- 
2.43.5