include/exec/memory.h | 30 ++++- include/hw/vfio/vfio-common.h | 2 + include/hw/virtio/virtio-iommu.h | 7 +- include/qemu/range.h | 9 ++ include/qemu/reserved-region.h | 32 +++++ hw/core/qdev-properties-system.c | 9 +- hw/vfio/common.c | 70 ++++++++--- hw/virtio/virtio-iommu-pci.c | 8 +- hw/virtio/virtio-iommu.c | 85 +++++++++++-- softmmu/memory.c | 15 +++ tests/unit/test-resv-mem.c | 198 +++++++++++++++++++++++++++++++ util/range.c | 41 ++++++- util/reserved-region.c | 94 +++++++++++++++ hw/virtio/trace-events | 1 + tests/unit/meson.build | 1 + util/meson.build | 1 + 16 files changed, 562 insertions(+), 41 deletions(-) create mode 100644 include/qemu/reserved-region.h create mode 100644 tests/unit/test-resv-mem.c create mode 100644 util/reserved-region.c
On x86, when assigning VFIO-PCI devices protected with virtio-iommu we encounter the case where the guest tries to map IOVAs beyond 48b whereas the physical VTD IOMMU only supports 48b. This ends up with VFIO_MAP_DMA failures at qemu level because at kernel level, vfio_iommu_iova_dma_valid() check returns false on vfio_map_do_map(). This is due to the fact the virtio-iommu currently unconditionally exposes an IOVA range of 64b through its config input range fields. This series removes this assumption by retrieving the usable IOVA regions through the VFIO_IOMMU_TYPE1_INFO_CAP_IOVA_RANGE UAPI when a VFIO device is attached. This info is communicated to the virtio-iommu memory region, transformed into the inversed info, ie. the host reserved IOVA regions. Then those latter are combined with the reserved IOVA regions set though the virtio-iommu reserved-regions property. That way, the guest virtio-iommu driver, unchanged, is able to probe the whole set of reserved regions and prevent any IOVA belonging to those ranges from beeing used, achieving the original goal. Best Regards Eric This series can be found at: https://github.com/eauger/qemu/tree/virtio-iommu_geometry_v1 Eric Auger (13): memory: Let ReservedRegion use Range memory: Introduce memory_region_iommu_set_iova_ranges vfio: Collect container iova range info virtio-iommu: Rename reserved_regions into prop_resv_regions virtio-iommu: Introduce per IOMMUDevice reserved regions range: Introduce range_inverse_array() virtio-iommu: Implement set_iova_ranges() callback range: Make range_compare() public util/reserved-region: Add new ReservedRegion helpers virtio-iommu: Consolidate host reserved regions and property set ones test: Add some tests for range and resv-mem helpers virtio-iommu: Resize memory region according to the max iova info vfio: Remove 64-bit IOVA address space assumption include/exec/memory.h | 30 ++++- include/hw/vfio/vfio-common.h | 2 + include/hw/virtio/virtio-iommu.h | 7 +- include/qemu/range.h | 9 ++ include/qemu/reserved-region.h | 32 +++++ hw/core/qdev-properties-system.c | 9 +- hw/vfio/common.c | 70 ++++++++--- hw/virtio/virtio-iommu-pci.c | 8 +- hw/virtio/virtio-iommu.c | 85 +++++++++++-- softmmu/memory.c | 15 +++ tests/unit/test-resv-mem.c | 198 +++++++++++++++++++++++++++++++ util/range.c | 41 ++++++- util/reserved-region.c | 94 +++++++++++++++ hw/virtio/trace-events | 1 + tests/unit/meson.build | 1 + util/meson.build | 1 + 16 files changed, 562 insertions(+), 41 deletions(-) create mode 100644 include/qemu/reserved-region.h create mode 100644 tests/unit/test-resv-mem.c create mode 100644 util/reserved-region.c -- 2.41.0
I have runned the following two tests, but both tests failed: [1] start a VM with virtio-iommu + 2 ice PFs only via qemu-kvm 8.1.5 Test result : the qemu-kvm keeps throwing the error: VFIO_MAP_DMA failed: File exists. vfio_dma_map(0x56443d20fbe0, 0xffffe000, 0x1000, 0x7fb545709000) = -17 (File exists) [2] start a VM with virtio-iommu + 2 ice PFs via libvirt-9.5 + qemu-kvm 8.1.5 Test result: the qemu-kvm core dump with ERROR:../qom/object.c:1198:object_unref: assertion failed: (obj->ref > 0). Bail out! ERROR:../qom/object.c:1198:object_unref: assertion failed: (obj->ref > 0) After removing the 2 PF from the VM, both tests passed. Tested-by: Yanghang Liu <yanghliu@redhat.com> Best Regards, YangHang Liu On Mon, Sep 4, 2023 at 4:08 PM Eric Auger <eric.auger@redhat.com> wrote: > > On x86, when assigning VFIO-PCI devices protected with virtio-iommu > we encounter the case where the guest tries to map IOVAs beyond 48b > whereas the physical VTD IOMMU only supports 48b. This ends up with > VFIO_MAP_DMA failures at qemu level because at kernel level, > vfio_iommu_iova_dma_valid() check returns false on vfio_map_do_map(). > > This is due to the fact the virtio-iommu currently unconditionally > exposes an IOVA range of 64b through its config input range fields. > > This series removes this assumption by retrieving the usable IOVA > regions through the VFIO_IOMMU_TYPE1_INFO_CAP_IOVA_RANGE UAPI when > a VFIO device is attached. This info is communicated to the > virtio-iommu memory region, transformed into the inversed info, ie. > the host reserved IOVA regions. Then those latter are combined with the > reserved IOVA regions set though the virtio-iommu reserved-regions > property. That way, the guest virtio-iommu driver, unchanged, is > able to probe the whole set of reserved regions and prevent any IOVA > belonging to those ranges from beeing used, achieving the original goal. > > Best Regards > > Eric > > This series can be found at: > https://github.com/eauger/qemu/tree/virtio-iommu_geometry_v1 > > Eric Auger (13): > memory: Let ReservedRegion use Range > memory: Introduce memory_region_iommu_set_iova_ranges > vfio: Collect container iova range info > virtio-iommu: Rename reserved_regions into prop_resv_regions > virtio-iommu: Introduce per IOMMUDevice reserved regions > range: Introduce range_inverse_array() > virtio-iommu: Implement set_iova_ranges() callback > range: Make range_compare() public > util/reserved-region: Add new ReservedRegion helpers > virtio-iommu: Consolidate host reserved regions and property set ones > test: Add some tests for range and resv-mem helpers > virtio-iommu: Resize memory region according to the max iova info > vfio: Remove 64-bit IOVA address space assumption > > include/exec/memory.h | 30 ++++- > include/hw/vfio/vfio-common.h | 2 + > include/hw/virtio/virtio-iommu.h | 7 +- > include/qemu/range.h | 9 ++ > include/qemu/reserved-region.h | 32 +++++ > hw/core/qdev-properties-system.c | 9 +- > hw/vfio/common.c | 70 ++++++++--- > hw/virtio/virtio-iommu-pci.c | 8 +- > hw/virtio/virtio-iommu.c | 85 +++++++++++-- > softmmu/memory.c | 15 +++ > tests/unit/test-resv-mem.c | 198 +++++++++++++++++++++++++++++++ > util/range.c | 41 ++++++- > util/reserved-region.c | 94 +++++++++++++++ > hw/virtio/trace-events | 1 + > tests/unit/meson.build | 1 + > util/meson.build | 1 + > 16 files changed, 562 insertions(+), 41 deletions(-) > create mode 100644 include/qemu/reserved-region.h > create mode 100644 tests/unit/test-resv-mem.c > create mode 100644 util/reserved-region.c > > -- > 2.41.0 > >
Hi, On 9/5/23 10:22, YangHang Liu wrote: > I have runned the following two tests, but both tests failed: > [1] start a VM with virtio-iommu + 2 ice PFs only via qemu-kvm 8.1.5 > Test result : the qemu-kvm keeps throwing the error: VFIO_MAP_DMA > failed: File exists. vfio_dma_map(0x56443d20fbe0, 0xffffe000, 0x1000, > 0x7fb545709000) = -17 (File exists) > [2] start a VM with virtio-iommu + 2 ice PFs via libvirt-9.5 + qemu-kvm 8.1.5 > Test result: the qemu-kvm core dump with > ERROR:../qom/object.c:1198:object_unref: assertion failed: (obj->ref > > 0). Bail out! ERROR:../qom/object.c:1198:object_unref: assertion > failed: (obj->ref > 0) This latter issue is introduced by patch [PATCH 12/13] virtio-iommu: Resize memory region according to the max iova info and especially the call to memory_region_set_size(&mr->parent_obj, max_iova + 1); I don't really get why at the moment but I will investigate ... Eric > > After removing the 2 PF from the VM, both tests passed. > > Tested-by: Yanghang Liu <yanghliu@redhat.com> > > Best Regards, > YangHang Liu > > > On Mon, Sep 4, 2023 at 4:08 PM Eric Auger <eric.auger@redhat.com> wrote: >> On x86, when assigning VFIO-PCI devices protected with virtio-iommu >> we encounter the case where the guest tries to map IOVAs beyond 48b >> whereas the physical VTD IOMMU only supports 48b. This ends up with >> VFIO_MAP_DMA failures at qemu level because at kernel level, >> vfio_iommu_iova_dma_valid() check returns false on vfio_map_do_map(). >> >> This is due to the fact the virtio-iommu currently unconditionally >> exposes an IOVA range of 64b through its config input range fields. >> >> This series removes this assumption by retrieving the usable IOVA >> regions through the VFIO_IOMMU_TYPE1_INFO_CAP_IOVA_RANGE UAPI when >> a VFIO device is attached. This info is communicated to the >> virtio-iommu memory region, transformed into the inversed info, ie. >> the host reserved IOVA regions. Then those latter are combined with the >> reserved IOVA regions set though the virtio-iommu reserved-regions >> property. That way, the guest virtio-iommu driver, unchanged, is >> able to probe the whole set of reserved regions and prevent any IOVA >> belonging to those ranges from beeing used, achieving the original goal. >> >> Best Regards >> >> Eric >> >> This series can be found at: >> https://github.com/eauger/qemu/tree/virtio-iommu_geometry_v1 >> >> Eric Auger (13): >> memory: Let ReservedRegion use Range >> memory: Introduce memory_region_iommu_set_iova_ranges >> vfio: Collect container iova range info >> virtio-iommu: Rename reserved_regions into prop_resv_regions >> virtio-iommu: Introduce per IOMMUDevice reserved regions >> range: Introduce range_inverse_array() >> virtio-iommu: Implement set_iova_ranges() callback >> range: Make range_compare() public >> util/reserved-region: Add new ReservedRegion helpers >> virtio-iommu: Consolidate host reserved regions and property set ones >> test: Add some tests for range and resv-mem helpers >> virtio-iommu: Resize memory region according to the max iova info >> vfio: Remove 64-bit IOVA address space assumption >> >> include/exec/memory.h | 30 ++++- >> include/hw/vfio/vfio-common.h | 2 + >> include/hw/virtio/virtio-iommu.h | 7 +- >> include/qemu/range.h | 9 ++ >> include/qemu/reserved-region.h | 32 +++++ >> hw/core/qdev-properties-system.c | 9 +- >> hw/vfio/common.c | 70 ++++++++--- >> hw/virtio/virtio-iommu-pci.c | 8 +- >> hw/virtio/virtio-iommu.c | 85 +++++++++++-- >> softmmu/memory.c | 15 +++ >> tests/unit/test-resv-mem.c | 198 +++++++++++++++++++++++++++++++ >> util/range.c | 41 ++++++- >> util/reserved-region.c | 94 +++++++++++++++ >> hw/virtio/trace-events | 1 + >> tests/unit/meson.build | 1 + >> util/meson.build | 1 + >> 16 files changed, 562 insertions(+), 41 deletions(-) >> create mode 100644 include/qemu/reserved-region.h >> create mode 100644 tests/unit/test-resv-mem.c >> create mode 100644 util/reserved-region.c >> >> -- >> 2.41.0 >> >>
Hi Yanghang, On 9/5/23 10:22, YangHang Liu wrote: > I have runned the following two tests, but both tests failed: > [1] start a VM with virtio-iommu + 2 ice PFs only via qemu-kvm 8.1.5 > Test result : the qemu-kvm keeps throwing the error: VFIO_MAP_DMA > failed: File exists. vfio_dma_map(0x56443d20fbe0, 0xffffe000, 0x1000, > 0x7fb545709000) = -17 (File exists) > [2] start a VM with virtio-iommu + 2 ice PFs via libvirt-9.5 + qemu-kvm 8.1.5 > Test result: the qemu-kvm core dump with > ERROR:../qom/object.c:1198:object_unref: assertion failed: (obj->ref > > 0). Bail out! ERROR:../qom/object.c:1198:object_unref: assertion > failed: (obj->ref > 0) > > After removing the 2 PF from the VM, both tests passed. > Tested-by: Yanghang Liu <yanghliu@redhat.com> thank you for testing. If my understanding is correct you still encountered some issues with/after the series. If this is correct you shall not offer your Tested-by which means you tested the series and it works fine for you/fixes your issue. Coming back to the above mentionned issues: 1) the File Exists issue is known and is linked to the replay. This will be handled separately, ie.I need to resume working on it as my first approach was flawed: See https://lore.kernel.org/all/20221207133646.635760-1-eric.auger@redhat.com/ This is unrelated to this series. Note this shouldn't prevent your passthroughed device from working. Those should be just spurious warnings that need to be removed. 2) the object_unref assertion most probaly is linked to that series and I will to investigate asap. Thank you again! Eric > > Best Regards, > YangHang Liu > > > On Mon, Sep 4, 2023 at 4:08 PM Eric Auger <eric.auger@redhat.com> wrote: >> On x86, when assigning VFIO-PCI devices protected with virtio-iommu >> we encounter the case where the guest tries to map IOVAs beyond 48b >> whereas the physical VTD IOMMU only supports 48b. This ends up with >> VFIO_MAP_DMA failures at qemu level because at kernel level, >> vfio_iommu_iova_dma_valid() check returns false on vfio_map_do_map(). >> >> This is due to the fact the virtio-iommu currently unconditionally >> exposes an IOVA range of 64b through its config input range fields. >> >> This series removes this assumption by retrieving the usable IOVA >> regions through the VFIO_IOMMU_TYPE1_INFO_CAP_IOVA_RANGE UAPI when >> a VFIO device is attached. This info is communicated to the >> virtio-iommu memory region, transformed into the inversed info, ie. >> the host reserved IOVA regions. Then those latter are combined with the >> reserved IOVA regions set though the virtio-iommu reserved-regions >> property. That way, the guest virtio-iommu driver, unchanged, is >> able to probe the whole set of reserved regions and prevent any IOVA >> belonging to those ranges from beeing used, achieving the original goal. >> >> Best Regards >> >> Eric >> >> This series can be found at: >> https://github.com/eauger/qemu/tree/virtio-iommu_geometry_v1 >> >> Eric Auger (13): >> memory: Let ReservedRegion use Range >> memory: Introduce memory_region_iommu_set_iova_ranges >> vfio: Collect container iova range info >> virtio-iommu: Rename reserved_regions into prop_resv_regions >> virtio-iommu: Introduce per IOMMUDevice reserved regions >> range: Introduce range_inverse_array() >> virtio-iommu: Implement set_iova_ranges() callback >> range: Make range_compare() public >> util/reserved-region: Add new ReservedRegion helpers >> virtio-iommu: Consolidate host reserved regions and property set ones >> test: Add some tests for range and resv-mem helpers >> virtio-iommu: Resize memory region according to the max iova info >> vfio: Remove 64-bit IOVA address space assumption >> >> include/exec/memory.h | 30 ++++- >> include/hw/vfio/vfio-common.h | 2 + >> include/hw/virtio/virtio-iommu.h | 7 +- >> include/qemu/range.h | 9 ++ >> include/qemu/reserved-region.h | 32 +++++ >> hw/core/qdev-properties-system.c | 9 +- >> hw/vfio/common.c | 70 ++++++++--- >> hw/virtio/virtio-iommu-pci.c | 8 +- >> hw/virtio/virtio-iommu.c | 85 +++++++++++-- >> softmmu/memory.c | 15 +++ >> tests/unit/test-resv-mem.c | 198 +++++++++++++++++++++++++++++++ >> util/range.c | 41 ++++++- >> util/reserved-region.c | 94 +++++++++++++++ >> hw/virtio/trace-events | 1 + >> tests/unit/meson.build | 1 + >> util/meson.build | 1 + >> 16 files changed, 562 insertions(+), 41 deletions(-) >> create mode 100644 include/qemu/reserved-region.h >> create mode 100644 tests/unit/test-resv-mem.c >> create mode 100644 util/reserved-region.c >> >> -- >> 2.41.0 >> >>
On Mon, 4 Sep 2023 10:03:43 +0200 Eric Auger <eric.auger@redhat.com> wrote: > On x86, when assigning VFIO-PCI devices protected with virtio-iommu > we encounter the case where the guest tries to map IOVAs beyond 48b > whereas the physical VTD IOMMU only supports 48b. This ends up with > VFIO_MAP_DMA failures at qemu level because at kernel level, > vfio_iommu_iova_dma_valid() check returns false on vfio_map_do_map(). > > This is due to the fact the virtio-iommu currently unconditionally > exposes an IOVA range of 64b through its config input range fields. > > This series removes this assumption by retrieving the usable IOVA > regions through the VFIO_IOMMU_TYPE1_INFO_CAP_IOVA_RANGE UAPI when > a VFIO device is attached. This info is communicated to the > virtio-iommu memory region, transformed into the inversed info, ie. > the host reserved IOVA regions. Then those latter are combined with the > reserved IOVA regions set though the virtio-iommu reserved-regions > property. That way, the guest virtio-iommu driver, unchanged, is > able to probe the whole set of reserved regions and prevent any IOVA > belonging to those ranges from beeing used, achieving the original goal. Hi Eric, I don't quite follow this relative to device hotplug. Are we manipulating a per-device memory region which is created at device add time? Is that memory region actually shared in some cases, for instance if we have a PCIe-to-PCI bridge aliasing devices on the conventional side? Thanks, Alex > This series can be found at: > https://github.com/eauger/qemu/tree/virtio-iommu_geometry_v1 > > Eric Auger (13): > memory: Let ReservedRegion use Range > memory: Introduce memory_region_iommu_set_iova_ranges > vfio: Collect container iova range info > virtio-iommu: Rename reserved_regions into prop_resv_regions > virtio-iommu: Introduce per IOMMUDevice reserved regions > range: Introduce range_inverse_array() > virtio-iommu: Implement set_iova_ranges() callback > range: Make range_compare() public > util/reserved-region: Add new ReservedRegion helpers > virtio-iommu: Consolidate host reserved regions and property set ones > test: Add some tests for range and resv-mem helpers > virtio-iommu: Resize memory region according to the max iova info > vfio: Remove 64-bit IOVA address space assumption > > include/exec/memory.h | 30 ++++- > include/hw/vfio/vfio-common.h | 2 + > include/hw/virtio/virtio-iommu.h | 7 +- > include/qemu/range.h | 9 ++ > include/qemu/reserved-region.h | 32 +++++ > hw/core/qdev-properties-system.c | 9 +- > hw/vfio/common.c | 70 ++++++++--- > hw/virtio/virtio-iommu-pci.c | 8 +- > hw/virtio/virtio-iommu.c | 85 +++++++++++-- > softmmu/memory.c | 15 +++ > tests/unit/test-resv-mem.c | 198 +++++++++++++++++++++++++++++++ > util/range.c | 41 ++++++- > util/reserved-region.c | 94 +++++++++++++++ > hw/virtio/trace-events | 1 + > tests/unit/meson.build | 1 + > util/meson.build | 1 + > 16 files changed, 562 insertions(+), 41 deletions(-) > create mode 100644 include/qemu/reserved-region.h > create mode 100644 tests/unit/test-resv-mem.c > create mode 100644 util/reserved-region.c >
Hi Alex, On 9/5/23 19:55, Alex Williamson wrote: > On Mon, 4 Sep 2023 10:03:43 +0200 > Eric Auger <eric.auger@redhat.com> wrote: > >> On x86, when assigning VFIO-PCI devices protected with virtio-iommu >> we encounter the case where the guest tries to map IOVAs beyond 48b >> whereas the physical VTD IOMMU only supports 48b. This ends up with >> VFIO_MAP_DMA failures at qemu level because at kernel level, >> vfio_iommu_iova_dma_valid() check returns false on vfio_map_do_map(). >> >> This is due to the fact the virtio-iommu currently unconditionally >> exposes an IOVA range of 64b through its config input range fields. >> >> This series removes this assumption by retrieving the usable IOVA >> regions through the VFIO_IOMMU_TYPE1_INFO_CAP_IOVA_RANGE UAPI when >> a VFIO device is attached. This info is communicated to the >> virtio-iommu memory region, transformed into the inversed info, ie. >> the host reserved IOVA regions. Then those latter are combined with the >> reserved IOVA regions set though the virtio-iommu reserved-regions >> property. That way, the guest virtio-iommu driver, unchanged, is >> able to probe the whole set of reserved regions and prevent any IOVA >> belonging to those ranges from beeing used, achieving the original goal. > Hi Eric, > > I don't quite follow this relative to device hotplug. Are we > manipulating a per-device memory region which is created at device add > time? Is that memory region actually shared in some cases, for instance > if we have a PCIe-to-PCI bridge aliasing devices on the conventional > side? Thanks, I agree this deserves more attention and testing in the case of hotplug and aliasing. Wrt PCIe to PCI bridge, virtio-iommu and smmu are known to be broken with this latter due to lack of kernel support (issue with group probing, but this might change in the future) so this is not a currently supported feature, as opposed to virtual intel iommu. Here I was mostly assuming one device per container and per IOMMU MR but maybe I have to detect & forbid more complex scenari. Thanks Eric > Alex > >> This series can be found at: >> https://github.com/eauger/qemu/tree/virtio-iommu_geometry_v1 >> >> Eric Auger (13): >> memory: Let ReservedRegion use Range >> memory: Introduce memory_region_iommu_set_iova_ranges >> vfio: Collect container iova range info >> virtio-iommu: Rename reserved_regions into prop_resv_regions >> virtio-iommu: Introduce per IOMMUDevice reserved regions >> range: Introduce range_inverse_array() >> virtio-iommu: Implement set_iova_ranges() callback >> range: Make range_compare() public >> util/reserved-region: Add new ReservedRegion helpers >> virtio-iommu: Consolidate host reserved regions and property set ones >> test: Add some tests for range and resv-mem helpers >> virtio-iommu: Resize memory region according to the max iova info >> vfio: Remove 64-bit IOVA address space assumption >> >> include/exec/memory.h | 30 ++++- >> include/hw/vfio/vfio-common.h | 2 + >> include/hw/virtio/virtio-iommu.h | 7 +- >> include/qemu/range.h | 9 ++ >> include/qemu/reserved-region.h | 32 +++++ >> hw/core/qdev-properties-system.c | 9 +- >> hw/vfio/common.c | 70 ++++++++--- >> hw/virtio/virtio-iommu-pci.c | 8 +- >> hw/virtio/virtio-iommu.c | 85 +++++++++++-- >> softmmu/memory.c | 15 +++ >> tests/unit/test-resv-mem.c | 198 +++++++++++++++++++++++++++++++ >> util/range.c | 41 ++++++- >> util/reserved-region.c | 94 +++++++++++++++ >> hw/virtio/trace-events | 1 + >> tests/unit/meson.build | 1 + >> util/meson.build | 1 + >> 16 files changed, 562 insertions(+), 41 deletions(-) >> create mode 100644 include/qemu/reserved-region.h >> create mode 100644 tests/unit/test-resv-mem.c >> create mode 100644 util/reserved-region.c >>
© 2016 - 2024 Red Hat, Inc.