default-configs/aarch64-softmmu.mak | 1 + hw/arm/Makefile.objs | 1 + hw/arm/smmu-common.c | 527 ++++++++++++++++ hw/arm/smmu-internal.h | 105 ++++ hw/arm/smmuv3-internal.h | 584 +++++++++++++++++ hw/arm/smmuv3.c | 1181 +++++++++++++++++++++++++++++++++++ hw/arm/sysbus-fdt.c | 129 +++- hw/arm/trace-events | 48 ++ hw/arm/virt-acpi-build.c | 63 +- hw/arm/virt.c | 6 +- include/hw/acpi/acpi-defs.h | 15 + include/hw/arm/smmu-common.h | 123 ++++ include/hw/arm/smmuv3.h | 80 +++ include/hw/arm/sysbus-fdt.h | 2 + include/hw/arm/virt.h | 15 + target/arm/kvm.c | 27 + target/arm/trace-events | 3 + 17 files changed, 2886 insertions(+), 24 deletions(-) create mode 100644 hw/arm/smmu-common.c create mode 100644 hw/arm/smmu-internal.h create mode 100644 hw/arm/smmuv3-internal.h create mode 100644 hw/arm/smmuv3.c create mode 100644 include/hw/arm/smmu-common.h create mode 100644 include/hw/arm/smmuv3.h
This series implements the emulation code for ARM SMMUv3. Changes since v6: - DPDK testpmd now running on guest with 2 assigned VFs - Changed the instantiation method: add the following option to the QEMU command line -device smmuv3 # for virtio/vhost use cases -device smmuv3,caching-mode # for vfio use cases (based on [1]) - splitted the series into smaller patches to allow the review - the VFIO integration based on "tlbi-on-map" smmuv3 driver is isolated from the rest: last 2 patches, not for upstream. This is shipped for testing/bench until a better solution is found. - Reworked permission flag checks and event generation testing: - in dt and ACPI modes - virtio-net-pci and vhost-net devices using dma ops with various guest page sizes [2] - assigned VFs using dma ops [3]: - AMD Overdrive and igbvf passthrough (using gsi direct mapping) - Cavium ThunderX and ixgbevf passthrough (using KVM MSI routing) - DPDK testpmd on guest running with VFIO user space drivers (2 igbvf) [3] with guest and host page size equal (4kB) Known limitations: - no VMSAv8-32 suport - no nested stage support (S1 + S2) - no support for HYP mappings - register fine emulation, commands, interrupts and errors were not accurately tested. Handling is sufficient to run use cases described above though. - interrupts and event generation not observed yet. Best Regards Eric This series can be found at: v7: https://github.com/eauger/qemu/tree/v2.10.0-SMMU-v7 Previous version at: v6: https://github.com/eauger/qemu/tree/v2.10.0-rc2-SMMU-v6 References: [1] [RFC v2 0/4] arm-smmu-v3 tlbi-on-map option https://lkml.org/lkml/2017/8/11/426 [2] qemu cmd line excerpt: -device smmuv3 \ -netdev tap,id=tap0,script=no,downscript=no,ifname=tap0,vhost=off \ -device virtio-net-pci,netdev=tap0,mac=6a:f5:10:b1:3d:d2,iommu_platform,disable-modern=off,disable-legacy=on \ [3] use -device smmuv3,caching-mode History: v6 -> v7: - see above v5 -> v6: - Rebase on 2.10 and IOMMUMemoryRegion - add ACPI TLBI_ON_MAP support (VFIO integration also works in ACPI mode) - fix block replay - handle implementation defined SMMU_CMD_TLBI_NH_VA_AM cmd (goes along with TLBI_ON_MAP FW quirk) - replay systematically unmap the whole range first - smmuv3_map_hook does not unmap anymore and the unmap is done before the replay - add and use smmuv3_context_device_invalidate instead of blindly replaying everything v4 -> v5: - initial_level now part of SMMUTransCfg - smmu_page_walk_64 takes into account the max input size - implement sys->iommu_ops.replay and sys->iommu_ops.notify_flag_changed - smmuv3_translate: bug fix: don't walk on bypass - smmu_update_qreg: fix PROD index update - I did not yet address Peter's comments as the code is not mature enough to be split into sub patches. v3 -> v4 [Eric]: - page table walk rewritten to allow scan of the page table within a range of IOVA. This prepares for VFIO integration and replay. - configuration parsing partially reworked. - do not advertise unsupported/untested features: S2, S1 + S2, HYP, PRI, ATS, .. - added ACPI table generation - migrated to dynamic traces - mingw compilation fix v2 -> v3 [Eric]: - rebased on 2.9 - mostly code and patch reorganization to ease the review process - optional patches removed. They may be handled separately. I am currently working on ACPI enablement. - optional instantiation of the smmu in mach-virt - removed [2/9] (fdt functions) since not mandated - start splitting main patch into base and derived object - no new function feature added v1 -> v2 [Prem]: - Adopted review comments from Eric Auger - Make SMMU_DPRINTF to internally call qemu_log (since translation requests are too many, we need control on the type of log we want) - SMMUTransCfg modified to suite simplicity - Change RegInfo to uint64 register array - Code cleanup - Test cleanups - Reshuffled patches v0 -> v1 [Prem]: - As per SMMUv3 spec 16.0 (only is_ste_consistant() is noticeable) - Reworked register access/update logic - Factored out translation code for - single point bug fix - sharing/removal in future - (optional) Unit tests added, with PCI test device - S1 with 4k/64k, S1+S2 with 4k/64k - (S1 or S2) only can be verified by Linux 4.7 driver - (optional) Priliminary ACPI support v0 [Prem]: - Implements SMMUv3 spec 11.0 - Supported for PCIe devices, - Command Queue and Event Queue supported - LPAE only, S1 is supported and Tested, S2 not tested - BE mode Translation not supported - IRQ support (legacy, no MSI) Eric Auger (18): hw/arm/smmu-common: smmu base device and datatypes hw/arm/smmu-common: IOMMU memory region and address space setup hw/arm/smmu-common: smmu_read/write_sysmem hw/arm/smmu-common: VMSAv8-64 page table walk hw/arm/smmuv3: Wired IRQ and GERROR helpers hw/arm/smmuv3: Queue helpers hw/arm/smmuv3: Implement MMIO write operations hw/arm/smmuv3: Event queue recording helper hw/arm/smmuv3: Implement translate callback target/arm/kvm: Translate the MSI doorbell in kvm_arch_fixup_msi_route hw/arm/smmuv3: Implement data structure and TLB invalidation notifications hw/arm/smmuv3: Implement IOMMU memory region replay callback hw/arm/virt: Store the PCI host controller dt phandle hw/arm/sysbus-fdt: Pass the VirtMachineState to the node creation functions hw/arm/sysbus-fdt: Pass the platform bus base address in PlatformBusFDTData hw/arm/sysbus-fdt: Allow smmuv3 dynamic instantiation hw/arm/smmuv3: [not for upstream] add SMMU_CMD_TLBI_NH_VA_AM handling hw/arm/smmuv3: [not for upstream] Add caching-mode option Prem Mallappa (2): hw/arm/smmuv3: Skeleton hw/arm/virt-acpi-build: Add smmuv3 node in IORT table default-configs/aarch64-softmmu.mak | 1 + hw/arm/Makefile.objs | 1 + hw/arm/smmu-common.c | 527 ++++++++++++++++ hw/arm/smmu-internal.h | 105 ++++ hw/arm/smmuv3-internal.h | 584 +++++++++++++++++ hw/arm/smmuv3.c | 1181 +++++++++++++++++++++++++++++++++++ hw/arm/sysbus-fdt.c | 129 +++- hw/arm/trace-events | 48 ++ hw/arm/virt-acpi-build.c | 63 +- hw/arm/virt.c | 6 +- include/hw/acpi/acpi-defs.h | 15 + include/hw/arm/smmu-common.h | 123 ++++ include/hw/arm/smmuv3.h | 80 +++ include/hw/arm/sysbus-fdt.h | 2 + include/hw/arm/virt.h | 15 + target/arm/kvm.c | 27 + target/arm/trace-events | 3 + 17 files changed, 2886 insertions(+), 24 deletions(-) create mode 100644 hw/arm/smmu-common.c create mode 100644 hw/arm/smmu-internal.h create mode 100644 hw/arm/smmuv3-internal.h create mode 100644 hw/arm/smmuv3.c create mode 100644 include/hw/arm/smmu-common.h create mode 100644 include/hw/arm/smmuv3.h -- 2.5.5
On 1 September 2017 at 18:21, Eric Auger <eric.auger@redhat.com> wrote: > This series implements the emulation code for ARM SMMUv3. > > Changes since v6: > - DPDK testpmd now running on guest with 2 assigned VFs > - Changed the instantiation method: add the following option to > the QEMU command line > -device smmuv3 # for virtio/vhost use cases > -device smmuv3,caching-mode # for vfio use cases (based on [1]) > - splitted the series into smaller patches to allow the review > - the VFIO integration based on "tlbi-on-map" smmuv3 driver > is isolated from the rest: last 2 patches, not for upstream. > This is shipped for testing/bench until a better solution is found. > - Reworked permission flag checks and event generation Hi Eric -- I see you've upgraded this from an RFC to a PATCH set. Do you want the patches reviewed and (eventually) taken into git now? thanks -- PMM
Hi Peter, On 07/09/2017 14:39, Peter Maydell wrote: > On 1 September 2017 at 18:21, Eric Auger <eric.auger@redhat.com> wrote: >> This series implements the emulation code for ARM SMMUv3. >> >> Changes since v6: >> - DPDK testpmd now running on guest with 2 assigned VFs >> - Changed the instantiation method: add the following option to >> the QEMU command line >> -device smmuv3 # for virtio/vhost use cases >> -device smmuv3,caching-mode # for vfio use cases (based on [1]) >> - splitted the series into smaller patches to allow the review >> - the VFIO integration based on "tlbi-on-map" smmuv3 driver >> is isolated from the rest: last 2 patches, not for upstream. >> This is shipped for testing/bench until a better solution is found. >> - Reworked permission flag checks and event generation > > Hi Eric -- I see you've upgraded this from an RFC to a PATCH set. > Do you want the patches reviewed and (eventually) taken into git > now? Yes I split the series to make it more reviewable and from a functional point of view I have run all major use cases. So now I would encourage people to start reviewing the series (focusing on patches 1-18). Thanks Eric > > thanks > -- PMM >
Hi Eric, On Fri Sep 01, 2017 at 07:21:03PM +0200, Eric Auger wrote: > This series implements the emulation code for ARM SMMUv3. > > Changes since v6: > - DPDK testpmd now running on guest with 2 assigned VFs > - Changed the instantiation method: add the following option to > the QEMU command line > -device smmuv3 # for virtio/vhost use cases > -device smmuv3,caching-mode # for vfio use cases (based on [1]) > - splitted the series into smaller patches to allow the review > - the VFIO integration based on "tlbi-on-map" smmuv3 driver > is isolated from the rest: last 2 patches, not for upstream. > This is shipped for testing/bench until a better solution is found. > - Reworked permission flag checks and event generation > > testing: > - in dt and ACPI modes > - virtio-net-pci and vhost-net devices using dma ops with various > guest page sizes [2] > - assigned VFs using dma ops [3]: > - AMD Overdrive and igbvf passthrough (using gsi direct mapping) > - Cavium ThunderX and ixgbevf passthrough (using KVM MSI routing) > - DPDK testpmd on guest running with VFIO user space drivers (2 igbvf) [3] > with guest and host page size equal (4kB) > > Known limitations: > - no VMSAv8-32 suport > - no nested stage support (S1 + S2) > - no support for HYP mappings > - register fine emulation, commands, interrupts and errors were > not accurately tested. Handling is sufficient to run use cases > described above though. > - interrupts and event generation not observed yet. > > Best Regards > > Eric > Was looking at options to get rid of the existing hacks we have in this implementation (last two patches) and also to reduce the map/unmap/translation overhead for the guest kernel devices. Interestingly, the nested stage translation + smmu emulation at kernel that we were exploring, has been already tried by Will Deacon. https://www.linuxplumbersconf.org/2014/ocw/system/presentations/2019/original/vsmmu-lpc14.pdf https://lists.gnu.org/archive/html/qemu-devel/2015-06/msg03379.html It would be nice to understand, why this solution was not pursued atleast for vfio-pci devices. OR If you have already plans to do nested stage support in the future, would be interested to know about it. > This series can be found at: > v7: https://github.com/eauger/qemu/tree/v2.10.0-SMMU-v7 > Previous version at: > v6: https://github.com/eauger/qemu/tree/v2.10.0-rc2-SMMU-v6 > > References: > [1] [RFC v2 0/4] arm-smmu-v3 tlbi-on-map option > https://lkml.org/lkml/2017/8/11/426 > > [2] qemu cmd line excerpt: > -device smmuv3 \ > -netdev tap,id=tap0,script=no,downscript=no,ifname=tap0,vhost=off \ > -device virtio-net-pci,netdev=tap0,mac=6a:f5:10:b1:3d:d2,iommu_platform,disable-modern=off,disable-legacy=on \ > [3] use -device smmuv3,caching-mode > > > History: > v6 -> v7: > - see above > > v5 -> v6: > - Rebase on 2.10 and IOMMUMemoryRegion > - add ACPI TLBI_ON_MAP support (VFIO integration also works in > ACPI mode) > - fix block replay > - handle implementation defined SMMU_CMD_TLBI_NH_VA_AM cmd > (goes along with TLBI_ON_MAP FW quirk) > - replay systematically unmap the whole range first > - smmuv3_map_hook does not unmap anymore and the unmap is done > before the replay > - add and use smmuv3_context_device_invalidate instead of > blindly replaying everything > > v4 -> v5: > - initial_level now part of SMMUTransCfg > - smmu_page_walk_64 takes into account the max input size > - implement sys->iommu_ops.replay and sys->iommu_ops.notify_flag_changed > - smmuv3_translate: bug fix: don't walk on bypass > - smmu_update_qreg: fix PROD index update > - I did not yet address Peter's comments as the code is not mature enough > to be split into sub patches. > > v3 -> v4 [Eric]: > - page table walk rewritten to allow scan of the page table within a > range of IOVA. This prepares for VFIO integration and replay. > - configuration parsing partially reworked. > - do not advertise unsupported/untested features: S2, S1 + S2, HYP, > PRI, ATS, .. > - added ACPI table generation > - migrated to dynamic traces > - mingw compilation fix > > v2 -> v3 [Eric]: > - rebased on 2.9 > - mostly code and patch reorganization to ease the review process > - optional patches removed. They may be handled separately. I am currently > working on ACPI enablement. > - optional instantiation of the smmu in mach-virt > - removed [2/9] (fdt functions) since not mandated > - start splitting main patch into base and derived object > - no new function feature added > > v1 -> v2 [Prem]: > - Adopted review comments from Eric Auger > - Make SMMU_DPRINTF to internally call qemu_log > (since translation requests are too many, we need control > on the type of log we want) > - SMMUTransCfg modified to suite simplicity > - Change RegInfo to uint64 register array > - Code cleanup > - Test cleanups > - Reshuffled patches > > v0 -> v1 [Prem]: > - As per SMMUv3 spec 16.0 (only is_ste_consistant() is noticeable) > - Reworked register access/update logic > - Factored out translation code for > - single point bug fix > - sharing/removal in future > - (optional) Unit tests added, with PCI test device > - S1 with 4k/64k, S1+S2 with 4k/64k > - (S1 or S2) only can be verified by Linux 4.7 driver > - (optional) Priliminary ACPI support > > v0 [Prem]: > - Implements SMMUv3 spec 11.0 > - Supported for PCIe devices, > - Command Queue and Event Queue supported > - LPAE only, S1 is supported and Tested, S2 not tested > - BE mode Translation not supported > - IRQ support (legacy, no MSI) > > Eric Auger (18): > hw/arm/smmu-common: smmu base device and datatypes > hw/arm/smmu-common: IOMMU memory region and address space setup > hw/arm/smmu-common: smmu_read/write_sysmem > hw/arm/smmu-common: VMSAv8-64 page table walk > hw/arm/smmuv3: Wired IRQ and GERROR helpers > hw/arm/smmuv3: Queue helpers > hw/arm/smmuv3: Implement MMIO write operations > hw/arm/smmuv3: Event queue recording helper > hw/arm/smmuv3: Implement translate callback > target/arm/kvm: Translate the MSI doorbell in kvm_arch_fixup_msi_route > hw/arm/smmuv3: Implement data structure and TLB invalidation > notifications > hw/arm/smmuv3: Implement IOMMU memory region replay callback > hw/arm/virt: Store the PCI host controller dt phandle > hw/arm/sysbus-fdt: Pass the VirtMachineState to the node creation > functions > hw/arm/sysbus-fdt: Pass the platform bus base address in > PlatformBusFDTData > hw/arm/sysbus-fdt: Allow smmuv3 dynamic instantiation > hw/arm/smmuv3: [not for upstream] add SMMU_CMD_TLBI_NH_VA_AM handling > hw/arm/smmuv3: [not for upstream] Add caching-mode option > > Prem Mallappa (2): > hw/arm/smmuv3: Skeleton > hw/arm/virt-acpi-build: Add smmuv3 node in IORT table > > default-configs/aarch64-softmmu.mak | 1 + > hw/arm/Makefile.objs | 1 + > hw/arm/smmu-common.c | 527 ++++++++++++++++ > hw/arm/smmu-internal.h | 105 ++++ > hw/arm/smmuv3-internal.h | 584 +++++++++++++++++ > hw/arm/smmuv3.c | 1181 +++++++++++++++++++++++++++++++++++ > hw/arm/sysbus-fdt.c | 129 +++- > hw/arm/trace-events | 48 ++ > hw/arm/virt-acpi-build.c | 63 +- > hw/arm/virt.c | 6 +- > include/hw/acpi/acpi-defs.h | 15 + > include/hw/arm/smmu-common.h | 123 ++++ > include/hw/arm/smmuv3.h | 80 +++ > include/hw/arm/sysbus-fdt.h | 2 + > include/hw/arm/virt.h | 15 + > target/arm/kvm.c | 27 + > target/arm/trace-events | 3 + > 17 files changed, 2886 insertions(+), 24 deletions(-) > create mode 100644 hw/arm/smmu-common.c > create mode 100644 hw/arm/smmu-internal.h > create mode 100644 hw/arm/smmuv3-internal.h > create mode 100644 hw/arm/smmuv3.c > create mode 100644 include/hw/arm/smmu-common.h > create mode 100644 include/hw/arm/smmuv3.h > > -- > 2.5.5 > > -- Linu cherian
On Tue, Oct 24, 2017 at 11:08:02AM +0530, Linu Cherian wrote: > On Fri Sep 01, 2017 at 07:21:03PM +0200, Eric Auger wrote: > > This series implements the emulation code for ARM SMMUv3. > > > > Changes since v6: > > - DPDK testpmd now running on guest with 2 assigned VFs > > - Changed the instantiation method: add the following option to > > the QEMU command line > > -device smmuv3 # for virtio/vhost use cases > > -device smmuv3,caching-mode # for vfio use cases (based on [1]) > > - splitted the series into smaller patches to allow the review > > - the VFIO integration based on "tlbi-on-map" smmuv3 driver > > is isolated from the rest: last 2 patches, not for upstream. > > This is shipped for testing/bench until a better solution is found. > > - Reworked permission flag checks and event generation > > > > testing: > > - in dt and ACPI modes > > - virtio-net-pci and vhost-net devices using dma ops with various > > guest page sizes [2] > > - assigned VFs using dma ops [3]: > > - AMD Overdrive and igbvf passthrough (using gsi direct mapping) > > - Cavium ThunderX and ixgbevf passthrough (using KVM MSI routing) > > - DPDK testpmd on guest running with VFIO user space drivers (2 igbvf) [3] > > with guest and host page size equal (4kB) > > > > Known limitations: > > - no VMSAv8-32 suport > > - no nested stage support (S1 + S2) > > - no support for HYP mappings > > - register fine emulation, commands, interrupts and errors were > > not accurately tested. Handling is sufficient to run use cases > > described above though. > > - interrupts and event generation not observed yet. > > > > Best Regards > > > > Eric > > > > Was looking at options to get rid of the existing hacks we have > in this implementation (last two patches) and also to reduce the map/unmap/translation > overhead for the guest kernel devices. > > Interestingly, the nested stage translation + smmu emulation at kernel > that we were exploring, has been already tried by Will Deacon. > https://www.linuxplumbersconf.org/2014/ocw/system/presentations/2019/original/vsmmu-lpc14.pdf > https://lists.gnu.org/archive/html/qemu-devel/2015-06/msg03379.html > > > It would be nice to understand, why this solution was not pursued atleast for vfio-pci devices. > OR > If you have already plans to do nested stage support in the future, would be interested to know > about it. I don't plan to revive that code. I got something well on the way to working for SMMUv2, but it had some pretty major issues: 1. A huge amount of emulation code in the kernel 2. A horribly complicated user ABI 3. Keeping track of internal hardware caching state was a nightmare, so over-invalidation was rife 4. Errata workarounds meant trapping all SMMU accesses (inc. for stage 1) 5. I remember having issues with interrupts, but this was likely SMMUv2-specific 6. There was no scope for code re-use with other SMMU implementations (e.g. SMMUv3) Overall, it was just an unmaintainable, non-performant security-flaw-waiting-to-happen so I parked it. That's some of the background behind me preferring a virtio-iommu approach, because there's the potential for kernel acceleration using something like vhost. Will
Hi Will, On Tue, Oct 24, 2017 at 11:20:29AM +0100, Will Deacon wrote: > On Tue, Oct 24, 2017 at 11:08:02AM +0530, Linu Cherian wrote: > > On Fri Sep 01, 2017 at 07:21:03PM +0200, Eric Auger wrote: > > > This series implements the emulation code for ARM SMMUv3. > > > > > > Changes since v6: > > > - DPDK testpmd now running on guest with 2 assigned VFs > > > - Changed the instantiation method: add the following option to > > > the QEMU command line > > > -device smmuv3 # for virtio/vhost use cases > > > -device smmuv3,caching-mode # for vfio use cases (based on [1]) > > > - splitted the series into smaller patches to allow the review > > > - the VFIO integration based on "tlbi-on-map" smmuv3 driver > > > is isolated from the rest: last 2 patches, not for upstream. > > > This is shipped for testing/bench until a better solution is found. > > > - Reworked permission flag checks and event generation > > > > > > testing: > > > - in dt and ACPI modes > > > - virtio-net-pci and vhost-net devices using dma ops with various > > > guest page sizes [2] > > > - assigned VFs using dma ops [3]: > > > - AMD Overdrive and igbvf passthrough (using gsi direct mapping) > > > - Cavium ThunderX and ixgbevf passthrough (using KVM MSI routing) > > > - DPDK testpmd on guest running with VFIO user space drivers (2 igbvf) [3] > > > with guest and host page size equal (4kB) > > > > > > Known limitations: > > > - no VMSAv8-32 suport > > > - no nested stage support (S1 + S2) > > > - no support for HYP mappings > > > - register fine emulation, commands, interrupts and errors were > > > not accurately tested. Handling is sufficient to run use cases > > > described above though. > > > - interrupts and event generation not observed yet. > > > > > > Best Regards > > > > > > Eric > > > > > > > Was looking at options to get rid of the existing hacks we have > > in this implementation (last two patches) and also to reduce the map/unmap/translation > > overhead for the guest kernel devices. > > > > Interestingly, the nested stage translation + smmu emulation at kernel > > that we were exploring, has been already tried by Will Deacon. > > https://www.linuxplumbersconf.org/2014/ocw/system/presentations/2019/original/vsmmu-lpc14.pdf > > https://lists.gnu.org/archive/html/qemu-devel/2015-06/msg03379.html > > > > > > It would be nice to understand, why this solution was not pursued atleast for vfio-pci devices. > > OR > > If you have already plans to do nested stage support in the future, would be interested to know > > about it. > > I don't plan to revive that code. I got something well on the way to working > for SMMUv2, but it had some pretty major issues: > > 1. A huge amount of emulation code in the kernel > 2. A horribly complicated user ABI > 3. Keeping track of internal hardware caching state was a nightmare, so > over-invalidation was rife > 4. Errata workarounds meant trapping all SMMU accesses (inc. for stage 1) > 5. I remember having issues with interrupts, but this was likely > SMMUv2-specific > 6. There was no scope for code re-use with other SMMU implementations (e.g. > SMMUv3) > > Overall, it was just an unmaintainable, non-performant > security-flaw-waiting-to-happen so I parked it. That's some of the > background behind me preferring a virtio-iommu approach, because there's > the potential for kernel acceleration using something like vhost. > > Will Thanks for the explanation.
Hi Eric, On Fri Sep 01, 2017 at 07:21:03PM +0200, Eric Auger wrote: > This series implements the emulation code for ARM SMMUv3. > > Changes since v6: > - DPDK testpmd now running on guest with 2 assigned VFs > - Changed the instantiation method: add the following option to > the QEMU command line > -device smmuv3 # for virtio/vhost use cases > -device smmuv3,caching-mode # for vfio use cases (based on [1]) > - splitted the series into smaller patches to allow the review > - the VFIO integration based on "tlbi-on-map" smmuv3 driver > is isolated from the rest: last 2 patches, not for upstream. > This is shipped for testing/bench until a better solution is found. > - Reworked permission flag checks and event generation > e testing: > - in dt and ACPI modes > - virtio-net-pci and vhost-net devices using dma ops with various > guest page sizes [2] > - assigned VFs using dma ops [3]: > - AMD Overdrive and igbvf passthrough (using gsi direct mapping) > - Cavium ThunderX and ixgbevf passthrough (using KVM MSI routing) > - DPDK testpmd on guest running with VFIO user space drivers (2 igbvf) [3] > with guest and host page size equal (4kB) > > Known limitations: > - no VMSAv8-32 suport > - no nested stage support (S1 + S2) > - no support for HYP mappings > - register fine emulation, commands, interrupts and errors were > not accurately tested. Handling is sufficient to run use cases > described above though. > - interrupts and event generation not observed yet. While testing with vfio-pci, observed that the below two Qemu command, results in two different behaviour. Is this expected by design ? Case 1: # -device vfio-pci,host=0002:01:00.3 -device smmuv3,caching-mode Here iommu is not attached to the pci bus in Qemu backend, since pci_setup_iommu is not called before vfio_realize. Case 2: # -device smmuv3,caching-mode -device vfio-pci,host=0002:01:00.3 This works as expected, iommu is attached to the pci bus. > > Best Regards > > Eric > > This series can be found at: > v7: https://github.com/eauger/qemu/tree/v2.10.0-SMMU-v7 > Previous version at: > v6: https://github.com/eauger/qemu/tree/v2.10.0-rc2-SMMU-v6 > > References: > [1] [RFC v2 0/4] arm-smmu-v3 tlbi-on-map option > https://lkml.org/lkml/2017/8/11/426 > > [2] qemu cmd line excerpt: > -device smmuv3 \ > -netdev tap,id=tap0,script=no,downscript=no,ifname=tap0,vhost=off \ > -device virtio-net-pci,netdev=tap0,mac=6a:f5:10:b1:3d:d2,iommu_platform,disable-modern=off,disable-legacy=on \ > [3] use -device smmuv3,caching-mode > > > History: > v6 -> v7: > - see above > > v5 -> v6: > - Rebase on 2.10 and IOMMUMemoryRegion > - add ACPI TLBI_ON_MAP support (VFIO integration also works in > ACPI mode) > - fix block replay > - handle implementation defined SMMU_CMD_TLBI_NH_VA_AM cmd > (goes along with TLBI_ON_MAP FW quirk) > - replay systematically unmap the whole range first > - smmuv3_map_hook does not unmap anymore and the unmap is done > before the replay > - add and use smmuv3_context_device_invalidate instead of > blindly replaying everything > > v4 -> v5: > - initial_level now part of SMMUTransCfg > - smmu_page_walk_64 takes into account the max input size > - implement sys->iommu_ops.replay and sys->iommu_ops.notify_flag_changed > - smmuv3_translate: bug fix: don't walk on bypass > - smmu_update_qreg: fix PROD index update > - I did not yet address Peter's comments as the code is not mature enough > to be split into sub patches. > > v3 -> v4 [Eric]: > - page table walk rewritten to allow scan of the page table within a > range of IOVA. This prepares for VFIO integration and replay. > - configuration parsing partially reworked. > - do not advertise unsupported/untested features: S2, S1 + S2, HYP, > PRI, ATS, .. > - added ACPI table generation > - migrated to dynamic traces > - mingw compilation fix > > v2 -> v3 [Eric]: > - rebased on 2.9 > - mostly code and patch reorganization to ease the review process > - optional patches removed. They may be handled separately. I am currently > working on ACPI enablement. > - optional instantiation of the smmu in mach-virt > - removed [2/9] (fdt functions) since not mandated > - start splitting main patch into base and derived object > - no new function feature added > > v1 -> v2 [Prem]: > - Adopted review comments from Eric Auger > - Make SMMU_DPRINTF to internally call qemu_log > (since translation requests are too many, we need control > on the type of log we want) > - SMMUTransCfg modified to suite simplicity > - Change RegInfo to uint64 register array > - Code cleanup > - Test cleanups > - Reshuffled patches > > v0 -> v1 [Prem]: > - As per SMMUv3 spec 16.0 (only is_ste_consistant() is noticeable) > - Reworked register access/update logic > - Factored out translation code for > - single point bug fix > - sharing/removal in future > - (optional) Unit tests added, with PCI test device > - S1 with 4k/64k, S1+S2 with 4k/64k > - (S1 or S2) only can be verified by Linux 4.7 driver > - (optional) Priliminary ACPI support > > v0 [Prem]: > - Implements SMMUv3 spec 11.0 > - Supported for PCIe devices, > - Command Queue and Event Queue supported > - LPAE only, S1 is supported and Tested, S2 not tested > - BE mode Translation not supported > - IRQ support (legacy, no MSI) > > Eric Auger (18): > hw/arm/smmu-common: smmu base device and datatypes > hw/arm/smmu-common: IOMMU memory region and address space setup > hw/arm/smmu-common: smmu_read/write_sysmem > hw/arm/smmu-common: VMSAv8-64 page table walk > hw/arm/smmuv3: Wired IRQ and GERROR helpers > hw/arm/smmuv3: Queue helpers > hw/arm/smmuv3: Implement MMIO write operations > hw/arm/smmuv3: Event queue recording helper > hw/arm/smmuv3: Implement translate callback > target/arm/kvm: Translate the MSI doorbell in kvm_arch_fixup_msi_route > hw/arm/smmuv3: Implement data structure and TLB invalidation > notifications > hw/arm/smmuv3: Implement IOMMU memory region replay callback > hw/arm/virt: Store the PCI host controller dt phandle > hw/arm/sysbus-fdt: Pass the VirtMachineState to the node creation > functions > hw/arm/sysbus-fdt: Pass the platform bus base address in > PlatformBusFDTData > hw/arm/sysbus-fdt: Allow smmuv3 dynamic instantiation > hw/arm/smmuv3: [not for upstream] add SMMU_CMD_TLBI_NH_VA_AM handling > hw/arm/smmuv3: [not for upstream] Add caching-mode option > > Prem Mallappa (2): > hw/arm/smmuv3: Skeleton > hw/arm/virt-acpi-build: Add smmuv3 node in IORT table > > default-configs/aarch64-softmmu.mak | 1 + > hw/arm/Makefile.objs | 1 + > hw/arm/smmu-common.c | 527 ++++++++++++++++ > hw/arm/smmu-internal.h | 105 ++++ > hw/arm/smmuv3-internal.h | 584 +++++++++++++++++ > hw/arm/smmuv3.c | 1181 +++++++++++++++++++++++++++++++++++ > hw/arm/sysbus-fdt.c | 129 +++- > hw/arm/trace-events | 48 ++ > hw/arm/virt-acpi-build.c | 63 +- > hw/arm/virt.c | 6 +- > include/hw/acpi/acpi-defs.h | 15 + > include/hw/arm/smmu-common.h | 123 ++++ > include/hw/arm/smmuv3.h | 80 +++ > include/hw/arm/sysbus-fdt.h | 2 + > include/hw/arm/virt.h | 15 + > target/arm/kvm.c | 27 + > target/arm/trace-events | 3 + > 17 files changed, 2886 insertions(+), 24 deletions(-) > create mode 100644 hw/arm/smmu-common.c > create mode 100644 hw/arm/smmu-internal.h > create mode 100644 hw/arm/smmuv3-internal.h > create mode 100644 hw/arm/smmuv3.c > create mode 100644 include/hw/arm/smmu-common.h > create mode 100644 include/hw/arm/smmuv3.h > > -- > 2.5.5 > > -- Linu cherian
On Thu, Sep 28, 2017 at 12:13:12PM +0530, Linu Cherian wrote: > Hi Eric, > > > On Fri Sep 01, 2017 at 07:21:03PM +0200, Eric Auger wrote: > > This series implements the emulation code for ARM SMMUv3. > > > > Changes since v6: > > - DPDK testpmd now running on guest with 2 assigned VFs > > - Changed the instantiation method: add the following option to > > the QEMU command line > > -device smmuv3 # for virtio/vhost use cases > > -device smmuv3,caching-mode # for vfio use cases (based on [1]) > > - splitted the series into smaller patches to allow the review > > - the VFIO integration based on "tlbi-on-map" smmuv3 driver > > is isolated from the rest: last 2 patches, not for upstream. > > This is shipped for testing/bench until a better solution is found. > > - Reworked permission flag checks and event generation > > e testing: > > - in dt and ACPI modes > > - virtio-net-pci and vhost-net devices using dma ops with various > > guest page sizes [2] > > - assigned VFs using dma ops [3]: > > - AMD Overdrive and igbvf passthrough (using gsi direct mapping) > > - Cavium ThunderX and ixgbevf passthrough (using KVM MSI routing) > > - DPDK testpmd on guest running with VFIO user space drivers (2 igbvf) [3] > > with guest and host page size equal (4kB) > > > > Known limitations: > > - no VMSAv8-32 suport > > - no nested stage support (S1 + S2) > > - no support for HYP mappings > > - register fine emulation, commands, interrupts and errors were > > not accurately tested. Handling is sufficient to run use cases > > described above though. > > - interrupts and event generation not observed yet. > > While testing with vfio-pci, observed that the below two Qemu command, > results in two different behaviour. Is this expected by design ? > > Case 1: > # -device vfio-pci,host=0002:01:00.3 -device smmuv3,caching-mode > Here iommu is not attached to the pci bus in Qemu backend, since > pci_setup_iommu is not called before vfio_realize. > > Case 2: > # -device smmuv3,caching-mode -device vfio-pci,host=0002:01:00.3 > This works as expected, iommu is attached to the pci bus. Not sure about SMMU, but VT-d should have similar issue - the vIOMMU device needs to be created before the rest of the devices. Now for VT-d the ordering of devices should be assured by Libvirt: https://bugzilla.redhat.com/show_bug.cgi?id=1427005 For your reference only. Thanks, -- Peter Xu
Hi Linu, Peter, On 28/09/2017 09:13, Peter Xu wrote: > On Thu, Sep 28, 2017 at 12:13:12PM +0530, Linu Cherian wrote: >> Hi Eric, >> >> >> On Fri Sep 01, 2017 at 07:21:03PM +0200, Eric Auger wrote: >>> This series implements the emulation code for ARM SMMUv3. >>> >>> Changes since v6: >>> - DPDK testpmd now running on guest with 2 assigned VFs >>> - Changed the instantiation method: add the following option to >>> the QEMU command line >>> -device smmuv3 # for virtio/vhost use cases >>> -device smmuv3,caching-mode # for vfio use cases (based on [1]) >>> - splitted the series into smaller patches to allow the review >>> - the VFIO integration based on "tlbi-on-map" smmuv3 driver >>> is isolated from the rest: last 2 patches, not for upstream. >>> This is shipped for testing/bench until a better solution is found. >>> - Reworked permission flag checks and event generation >>> e testing: >>> - in dt and ACPI modes >>> - virtio-net-pci and vhost-net devices using dma ops with various >>> guest page sizes [2] >>> - assigned VFs using dma ops [3]: >>> - AMD Overdrive and igbvf passthrough (using gsi direct mapping) >>> - Cavium ThunderX and ixgbevf passthrough (using KVM MSI routing) >>> - DPDK testpmd on guest running with VFIO user space drivers (2 igbvf) [3] >>> with guest and host page size equal (4kB) >>> >>> Known limitations: >>> - no VMSAv8-32 suport >>> - no nested stage support (S1 + S2) >>> - no support for HYP mappings >>> - register fine emulation, commands, interrupts and errors were >>> not accurately tested. Handling is sufficient to run use cases >>> described above though. >>> - interrupts and event generation not observed yet. >> >> While testing with vfio-pci, observed that the below two Qemu command, >> results in two different behaviour. Is this expected by design ? >> >> Case 1: >> # -device vfio-pci,host=0002:01:00.3 -device smmuv3,caching-mode >> Here iommu is not attached to the pci bus in Qemu backend, since >> pci_setup_iommu is not called before vfio_realize. >> >> Case 2: >> # -device smmuv3,caching-mode -device vfio-pci,host=0002:01:00.3 >> This works as expected, iommu is attached to the pci bus. > > Not sure about SMMU, but VT-d should have similar issue - the vIOMMU > device needs to be created before the rest of the devices. Yes this is an expected limitation right now. I should have documented it though. As you noticed, the pci_set_iommu() is called on virtio-iommu realize and it relies on the fact the PCIe devices already are realized. Maybe we could relax this constraint by calling the pci_set_iommu in a machine init done notifier. Thanks Eric > > Now for VT-d the ordering of devices should be assured by Libvirt: > > https://bugzilla.redhat.com/show_bug.cgi?id=1427005 > > For your reference only. Thanks, >
On Thu Sep 28, 2017 at 09:54:20AM +0200, Auger Eric wrote: > Hi Linu, Peter, > > On 28/09/2017 09:13, Peter Xu wrote: > > On Thu, Sep 28, 2017 at 12:13:12PM +0530, Linu Cherian wrote: > >> Hi Eric, > >> > >> > >> On Fri Sep 01, 2017 at 07:21:03PM +0200, Eric Auger wrote: > >>> This series implements the emulation code for ARM SMMUv3. > >>> > >>> Changes since v6: > >>> - DPDK testpmd now running on guest with 2 assigned VFs > >>> - Changed the instantiation method: add the following option to > >>> the QEMU command line > >>> -device smmuv3 # for virtio/vhost use cases > >>> -device smmuv3,caching-mode # for vfio use cases (based on [1]) > >>> - splitted the series into smaller patches to allow the review > >>> - the VFIO integration based on "tlbi-on-map" smmuv3 driver > >>> is isolated from the rest: last 2 patches, not for upstream. > >>> This is shipped for testing/bench until a better solution is found. > >>> - Reworked permission flag checks and event generation > >>> e testing: > >>> - in dt and ACPI modes > >>> - virtio-net-pci and vhost-net devices using dma ops with various > >>> guest page sizes [2] > >>> - assigned VFs using dma ops [3]: > >>> - AMD Overdrive and igbvf passthrough (using gsi direct mapping) > >>> - Cavium ThunderX and ixgbevf passthrough (using KVM MSI routing) > >>> - DPDK testpmd on guest running with VFIO user space drivers (2 igbvf) [3] > >>> with guest and host page size equal (4kB) > >>> > >>> Known limitations: > >>> - no VMSAv8-32 suport > >>> - no nested stage support (S1 + S2) > >>> - no support for HYP mappings > >>> - register fine emulation, commands, interrupts and errors were > >>> not accurately tested. Handling is sufficient to run use cases > >>> described above though. > >>> - interrupts and event generation not observed yet. > >> > >> While testing with vfio-pci, observed that the below two Qemu command, > >> results in two different behaviour. Is this expected by design ? > >> > >> Case 1: > >> # -device vfio-pci,host=0002:01:00.3 -device smmuv3,caching-mode > >> Here iommu is not attached to the pci bus in Qemu backend, since > >> pci_setup_iommu is not called before vfio_realize. > >> > >> Case 2: > >> # -device smmuv3,caching-mode -device vfio-pci,host=0002:01:00.3 > >> This works as expected, iommu is attached to the pci bus. > > > > Not sure about SMMU, but VT-d should have similar issue - the vIOMMU > > device needs to be created before the rest of the devices. > > Yes this is an expected limitation right now. I should have documented > it though. As you noticed, the pci_set_iommu() is called on virtio-iommu > realize and it relies on the fact the PCIe devices already are realized. > > Maybe we could relax this constraint by calling the pci_set_iommu in a > machine init done notifier. > > Thanks > > Eric Thanks for confirming. > > > > > > Now for VT-d the ordering of devices should be assured by Libvirt: > > > > https://bugzilla.redhat.com/show_bug.cgi?id=1427005 > > > > For your reference only. Thanks, > > -- Linu cherian
Hi Eric, On Fri Sep 01, 2017 at 07:21:03PM +0200, Eric Auger wrote: > This series implements the emulation code for ARM SMMUv3. > > Changes since v6: > - DPDK testpmd now running on guest with 2 assigned VFs > - Changed the instantiation method: add the following option to > the QEMU command line > -device smmuv3 # for virtio/vhost use cases > -device smmuv3,caching-mode # for vfio use cases (based on [1]) > - splitted the series into smaller patches to allow the review > - the VFIO integration based on "tlbi-on-map" smmuv3 driver > is isolated from the rest: last 2 patches, not for upstream. > This is shipped for testing/bench until a better solution is found. > - Reworked permission flag checks and event generation > > testing: > - in dt and ACPI modes > - virtio-net-pci and vhost-net devices using dma ops with various > guest page sizes [2] > - assigned VFs using dma ops [3]: > - AMD Overdrive and igbvf passthrough (using gsi direct mapping) > - Cavium ThunderX and ixgbevf passthrough (using KVM MSI routing) > - DPDK testpmd on guest running with VFIO user space drivers (2 igbvf) [3] > with guest and host page size equal (4kB) > > Known limitations: > - no VMSAv8-32 suport > - no nested stage support (S1 + S2) > - no support for HYP mappings > - register fine emulation, commands, interrupts and errors were > not accurately tested. Handling is sufficient to run use cases > described above though. > - interrupts and event generation not observed yet. > By design, shouldnt this work on hardware with smmuv2 implementations as well. ie. Guest with smmuv3 emulation + Host with smmuv2 hardware. Or Is there any known limitations for this ? > Best Regards > > Eric > > This series can be found at: > v7: https://github.com/eauger/qemu/tree/v2.10.0-SMMU-v7 > Previous version at: > v6: https://github.com/eauger/qemu/tree/v2.10.0-rc2-SMMU-v6 > > References: > [1] [RFC v2 0/4] arm-smmu-v3 tlbi-on-map option > https://lkml.org/lkml/2017/8/11/426 > > [2] qemu cmd line excerpt: > -device smmuv3 \ > -netdev tap,id=tap0,script=no,downscript=no,ifname=tap0,vhost=off \ > -device virtio-net-pci,netdev=tap0,mac=6a:f5:10:b1:3d:d2,iommu_platform,disable-modern=off,disable-legacy=on \ > [3] use -device smmuv3,caching-mode > > > History: > v6 -> v7: > - see above > > v5 -> v6: > - Rebase on 2.10 and IOMMUMemoryRegion > - add ACPI TLBI_ON_MAP support (VFIO integration also works in > ACPI mode) > - fix block replay > - handle implementation defined SMMU_CMD_TLBI_NH_VA_AM cmd > (goes along with TLBI_ON_MAP FW quirk) > - replay systematically unmap the whole range first > - smmuv3_map_hook does not unmap anymore and the unmap is done > before the replay > - add and use smmuv3_context_device_invalidate instead of > blindly replaying everything > > v4 -> v5: > - initial_level now part of SMMUTransCfg > - smmu_page_walk_64 takes into account the max input size > - implement sys->iommu_ops.replay and sys->iommu_ops.notify_flag_changed > - smmuv3_translate: bug fix: don't walk on bypass > - smmu_update_qreg: fix PROD index update > - I did not yet address Peter's comments as the code is not mature enough > to be split into sub patches. > > v3 -> v4 [Eric]: > - page table walk rewritten to allow scan of the page table within a > range of IOVA. This prepares for VFIO integration and replay. > - configuration parsing partially reworked. > - do not advertise unsupported/untested features: S2, S1 + S2, HYP, > PRI, ATS, .. > - added ACPI table generation > - migrated to dynamic traces > - mingw compilation fix > > v2 -> v3 [Eric]: > - rebased on 2.9 > - mostly code and patch reorganization to ease the review process > - optional patches removed. They may be handled separately. I am currently > working on ACPI enablement. > - optional instantiation of the smmu in mach-virt > - removed [2/9] (fdt functions) since not mandated > - start splitting main patch into base and derived object > - no new function feature added > > v1 -> v2 [Prem]: > - Adopted review comments from Eric Auger > - Make SMMU_DPRINTF to internally call qemu_log > (since translation requests are too many, we need control > on the type of log we want) > - SMMUTransCfg modified to suite simplicity > - Change RegInfo to uint64 register array > - Code cleanup > - Test cleanups > - Reshuffled patches > > v0 -> v1 [Prem]: > - As per SMMUv3 spec 16.0 (only is_ste_consistant() is noticeable) > - Reworked register access/update logic > - Factored out translation code for > - single point bug fix > - sharing/removal in future > - (optional) Unit tests added, with PCI test device > - S1 with 4k/64k, S1+S2 with 4k/64k > - (S1 or S2) only can be verified by Linux 4.7 driver > - (optional) Priliminary ACPI support > > v0 [Prem]: > - Implements SMMUv3 spec 11.0 > - Supported for PCIe devices, > - Command Queue and Event Queue supported > - LPAE only, S1 is supported and Tested, S2 not tested > - BE mode Translation not supported > - IRQ support (legacy, no MSI) > > Eric Auger (18): > hw/arm/smmu-common: smmu base device and datatypes > hw/arm/smmu-common: IOMMU memory region and address space setup > hw/arm/smmu-common: smmu_read/write_sysmem > hw/arm/smmu-common: VMSAv8-64 page table walk > hw/arm/smmuv3: Wired IRQ and GERROR helpers > hw/arm/smmuv3: Queue helpers > hw/arm/smmuv3: Implement MMIO write operations > hw/arm/smmuv3: Event queue recording helper > hw/arm/smmuv3: Implement translate callback > target/arm/kvm: Translate the MSI doorbell in kvm_arch_fixup_msi_route > hw/arm/smmuv3: Implement data structure and TLB invalidation > notifications > hw/arm/smmuv3: Implement IOMMU memory region replay callback > hw/arm/virt: Store the PCI host controller dt phandle > hw/arm/sysbus-fdt: Pass the VirtMachineState to the node creation > functions > hw/arm/sysbus-fdt: Pass the platform bus base address in > PlatformBusFDTData > hw/arm/sysbus-fdt: Allow smmuv3 dynamic instantiation > hw/arm/smmuv3: [not for upstream] add SMMU_CMD_TLBI_NH_VA_AM handling > hw/arm/smmuv3: [not for upstream] Add caching-mode option > > Prem Mallappa (2): > hw/arm/smmuv3: Skeleton > hw/arm/virt-acpi-build: Add smmuv3 node in IORT table > > default-configs/aarch64-softmmu.mak | 1 + > hw/arm/Makefile.objs | 1 + > hw/arm/smmu-common.c | 527 ++++++++++++++++ > hw/arm/smmu-internal.h | 105 ++++ > hw/arm/smmuv3-internal.h | 584 +++++++++++++++++ > hw/arm/smmuv3.c | 1181 +++++++++++++++++++++++++++++++++++ > hw/arm/sysbus-fdt.c | 129 +++- > hw/arm/trace-events | 48 ++ > hw/arm/virt-acpi-build.c | 63 +- > hw/arm/virt.c | 6 +- > include/hw/acpi/acpi-defs.h | 15 + > include/hw/arm/smmu-common.h | 123 ++++ > include/hw/arm/smmuv3.h | 80 +++ > include/hw/arm/sysbus-fdt.h | 2 + > include/hw/arm/virt.h | 15 + > target/arm/kvm.c | 27 + > target/arm/trace-events | 3 + > 17 files changed, 2886 insertions(+), 24 deletions(-) > create mode 100644 hw/arm/smmu-common.c > create mode 100644 hw/arm/smmu-internal.h > create mode 100644 hw/arm/smmuv3-internal.h > create mode 100644 hw/arm/smmuv3.c > create mode 100644 include/hw/arm/smmu-common.h > create mode 100644 include/hw/arm/smmuv3.h > > -- > 2.5.5 > > -- Linu cherian
Hi Linu, On 12/09/2017 08:18, Linu Cherian wrote: > Hi Eric, > > On Fri Sep 01, 2017 at 07:21:03PM +0200, Eric Auger wrote: >> This series implements the emulation code for ARM SMMUv3. >> >> Changes since v6: >> - DPDK testpmd now running on guest with 2 assigned VFs >> - Changed the instantiation method: add the following option to >> the QEMU command line >> -device smmuv3 # for virtio/vhost use cases >> -device smmuv3,caching-mode # for vfio use cases (based on [1]) >> - splitted the series into smaller patches to allow the review >> - the VFIO integration based on "tlbi-on-map" smmuv3 driver >> is isolated from the rest: last 2 patches, not for upstream. >> This is shipped for testing/bench until a better solution is found. >> - Reworked permission flag checks and event generation >> >> testing: >> - in dt and ACPI modes >> - virtio-net-pci and vhost-net devices using dma ops with various >> guest page sizes [2] >> - assigned VFs using dma ops [3]: >> - AMD Overdrive and igbvf passthrough (using gsi direct mapping) >> - Cavium ThunderX and ixgbevf passthrough (using KVM MSI routing) >> - DPDK testpmd on guest running with VFIO user space drivers (2 igbvf) [3] >> with guest and host page size equal (4kB) >> >> Known limitations: >> - no VMSAv8-32 suport >> - no nested stage support (S1 + S2) >> - no support for HYP mappings >> - register fine emulation, commands, interrupts and errors were >> not accurately tested. Handling is sufficient to run use cases >> described above though. >> - interrupts and event generation not observed yet. >> > > By design, shouldnt this work on hardware with smmuv2 implementations as well. > ie. Guest with smmuv3 emulation + Host with smmuv2 hardware. Yes indeed. I am mostly testing with a host featuring smmuv2 at the moment. Thanks Eric > > Or Is there any known limitations for this ? > >> Best Regards >> >> Eric >> >> This series can be found at: >> v7: https://github.com/eauger/qemu/tree/v2.10.0-SMMU-v7 >> Previous version at: >> v6: https://github.com/eauger/qemu/tree/v2.10.0-rc2-SMMU-v6 >> >> References: >> [1] [RFC v2 0/4] arm-smmu-v3 tlbi-on-map option >> https://lkml.org/lkml/2017/8/11/426 >> >> [2] qemu cmd line excerpt: >> -device smmuv3 \ >> -netdev tap,id=tap0,script=no,downscript=no,ifname=tap0,vhost=off \ >> -device virtio-net-pci,netdev=tap0,mac=6a:f5:10:b1:3d:d2,iommu_platform,disable-modern=off,disable-legacy=on \ >> [3] use -device smmuv3,caching-mode >> >> >> History: >> v6 -> v7: >> - see above >> >> v5 -> v6: >> - Rebase on 2.10 and IOMMUMemoryRegion >> - add ACPI TLBI_ON_MAP support (VFIO integration also works in >> ACPI mode) >> - fix block replay >> - handle implementation defined SMMU_CMD_TLBI_NH_VA_AM cmd >> (goes along with TLBI_ON_MAP FW quirk) >> - replay systematically unmap the whole range first >> - smmuv3_map_hook does not unmap anymore and the unmap is done >> before the replay >> - add and use smmuv3_context_device_invalidate instead of >> blindly replaying everything >> >> v4 -> v5: >> - initial_level now part of SMMUTransCfg >> - smmu_page_walk_64 takes into account the max input size >> - implement sys->iommu_ops.replay and sys->iommu_ops.notify_flag_changed >> - smmuv3_translate: bug fix: don't walk on bypass >> - smmu_update_qreg: fix PROD index update >> - I did not yet address Peter's comments as the code is not mature enough >> to be split into sub patches. >> >> v3 -> v4 [Eric]: >> - page table walk rewritten to allow scan of the page table within a >> range of IOVA. This prepares for VFIO integration and replay. >> - configuration parsing partially reworked. >> - do not advertise unsupported/untested features: S2, S1 + S2, HYP, >> PRI, ATS, .. >> - added ACPI table generation >> - migrated to dynamic traces >> - mingw compilation fix >> >> v2 -> v3 [Eric]: >> - rebased on 2.9 >> - mostly code and patch reorganization to ease the review process >> - optional patches removed. They may be handled separately. I am currently >> working on ACPI enablement. >> - optional instantiation of the smmu in mach-virt >> - removed [2/9] (fdt functions) since not mandated >> - start splitting main patch into base and derived object >> - no new function feature added >> >> v1 -> v2 [Prem]: >> - Adopted review comments from Eric Auger >> - Make SMMU_DPRINTF to internally call qemu_log >> (since translation requests are too many, we need control >> on the type of log we want) >> - SMMUTransCfg modified to suite simplicity >> - Change RegInfo to uint64 register array >> - Code cleanup >> - Test cleanups >> - Reshuffled patches >> >> v0 -> v1 [Prem]: >> - As per SMMUv3 spec 16.0 (only is_ste_consistant() is noticeable) >> - Reworked register access/update logic >> - Factored out translation code for >> - single point bug fix >> - sharing/removal in future >> - (optional) Unit tests added, with PCI test device >> - S1 with 4k/64k, S1+S2 with 4k/64k >> - (S1 or S2) only can be verified by Linux 4.7 driver >> - (optional) Priliminary ACPI support >> >> v0 [Prem]: >> - Implements SMMUv3 spec 11.0 >> - Supported for PCIe devices, >> - Command Queue and Event Queue supported >> - LPAE only, S1 is supported and Tested, S2 not tested >> - BE mode Translation not supported >> - IRQ support (legacy, no MSI) >> >> Eric Auger (18): >> hw/arm/smmu-common: smmu base device and datatypes >> hw/arm/smmu-common: IOMMU memory region and address space setup >> hw/arm/smmu-common: smmu_read/write_sysmem >> hw/arm/smmu-common: VMSAv8-64 page table walk >> hw/arm/smmuv3: Wired IRQ and GERROR helpers >> hw/arm/smmuv3: Queue helpers >> hw/arm/smmuv3: Implement MMIO write operations >> hw/arm/smmuv3: Event queue recording helper >> hw/arm/smmuv3: Implement translate callback >> target/arm/kvm: Translate the MSI doorbell in kvm_arch_fixup_msi_route >> hw/arm/smmuv3: Implement data structure and TLB invalidation >> notifications >> hw/arm/smmuv3: Implement IOMMU memory region replay callback >> hw/arm/virt: Store the PCI host controller dt phandle >> hw/arm/sysbus-fdt: Pass the VirtMachineState to the node creation >> functions >> hw/arm/sysbus-fdt: Pass the platform bus base address in >> PlatformBusFDTData >> hw/arm/sysbus-fdt: Allow smmuv3 dynamic instantiation >> hw/arm/smmuv3: [not for upstream] add SMMU_CMD_TLBI_NH_VA_AM handling >> hw/arm/smmuv3: [not for upstream] Add caching-mode option >> >> Prem Mallappa (2): >> hw/arm/smmuv3: Skeleton >> hw/arm/virt-acpi-build: Add smmuv3 node in IORT table >> >> default-configs/aarch64-softmmu.mak | 1 + >> hw/arm/Makefile.objs | 1 + >> hw/arm/smmu-common.c | 527 ++++++++++++++++ >> hw/arm/smmu-internal.h | 105 ++++ >> hw/arm/smmuv3-internal.h | 584 +++++++++++++++++ >> hw/arm/smmuv3.c | 1181 +++++++++++++++++++++++++++++++++++ >> hw/arm/sysbus-fdt.c | 129 +++- >> hw/arm/trace-events | 48 ++ >> hw/arm/virt-acpi-build.c | 63 +- >> hw/arm/virt.c | 6 +- >> include/hw/acpi/acpi-defs.h | 15 + >> include/hw/arm/smmu-common.h | 123 ++++ >> include/hw/arm/smmuv3.h | 80 +++ >> include/hw/arm/sysbus-fdt.h | 2 + >> include/hw/arm/virt.h | 15 + >> target/arm/kvm.c | 27 + >> target/arm/trace-events | 3 + >> 17 files changed, 2886 insertions(+), 24 deletions(-) >> create mode 100644 hw/arm/smmu-common.c >> create mode 100644 hw/arm/smmu-internal.h >> create mode 100644 hw/arm/smmuv3-internal.h >> create mode 100644 hw/arm/smmuv3.c >> create mode 100644 include/hw/arm/smmu-common.h >> create mode 100644 include/hw/arm/smmuv3.h >> >> -- >> 2.5.5 >> >> >
On Fri, Sep 01, 2017 at 07:21:03PM +0200, Eric Auger wrote: > This series implements the emulation code for ARM SMMUv3. Can you add some code to block using vfio with this until patches 19+20 are ready? Then 1-18 could be applied. > Changes since v6: > - DPDK testpmd now running on guest with 2 assigned VFs > - Changed the instantiation method: add the following option to > the QEMU command line > -device smmuv3 # for virtio/vhost use cases > -device smmuv3,caching-mode # for vfio use cases (based on [1]) > - splitted the series into smaller patches to allow the review > - the VFIO integration based on "tlbi-on-map" smmuv3 driver > is isolated from the rest: last 2 patches, not for upstream. > This is shipped for testing/bench until a better solution is found. > - Reworked permission flag checks and event generation > > testing: > - in dt and ACPI modes > - virtio-net-pci and vhost-net devices using dma ops with various > guest page sizes [2] > - assigned VFs using dma ops [3]: > - AMD Overdrive and igbvf passthrough (using gsi direct mapping) > - Cavium ThunderX and ixgbevf passthrough (using KVM MSI routing) > - DPDK testpmd on guest running with VFIO user space drivers (2 igbvf) [3] > with guest and host page size equal (4kB) > > Known limitations: > - no VMSAv8-32 suport > - no nested stage support (S1 + S2) > - no support for HYP mappings > - register fine emulation, commands, interrupts and errors were > not accurately tested. Handling is sufficient to run use cases > described above though. > - interrupts and event generation not observed yet. > > Best Regards > > Eric > > This series can be found at: > v7: https://github.com/eauger/qemu/tree/v2.10.0-SMMU-v7 > Previous version at: > v6: https://github.com/eauger/qemu/tree/v2.10.0-rc2-SMMU-v6 > > References: > [1] [RFC v2 0/4] arm-smmu-v3 tlbi-on-map option > https://lkml.org/lkml/2017/8/11/426 > > [2] qemu cmd line excerpt: > -device smmuv3 \ > -netdev tap,id=tap0,script=no,downscript=no,ifname=tap0,vhost=off \ > -device virtio-net-pci,netdev=tap0,mac=6a:f5:10:b1:3d:d2,iommu_platform,disable-modern=off,disable-legacy=on \ > [3] use -device smmuv3,caching-mode > > > History: > v6 -> v7: > - see above > > v5 -> v6: > - Rebase on 2.10 and IOMMUMemoryRegion > - add ACPI TLBI_ON_MAP support (VFIO integration also works in > ACPI mode) > - fix block replay > - handle implementation defined SMMU_CMD_TLBI_NH_VA_AM cmd > (goes along with TLBI_ON_MAP FW quirk) > - replay systematically unmap the whole range first > - smmuv3_map_hook does not unmap anymore and the unmap is done > before the replay > - add and use smmuv3_context_device_invalidate instead of > blindly replaying everything > > v4 -> v5: > - initial_level now part of SMMUTransCfg > - smmu_page_walk_64 takes into account the max input size > - implement sys->iommu_ops.replay and sys->iommu_ops.notify_flag_changed > - smmuv3_translate: bug fix: don't walk on bypass > - smmu_update_qreg: fix PROD index update > - I did not yet address Peter's comments as the code is not mature enough > to be split into sub patches. > > v3 -> v4 [Eric]: > - page table walk rewritten to allow scan of the page table within a > range of IOVA. This prepares for VFIO integration and replay. > - configuration parsing partially reworked. > - do not advertise unsupported/untested features: S2, S1 + S2, HYP, > PRI, ATS, .. > - added ACPI table generation > - migrated to dynamic traces > - mingw compilation fix > > v2 -> v3 [Eric]: > - rebased on 2.9 > - mostly code and patch reorganization to ease the review process > - optional patches removed. They may be handled separately. I am currently > working on ACPI enablement. > - optional instantiation of the smmu in mach-virt > - removed [2/9] (fdt functions) since not mandated > - start splitting main patch into base and derived object > - no new function feature added > > v1 -> v2 [Prem]: > - Adopted review comments from Eric Auger > - Make SMMU_DPRINTF to internally call qemu_log > (since translation requests are too many, we need control > on the type of log we want) > - SMMUTransCfg modified to suite simplicity > - Change RegInfo to uint64 register array > - Code cleanup > - Test cleanups > - Reshuffled patches > > v0 -> v1 [Prem]: > - As per SMMUv3 spec 16.0 (only is_ste_consistant() is noticeable) > - Reworked register access/update logic > - Factored out translation code for > - single point bug fix > - sharing/removal in future > - (optional) Unit tests added, with PCI test device > - S1 with 4k/64k, S1+S2 with 4k/64k > - (S1 or S2) only can be verified by Linux 4.7 driver > - (optional) Priliminary ACPI support > > v0 [Prem]: > - Implements SMMUv3 spec 11.0 > - Supported for PCIe devices, > - Command Queue and Event Queue supported > - LPAE only, S1 is supported and Tested, S2 not tested > - BE mode Translation not supported > - IRQ support (legacy, no MSI) > > Eric Auger (18): > hw/arm/smmu-common: smmu base device and datatypes > hw/arm/smmu-common: IOMMU memory region and address space setup > hw/arm/smmu-common: smmu_read/write_sysmem > hw/arm/smmu-common: VMSAv8-64 page table walk > hw/arm/smmuv3: Wired IRQ and GERROR helpers > hw/arm/smmuv3: Queue helpers > hw/arm/smmuv3: Implement MMIO write operations > hw/arm/smmuv3: Event queue recording helper > hw/arm/smmuv3: Implement translate callback > target/arm/kvm: Translate the MSI doorbell in kvm_arch_fixup_msi_route > hw/arm/smmuv3: Implement data structure and TLB invalidation > notifications > hw/arm/smmuv3: Implement IOMMU memory region replay callback > hw/arm/virt: Store the PCI host controller dt phandle > hw/arm/sysbus-fdt: Pass the VirtMachineState to the node creation > functions > hw/arm/sysbus-fdt: Pass the platform bus base address in > PlatformBusFDTData > hw/arm/sysbus-fdt: Allow smmuv3 dynamic instantiation > hw/arm/smmuv3: [not for upstream] add SMMU_CMD_TLBI_NH_VA_AM handling > hw/arm/smmuv3: [not for upstream] Add caching-mode option > > Prem Mallappa (2): > hw/arm/smmuv3: Skeleton > hw/arm/virt-acpi-build: Add smmuv3 node in IORT table > > default-configs/aarch64-softmmu.mak | 1 + > hw/arm/Makefile.objs | 1 + > hw/arm/smmu-common.c | 527 ++++++++++++++++ > hw/arm/smmu-internal.h | 105 ++++ > hw/arm/smmuv3-internal.h | 584 +++++++++++++++++ > hw/arm/smmuv3.c | 1181 +++++++++++++++++++++++++++++++++++ > hw/arm/sysbus-fdt.c | 129 +++- > hw/arm/trace-events | 48 ++ > hw/arm/virt-acpi-build.c | 63 +- > hw/arm/virt.c | 6 +- > include/hw/acpi/acpi-defs.h | 15 + > include/hw/arm/smmu-common.h | 123 ++++ > include/hw/arm/smmuv3.h | 80 +++ > include/hw/arm/sysbus-fdt.h | 2 + > include/hw/arm/virt.h | 15 + > target/arm/kvm.c | 27 + > target/arm/trace-events | 3 + > 17 files changed, 2886 insertions(+), 24 deletions(-) > create mode 100644 hw/arm/smmu-common.c > create mode 100644 hw/arm/smmu-internal.h > create mode 100644 hw/arm/smmuv3-internal.h > create mode 100644 hw/arm/smmuv3.c > create mode 100644 include/hw/arm/smmu-common.h > create mode 100644 include/hw/arm/smmuv3.h > > -- > 2.5.5
Hi Michael, On 08/09/2017 07:47, Michael S. Tsirkin wrote: > On Fri, Sep 01, 2017 at 07:21:03PM +0200, Eric Auger wrote: >> This series implements the emulation code for ARM SMMUv3. > > Can you add some code to block using vfio with this > until patches 19+20 are ready? Sure. Thanks Eric > Then 1-18 could be applied. > >> Changes since v6: >> - DPDK testpmd now running on guest with 2 assigned VFs >> - Changed the instantiation method: add the following option to >> the QEMU command line >> -device smmuv3 # for virtio/vhost use cases >> -device smmuv3,caching-mode # for vfio use cases (based on [1]) >> - splitted the series into smaller patches to allow the review >> - the VFIO integration based on "tlbi-on-map" smmuv3 driver >> is isolated from the rest: last 2 patches, not for upstream. >> This is shipped for testing/bench until a better solution is found. >> - Reworked permission flag checks and event generation >> >> testing: >> - in dt and ACPI modes >> - virtio-net-pci and vhost-net devices using dma ops with various >> guest page sizes [2] >> - assigned VFs using dma ops [3]: >> - AMD Overdrive and igbvf passthrough (using gsi direct mapping) >> - Cavium ThunderX and ixgbevf passthrough (using KVM MSI routing) >> - DPDK testpmd on guest running with VFIO user space drivers (2 igbvf) [3] >> with guest and host page size equal (4kB) >> >> Known limitations: >> - no VMSAv8-32 suport >> - no nested stage support (S1 + S2) >> - no support for HYP mappings >> - register fine emulation, commands, interrupts and errors were >> not accurately tested. Handling is sufficient to run use cases >> described above though. >> - interrupts and event generation not observed yet. >> >> Best Regards >> >> Eric >> >> This series can be found at: >> v7: https://github.com/eauger/qemu/tree/v2.10.0-SMMU-v7 >> Previous version at: >> v6: https://github.com/eauger/qemu/tree/v2.10.0-rc2-SMMU-v6 >> >> References: >> [1] [RFC v2 0/4] arm-smmu-v3 tlbi-on-map option >> https://lkml.org/lkml/2017/8/11/426 >> >> [2] qemu cmd line excerpt: >> -device smmuv3 \ >> -netdev tap,id=tap0,script=no,downscript=no,ifname=tap0,vhost=off \ >> -device virtio-net-pci,netdev=tap0,mac=6a:f5:10:b1:3d:d2,iommu_platform,disable-modern=off,disable-legacy=on \ >> [3] use -device smmuv3,caching-mode >> >> >> History: >> v6 -> v7: >> - see above >> >> v5 -> v6: >> - Rebase on 2.10 and IOMMUMemoryRegion >> - add ACPI TLBI_ON_MAP support (VFIO integration also works in >> ACPI mode) >> - fix block replay >> - handle implementation defined SMMU_CMD_TLBI_NH_VA_AM cmd >> (goes along with TLBI_ON_MAP FW quirk) >> - replay systematically unmap the whole range first >> - smmuv3_map_hook does not unmap anymore and the unmap is done >> before the replay >> - add and use smmuv3_context_device_invalidate instead of >> blindly replaying everything >> >> v4 -> v5: >> - initial_level now part of SMMUTransCfg >> - smmu_page_walk_64 takes into account the max input size >> - implement sys->iommu_ops.replay and sys->iommu_ops.notify_flag_changed >> - smmuv3_translate: bug fix: don't walk on bypass >> - smmu_update_qreg: fix PROD index update >> - I did not yet address Peter's comments as the code is not mature enough >> to be split into sub patches. >> >> v3 -> v4 [Eric]: >> - page table walk rewritten to allow scan of the page table within a >> range of IOVA. This prepares for VFIO integration and replay. >> - configuration parsing partially reworked. >> - do not advertise unsupported/untested features: S2, S1 + S2, HYP, >> PRI, ATS, .. >> - added ACPI table generation >> - migrated to dynamic traces >> - mingw compilation fix >> >> v2 -> v3 [Eric]: >> - rebased on 2.9 >> - mostly code and patch reorganization to ease the review process >> - optional patches removed. They may be handled separately. I am currently >> working on ACPI enablement. >> - optional instantiation of the smmu in mach-virt >> - removed [2/9] (fdt functions) since not mandated >> - start splitting main patch into base and derived object >> - no new function feature added >> >> v1 -> v2 [Prem]: >> - Adopted review comments from Eric Auger >> - Make SMMU_DPRINTF to internally call qemu_log >> (since translation requests are too many, we need control >> on the type of log we want) >> - SMMUTransCfg modified to suite simplicity >> - Change RegInfo to uint64 register array >> - Code cleanup >> - Test cleanups >> - Reshuffled patches >> >> v0 -> v1 [Prem]: >> - As per SMMUv3 spec 16.0 (only is_ste_consistant() is noticeable) >> - Reworked register access/update logic >> - Factored out translation code for >> - single point bug fix >> - sharing/removal in future >> - (optional) Unit tests added, with PCI test device >> - S1 with 4k/64k, S1+S2 with 4k/64k >> - (S1 or S2) only can be verified by Linux 4.7 driver >> - (optional) Priliminary ACPI support >> >> v0 [Prem]: >> - Implements SMMUv3 spec 11.0 >> - Supported for PCIe devices, >> - Command Queue and Event Queue supported >> - LPAE only, S1 is supported and Tested, S2 not tested >> - BE mode Translation not supported >> - IRQ support (legacy, no MSI) >> >> Eric Auger (18): >> hw/arm/smmu-common: smmu base device and datatypes >> hw/arm/smmu-common: IOMMU memory region and address space setup >> hw/arm/smmu-common: smmu_read/write_sysmem >> hw/arm/smmu-common: VMSAv8-64 page table walk >> hw/arm/smmuv3: Wired IRQ and GERROR helpers >> hw/arm/smmuv3: Queue helpers >> hw/arm/smmuv3: Implement MMIO write operations >> hw/arm/smmuv3: Event queue recording helper >> hw/arm/smmuv3: Implement translate callback >> target/arm/kvm: Translate the MSI doorbell in kvm_arch_fixup_msi_route >> hw/arm/smmuv3: Implement data structure and TLB invalidation >> notifications >> hw/arm/smmuv3: Implement IOMMU memory region replay callback >> hw/arm/virt: Store the PCI host controller dt phandle >> hw/arm/sysbus-fdt: Pass the VirtMachineState to the node creation >> functions >> hw/arm/sysbus-fdt: Pass the platform bus base address in >> PlatformBusFDTData >> hw/arm/sysbus-fdt: Allow smmuv3 dynamic instantiation >> hw/arm/smmuv3: [not for upstream] add SMMU_CMD_TLBI_NH_VA_AM handling >> hw/arm/smmuv3: [not for upstream] Add caching-mode option >> >> Prem Mallappa (2): >> hw/arm/smmuv3: Skeleton >> hw/arm/virt-acpi-build: Add smmuv3 node in IORT table >> >> default-configs/aarch64-softmmu.mak | 1 + >> hw/arm/Makefile.objs | 1 + >> hw/arm/smmu-common.c | 527 ++++++++++++++++ >> hw/arm/smmu-internal.h | 105 ++++ >> hw/arm/smmuv3-internal.h | 584 +++++++++++++++++ >> hw/arm/smmuv3.c | 1181 +++++++++++++++++++++++++++++++++++ >> hw/arm/sysbus-fdt.c | 129 +++- >> hw/arm/trace-events | 48 ++ >> hw/arm/virt-acpi-build.c | 63 +- >> hw/arm/virt.c | 6 +- >> include/hw/acpi/acpi-defs.h | 15 + >> include/hw/arm/smmu-common.h | 123 ++++ >> include/hw/arm/smmuv3.h | 80 +++ >> include/hw/arm/sysbus-fdt.h | 2 + >> include/hw/arm/virt.h | 15 + >> target/arm/kvm.c | 27 + >> target/arm/trace-events | 3 + >> 17 files changed, 2886 insertions(+), 24 deletions(-) >> create mode 100644 hw/arm/smmu-common.c >> create mode 100644 hw/arm/smmu-internal.h >> create mode 100644 hw/arm/smmuv3-internal.h >> create mode 100644 hw/arm/smmuv3.c >> create mode 100644 include/hw/arm/smmu-common.h >> create mode 100644 include/hw/arm/smmuv3.h >> >> -- >> 2.5.5 >
© 2016 - 2024 Red Hat, Inc.