[PATCH v2 0/9] SMMUv3.2 Range-based TLB Invalidation Support

Eric Auger posted 9 patches 3 years, 9 months ago
Test FreeBSD passed
Test docker-quick@centos7 passed
Test checkpatch passed
Test docker-mingw@fedora passed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20200702152659.8522-1-eric.auger@redhat.com
Maintainers: Eric Auger <eric.auger@redhat.com>, Peter Maydell <peter.maydell@linaro.org>
There is a newer version of this series
hw/arm/smmu-internal.h       |  13 +++
hw/arm/smmuv3-internal.h     |   5 +
include/hw/arm/smmu-common.h |  21 ++--
hw/arm/smmu-common.c         | 200 +++++++++++++++++++++++------------
hw/arm/smmuv3.c              | 134 ++++++++++++-----------
hw/arm/trace-events          |  10 +-
6 files changed, 235 insertions(+), 148 deletions(-)
[PATCH v2 0/9] SMMUv3.2 Range-based TLB Invalidation Support
Posted by Eric Auger 3 years, 9 months ago
SMMU3.2 brings the support of range-based TLB invalidation and
level hint. When this feature is supported, the SMMUv3 driver
is allowed to send TLB invalidations for a range of IOVAs instead
of using page based invalidation.

Implementing this feature in the virtual SMMUv3 device is
mandated for DPDK on guest use case: DPDK uses hugepage
buffers and guest sends invalidations for blocks. Without
this feature, a guest invalidation of a block of 1GB for instance
translates into a storm of page invalidations. Each of them
is trapped by the VMM and cascaded downto the physical IOMMU.
This completely stalls the execution. This integration issue
was initially reported in [1].

Now SMMUv3.2 specifies additional parameters to NH_VA and NH_VAA
stage 1 invalidation commands so we can support those extensions.

patches [1, 5] are cleanup patches.
patches [6] changes the implementation of the VSMMUV3 IOTLB
   This IOTLB is a minimalist IOTLB implementation that avoids to
   do the page table walk in case we have an entry in the TLB.
   Previously entries were page mappings only. Now they can be
   blocks.
patches [7, 9] bring support for range invalidation.

Supporting block mappings in the IOTLB look sensible in terms of
TLB entry consumption. However looking at virtio/vhost device usage,
without block mapping and without range invalidation (< 5.7 kernels
it may be less performant. However for recent guest kernels
supporting range invalidations [2], the performance should be similar.

Best Regards

Eric

This series can be found at:
https://github.com/eauger/qemu.git
branch: v5.0.0-smmuv3-ril-v2

References:
[1] [RFC v2 4/4] iommu/arm-smmu-v3: add CMD_TLBI_NH_VA_AM command
for iova range invalidation
(https://lists.linuxfoundation.org/pipermail/iommu/2017-August/023679.html

[2] 5.7+ kernels featuring
6a481a95d4c1 iommu/arm-smmu-v3: Add SMMUv3.2 range invalidation support

History:
v1 -> v2:
- added "hw/arm/smmu: Introduce smmu_get_iotlb_key()"
- removed "[PATCH 5/9] hw/arm/smmuv3: Store the starting level in
  SMMUTransTableInfo"
- Collected Peter's R-b
- In this version the key still features TG/LVL.
- More details in individual history logs


Eric Auger (9):
  hw/arm/smmu-common: Factorize some code in smmu_ptw_64()
  hw/arm/smmu-common: Add IOTLB helpers
  hw/arm/smmu: Introduce smmu_get_iotlb_key()
  hw/arm/smmu: Simplify the IOTLB key format
  hw/arm/smmu: Introduce SMMUTLBEntry for PTW and IOTLB value
  hw/arm/smmu-common: Manage IOTLB block entries
  hw/arm/smmuv3: Introduce smmuv3_s1_range_inval() helper
  hw/arm/smmuv3: Get prepared for range invalidation
  hw/arm/smmuv3: Advertise SMMUv3.2 range invalidation

 hw/arm/smmu-internal.h       |  13 +++
 hw/arm/smmuv3-internal.h     |   5 +
 include/hw/arm/smmu-common.h |  21 ++--
 hw/arm/smmu-common.c         | 200 +++++++++++++++++++++++------------
 hw/arm/smmuv3.c              | 134 ++++++++++++-----------
 hw/arm/trace-events          |  10 +-
 6 files changed, 235 insertions(+), 148 deletions(-)

-- 
2.21.3