drivers/gpu/drm/panfrost/panfrost_mmu.c | 2 +- drivers/gpu/drm/panthor/panthor_mmu.c | 2 +- drivers/iommu/apple-dart.c | 13 ++-- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 12 ++- .../iommu/arm/arm-smmu/arm-smmu-qcom-debug.c | 2 +- drivers/iommu/arm/arm-smmu/arm-smmu.c | 15 ++-- drivers/iommu/arm/arm-smmu/qcom_iommu.c | 13 ++-- drivers/iommu/exynos-iommu.c | 23 ++++-- drivers/iommu/fsl_pamu_domain.c | 28 ++++++- drivers/iommu/generic_pt/iommu_pt.h | 57 ++++++++------ drivers/iommu/io-pgtable-arm-selftests.c | 12 +-- drivers/iommu/io-pgtable-arm-v7s.c | 27 ++++--- drivers/iommu/io-pgtable-arm.c | 16 ++-- drivers/iommu/io-pgtable-dart.c | 21 ++++-- drivers/iommu/iommu.c | 32 +++++++- drivers/iommu/iommufd/pages.c | 75 ++++++++++++++++--- drivers/iommu/iommufd/selftest.c | 2 +- drivers/iommu/ipmmu-vmsa.c | 12 ++- drivers/iommu/msm_iommu.c | 29 +++++-- drivers/iommu/mtk_iommu.c | 14 +++- drivers/iommu/mtk_iommu_v1.c | 16 +++- drivers/iommu/omap-iommu.c | 34 ++++++--- drivers/iommu/rockchip-iommu.c | 13 +++- drivers/iommu/s390-iommu.c | 21 ++++-- drivers/iommu/sprd-iommu.c | 20 +++-- drivers/iommu/sun50i-iommu.c | 17 +++-- drivers/iommu/tegra-smmu.c | 14 +++- drivers/iommu/virtio-iommu.c | 15 +++- drivers/vfio/vfio_iommu_type1.c | 26 +++++-- include/linux/generic_pt/iommu.h | 13 ++-- include/linux/io-pgtable.h | 10 ++- include/linux/iommu.h | 12 ++- 32 files changed, 443 insertions(+), 175 deletions(-)
This series introduces the iova_to_phys_length interface across the IOMMU
subsystem. The new callback returns both the physical address and the PTE
mapping size in a single page table walk, enabling callers such as iommufd
and VFIO to traverse IOVA space efficiently by actual mapping granularity
instead of fixed PAGE_SIZE steps.
Motivation
==========
The current iova_to_phys interface only returns a physical address with no
indication of the mapping page size. This forces callers to iterate one
PAGE_SIZE at a time when collecting PFNs or unmapping IOVA ranges, which is
extremely inefficient for large mappings (e.g., 2MB or 1GB huge pages
require 512 or 262144 page-table walks respectively).
The new iova_to_phys_length interface solves this by providing the
contiguous mapping size alongside the physical address in a single walk.
Design
======
Core layer (patch 1):
- Adds iova_to_phys_length to iommu_domain_ops
- iommu_iova_to_phys_length() detects invalid states by checking
ops->iova_to_phys_length (not domain->type), returns PHYS_ADDR_MAX
on error
- iommu_iova_to_phys() is preserved as a thin wrapper calling
iova_to_phys_length internally, converting PHYS_ADDR_MAX back to 0
for historical API compatibility
io-pgtable backends (patches 2-4):
- ARM LPAE, ARM v7s, and DART each implement iova_to_phys_length
returning phys + page size, with PHYS_ADDR_MAX for error paths
- The old iova_to_phys is kept temporarily as a wrapper
generic_pt framework (patch 5):
- Implements iova_to_phys_length using pt_entry_oa_lg2sz() which
already accounts for contiguous PTE hints, returning the correct
mapping size in a single leaf-entry lookup
Per-driver migration (patches 6-22):
- Each IOMMU driver (arm-smmu-v3, arm-smmu, qcom, apple-dart,
ipmmu-vmsa, mtk, exynos, fsl_pamu, msm, omap, rockchip, s390,
sprd, sun50i, tegra-smmu, virtio) implements iova_to_phys_length
in its own atomic commit
- All error paths return PHYS_ADDR_MAX
Caller conversion (patches 23-25):
- iommufd/pages.c uses iova_to_phys_length to batch PFN collection
by actual mapping granularity
- VFIO type1 uses iova_to_phys_length for efficient unmap traversal
- drm/panfrost and drm/panthor switch to the new interface
Cleanup (patches 26-30):
- io-pgtable selftests switch to iova_to_phys_length
- Remove deprecated iova_to_phys wrappers from io-pgtable backends
- Remove iova_to_phys from iommu_domain_ops and io_pgtable_ops
Changes in v2:
- Use PHYS_ADDR_MAX (~(phys_addr_t)0) as error return instead of 0
throughout the entire call chain, per review feedback (Jason)
- Detect invalid domain states by checking ops->iova_to_phys_length
rather than domain->type == IOMMU_DOMAIN_BLOCKED (Jason)
- iommu_iova_to_phys() wrapper converts PHYS_ADDR_MAX -> 0 to
maintain historical semantic for existing callers
- generic_pt: use pt_entry_oa_lg2sz() which already handles
contiguous PTE hints natively (Jason)
- All drivers updated to return PHYS_ADDR_MAX for error/fault paths
instead of 0
Guanghui Feng (30):
iommu: introduce iova_to_phys_length in iommu_domain_ops
iommu/io-pgtable-arm: introduce iova_to_phys_length in io_pgtable_ops
iommu/io-pgtable-arm-v7s: introduce iova_to_phys_length in
io_pgtable_ops
iommu/io-pgtable-dart: introduce iova_to_phys_length in io_pgtable_ops
iommu/generic_pt: implement iova_to_phys_length
iommu/arm-smmu-v3: implement iova_to_phys_length
iommu/arm-smmu: implement iova_to_phys_length
iommu/qcom_iommu: implement iova_to_phys_length
iommu/apple-dart: implement iova_to_phys_length
iommu/ipmmu-vmsa: implement iova_to_phys_length
iommu/mtk_iommu: implement iova_to_phys_length
iommu/exynos: implement iova_to_phys_length
iommu/fsl_pamu: implement iova_to_phys_length
iommu/msm: implement iova_to_phys_length
iommu/mtk_v1: implement iova_to_phys_length
iommu/omap: implement iova_to_phys_length
iommu/rockchip: implement iova_to_phys_length
iommu/s390: implement iova_to_phys_length
iommu/sprd: implement iova_to_phys_length
iommu/sun50i: implement iova_to_phys_length
iommu/tegra-smmu: implement iova_to_phys_length
iommu/virtio: implement iova_to_phys_length
vfio/iommufd: use iova_to_phys_length for efficient unmap
drm/panfrost: switch to iova_to_phys_length
drm/panthor: switch to iova_to_phys_length
iommu/io-pgtable: selftests switch to iova_to_phys_length
iommu/io-pgtable-arm: remove deprecated iova_to_phys wrapper
iommu/io-pgtable-arm-v7s: remove deprecated iova_to_phys wrapper
iommu/io-pgtable-dart: remove deprecated iova_to_phys wrapper
iommu: remove iova_to_phys from domain_ops and io_pgtable_ops
drivers/gpu/drm/panfrost/panfrost_mmu.c | 2 +-
drivers/gpu/drm/panthor/panthor_mmu.c | 2 +-
drivers/iommu/apple-dart.c | 13 ++--
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 12 ++-
.../iommu/arm/arm-smmu/arm-smmu-qcom-debug.c | 2 +-
drivers/iommu/arm/arm-smmu/arm-smmu.c | 15 ++--
drivers/iommu/arm/arm-smmu/qcom_iommu.c | 13 ++--
drivers/iommu/exynos-iommu.c | 23 ++++--
drivers/iommu/fsl_pamu_domain.c | 28 ++++++-
drivers/iommu/generic_pt/iommu_pt.h | 57 ++++++++------
drivers/iommu/io-pgtable-arm-selftests.c | 12 +--
drivers/iommu/io-pgtable-arm-v7s.c | 27 ++++---
drivers/iommu/io-pgtable-arm.c | 16 ++--
drivers/iommu/io-pgtable-dart.c | 21 ++++--
drivers/iommu/iommu.c | 32 +++++++-
drivers/iommu/iommufd/pages.c | 75 ++++++++++++++++---
drivers/iommu/iommufd/selftest.c | 2 +-
drivers/iommu/ipmmu-vmsa.c | 12 ++-
drivers/iommu/msm_iommu.c | 29 +++++--
drivers/iommu/mtk_iommu.c | 14 +++-
drivers/iommu/mtk_iommu_v1.c | 16 +++-
drivers/iommu/omap-iommu.c | 34 ++++++---
drivers/iommu/rockchip-iommu.c | 13 +++-
drivers/iommu/s390-iommu.c | 21 ++++--
drivers/iommu/sprd-iommu.c | 20 +++--
drivers/iommu/sun50i-iommu.c | 17 +++--
drivers/iommu/tegra-smmu.c | 14 +++-
drivers/iommu/virtio-iommu.c | 15 +++-
drivers/vfio/vfio_iommu_type1.c | 26 +++++--
include/linux/generic_pt/iommu.h | 13 ++--
include/linux/io-pgtable.h | 10 ++-
include/linux/iommu.h | 12 ++-
32 files changed, 443 insertions(+), 175 deletions(-)
--
2.43.7
This series introduces a new iova_to_phys_length() interface to the IOMMU
subsystem, migrates all drivers and callers to it, and finally removes the
legacy iova_to_phys() interface.
Motivation
==========
The existing iommu_iova_to_phys() returns only the physical address for a
given IOVA. Callers that need to walk a range of IOVA space (most notably
VFIO and IOMMUFD during unmap) have no way to learn the size of the mapping
backing an IOVA, so they are forced to iterate in fixed PAGE_SIZE steps.
When the underlying mapping uses large pages (e.g. 2MB or 1GB), this results
in a separate page table walk for every 4KB, which is highly inefficient for
large regions.
iommu_iova_to_phys_length() returns both the physical address and the page
size of the PTE backing the IOVA in a single page table walk. Callers can
then advance by the actual mapping granularity instead of PAGE_SIZE,
dramatically reducing the number of page table walks during teardown of
large mappings.
The new helper translates a missing mapping to PHYS_ADDR_MAX (rather than 0)
so that a zero physical address can be distinguished from a translation
failure, and the legacy iommu_iova_to_phys() is reimplemented on top of it.
Approach
========
The series is structured to keep the tree bisectable at every step: the new
interface is added first with a fallback to the legacy callback, every
driver and caller is migrated, and only then is the old interface removed.
- Core and page table layer (patches 1-5):
Add iova_to_phys_length to iommu_domain_ops and io_pgtable_ops, the
core iommu_iova_to_phys_length() helper with a fallback to the legacy
iova_to_phys, and implement it in the io-pgtable arm, arm-v7s and dart
formats as well as the generic_pt framework.
- Driver migration (patches 6-22):
Implement iova_to_phys_length in all remaining IOMMU drivers:
arm-smmu-v3, arm-smmu, qcom_iommu, apple-dart, ipmmu-vmsa, mtk_iommu,
exynos, fsl_pamu, msm, mtk_v1, omap, rockchip, s390, sprd, sun50i,
tegra-smmu and virtio.
- Caller migration (patches 23-28):
Switch VFIO and IOMMUFD to iova_to_phys_length for efficient,
granularity-aware unmap, update the iommufd selftest, migrate the
DRM panfrost and panthor drivers, and switch the io-pgtable selftests
to the new interface.
- Removal (patches 29-32):
Drop the now-unused iova_to_phys wrappers from the io-pgtable arm,
arm-v7s and dart formats, and finally remove iova_to_phys from
iommu_domain_ops and io_pgtable_ops.
No functional change is intended for translation results; the change is
purely about exposing mapping size and using it to make range operations
more efficient.
Guanghui Feng (32):
iommu: introduce iova_to_phys_length in iommu_domain_ops
iommu/io-pgtable-arm: introduce iova_to_phys_length in io_pgtable_ops
iommu/io-pgtable-arm-v7s: introduce iova_to_phys_length in
io_pgtable_ops
iommu/io-pgtable-dart: introduce iova_to_phys_length in io_pgtable_ops
iommu/generic_pt: implement iova_to_phys_length
iommu/arm-smmu-v3: implement iova_to_phys_length
iommu/arm-smmu: implement iova_to_phys_length
iommu/qcom_iommu: implement iova_to_phys_length
iommu/apple-dart: implement iova_to_phys_length
iommu/ipmmu-vmsa: implement iova_to_phys_length
iommu/mtk_iommu: implement iova_to_phys_length
iommu/exynos: implement iova_to_phys_length
iommu/fsl_pamu: implement iova_to_phys_length
iommu/msm: implement iova_to_phys_length
iommu/mtk_v1: implement iova_to_phys_length
iommu/omap: implement iova_to_phys_length
iommu/rockchip: implement iova_to_phys_length
iommu/s390: implement iova_to_phys_length
iommu/sprd: implement iova_to_phys_length
iommu/sun50i: implement iova_to_phys_length
iommu/tegra-smmu: implement iova_to_phys_length
iommu/virtio: implement iova_to_phys_length
vfio: use iova_to_phys_length for efficient unmap
iommufd: use iova_to_phys_length for efficient unmap
iommufd/selftest: switch to iommu_iova_to_phys_length
drm/panfrost: switch to iova_to_phys_length
drm/panthor: switch to iova_to_phys_length
iommu/io-pgtable: selftests switch to iova_to_phys_length
iommu/io-pgtable-arm: remove deprecated iova_to_phys wrapper
iommu/io-pgtable-arm-v7s: remove deprecated iova_to_phys wrapper
iommu/io-pgtable-dart: remove deprecated iova_to_phys wrapper
iommu: remove iova_to_phys from domain_ops and io_pgtable_ops
drivers/gpu/drm/panfrost/panfrost_mmu.c | 2 +-
drivers/gpu/drm/panthor/panthor_mmu.c | 2 +-
drivers/iommu/apple-dart.c | 11 +--
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 10 ++-
.../iommu/arm/arm-smmu/arm-smmu-qcom-debug.c | 2 +-
drivers/iommu/arm/arm-smmu/arm-smmu.c | 13 +--
drivers/iommu/arm/arm-smmu/qcom_iommu.c | 11 +--
drivers/iommu/exynos-iommu.c | 20 +++--
drivers/iommu/fsl_pamu_domain.c | 26 +++++-
drivers/iommu/generic_pt/iommu_pt.h | 84 ++++++++++++++-----
drivers/iommu/io-pgtable-arm-selftests.c | 12 +--
drivers/iommu/io-pgtable-arm-v7s.c | 25 +++---
drivers/iommu/io-pgtable-arm.c | 16 ++--
drivers/iommu/io-pgtable-dart.c | 21 +++--
drivers/iommu/iommu.c | 38 +++++++--
drivers/iommu/iommufd/pages.c | 74 +++++++++++++---
drivers/iommu/iommufd/selftest.c | 2 +-
drivers/iommu/ipmmu-vmsa.c | 10 ++-
drivers/iommu/msm_iommu.c | 21 +++--
drivers/iommu/mtk_iommu.c | 11 ++-
drivers/iommu/mtk_iommu_v1.c | 13 ++-
drivers/iommu/omap-iommu.c | 29 ++++---
drivers/iommu/rockchip-iommu.c | 10 ++-
drivers/iommu/s390-iommu.c | 18 ++--
drivers/iommu/sprd-iommu.c | 17 ++--
drivers/iommu/sun50i-iommu.c | 14 ++--
drivers/iommu/tegra-smmu.c | 11 ++-
drivers/iommu/virtio-iommu.c | 12 ++-
drivers/vfio/vfio_iommu_type1.c | 27 ++++--
include/linux/generic_pt/iommu.h | 13 +--
include/linux/io-pgtable.h | 11 ++-
include/linux/iommu.h | 12 ++-
32 files changed, 423 insertions(+), 175 deletions(-)
--
2.43.7
Add iova_to_phys_length callback to struct iommu_domain_ops alongside
the existing iova_to_phys. The new callback returns both the physical
address and the PTE mapping page size in a single page table walk.
Add iommu_iova_to_phys_length() core function that:
- Checks ops->iova_to_phys_length first (preferred path)
- Falls back to ops->iova_to_phys for unmigrated drivers
This enables callers like VFIO to efficiently traverse IOVA space
by actual mapping granularity instead of fixed PAGE_SIZE steps.
Signed-off-by: Guanghui Feng <guanghuifeng@linux.alibaba.com>
Acked-by: Shiqiang Zhang <shiyu.zsq@linux.alibaba.com>
Acked-by: Simon Guo <wei.guo.simon@linux.alibaba.com>
---
drivers/iommu/iommu.c | 50 ++++++++++++++++++++++++++++++++++++++-----
include/linux/iommu.h | 9 ++++++++
2 files changed, 54 insertions(+), 5 deletions(-)
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index d1a9e713d3a0..320ea13488e7 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2545,15 +2545,55 @@ void iommu_detach_group(struct iommu_domain *domain, struct iommu_group *group)
}
EXPORT_SYMBOL_GPL(iommu_detach_group);
-phys_addr_t iommu_iova_to_phys(struct iommu_domain *domain, dma_addr_t iova)
+/**
+ * iommu_iova_to_phys_length - Translate IOVA and return mapping page size
+ * @domain: IOMMU domain to query
+ * @iova: IO virtual address to translate
+ * @mapped_length: Output parameter for the PTE page size (e.g. 4KB/2MB/1GB)
+ *
+ * Like iommu_iova_to_phys() but additionally returns the page size of the
+ * PTE mapping at @iova through @mapped_length.
+ *
+ * Return: The physical address for the given IOVA, or PHYS_ADDR_MAX if no
+ * translation exists.
+ */
+phys_addr_t iommu_iova_to_phys_length(struct iommu_domain *domain,
+ dma_addr_t iova,
+ size_t *mapped_length)
{
- if (domain->type == IOMMU_DOMAIN_IDENTITY)
+ phys_addr_t phys;
+
+ if (domain->type == IOMMU_DOMAIN_IDENTITY) {
+ if (mapped_length)
+ *mapped_length = PAGE_SIZE;
return iova;
+ }
- if (domain->type == IOMMU_DOMAIN_BLOCKED)
- return 0;
+ if (mapped_length)
+ *mapped_length = 0;
+
+ if (domain->ops->iova_to_phys_length)
+ return domain->ops->iova_to_phys_length(domain, iova, mapped_length);
+
+ /* Fallback to legacy iova_to_phys without length info */
+ if (!domain->ops->iova_to_phys)
+ return PHYS_ADDR_MAX;
+
+ phys = domain->ops->iova_to_phys(domain, iova);
+ if (!phys)
+ return PHYS_ADDR_MAX;
+
+ if (mapped_length)
+ *mapped_length = PAGE_SIZE;
+ return phys;
+}
+EXPORT_SYMBOL_GPL(iommu_iova_to_phys_length);
+
+phys_addr_t iommu_iova_to_phys(struct iommu_domain *domain, dma_addr_t iova)
+{
+ phys_addr_t phys = iommu_iova_to_phys_length(domain, iova, NULL);
- return domain->ops->iova_to_phys(domain, iova);
+ return (phys == PHYS_ADDR_MAX) ? 0 : phys;
}
EXPORT_SYMBOL_GPL(iommu_iova_to_phys);
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index e587d4ac4d33..19da84c2922c 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -747,6 +747,9 @@ struct iommu_ops {
* invalidation requests. The driver data structure
* must be defined in include/uapi/linux/iommufd.h
* @iova_to_phys: translate iova to physical address
+ * @iova_to_phys_length: translate iova to physical address and additionally
+ * return the page size of the PTE mapping at @iova
+ * through @mapped_length.
* @enforce_cache_coherency: Prevent any kind of DMA from bypassing IOMMU_CACHE,
* including no-snoop TLPs on PCIe or other platform
* specific mechanisms.
@@ -776,6 +779,9 @@ struct iommu_domain_ops {
phys_addr_t (*iova_to_phys)(struct iommu_domain *domain,
dma_addr_t iova);
+ phys_addr_t (*iova_to_phys_length)(struct iommu_domain *domain,
+ dma_addr_t iova,
+ size_t *mapped_length);
bool (*enforce_cache_coherency)(struct iommu_domain *domain);
int (*set_pgtable_quirks)(struct iommu_domain *domain,
@@ -930,6 +936,9 @@ extern ssize_t iommu_map_sg(struct iommu_domain *domain, unsigned long iova,
struct scatterlist *sg, unsigned int nents,
int prot, gfp_t gfp);
extern phys_addr_t iommu_iova_to_phys(struct iommu_domain *domain, dma_addr_t iova);
+extern phys_addr_t iommu_iova_to_phys_length(struct iommu_domain *domain,
+ dma_addr_t iova,
+ size_t *mapped_length);
extern void iommu_set_fault_handler(struct iommu_domain *domain,
iommu_fault_handler_t handler, void *token);
--
2.43.7
On Wed, Jun 03, 2026 at 11:17:33PM +0800, Guanghui Feng wrote:
> +phys_addr_t iommu_iova_to_phys_length(struct iommu_domain *domain,
> + dma_addr_t iova,
> + size_t *mapped_length)
> {
This should take in an ending point so the accumulation knows when to
stop, otherwise it is too hard to use.
> - if (domain->type == IOMMU_DOMAIN_IDENTITY)
> + phys_addr_t phys;
> +
> + if (domain->type == IOMMU_DOMAIN_IDENTITY) {
> + if (mapped_length)
> + *mapped_length = PAGE_SIZE;
> return iova;
> + }
>
> - if (domain->type == IOMMU_DOMAIN_BLOCKED)
> - return 0;
> + if (mapped_length)
> + *mapped_length = 0;
> +
> + if (domain->ops->iova_to_phys_length)
> + return domain->ops->iova_to_phys_length(domain, iova, mapped_length);
> +
> + /* Fallback to legacy iova_to_phys without length info */
> + if (!domain->ops->iova_to_phys)
> + return PHYS_ADDR_MAX;
> +
> + phys = domain->ops->iova_to_phys(domain, iova);
> + if (!phys)
> + return PHYS_ADDR_MAX;
And to properly clean up the callers all the non-iommupt paths should
manually do accumulation here as well.
Basically if you call this function you get a maximal contiguous
physical range as efficiently as possible.
Jason
On 6/3/26 23:17, Guanghui Feng wrote: > Add iova_to_phys_length callback to struct iommu_domain_ops alongside > the existing iova_to_phys. The new callback returns both the physical > address and the PTE mapping page size in a single page table walk. > > Add iommu_iova_to_phys_length() core function that: > - Checks ops->iova_to_phys_length first (preferred path) > - Falls back to ops->iova_to_phys for unmigrated drivers > > This enables callers like VFIO to efficiently traverse IOVA space > by actual mapping granularity instead of fixed PAGE_SIZE steps. > > Signed-off-by: Guanghui Feng<guanghuifeng@linux.alibaba.com> > Acked-by: Shiqiang Zhang<shiyu.zsq@linux.alibaba.com> > Acked-by: Simon Guo<wei.guo.simon@linux.alibaba.com> > --- > drivers/iommu/iommu.c | 50 ++++++++++++++++++++++++++++++++++++++----- > include/linux/iommu.h | 9 ++++++++ > 2 files changed, 54 insertions(+), 5 deletions(-) Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Add iova_to_phys_length to struct io_pgtable_ops alongside iova_to_phys.
Implement in ARM LPAE backend: returns ARM_LPAE_BLOCK_SIZE at the resolved level.
The old iova_to_phys is kept as a thin wrapper for backward compat.
Signed-off-by: Guanghui Feng <guanghuifeng@linux.alibaba.com>
---
drivers/iommu/io-pgtable-arm.c | 23 +++++++++++++++++++++--
include/linux/io-pgtable.h | 7 +++++++
2 files changed, 28 insertions(+), 2 deletions(-)
diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index 0208e5897c29..f33a86fa0f6c 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -731,8 +731,21 @@ static int visit_iova_to_phys(struct io_pgtable_walk_data *walk_data, int lvl,
return 0;
}
+static phys_addr_t arm_lpae_iova_to_phys_length(struct io_pgtable_ops *ops,
+ unsigned long iova,
+ size_t *mapped_length);
+
static phys_addr_t arm_lpae_iova_to_phys(struct io_pgtable_ops *ops,
unsigned long iova)
+{
+ phys_addr_t phys = arm_lpae_iova_to_phys_length(ops, iova, NULL);
+
+ return (phys == PHYS_ADDR_MAX) ? 0 : phys;
+}
+
+static phys_addr_t arm_lpae_iova_to_phys_length(struct io_pgtable_ops *ops,
+ unsigned long iova,
+ size_t *mapped_length)
{
struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops);
struct iova_to_phys_data d;
@@ -742,13 +755,18 @@ static phys_addr_t arm_lpae_iova_to_phys(struct io_pgtable_ops *ops,
.addr = iova,
.end = iova + 1,
};
+ size_t block_size;
int ret;
ret = __arm_lpae_iopte_walk(data, &walk_data, data->pgd, data->start_level);
if (ret)
- return 0;
+ return PHYS_ADDR_MAX;
+
+ block_size = ARM_LPAE_BLOCK_SIZE(d.lvl, data);
+ if (mapped_length)
+ *mapped_length = block_size;
- iova &= (ARM_LPAE_BLOCK_SIZE(d.lvl, data) - 1);
+ iova &= (block_size - 1);
return iopte_to_paddr(d.pte, data) | iova;
}
@@ -948,6 +966,7 @@ arm_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg)
.map_pages = arm_lpae_map_pages,
.unmap_pages = arm_lpae_unmap_pages,
.iova_to_phys = arm_lpae_iova_to_phys,
+ .iova_to_phys_length = arm_lpae_iova_to_phys_length,
.read_and_clear_dirty = arm_lpae_read_and_clear_dirty,
.pgtable_walk = arm_lpae_pgtable_walk,
};
diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h
index e19872e37e06..42bcdd309b88 100644
--- a/include/linux/io-pgtable.h
+++ b/include/linux/io-pgtable.h
@@ -203,6 +203,10 @@ struct arm_lpae_io_pgtable_walk_data {
* @map_pages: Map a physically contiguous range of pages of the same size.
* @unmap_pages: Unmap a range of virtually contiguous pages of the same size.
* @iova_to_phys: Translate iova to physical address.
+ * @iova_to_phys_length: Translate iova to physical address and return the
+ * remaining mapped length from iova to the end of the
+ * mapping entry via @mapped_length. If @mapped_length is
+ * NULL, only the physical address is returned.
* @pgtable_walk: (optional) Perform a page table walk for a given iova.
* @read_and_clear_dirty: Record dirty info per IOVA. If an IOVA is dirty,
* clear its dirty state from the PTE unless the
@@ -220,6 +224,9 @@ struct io_pgtable_ops {
struct iommu_iotlb_gather *gather);
phys_addr_t (*iova_to_phys)(struct io_pgtable_ops *ops,
unsigned long iova);
+ phys_addr_t (*iova_to_phys_length)(struct io_pgtable_ops *ops,
+ unsigned long iova,
+ size_t *mapped_length);
int (*pgtable_walk)(struct io_pgtable_ops *ops, unsigned long iova, void *wd);
int (*read_and_clear_dirty)(struct io_pgtable_ops *ops,
unsigned long iova, size_t size,
--
2.43.7
Implement iova_to_phys_length in ARM v7s backend: returns block size
derived from level mask. The old iova_to_phys is kept as a thin wrapper.
Signed-off-by: Guanghui Feng <guanghuifeng@linux.alibaba.com>
---
drivers/iommu/io-pgtable-arm-v7s.c | 32 +++++++++++++++++++++++-------
1 file changed, 25 insertions(+), 7 deletions(-)
diff --git a/drivers/iommu/io-pgtable-arm-v7s.c b/drivers/iommu/io-pgtable-arm-v7s.c
index 1dbef8c55007..62198e31a393 100644
--- a/drivers/iommu/io-pgtable-arm-v7s.c
+++ b/drivers/iommu/io-pgtable-arm-v7s.c
@@ -641,8 +641,21 @@ static size_t arm_v7s_unmap_pages(struct io_pgtable_ops *ops, unsigned long iova
return unmapped;
}
+static phys_addr_t arm_v7s_iova_to_phys_length(struct io_pgtable_ops *ops,
+ unsigned long iova,
+ size_t *mapped_length);
+
static phys_addr_t arm_v7s_iova_to_phys(struct io_pgtable_ops *ops,
unsigned long iova)
+{
+ phys_addr_t phys = arm_v7s_iova_to_phys_length(ops, iova, NULL);
+
+ return (phys == PHYS_ADDR_MAX) ? 0 : phys;
+}
+
+static phys_addr_t arm_v7s_iova_to_phys_length(struct io_pgtable_ops *ops,
+ unsigned long iova,
+ size_t *mapped_length)
{
struct arm_v7s_io_pgtable *data = io_pgtable_ops_to_data(ops);
arm_v7s_iopte *ptep = data->pgd, pte;
@@ -656,11 +669,15 @@ static phys_addr_t arm_v7s_iova_to_phys(struct io_pgtable_ops *ops,
} while (ARM_V7S_PTE_IS_TABLE(pte, lvl));
if (!ARM_V7S_PTE_IS_VALID(pte))
- return 0;
+ return PHYS_ADDR_MAX;
mask = ARM_V7S_LVL_MASK(lvl);
if (arm_v7s_pte_is_cont(pte, lvl))
mask *= ARM_V7S_CONT_PAGES;
+
+ if (mapped_length)
+ *mapped_length = ~mask + 1U;
+
return iopte_to_paddr(pte, lvl, &data->iop.cfg) | (iova & ~mask);
}
@@ -714,6 +731,7 @@ static struct io_pgtable *arm_v7s_alloc_pgtable(struct io_pgtable_cfg *cfg,
.map_pages = arm_v7s_map_pages,
.unmap_pages = arm_v7s_unmap_pages,
.iova_to_phys = arm_v7s_iova_to_phys,
+ .iova_to_phys_length = arm_v7s_iova_to_phys_length,
};
/* We have to do this early for __arm_v7s_alloc_table to work... */
@@ -843,13 +861,13 @@ static int __init arm_v7s_do_selftests(void)
* Initial sanity checks.
* Empty page tables shouldn't provide any translations.
*/
- if (ops->iova_to_phys(ops, 42))
+ if (ops->iova_to_phys_length(ops, 42, NULL) != PHYS_ADDR_MAX)
return __FAIL(ops);
- if (ops->iova_to_phys(ops, SZ_1G + 42))
+ if (ops->iova_to_phys_length(ops, SZ_1G + 42, NULL) != PHYS_ADDR_MAX)
return __FAIL(ops);
- if (ops->iova_to_phys(ops, SZ_2G + 42))
+ if (ops->iova_to_phys_length(ops, SZ_2G + 42, NULL) != PHYS_ADDR_MAX)
return __FAIL(ops);
/*
@@ -870,7 +888,7 @@ static int __init arm_v7s_do_selftests(void)
&mapped))
return __FAIL(ops);
- if (ops->iova_to_phys(ops, iova + 42) != (iova + 42))
+ if (ops->iova_to_phys_length(ops, iova + 42, NULL) != (iova + 42))
return __FAIL(ops);
iova += SZ_16M;
@@ -884,7 +902,7 @@ static int __init arm_v7s_do_selftests(void)
if (ops->unmap_pages(ops, iova, size, 1, NULL) != size)
return __FAIL(ops);
- if (ops->iova_to_phys(ops, iova + 42))
+ if (ops->iova_to_phys_length(ops, iova + 42, NULL) != PHYS_ADDR_MAX)
return __FAIL(ops);
/* Remap full block */
@@ -892,7 +910,7 @@ static int __init arm_v7s_do_selftests(void)
GFP_KERNEL, &mapped))
return __FAIL(ops);
- if (ops->iova_to_phys(ops, iova + 42) != (iova + 42))
+ if (ops->iova_to_phys_length(ops, iova + 42, NULL) != (iova + 42))
return __FAIL(ops);
iova += SZ_16M;
--
2.43.7
Implement iova_to_phys_length in DART backend: returns pgsize from
cfg.pgsize_bitmap (single fixed page size). The old iova_to_phys is kept
as a thin wrapper.
Signed-off-by: Guanghui Feng <guanghuifeng@linux.alibaba.com>
---
drivers/iommu/io-pgtable-dart.c | 32 +++++++++++++++++++++++++-------
1 file changed, 25 insertions(+), 7 deletions(-)
diff --git a/drivers/iommu/io-pgtable-dart.c b/drivers/iommu/io-pgtable-dart.c
index cbc5d6aa2daa..2dac21a578a7 100644
--- a/drivers/iommu/io-pgtable-dart.c
+++ b/drivers/iommu/io-pgtable-dart.c
@@ -333,29 +333,46 @@ static size_t dart_unmap_pages(struct io_pgtable_ops *ops, unsigned long iova,
return i * pgsize;
}
+static phys_addr_t dart_iova_to_phys_length(struct io_pgtable_ops *ops,
+ unsigned long iova,
+ size_t *mapped_length);
+
static phys_addr_t dart_iova_to_phys(struct io_pgtable_ops *ops,
- unsigned long iova)
+ unsigned long iova)
+{
+ phys_addr_t phys = dart_iova_to_phys_length(ops, iova, NULL);
+
+ return (phys == PHYS_ADDR_MAX) ? 0 : phys;
+}
+
+static phys_addr_t dart_iova_to_phys_length(struct io_pgtable_ops *ops,
+ unsigned long iova,
+ size_t *mapped_length)
{
struct dart_io_pgtable *data = io_pgtable_ops_to_data(ops);
dart_iopte pte, *ptep;
+ size_t pgsize;
ptep = dart_get_last(data, iova);
/* Valid L2 IOPTE pointer? */
if (!ptep)
- return 0;
+ return PHYS_ADDR_MAX;
ptep += dart_get_last_index(data, iova);
pte = READ_ONCE(*ptep);
/* Found translation */
if (pte) {
- iova &= (data->iop.cfg.pgsize_bitmap - 1);
+ pgsize = data->iop.cfg.pgsize_bitmap;
+ if (mapped_length)
+ *mapped_length = pgsize;
+ iova &= (pgsize - 1);
return iopte_to_paddr(pte, data) | iova;
}
/* Ran out of page tables to walk */
- return 0;
+ return PHYS_ADDR_MAX;
}
static struct dart_io_pgtable *
@@ -397,9 +414,10 @@ dart_alloc_pgtable(struct io_pgtable_cfg *cfg)
data->bits_per_level = bits_per_level;
data->iop.ops = (struct io_pgtable_ops) {
- .map_pages = dart_map_pages,
- .unmap_pages = dart_unmap_pages,
- .iova_to_phys = dart_iova_to_phys,
+ .map_pages = dart_map_pages,
+ .unmap_pages = dart_unmap_pages,
+ .iova_to_phys = dart_iova_to_phys,
+ .iova_to_phys_length = dart_iova_to_phys_length,
};
return data;
--
2.43.7
Extend the Generic Page Table framework to implement iova_to_phys_length.
Use pt_entry_oa_lg2sz() to determine PTE block size. Update
IOMMU_PT_DOMAIN_OPS macro to set .iova_to_phys_length.
Signed-off-by: Guanghui Feng <guanghuifeng@linux.alibaba.com>
Acked-by: Shiqiang Zhang <shiyu.zsq@linux.alibaba.com>
Acked-by: Simon Guo <wei.guo.simon@linux.alibaba.com>
---
drivers/iommu/generic_pt/iommu_pt.h | 84 +++++++++++++++++++++--------
include/linux/generic_pt/iommu.h | 13 ++---
2 files changed, 69 insertions(+), 28 deletions(-)
diff --git a/drivers/iommu/generic_pt/iommu_pt.h b/drivers/iommu/generic_pt/iommu_pt.h
index dc91fb4e2f61..e362e819ef9c 100644
--- a/drivers/iommu/generic_pt/iommu_pt.h
+++ b/drivers/iommu/generic_pt/iommu_pt.h
@@ -145,13 +145,21 @@ static inline unsigned int compute_best_pgsize(struct pt_state *pts,
pts->range->va, pts->range->last_va, oa);
}
-static __always_inline int __do_iova_to_phys(struct pt_range *range, void *arg,
- unsigned int level,
- struct pt_table_p *table,
- pt_level_fn_t descend_fn)
+struct iova_to_phys_length_data {
+ pt_oaddr_t phys;
+ size_t length;
+};
+
+static __always_inline int __do_iova_to_phys_length(struct pt_range *range,
+ void *arg, unsigned int level,
+ struct pt_table_p *table,
+ pt_level_fn_t descend_fn)
{
struct pt_state pts = pt_init(range, level, table);
- pt_oaddr_t *res = arg;
+ struct iova_to_phys_length_data *data = arg;
+ unsigned int entry_lg2sz;
+ size_t entry_sz;
+ pt_oaddr_t expected_oa;
switch (pt_load_single_entry(&pts)) {
case PT_ENTRY_EMPTY:
@@ -159,45 +167,77 @@ static __always_inline int __do_iova_to_phys(struct pt_range *range, void *arg,
case PT_ENTRY_TABLE:
return pt_descend(&pts, arg, descend_fn);
case PT_ENTRY_OA:
- *res = pt_entry_oa_exact(&pts);
- return 0;
+ break;
}
- return -ENOENT;
+
+ data->phys = pt_entry_oa_exact(&pts);
+ entry_lg2sz = pt_entry_oa_lg2sz(&pts);
+ entry_sz = log2_to_int(entry_lg2sz);
+
+ /* Start with the full mapping size of the first entry */
+ data->length = entry_sz;
+
+ /* Accumulate subsequent physically contiguous entries */
+ expected_oa = pt_entry_oa(&pts) + entry_sz;
+ pts.end_index = log2_to_int(pt_num_items_lg2(&pts));
+ pt_next_entry(&pts);
+
+ while (pts.index < pts.end_index) {
+ pt_load_entry(&pts);
+ if (pts.type != PT_ENTRY_OA)
+ break;
+ if (pt_entry_oa_lg2sz(&pts) != entry_lg2sz)
+ break;
+ if (pt_entry_oa(&pts) != expected_oa)
+ break;
+ data->length += entry_sz;
+ expected_oa += entry_sz;
+ pt_next_entry(&pts);
+ }
+
+ return 0;
}
-PT_MAKE_LEVELS(__iova_to_phys, __do_iova_to_phys);
+PT_MAKE_LEVELS(__iova_to_phys_length, __do_iova_to_phys_length);
/**
- * iova_to_phys() - Return the output address for the given IOVA
+ * iova_to_phys_length() - Translate IOVA returning phys and contiguous length
* @domain: Table to query
* @iova: IO virtual address to query
+ * @mapped_length: Output for the total contiguous mapped length in bytes
*
- * Determine the output address from the given IOVA. @iova may have any
- * alignment, the returned physical will be adjusted with any sub page offset.
+ * Walk the IOMMU page table to translate @iova to a physical address while
+ * also returning the total contiguous physically mapped length through
+ * @mapped_length. The function accumulates consecutive page table entries that
+ * are physically contiguous, so callers can determine the full contiguous
+ * mapping extent with a single call.
*
* Context: The caller must hold a read range lock that includes @iova.
*
- * Return: 0 if there is no translation for the given iova.
+ * Return: The physical address, or PHYS_ADDR_MAX if there is no translation.
*/
-phys_addr_t DOMAIN_NS(iova_to_phys)(struct iommu_domain *domain,
- dma_addr_t iova)
+phys_addr_t DOMAIN_NS(iova_to_phys_length)(struct iommu_domain *domain,
+ dma_addr_t iova,
+ size_t *mapped_length)
{
struct pt_iommu *iommu_table =
container_of(domain, struct pt_iommu, domain);
struct pt_range range;
- pt_oaddr_t res;
+ struct iova_to_phys_length_data data;
int ret;
ret = make_range(common_from_iommu(iommu_table), &range, iova, 1);
if (ret)
- return ret;
+ return PHYS_ADDR_MAX;
- ret = pt_walk_range(&range, __iova_to_phys, &res);
- /* PHYS_ADDR_MAX would be a better error code */
+ ret = pt_walk_range(&range, __iova_to_phys_length, &data);
if (ret)
- return 0;
- return res;
+ return PHYS_ADDR_MAX;
+
+ if (mapped_length)
+ *mapped_length = data.length;
+ return data.phys;
}
-EXPORT_SYMBOL_NS_GPL(DOMAIN_NS(iova_to_phys), "GENERIC_PT_IOMMU");
+EXPORT_SYMBOL_NS_GPL(DOMAIN_NS(iova_to_phys_length), "GENERIC_PT_IOMMU");
struct pt_iommu_dirty_args {
struct iommu_dirty_bitmap *dirty;
diff --git a/include/linux/generic_pt/iommu.h b/include/linux/generic_pt/iommu.h
index dd0edd02a48a..859b853e9dc7 100644
--- a/include/linux/generic_pt/iommu.h
+++ b/include/linux/generic_pt/iommu.h
@@ -249,8 +249,9 @@ struct pt_iommu_cfg {
/* Generate the exported function signatures from iommu_pt.h */
#define IOMMU_PROTOTYPES(fmt) \
- phys_addr_t pt_iommu_##fmt##_iova_to_phys(struct iommu_domain *domain, \
- dma_addr_t iova); \
+ phys_addr_t pt_iommu_##fmt##_iova_to_phys_length( \
+ struct iommu_domain *domain, dma_addr_t iova, \
+ size_t *mapped_length); \
int pt_iommu_##fmt##_read_and_clear_dirty( \
struct iommu_domain *domain, unsigned long iova, size_t size, \
unsigned long flags, struct iommu_dirty_bitmap *dirty); \
@@ -267,11 +268,11 @@ struct pt_iommu_cfg {
IOMMU_PROTOTYPES(fmt)
/*
- * A driver uses IOMMU_PT_DOMAIN_OPS to populate the iommu_domain_ops for the
- * iommu_pt
+ * A driver uses IOMMU_PT_DOMAIN_OPS to populate the iommu_domain_ops for
+ * the iommu_pt
*/
-#define IOMMU_PT_DOMAIN_OPS(fmt) \
- .iova_to_phys = &pt_iommu_##fmt##_iova_to_phys
+#define IOMMU_PT_DOMAIN_OPS(fmt) \
+ .iova_to_phys_length = &pt_iommu_##fmt##_iova_to_phys_length
#define IOMMU_PT_DIRTY_OPS(fmt) \
.read_and_clear_dirty = &pt_iommu_##fmt##_read_and_clear_dirty
--
2.43.7
On 6/3/26 23:17, Guanghui Feng wrote:
> Extend the Generic Page Table framework to implement iova_to_phys_length.
> Use pt_entry_oa_lg2sz() to determine PTE block size. Update
> IOMMU_PT_DOMAIN_OPS macro to set .iova_to_phys_length.
>
> Signed-off-by: Guanghui Feng <guanghuifeng@linux.alibaba.com>
> Acked-by: Shiqiang Zhang <shiyu.zsq@linux.alibaba.com>
> Acked-by: Simon Guo <wei.guo.simon@linux.alibaba.com>
> ---
> drivers/iommu/generic_pt/iommu_pt.h | 84 +++++++++++++++++++++--------
> include/linux/generic_pt/iommu.h | 13 ++---
> 2 files changed, 69 insertions(+), 28 deletions(-)
>
> diff --git a/drivers/iommu/generic_pt/iommu_pt.h b/drivers/iommu/generic_pt/iommu_pt.h
> index dc91fb4e2f61..e362e819ef9c 100644
> --- a/drivers/iommu/generic_pt/iommu_pt.h
> +++ b/drivers/iommu/generic_pt/iommu_pt.h
> @@ -145,13 +145,21 @@ static inline unsigned int compute_best_pgsize(struct pt_state *pts,
> pts->range->va, pts->range->last_va, oa);
> }
>
> -static __always_inline int __do_iova_to_phys(struct pt_range *range, void *arg,
> - unsigned int level,
> - struct pt_table_p *table,
> - pt_level_fn_t descend_fn)
> +struct iova_to_phys_length_data {
> + pt_oaddr_t phys;
> + size_t length;
> +};
> +
> +static __always_inline int __do_iova_to_phys_length(struct pt_range *range,
> + void *arg, unsigned int level,
> + struct pt_table_p *table,
> + pt_level_fn_t descend_fn)
> {
> struct pt_state pts = pt_init(range, level, table);
> - pt_oaddr_t *res = arg;
> + struct iova_to_phys_length_data *data = arg;
> + unsigned int entry_lg2sz;
> + size_t entry_sz;
> + pt_oaddr_t expected_oa;
>
> switch (pt_load_single_entry(&pts)) {
> case PT_ENTRY_EMPTY:
> @@ -159,45 +167,77 @@ static __always_inline int __do_iova_to_phys(struct pt_range *range, void *arg,
> case PT_ENTRY_TABLE:
> return pt_descend(&pts, arg, descend_fn);
> case PT_ENTRY_OA:
> - *res = pt_entry_oa_exact(&pts);
> - return 0;
> + break;
> }
> - return -ENOENT;
> +
> + data->phys = pt_entry_oa_exact(&pts);
> + entry_lg2sz = pt_entry_oa_lg2sz(&pts);
> + entry_sz = log2_to_int(entry_lg2sz);
> +
> + /* Start with the full mapping size of the first entry */
> + data->length = entry_sz;
data->length doesn't account for iova offset. Is this by design? We
should document this clearly somewhere.
Sashiko reported the same issue too.
[Severity: High]
Does this calculation overstate the mapped length for unaligned IOVAs?
If the IOVA is not aligned to the PTE block size, pt_entry_oa_exact()
includes the intra-page offset in data->phys. However, data->length
is unconditionally initialized to the full entry_sz rather than
entry_sz - offset. Callers relying on mapped_length might operate
on out-of-bounds memory because data->phys + data->length extends
beyond the valid mapped physical memory by the unaligned offset amount.
> +
> + /* Accumulate subsequent physically contiguous entries */
> + expected_oa = pt_entry_oa(&pts) + entry_sz;
> + pts.end_index = log2_to_int(pt_num_items_lg2(&pts));
> + pt_next_entry(&pts);
> +
> + while (pts.index < pts.end_index) {
> + pt_load_entry(&pts);
> + if (pts.type != PT_ENTRY_OA)
> + break;
> + if (pt_entry_oa_lg2sz(&pts) != entry_lg2sz)
> + break;
> + if (pt_entry_oa(&pts) != expected_oa)
> + break;
> + data->length += entry_sz;
> + expected_oa += entry_sz;
> + pt_next_entry(&pts);
> + }
> +
> + return 0;
> }
> -PT_MAKE_LEVELS(__iova_to_phys, __do_iova_to_phys);
> +PT_MAKE_LEVELS(__iova_to_phys_length, __do_iova_to_phys_length);
>
> /**
> - * iova_to_phys() - Return the output address for the given IOVA
> + * iova_to_phys_length() - Translate IOVA returning phys and contiguous length
> * @domain: Table to query
> * @iova: IO virtual address to query
> + * @mapped_length: Output for the total contiguous mapped length in bytes
> *
> - * Determine the output address from the given IOVA. @iova may have any
> - * alignment, the returned physical will be adjusted with any sub page offset.
> + * Walk the IOMMU page table to translate @iova to a physical address while
> + * also returning the total contiguous physically mapped length through
> + * @mapped_length. The function accumulates consecutive page table entries that
> + * are physically contiguous, so callers can determine the full contiguous
> + * mapping extent with a single call.
> *
> * Context: The caller must hold a read range lock that includes @iova.
> *
> - * Return: 0 if there is no translation for the given iova.
> + * Return: The physical address, or PHYS_ADDR_MAX if there is no translation.
> */
> -phys_addr_t DOMAIN_NS(iova_to_phys)(struct iommu_domain *domain,
> - dma_addr_t iova)
> +phys_addr_t DOMAIN_NS(iova_to_phys_length)(struct iommu_domain *domain,
> + dma_addr_t iova,
> + size_t *mapped_length)
> {
> struct pt_iommu *iommu_table =
> container_of(domain, struct pt_iommu, domain);
> struct pt_range range;
> - pt_oaddr_t res;
> + struct iova_to_phys_length_data data;
> int ret;
>
> ret = make_range(common_from_iommu(iommu_table), &range, iova, 1);
> if (ret)
> - return ret;
> + return PHYS_ADDR_MAX;
>
> - ret = pt_walk_range(&range, __iova_to_phys, &res);
> - /* PHYS_ADDR_MAX would be a better error code */
> + ret = pt_walk_range(&range, __iova_to_phys_length, &data);
> if (ret)
> - return 0;
> - return res;
> + return PHYS_ADDR_MAX;
> +
> + if (mapped_length)
> + *mapped_length = data.length;
> + return data.phys;
> }
> -EXPORT_SYMBOL_NS_GPL(DOMAIN_NS(iova_to_phys), "GENERIC_PT_IOMMU");
> +EXPORT_SYMBOL_NS_GPL(DOMAIN_NS(iova_to_phys_length), "GENERIC_PT_IOMMU");
>
> struct pt_iommu_dirty_args {
> struct iommu_dirty_bitmap *dirty;
> diff --git a/include/linux/generic_pt/iommu.h b/include/linux/generic_pt/iommu.h
> index dd0edd02a48a..859b853e9dc7 100644
> --- a/include/linux/generic_pt/iommu.h
> +++ b/include/linux/generic_pt/iommu.h
> @@ -249,8 +249,9 @@ struct pt_iommu_cfg {
>
> /* Generate the exported function signatures from iommu_pt.h */
> #define IOMMU_PROTOTYPES(fmt) \
> - phys_addr_t pt_iommu_##fmt##_iova_to_phys(struct iommu_domain *domain, \
> - dma_addr_t iova); \
> + phys_addr_t pt_iommu_##fmt##_iova_to_phys_length( \
> + struct iommu_domain *domain, dma_addr_t iova, \
> + size_t *mapped_length); \
> int pt_iommu_##fmt##_read_and_clear_dirty( \
> struct iommu_domain *domain, unsigned long iova, size_t size, \
> unsigned long flags, struct iommu_dirty_bitmap *dirty); \
> @@ -267,11 +268,11 @@ struct pt_iommu_cfg {
> IOMMU_PROTOTYPES(fmt)
>
> /*
> - * A driver uses IOMMU_PT_DOMAIN_OPS to populate the iommu_domain_ops for the
> - * iommu_pt
> + * A driver uses IOMMU_PT_DOMAIN_OPS to populate the iommu_domain_ops for
> + * the iommu_pt
> */
> -#define IOMMU_PT_DOMAIN_OPS(fmt) \
> - .iova_to_phys = &pt_iommu_##fmt##_iova_to_phys
> +#define IOMMU_PT_DOMAIN_OPS(fmt) \
> + .iova_to_phys_length = &pt_iommu_##fmt##_iova_to_phys_length
> #define IOMMU_PT_DIRTY_OPS(fmt) \
> .read_and_clear_dirty = &pt_iommu_##fmt##_read_and_clear_dirty
>
Thanks,
baolu
On Thu, Jun 04, 2026 at 11:30:37AM +0800, Baolu Lu wrote:
> > -static __always_inline int __do_iova_to_phys(struct pt_range *range, void *arg,
> > - unsigned int level,
> > - struct pt_table_p *table,
> > - pt_level_fn_t descend_fn)
> > +struct iova_to_phys_length_data {
> > + pt_oaddr_t phys;
> > + size_t length;
> > +};
> > +
> > +static __always_inline int __do_iova_to_phys_length(struct pt_range *range,
> > + void *arg, unsigned int level,
> > + struct pt_table_p *table,
> > + pt_level_fn_t descend_fn)
> > {
> > struct pt_state pts = pt_init(range, level, table);
> > - pt_oaddr_t *res = arg;
> > + struct iova_to_phys_length_data *data = arg;
> > + unsigned int entry_lg2sz;
> > + size_t entry_sz;
> > + pt_oaddr_t expected_oa;
> > switch (pt_load_single_entry(&pts)) {
> > case PT_ENTRY_EMPTY:
> > @@ -159,45 +167,77 @@ static __always_inline int __do_iova_to_phys(struct pt_range *range, void *arg,
> > case PT_ENTRY_TABLE:
> > return pt_descend(&pts, arg, descend_fn);
> > case PT_ENTRY_OA:
> > - *res = pt_entry_oa_exact(&pts);
> > - return 0;
> > + break;
> > }
> > - return -ENOENT;
> > +
> > + data->phys = pt_entry_oa_exact(&pts);
> > + entry_lg2sz = pt_entry_oa_lg2sz(&pts);
> > + entry_sz = log2_to_int(entry_lg2sz);
> > +
> > + /* Start with the full mapping size of the first entry */
> > + data->length = entry_sz;
>
> data->length doesn't account for iova offset. Is this by design? We
> should document this clearly somewhere.
That's defintaely a mistake, the phys has to be offset by the iova in all cases,
it is part of the API.
Also add kunits tests to the iommupt selftest to cover various
scenarios please.
Also this doesn't look quite right, the walk should look more like
unmap where we just walk and stop walking when we hit a physical
address discontiguity. The stop point defines the result length.
Jason
Migrate ARM SMMUv3 to implement iova_to_phys_length, calling
ops->iova_to_phys_length on the io-pgtable layer.
Signed-off-by: Guanghui Feng <guanghuifeng@linux.alibaba.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index e8d7dbe495f0..616e7057ec7f 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -4069,14 +4069,16 @@ static void arm_smmu_iotlb_sync(struct iommu_domain *domain,
}
static phys_addr_t
-arm_smmu_iova_to_phys(struct iommu_domain *domain, dma_addr_t iova)
+arm_smmu_iova_to_phys_length(struct iommu_domain *domain, dma_addr_t iova,
+ size_t *mapped_length)
{
struct io_pgtable_ops *ops = to_smmu_domain(domain)->pgtbl_ops;
+
if (!ops)
- return 0;
+ return PHYS_ADDR_MAX;
- return ops->iova_to_phys(ops, iova);
+ return ops->iova_to_phys_length(ops, iova, mapped_length);
}
static struct platform_driver arm_smmu_driver;
@@ -4396,7 +4398,7 @@ static const struct iommu_ops arm_smmu_ops = {
.unmap_pages = arm_smmu_unmap_pages,
.flush_iotlb_all = arm_smmu_flush_iotlb_all,
.iotlb_sync = arm_smmu_iotlb_sync,
- .iova_to_phys = arm_smmu_iova_to_phys,
+ .iova_to_phys_length = arm_smmu_iova_to_phys_length,
.free = arm_smmu_domain_free_paging,
}
};
--
2.43.7
Migrate ARM SMMU to implement iova_to_phys_length, calling
ops->iova_to_phys_length on the io-pgtable layer. Update qcom-debug
caller accordingly.
Signed-off-by: Guanghui Feng <guanghuifeng@linux.alibaba.com>
---
drivers/iommu/arm/arm-smmu/arm-smmu-qcom-debug.c | 2 +-
drivers/iommu/arm/arm-smmu/arm-smmu.c | 13 +++++++------
2 files changed, 8 insertions(+), 7 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom-debug.c b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom-debug.c
index 65e0ef6539fe..4fd01341157f 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom-debug.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom-debug.c
@@ -415,7 +415,7 @@ irqreturn_t qcom_smmu_context_fault(int irq, void *dev)
return IRQ_HANDLED;
}
- phys_soft = ops->iova_to_phys(ops, cfi.iova);
+ phys_soft = ops->iova_to_phys_length(ops, cfi.iova, NULL);
tmp = report_iommu_fault(&smmu_domain->domain, NULL, cfi.iova,
cfi.fsynr & ARM_SMMU_CB_FSYNR0_WNR ? IOMMU_FAULT_WRITE : IOMMU_FAULT_READ);
diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c
index 0bd21d206eb3..5c9ec7c93763 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
@@ -1366,7 +1366,7 @@ static phys_addr_t arm_smmu_iova_to_phys_hard(struct iommu_domain *domain,
"iova to phys timed out on %pad. Falling back to software table walk.\n",
&iova);
arm_smmu_rpm_put(smmu);
- return ops->iova_to_phys(ops, iova);
+ return ops->iova_to_phys_length(ops, iova, NULL);
}
phys = arm_smmu_cb_readq(smmu, idx, ARM_SMMU_CB_PAR);
@@ -1384,20 +1384,21 @@ static phys_addr_t arm_smmu_iova_to_phys_hard(struct iommu_domain *domain,
return addr;
}
-static phys_addr_t arm_smmu_iova_to_phys(struct iommu_domain *domain,
- dma_addr_t iova)
+static phys_addr_t arm_smmu_iova_to_phys_length(struct iommu_domain *domain,
+ dma_addr_t iova, size_t *mapped_length)
{
struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
struct io_pgtable_ops *ops = smmu_domain->pgtbl_ops;
+
if (!ops)
- return 0;
+ return PHYS_ADDR_MAX;
if (smmu_domain->smmu->features & ARM_SMMU_FEAT_TRANS_OPS &&
smmu_domain->stage == ARM_SMMU_DOMAIN_S1)
return arm_smmu_iova_to_phys_hard(domain, iova);
- return ops->iova_to_phys(ops, iova);
+ return ops->iova_to_phys_length(ops, iova, mapped_length);
}
static bool arm_smmu_capable(struct device *dev, enum iommu_cap cap)
@@ -1652,7 +1653,7 @@ static const struct iommu_ops arm_smmu_ops = {
.unmap_pages = arm_smmu_unmap_pages,
.flush_iotlb_all = arm_smmu_flush_iotlb_all,
.iotlb_sync = arm_smmu_iotlb_sync,
- .iova_to_phys = arm_smmu_iova_to_phys,
+ .iova_to_phys_length = arm_smmu_iova_to_phys_length,
.set_pgtable_quirks = arm_smmu_set_pgtable_quirks,
.free = arm_smmu_domain_free,
}
--
2.43.7
Migrate Qualcomm IOMMU to implement iova_to_phys_length, calling
ops->iova_to_phys_length on the io-pgtable layer.
Signed-off-by: Guanghui Feng <guanghuifeng@linux.alibaba.com>
---
drivers/iommu/arm/arm-smmu/qcom_iommu.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu/qcom_iommu.c b/drivers/iommu/arm/arm-smmu/qcom_iommu.c
index a1e8cf29f594..9e0e8b5caec1 100644
--- a/drivers/iommu/arm/arm-smmu/qcom_iommu.c
+++ b/drivers/iommu/arm/arm-smmu/qcom_iommu.c
@@ -489,19 +489,20 @@ static void qcom_iommu_iotlb_sync(struct iommu_domain *domain,
qcom_iommu_flush_iotlb_all(domain);
}
-static phys_addr_t qcom_iommu_iova_to_phys(struct iommu_domain *domain,
- dma_addr_t iova)
+static phys_addr_t qcom_iommu_iova_to_phys_length(struct iommu_domain *domain,
+ dma_addr_t iova, size_t *mapped_length)
{
phys_addr_t ret;
unsigned long flags;
struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
struct io_pgtable_ops *ops = qcom_domain->pgtbl_ops;
+
if (!ops)
- return 0;
+ return PHYS_ADDR_MAX;
spin_lock_irqsave(&qcom_domain->pgtbl_lock, flags);
- ret = ops->iova_to_phys(ops, iova);
+ ret = ops->iova_to_phys_length(ops, iova, mapped_length);
spin_unlock_irqrestore(&qcom_domain->pgtbl_lock, flags);
return ret;
@@ -602,7 +603,7 @@ static const struct iommu_ops qcom_iommu_ops = {
.unmap_pages = qcom_iommu_unmap,
.flush_iotlb_all = qcom_iommu_flush_iotlb_all,
.iotlb_sync = qcom_iommu_iotlb_sync,
- .iova_to_phys = qcom_iommu_iova_to_phys,
+ .iova_to_phys_length = qcom_iommu_iova_to_phys_length,
.free = qcom_iommu_domain_free,
}
};
--
2.43.7
Migrate Apple DART to implement iova_to_phys_length, passing through
mapped_length from io-pgtable.
Signed-off-by: Guanghui Feng <guanghuifeng@linux.alibaba.com>
---
drivers/iommu/apple-dart.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/drivers/iommu/apple-dart.c b/drivers/iommu/apple-dart.c
index 17bdadb6b504..fdc533ba72da 100644
--- a/drivers/iommu/apple-dart.c
+++ b/drivers/iommu/apple-dart.c
@@ -528,16 +528,17 @@ static int apple_dart_iotlb_sync_map(struct iommu_domain *domain,
return 0;
}
-static phys_addr_t apple_dart_iova_to_phys(struct iommu_domain *domain,
- dma_addr_t iova)
+static phys_addr_t apple_dart_iova_to_phys_length(struct iommu_domain *domain,
+ dma_addr_t iova, size_t *mapped_length)
{
struct apple_dart_domain *dart_domain = to_dart_domain(domain);
struct io_pgtable_ops *ops = dart_domain->pgtbl_ops;
+
if (!ops)
- return 0;
+ return PHYS_ADDR_MAX;
- return ops->iova_to_phys(ops, iova);
+ return ops->iova_to_phys_length(ops, iova, mapped_length);
}
static int apple_dart_map_pages(struct iommu_domain *domain, unsigned long iova,
@@ -1018,7 +1019,7 @@ static const struct iommu_ops apple_dart_iommu_ops = {
.flush_iotlb_all = apple_dart_flush_iotlb_all,
.iotlb_sync = apple_dart_iotlb_sync,
.iotlb_sync_map = apple_dart_iotlb_sync_map,
- .iova_to_phys = apple_dart_iova_to_phys,
+ .iova_to_phys_length = apple_dart_iova_to_phys_length,
.free = apple_dart_domain_free,
}
};
--
2.43.7
Migrate IPMMU-VMSA to implement iova_to_phys_length, passing through
mapped_length from io-pgtable.
Signed-off-by: Guanghui Feng <guanghuifeng@linux.alibaba.com>
---
drivers/iommu/ipmmu-vmsa.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/drivers/iommu/ipmmu-vmsa.c b/drivers/iommu/ipmmu-vmsa.c
index 9386b752dea2..6e2e5922ab1b 100644
--- a/drivers/iommu/ipmmu-vmsa.c
+++ b/drivers/iommu/ipmmu-vmsa.c
@@ -699,14 +699,16 @@ static void ipmmu_iotlb_sync(struct iommu_domain *io_domain,
ipmmu_flush_iotlb_all(io_domain);
}
-static phys_addr_t ipmmu_iova_to_phys(struct iommu_domain *io_domain,
- dma_addr_t iova)
+static phys_addr_t ipmmu_iova_to_phys_length(struct iommu_domain *io_domain,
+ dma_addr_t iova, size_t *mapped_length)
{
struct ipmmu_vmsa_domain *domain = to_vmsa_domain(io_domain);
+
/* TODO: Is locking needed ? */
- return domain->iop->iova_to_phys(domain->iop, iova);
+ return domain->iop->iova_to_phys_length(domain->iop, iova,
+ mapped_length);
}
static int ipmmu_init_platform_device(struct device *dev,
@@ -892,7 +894,7 @@ static const struct iommu_ops ipmmu_ops = {
.unmap_pages = ipmmu_unmap,
.flush_iotlb_all = ipmmu_flush_iotlb_all,
.iotlb_sync = ipmmu_iotlb_sync,
- .iova_to_phys = ipmmu_iova_to_phys,
+ .iova_to_phys_length = ipmmu_iova_to_phys_length,
.free = ipmmu_domain_free,
}
};
--
2.43.7
Migrate MediaTek IOMMU to implement iova_to_phys_length, passing through
mapped_length from io-pgtable.
Signed-off-by: Guanghui Feng <guanghuifeng@linux.alibaba.com>
---
drivers/iommu/mtk_iommu.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index 2be990c108de..6ca31f8d4d96 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -858,13 +858,16 @@ static int mtk_iommu_sync_map(struct iommu_domain *domain, unsigned long iova,
return 0;
}
-static phys_addr_t mtk_iommu_iova_to_phys(struct iommu_domain *domain,
- dma_addr_t iova)
+static phys_addr_t mtk_iommu_iova_to_phys_length(struct iommu_domain *domain,
+ dma_addr_t iova, size_t *mapped_length)
{
struct mtk_iommu_domain *dom = to_mtk_domain(domain);
phys_addr_t pa;
- pa = dom->iop->iova_to_phys(dom->iop, iova);
+ pa = dom->iop->iova_to_phys_length(dom->iop, iova, mapped_length);
+ if (pa == PHYS_ADDR_MAX)
+ return PHYS_ADDR_MAX;
+
if (IS_ENABLED(CONFIG_PHYS_ADDR_T_64BIT) &&
dom->bank->parent_data->enable_4GB &&
pa >= MTK_IOMMU_4GB_MODE_REMAP_BASE)
@@ -1070,7 +1073,7 @@ static const struct iommu_ops mtk_iommu_ops = {
.flush_iotlb_all = mtk_iommu_flush_iotlb_all,
.iotlb_sync = mtk_iommu_iotlb_sync,
.iotlb_sync_map = mtk_iommu_sync_map,
- .iova_to_phys = mtk_iommu_iova_to_phys,
+ .iova_to_phys_length = mtk_iommu_iova_to_phys_length,
.free = mtk_iommu_domain_free,
}
};
--
2.43.7
Implement iova_to_phys_length for Exynos IOMMU driver,
returning the actual PTE mapping size.
Signed-off-by: Guanghui Feng <guanghuifeng@linux.alibaba.com>
---
drivers/iommu/exynos-iommu.c | 20 ++++++++++++++------
1 file changed, 14 insertions(+), 6 deletions(-)
diff --git a/drivers/iommu/exynos-iommu.c b/drivers/iommu/exynos-iommu.c
index 874d05f4b396..17d77b9114c4 100644
--- a/drivers/iommu/exynos-iommu.c
+++ b/drivers/iommu/exynos-iommu.c
@@ -1372,13 +1372,14 @@ static size_t exynos_iommu_unmap(struct iommu_domain *iommu_domain,
return 0;
}
-static phys_addr_t exynos_iommu_iova_to_phys(struct iommu_domain *iommu_domain,
- dma_addr_t iova)
+static phys_addr_t exynos_iommu_iova_to_phys_length(struct iommu_domain *iommu_domain,
+ dma_addr_t iova,
+ size_t *mapped_length)
{
struct exynos_iommu_domain *domain = to_exynos_domain(iommu_domain);
sysmmu_pte_t *entry;
unsigned long flags;
- phys_addr_t phys = 0;
+ phys_addr_t phys = PHYS_ADDR_MAX;
spin_lock_irqsave(&domain->pgtablelock, flags);
@@ -1386,13 +1387,20 @@ static phys_addr_t exynos_iommu_iova_to_phys(struct iommu_domain *iommu_domain,
if (lv1ent_section(entry)) {
phys = section_phys(entry) + section_offs(iova);
+ if (mapped_length)
+ *mapped_length = SECT_SIZE;
} else if (lv1ent_page(entry)) {
entry = page_entry(entry, iova);
- if (lv2ent_large(entry))
+ if (lv2ent_large(entry)) {
phys = lpage_phys(entry) + lpage_offs(iova);
- else if (lv2ent_small(entry))
+ if (mapped_length)
+ *mapped_length = LPAGE_SIZE;
+ } else if (lv2ent_small(entry)) {
phys = spage_phys(entry) + spage_offs(iova);
+ if (mapped_length)
+ *mapped_length = SPAGE_SIZE;
+ }
}
spin_unlock_irqrestore(&domain->pgtablelock, flags);
@@ -1484,7 +1492,7 @@ static const struct iommu_ops exynos_iommu_ops = {
.attach_dev = exynos_iommu_attach_device,
.map_pages = exynos_iommu_map,
.unmap_pages = exynos_iommu_unmap,
- .iova_to_phys = exynos_iommu_iova_to_phys,
+ .iova_to_phys_length = exynos_iommu_iova_to_phys_length,
.free = exynos_iommu_domain_free,
}
};
--
2.43.7
Implement iova_to_phys_length for FSL PAMU IOMMU driver,
returning the actual PTE mapping size.
Signed-off-by: Guanghui Feng <guanghuifeng@linux.alibaba.com>
---
drivers/iommu/fsl_pamu_domain.c | 26 ++++++++++++++++++++++----
1 file changed, 22 insertions(+), 4 deletions(-)
diff --git a/drivers/iommu/fsl_pamu_domain.c b/drivers/iommu/fsl_pamu_domain.c
index 9664ef9840d2..60abd497dc63 100644
--- a/drivers/iommu/fsl_pamu_domain.c
+++ b/drivers/iommu/fsl_pamu_domain.c
@@ -169,12 +169,30 @@ static void attach_device(struct fsl_dma_domain *dma_domain, int liodn, struct d
spin_unlock_irqrestore(&device_domain_lock, flags);
}
-static phys_addr_t fsl_pamu_iova_to_phys(struct iommu_domain *domain,
- dma_addr_t iova)
+static phys_addr_t fsl_pamu_iova_to_phys_length(struct iommu_domain *domain,
+ dma_addr_t iova,
+ size_t *mapped_length)
{
+
if (iova < domain->geometry.aperture_start ||
iova > domain->geometry.aperture_end)
- return 0;
+ return PHYS_ADDR_MAX;
+
+ /*
+ * PAMU configures exactly one Primary PAACE entry per LIODN with the
+ * Multi-Window (MW) bit cleared and ATM = PAACE_ATM_WINDOW_XLATE,
+ * WBAL/TWBAL = 0. That is, every LIODN is backed by a single hardware
+ * mapping window of fixed size (1ULL << 36, i.e. 64GB, see
+ * pamu_config_ppaace()) performing an identity translation. The driver
+ * does not split this window into SPAACE sub-windows, so the entire
+ * aperture is one PAACE "PTE" entry. Return that single entry's size
+ * as mapped_length, matching the iova_to_phys_length contract that
+ * mapped_length reports the full size of the mapping entry which
+ * covers iova (not the remaining bytes from iova to its end).
+ */
+ if (mapped_length)
+ *mapped_length = 1ULL << 36;
+
return iova;
}
@@ -435,7 +453,7 @@ static const struct iommu_ops fsl_pamu_ops = {
.device_group = fsl_pamu_device_group,
.default_domain_ops = &(const struct iommu_domain_ops) {
.attach_dev = fsl_pamu_attach_device,
- .iova_to_phys = fsl_pamu_iova_to_phys,
+ .iova_to_phys_length = fsl_pamu_iova_to_phys_length,
.free = fsl_pamu_domain_free,
}
};
--
2.43.7
Implement iova_to_phys_length for MSM IOMMU driver,
returning the actual PTE mapping size.
Signed-off-by: Guanghui Feng <guanghuifeng@linux.alibaba.com>
---
drivers/iommu/msm_iommu.c | 21 ++++++++++++++-------
1 file changed, 14 insertions(+), 7 deletions(-)
diff --git a/drivers/iommu/msm_iommu.c b/drivers/iommu/msm_iommu.c
index 0ad5ff431d5b..1038e8141223 100644
--- a/drivers/iommu/msm_iommu.c
+++ b/drivers/iommu/msm_iommu.c
@@ -523,15 +523,16 @@ static size_t msm_iommu_unmap(struct iommu_domain *domain, unsigned long iova,
return ret;
}
-static phys_addr_t msm_iommu_iova_to_phys(struct iommu_domain *domain,
- dma_addr_t va)
+static phys_addr_t msm_iommu_iova_to_phys_length(struct iommu_domain *domain,
+ dma_addr_t va,
+ size_t *mapped_length)
{
struct msm_priv *priv;
struct msm_iommu_dev *iommu;
struct msm_iommu_ctx_dev *master;
unsigned int par;
unsigned long flags;
- phys_addr_t ret = 0;
+ phys_addr_t ret = PHYS_ADDR_MAX;
spin_lock_irqsave(&msm_iommu_lock, flags);
@@ -558,13 +559,19 @@ static phys_addr_t msm_iommu_iova_to_phys(struct iommu_domain *domain,
par = GET_PAR(iommu->base, master->num);
/* We are dealing with a supersection */
- if (GET_NOFAULT_SS(iommu->base, master->num))
+ if (GET_NOFAULT_SS(iommu->base, master->num)) {
ret = (par & 0xFF000000) | (va & 0x00FFFFFF);
- else /* Upper 20 bits from PAR, lower 12 from VA */
+ if (mapped_length)
+ *mapped_length = SZ_16M;
+ } else {
+ /* Upper 20 bits from PAR, lower 12 from VA */
ret = (par & 0xFFFFF000) | (va & 0x00000FFF);
+ if (mapped_length)
+ *mapped_length = SZ_4K;
+ }
if (GET_FAULT(iommu->base, master->num))
- ret = 0;
+ ret = PHYS_ADDR_MAX;
__disable_clocks(iommu);
fail:
@@ -706,7 +713,7 @@ static struct iommu_ops msm_iommu_ops = {
*/
.iotlb_sync = NULL,
.iotlb_sync_map = msm_iommu_sync_map,
- .iova_to_phys = msm_iommu_iova_to_phys,
+ .iova_to_phys_length = msm_iommu_iova_to_phys_length,
.free = msm_iommu_domain_free,
}
};
--
2.43.7
Implement iova_to_phys_length for MediaTek v1 IOMMU driver,
returning the actual PTE mapping size.
Also fix pre-existing bug: add page offset to physical address.
Signed-off-by: Guanghui Feng <guanghuifeng@linux.alibaba.com>
---
drivers/iommu/mtk_iommu_v1.c | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)
diff --git a/drivers/iommu/mtk_iommu_v1.c b/drivers/iommu/mtk_iommu_v1.c
index ac97dd2868d4..da41dda7620b 100644
--- a/drivers/iommu/mtk_iommu_v1.c
+++ b/drivers/iommu/mtk_iommu_v1.c
@@ -393,7 +393,9 @@ static size_t mtk_iommu_v1_unmap(struct iommu_domain *domain, unsigned long iova
return size;
}
-static phys_addr_t mtk_iommu_v1_iova_to_phys(struct iommu_domain *domain, dma_addr_t iova)
+static phys_addr_t mtk_iommu_v1_iova_to_phys_length(struct iommu_domain *domain,
+ dma_addr_t iova,
+ size_t *mapped_length)
{
struct mtk_iommu_v1_domain *dom = to_mtk_domain(domain);
unsigned long flags;
@@ -404,6 +406,13 @@ static phys_addr_t mtk_iommu_v1_iova_to_phys(struct iommu_domain *domain, dma_ad
pa = pa & (~(MT2701_IOMMU_PAGE_SIZE - 1));
spin_unlock_irqrestore(&dom->pgtlock, flags);
+ if (!pa)
+ return PHYS_ADDR_MAX;
+
+ pa |= (iova & (MT2701_IOMMU_PAGE_SIZE - 1));
+ if (mapped_length)
+ *mapped_length = MT2701_IOMMU_PAGE_SIZE;
+
return pa;
}
@@ -590,7 +599,7 @@ static const struct iommu_ops mtk_iommu_v1_ops = {
.attach_dev = mtk_iommu_v1_attach_device,
.map_pages = mtk_iommu_v1_map,
.unmap_pages = mtk_iommu_v1_unmap,
- .iova_to_phys = mtk_iommu_v1_iova_to_phys,
+ .iova_to_phys_length = mtk_iommu_v1_iova_to_phys_length,
.free = mtk_iommu_v1_domain_free,
}
};
--
2.43.7
Implement iova_to_phys_length for OMAP IOMMU driver,
returning the actual PTE mapping size.
Signed-off-by: Guanghui Feng <guanghuifeng@linux.alibaba.com>
---
drivers/iommu/omap-iommu.c | 29 +++++++++++++++++++----------
1 file changed, 19 insertions(+), 10 deletions(-)
diff --git a/drivers/iommu/omap-iommu.c b/drivers/iommu/omap-iommu.c
index 8231d7d6bb6a..f4a416326f7c 100644
--- a/drivers/iommu/omap-iommu.c
+++ b/drivers/iommu/omap-iommu.c
@@ -1592,15 +1592,16 @@ static void omap_iommu_domain_free(struct iommu_domain *domain)
kfree(omap_domain);
}
-static phys_addr_t omap_iommu_iova_to_phys(struct iommu_domain *domain,
- dma_addr_t da)
+static phys_addr_t omap_iommu_iova_to_phys_length(struct iommu_domain *domain,
+ dma_addr_t da,
+ size_t *mapped_length)
{
struct omap_iommu_domain *omap_domain = to_omap_domain(domain);
struct omap_iommu_device *iommu = omap_domain->iommus;
struct omap_iommu *oiommu = iommu->iommu_dev;
struct device *dev = oiommu->dev;
u32 *pgd, *pte;
- phys_addr_t ret = 0;
+ phys_addr_t ret = PHYS_ADDR_MAX;
/*
* all the iommus within the domain will have identical programming,
@@ -1609,19 +1610,27 @@ static phys_addr_t omap_iommu_iova_to_phys(struct iommu_domain *domain,
iopgtable_lookup_entry(oiommu, da, &pgd, &pte);
if (pte) {
- if (iopte_is_small(*pte))
+ if (iopte_is_small(*pte)) {
ret = omap_iommu_translate(*pte, da, IOPTE_MASK);
- else if (iopte_is_large(*pte))
+ if (mapped_length)
+ *mapped_length = IOPTE_SIZE;
+ } else if (iopte_is_large(*pte)) {
ret = omap_iommu_translate(*pte, da, IOLARGE_MASK);
- else
+ if (mapped_length)
+ *mapped_length = IOLARGE_SIZE;
+ } else
dev_err(dev, "bogus pte 0x%x, da 0x%llx", *pte,
(unsigned long long)da);
} else {
- if (iopgd_is_section(*pgd))
+ if (iopgd_is_section(*pgd)) {
ret = omap_iommu_translate(*pgd, da, IOSECTION_MASK);
- else if (iopgd_is_super(*pgd))
+ if (mapped_length)
+ *mapped_length = IOSECTION_SIZE;
+ } else if (iopgd_is_super(*pgd)) {
ret = omap_iommu_translate(*pgd, da, IOSUPER_MASK);
- else
+ if (mapped_length)
+ *mapped_length = IOSUPER_SIZE;
+ } else
dev_err(dev, "bogus pgd 0x%x, da 0x%llx", *pgd,
(unsigned long long)da);
}
@@ -1723,7 +1732,7 @@ static const struct iommu_ops omap_iommu_ops = {
.attach_dev = omap_iommu_attach_dev,
.map_pages = omap_iommu_map,
.unmap_pages = omap_iommu_unmap,
- .iova_to_phys = omap_iommu_iova_to_phys,
+ .iova_to_phys_length = omap_iommu_iova_to_phys_length,
.free = omap_iommu_domain_free,
}
};
--
2.43.7
Implement iova_to_phys_length for Rockchip IOMMU driver,
returning the actual PTE mapping size.
Signed-off-by: Guanghui Feng <guanghuifeng@linux.alibaba.com>
---
drivers/iommu/rockchip-iommu.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c
index 0013cf196c57..a51c29340b98 100644
--- a/drivers/iommu/rockchip-iommu.c
+++ b/drivers/iommu/rockchip-iommu.c
@@ -648,12 +648,12 @@ static irqreturn_t rk_iommu_irq(int irq, void *dev_id)
return ret;
}
-static phys_addr_t rk_iommu_iova_to_phys(struct iommu_domain *domain,
- dma_addr_t iova)
+static phys_addr_t rk_iommu_iova_to_phys_length(struct iommu_domain *domain,
+ dma_addr_t iova, size_t *mapped_length)
{
struct rk_iommu_domain *rk_domain = to_rk_domain(domain);
unsigned long flags;
- phys_addr_t pt_phys, phys = 0;
+ phys_addr_t pt_phys, phys = PHYS_ADDR_MAX;
u32 dte, pte;
u32 *page_table;
@@ -670,6 +670,8 @@ static phys_addr_t rk_iommu_iova_to_phys(struct iommu_domain *domain,
goto out;
phys = rk_ops->pt_address(pte) + rk_iova_page_offset(iova);
+ if (mapped_length)
+ *mapped_length = SPAGE_SIZE;
out:
spin_unlock_irqrestore(&rk_domain->dt_lock, flags);
@@ -1187,7 +1189,7 @@ static const struct iommu_ops rk_iommu_ops = {
.attach_dev = rk_iommu_attach_device,
.map_pages = rk_iommu_map,
.unmap_pages = rk_iommu_unmap,
- .iova_to_phys = rk_iommu_iova_to_phys,
+ .iova_to_phys_length = rk_iommu_iova_to_phys_length,
.free = rk_iommu_domain_free,
}
};
--
2.43.7
Implement iova_to_phys_length for s390 IOMMU driver,
returning the actual PTE mapping size.
Signed-off-by: Guanghui Feng <guanghuifeng@linux.alibaba.com>
---
drivers/iommu/s390-iommu.c | 18 +++++++++++-------
1 file changed, 11 insertions(+), 7 deletions(-)
diff --git a/drivers/iommu/s390-iommu.c b/drivers/iommu/s390-iommu.c
index f148f559ac56..6cfcc55ef59f 100644
--- a/drivers/iommu/s390-iommu.c
+++ b/drivers/iommu/s390-iommu.c
@@ -986,22 +986,23 @@ static unsigned long *get_rto_from_iova(struct s390_domain *domain,
}
}
-static phys_addr_t s390_iommu_iova_to_phys(struct iommu_domain *domain,
- dma_addr_t iova)
+static phys_addr_t s390_iommu_iova_to_phys_length(struct iommu_domain *domain,
+ dma_addr_t iova,
+ size_t *mapped_length)
{
struct s390_domain *s390_domain = to_s390_domain(domain);
unsigned long *rto, *sto, *pto;
unsigned long ste, pte, rte;
unsigned int rtx, sx, px;
- phys_addr_t phys = 0;
+ phys_addr_t phys = PHYS_ADDR_MAX;
if (iova < domain->geometry.aperture_start ||
iova > domain->geometry.aperture_end)
- return 0;
+ return PHYS_ADDR_MAX;
rto = get_rto_from_iova(s390_domain, iova);
if (!rto)
- return 0;
+ return PHYS_ADDR_MAX;
rtx = calc_rtx(iova);
sx = calc_sx(iova);
@@ -1014,8 +1015,11 @@ static phys_addr_t s390_iommu_iova_to_phys(struct iommu_domain *domain,
if (reg_entry_isvalid(ste)) {
pto = get_st_pto(ste);
pte = READ_ONCE(pto[px]);
- if (pt_entry_isvalid(pte))
+ if (pt_entry_isvalid(pte)) {
phys = pte & ZPCI_PTE_ADDR_MASK;
+ if (mapped_length)
+ *mapped_length = SZ_4K;
+ }
}
}
@@ -1183,7 +1187,7 @@ static struct iommu_domain blocking_domain = {
.flush_iotlb_all = s390_iommu_flush_iotlb_all, \
.iotlb_sync = s390_iommu_iotlb_sync, \
.iotlb_sync_map = s390_iommu_iotlb_sync_map, \
- .iova_to_phys = s390_iommu_iova_to_phys, \
+ .iova_to_phys_length = s390_iommu_iova_to_phys_length, \
.free = s390_domain_free, \
}
--
2.43.7
Implement iova_to_phys_length for Spreadtrum IOMMU driver,
returning the actual PTE mapping size.
Signed-off-by: Guanghui Feng <guanghuifeng@linux.alibaba.com>
---
drivers/iommu/sprd-iommu.c | 17 ++++++++++++-----
1 file changed, 12 insertions(+), 5 deletions(-)
diff --git a/drivers/iommu/sprd-iommu.c b/drivers/iommu/sprd-iommu.c
index c1a34445d244..3c04ec040d96 100644
--- a/drivers/iommu/sprd-iommu.c
+++ b/drivers/iommu/sprd-iommu.c
@@ -366,8 +366,9 @@ static void sprd_iommu_sync(struct iommu_domain *domain,
sprd_iommu_sync_map(domain, 0, 0);
}
-static phys_addr_t sprd_iommu_iova_to_phys(struct iommu_domain *domain,
- dma_addr_t iova)
+static phys_addr_t sprd_iommu_iova_to_phys_length(struct iommu_domain *domain,
+ dma_addr_t iova,
+ size_t *mapped_length)
{
struct sprd_iommu_domain *dom = to_sprd_domain(domain);
unsigned long flags;
@@ -376,13 +377,19 @@ static phys_addr_t sprd_iommu_iova_to_phys(struct iommu_domain *domain,
unsigned long end = domain->geometry.aperture_end;
if (WARN_ON(iova < start || iova > end))
- return 0;
+ return PHYS_ADDR_MAX;
spin_lock_irqsave(&dom->pgtlock, flags);
pa = *(dom->pgt_va + ((iova - start) >> SPRD_IOMMU_PAGE_SHIFT));
- pa = (pa << SPRD_IOMMU_PAGE_SHIFT) + ((iova - start) & (SPRD_IOMMU_PAGE_SIZE - 1));
spin_unlock_irqrestore(&dom->pgtlock, flags);
+ if (!pa)
+ return PHYS_ADDR_MAX;
+
+ pa = (pa << SPRD_IOMMU_PAGE_SHIFT) + ((iova - start) & (SPRD_IOMMU_PAGE_SIZE - 1));
+ if (mapped_length)
+ *mapped_length = SPRD_IOMMU_PAGE_SIZE;
+
return pa;
}
@@ -420,7 +427,7 @@ static const struct iommu_ops sprd_iommu_ops = {
.unmap_pages = sprd_iommu_unmap,
.iotlb_sync_map = sprd_iommu_sync_map,
.iotlb_sync = sprd_iommu_sync,
- .iova_to_phys = sprd_iommu_iova_to_phys,
+ .iova_to_phys_length = sprd_iommu_iova_to_phys_length,
.free = sprd_iommu_domain_free,
}
};
--
2.43.7
Implement iova_to_phys_length for sun50i IOMMU driver,
returning the actual PTE mapping size.
Signed-off-by: Guanghui Feng <guanghuifeng@linux.alibaba.com>
---
drivers/iommu/sun50i-iommu.c | 14 +++++++++-----
1 file changed, 9 insertions(+), 5 deletions(-)
diff --git a/drivers/iommu/sun50i-iommu.c b/drivers/iommu/sun50i-iommu.c
index be3f1ce696ba..9f39fe4a9d4f 100644
--- a/drivers/iommu/sun50i-iommu.c
+++ b/drivers/iommu/sun50i-iommu.c
@@ -659,8 +659,9 @@ static size_t sun50i_iommu_unmap(struct iommu_domain *domain, unsigned long iova
return SZ_4K;
}
-static phys_addr_t sun50i_iommu_iova_to_phys(struct iommu_domain *domain,
- dma_addr_t iova)
+static phys_addr_t sun50i_iommu_iova_to_phys_length(struct iommu_domain *domain,
+ dma_addr_t iova,
+ size_t *mapped_length)
{
struct sun50i_iommu_domain *sun50i_domain = to_sun50i_domain(domain);
phys_addr_t pt_phys;
@@ -669,13 +670,16 @@ static phys_addr_t sun50i_iommu_iova_to_phys(struct iommu_domain *domain,
dte = sun50i_domain->dt[sun50i_iova_get_dte_index(iova)];
if (!sun50i_dte_is_pt_valid(dte))
- return 0;
+ return PHYS_ADDR_MAX;
pt_phys = sun50i_dte_get_pt_address(dte);
page_table = (u32 *)phys_to_virt(pt_phys);
pte = page_table[sun50i_iova_get_pte_index(iova)];
if (!sun50i_pte_is_page_valid(pte))
- return 0;
+ return PHYS_ADDR_MAX;
+
+ if (mapped_length)
+ *mapped_length = SZ_4K;
return sun50i_pte_get_page_address(pte) +
sun50i_iova_get_page_offset(iova);
@@ -857,7 +861,7 @@ static const struct iommu_ops sun50i_iommu_ops = {
.flush_iotlb_all = sun50i_iommu_flush_iotlb_all,
.iotlb_sync_map = sun50i_iommu_iotlb_sync_map,
.iotlb_sync = sun50i_iommu_iotlb_sync,
- .iova_to_phys = sun50i_iommu_iova_to_phys,
+ .iova_to_phys_length = sun50i_iommu_iova_to_phys_length,
.map_pages = sun50i_iommu_map,
.unmap_pages = sun50i_iommu_unmap,
.free = sun50i_iommu_domain_free,
--
2.43.7
Implement iova_to_phys_length for Tegra SMMU IOMMU driver,
returning the actual PTE mapping size.
Signed-off-by: Guanghui Feng <guanghuifeng@linux.alibaba.com>
---
drivers/iommu/tegra-smmu.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c
index 67e7a7b925f0..12f9bb623d87 100644
--- a/drivers/iommu/tegra-smmu.c
+++ b/drivers/iommu/tegra-smmu.c
@@ -803,8 +803,8 @@ static size_t tegra_smmu_unmap(struct iommu_domain *domain, unsigned long iova,
return size;
}
-static phys_addr_t tegra_smmu_iova_to_phys(struct iommu_domain *domain,
- dma_addr_t iova)
+static phys_addr_t tegra_smmu_iova_to_phys_length(struct iommu_domain *domain,
+ dma_addr_t iova, size_t *mapped_length)
{
struct tegra_smmu_as *as = to_smmu_as(domain);
unsigned long pfn;
@@ -813,10 +813,13 @@ static phys_addr_t tegra_smmu_iova_to_phys(struct iommu_domain *domain,
pte = tegra_smmu_pte_lookup(as, iova, &pte_dma);
if (!pte || !*pte)
- return 0;
+ return PHYS_ADDR_MAX;
pfn = *pte & as->smmu->pfn_mask;
+ if (mapped_length)
+ *mapped_length = SZ_4K;
+
return SMMU_PFN_PHYS(pfn) + SMMU_OFFSET_IN_PAGE(iova);
}
@@ -1007,7 +1010,7 @@ static const struct iommu_ops tegra_smmu_ops = {
.attach_dev = tegra_smmu_attach_dev,
.map_pages = tegra_smmu_map,
.unmap_pages = tegra_smmu_unmap,
- .iova_to_phys = tegra_smmu_iova_to_phys,
+ .iova_to_phys_length = tegra_smmu_iova_to_phys_length,
.free = tegra_smmu_domain_free,
}
};
--
2.43.7
Implement iova_to_phys_length for virtio IOMMU driver,
returning the actual PTE mapping size.
Signed-off-by: Guanghui Feng <guanghuifeng@linux.alibaba.com>
---
drivers/iommu/virtio-iommu.c | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
index 587fc13197f1..b92316257e42 100644
--- a/drivers/iommu/virtio-iommu.c
+++ b/drivers/iommu/virtio-iommu.c
@@ -912,10 +912,11 @@ static size_t viommu_unmap_pages(struct iommu_domain *domain, unsigned long iova
return ret ? 0 : unmapped;
}
-static phys_addr_t viommu_iova_to_phys(struct iommu_domain *domain,
- dma_addr_t iova)
+static phys_addr_t viommu_iova_to_phys_length(struct iommu_domain *domain,
+ dma_addr_t iova,
+ size_t *mapped_length)
{
- u64 paddr = 0;
+ u64 paddr = PHYS_ADDR_MAX;
unsigned long flags;
struct viommu_mapping *mapping;
struct interval_tree_node *node;
@@ -926,6 +927,9 @@ static phys_addr_t viommu_iova_to_phys(struct iommu_domain *domain,
if (node) {
mapping = container_of(node, struct viommu_mapping, iova);
paddr = mapping->paddr + (iova - mapping->iova.start);
+ if (mapped_length)
+ *mapped_length = mapping->iova.last -
+ mapping->iova.start + 1;
}
spin_unlock_irqrestore(&vdomain->mappings_lock, flags);
@@ -1102,7 +1106,7 @@ static const struct iommu_ops viommu_ops = {
.attach_dev = viommu_attach_dev,
.map_pages = viommu_map_pages,
.unmap_pages = viommu_unmap_pages,
- .iova_to_phys = viommu_iova_to_phys,
+ .iova_to_phys_length = viommu_iova_to_phys_length,
.flush_iotlb_all = viommu_flush_iotlb_all,
.iotlb_sync = viommu_iotlb_sync,
.iotlb_sync_map = viommu_iotlb_sync_map,
--
2.43.7
Use iommu_iova_to_phys_length() to get PTE page size, allowing
traversal by actual mapping granularity instead of PAGE_SIZE steps.
Signed-off-by: Guanghui Feng <guanghuifeng@linux.alibaba.com>
Acked-by: Shiqiang Zhang <shiyu.zsq@linux.alibaba.com>
Acked-by: Simon Guo <wei.guo.simon@linux.alibaba.com>
---
drivers/vfio/vfio_iommu_type1.c | 27 ++++++++++++++++++++++-----
1 file changed, 22 insertions(+), 5 deletions(-)
diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index c8151ba54de3..115d88d7003e 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -1177,25 +1177,42 @@ static long vfio_unmap_unpin(struct vfio_iommu *iommu, struct vfio_dma *dma,
iommu_iotlb_gather_init(&iotlb_gather);
while (pos < dma->size) {
- size_t unmapped, len;
+ size_t unmapped, len, pgsize;
phys_addr_t phys, next;
dma_addr_t iova = dma->iova + pos;
- phys = iommu_iova_to_phys(domain->domain, iova);
- if (WARN_ON(!phys)) {
+ /* Single page table walk returns both phys and PTE size */
+ phys = iommu_iova_to_phys_length(domain->domain, iova,
+ &pgsize);
+ if (WARN_ON(phys == PHYS_ADDR_MAX)) {
pos += PAGE_SIZE;
continue;
}
+ if (WARN_ON(!pgsize || pgsize < PAGE_SIZE))
+ pgsize = PAGE_SIZE;
/*
* To optimize for fewer iommu_unmap() calls, each of which
* may require hardware cache flushing, try to find the
* largest contiguous physical memory chunk to unmap.
+ *
+ * mapped_length already accounts for contiguous entries
+ * from iova, then try to join following physically
+ * contiguous PTEs.
*/
- for (len = PAGE_SIZE; pos + len < dma->size; len += PAGE_SIZE) {
- next = iommu_iova_to_phys(domain->domain, iova + len);
+ len = min_t(size_t, pgsize, dma->size - pos);
+ for (; pos + len < dma->size; ) {
+ size_t next_pgsize;
+
+ next = iommu_iova_to_phys_length(domain->domain,
+ iova + len,
+ &next_pgsize);
if (next != phys + len)
break;
+ if (WARN_ON(!next_pgsize || next_pgsize < PAGE_SIZE))
+ next_pgsize = PAGE_SIZE;
+ len += min_t(size_t, next_pgsize,
+ dma->size - pos - len);
}
/*
--
2.43.7
On Wed, Jun 03, 2026 at 11:17:55PM +0800, Guanghui Feng wrote:
> Use iommu_iova_to_phys_length() to get PTE page size, allowing
> traversal by actual mapping granularity instead of PAGE_SIZE steps.
>
> Signed-off-by: Guanghui Feng <guanghuifeng@linux.alibaba.com>
> Acked-by: Shiqiang Zhang <shiyu.zsq@linux.alibaba.com>
> Acked-by: Simon Guo <wei.guo.simon@linux.alibaba.com>
> ---
> drivers/vfio/vfio_iommu_type1.c | 27 ++++++++++++++++++++++-----
> 1 file changed, 22 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> index c8151ba54de3..115d88d7003e 100644
> --- a/drivers/vfio/vfio_iommu_type1.c
> +++ b/drivers/vfio/vfio_iommu_type1.c
> @@ -1177,25 +1177,42 @@ static long vfio_unmap_unpin(struct vfio_iommu *iommu, struct vfio_dma *dma,
>
> iommu_iotlb_gather_init(&iotlb_gather);
> while (pos < dma->size) {
> - size_t unmapped, len;
> + size_t unmapped, len, pgsize;
> phys_addr_t phys, next;
> dma_addr_t iova = dma->iova + pos;
>
> - phys = iommu_iova_to_phys(domain->domain, iova);
> - if (WARN_ON(!phys)) {
> + /* Single page table walk returns both phys and PTE size */
> + phys = iommu_iova_to_phys_length(domain->domain, iova,
> + &pgsize);
> + if (WARN_ON(phys == PHYS_ADDR_MAX)) {
> pos += PAGE_SIZE;
> continue;
> }
> + if (WARN_ON(!pgsize || pgsize < PAGE_SIZE))
> + pgsize = PAGE_SIZE;
>
> /*
> * To optimize for fewer iommu_unmap() calls, each of which
> * may require hardware cache flushing, try to find the
> * largest contiguous physical memory chunk to unmap.
> + *
> + * mapped_length already accounts for contiguous entries
> + * from iova, then try to join following physically
> + * contiguous PTEs.
> */
> - for (len = PAGE_SIZE; pos + len < dma->size; len += PAGE_SIZE) {
> - next = iommu_iova_to_phys(domain->domain, iova + len);
> + len = min_t(size_t, pgsize, dma->size - pos);
> + for (; pos + len < dma->size; ) {
> + size_t next_pgsize;
> +
> + next = iommu_iova_to_phys_length(domain->domain,
> + iova + len,
> + &next_pgsize);
vfio should not be calling it twice, the core code needs to give the
best length as efficiently as it can. not open coding this in callers.
I think I've said this three times now
Jason
Use iommu_iova_to_phys_length() to get PTE page size in
batch_from_domain and raw_pages_from_domain, allowing traversal
by actual mapping granularity instead of PAGE_SIZE steps.
Signed-off-by: Guanghui Feng <guanghuifeng@linux.alibaba.com>
Acked-by: Shiqiang Zhang <shiyu.zsq@linux.alibaba.com>
Acked-by: Simon Guo <wei.guo.simon@linux.alibaba.com>
---
drivers/iommu/iommufd/pages.c | 74 +++++++++++++++++++++++++++++------
1 file changed, 62 insertions(+), 12 deletions(-)
diff --git a/drivers/iommu/iommufd/pages.c b/drivers/iommu/iommufd/pages.c
index 9bdb2945afe1..40a2fe9adf9c 100644
--- a/drivers/iommu/iommufd/pages.c
+++ b/drivers/iommu/iommufd/pages.c
@@ -417,17 +417,42 @@ static void batch_from_domain(struct pfn_batch *batch,
if (start_index == iopt_area_index(area))
page_offset = area->page_offset;
while (start_index <= last_index) {
+ size_t pgsize;
+ unsigned long npages;
+ unsigned long i;
+
/*
- * This is pretty slow, it would be nice to get the page size
- * back from the driver, or have the driver directly fill the
- * batch.
+ * Use iova_to_phys_length to get both the physical address
+ * and the contiguous mapped length in a single page table
+ * walk, allowing us to skip ahead by the contiguous region
+ * size instead of walking page tables for every PAGE_SIZE.
+ * Query at page-aligned iova so pgsize covers from page start.
*/
- phys = iommu_iova_to_phys(domain, iova) - page_offset;
- if (!batch_add_pfn(batch, PHYS_PFN(phys)))
- return;
- iova += PAGE_SIZE - page_offset;
+ phys = iommu_iova_to_phys_length(domain, iova - page_offset,
+ &pgsize);
+ if (WARN_ON(phys == PHYS_ADDR_MAX))
+ break;
+ if (WARN_ON(!pgsize || pgsize < PAGE_SIZE))
+ pgsize = PAGE_SIZE;
+
+ /*
+ * pgsize is the contiguous length from the page-aligned
+ * iova, so npages is simply pgsize / PAGE_SIZE.
+ */
+ npages = pgsize / PAGE_SIZE;
+ npages = min_t(unsigned long, npages,
+ last_index - start_index + 1);
+ if (!npages)
+ npages = 1;
+
+ for (i = 0; i < npages; i++) {
+ if (!batch_add_pfn(batch, PHYS_PFN(phys) + i))
+ return;
+ }
+
+ iova += npages * PAGE_SIZE - page_offset;
page_offset = 0;
- start_index++;
+ start_index += npages;
}
}
@@ -445,11 +470,36 @@ static struct page **raw_pages_from_domain(struct iommu_domain *domain,
if (start_index == iopt_area_index(area))
page_offset = area->page_offset;
while (start_index <= last_index) {
- phys = iommu_iova_to_phys(domain, iova) - page_offset;
- *(out_pages++) = pfn_to_page(PHYS_PFN(phys));
- iova += PAGE_SIZE - page_offset;
+ size_t pgsize;
+ unsigned long npages;
+ unsigned long i;
+
+ /*
+ * Resolve the contiguous mapped length together with the
+ * physical address so we can fill multiple struct page
+ * pointers per page table walk when the IOMMU uses large
+ * pages. Query at page-aligned iova so pgsize covers from
+ * page start.
+ */
+ phys = iommu_iova_to_phys_length(domain, iova - page_offset,
+ &pgsize);
+ if (WARN_ON(phys == PHYS_ADDR_MAX))
+ break;
+ if (WARN_ON(!pgsize || pgsize < PAGE_SIZE))
+ pgsize = PAGE_SIZE;
+
+ npages = pgsize / PAGE_SIZE;
+ npages = min_t(unsigned long, npages,
+ last_index - start_index + 1);
+ if (!npages)
+ npages = 1;
+
+ for (i = 0; i < npages; i++)
+ *(out_pages++) = pfn_to_page(PHYS_PFN(phys) + i);
+
+ iova += npages * PAGE_SIZE - page_offset;
page_offset = 0;
- start_index++;
+ start_index += npages;
}
return out_pages;
}
--
2.43.7
On Wed, Jun 03, 2026 at 11:17:56PM +0800, Guanghui Feng wrote:
> Use iommu_iova_to_phys_length() to get PTE page size in
> + for (i = 0; i < npages; i++) {
> + if (!batch_add_pfn(batch, PHYS_PFN(phys) + i))
> + return;
batch_add_pfn_num()
Be mindful that the num is purposfully a u32 so that will need some
attention.
Jason
Replace direct domain->ops->iova_to_phys() call with the new
iommu_iova_to_phys_length() interface in selftest.
Signed-off-by: Guanghui Feng <guanghuifeng@linux.alibaba.com>
Acked-by: Shiqiang Zhang <shiyu.zsq@linux.alibaba.com>
Acked-by: Simon Guo <wei.guo.simon@linux.alibaba.com>
---
drivers/iommu/iommufd/selftest.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/iommu/iommufd/selftest.c b/drivers/iommu/iommufd/selftest.c
index af07c642a526..d4dd39930224 100644
--- a/drivers/iommu/iommufd/selftest.c
+++ b/drivers/iommu/iommufd/selftest.c
@@ -1214,7 +1214,7 @@ static int iommufd_test_md_check_pa(struct iommufd_ucmd *ucmd,
pfn = page_to_pfn(pages[0]);
put_page(pages[0]);
- io_phys = mock->domain.ops->iova_to_phys(&mock->domain, iova);
+ io_phys = iommu_iova_to_phys_length(&mock->domain, iova, NULL);
if (io_phys !=
pfn * PAGE_SIZE + ((uintptr_t)uptr % PAGE_SIZE)) {
rc = -EINVAL;
--
2.43.7
Migrate panfrost_mmu to use ops->iova_to_phys_length(ops, iova, NULL)
instead of the deprecated ops->iova_to_phys.
Signed-off-by: Guanghui Feng <guanghuifeng@linux.alibaba.com>
---
drivers/gpu/drm/panfrost/panfrost_mmu.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c
index 4a3162c3b659..aa0bc82deaf6 100644
--- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
+++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
@@ -514,7 +514,7 @@ void panfrost_mmu_unmap(struct panfrost_gem_mapping *mapping)
if (bo->is_heap)
pgcount = 1;
- if (!bo->is_heap || ops->iova_to_phys(ops, iova)) {
+ if (!bo->is_heap || ops->iova_to_phys_length(ops, iova, NULL) != PHYS_ADDR_MAX) {
unmapped_page = ops->unmap_pages(ops, iova, pgsize, pgcount, NULL);
WARN_ON(unmapped_page != pgsize * pgcount);
}
--
2.43.7
Migrate panthor_mmu to use ops->iova_to_phys_length(ops, iova, NULL)
instead of the deprecated ops->iova_to_phys.
Signed-off-by: Guanghui Feng <guanghuifeng@linux.alibaba.com>
---
drivers/gpu/drm/panthor/panthor_mmu.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/panthor/panthor_mmu.c b/drivers/gpu/drm/panthor/panthor_mmu.c
index 75d98dad7b1d..3b635fc1f651 100644
--- a/drivers/gpu/drm/panthor/panthor_mmu.c
+++ b/drivers/gpu/drm/panthor/panthor_mmu.c
@@ -903,7 +903,7 @@ static void panthor_vm_unmap_pages(struct panthor_vm *vm, u64 iova, u64 size)
* are out-of-sync. This is not supposed to happen, hence the
* above WARN_ON().
*/
- while (!ops->iova_to_phys(ops, iova + unmapped_sz) &&
+ while (ops->iova_to_phys_length(ops, iova + unmapped_sz, NULL) == PHYS_ADDR_MAX &&
unmapped_sz < pgsize * pgcount)
unmapped_sz += SZ_4K;
--
2.43.7
Migrate io-pgtable ARM selftests to use ops->iova_to_phys_length
instead of the deprecated ops->iova_to_phys.
Signed-off-by: Guanghui Feng <guanghuifeng@linux.alibaba.com>
---
drivers/iommu/io-pgtable-arm-selftests.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/drivers/iommu/io-pgtable-arm-selftests.c b/drivers/iommu/io-pgtable-arm-selftests.c
index 334e70350924..78252344c3d0 100644
--- a/drivers/iommu/io-pgtable-arm-selftests.c
+++ b/drivers/iommu/io-pgtable-arm-selftests.c
@@ -72,13 +72,13 @@ static int arm_lpae_run_tests(struct kunit *test, struct io_pgtable_cfg *cfg)
* Initial sanity checks.
* Empty page tables shouldn't provide any translations.
*/
- if (ops->iova_to_phys(ops, 42))
+ if (ops->iova_to_phys_length(ops, 42, NULL) != PHYS_ADDR_MAX)
return __FAIL(test, i);
- if (ops->iova_to_phys(ops, SZ_1G + 42))
+ if (ops->iova_to_phys_length(ops, SZ_1G + 42, NULL) != PHYS_ADDR_MAX)
return __FAIL(test, i);
- if (ops->iova_to_phys(ops, SZ_2G + 42))
+ if (ops->iova_to_phys_length(ops, SZ_2G + 42, NULL) != PHYS_ADDR_MAX)
return __FAIL(test, i);
/*
@@ -100,7 +100,7 @@ static int arm_lpae_run_tests(struct kunit *test, struct io_pgtable_cfg *cfg)
GFP_KERNEL, &mapped))
return __FAIL(test, i);
- if (ops->iova_to_phys(ops, iova + 42) != (iova + 42))
+ if (ops->iova_to_phys_length(ops, iova + 42, NULL) != (iova + 42))
return __FAIL(test, i);
iova += SZ_1G;
@@ -114,7 +114,7 @@ static int arm_lpae_run_tests(struct kunit *test, struct io_pgtable_cfg *cfg)
if (ops->unmap_pages(ops, iova, size, 1, NULL) != size)
return __FAIL(test, i);
- if (ops->iova_to_phys(ops, iova + 42))
+ if (ops->iova_to_phys_length(ops, iova + 42, NULL) != PHYS_ADDR_MAX)
return __FAIL(test, i);
/* Remap full block */
@@ -122,7 +122,7 @@ static int arm_lpae_run_tests(struct kunit *test, struct io_pgtable_cfg *cfg)
IOMMU_WRITE, GFP_KERNEL, &mapped))
return __FAIL(test, i);
- if (ops->iova_to_phys(ops, iova + 42) != (iova + 42))
+ if (ops->iova_to_phys_length(ops, iova + 42, NULL) != (iova + 42))
return __FAIL(test, i);
iova += SZ_1G;
--
2.43.7
Remove the iova_to_phys wrapper function and .iova_to_phys assignment
from ARM LPAE io-pgtable, as all callers now use iova_to_phys_length.
Signed-off-by: Guanghui Feng <guanghuifeng@linux.alibaba.com>
---
drivers/iommu/io-pgtable-arm.c | 13 -------------
1 file changed, 13 deletions(-)
diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index f33a86fa0f6c..55a32346b586 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -731,18 +731,6 @@ static int visit_iova_to_phys(struct io_pgtable_walk_data *walk_data, int lvl,
return 0;
}
-static phys_addr_t arm_lpae_iova_to_phys_length(struct io_pgtable_ops *ops,
- unsigned long iova,
- size_t *mapped_length);
-
-static phys_addr_t arm_lpae_iova_to_phys(struct io_pgtable_ops *ops,
- unsigned long iova)
-{
- phys_addr_t phys = arm_lpae_iova_to_phys_length(ops, iova, NULL);
-
- return (phys == PHYS_ADDR_MAX) ? 0 : phys;
-}
-
static phys_addr_t arm_lpae_iova_to_phys_length(struct io_pgtable_ops *ops,
unsigned long iova,
size_t *mapped_length)
@@ -965,7 +953,6 @@ arm_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg)
data->iop.ops = (struct io_pgtable_ops) {
.map_pages = arm_lpae_map_pages,
.unmap_pages = arm_lpae_unmap_pages,
- .iova_to_phys = arm_lpae_iova_to_phys,
.iova_to_phys_length = arm_lpae_iova_to_phys_length,
.read_and_clear_dirty = arm_lpae_read_and_clear_dirty,
.pgtable_walk = arm_lpae_pgtable_walk,
--
2.43.7
Remove the iova_to_phys wrapper function and .iova_to_phys assignment
from ARM v7s io-pgtable, as all callers now use iova_to_phys_length.
Signed-off-by: Guanghui Feng <guanghuifeng@linux.alibaba.com>
---
drivers/iommu/io-pgtable-arm-v7s.c | 13 -------------
1 file changed, 13 deletions(-)
diff --git a/drivers/iommu/io-pgtable-arm-v7s.c b/drivers/iommu/io-pgtable-arm-v7s.c
index 62198e31a393..da065747e37c 100644
--- a/drivers/iommu/io-pgtable-arm-v7s.c
+++ b/drivers/iommu/io-pgtable-arm-v7s.c
@@ -641,18 +641,6 @@ static size_t arm_v7s_unmap_pages(struct io_pgtable_ops *ops, unsigned long iova
return unmapped;
}
-static phys_addr_t arm_v7s_iova_to_phys_length(struct io_pgtable_ops *ops,
- unsigned long iova,
- size_t *mapped_length);
-
-static phys_addr_t arm_v7s_iova_to_phys(struct io_pgtable_ops *ops,
- unsigned long iova)
-{
- phys_addr_t phys = arm_v7s_iova_to_phys_length(ops, iova, NULL);
-
- return (phys == PHYS_ADDR_MAX) ? 0 : phys;
-}
-
static phys_addr_t arm_v7s_iova_to_phys_length(struct io_pgtable_ops *ops,
unsigned long iova,
size_t *mapped_length)
@@ -730,7 +718,6 @@ static struct io_pgtable *arm_v7s_alloc_pgtable(struct io_pgtable_cfg *cfg,
data->iop.ops = (struct io_pgtable_ops) {
.map_pages = arm_v7s_map_pages,
.unmap_pages = arm_v7s_unmap_pages,
- .iova_to_phys = arm_v7s_iova_to_phys,
.iova_to_phys_length = arm_v7s_iova_to_phys_length,
};
--
2.43.7
Remove the iova_to_phys wrapper function and .iova_to_phys assignment
from DART io-pgtable, as all callers now use iova_to_phys_length.
Signed-off-by: Guanghui Feng <guanghuifeng@linux.alibaba.com>
---
drivers/iommu/io-pgtable-dart.c | 13 -------------
1 file changed, 13 deletions(-)
diff --git a/drivers/iommu/io-pgtable-dart.c b/drivers/iommu/io-pgtable-dart.c
index 2dac21a578a7..01c4c022830b 100644
--- a/drivers/iommu/io-pgtable-dart.c
+++ b/drivers/iommu/io-pgtable-dart.c
@@ -333,18 +333,6 @@ static size_t dart_unmap_pages(struct io_pgtable_ops *ops, unsigned long iova,
return i * pgsize;
}
-static phys_addr_t dart_iova_to_phys_length(struct io_pgtable_ops *ops,
- unsigned long iova,
- size_t *mapped_length);
-
-static phys_addr_t dart_iova_to_phys(struct io_pgtable_ops *ops,
- unsigned long iova)
-{
- phys_addr_t phys = dart_iova_to_phys_length(ops, iova, NULL);
-
- return (phys == PHYS_ADDR_MAX) ? 0 : phys;
-}
-
static phys_addr_t dart_iova_to_phys_length(struct io_pgtable_ops *ops,
unsigned long iova,
size_t *mapped_length)
@@ -416,7 +404,6 @@ dart_alloc_pgtable(struct io_pgtable_cfg *cfg)
data->iop.ops = (struct io_pgtable_ops) {
.map_pages = dart_map_pages,
.unmap_pages = dart_unmap_pages,
- .iova_to_phys = dart_iova_to_phys,
.iova_to_phys_length = dart_iova_to_phys_length,
};
--
2.43.7
Now that all drivers implement iova_to_phys_length and all callers
have migrated, remove the deprecated interfaces:
- Remove .iova_to_phys from struct iommu_domain_ops
- Remove .iova_to_phys from struct io_pgtable_ops
- Remove fallback path in iommu_iova_to_phys_length()
- iommu_iova_to_phys() remains as a thin wrapper calling _length with NULL
Signed-off-by: Guanghui Feng <guanghuifeng@linux.alibaba.com>
---
drivers/iommu/iommu.c | 16 ++--------------
include/linux/io-pgtable.h | 12 +++++-------
include/linux/iommu.h | 3 ---
3 files changed, 7 insertions(+), 24 deletions(-)
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 320ea13488e7..1ad4787925cd 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2561,8 +2561,6 @@ phys_addr_t iommu_iova_to_phys_length(struct iommu_domain *domain,
dma_addr_t iova,
size_t *mapped_length)
{
- phys_addr_t phys;
-
if (domain->type == IOMMU_DOMAIN_IDENTITY) {
if (mapped_length)
*mapped_length = PAGE_SIZE;
@@ -2572,20 +2570,10 @@ phys_addr_t iommu_iova_to_phys_length(struct iommu_domain *domain,
if (mapped_length)
*mapped_length = 0;
- if (domain->ops->iova_to_phys_length)
- return domain->ops->iova_to_phys_length(domain, iova, mapped_length);
-
- /* Fallback to legacy iova_to_phys without length info */
- if (!domain->ops->iova_to_phys)
+ if (!domain->ops->iova_to_phys_length)
return PHYS_ADDR_MAX;
- phys = domain->ops->iova_to_phys(domain, iova);
- if (!phys)
- return PHYS_ADDR_MAX;
-
- if (mapped_length)
- *mapped_length = PAGE_SIZE;
- return phys;
+ return domain->ops->iova_to_phys_length(domain, iova, mapped_length);
}
EXPORT_SYMBOL_GPL(iommu_iova_to_phys_length);
diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h
index 42bcdd309b88..ea7e473146e4 100644
--- a/include/linux/io-pgtable.h
+++ b/include/linux/io-pgtable.h
@@ -202,11 +202,11 @@ struct arm_lpae_io_pgtable_walk_data {
*
* @map_pages: Map a physically contiguous range of pages of the same size.
* @unmap_pages: Unmap a range of virtually contiguous pages of the same size.
- * @iova_to_phys: Translate iova to physical address.
- * @iova_to_phys_length: Translate iova to physical address and return the
- * remaining mapped length from iova to the end of the
- * mapping entry via @mapped_length. If @mapped_length is
- * NULL, only the physical address is returned.
+ * @iova_to_phys_length: Translate iova to physical address and return, via
+ * @mapped_length, the full size of the mapping entry
+ * that covers @iova (e.g. 4KB/2MB/1GB). If
+ * @mapped_length is NULL, only the physical address
+ * is returned.
* @pgtable_walk: (optional) Perform a page table walk for a given iova.
* @read_and_clear_dirty: Record dirty info per IOVA. If an IOVA is dirty,
* clear its dirty state from the PTE unless the
@@ -222,8 +222,6 @@ struct io_pgtable_ops {
size_t (*unmap_pages)(struct io_pgtable_ops *ops, unsigned long iova,
size_t pgsize, size_t pgcount,
struct iommu_iotlb_gather *gather);
- phys_addr_t (*iova_to_phys)(struct io_pgtable_ops *ops,
- unsigned long iova);
phys_addr_t (*iova_to_phys_length)(struct io_pgtable_ops *ops,
unsigned long iova,
size_t *mapped_length);
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 19da84c2922c..ca585647180b 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -746,7 +746,6 @@ struct iommu_ops {
* array->entry_num to report the number of handled
* invalidation requests. The driver data structure
* must be defined in include/uapi/linux/iommufd.h
- * @iova_to_phys: translate iova to physical address
* @iova_to_phys_length: translate iova to physical address and additionally
* return the page size of the PTE mapping at @iova
* through @mapped_length.
@@ -777,8 +776,6 @@ struct iommu_domain_ops {
int (*cache_invalidate_user)(struct iommu_domain *domain,
struct iommu_user_data_array *array);
- phys_addr_t (*iova_to_phys)(struct iommu_domain *domain,
- dma_addr_t iova);
phys_addr_t (*iova_to_phys_length)(struct iommu_domain *domain,
dma_addr_t iova,
size_t *mapped_length);
--
2.43.7
© 2016 - 2026 Red Hat, Inc.