drivers/vfio/vfio_iommu_type1.c | 173 +++++++++++++-------- .../testing/selftests/vfio/lib/include/vfio_util.h | 27 +++- tools/testing/selftests/vfio/lib/vfio_pci_device.c | 104 ++++++++++--- .../testing/selftests/vfio/vfio_dma_mapping_test.c | 95 ++++++++++- 4 files changed, 308 insertions(+), 91 deletions(-)
This patch series aims to fix vfio_iommu_type.c to support
VFIO_IOMMU_MAP_DMA and VFIO_IOMMU_UNMAP_DMA operations targeting IOVA
ranges which lie against the addressable limit. i.e. ranges where
iova_start + iova_size would overflow to exactly zero.
Today, the VFIO UAPI has an inconsistency: The
VFIO_IOMMU_TYPE1_INFO_CAP_IOVA_RANGE capability of VFIO_IOMMU_GET_INFO
reports that ranges up to the end of the address space are available
for use, but are not really due to bugs in handling boundary conditions.
For example:
vfio_find_dma_first_node() is called to find the first dma node to unmap
given an unmap range of [iova..iova+size). The check at the end of the
function intends to test if the dma result lies beyond the end of the
unmap range. The condition is incorrectly satisfied when iova+size
overflows to zero, causing the function to return NULL.
The same issue happens inside vfio_dma_do_unmap()'s while loop.
This bug was also reported by Alejandro Jimenez in [1][2].
Of primary concern are locations in the current code which perform
comparisons against (iova + size) expressions, where overflow to zero
is possible.
The initial list of candidate locations to audit was taken from the
following:
$ rg 'iova.*\+.*size' -n drivers/vfio/vfio_iommu_type1.c | rg -v '\- 1'
173: else if (start >= dma->iova + dma->size)
192: if (start < dma->iova + dma->size) {
216: if (new->iova + new->size <= dma->iova)
1060: dma_addr_t iova = dma->iova, end = dma->iova + dma->size;
1233: if (dma && dma->iova + dma->size != iova + size)
1380: if (dma && dma->iova + dma->size != iova + size)
1501: ret = vfio_iommu_map(iommu, iova + dma->size, pfn, npage,
1504: vfio_unpin_pages_remote(dma, iova + dma->size, pfn,
1721: while (iova < dma->iova + dma->size) {
1743: i = iova + size;
1744: while (i < dma->iova + dma->size &&
1754: size_t n = dma->iova + dma->size - iova;
1785: iova += size;
1810: while (iova < dma->iova + dma->size) {
1823: i = iova + size;
1824: while (i < dma->iova + dma->size &&
2919: if (range.iova + range.size < range.iova)
This series spends the first couple commits making mechanical
preparations before the fix lands in the third commit. Selftests are
added in the last two commits.
[1] https://lore.kernel.org/qemu-devel/20250919213515.917111-1-alejandro.j.jimenez@oracle.com/
[2] https://lore.kernel.org/all/68e18f2c-79ad-45ec-99b9-99ff68ba5438@oracle.com/
Signed-off-by: Alex Mastro <amastro@fb.com>
---
Changes in v6:
- Fix nits in selftests
- Clarify function calls with '()' in commit messages
- Link to v5: https://lore.kernel.org/r/20251027-fix-unmap-v5-0-4f0fcf8ffb7d@fb.com
Changes in v5:
- Add vfio selftests
- Clarify commit message
- Link to v4: https://lore.kernel.org/r/20251012-fix-unmap-v4-0-9eefc90ed14c@fb.com
Changes in v4:
- Fix type assigned to iova_end
- Clarify overflow checking, add checks to vfio_iommu_type1_dirty_pages
- Consider npage==0 an error for vfio_iommu_type1_pin_pages
- Link to v3: https://lore.kernel.org/r/20251010-fix-unmap-v3-0-306c724d6998@fb.com
Changes in v3:
- Fix handling of unmap_all in vfio_dma_do_unmap
- Fix !range.size to return -EINVAL for VFIO_IOMMU_DIRTY_PAGES_FLAG_GET_BITMAP
- Dedup !range.size checking
- Return -EOVERFLOW on check_*_overflow
- Link to v2: https://lore.kernel.org/r/20251007-fix-unmap-v2-0-759bceb9792e@fb.com
Changes in v2:
- Change to patch series rather than single commit
- Expand scope to fix more than just the unmap discovery path
- Link to v1: https://lore.kernel.org/r/20251005-fix-unmap-v1-1-6687732ed44e@fb.com
---
Alex Mastro (5):
vfio/type1: sanitize for overflow using check_*_overflow()
vfio/type1: move iova increment to unmap_unpin_*() caller
vfio/type1: handle DMA map/unmap up to the addressable limit
vfio: selftests: update DMA map/unmap helpers to support more test kinds
vfio: selftests: add end of address space DMA map/unmap tests
drivers/vfio/vfio_iommu_type1.c | 173 +++++++++++++--------
.../testing/selftests/vfio/lib/include/vfio_util.h | 27 +++-
tools/testing/selftests/vfio/lib/vfio_pci_device.c | 104 ++++++++++---
.../testing/selftests/vfio/vfio_dma_mapping_test.c | 95 ++++++++++-
4 files changed, 308 insertions(+), 91 deletions(-)
---
base-commit: 451bb96328981808463405d436bd58de16dd967d
change-id: 20251005-fix-unmap-c3f3e87dabfa
Best regards,
--
Alex Mastro <amastro@fb.com>
On Tue, 28 Oct 2025 09:14:59 -0700
Alex Mastro <amastro@fb.com> wrote:
> This patch series aims to fix vfio_iommu_type.c to support
> VFIO_IOMMU_MAP_DMA and VFIO_IOMMU_UNMAP_DMA operations targeting IOVA
> ranges which lie against the addressable limit. i.e. ranges where
> iova_start + iova_size would overflow to exactly zero.
>
> Today, the VFIO UAPI has an inconsistency: The
> VFIO_IOMMU_TYPE1_INFO_CAP_IOVA_RANGE capability of VFIO_IOMMU_GET_INFO
> reports that ranges up to the end of the address space are available
> for use, but are not really due to bugs in handling boundary conditions.
>
> For example:
>
> vfio_find_dma_first_node() is called to find the first dma node to unmap
> given an unmap range of [iova..iova+size). The check at the end of the
> function intends to test if the dma result lies beyond the end of the
> unmap range. The condition is incorrectly satisfied when iova+size
> overflows to zero, causing the function to return NULL.
>
> The same issue happens inside vfio_dma_do_unmap()'s while loop.
>
> This bug was also reported by Alejandro Jimenez in [1][2].
>
> Of primary concern are locations in the current code which perform
> comparisons against (iova + size) expressions, where overflow to zero
> is possible.
>
> The initial list of candidate locations to audit was taken from the
> following:
>
> $ rg 'iova.*\+.*size' -n drivers/vfio/vfio_iommu_type1.c | rg -v '\- 1'
> 173: else if (start >= dma->iova + dma->size)
> 192: if (start < dma->iova + dma->size) {
> 216: if (new->iova + new->size <= dma->iova)
> 1060: dma_addr_t iova = dma->iova, end = dma->iova + dma->size;
> 1233: if (dma && dma->iova + dma->size != iova + size)
> 1380: if (dma && dma->iova + dma->size != iova + size)
> 1501: ret = vfio_iommu_map(iommu, iova + dma->size, pfn, npage,
> 1504: vfio_unpin_pages_remote(dma, iova + dma->size, pfn,
> 1721: while (iova < dma->iova + dma->size) {
> 1743: i = iova + size;
> 1744: while (i < dma->iova + dma->size &&
> 1754: size_t n = dma->iova + dma->size - iova;
> 1785: iova += size;
> 1810: while (iova < dma->iova + dma->size) {
> 1823: i = iova + size;
> 1824: while (i < dma->iova + dma->size &&
> 2919: if (range.iova + range.size < range.iova)
>
> This series spends the first couple commits making mechanical
> preparations before the fix lands in the third commit. Selftests are
> added in the last two commits.
>
> [1] https://lore.kernel.org/qemu-devel/20250919213515.917111-1-alejandro.j.jimenez@oracle.com/
> [2] https://lore.kernel.org/all/68e18f2c-79ad-45ec-99b9-99ff68ba5438@oracle.com/
>
> Signed-off-by: Alex Mastro <amastro@fb.com>
>
> ---
> Changes in v6:
> - Fix nits in selftests
> - Clarify function calls with '()' in commit messages
> - Link to v5: https://lore.kernel.org/r/20251027-fix-unmap-v5-0-4f0fcf8ffb7d@fb.com
Applied to vfio for-linus branch for v6.18. Thanks!
Alex
On 2025-10-28 09:14 AM, Alex Mastro wrote:
> This series spends the first couple commits making mechanical
> preparations before the fix lands in the third commit. Selftests are
> added in the last two commits.
Hi Alex,
The new unmap_range and unmap_all selftests are failing for me. They all fail
when attempting to map in region at the top of the IOVA address space.
Here's one example:
$ ./run.sh -d 0000:6a:01.0 -- ./vfio_dma_mapping_test -r vfio_dma_map_limit_test.iommufd.unmap_range
+ echo "vfio-pci" > /sys/bus/pci/devices/0000:6a:01.0/driver_override
+ echo "0000:6a:01.0" > /sys/bus/pci/drivers/vfio-pci/bind
TAP version 13
1..1
# Starting 1 tests from 1 test cases.
# RUN vfio_dma_map_limit_test.iommufd.unmap_range ...
Driver found: dsa
tools/testing/selftests/vfio/lib/include/vfio_util.h:219: Assertion Failure
Expression: __vfio_pci_dma_map(device, region) == 0
Observed: 0xffffffffffffffea == 0
[errno: 22 - Invalid argument]
# unmap_range: Test failed
# FAIL vfio_dma_map_limit_test.iommufd.unmap_range
not ok 1 vfio_dma_map_limit_test.iommufd.unmap_range
# FAILED: 0 / 1 tests passed.
# Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0
+ echo "0000:6a:01.0" > /sys/bus/pci/drivers/vfio-pci/unbind
+ echo "" > /sys/bus/pci/devices/0000:6a:01.0/driver_override
I am testing at the tip of Linus' tree at commit a1388fcb52fc ("Merge tag
'libcrypto-for-linus' of
git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux").
On 2025-11-07 12:44 AM, David Matlack wrote: > On 2025-10-28 09:14 AM, Alex Mastro wrote: > > > This series spends the first couple commits making mechanical > > preparations before the fix lands in the third commit. Selftests are > > added in the last two commits. > > The new unmap_range and unmap_all selftests are failing for me. They all fail > when attempting to map in region at the top of the IOVA address space. > > # RUN vfio_dma_map_limit_test.iommufd.unmap_range ... > Driver found: dsa > tools/testing/selftests/vfio/lib/include/vfio_util.h:219: Assertion Failure > > Expression: __vfio_pci_dma_map(device, region) == 0 > Observed: 0xffffffffffffffea == 0 > [errno: 22 - Invalid argument] For type1, I tracked down -EINVAL as coming from vfio_iommu_iova_dma_valid() returning false. The system I tested on only supports IOVAs up through 0x00ffffffffffffff. Do you know what systems supports up to 0xffffffffffffffff? I would like to try to make sure I am getting test coverage there when running these tests. In the meantime, I sent out a fix to skip this test instead of failing: https://lore.kernel.org/kvm/20251107222058.2009244-1-dmatlack@google.com/
On Fri, Nov 07, 2025 at 10:24:27PM +0000, David Matlack wrote: > On 2025-11-07 12:44 AM, David Matlack wrote: > > On 2025-10-28 09:14 AM, Alex Mastro wrote: > For type1, I tracked down -EINVAL as coming from > vfio_iommu_iova_dma_valid() returning false. > > The system I tested on only supports IOVAs up through > 0x00ffffffffffffff. > > Do you know what systems supports up to 0xffffffffffffffff? I would like > to try to make sure I am getting test coverage there when running these > tests. I observed this on an AMD EPYC 9654 server. > In the meantime, I sent out a fix to skip this test instead of failing: > > https://lore.kernel.org/kvm/20251107222058.2009244-1-dmatlack@google.com/ Thanks for the fix -- acked. My tests were making too strong an assumption about availability of those ranges. Alex
On Fri, Nov 7, 2025 at 4:36 PM Alex Mastro <amastro@fb.com> wrote: > > On Fri, Nov 07, 2025 at 10:24:27PM +0000, David Matlack wrote: > > On 2025-11-07 12:44 AM, David Matlack wrote: > > > On 2025-10-28 09:14 AM, Alex Mastro wrote: > > For type1, I tracked down -EINVAL as coming from > > vfio_iommu_iova_dma_valid() returning false. > > > > The system I tested on only supports IOVAs up through > > 0x00ffffffffffffff. > > > > Do you know what systems supports up to 0xffffffffffffffff? I would like > > to try to make sure I am getting test coverage there when running these > > tests. > > I observed this on an AMD EPYC 9654 server. Thanks. I was able to find a similar server to that that supports iova 0xffffffffffffffff and have added that to my test workflow.
© 2016 - 2025 Red Hat, Inc.