[PATCH v3 0/7] qemu: Implement support for iommufd

Nathan Chen via Devel posted 7 patches 2 weeks, 5 days ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/libvirt tags/patchew/20251219021925.1864433-1-nathanc@nvidia.com
There is a newer version of this series
docs/formatdomain.rst                         |   7 +
po/POTFILES                                   |   1 +
src/bhyve/bhyve_parse_command.c               |   2 +-
src/conf/device_conf.c                        |  11 ++
src/conf/device_conf.h                        |   1 +
src/conf/domain_conf.c                        |  13 +-
src/conf/domain_conf.h                        |   5 +-
src/conf/schemas/basictypes.rng               |   5 +
src/libvirt_private.syms                      |   4 +
src/libxl/xen_common.c                        |   2 +-
src/libxl/xen_xl.c                            |   2 +-
src/lxc/lxc_native.c                          |   2 +-
src/qemu/qemu_cgroup.c                        |  26 ++--
src/qemu/qemu_command.c                       |  74 ++++++++++
src/qemu/qemu_domain.c                        |  41 ++++++
src/qemu/qemu_domain.h                        |  20 +++
src/qemu/qemu_namespace.c                     |  16 ++-
src/qemu/qemu_process.c                       | 126 ++++++++++++++++++
src/security/security_apparmor.c              |  33 ++++-
src/security/security_dac.c                   |  60 +++++++--
src/security/security_selinux.c               |  58 ++++++--
src/security/virt-aa-helper.c                 |  32 ++++-
src/util/meson.build                          |   1 +
src/util/viriommufd.c                         |  89 +++++++++++++
src/util/viriommufd.h                         |  23 ++++
src/util/virpci.c                             |  69 ++++++++++
src/util/virpci.h                             |   2 +
src/vbox/vbox_common.c                        |   2 +-
.../iommufd-q35.x86_64-latest.args            |  41 ++++++
.../iommufd-q35.x86_64-latest.xml             |  60 +++++++++
tests/qemuxmlconfdata/iommufd-q35.xml         |  38 ++++++
.../iommufd-virt.aarch64-latest.args          |  33 +++++
.../iommufd-virt.aarch64-latest.xml           |  34 +++++
tests/qemuxmlconfdata/iommufd-virt.xml        |  22 +++
.../iommufd.x86_64-latest.args                |  35 +++++
.../qemuxmlconfdata/iommufd.x86_64-latest.xml |  38 ++++++
tests/qemuxmlconfdata/iommufd.xml             |  30 +++++
tests/qemuxmlconftest.c                       |  33 +++++
tests/virhostdevtest.c                        |   2 +-
39 files changed, 1031 insertions(+), 62 deletions(-)
create mode 100644 src/util/viriommufd.c
create mode 100644 src/util/viriommufd.h
create mode 100644 tests/qemuxmlconfdata/iommufd-q35.x86_64-latest.args
create mode 100644 tests/qemuxmlconfdata/iommufd-q35.x86_64-latest.xml
create mode 100644 tests/qemuxmlconfdata/iommufd-q35.xml
create mode 100644 tests/qemuxmlconfdata/iommufd-virt.aarch64-latest.args
create mode 100644 tests/qemuxmlconfdata/iommufd-virt.aarch64-latest.xml
create mode 100644 tests/qemuxmlconfdata/iommufd-virt.xml
create mode 100644 tests/qemuxmlconfdata/iommufd.x86_64-latest.args
create mode 100644 tests/qemuxmlconfdata/iommufd.x86_64-latest.xml
create mode 100644 tests/qemuxmlconfdata/iommufd.xml
[PATCH v3 0/7] qemu: Implement support for iommufd
Posted by Nathan Chen via Devel 2 weeks, 5 days ago
From: Nathan Chen <nathanc@nvidia.com>

Hi,

This is a follow up to the second patch series [0] for using iommufd
to propagate DMA mappings to the kernel for VM-assigned host
devices in a qemu VM.

We add a new 'iommufd' attribute for hostdev devices to be
associated with the iommufd object.

For instance, specifying the iommufd object and associated hostdev in a
VM definition:

  <devices>
...
    <hostdev mode='subsystem' type='pci' managed='no'>
      <driver iommufd='yes'/>
      <source>
        <address domain='0x0009' bus='0x01' slot='0x00' function='0x0'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x15' slot='0x00' function='0x0'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='no'>
      <driver iommufd='yes'/>
      <source>
        <address domain='0x0019' bus='0x01' slot='0x00' function='0x0'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x16' slot='0x00' function='0x0'/>
    </hostdev>
...
  </devices>

This would get translated to a qemu command line with the arguments below.
Note that libvirt will open the /dev/iommu and VFIO cdev, passing the
associated fd number to qemu:

 -object '{"qom-type":"iommufd","id":"iommufd0","fd":"24"}' \
 -device '{"driver":"vfio-pci","host":"0009:01:00.0","id":"hostdev0","iommufd":"iommufd0","fd":"22","bus":"pci.21","addr":"0x0"}' \
 -device '{"driver":"vfio-pci","host":"0019:01:00.0","id":"hostdev1","iommufd":"iommufd0","fd":"25","bus":"pci.22","addr":"0x0"}' \

Changes from v2:
- Set per-process memory accounting mode for iommufd
- Separated out formatting of iommufd object from qemuBuildHostdevCommandLine
- Placed hostdev private data implementation in a separate commit
- Allocate hostdev private data unconditionally
- Compare FDs against -1
- Integrated callback function in virQEMUDriverPrivateDataCallbacks for qemuDomainHostdevPrivateNew
- Dropped qemuProcessCloseVfioFds
- Addressed other feedback from v2 (formatting, includes, etc.)
- Revised seclabel logic to be device-specific for AppArmor and to allow paths for SELinux/DAC

Thanks to Ján Tomko for sharing some of the above changes in a personal repo. I have included
changes directly from that repo and added Suggested-by or Signed-off-by tags on various commits
containing the changes.

This series is on Github:
https://github.com/NathanChenNVIDIA/libvirt/tree/iommufd-12-25

Thanks,
Nathan

[0] https://lists.libvirt.org/archives/list/devel@lists.libvirt.org/thread/KFYUQGMXWV64QPI245H66GKRNAYL7LGB/

Signed-off-by: Nathan Chen <nathanc@nvidia.com>

Nathan Chen (7):
  qemu: Implement support for associating iommufd to hostdev
  qemu: Introduce privateData for hostdevs
  qemu: open VFIO FDs from libvirt backend
  qemu: open iommufd FD from libvirt backend
  qemu: Set per-process memory accounting for iommufd
  qemu: Update Cgroup, namespace, and seclabel for iommufd
  tests: qemuxmlconfdata: provide iommufd sample XML and CLI args
  cover letter: qemu: Implement support for iommufd

 docs/formatdomain.rst                         |   7 +
 po/POTFILES                                   |   1 +
 src/bhyve/bhyve_parse_command.c               |   2 +-
 src/conf/device_conf.c                        |  11 ++
 src/conf/device_conf.h                        |   1 +
 src/conf/domain_conf.c                        |  13 +-
 src/conf/domain_conf.h                        |   5 +-
 src/conf/schemas/basictypes.rng               |   5 +
 src/libvirt_private.syms                      |   4 +
 src/libxl/xen_common.c                        |   2 +-
 src/libxl/xen_xl.c                            |   2 +-
 src/lxc/lxc_native.c                          |   2 +-
 src/qemu/qemu_cgroup.c                        |  26 ++--
 src/qemu/qemu_command.c                       |  74 ++++++++++
 src/qemu/qemu_domain.c                        |  41 ++++++
 src/qemu/qemu_domain.h                        |  20 +++
 src/qemu/qemu_namespace.c                     |  16 ++-
 src/qemu/qemu_process.c                       | 126 ++++++++++++++++++
 src/security/security_apparmor.c              |  33 ++++-
 src/security/security_dac.c                   |  60 +++++++--
 src/security/security_selinux.c               |  58 ++++++--
 src/security/virt-aa-helper.c                 |  32 ++++-
 src/util/meson.build                          |   1 +
 src/util/viriommufd.c                         |  89 +++++++++++++
 src/util/viriommufd.h                         |  23 ++++
 src/util/virpci.c                             |  69 ++++++++++
 src/util/virpci.h                             |   2 +
 src/vbox/vbox_common.c                        |   2 +-
 .../iommufd-q35.x86_64-latest.args            |  41 ++++++
 .../iommufd-q35.x86_64-latest.xml             |  60 +++++++++
 tests/qemuxmlconfdata/iommufd-q35.xml         |  38 ++++++
 .../iommufd-virt.aarch64-latest.args          |  33 +++++
 .../iommufd-virt.aarch64-latest.xml           |  34 +++++
 tests/qemuxmlconfdata/iommufd-virt.xml        |  22 +++
 .../iommufd.x86_64-latest.args                |  35 +++++
 .../qemuxmlconfdata/iommufd.x86_64-latest.xml |  38 ++++++
 tests/qemuxmlconfdata/iommufd.xml             |  30 +++++
 tests/qemuxmlconftest.c                       |  33 +++++
 tests/virhostdevtest.c                        |   2 +-
 39 files changed, 1031 insertions(+), 62 deletions(-)
 create mode 100644 src/util/viriommufd.c
 create mode 100644 src/util/viriommufd.h
 create mode 100644 tests/qemuxmlconfdata/iommufd-q35.x86_64-latest.args
 create mode 100644 tests/qemuxmlconfdata/iommufd-q35.x86_64-latest.xml
 create mode 100644 tests/qemuxmlconfdata/iommufd-q35.xml
 create mode 100644 tests/qemuxmlconfdata/iommufd-virt.aarch64-latest.args
 create mode 100644 tests/qemuxmlconfdata/iommufd-virt.aarch64-latest.xml
 create mode 100644 tests/qemuxmlconfdata/iommufd-virt.xml
 create mode 100644 tests/qemuxmlconfdata/iommufd.x86_64-latest.args
 create mode 100644 tests/qemuxmlconfdata/iommufd.x86_64-latest.xml
 create mode 100644 tests/qemuxmlconfdata/iommufd.xml

-- 
2.43.0

Re: [PATCH v3 0/7] qemu: Implement support for iommufd
Posted by Nathan Chen via Devel 2 weeks, 4 days ago

On 12/18/2025 6:19 PM, Nathan Chen wrote:
> Changes from v2:
> - Set per-process memory accounting mode for iommufd
> - Separated out formatting of iommufd object from qemuBuildHostdevCommandLine
> - Placed hostdev private data implementation in a separate commit
> - Allocate hostdev private data unconditionally
> - Compare FDs against -1
> - Integrated callback function in virQEMUDriverPrivateDataCallbacks for qemuDomainHostdevPrivateNew
> - Dropped qemuProcessCloseVfioFds
> - Addressed other feedback from v2 (formatting, includes, etc.)
> - Revised seclabel logic to be device-specific for AppArmor and to allow paths for SELinux/DAC

I have found a bug with this series where attempting to boot an iommufd 
VM a second time immediately after shutdown fails with an error binding 
device fd to iommufd with "File descriptor in bad state". It does not 
occur with the previous libvirt revisions. I will investigate and 
resolve this before the next refresh.

Nathan
Re: [PATCH v3 0/7] qemu: Implement support for iommufd
Posted by Ján Tomko via Devel 2 weeks, 1 day ago
On a Friday in 2025, Nathan Chen via Devel wrote:
>
>
>On 12/18/2025 6:19 PM, Nathan Chen wrote:
>>Changes from v2:
>>- Set per-process memory accounting mode for iommufd
>>- Separated out formatting of iommufd object from qemuBuildHostdevCommandLine
>>- Placed hostdev private data implementation in a separate commit
>>- Allocate hostdev private data unconditionally
>>- Compare FDs against -1
>>- Integrated callback function in virQEMUDriverPrivateDataCallbacks for qemuDomainHostdevPrivateNew
>>- Dropped qemuProcessCloseVfioFds
>>- Addressed other feedback from v2 (formatting, includes, etc.)
>>- Revised seclabel logic to be device-specific for AppArmor and to allow paths for SELinux/DAC
>
>I have found a bug with this series where attempting to boot an 
>iommufd VM a second time immediately after shutdown fails with an 
>error binding device fd to iommufd with "File descriptor in bad 
>state". It does not occur with the previous libvirt revisions. I will 
>investigate and resolve this before the next refresh.
>

Interesting; could it be related to closing the file descriptor later
as I suggested (i.e. in qemuDomainHostdevPrivateDispose?). Maybe it gets
called too late by libvirt, or some cleanup in the kernel takes too
long?

Jano

>Nathan
>