[RFC PATCH 0/5] cover letter: qemu: Implement support for iommufd and multiple vSMMUs

Nathan Chen via Devel posted 5 patches 3 weeks, 2 days ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/libvirt tags/patchew/20250815025415.2805374-1-nathanc@nvidia.com
docs/formatdomain.rst             |  22 ++-
src/conf/domain_conf.c            | 208 ++++++++++++++++++++++--
src/conf/domain_conf.h            |  13 +-
src/conf/domain_validate.c        |  58 +++++--
src/conf/schemas/domaincommon.rng |  24 ++-
src/libvirt_private.syms          |   2 +
src/qemu/qemu_alias.c             |  15 +-
src/qemu/qemu_cgroup.c            |  61 +++++++
src/qemu/qemu_cgroup.h            |   1 +
src/qemu/qemu_command.c           | 261 ++++++++++++++++++++++--------
src/qemu/qemu_command.h           |   3 +-
src/qemu/qemu_domain.c            |   8 +
src/qemu/qemu_domain.h            |   7 +
src/qemu/qemu_domain_address.c    |  33 ++--
src/qemu/qemu_driver.c            |   8 +-
src/qemu/qemu_hotplug.c           |   2 +-
src/qemu/qemu_namespace.c         |  44 +++++
src/qemu/qemu_postparse.c         |  11 +-
src/qemu/qemu_process.c           | 232 ++++++++++++++++++++++++++
src/qemu/qemu_validate.c          |  18 ++-
src/security/security_apparmor.c  |  11 ++
src/security/security_dac.c       |  23 +++
src/security/security_selinux.c   |  24 +++
src/util/virpci.c                 |  68 ++++++++
src/util/virpci.h                 |   1 +
25 files changed, 1020 insertions(+), 138 deletions(-)
[RFC PATCH 0/5] cover letter: qemu: Implement support for iommufd and multiple vSMMUs
Posted by Nathan Chen via Devel 3 weeks, 2 days ago
Hi,

This is a follow up to the second RFC patchset [0] for supporting multiple
vSMMU instances and using iommufd to propagate DMA mappings to kernel for
VM-assigned host devices in a qemu VM.

This patchset implements support for specifying multiple <iommu> devices
within the VM definition when smmuv3Dev IOMMU model is specified, and is
tested with Shameer's latest qemu RFC for HW-accelerated vSMMU devices [1]

Moreover, it adds a new 'iommufdId' attribute for hostdev devices to be
associated with the iommufd object.

For instance, specifying the iommufd object and associated hostdev in a
VM definition with multiple IOMMUs, configured to be routed to
pcie-expander-bus controllers in a way where VFIO device to SMMUv3
associations are matched with the host:

  <devices>
...
    <controller type='pci' index='1' model='pcie-expander-bus'>
      <model name='pxb-pcie'/>
      <target busNr='252'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x0'/>
    </controller>
    <controller type='pci' index='2' model='pcie-expander-bus'>
      <model name='pxb-pcie'/>
      <target busNr='248'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </controller>
...
    <controller type='pci' index='21' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='21' port='0x0'/>
      <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>
    </controller>
    <controller type='pci' index='22' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='22' port='0xa8'/>
      <address type='pci' domain='0x0000' bus='0x02' slot='0x00' function='0x0'/>
    </controller>
...
    <hostdev mode='subsystem' type='pci' managed='no'>
      <source>
	<address domain='0x0009' bus='0x01' slot='0x00' function='0x0'/>
      </source>
      <iommufdId>iommufd0</iommufdId>
      <address type='pci' domain='0x0000' bus='0x15' slot='0x00' function='0x0'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='no'>
      <source>
	<address domain='0x0019' bus='0x01' slot='0x00' function='0x0'/>
      </source>
      <iommufdId>iommufd0</iommufdId>
      <address type='pci' domain='0x0000' bus='0x16' slot='0x00' function='0x0'/>
    </hostdev>
    <iommu model='smmuv3Dev' parentIdx='1' accel='on'/>
    <iommu model='smmuv3Dev' parentIdx='2' accel='on'/>
  </devices>

This would get translated to a qemu command line with the arguments below.
Note that libvirt will open the /dev/iommu and VFIO cdev, passing the
associated fd number to qemu:

 -device '{"driver":"pxb-pcie","bus_nr":252,"id":"pci.1","bus":"pcie.0","addr":"0x1"}' \
 -device '{"driver":"pxb-pcie","bus_nr":248,"id":"pci.2","bus":"pcie.0","addr":"0x2"}' \
 -device '{"driver":"pcie-root-port","port":0,"chassis":21,"id":"pci.21","bus":"pci.1","addr":"0x0"}' \
 -device '{"driver":"pcie-root-port","port":168,"chassis":22,"id":"pci.22","bus":"pci.2","addr":"0x0"}' \
 -object '{"qom-type":"iommufd","id":"iommufd0","fd":"24"}' \
 -device '{"driver":"arm-smmuv3-accel","primary-bus":"pci.1","id":"smmuv3.0","accel":true}' \
 -device '{"driver":"arm-smmuv3-accel","primary-bus":"pci.2","id":"smmuv3.1","accel":true}' \
 -device '{"driver":"vfio-pci","host":"0009:01:00.0","id":"hostdev0","iommufd":"iommufd0","fd":"22","bus":"pci.21","addr":"0x0"}' \
 -device '{"driver":"vfio-pci","host":"0019:01:00.0","id":"hostdev1","iommufd":"iommufd0","fd":"25","bus":"pci.22","addr":"0x0"}' \

Summary of changes:
- Separated out commits for smmuv3Dev iommu model support and
  supporting multiple IOMMU definitions
- Made iommufd only a hostdev attribute
- Revised smmuv3Dev iommu model definition to reference the controller
  index instead of assigning it a BDF
- Open iommufd FDs from libvirt backend without exposing FDs to XML users
- Fixed iommufd path permissions
- Matched qemu usage of Shameer's latest RFCv3

This series is on Github:
https://github.com/NathanChenNVIDIA/libvirt/tree/smmuv3Dev-iommufd-08-12-25

Thanks,
Nathan

[0] https://lists.libvirt.org/archives/list/devel@lists.libvirt.org/thread/EASBQHPCLPK5G3PF3DEU57G6CI4GSC74/
[1] https://lore.kernel.org/qemu-devel/20250714155941.22176-1-shameerali.kolothum.thodi@huawei.com/

Signed-off-by: Nathan Chen <nathanc@nvidia.com>


Nathan Chen (5):
  qemu: add IOMMU model smmuv3Dev
  conf: Support multiple smmuv3Dev IOMMU devices
  qemu: Implement support for associating iommufd to hostdev
  qemu: open iommufd FDs from libvirt backend
  qemu: Update Cgroup, namespace, and seclabel for qemu to access
    iommufd paths

 docs/formatdomain.rst             |  22 ++-
 src/conf/domain_conf.c            | 208 ++++++++++++++++++++++--
 src/conf/domain_conf.h            |  13 +-
 src/conf/domain_validate.c        |  58 +++++--
 src/conf/schemas/domaincommon.rng |  24 ++-
 src/libvirt_private.syms          |   2 +
 src/qemu/qemu_alias.c             |  15 +-
 src/qemu/qemu_cgroup.c            |  61 +++++++
 src/qemu/qemu_cgroup.h            |   1 +
 src/qemu/qemu_command.c           | 261 ++++++++++++++++++++++--------
 src/qemu/qemu_command.h           |   3 +-
 src/qemu/qemu_domain.c            |   8 +
 src/qemu/qemu_domain.h            |   7 +
 src/qemu/qemu_domain_address.c    |  33 ++--
 src/qemu/qemu_driver.c            |   8 +-
 src/qemu/qemu_hotplug.c           |   2 +-
 src/qemu/qemu_namespace.c         |  44 +++++
 src/qemu/qemu_postparse.c         |  11 +-
 src/qemu/qemu_process.c           | 232 ++++++++++++++++++++++++++
 src/qemu/qemu_validate.c          |  18 ++-
 src/security/security_apparmor.c  |  11 ++
 src/security/security_dac.c       |  23 +++
 src/security/security_selinux.c   |  24 +++
 src/util/virpci.c                 |  68 ++++++++
 src/util/virpci.h                 |   1 +
 25 files changed, 1020 insertions(+), 138 deletions(-)

-- 
2.43.0
Re: [RFC PATCH 0/5] cover letter: qemu: Implement support for iommufd and multiple vSMMUs
Posted by Daniel P. Berrangé via Devel 1 week, 3 days ago
On Thu, Aug 14, 2025 at 07:54:09PM -0700, Nathan Chen via Devel wrote:
> Hi,
> 
> This is a follow up to the second RFC patchset [0] for supporting multiple
> vSMMU instances and using iommufd to propagate DMA mappings to kernel for
> VM-assigned host devices in a qemu VM.
> 
> This patchset implements support for specifying multiple <iommu> devices
> within the VM definition when smmuv3Dev IOMMU model is specified, and is
> tested with Shameer's latest qemu RFC for HW-accelerated vSMMU devices [1]
> 
> Moreover, it adds a new 'iommufdId' attribute for hostdev devices to be
> associated with the iommufd object.
> 
> For instance, specifying the iommufd object and associated hostdev in a
> VM definition with multiple IOMMUs, configured to be routed to
> pcie-expander-bus controllers in a way where VFIO device to SMMUv3
> associations are matched with the host:
> 
>   <devices>
> ...
>     <controller type='pci' index='1' model='pcie-expander-bus'>
>       <model name='pxb-pcie'/>
>       <target busNr='252'/>
>       <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x0'/>
>     </controller>
>     <controller type='pci' index='2' model='pcie-expander-bus'>
>       <model name='pxb-pcie'/>
>       <target busNr='248'/>
>       <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
>     </controller>
> ...
>     <controller type='pci' index='21' model='pcie-root-port'>
>       <model name='pcie-root-port'/>
>       <target chassis='21' port='0x0'/>
>       <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>
>     </controller>
>     <controller type='pci' index='22' model='pcie-root-port'>
>       <model name='pcie-root-port'/>
>       <target chassis='22' port='0xa8'/>
>       <address type='pci' domain='0x0000' bus='0x02' slot='0x00' function='0x0'/>
>     </controller>
> ...
>     <hostdev mode='subsystem' type='pci' managed='no'>
>       <source>
> 	<address domain='0x0009' bus='0x01' slot='0x00' function='0x0'/>
>       </source>
>       <iommufdId>iommufd0</iommufdId>
>       <address type='pci' domain='0x0000' bus='0x15' slot='0x00' function='0x0'/>
>     </hostdev>
>     <hostdev mode='subsystem' type='pci' managed='no'>
>       <source>
> 	<address domain='0x0019' bus='0x01' slot='0x00' function='0x0'/>
>       </source>
>       <iommufdId>iommufd0</iommufdId>
>       <address type='pci' domain='0x0000' bus='0x16' slot='0x00' function='0x0'/>
>     </hostdev>
>     <iommu model='smmuv3Dev' parentIdx='1' accel='on'/>
>     <iommu model='smmuv3Dev' parentIdx='2' accel='on'/>
>   </devices>
> 
> This would get translated to a qemu command line with the arguments below.
> Note that libvirt will open the /dev/iommu and VFIO cdev, passing the
> associated fd number to qemu:
> 
>  -device '{"driver":"pxb-pcie","bus_nr":252,"id":"pci.1","bus":"pcie.0","addr":"0x1"}' \
>  -device '{"driver":"pxb-pcie","bus_nr":248,"id":"pci.2","bus":"pcie.0","addr":"0x2"}' \
>  -device '{"driver":"pcie-root-port","port":0,"chassis":21,"id":"pci.21","bus":"pci.1","addr":"0x0"}' \
>  -device '{"driver":"pcie-root-port","port":168,"chassis":22,"id":"pci.22","bus":"pci.2","addr":"0x0"}' \
>  -object '{"qom-type":"iommufd","id":"iommufd0","fd":"24"}' \
>  -device '{"driver":"arm-smmuv3-accel","primary-bus":"pci.1","id":"smmuv3.0","accel":true}' \
>  -device '{"driver":"arm-smmuv3-accel","primary-bus":"pci.2","id":"smmuv3.1","accel":true}' \
>  -device '{"driver":"vfio-pci","host":"0009:01:00.0","id":"hostdev0","iommufd":"iommufd0","fd":"22","bus":"pci.21","addr":"0x0"}' \
>  -device '{"driver":"vfio-pci","host":"0019:01:00.0","id":"hostdev1","iommufd":"iommufd0","fd":"25","bus":"pci.22","addr":"0x0"}' \
> 
> Summary of changes:
> - Separated out commits for smmuv3Dev iommu model support and
>   supporting multiple IOMMU definitions
> - Made iommufd only a hostdev attribute
> - Revised smmuv3Dev iommu model definition to reference the controller
>   index instead of assigning it a BDF
> - Open iommufd FDs from libvirt backend without exposing FDs to XML users
> - Fixed iommufd path permissions
> - Matched qemu usage of Shameer's latest RFCv3
> 
> This series is on Github:
> https://github.com/NathanChenNVIDIA/libvirt/tree/smmuv3Dev-iommufd-08-12-25
> 
> Thanks,
> Nathan
> 
> [0] https://lists.libvirt.org/archives/list/devel@lists.libvirt.org/thread/EASBQHPCLPK5G3PF3DEU57G6CI4GSC74/
> [1] https://lore.kernel.org/qemu-devel/20250714155941.22176-1-shameerali.kolothum.thodi@huawei.com/
> 
> Signed-off-by: Nathan Chen <nathanc@nvidia.com>
> 
> 
> Nathan Chen (5):
>   qemu: add IOMMU model smmuv3Dev
>   conf: Support multiple smmuv3Dev IOMMU devices
>   qemu: Implement support for associating iommufd to hostdev
>   qemu: open iommufd FDs from libvirt backend
>   qemu: Update Cgroup, namespace, and seclabel for qemu to access
>     iommufd paths
> 
>  docs/formatdomain.rst             |  22 ++-
>  src/conf/domain_conf.c            | 208 ++++++++++++++++++++++--
>  src/conf/domain_conf.h            |  13 +-
>  src/conf/domain_validate.c        |  58 +++++--
>  src/conf/schemas/domaincommon.rng |  24 ++-
>  src/libvirt_private.syms          |   2 +
>  src/qemu/qemu_alias.c             |  15 +-
>  src/qemu/qemu_cgroup.c            |  61 +++++++
>  src/qemu/qemu_cgroup.h            |   1 +
>  src/qemu/qemu_command.c           | 261 ++++++++++++++++++++++--------
>  src/qemu/qemu_command.h           |   3 +-
>  src/qemu/qemu_domain.c            |   8 +
>  src/qemu/qemu_domain.h            |   7 +
>  src/qemu/qemu_domain_address.c    |  33 ++--
>  src/qemu/qemu_driver.c            |   8 +-
>  src/qemu/qemu_hotplug.c           |   2 +-
>  src/qemu/qemu_namespace.c         |  44 +++++
>  src/qemu/qemu_postparse.c         |  11 +-
>  src/qemu/qemu_process.c           | 232 ++++++++++++++++++++++++++
>  src/qemu/qemu_validate.c          |  18 ++-
>  src/security/security_apparmor.c  |  11 ++
>  src/security/security_dac.c       |  23 +++
>  src/security/security_selinux.c   |  24 +++
>  src/util/virpci.c                 |  68 ++++++++
>  src/util/virpci.h                 |   1 +
>  25 files changed, 1020 insertions(+), 138 deletions(-)

We could do with some changes to the test suite to provide sample XML
and CLI args for the iommufd XML schema.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
Re: [RFC PATCH 0/5] cover letter: qemu: Implement support for iommufd and multiple vSMMUs
Posted by Nathan Chen via Devel 1 week, 2 days ago

On 8/27/2025 7:01 AM, Daniel P. Berrangé wrote:
>> Hi,
>>
>> This is a follow up to the second RFC patchset [0] for supporting multiple
>> vSMMU instances and using iommufd to propagate DMA mappings to kernel for
>> VM-assigned host devices in a qemu VM.
>>
>> This patchset implements support for specifying multiple <iommu> devices
>> within the VM definition when smmuv3Dev IOMMU model is specified, and is
>> tested with Shameer's latest qemu RFC for HW-accelerated vSMMU devices [1]
>>
>> Moreover, it adds a new 'iommufdId' attribute for hostdev devices to be
>> associated with the iommufd object.
>>
>> For instance, specifying the iommufd object and associated hostdev in a
>> VM definition with multiple IOMMUs, configured to be routed to
>> pcie-expander-bus controllers in a way where VFIO device to SMMUv3
>> associations are matched with the host:
>>
>>    <devices>
>> ...
>>      <controller type='pci' index='1' model='pcie-expander-bus'>
>>        <model name='pxb-pcie'/>
>>        <target busNr='252'/>
>>        <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x0'/>
>>      </controller>
>>      <controller type='pci' index='2' model='pcie-expander-bus'>
>>        <model name='pxb-pcie'/>
>>        <target busNr='248'/>
>>        <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
>>      </controller>
>> ...
>>      <controller type='pci' index='21' model='pcie-root-port'>
>>        <model name='pcie-root-port'/>
>>        <target chassis='21' port='0x0'/>
>>        <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>
>>      </controller>
>>      <controller type='pci' index='22' model='pcie-root-port'>
>>        <model name='pcie-root-port'/>
>>        <target chassis='22' port='0xa8'/>
>>        <address type='pci' domain='0x0000' bus='0x02' slot='0x00' function='0x0'/>
>>      </controller>
>> ...
>>      <hostdev mode='subsystem' type='pci' managed='no'>
>>        <source>
>> 	<address domain='0x0009' bus='0x01' slot='0x00' function='0x0'/>
>>        </source>
>>        <iommufdId>iommufd0</iommufdId>
>>        <address type='pci' domain='0x0000' bus='0x15' slot='0x00' function='0x0'/>
>>      </hostdev>
>>      <hostdev mode='subsystem' type='pci' managed='no'>
>>        <source>
>> 	<address domain='0x0019' bus='0x01' slot='0x00' function='0x0'/>
>>        </source>
>>        <iommufdId>iommufd0</iommufdId>
>>        <address type='pci' domain='0x0000' bus='0x16' slot='0x00' function='0x0'/>
>>      </hostdev>
>>      <iommu model='smmuv3Dev' parentIdx='1' accel='on'/>
>>      <iommu model='smmuv3Dev' parentIdx='2' accel='on'/>
>>    </devices>
>>
>> This would get translated to a qemu command line with the arguments below.
>> Note that libvirt will open the /dev/iommu and VFIO cdev, passing the
>> associated fd number to qemu:
>>
>>   -device '{"driver":"pxb-pcie","bus_nr":252,"id":"pci.1","bus":"pcie.0","addr":"0x1"}' \
>>   -device '{"driver":"pxb-pcie","bus_nr":248,"id":"pci.2","bus":"pcie.0","addr":"0x2"}' \
>>   -device '{"driver":"pcie-root-port","port":0,"chassis":21,"id":"pci.21","bus":"pci.1","addr":"0x0"}' \
>>   -device '{"driver":"pcie-root-port","port":168,"chassis":22,"id":"pci.22","bus":"pci.2","addr":"0x0"}' \
>>   -object '{"qom-type":"iommufd","id":"iommufd0","fd":"24"}' \
>>   -device '{"driver":"arm-smmuv3-accel","primary-bus":"pci.1","id":"smmuv3.0","accel":true}' \
>>   -device '{"driver":"arm-smmuv3-accel","primary-bus":"pci.2","id":"smmuv3.1","accel":true}' \
>>   -device '{"driver":"vfio-pci","host":"0009:01:00.0","id":"hostdev0","iommufd":"iommufd0","fd":"22","bus":"pci.21","addr":"0x0"}' \
>>   -device '{"driver":"vfio-pci","host":"0019:01:00.0","id":"hostdev1","iommufd":"iommufd0","fd":"25","bus":"pci.22","addr":"0x0"}' \
>>
>> Summary of changes:
>> - Separated out commits for smmuv3Dev iommu model support and
>>    supporting multiple IOMMU definitions
>> - Made iommufd only a hostdev attribute
>> - Revised smmuv3Dev iommu model definition to reference the controller
>>    index instead of assigning it a BDF
>> - Open iommufd FDs from libvirt backend without exposing FDs to XML users
>> - Fixed iommufd path permissions
>> - Matched qemu usage of Shameer's latest RFCv3
>>
>> This series is on Github:
>> https://github.com/NathanChenNVIDIA/libvirt/tree/smmuv3Dev- 
>> iommufd-08-12-25
>>
>> Thanks,
>> Nathan
>>
>> [0]https://lists.libvirt.org/archives/list/devel@lists.libvirt.org/ 
>> thread/EASBQHPCLPK5G3PF3DEU57G6CI4GSC74/
>> [1]https://lore.kernel.org/qemu-devel/20250714155941.22176-1- 
>> shameerali.kolothum.thodi@huawei.com/
>>
>> Signed-off-by: Nathan Chen<nathanc@nvidia.com>
>>
>>
>> Nathan Chen (5):
>>    qemu: add IOMMU model smmuv3Dev
>>    conf: Support multiple smmuv3Dev IOMMU devices
>>    qemu: Implement support for associating iommufd to hostdev
>>    qemu: open iommufd FDs from libvirt backend
>>    qemu: Update Cgroup, namespace, and seclabel for qemu to access
>>      iommufd paths
>>
>>   docs/formatdomain.rst             |  22 ++-
>>   src/conf/domain_conf.c            | 208 ++++++++++++++++++++++--
>>   src/conf/domain_conf.h            |  13 +-
>>   src/conf/domain_validate.c        |  58 +++++--
>>   src/conf/schemas/domaincommon.rng |  24 ++-
>>   src/libvirt_private.syms          |   2 +
>>   src/qemu/qemu_alias.c             |  15 +-
>>   src/qemu/qemu_cgroup.c            |  61 +++++++
>>   src/qemu/qemu_cgroup.h            |   1 +
>>   src/qemu/qemu_command.c           | 261 ++++++++++++++++++++++--------
>>   src/qemu/qemu_command.h           |   3 +-
>>   src/qemu/qemu_domain.c            |   8 +
>>   src/qemu/qemu_domain.h            |   7 +
>>   src/qemu/qemu_domain_address.c    |  33 ++--
>>   src/qemu/qemu_driver.c            |   8 +-
>>   src/qemu/qemu_hotplug.c           |   2 +-
>>   src/qemu/qemu_namespace.c         |  44 +++++
>>   src/qemu/qemu_postparse.c         |  11 +-
>>   src/qemu/qemu_process.c           | 232 ++++++++++++++++++++++++++
>>   src/qemu/qemu_validate.c          |  18 ++-
>>   src/security/security_apparmor.c  |  11 ++
>>   src/security/security_dac.c       |  23 +++
>>   src/security/security_selinux.c   |  24 +++
>>   src/util/virpci.c                 |  68 ++++++++
>>   src/util/virpci.h                 |   1 +
>>   25 files changed, 1020 insertions(+), 138 deletions(-)
> We could do with some changes to the test suite to provide sample XML
> and CLI args for the iommufd XML schema.

Yes, I will include some sample XML and CLI args in the next revision. 
We will have to mock the fd numbers generated for the CLI command.

Thanks,
Nathan