[RFC PATCH v3 0/4] qemu: Implement support for EGM

Nathan Chen via Devel posted 4 patches 2 weeks, 2 days ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/libvirt tags/patchew/20251125191704.644477-1-nathanc@nvidia.com
docs/formatdomain.rst                         |  18 +-
src/conf/domain_conf.c                        |  34 +++-
src/conf/domain_conf.h                        |   7 +
src/conf/domain_postparse.c                   |   6 +-
src/conf/domain_validate.c                    |  15 ++
src/conf/schemas/domaincommon.rng             |   6 +
src/qemu/qemu_alias.c                         |   7 +-
src/qemu/qemu_capabilities.c                  |   2 +
src/qemu/qemu_capabilities.h                  |   1 +
src/qemu/qemu_cgroup.c                        |  10 ++
src/qemu/qemu_command.c                       | 158 ++++++++++++++++--
src/qemu/qemu_domain.c                        |  15 +-
src/qemu/qemu_domain_address.c                |   3 +
src/qemu/qemu_driver.c                        |   1 +
src/qemu/qemu_hotplug.c                       |   1 +
src/qemu/qemu_monitor_json.c                  |   1 +
src/qemu/qemu_namespace.c                     |   3 +
src/qemu/qemu_postparse.c                     |   1 +
src/qemu/qemu_process.c                       |   2 +
src/qemu/qemu_validate.c                      |   6 +
src/security/apparmor/usr.sbin.libvirtd.in    |   3 +
src/security/security_apparmor.c              |   2 +
src/security/security_dac.c                   |   8 +
src/security/security_selinux.c               |   6 +
src/security/virt-aa-helper.c                 |   4 +
src/util/virfile.h                            |   2 +-
tests/meson.build                             |   1 +
tests/qemuegmmock.c                           |  67 ++++++++
.../acpi-egm-memory.aarch64-latest.args       |  47 ++++++
.../acpi-egm-memory.aarch64-latest.xml        | 124 ++++++++++++++
tests/qemuxmlconfdata/acpi-egm-memory.xml     | 124 ++++++++++++++
tests/qemuxmlconftest.c                       |   8 +-
32 files changed, 672 insertions(+), 21 deletions(-)
create mode 100644 tests/qemuegmmock.c
create mode 100644 tests/qemuxmlconfdata/acpi-egm-memory.aarch64-latest.args
create mode 100644 tests/qemuxmlconfdata/acpi-egm-memory.aarch64-latest.xml
create mode 100644 tests/qemuxmlconfdata/acpi-egm-memory.xml
[RFC PATCH v3 0/4] qemu: Implement support for EGM
Posted by Nathan Chen via Devel 2 weeks, 2 days ago
The Grace SOC introduces Extended GPU Memory (EGM) [3], a feature that enables
GPUs to efficiently access system memory within and across nodes. This patch
series adds support for virtualizing EGM (vEGM) in libvirt, allowing VMs to
utilize dedicated EGM memory regions through ACPI.

This patch series is a follow-up RFC to the second EGM
RFC series [0], to gather feedback from the libvirt community on
the overall approach and implementation details. While kernel EGM
driver support and QEMU acpi-egm-memory device support are not yet upstream,
reference implementations are available [1][2] to enable testing and validation
of the libvirt integration.

Any community feedback is appreciated.

Background and Use Cases
=========================

EGM allows host memory to be partitioned into two regions:
1. Standard memory for Host OS usage
2. EGM region assigned to VMs as their system memory

This technology enables various high-performance computing scenarios [3]:
- Large memory pools for AI/ML workloads
- High-performance computing applications
- Memory extension for systems with limited main memory
- GPU-accelerated workloads requiring large addressable memory

Implementation Overview
=======================

This series adds a new memory device model VIR_DOMAIN_MEMORY_MODEL_EGM with
'path' source attribute and 'pciDev' target attribute to denote host EGM
device backing path and PCI device alias to associate the vEGM with,
respectively.

For instance, given the XML stanzas below:

    <memory model='egm' access='shared'>
     <source>
       <path>/dev/egm4</path>
     </source>
     <target>
       <size unit='KiB'>8388608</size>
       <node>0</node>
       <pciDev>ua-hostdev0</pciDev>
     </target>
   </memory>
    <memory model='egm' access='shared'>
     <source>
       <path>/dev/egm5</path>
     </source>
     <target>
       <size unit='KiB'>8388608</size>
       <node>1</node>
       <pciDev>ua-hostdev1</pciDev>
     </target>
   </memory>

The corresponding qemu command line will include the following arguments:

-object '{"qom-type":"memory-backend-file","id":"memegm0","mem-path":"/dev/egm4","share":true,"prealloc":true,"size":8589934592}' \
-object acpi-egm-memory,id=egm0,pci-dev=ua-hostdev0,node=0 \
-object '{"qom-type":"memory-backend-file","id":"memegm1","mem-path":"/dev/egm5","share":true,"prealloc":true,"size":8589934592}' \
-object acpi-egm-memory,id=egm1,pci-dev=ua-hostdev1,node=1 \
-numa node,nodeid=0,cpus=0-1,memdev=memegm4 \
-numa node,nodeid=1,cpus=2-3,memdev=memegm5 \

For a system where multiple GPUs are associated with a single host socket/NUMA node/EGM chardev,
we consolidate the memory backing into a single memory-backend-file object per host EGM chardev.
For instance, given the XML stanzas below:

    <memory model='egm' access='shared'>
     <source>
       <path>/dev/egm4</path>
     </source>
     <target>
       <size unit='KiB'>8388608</size>
       <node>0</node>
       <pciDev>ua-hostdev0</pciDev>
     </target>
   </memory>
    <memory model='egm' access='shared'>
     <source>
       <path>/dev/egm4</path>
     </source>
     <target>
       <size unit='KiB'>8388608</size>
       <node>0</node>
       <pciDev>ua-hostdev1</pciDev>
     </target>
   </memory>

The corresponding qemu command line will include the following arguments:

-object '{"qom-type":"memory-backend-file","id":"memegm0","mem-path":"/dev/egm4","share":true,"prealloc":true,"size":17179869184}' \
-object acpi-egm-memory,id=egm0,pci-dev=ua-hostdev0,node=0 \
-object acpi-egm-memory,id=egm1,pci-dev=ua-hostdev1,node=0 \
-numa node,nodeid=0,cpus=0-4,memdev=memegm0 \

Changes from RFCv2:
- Decouple host EGM chardev name from guest EGM ID
- Consolidate all acpi-egm-memory objects' memory into a single memory-backend-file per EGM chardev specified.

Changes from RFCv1:
- Use existing memory device infrastructure to represent EGM configuration
- Added support for multiple EGM devices

This series is on Github:
https://github.com/NathanChenNVIDIA/libvirt/tree/egm-11-24-25

Thanks,
Nathan

[0] https://lists.libvirt.org/archives/list/devel@lists.libvirt.org/thread/6RU7R2NQEDEUU7JFPM6DTXJBWUDXTYWE/
[1] https://github.com/ianm-nv/qemu/tree/6.8_ghvirt_egm_may2025
[2] https://github.com/NVIDIA/QEMU/commit/32db1b74fb99c0571724c7e69485e89098c14874
[3] https://developer.nvidia.com/blog/nvidia-grace-hopper-superchip-architecture-in-depth/#extended_gpu_memory

Ian May (1):
  tests: Add qemuxmlconftest for ACPI EGM memory device

Nathan Chen (3):
  conf: Support EGM memory device model
  qemu: Add cgroup, namespace, and seclabel setup for EGM memory device
    model
  qemu: Add qemu CLI support for EGM

 docs/formatdomain.rst                         |  18 +-
 src/conf/domain_conf.c                        |  34 +++-
 src/conf/domain_conf.h                        |   7 +
 src/conf/domain_postparse.c                   |   6 +-
 src/conf/domain_validate.c                    |  15 ++
 src/conf/schemas/domaincommon.rng             |   6 +
 src/qemu/qemu_alias.c                         |   7 +-
 src/qemu/qemu_capabilities.c                  |   2 +
 src/qemu/qemu_capabilities.h                  |   1 +
 src/qemu/qemu_cgroup.c                        |  10 ++
 src/qemu/qemu_command.c                       | 158 ++++++++++++++++--
 src/qemu/qemu_domain.c                        |  15 +-
 src/qemu/qemu_domain_address.c                |   3 +
 src/qemu/qemu_driver.c                        |   1 +
 src/qemu/qemu_hotplug.c                       |   1 +
 src/qemu/qemu_monitor_json.c                  |   1 +
 src/qemu/qemu_namespace.c                     |   3 +
 src/qemu/qemu_postparse.c                     |   1 +
 src/qemu/qemu_process.c                       |   2 +
 src/qemu/qemu_validate.c                      |   6 +
 src/security/apparmor/usr.sbin.libvirtd.in    |   3 +
 src/security/security_apparmor.c              |   2 +
 src/security/security_dac.c                   |   8 +
 src/security/security_selinux.c               |   6 +
 src/security/virt-aa-helper.c                 |   4 +
 src/util/virfile.h                            |   2 +-
 tests/meson.build                             |   1 +
 tests/qemuegmmock.c                           |  67 ++++++++
 .../acpi-egm-memory.aarch64-latest.args       |  47 ++++++
 .../acpi-egm-memory.aarch64-latest.xml        | 124 ++++++++++++++
 tests/qemuxmlconfdata/acpi-egm-memory.xml     | 124 ++++++++++++++
 tests/qemuxmlconftest.c                       |   8 +-
 32 files changed, 672 insertions(+), 21 deletions(-)
 create mode 100644 tests/qemuegmmock.c
 create mode 100644 tests/qemuxmlconfdata/acpi-egm-memory.aarch64-latest.args
 create mode 100644 tests/qemuxmlconfdata/acpi-egm-memory.aarch64-latest.xml
 create mode 100644 tests/qemuxmlconfdata/acpi-egm-memory.xml

-- 
2.43.0
Re: [RFC PATCH v3 0/4] qemu: Implement support for EGM
Posted by Peter Krempa via Devel 2 weeks, 1 day ago
On Tue, Nov 25, 2025 at 11:17:00 -0800, Nathan Chen via Devel wrote:

[...]

> The corresponding qemu command line will include the following arguments:
> 
> -object '{"qom-type":"memory-backend-file","id":"memegm0","mem-path":"/dev/egm4","share":true,"prealloc":true,"size":17179869184}' \
> -object acpi-egm-memory,id=egm0,pci-dev=ua-hostdev0,node=0 \
> -object acpi-egm-memory,id=egm1,pci-dev=ua-hostdev1,node=0 \
> -numa node,nodeid=0,cpus=0-4,memdev=memegm0 \
> 
> Changes from RFCv2:
> - Decouple host EGM chardev name from guest EGM ID
> - Consolidate all acpi-egm-memory objects' memory into a single memory-backend-file per EGM chardev specified.
> 
> Changes from RFCv1:
> - Use existing memory device infrastructure to represent EGM configuration
> - Added support for multiple EGM devices
> 
> This series is on Github:
> https://github.com/NathanChenNVIDIA/libvirt/tree/egm-11-24-25
> 
> Thanks,
> Nathan
> 
> [0] https://lists.libvirt.org/archives/list/devel@lists.libvirt.org/thread/6RU7R2NQEDEUU7JFPM6DTXJBWUDXTYWE/
> [1] https://github.com/ianm-nv/qemu/tree/6.8_ghvirt_egm_may2025
> [2] https://github.com/NVIDIA/QEMU/commit/32db1b74fb99c0571724c7e69485e89098c14874

What is the state of this qemu series? I was looking at the aarch64
capability update and didn't see any of the 'acpi-egm-memory' so I went
looking at the qemu mailing list and didn't find any submission adding
the aforementioned object type.

Note that unless the qemu functionallity is commited to the upstream
repository we will not be taking any patches using it.
Re: [RFC PATCH v3 0/4] qemu: Implement support for EGM
Posted by Nathan Chen via Devel 1 week, 3 days ago

On 11/27/2025 1:05 AM, Peter Krempa wrote:
> On Tue, Nov 25, 2025 at 11:17:00 -0800, Nathan Chen via Devel wrote:
> 
> [...]
> 
>> The corresponding qemu command line will include the following arguments:
>>
>> -object '{"qom-type":"memory-backend-file","id":"memegm0","mem-path":"/dev/egm4","share":true,"prealloc":true,"size":17179869184}' \
>> -object acpi-egm-memory,id=egm0,pci-dev=ua-hostdev0,node=0 \
>> -object acpi-egm-memory,id=egm1,pci-dev=ua-hostdev1,node=0 \
>> -numa node,nodeid=0,cpus=0-4,memdev=memegm0 \
>>
>> Changes from RFCv2:
>> - Decouple host EGM chardev name from guest EGM ID
>> - Consolidate all acpi-egm-memory objects' memory into a single memory-backend-file per EGM chardev specified.
>>
>> Changes from RFCv1:
>> - Use existing memory device infrastructure to represent EGM configuration
>> - Added support for multiple EGM devices
>>
>> This series is on Github:
>> https://github.com/NathanChenNVIDIA/libvirt/tree/egm-11-24-25
>>
>> Thanks,
>> Nathan
>>
>> [0]https://lists.libvirt.org/archives/list/devel@lists.libvirt.org/ 
>> thread/6RU7R2NQEDEUU7JFPM6DTXJBWUDXTYWE/
>> [1]https://github.com/ianm-nv/qemu/tree/6.8_ghvirt_egm_may2025
>> [2]https://github.com/NVIDIA/QEMU/ 
>> commit/32db1b74fb99c0571724c7e69485e89098c14874
> What is the state of this qemu series? I was looking at the aarch64
> capability update and didn't see any of the 'acpi-egm-memory' so I went
> looking at the qemu mailing list and didn't find any submission adding
> the aforementioned object type.
> 

The associated qemu series has not been submitted for upstream feedback 
yet, and the underlying kernel series had its first RFC [0] submitted 
for upstream review in September of this year.

[0] https://lore.kernel.org/all/20250904040828.319452-1-ankita@nvidia.com/

> Note that unless the qemu functionallity is commited to the upstream
> repository we will not be taking any patches using it.

Understood, we are submitting patches earlier to identify potential 
issues with our Libvirt implementation while the qemu portion is being 
developed.