[PATCH v11 0/4] hw/mem: add sp-mem device for Specific Purpose Memory

fanhuang posted 4 patches 2 weeks, 1 day ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20260611100637.2460507-1-FangSheng.Huang@amd.com
Maintainers: Paolo Bonzini <pbonzini@redhat.com>, Richard Henderson <richard.henderson@linaro.org>, "Michael S. Tsirkin" <mst@redhat.com>, Igor Mammedov <imammedo@redhat.com>, Ani Sinha <anisinha@redhat.com>, David Hildenbrand <david@kernel.org>, FangSheng Huang <FangSheng.Huang@amd.com>, "Philippe Mathieu-Daudé" <philmd@mailo.com>, Zhao Liu <zhao1.liu@intel.com>, Eric Blake <eblake@redhat.com>, Markus Armbruster <armbru@redhat.com>
There is a newer version of this series
MAINTAINERS                  |   3 +
qapi/machine.json            |  43 ++++++++++-
hw/i386/e820_memory_layout.h |  11 +--
include/hw/mem/sp-mem.h      |  35 +++++++++
hw/core/machine-hmp-cmds.c   |  11 +++
hw/i386/acpi-build.c         |  96 +++++++++++++++++++++++--
hw/i386/pc.c                 |  36 ++++++++++
hw/mem/sp-mem.c              | 136 +++++++++++++++++++++++++++++++++++
hw/i386/Kconfig              |   2 +
hw/mem/Kconfig               |   4 ++
hw/mem/meson.build           |   1 +
11 files changed, 367 insertions(+), 11 deletions(-)
create mode 100644 include/hw/mem/sp-mem.h
create mode 100644 hw/mem/sp-mem.c
[PATCH v11 0/4] hw/mem: add sp-mem device for Specific Purpose Memory
Posted by fanhuang 2 weeks, 1 day ago
This series adds a TYPE_MEMORY_DEVICE subclass `sp-mem` for boot-time
SOFT_RESERVED guest memory, following the direction from the v7
thread [1] and the v8 / v9 / v10 reviews [2][3][4].

Background
----------

This series targets coherent CPU + accelerator shared-address-space
systems, where the accelerator's HBM is not a device-private
framebuffer behind a PCIe BAR but a tier of host system memory:
visible to the CPU in the platform physical address space, shared
coherently with the accelerator over the platform fabric, and bound
to a NUMA proximity domain set by platform firmware at boot fabric
training.

For such a region to function correctly in the guest, two things
must hold simultaneously: the CPU memory subsystem has to see it in
the system memory map (so the CPU side can address it), and it has
to be reserved exclusively for the accelerator's driver (so the
kernel's general allocator does not hand SPM pages to unrelated
workloads). The SOFT_RESERVED memory type in E820 plus a matching
SRAT memory-affinity entry is the mechanism that delivers both: a
firmware-produced topology that the CPU memory subsystem honors and
the accelerator's driver consumes for its own range.

Approach
--------

The series introduces a new TYPE_MEMORY_DEVICE subclass `sp-mem`.
Each instance binds one host memory backend to a single NUMA
proximity domain and is boot-time only; placement, mapped-state
enforcement, and QMP introspection come from the existing
memory-device framework.

Testing
-------

Verified end-to-end on q35 + KVM, with both SeaBIOS and OVMF, for:

- single sp-mem instance
- two sp-mem instances on different NUMA nodes

Guest observations: /proc/iomem shows one SOFT_RESERVED entry per
sp-mem device, dmesg SRAT parsing reports the matching
memory_affinity entries with correct PXM, and the umbrella
HOTPLUGGABLE entry covers the remaining hotplug-memory window
without overlapping the sp-mem ranges.

Changes since v10
-----------------

  - QAPI: rename SpMemDeviceInfo's `memaddr` to `addr` (match
    PCDIMMDeviceInfo).
  - sp_mem_get_addr/set_addr go through object_property_get/set_uint
    instead of dereferencing the struct directly (match pc-dimm).
  - Drop the hostmem conditionals in fill_device_info (hostmem is set
    by the time the device is realized).
  - Use memory_device_get_region_size for get_plugged_size instead of
    a custom helper.
  - Add the sp-mem case to the HMP "info memory-devices" printer
    (hw/core/machine-hmp-cmds.c), which otherwise hit
    g_assert_not_reached() for the new device kind.
  - Rename collect_sp_mem_ranges_cb to sp_mem_collect_ranges_cb;
    rename the local cursor/end to region_start/region_end.
  - Drop the redundant numa_state guard and the duplicate
    device_memory guard inside build_srat_device_memory(); keep a
    single device_memory guard at the build_srat() call site.
  - Add an inline comment on the trailing placeholder range
    (HOTPLUGGABLE remainder of the device_memory window).
  - Move the is_mapped check and host_memory_backend_set_mapped() into
    sp_mem_realize()/sp_mem_unrealize(), matching pc-dimm; the pc plug
    handler keeps only the NUMA node-range check, which needs
    MachineState.

Patch 4 (MAINTAINERS) is unchanged and retains the v10 Acked-by.

Previous versions
-----------------

  v1: https://lore.kernel.org/qemu-devel/20250924103324.2074819-1-FangSheng.Huang@amd.com/
  v2: https://lore.kernel.org/qemu-devel/20251020090701.4036748-1-FangSheng.Huang@amd.com/
  v3: https://lore.kernel.org/qemu-devel/20251208105137.2058928-1-FangSheng.Huang@amd.com/
  v4: https://lore.kernel.org/qemu-devel/20251209093841.2250527-1-FangSheng.Huang@amd.com/
  v5: https://lore.kernel.org/qemu-devel/20260123024312.1601732-1-FangSheng.Huang@amd.com/
  v6: https://lore.kernel.org/qemu-devel/20260226105023.256568-1-FangSheng.Huang@amd.com/
  v7: https://lore.kernel.org/qemu-devel/20260306082735.1106690-1-FangSheng.Huang@amd.com/
  v8: https://lore.kernel.org/qemu-devel/20260527074215.229119-1-FangSheng.Huang@amd.com/
  v9: https://lore.kernel.org/qemu-devel/20260602084447.1100554-1-FangSheng.Huang@amd.com/
  v10: https://lore.kernel.org/qemu-devel/20260605104609.1739911-1-FangSheng.Huang@amd.com/

  [1] v7 thread closeout:
      https://lore.kernel.org/qemu-devel/666a7ba1-5d3a-4732-b872-0d9fb2fe8461@amd.com/
  [2] v8 review:
      https://lore.kernel.org/qemu-devel/20260601105057.2d764e55@imammedo/
  [3] v9 review:
      https://lore.kernel.org/qemu-devel/20260602084447.1100554-1-FangSheng.Huang@amd.com/T/
  [4] v10 review:
      https://lore.kernel.org/qemu-devel/20260605104609.1739911-1-FangSheng.Huang@amd.com/T/

fanhuang (4):
  hw/mem: add sp-mem device for Specific Purpose Memory
  i386/acpi-build: partition device_memory SRAT umbrella for sp-mem
  hw/i386: hook sp-mem into the pc machine plug path
  MAINTAINERS: cover sp-mem under Memory devices, add R: tag

 MAINTAINERS                  |   3 +
 qapi/machine.json            |  43 ++++++++++-
 hw/i386/e820_memory_layout.h |  11 +--
 include/hw/mem/sp-mem.h      |  35 +++++++++
 hw/core/machine-hmp-cmds.c   |  11 +++
 hw/i386/acpi-build.c         |  96 +++++++++++++++++++++++--
 hw/i386/pc.c                 |  36 ++++++++++
 hw/mem/sp-mem.c              | 136 +++++++++++++++++++++++++++++++++++
 hw/i386/Kconfig              |   2 +
 hw/mem/Kconfig               |   4 ++
 hw/mem/meson.build           |   1 +
 11 files changed, 367 insertions(+), 11 deletions(-)
 create mode 100644 include/hw/mem/sp-mem.h
 create mode 100644 hw/mem/sp-mem.c


base-commit: de5d8bfd6105d3dd3ae668df9762df244a6d1506
--
2.34.1