[PATCH v4 0/9] dax/kmem: atomic whole-device hotplug via sysfs

Gregory Price posted 9 patches 2 days, 8 hours ago
Documentation/ABI/testing/sysfs-bus-dax       |  25 +
drivers/dax/bus.c                             |   3 +
drivers/dax/bus.h                             |   2 +
drivers/dax/cxl.c                             |   1 +
drivers/dax/dax-private.h                     |   3 +
drivers/dax/hmem/hmem.c                       |   1 +
drivers/dax/kmem.c                            | 500 ++++++++++++++----
drivers/dax/pmem.c                            |   1 +
include/linux/memory.h                        |  22 +
include/linux/memory_hotplug.h                |  12 +
mm/memory_hotplug.c                           | 163 +++++-
tools/testing/selftests/Makefile              |   1 +
tools/testing/selftests/dax/Makefile          |   6 +
tools/testing/selftests/dax/config            |   4 +
.../testing/selftests/dax/dax-kmem-hotplug.sh | 145 +++++
tools/testing/selftests/dax/settings          |   1 +
16 files changed, 774 insertions(+), 116 deletions(-)
create mode 100644 tools/testing/selftests/dax/Makefile
create mode 100644 tools/testing/selftests/dax/config
create mode 100755 tools/testing/selftests/dax/dax-kmem-hotplug.sh
create mode 100644 tools/testing/selftests/dax/settings
[PATCH v4 0/9] dax/kmem: atomic whole-device hotplug via sysfs
Posted by Gregory Price 2 days, 8 hours ago
The dax kmem driver onlines memory during probe using the system
default policy, with no atomic control for the state of an entire
region at runtime - only by toggling individual memory blocks.

Offlining and removing a whole region therefore races with other
userland controllers that interfere between the offline and remove
steps. This was discussed in the LPC2025 device memory sessions [1].

This series adds a sysfs "hotplug" attribute for atomic whole-device
hotplug control, plus the mm and dax plumbing to support it.

Transitions are atomic across every range of the device. The state
names mirror the per-block memoryX/state ABI with one modification:

  - "unplugged":      memory blocks are not present
  - "online":         online as system RAM, zone chosen by the kernel
  - "online_kernel":  online in ZONE_NORMAL
  - "online_movable": online in ZONE_MOVABLE

"offline" (blocks present but offline) is reportable for backward
compatibility but is not writable because it entices the race condition
we are trying to solve (offlining all the memory blocks in one atomic
and unplugging them in another atomic).

mm preparation:
  1. mm/memory: add memory_block_aligned_range() helper.
  2. mm/memory_hotplug: pass online_type to online_memory_block().
  3. mm/memory_hotplug: export mhp_get_default_online_type().
  4. mm/memory_hotplug: add __add_memory_driver_managed() so a driver can
     select the online policy.  The override is restricted to in-tree
     modules via EXPORT_SYMBOL_FOR_MODULES().
  5. mm/memory_hotplug: add offline_and_remove_memory_ranges() for atomic,
     all-or-nothing offline+remove of several ranges under a single
     lock_device_hotplug().

dax/kmem feature:
  6. Plumb online_type through the dax device creation path.
  7. Extract hotplug/hotremove into helper functions.
  8. Add the "hotplug" sysfs attribute.
  9. selftests/dax: regression test for the attribute.

DAX Kmem probe still creates the memory blocks by default, even when
the default policy is "offline" to preserve backwards compatibility.

Unplug (atomic offline+remove of the whole device) is the new
capability provided by the attribute.

I downgraded a BUG() to a WARN() when unbind is called while the device
is not unplugged.  The old per-block toggling pattern is still used by
userland tools and disconnects the 'hotplug' value from the real region
state; until per-block control is deprecated or restricted in some way,
WARN() flags that tools should move to the new atomic pattern.

Changes since v3 [2]:
  - Dropped the memory-tier dedup patch - mt_get_memory_type()
  - Added offline_and_remove_memory_ranges() with rollback
  - Added online_kernel so we mirror memoryX/state ABI.
  - Fixed a backward-compatibility regression: probe now always creates
    memory blocks (offline policy -> present+offline) instead an unplugged
    device, which broke tools expecting the blocks to be present.
  - Restricted the __add_memory_driver_managed() export to the kmem module
    only (was "kmem,cxl_core"); cxl_core can be added when it grows a user.
  - Renamed the alignment helper to memory_block_aligned_range() and
    dropped a dead enum->string helper (reusing online_type_to_str[]).
  - Added an in-tree selftest.

[1] https://lpc.events/event/19/contributions/2016/
[2] https://lore.kernel.org/all/20260321150404.3288786-1-gourry@gourry.net/

Gregory Price (9):
  mm/memory: add memory_block_aligned_range() helper
  mm/memory_hotplug: pass online_type to online_memory_block() via arg
  mm/memory_hotplug: export mhp_get_default_online_type
  mm/memory_hotplug: add __add_memory_driver_managed() with online_type
    arg
  mm/memory_hotplug: add offline_and_remove_memory_ranges()
  dax: plumb hotplug online_type through dax
  dax/kmem: extract hotplug/hotremove helper functions
  dax/kmem: add sysfs interface for atomic whole-device hotplug
  selftests/dax: add dax/kmem hotplug sysfs regression test

 Documentation/ABI/testing/sysfs-bus-dax       |  25 +
 drivers/dax/bus.c                             |   3 +
 drivers/dax/bus.h                             |   2 +
 drivers/dax/cxl.c                             |   1 +
 drivers/dax/dax-private.h                     |   3 +
 drivers/dax/hmem/hmem.c                       |   1 +
 drivers/dax/kmem.c                            | 500 ++++++++++++++----
 drivers/dax/pmem.c                            |   1 +
 include/linux/memory.h                        |  22 +
 include/linux/memory_hotplug.h                |  12 +
 mm/memory_hotplug.c                           | 163 +++++-
 tools/testing/selftests/Makefile              |   1 +
 tools/testing/selftests/dax/Makefile          |   6 +
 tools/testing/selftests/dax/config            |   4 +
 .../testing/selftests/dax/dax-kmem-hotplug.sh | 145 +++++
 tools/testing/selftests/dax/settings          |   1 +
 16 files changed, 774 insertions(+), 116 deletions(-)
 create mode 100644 tools/testing/selftests/dax/Makefile
 create mode 100644 tools/testing/selftests/dax/config
 create mode 100755 tools/testing/selftests/dax/dax-kmem-hotplug.sh
 create mode 100644 tools/testing/selftests/dax/settings


base-commit: 7f981ca4cef222e26fc2b4ceb2d2bfe7a6153d3a
-- 
2.54.0