[PATCH 8/9] cxl/core: Add dax_kmem_region and sysram_region drivers

Gregory Price posted 9 patches 1 week, 4 days ago
[PATCH 8/9] cxl/core: Add dax_kmem_region and sysram_region drivers
Posted by Gregory Price 1 week, 4 days ago
In the current kmem driver binding process, the only way for users
to define hotplug policy is via a build-time option, or by not
onlining memory by default and setting each individual memory block
online after hotplug occurs.  We can solve this with a configuration
step between region-probe and dax-probe.

Add the infrastructure for a two-stage driver binding for kmem-mode
dax regions. The cxl_dax_kmem_region driver probes cxl_sysram_region
devices and creates cxl_dax_region with dax_driver=kmem.

This creates an interposition step where users can configure policy.

Device hierarchy:
  region0 -> sysram_region0 -> dax_region0 -> dax0.0

The sysram_region device exposes a sysfs 'online_type' attribute
that allows users to configure the memory online type before the
underlying dax_region is created and memory is hotplugged.

  sysram_region0/online_type:
      invalid:        not configured, blocks probe
      offline:        memory will not be onlined automatically
      online:         memory will be onlined in ZONE_NORMAL
      online_movable: memory will be onlined in ZONE_MMOVABLE

The device initializes with online_type=invalid which prevents the
cxl_dax_kmem_region driver from binding until the user explicitly
configures a valid online_type.

This enables a two-step binding process:
  echo region0 > cxl_sysram_region/bind
  echo online_movable > sysram_region0/online_type
  echo sysram_region0 > cxl_dax_kmem_region/bind

Signed-off-by: Gregory Price <gourry@gourry.net>
---
 Documentation/ABI/testing/sysfs-bus-cxl |  21 +++
 drivers/cxl/core/Makefile               |   1 +
 drivers/cxl/core/core.h                 |   6 +
 drivers/cxl/core/dax_region.c           |  50 +++++++
 drivers/cxl/core/port.c                 |   2 +
 drivers/cxl/core/region.c               |  14 ++
 drivers/cxl/core/sysram_region.c        | 180 ++++++++++++++++++++++++
 drivers/cxl/cxl.h                       |  25 ++++
 8 files changed, 299 insertions(+)
 create mode 100644 drivers/cxl/core/sysram_region.c

diff --git a/Documentation/ABI/testing/sysfs-bus-cxl b/Documentation/ABI/testing/sysfs-bus-cxl
index c80a1b5a03db..a051cb86bdfc 100644
--- a/Documentation/ABI/testing/sysfs-bus-cxl
+++ b/Documentation/ABI/testing/sysfs-bus-cxl
@@ -624,3 +624,24 @@ Description:
 		The count is persistent across power loss and wraps back to 0
 		upon overflow. If this file is not present, the device does not
 		have the necessary support for dirty tracking.
+
+
+What:		/sys/bus/cxl/devices/sysram_regionZ/online_type
+Date:		January, 2026
+KernelVersion:	v7.1
+Contact:	linux-cxl@vger.kernel.org
+Description:
+		(RW) This attribute allows users to configure the memory online
+		type before the underlying dax_region engages in hotplug.
+
+		Valid values:
+		  'invalid': Not configured (default). Blocks probe.
+		  'offline': Memory will not be onlined automatically.
+		  'online' : Memory will be onlined in ZONE_NORMAL.
+		  'online_movable': Memory will be onlined in ZONE_MOVABLE.
+
+		The device initializes with online_type='invalid' which prevents
+		the cxl_dax_kmem_region driver from binding until the user
+		explicitly configures a valid online_type. This enables a
+		two-step binding process that gives users control over memory
+		hotplug policy before memory is added to the system.
diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile
index 36f284d7c500..faf662c7d88b 100644
--- a/drivers/cxl/core/Makefile
+++ b/drivers/cxl/core/Makefile
@@ -18,6 +18,7 @@ cxl_core-y += ras.o
 cxl_core-$(CONFIG_TRACING) += trace.o
 cxl_core-$(CONFIG_CXL_REGION) += region.o
 cxl_core-$(CONFIG_CXL_REGION) += dax_region.o
+cxl_core-$(CONFIG_CXL_REGION) += sysram_region.o
 cxl_core-$(CONFIG_CXL_REGION) += pmem_region.o
 cxl_core-$(CONFIG_CXL_MCE) += mce.o
 cxl_core-$(CONFIG_CXL_FEATURES) += features.o
diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
index ea4df8abc2ad..04b32015e9b1 100644
--- a/drivers/cxl/core/core.h
+++ b/drivers/cxl/core/core.h
@@ -26,6 +26,7 @@ extern struct device_attribute dev_attr_delete_region;
 extern struct device_attribute dev_attr_region;
 extern const struct device_type cxl_pmem_region_type;
 extern const struct device_type cxl_dax_region_type;
+extern const struct device_type cxl_sysram_region_type;
 extern const struct device_type cxl_region_type;
 
 int cxl_decoder_detach(struct cxl_region *cxlr,
@@ -37,6 +38,7 @@ int cxl_decoder_detach(struct cxl_region *cxlr,
 #define SET_CXL_REGION_ATTR(x) (&dev_attr_##x.attr),
 #define CXL_PMEM_REGION_TYPE(x) (&cxl_pmem_region_type)
 #define CXL_DAX_REGION_TYPE(x) (&cxl_dax_region_type)
+#define CXL_SYSRAM_REGION_TYPE(x) (&cxl_sysram_region_type)
 int cxl_region_init(void);
 void cxl_region_exit(void);
 int cxl_get_poison_by_endpoint(struct cxl_port *port);
@@ -44,9 +46,12 @@ struct cxl_region *cxl_dpa_to_region(const struct cxl_memdev *cxlmd, u64 dpa);
 u64 cxl_dpa_to_hpa(struct cxl_region *cxlr, const struct cxl_memdev *cxlmd,
 		   u64 dpa);
 int devm_cxl_add_dax_region(struct cxl_region *cxlr, enum dax_driver_type);
+int devm_cxl_add_sysram_region(struct cxl_region *cxlr);
 int devm_cxl_add_pmem_region(struct cxl_region *cxlr);
 
 extern struct cxl_driver cxl_devdax_region_driver;
+extern struct cxl_driver cxl_dax_kmem_region_driver;
+extern struct cxl_driver cxl_sysram_region_driver;
 
 #else
 static inline u64 cxl_dpa_to_hpa(struct cxl_region *cxlr,
@@ -81,6 +86,7 @@ static inline void cxl_region_exit(void)
 #define SET_CXL_REGION_ATTR(x)
 #define CXL_PMEM_REGION_TYPE(x) NULL
 #define CXL_DAX_REGION_TYPE(x) NULL
+#define CXL_SYSRAM_REGION_TYPE(x) NULL
 #endif
 
 struct cxl_send_command;
diff --git a/drivers/cxl/core/dax_region.c b/drivers/cxl/core/dax_region.c
index 391d51e5ec37..a379f5b85e3d 100644
--- a/drivers/cxl/core/dax_region.c
+++ b/drivers/cxl/core/dax_region.c
@@ -127,3 +127,53 @@ struct cxl_driver cxl_devdax_region_driver = {
 	.probe = cxl_devdax_region_driver_probe,
 	.id = CXL_DEVICE_REGION,
 };
+
+static int cxl_dax_kmem_region_driver_probe(struct device *dev)
+{
+	struct cxl_sysram_region *cxlr_sysram = to_cxl_sysram_region(dev);
+	struct cxl_dax_region *cxlr_dax;
+	struct cxl_region *cxlr;
+	int rc;
+
+	if (!cxlr_sysram)
+		return -ENODEV;
+
+	/* Require explicit online_type configuration before binding */
+	if (cxlr_sysram->online_type == -1)
+		return -ENODEV;
+
+	cxlr = cxlr_sysram->cxlr;
+
+	cxlr_dax = cxl_dax_region_alloc(cxlr);
+	if (IS_ERR(cxlr_dax))
+		return PTR_ERR(cxlr_dax);
+
+	/* Inherit online_type from parent sysram_region */
+	cxlr_dax->online_type = cxlr_sysram->online_type;
+	cxlr_dax->dax_driver = DAXDRV_KMEM_TYPE;
+
+	/* Parent is the sysram_region device */
+	cxlr_dax->dev.parent = dev;
+
+	rc = dev_set_name(&cxlr_dax->dev, "dax_region%d", cxlr->id);
+	if (rc)
+		goto err;
+
+	rc = device_add(&cxlr_dax->dev);
+	if (rc)
+		goto err;
+
+	dev_dbg(dev, "%s: register %s\n", dev_name(dev),
+		dev_name(&cxlr_dax->dev));
+
+	return devm_add_action_or_reset(dev, cxlr_dax_unregister, cxlr_dax);
+err:
+	put_device(&cxlr_dax->dev);
+	return rc;
+}
+
+struct cxl_driver cxl_dax_kmem_region_driver = {
+	.name = "cxl_dax_kmem_region",
+	.probe = cxl_dax_kmem_region_driver_probe,
+	.id = CXL_DEVICE_SYSRAM_REGION,
+};
diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
index 3310dbfae9d6..dc7262a5efd6 100644
--- a/drivers/cxl/core/port.c
+++ b/drivers/cxl/core/port.c
@@ -66,6 +66,8 @@ static int cxl_device_id(const struct device *dev)
 		return CXL_DEVICE_PMEM_REGION;
 	if (dev->type == CXL_DAX_REGION_TYPE())
 		return CXL_DEVICE_DAX_REGION;
+	if (dev->type == CXL_SYSRAM_REGION_TYPE())
+		return CXL_DEVICE_SYSRAM_REGION;
 	if (is_cxl_port(dev)) {
 		if (is_cxl_root(to_cxl_port(dev)))
 			return CXL_DEVICE_ROOT;
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 6200ca1cc2dd..8bef91dc726c 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -3734,8 +3734,20 @@ int cxl_region_init(void)
 	if (rc)
 		goto err_dax;
 
+	rc = cxl_driver_register(&cxl_sysram_region_driver);
+	if (rc)
+		goto err_sysram;
+
+	rc = cxl_driver_register(&cxl_dax_kmem_region_driver);
+	if (rc)
+		goto err_dax_kmem;
+
 	return 0;
 
+err_dax_kmem:
+	cxl_driver_unregister(&cxl_sysram_region_driver);
+err_sysram:
+	cxl_driver_unregister(&cxl_devdax_region_driver);
 err_dax:
 	cxl_driver_unregister(&cxl_region_driver);
 	return rc;
@@ -3743,6 +3755,8 @@ int cxl_region_init(void)
 
 void cxl_region_exit(void)
 {
+	cxl_driver_unregister(&cxl_dax_kmem_region_driver);
+	cxl_driver_unregister(&cxl_sysram_region_driver);
 	cxl_driver_unregister(&cxl_devdax_region_driver);
 	cxl_driver_unregister(&cxl_region_driver);
 }
diff --git a/drivers/cxl/core/sysram_region.c b/drivers/cxl/core/sysram_region.c
new file mode 100644
index 000000000000..5665db238d0f
--- /dev/null
+++ b/drivers/cxl/core/sysram_region.c
@@ -0,0 +1,180 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright(c) 2026 Meta Platforms, Inc. All rights reserved. */
+/*
+ * CXL Sysram Region - Intermediate device for kmem hotplug configuration
+ *
+ * This provides an intermediate device between cxl_region and cxl_dax_region
+ * that allows users to configure memory hotplug parameters (like online_type)
+ * before the underlying dax_region is created and memory is hotplugged.
+ */
+
+#include <linux/memory_hotplug.h>
+#include <linux/device.h>
+#include <linux/slab.h>
+#include <cxlmem.h>
+#include <cxl.h>
+#include "core.h"
+
+static void cxl_sysram_region_release(struct device *dev)
+{
+	struct cxl_sysram_region *cxlr_sysram = to_cxl_sysram_region(dev);
+
+	kfree(cxlr_sysram);
+}
+
+static ssize_t online_type_show(struct device *dev,
+				struct device_attribute *attr, char *buf)
+{
+	struct cxl_sysram_region *cxlr_sysram = to_cxl_sysram_region(dev);
+
+	switch (cxlr_sysram->online_type) {
+	case MMOP_OFFLINE:
+		return sysfs_emit(buf, "offline\n");
+	case MMOP_ONLINE:
+		return sysfs_emit(buf, "online\n");
+	case MMOP_ONLINE_MOVABLE:
+		return sysfs_emit(buf, "online_movable\n");
+	default:
+		return sysfs_emit(buf, "invalid\n");
+	}
+}
+
+static ssize_t online_type_store(struct device *dev,
+				 struct device_attribute *attr,
+				 const char *buf, size_t len)
+{
+	struct cxl_sysram_region *cxlr_sysram = to_cxl_sysram_region(dev);
+
+	if (sysfs_streq(buf, "offline"))
+		cxlr_sysram->online_type = MMOP_OFFLINE;
+	else if (sysfs_streq(buf, "online"))
+		cxlr_sysram->online_type = MMOP_ONLINE;
+	else if (sysfs_streq(buf, "online_movable"))
+		cxlr_sysram->online_type = MMOP_ONLINE_MOVABLE;
+	else
+		return -EINVAL;
+
+	return len;
+}
+
+static DEVICE_ATTR_RW(online_type);
+
+static struct attribute *cxl_sysram_region_attrs[] = {
+	&dev_attr_online_type.attr,
+	NULL,
+};
+
+static const struct attribute_group cxl_sysram_region_attribute_group = {
+	.attrs = cxl_sysram_region_attrs,
+};
+
+static const struct attribute_group *cxl_sysram_region_attribute_groups[] = {
+	&cxl_base_attribute_group,
+	&cxl_sysram_region_attribute_group,
+	NULL,
+};
+
+const struct device_type cxl_sysram_region_type = {
+	.name = "cxl_sysram_region",
+	.release = cxl_sysram_region_release,
+	.groups = cxl_sysram_region_attribute_groups,
+};
+
+static bool is_cxl_sysram_region(struct device *dev)
+{
+	return dev->type == &cxl_sysram_region_type;
+}
+
+struct cxl_sysram_region *to_cxl_sysram_region(struct device *dev)
+{
+	if (dev_WARN_ONCE(dev, !is_cxl_sysram_region(dev),
+			  "not a cxl_sysram_region device\n"))
+		return NULL;
+	return container_of(dev, struct cxl_sysram_region, dev);
+}
+EXPORT_SYMBOL_NS_GPL(to_cxl_sysram_region, "CXL");
+
+static struct lock_class_key cxl_sysram_region_key;
+
+static struct cxl_sysram_region *cxl_sysram_region_alloc(struct cxl_region *cxlr)
+{
+	struct cxl_region_params *p = &cxlr->params;
+	struct cxl_sysram_region *cxlr_sysram;
+	struct device *dev;
+
+	guard(rwsem_read)(&cxl_rwsem.region);
+	if (p->state != CXL_CONFIG_COMMIT)
+		return ERR_PTR(-ENXIO);
+
+	cxlr_sysram = kzalloc(sizeof(*cxlr_sysram), GFP_KERNEL);
+	if (!cxlr_sysram)
+		return ERR_PTR(-ENOMEM);
+
+	cxlr_sysram->hpa_range.start = p->res->start;
+	cxlr_sysram->hpa_range.end = p->res->end;
+	cxlr_sysram->online_type = -1;  /* Require explicit configuration */
+
+	dev = &cxlr_sysram->dev;
+	cxlr_sysram->cxlr = cxlr;
+	device_initialize(dev);
+	lockdep_set_class(&dev->mutex, &cxl_sysram_region_key);
+	device_set_pm_not_required(dev);
+	dev->parent = &cxlr->dev;
+	dev->bus = &cxl_bus_type;
+	dev->type = &cxl_sysram_region_type;
+
+	return cxlr_sysram;
+}
+
+static void cxlr_sysram_unregister(void *_cxlr_sysram)
+{
+	struct cxl_sysram_region *cxlr_sysram = _cxlr_sysram;
+
+	device_unregister(&cxlr_sysram->dev);
+}
+
+int devm_cxl_add_sysram_region(struct cxl_region *cxlr)
+{
+	struct cxl_sysram_region *cxlr_sysram;
+	struct device *dev;
+	int rc;
+
+	cxlr_sysram = cxl_sysram_region_alloc(cxlr);
+	if (IS_ERR(cxlr_sysram))
+		return PTR_ERR(cxlr_sysram);
+
+	dev = &cxlr_sysram->dev;
+	rc = dev_set_name(dev, "sysram_region%d", cxlr->id);
+	if (rc)
+		goto err;
+
+	rc = device_add(dev);
+	if (rc)
+		goto err;
+
+	dev_dbg(&cxlr->dev, "%s: register %s\n", dev_name(dev->parent),
+		dev_name(dev));
+
+	return devm_add_action_or_reset(&cxlr->dev, cxlr_sysram_unregister,
+					cxlr_sysram);
+err:
+	put_device(dev);
+	return rc;
+}
+
+static int cxl_sysram_region_driver_probe(struct device *dev)
+{
+	struct cxl_region *cxlr = to_cxl_region(dev);
+
+	/* Only handle RAM regions */
+	if (cxlr->mode != CXL_PARTMODE_RAM)
+		return -ENODEV;
+
+	return devm_cxl_add_sysram_region(cxlr);
+}
+
+struct cxl_driver cxl_sysram_region_driver = {
+	.name = "cxl_sysram_region",
+	.probe = cxl_sysram_region_driver_probe,
+	.id = CXL_DEVICE_REGION,
+};
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 674d5f870c70..1544c27e9c89 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -596,6 +596,25 @@ struct cxl_dax_region {
 	enum dax_driver_type dax_driver;
 };
 
+/**
+ * struct cxl_sysram_region - CXL RAM region for system memory hotplug
+ * @dev: device for this sysram_region
+ * @cxlr: parent cxl_region
+ * @hpa_range: Host physical address range for the region
+ * @online_type: Memory online type (MMOP_* 0-3, or -1 if not configured)
+ *
+ * Intermediate device that allows configuration of memory hotplug
+ * parameters before the underlying dax_region is created. The device
+ * starts with online_type=-1 which prevents the cxl_dax_kmem_region
+ * driver from binding until the user explicitly sets online_type.
+ */
+struct cxl_sysram_region {
+	struct device dev;
+	struct cxl_region *cxlr;
+	struct range hpa_range;
+	int online_type;
+};
+
 /**
  * struct cxl_port - logical collection of upstream port devices and
  *		     downstream port devices to construct a CXL memory
@@ -890,6 +909,7 @@ void cxl_driver_unregister(struct cxl_driver *cxl_drv);
 #define CXL_DEVICE_PMEM_REGION		7
 #define CXL_DEVICE_DAX_REGION		8
 #define CXL_DEVICE_PMU			9
+#define CXL_DEVICE_SYSRAM_REGION	10
 
 #define MODULE_ALIAS_CXL(type) MODULE_ALIAS("cxl:t" __stringify(type) "*")
 #define CXL_MODALIAS_FMT "cxl:t%d"
@@ -907,6 +927,7 @@ bool is_cxl_pmem_region(struct device *dev);
 struct cxl_pmem_region *to_cxl_pmem_region(struct device *dev);
 int cxl_add_to_region(struct cxl_endpoint_decoder *cxled);
 struct cxl_dax_region *to_cxl_dax_region(struct device *dev);
+struct cxl_sysram_region *to_cxl_sysram_region(struct device *dev);
 u64 cxl_port_get_spa_cache_alias(struct cxl_port *endpoint, u64 spa);
 #else
 static inline bool is_cxl_pmem_region(struct device *dev)
@@ -925,6 +946,10 @@ static inline struct cxl_dax_region *to_cxl_dax_region(struct device *dev)
 {
 	return NULL;
 }
+static inline struct cxl_sysram_region *to_cxl_sysram_region(struct device *dev)
+{
+	return NULL;
+}
 static inline u64 cxl_port_get_spa_cache_alias(struct cxl_port *endpoint,
 					       u64 spa)
 {
-- 
2.52.0
Re: [PATCH 8/9] cxl/core: Add dax_kmem_region and sysram_region drivers
Posted by Jonathan Cameron 1 week ago
On Thu, 29 Jan 2026 16:04:41 -0500
Gregory Price <gourry@gourry.net> wrote:

> In the current kmem driver binding process, the only way for users
> to define hotplug policy is via a build-time option, or by not
> onlining memory by default and setting each individual memory block
> online after hotplug occurs.  We can solve this with a configuration
> step between region-probe and dax-probe.
> 
> Add the infrastructure for a two-stage driver binding for kmem-mode
> dax regions. The cxl_dax_kmem_region driver probes cxl_sysram_region
> devices and creates cxl_dax_region with dax_driver=kmem.
> 
> This creates an interposition step where users can configure policy.
> 
> Device hierarchy:
>   region0 -> sysram_region0 -> dax_region0 -> dax0.0
> 
> The sysram_region device exposes a sysfs 'online_type' attribute
> that allows users to configure the memory online type before the
> underlying dax_region is created and memory is hotplugged.
> 
>   sysram_region0/online_type:
>       invalid:        not configured, blocks probe
>       offline:        memory will not be onlined automatically
>       online:         memory will be onlined in ZONE_NORMAL
>       online_movable: memory will be onlined in ZONE_MMOVABLE

ZONE_MOVABLE

> 
> The device initializes with online_type=invalid which prevents the
> cxl_dax_kmem_region driver from binding until the user explicitly
> configures a valid online_type.
> 
> This enables a two-step binding process:
>   echo region0 > cxl_sysram_region/bind
>   echo online_movable > sysram_region0/online_type
>   echo sysram_region0 > cxl_dax_kmem_region/bind
> 
> Signed-off-by: Gregory Price <gourry@gourry.net>
Trivial stuff. Will mull over this series as a whole...
My first instinctive reaction is positive - I'm just wondering
where additional drivers fit into this and whether it has the
right degree of flexibility.



> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index 6200ca1cc2dd..8bef91dc726c 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -3734,8 +3734,20 @@ int cxl_region_init(void)
>  	if (rc)
>  		goto err_dax;
>  
> +	rc = cxl_driver_register(&cxl_sysram_region_driver);

This smells like a loop over an array of drivers is becoming sensible.

> +	if (rc)
> +		goto err_sysram;
> +
> +	rc = cxl_driver_register(&cxl_dax_kmem_region_driver);
> +	if (rc)
> +		goto err_dax_kmem;
> +
>  	return 0;
>  
> +err_dax_kmem:
> +	cxl_driver_unregister(&cxl_sysram_region_driver);
> +err_sysram:
> +	cxl_driver_unregister(&cxl_devdax_region_driver);
>  err_dax:
>  	cxl_driver_unregister(&cxl_region_driver);
>  	return rc;
> @@ -3743,6 +3755,8 @@ int cxl_region_init(void)
>  
>  void cxl_region_exit(void)
>  {
> +	cxl_driver_unregister(&cxl_dax_kmem_region_driver);
> +	cxl_driver_unregister(&cxl_sysram_region_driver);
>  	cxl_driver_unregister(&cxl_devdax_region_driver);
>  	cxl_driver_unregister(&cxl_region_driver);
>  }
> diff --git a/drivers/cxl/core/sysram_region.c b/drivers/cxl/core/sysram_region.c
> new file mode 100644
> index 000000000000..5665db238d0f
> --- /dev/null
> +++ b/drivers/cxl/core/sysram_region.c
> @@ -0,0 +1,180 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/* Copyright(c) 2026 Meta Platforms, Inc. All rights reserved. */
> +/*
> + * CXL Sysram Region - Intermediate device for kmem hotplug configuration
> + *
> + * This provides an intermediate device between cxl_region and cxl_dax_region
> + * that allows users to configure memory hotplug parameters (like online_type)
> + * before the underlying dax_region is created and memory is hotplugged.
> + */
> +
> +#include <linux/memory_hotplug.h>
> +#include <linux/device.h>
> +#include <linux/slab.h>
> +#include <cxlmem.h>
> +#include <cxl.h>
> +#include "core.h"

> +
> +static DEVICE_ATTR_RW(online_type);
> +
> +static struct attribute *cxl_sysram_region_attrs[] = {
> +	&dev_attr_online_type.attr,
> +	NULL,

As below.

> +};
> +
> +static const struct attribute_group cxl_sysram_region_attribute_group = {
> +	.attrs = cxl_sysram_region_attrs,
> +};
> +
> +static const struct attribute_group *cxl_sysram_region_attribute_groups[] = {
> +	&cxl_base_attribute_group,
> +	&cxl_sysram_region_attribute_group,
> +	NULL,

Trivial, but don't want a comma on that NULL.

> +};

> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index 674d5f870c70..1544c27e9c89 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -596,6 +596,25 @@ struct cxl_dax_region {
>  	enum dax_driver_type dax_driver;
>  };
>  
> +/**
> + * struct cxl_sysram_region - CXL RAM region for system memory hotplug
> + * @dev: device for this sysram_region
> + * @cxlr: parent cxl_region
> + * @hpa_range: Host physical address range for the region
> + * @online_type: Memory online type (MMOP_* 0-3, or -1 if not configured)

Ah. An there's our reason for an int.   Can we just add a MMOP enum value
for not configured yet and so let us use it as an enum?
Or have a separate bool for that and ignore the online_type until it's set.


> + *
> + * Intermediate device that allows configuration of memory hotplug
> + * parameters before the underlying dax_region is created. The device
> + * starts with online_type=-1 which prevents the cxl_dax_kmem_region
> + * driver from binding until the user explicitly sets online_type.
> + */
> +struct cxl_sysram_region {
> +	struct device dev;
> +	struct cxl_region *cxlr;
> +	struct range hpa_range;
> +	int online_type;
> +};
Re: [PATCH 8/9] cxl/core: Add dax_kmem_region and sysram_region drivers
Posted by Gregory Price 1 week ago
On Mon, Feb 02, 2026 at 06:20:15PM +0000, Jonathan Cameron wrote:
> >  
> > +/**
> > + * struct cxl_sysram_region - CXL RAM region for system memory hotplug
> > + * @dev: device for this sysram_region
> > + * @cxlr: parent cxl_region
> > + * @hpa_range: Host physical address range for the region
> > + * @online_type: Memory online type (MMOP_* 0-3, or -1 if not configured)
> 
> Ah. An there's our reason for an int.   Can we just add a MMOP enum value
> for not configured yet and so let us use it as an enum?
> Or have a separate bool for that and ignore the online_type until it's set.
> 

I think the latter is more reasonably, MMOP_UNCONFIGURED doesn't much
make sense for memory_hotplug.c

ack.

~Gregory
Re: [PATCH 8/9] cxl/core: Add dax_kmem_region and sysram_region drivers
Posted by Cheatham, Benjamin 1 week, 3 days ago
On 1/29/2026 3:04 PM, Gregory Price wrote:
> In the current kmem driver binding process, the only way for users
> to define hotplug policy is via a build-time option, or by not
> onlining memory by default and setting each individual memory block
> online after hotplug occurs.  We can solve this with a configuration
> step between region-probe and dax-probe.
> 
> Add the infrastructure for a two-stage driver binding for kmem-mode
> dax regions. The cxl_dax_kmem_region driver probes cxl_sysram_region
> devices and creates cxl_dax_region with dax_driver=kmem.
> 
> This creates an interposition step where users can configure policy.
> 
> Device hierarchy:
>   region0 -> sysram_region0 -> dax_region0 -> dax0.0

This technically comes up in the devdax_region driver patch first, but I noticed it here
so this is where I'm putting it:

I like the idea here, but the implementation is all off. Firstly, devm_cxl_add_sysram_region()
is never called outside of sysram_region_driver::probe(), so I'm not sure how they ever get
added to the system (same with devdax regions).

Second, there's this weird pattern of adding sub-region (sysram, devdax, etc.) devices being added
inside of the sub-region driver probe. I would expect the devices are added then the probe function
is called. What I think should be going on here (and correct me if I'm wrong) is:
	1. a cxl_region device is added to the system
	2. cxl_region::probe() is called on said device (one in cxl/core/region.c)
	3. Said probe function figures out the device is a dax_region or whatever else and creates that type of region device
	(i.e. cxl_region::probe() -> device_add(&cxl_sysram_device))
	4. if the device's dax driver type is DAXDRV_DEVICE_TYPE it gets sent to the daxdev_region driver
	5a. if the device's dax driver type is DAXDRV_KMEM_TYPE it gets sent to the sysram_region driver which holds it until
	the online_type is set
	5b. Once the online_type is set, the device is forwarded to the dax_kmem_region driver? Not sure on this part

What seems to be happening is that the cxl_region is added, all of these region drivers try
to bind to it since they all use the same device id (CXL_DEVICE_REGION) and the correct one is
figured out by magic? I'm somewhat confused at this point :/.

> 
> The sysram_region device exposes a sysfs 'online_type' attribute
> that allows users to configure the memory online type before the
> underlying dax_region is created and memory is hotplugged.
> 
>   sysram_region0/online_type:
>       invalid:        not configured, blocks probe
>       offline:        memory will not be onlined automatically
>       online:         memory will be onlined in ZONE_NORMAL
>       online_movable: memory will be onlined in ZONE_MMOVABLE
> 
> The device initializes with online_type=invalid which prevents the
> cxl_dax_kmem_region driver from binding until the user explicitly
> configures a valid online_type.
> 
> This enables a two-step binding process:
>   echo region0 > cxl_sysram_region/bind
>   echo online_movable > sysram_region0/online_type
>   echo sysram_region0 > cxl_dax_kmem_region/bind
> 
> Signed-off-by: Gregory Price <gourry@gourry.net>
> ---
>  Documentation/ABI/testing/sysfs-bus-cxl |  21 +++
>  drivers/cxl/core/Makefile               |   1 +
>  drivers/cxl/core/core.h                 |   6 +
>  drivers/cxl/core/dax_region.c           |  50 +++++++
>  drivers/cxl/core/port.c                 |   2 +
>  drivers/cxl/core/region.c               |  14 ++
>  drivers/cxl/core/sysram_region.c        | 180 ++++++++++++++++++++++++
>  drivers/cxl/cxl.h                       |  25 ++++
>  8 files changed, 299 insertions(+)
>  create mode 100644 drivers/cxl/core/sysram_region.c
> 
> diff --git a/Documentation/ABI/testing/sysfs-bus-cxl b/Documentation/ABI/testing/sysfs-bus-cxl
> index c80a1b5a03db..a051cb86bdfc 100644
> --- a/Documentation/ABI/testing/sysfs-bus-cxl
> +++ b/Documentation/ABI/testing/sysfs-bus-cxl
> @@ -624,3 +624,24 @@ Description:
>  		The count is persistent across power loss and wraps back to 0
>  		upon overflow. If this file is not present, the device does not
>  		have the necessary support for dirty tracking.
> +
> +
> +What:		/sys/bus/cxl/devices/sysram_regionZ/online_type
> +Date:		January, 2026
> +KernelVersion:	v7.1
> +Contact:	linux-cxl@vger.kernel.org
> +Description:
> +		(RW) This attribute allows users to configure the memory online
> +		type before the underlying dax_region engages in hotplug.
> +
> +		Valid values:
> +		  'invalid': Not configured (default). Blocks probe.

This should be removed from the valid values section since it's not a valid value
to write to the attribute. The mention of the default in the paragraph below should
be enough.

> +		  'offline': Memory will not be onlined automatically.
> +		  'online' : Memory will be onlined in ZONE_NORMAL.
> +		  'online_movable': Memory will be onlined in ZONE_MOVABLE.
> +
> +		The device initializes with online_type='invalid' which prevents
> +		the cxl_dax_kmem_region driver from binding until the user
> +		explicitly configures a valid online_type. This enables a
> +		two-step binding process that gives users control over memory
> +		hotplug policy before memory is added to the system.
> diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile
> index 36f284d7c500..faf662c7d88b 100644
> --- a/drivers/cxl/core/Makefile
> +++ b/drivers/cxl/core/Makefile
> @@ -18,6 +18,7 @@ cxl_core-y += ras.o
>  cxl_core-$(CONFIG_TRACING) += trace.o
>  cxl_core-$(CONFIG_CXL_REGION) += region.o
>  cxl_core-$(CONFIG_CXL_REGION) += dax_region.o
> +cxl_core-$(CONFIG_CXL_REGION) += sysram_region.o
>  cxl_core-$(CONFIG_CXL_REGION) += pmem_region.o
>  cxl_core-$(CONFIG_CXL_MCE) += mce.o
>  cxl_core-$(CONFIG_CXL_FEATURES) += features.o
> diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h
> index ea4df8abc2ad..04b32015e9b1 100644
> --- a/drivers/cxl/core/core.h
> +++ b/drivers/cxl/core/core.h
> @@ -26,6 +26,7 @@ extern struct device_attribute dev_attr_delete_region;
>  extern struct device_attribute dev_attr_region;
>  extern const struct device_type cxl_pmem_region_type;
>  extern const struct device_type cxl_dax_region_type;
> +extern const struct device_type cxl_sysram_region_type;
>  extern const struct device_type cxl_region_type;
>  
>  int cxl_decoder_detach(struct cxl_region *cxlr,
> @@ -37,6 +38,7 @@ int cxl_decoder_detach(struct cxl_region *cxlr,
>  #define SET_CXL_REGION_ATTR(x) (&dev_attr_##x.attr),
>  #define CXL_PMEM_REGION_TYPE(x) (&cxl_pmem_region_type)
>  #define CXL_DAX_REGION_TYPE(x) (&cxl_dax_region_type)
> +#define CXL_SYSRAM_REGION_TYPE(x) (&cxl_sysram_region_type)
>  int cxl_region_init(void);
>  void cxl_region_exit(void);
>  int cxl_get_poison_by_endpoint(struct cxl_port *port);
> @@ -44,9 +46,12 @@ struct cxl_region *cxl_dpa_to_region(const struct cxl_memdev *cxlmd, u64 dpa);
>  u64 cxl_dpa_to_hpa(struct cxl_region *cxlr, const struct cxl_memdev *cxlmd,
>  		   u64 dpa);
>  int devm_cxl_add_dax_region(struct cxl_region *cxlr, enum dax_driver_type);
> +int devm_cxl_add_sysram_region(struct cxl_region *cxlr);
>  int devm_cxl_add_pmem_region(struct cxl_region *cxlr);
>  
>  extern struct cxl_driver cxl_devdax_region_driver;
> +extern struct cxl_driver cxl_dax_kmem_region_driver;
> +extern struct cxl_driver cxl_sysram_region_driver;
>  
>  #else
>  static inline u64 cxl_dpa_to_hpa(struct cxl_region *cxlr,
> @@ -81,6 +86,7 @@ static inline void cxl_region_exit(void)
>  #define SET_CXL_REGION_ATTR(x)
>  #define CXL_PMEM_REGION_TYPE(x) NULL
>  #define CXL_DAX_REGION_TYPE(x) NULL
> +#define CXL_SYSRAM_REGION_TYPE(x) NULL
>  #endif
>  
>  struct cxl_send_command;
> diff --git a/drivers/cxl/core/dax_region.c b/drivers/cxl/core/dax_region.c
> index 391d51e5ec37..a379f5b85e3d 100644
> --- a/drivers/cxl/core/dax_region.c
> +++ b/drivers/cxl/core/dax_region.c
> @@ -127,3 +127,53 @@ struct cxl_driver cxl_devdax_region_driver = {
>  	.probe = cxl_devdax_region_driver_probe,
>  	.id = CXL_DEVICE_REGION,
>  };
> +
> +static int cxl_dax_kmem_region_driver_probe(struct device *dev)
> +{
> +	struct cxl_sysram_region *cxlr_sysram = to_cxl_sysram_region(dev);
> +	struct cxl_dax_region *cxlr_dax;
> +	struct cxl_region *cxlr;
> +	int rc;
> +
> +	if (!cxlr_sysram)
> +		return -ENODEV;
> +
> +	/* Require explicit online_type configuration before binding */
> +	if (cxlr_sysram->online_type == -1)
> +		return -ENODEV;
> +
> +	cxlr = cxlr_sysram->cxlr;
> +
> +	cxlr_dax = cxl_dax_region_alloc(cxlr);
> +	if (IS_ERR(cxlr_dax))
> +		return PTR_ERR(cxlr_dax);

You can use cleanup.h here to remove the goto's (I think). Following should work:

#DEFINE_FREE(cxlr_dax_region_put, struct cxl_dax_region *, if (!IS_ERR_OR_NULL(_T)) put_device(&cxlr_dax->dev))
static int cxl_dax_kmem_region_driver_probe(struct device *dev)
{
	...

	struct cxl_dax_region *cxlr_dax __free(cxlr_dax_region_put) = cxl_dax_region_alloc(cxlr);
	if (IS_ERR(cxlr_dax))
		return PTR_ERR(cxlr_dax);

	...

	rc = dev_set_name(&cxlr_dax->dev, "dax_region%d", cxlr->id);
	if (rc)
		return rc;

	rc = device_add(&cxlr_dax->dev);
	if (rc)
		return rc;

	dev_dbg(dev, "%s: register %s\n", dev_name(dev), dev_name(&cxlr_dax->dev));

	return devm_add_action_or_reset(dev, cxlr_dax_unregister, no_free_ptr(cxlr_dax));
}
> +
> +	/* Inherit online_type from parent sysram_region */
> +	cxlr_dax->online_type = cxlr_sysram->online_type;
> +	cxlr_dax->dax_driver = DAXDRV_KMEM_TYPE;
> +
> +	/* Parent is the sysram_region device */
> +	cxlr_dax->dev.parent = dev;
> +
> +	rc = dev_set_name(&cxlr_dax->dev, "dax_region%d", cxlr->id);
> +	if (rc)
> +		goto err;
> +
> +	rc = device_add(&cxlr_dax->dev);
> +	if (rc)
> +		goto err;
> +
> +	dev_dbg(dev, "%s: register %s\n", dev_name(dev),
> +		dev_name(&cxlr_dax->dev));
> +
> +	return devm_add_action_or_reset(dev, cxlr_dax_unregister, cxlr_dax);
> +err:
> +	put_device(&cxlr_dax->dev);
> +	return rc;
> +}
> +
> +struct cxl_driver cxl_dax_kmem_region_driver = {
> +	.name = "cxl_dax_kmem_region",
> +	.probe = cxl_dax_kmem_region_driver_probe,
> +	.id = CXL_DEVICE_SYSRAM_REGION,
> +};
> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
> index 3310dbfae9d6..dc7262a5efd6 100644
> --- a/drivers/cxl/core/port.c
> +++ b/drivers/cxl/core/port.c
> @@ -66,6 +66,8 @@ static int cxl_device_id(const struct device *dev)
>  		return CXL_DEVICE_PMEM_REGION;
>  	if (dev->type == CXL_DAX_REGION_TYPE())
>  		return CXL_DEVICE_DAX_REGION;
> +	if (dev->type == CXL_SYSRAM_REGION_TYPE())
> +		return CXL_DEVICE_SYSRAM_REGION;
>  	if (is_cxl_port(dev)) {
>  		if (is_cxl_root(to_cxl_port(dev)))
>  			return CXL_DEVICE_ROOT;
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index 6200ca1cc2dd..8bef91dc726c 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -3734,8 +3734,20 @@ int cxl_region_init(void)
>  	if (rc)
>  		goto err_dax;
>  
> +	rc = cxl_driver_register(&cxl_sysram_region_driver);
> +	if (rc)
> +		goto err_sysram;
> +
> +	rc = cxl_driver_register(&cxl_dax_kmem_region_driver);
> +	if (rc)
> +		goto err_dax_kmem;
> +
>  	return 0;
>  
> +err_dax_kmem:
> +	cxl_driver_unregister(&cxl_sysram_region_driver);
> +err_sysram:
> +	cxl_driver_unregister(&cxl_devdax_region_driver);
>  err_dax:
>  	cxl_driver_unregister(&cxl_region_driver);
>  	return rc;
> @@ -3743,6 +3755,8 @@ int cxl_region_init(void)
>  
>  void cxl_region_exit(void)
>  {
> +	cxl_driver_unregister(&cxl_dax_kmem_region_driver);
> +	cxl_driver_unregister(&cxl_sysram_region_driver);
>  	cxl_driver_unregister(&cxl_devdax_region_driver);
>  	cxl_driver_unregister(&cxl_region_driver);
>  }
> diff --git a/drivers/cxl/core/sysram_region.c b/drivers/cxl/core/sysram_region.c
> new file mode 100644
> index 000000000000..5665db238d0f
> --- /dev/null
> +++ b/drivers/cxl/core/sysram_region.c
> @@ -0,0 +1,180 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/* Copyright(c) 2026 Meta Platforms, Inc. All rights reserved. */
> +/*
> + * CXL Sysram Region - Intermediate device for kmem hotplug configuration
> + *
> + * This provides an intermediate device between cxl_region and cxl_dax_region
> + * that allows users to configure memory hotplug parameters (like online_type)
> + * before the underlying dax_region is created and memory is hotplugged.
> + */
> +
> +#include <linux/memory_hotplug.h>
> +#include <linux/device.h>
> +#include <linux/slab.h>
> +#include <cxlmem.h>
> +#include <cxl.h>
> +#include "core.h"
> +
> +static void cxl_sysram_region_release(struct device *dev)
> +{
> +	struct cxl_sysram_region *cxlr_sysram = to_cxl_sysram_region(dev);
> +
> +	kfree(cxlr_sysram);
> +}
> +
> +static ssize_t online_type_show(struct device *dev,
> +				struct device_attribute *attr, char *buf)
> +{
> +	struct cxl_sysram_region *cxlr_sysram = to_cxl_sysram_region(dev);
> +
> +	switch (cxlr_sysram->online_type) {
> +	case MMOP_OFFLINE:
> +		return sysfs_emit(buf, "offline\n");
> +	case MMOP_ONLINE:
> +		return sysfs_emit(buf, "online\n");
> +	case MMOP_ONLINE_MOVABLE:
> +		return sysfs_emit(buf, "online_movable\n");
> +	default:
> +		return sysfs_emit(buf, "invalid\n");
> +	}
> +}
> +
> +static ssize_t online_type_store(struct device *dev,
> +				 struct device_attribute *attr,
> +				 const char *buf, size_t len)
> +{
> +	struct cxl_sysram_region *cxlr_sysram = to_cxl_sysram_region(dev);
> +
> +	if (sysfs_streq(buf, "offline"))
> +		cxlr_sysram->online_type = MMOP_OFFLINE;
> +	else if (sysfs_streq(buf, "online"))
> +		cxlr_sysram->online_type = MMOP_ONLINE;
> +	else if (sysfs_streq(buf, "online_movable"))
> +		cxlr_sysram->online_type = MMOP_ONLINE_MOVABLE;
> +	else
> +		return -EINVAL;
> +
> +	return len;
> +}
> +
> +static DEVICE_ATTR_RW(online_type);
> +
> +static struct attribute *cxl_sysram_region_attrs[] = {
> +	&dev_attr_online_type.attr,
> +	NULL,
> +};
> +
> +static const struct attribute_group cxl_sysram_region_attribute_group = {
> +	.attrs = cxl_sysram_region_attrs,
> +};
> +
> +static const struct attribute_group *cxl_sysram_region_attribute_groups[] = {
> +	&cxl_base_attribute_group,
> +	&cxl_sysram_region_attribute_group,
> +	NULL,
> +};
> +
> +const struct device_type cxl_sysram_region_type = {
> +	.name = "cxl_sysram_region",
> +	.release = cxl_sysram_region_release,
> +	.groups = cxl_sysram_region_attribute_groups,
> +};
> +
> +static bool is_cxl_sysram_region(struct device *dev)
> +{
> +	return dev->type == &cxl_sysram_region_type;
> +}
> +
> +struct cxl_sysram_region *to_cxl_sysram_region(struct device *dev)
> +{
> +	if (dev_WARN_ONCE(dev, !is_cxl_sysram_region(dev),
> +			  "not a cxl_sysram_region device\n"))
> +		return NULL;
> +	return container_of(dev, struct cxl_sysram_region, dev);
> +}
> +EXPORT_SYMBOL_NS_GPL(to_cxl_sysram_region, "CXL");
> +
> +static struct lock_class_key cxl_sysram_region_key;
> +
> +static struct cxl_sysram_region *cxl_sysram_region_alloc(struct cxl_region *cxlr)
> +{
> +	struct cxl_region_params *p = &cxlr->params;
> +	struct cxl_sysram_region *cxlr_sysram;
> +	struct device *dev;
> +
> +	guard(rwsem_read)(&cxl_rwsem.region);
> +	if (p->state != CXL_CONFIG_COMMIT)
> +		return ERR_PTR(-ENXIO);
> +
> +	cxlr_sysram = kzalloc(sizeof(*cxlr_sysram), GFP_KERNEL);
> +	if (!cxlr_sysram)
> +		return ERR_PTR(-ENOMEM);
> +
> +	cxlr_sysram->hpa_range.start = p->res->start;
> +	cxlr_sysram->hpa_range.end = p->res->end;
> +	cxlr_sysram->online_type = -1;  /* Require explicit configuration */
> +
> +	dev = &cxlr_sysram->dev;
> +	cxlr_sysram->cxlr = cxlr;
> +	device_initialize(dev);
> +	lockdep_set_class(&dev->mutex, &cxl_sysram_region_key);
> +	device_set_pm_not_required(dev);
> +	dev->parent = &cxlr->dev;
> +	dev->bus = &cxl_bus_type;
> +	dev->type = &cxl_sysram_region_type;
> +
> +	return cxlr_sysram;
> +}
> +
> +static void cxlr_sysram_unregister(void *_cxlr_sysram)
> +{
> +	struct cxl_sysram_region *cxlr_sysram = _cxlr_sysram;
> +
> +	device_unregister(&cxlr_sysram->dev);
> +}
> +
> +int devm_cxl_add_sysram_region(struct cxl_region *cxlr)
> +{
> +	struct cxl_sysram_region *cxlr_sysram;
> +	struct device *dev;
> +	int rc;
> +
> +	cxlr_sysram = cxl_sysram_region_alloc(cxlr);
> +	if (IS_ERR(cxlr_sysram))
> +		return PTR_ERR(cxlr_sysram);

Same thing as above

Thanks,
Ben

> +
> +	dev = &cxlr_sysram->dev;
> +	rc = dev_set_name(dev, "sysram_region%d", cxlr->id);
> +	if (rc)
> +		goto err;
> +
> +	rc = device_add(dev);
> +	if (rc)
> +		goto err;
> +
> +	dev_dbg(&cxlr->dev, "%s: register %s\n", dev_name(dev->parent),
> +		dev_name(dev));
> +
> +	return devm_add_action_or_reset(&cxlr->dev, cxlr_sysram_unregister,
> +					cxlr_sysram);
> +err:
> +	put_device(dev);
> +	return rc;
> +}
> +
> +static int cxl_sysram_region_driver_probe(struct device *dev)
> +{
> +	struct cxl_region *cxlr = to_cxl_region(dev);
> +
> +	/* Only handle RAM regions */
> +	if (cxlr->mode != CXL_PARTMODE_RAM)
> +		return -ENODEV;
> +
> +	return devm_cxl_add_sysram_region(cxlr);
> +}
> +
> +struct cxl_driver cxl_sysram_region_driver = {
> +	.name = "cxl_sysram_region",
> +	.probe = cxl_sysram_region_driver_probe,
> +	.id = CXL_DEVICE_REGION,
> +};
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index 674d5f870c70..1544c27e9c89 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -596,6 +596,25 @@ struct cxl_dax_region {
>  	enum dax_driver_type dax_driver;
>  };
>  
> +/**
> + * struct cxl_sysram_region - CXL RAM region for system memory hotplug
> + * @dev: device for this sysram_region
> + * @cxlr: parent cxl_region
> + * @hpa_range: Host physical address range for the region
> + * @online_type: Memory online type (MMOP_* 0-3, or -1 if not configured)
> + *
> + * Intermediate device that allows configuration of memory hotplug
> + * parameters before the underlying dax_region is created. The device
> + * starts with online_type=-1 which prevents the cxl_dax_kmem_region
> + * driver from binding until the user explicitly sets online_type.
> + */
> +struct cxl_sysram_region {
> +	struct device dev;
> +	struct cxl_region *cxlr;
> +	struct range hpa_range;
> +	int online_type;
> +};
> +
>  /**
>   * struct cxl_port - logical collection of upstream port devices and
>   *		     downstream port devices to construct a CXL memory
> @@ -890,6 +909,7 @@ void cxl_driver_unregister(struct cxl_driver *cxl_drv);
>  #define CXL_DEVICE_PMEM_REGION		7
>  #define CXL_DEVICE_DAX_REGION		8
>  #define CXL_DEVICE_PMU			9
> +#define CXL_DEVICE_SYSRAM_REGION	10
>  
>  #define MODULE_ALIAS_CXL(type) MODULE_ALIAS("cxl:t" __stringify(type) "*")
>  #define CXL_MODALIAS_FMT "cxl:t%d"
> @@ -907,6 +927,7 @@ bool is_cxl_pmem_region(struct device *dev);
>  struct cxl_pmem_region *to_cxl_pmem_region(struct device *dev);
>  int cxl_add_to_region(struct cxl_endpoint_decoder *cxled);
>  struct cxl_dax_region *to_cxl_dax_region(struct device *dev);
> +struct cxl_sysram_region *to_cxl_sysram_region(struct device *dev);
>  u64 cxl_port_get_spa_cache_alias(struct cxl_port *endpoint, u64 spa);
>  #else
>  static inline bool is_cxl_pmem_region(struct device *dev)
> @@ -925,6 +946,10 @@ static inline struct cxl_dax_region *to_cxl_dax_region(struct device *dev)
>  {
>  	return NULL;
>  }
> +static inline struct cxl_sysram_region *to_cxl_sysram_region(struct device *dev)
> +{
> +	return NULL;
> +}
>  static inline u64 cxl_port_get_spa_cache_alias(struct cxl_port *endpoint,
>  					       u64 spa)
>  {
Re: [PATCH 8/9] cxl/core: Add dax_kmem_region and sysram_region drivers
Posted by Gregory Price 1 week, 3 days ago
On Fri, Jan 30, 2026 at 03:27:12PM -0600, Cheatham, Benjamin wrote:
> On 1/29/2026 3:04 PM, Gregory Price wrote:
> > In the current kmem driver binding process, the only way for users
> > to define hotplug policy is via a build-time option, or by not
> > onlining memory by default and setting each individual memory block
> > online after hotplug occurs.  We can solve this with a configuration
> > step between region-probe and dax-probe.
> > 
> > Add the infrastructure for a two-stage driver binding for kmem-mode
> > dax regions. The cxl_dax_kmem_region driver probes cxl_sysram_region
> > devices and creates cxl_dax_region with dax_driver=kmem.
> > 
> > This creates an interposition step where users can configure policy.
> > 
> > Device hierarchy:
> >   region0 -> sysram_region0 -> dax_region0 -> dax0.0
> 
> This technically comes up in the devdax_region driver patch first, but I noticed it here
> so this is where I'm putting it:
> 
> I like the idea here, but the implementation is all off. Firstly, devm_cxl_add_sysram_region()
> is never called outside of sysram_region_driver::probe(), so I'm not sure how they ever get
> added to the system (same with devdax regions).
> 
> Second, there's this weird pattern of adding sub-region (sysram, devdax, etc.) devices being added
> inside of the sub-region driver probe. I would expect the devices are added then the probe function
> is called. 

I originally tried doing with region0/region_driver, but that design
pattern is also confusing - and it creates differently bad patterns.

    echo region0 > decoder0.0/create_ram_region   -> creates region0

    # Current pattern
    echo region > driver/region/probe  /* auto-region behavior */

    # region_driver attribute pattern
    echo "sysram" > region0/region_driver
    echo region0 > driver/region/probe   /* uses sysram region driver */

https://lore.kernel.org/linux-cxl/20260113202138.3021093-1-gourry@gourry.net/

Ira pointed out that this design makes the "implicit" design of the
driver worse.  The user doesn't actually know what driver is being used
under the hood - it just knows something is being used.

This at least makes it explicit which driver is being used - and splits
the uses-case logic up into discrete drivers (dax users don't have to
worry about sysram users breaking their stuff).

If it makes more sense, you could swap the ordering of the names

    echo region0 > region/bind
    echo region0 > region_sysram/bind
    echo region0 > region_daxdev/bind
    echo region0 > region_dax_kmem/bind
    echo region0 > region_pony/bind

--- 

The  underlying issue is that region::probe() is trying to be a
god-function for every possible use case, and hiding the use case
behind an attribute vs a driver is not good.

(also the default behavior for region::probe() in an otherwise
 unconfigured region is required for backwards compatibility)

> What I think should be going on here (and correct me if I'm wrong) is:
> 	1. a cxl_region device is added to the system
> 	2. cxl_region::probe() is called on said device (one in cxl/core/region.c)
> 	3. Said probe function figures out the device is a dax_region or whatever else and creates that type of region device
> 	(i.e. cxl_region::probe() -> device_add(&cxl_sysram_device))
> 	4. if the device's dax driver type is DAXDRV_DEVICE_TYPE it gets sent to the daxdev_region driver
> 	5a. if the device's dax driver type is DAXDRV_KMEM_TYPE it gets sent to the sysram_region driver which holds it until
> 	the online_type is set
> 	5b. Once the online_type is set, the device is forwarded to the dax_kmem_region driver? Not sure on this part
> 
> What seems to be happening is that the cxl_region is added, all of these region drivers try
> to bind to it since they all use the same device id (CXL_DEVICE_REGION) and the correct one is
> figured out by magic? I'm somewhat confused at this point :/.
> 

For auto-regions:
   region_probe() eats it and you get the default behavior.

For non-auto regions:
   create_x_region generates an un-configured region and fails to probe
   until the user commits it and probes it.

auto-regions are evil and should be discouraged.

~Gregory
Re: [PATCH 8/9] cxl/core: Add dax_kmem_region and sysram_region drivers
Posted by Cheatham, Benjamin 1 week ago
On 1/30/2026 4:12 PM, Gregory Price wrote:
> On Fri, Jan 30, 2026 at 03:27:12PM -0600, Cheatham, Benjamin wrote:
>> On 1/29/2026 3:04 PM, Gregory Price wrote:
>>> In the current kmem driver binding process, the only way for users
>>> to define hotplug policy is via a build-time option, or by not
>>> onlining memory by default and setting each individual memory block
>>> online after hotplug occurs.  We can solve this with a configuration
>>> step between region-probe and dax-probe.
>>>
>>> Add the infrastructure for a two-stage driver binding for kmem-mode
>>> dax regions. The cxl_dax_kmem_region driver probes cxl_sysram_region
>>> devices and creates cxl_dax_region with dax_driver=kmem.
>>>
>>> This creates an interposition step where users can configure policy.
>>>
>>> Device hierarchy:
>>>   region0 -> sysram_region0 -> dax_region0 -> dax0.0
>>
>> This technically comes up in the devdax_region driver patch first, but I noticed it here
>> so this is where I'm putting it:
>>
>> I like the idea here, but the implementation is all off. Firstly, devm_cxl_add_sysram_region()
>> is never called outside of sysram_region_driver::probe(), so I'm not sure how they ever get
>> added to the system (same with devdax regions).
>>
>> Second, there's this weird pattern of adding sub-region (sysram, devdax, etc.) devices being added
>> inside of the sub-region driver probe. I would expect the devices are added then the probe function
>> is called. 
> 
> I originally tried doing with region0/region_driver, but that design
> pattern is also confusing - and it creates differently bad patterns.
> 
>     echo region0 > decoder0.0/create_ram_region   -> creates region0
> 
>     # Current pattern
>     echo region > driver/region/probe  /* auto-region behavior */
> 
>     # region_driver attribute pattern
>     echo "sysram" > region0/region_driver
>     echo region0 > driver/region/probe   /* uses sysram region driver */
> 
> https://lore.kernel.org/linux-cxl/20260113202138.3021093-1-gourry@gourry.net/
> 
> Ira pointed out that this design makes the "implicit" design of the
> driver worse.  The user doesn't actually know what driver is being used
> under the hood - it just knows something is being used.
> 
> This at least makes it explicit which driver is being used - and splits
> the uses-case logic up into discrete drivers (dax users don't have to
> worry about sysram users breaking their stuff).
> 
> If it makes more sense, you could swap the ordering of the names
> 
>     echo region0 > region/bind
>     echo region0 > region_sysram/bind
>     echo region0 > region_daxdev/bind
>     echo region0 > region_dax_kmem/bind
>     echo region0 > region_pony/bind
> 
> --- 
> 
> The  underlying issue is that region::probe() is trying to be a
> god-function for every possible use case, and hiding the use case
> behind an attribute vs a driver is not good.
> 
> (also the default behavior for region::probe() in an otherwise
>  unconfigured region is required for backwards compatibility)

Ok, that makes sense. I think I just got lost in the sauce while looking at this last
week and this explanation helped a lot.> 
>> What I think should be going on here (and correct me if I'm wrong) is:
>> 	1. a cxl_region device is added to the system
>> 	2. cxl_region::probe() is called on said device (one in cxl/core/region.c)
>> 	3. Said probe function figures out the device is a dax_region or whatever else and creates that type of region device
>> 	(i.e. cxl_region::probe() -> device_add(&cxl_sysram_device))
>> 	4. if the device's dax driver type is DAXDRV_DEVICE_TYPE it gets sent to the daxdev_region driver
>> 	5a. if the device's dax driver type is DAXDRV_KMEM_TYPE it gets sent to the sysram_region driver which holds it until
>> 	the online_type is set
>> 	5b. Once the online_type is set, the device is forwarded to the dax_kmem_region driver? Not sure on this part
>>
>> What seems to be happening is that the cxl_region is added, all of these region drivers try
>> to bind to it since they all use the same device id (CXL_DEVICE_REGION) and the correct one is
>> figured out by magic? I'm somewhat confused at this point :/.
>>
> 
> For auto-regions:
>    region_probe() eats it and you get the default behavior.
> 
> For non-auto regions:
>    create_x_region generates an un-configured region and fails to probe
>    until the user commits it and probes it.

I think this was the source of my misunderstanding. I was trying to understand how it
works for auto regions when it's never meant to apply to them.

Sorry if this is a stupid question, but what stops auto regions from binding to the
sysram/dax region drivers? They all bind to region devices, so I assume there's something
keeping them from binding before the core region driver gets a chance.

Thanks,
Ben
> 
> auto-regions are evil and should be discouraged.
> 
> ~Gregory
Re: [PATCH 8/9] cxl/core: Add dax_kmem_region and sysram_region drivers
Posted by Gregory Price 1 week ago
On Mon, Feb 02, 2026 at 11:02:37AM -0600, Cheatham, Benjamin wrote:
> Sorry if this is a stupid question, but what stops auto regions from binding to the
> sysram/dax region drivers? They all bind to region devices, so I assume there's something
> keeping them from binding before the core region driver gets a chance.
> 

I just grok'd the implications of this question.

I *think* the answer is "probe order" - which is bad.

I need to think about this a bit more and get a more definitive answer.

~Gregory
Re: [PATCH 8/9] cxl/core: Add dax_kmem_region and sysram_region drivers
Posted by Gregory Price 1 week ago
On Mon, Feb 02, 2026 at 11:02:37AM -0600, Cheatham, Benjamin wrote:
> > 
> > For auto-regions:
> >    region_probe() eats it and you get the default behavior.
> > 
> > For non-auto regions:
> >    create_x_region generates an un-configured region and fails to probe
> >    until the user commits it and probes it.
> 
> I think this was the source of my misunderstanding. I was trying to understand how it
> works for auto regions when it's never meant to apply to them.
> 
> Sorry if this is a stupid question, but what stops auto regions from binding to the
> sysram/dax region drivers? They all bind to region devices, so I assume there's something
> keeping them from binding before the core region driver gets a chance.
> 

Auto regions explicitly use the dax_kmem path (all existing code,
unchanged)- which auto-plugs into dax/hotplug.

I do get what you're saying that everything binds on a region type,
I will look a little closer at this and see if there's something more
reasonable we can do.

I think i can update `region/bind` to use the sysram driver with
   online_type=mhp_default_online_type

so you'd end up with effective the auto-region logic:

cxlcli create-region -m ram ... existing argument set
------
    echo region0 > create_ram_region
    /* program decoders */
    echo region0 > region/bind
    /* 
     * region_bind():
     *    1) alloc sysram_region object
     *    2) sysram_regionN->online_type=mhp_default_online_type()
     *    3) add device to bus
     *    4) device auto-probes all the way down to dax
     *    5) dax auto-onlines with system default setting
     */
------

and Non-auto-region logic (approximation)

cxlcli creation-region -m ram --type sysram --online-type=movable
-----
   echo region0 > create_ram_region
   /* program decoders */
   echo region0 > sysram/bind
   echo online_movable > sysram_region0/online_type
   echo sysram_region0 > dax_kmem/bind
-----


I want to retain the dax_kmem driver because there may be multiple users
other than sysram.   For example, a compressed memory region wants to
utilize dax_kmem, but has its own complex policy (via N_MEMORY_PRIVATE)
so it doesn't want to abstract through sysram_region, but it does want
to abstract through dax_kmem.

weeeee "software defined memory" weeeee

~Gregory