[PATCH v2 6/6] cxl/mem: Introduce cxl_memdev_attach for CXL-dependent operation

Dan Williams posted 6 patches 1 month, 3 weeks ago
[PATCH v2 6/6] cxl/mem: Introduce cxl_memdev_attach for CXL-dependent operation
Posted by Dan Williams 1 month, 3 weeks ago
Unlike the cxl_pci class driver that opportunistically enables memory
expansion with no other dependent functionality, CXL accelerator drivers
have distinct PCIe-only and CXL-enhanced operation states. If CXL is
available some additional coherent memory/cache operations can be enabled,
otherwise traditional DMA+MMIO over PCIe/CXL.io is a fallback.

Allow for a driver to pass a routine to be called in cxl_mem_probe()
context. This ability is inspired by and mirrors the semantics of
faux_device_create(). It allows for the caller to run CXL-topology
attach-dependent logic on behalf of the caller.

The probe callback runs after the port topology is successfully attached
for the given memdev.

Additionally the presence of @cxlmd->attach indicates that the accelerator
driver be detached when CXL operation ends. This conceptually makes a CXL
link loss event mirror a PCIe link loss event which results in triggering
the ->remove() callback of affected devices+drivers. A driver can re-attach
to recover back to PCIe-only operation. Live recovery, i.e. without a
->remove()/->probe() cycle, is left as a future consideration.

Cc: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
Reviewed-by: Ben Cheatham <benjamin.cheatham@amd.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com> (✓ DKIM/intel.com)
Tested-by: Alejandro Lucero <alucerop@amd.com> (✓ DKIM/amd.com)
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/cxl/cxlmem.h         | 12 ++++++++++--
 drivers/cxl/core/memdev.c    | 33 +++++++++++++++++++++++++++++----
 drivers/cxl/mem.c            | 20 ++++++++++++++++----
 drivers/cxl/pci.c            |  2 +-
 tools/testing/cxl/test/mem.c |  2 +-
 5 files changed, 57 insertions(+), 12 deletions(-)

diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
index 9db31c7993c4..ef202b34e5ea 100644
--- a/drivers/cxl/cxlmem.h
+++ b/drivers/cxl/cxlmem.h
@@ -34,6 +34,10 @@
 	(FIELD_GET(CXLMDEV_RESET_NEEDED_MASK, status) !=                       \
 	 CXLMDEV_RESET_NEEDED_NOT)
 
+struct cxl_memdev_attach {
+	int (*probe)(struct cxl_memdev *cxlmd);
+};
+
 /**
  * struct cxl_memdev - CXL bus object representing a Type-3 Memory Device
  * @dev: driver core device object
@@ -43,6 +47,7 @@
  * @cxl_nvb: coordinate removal of @cxl_nvd if present
  * @cxl_nvd: optional bridge to an nvdimm if the device supports pmem
  * @endpoint: connection to the CXL port topology for this memory device
+ * @attach: creator of this memdev depends on CXL link attach to operate
  * @id: id number of this memdev instance.
  * @depth: endpoint port depth
  * @scrub_cycle: current scrub cycle set for this device
@@ -59,6 +64,7 @@ struct cxl_memdev {
 	struct cxl_nvdimm_bridge *cxl_nvb;
 	struct cxl_nvdimm *cxl_nvd;
 	struct cxl_port *endpoint;
+	const struct cxl_memdev_attach *attach;
 	int id;
 	int depth;
 	u8 scrub_cycle;
@@ -95,8 +101,10 @@ static inline bool is_cxl_endpoint(struct cxl_port *port)
 	return is_cxl_memdev(port->uport_dev);
 }
 
-struct cxl_memdev *__devm_cxl_add_memdev(struct cxl_dev_state *cxlds);
-struct cxl_memdev *devm_cxl_add_memdev(struct cxl_dev_state *cxlds);
+struct cxl_memdev *__devm_cxl_add_memdev(struct cxl_dev_state *cxlds,
+					 const struct cxl_memdev_attach *attach);
+struct cxl_memdev *devm_cxl_add_memdev(struct cxl_dev_state *cxlds,
+				       const struct cxl_memdev_attach *attach);
 int devm_cxl_sanitize_setup_notifier(struct device *host,
 				     struct cxl_memdev *cxlmd);
 struct cxl_memdev_state;
diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
index 63da2bd4436e..3ab4cd8f19ed 100644
--- a/drivers/cxl/core/memdev.c
+++ b/drivers/cxl/core/memdev.c
@@ -641,14 +641,24 @@ static void detach_memdev(struct work_struct *work)
 	struct cxl_memdev *cxlmd;
 
 	cxlmd = container_of(work, typeof(*cxlmd), detach_work);
-	device_release_driver(&cxlmd->dev);
+
+	/*
+	 * When the creator of @cxlmd sets ->attach it indicates CXL operation
+	 * is required. In that case, @cxlmd detach escalates to parent device
+	 * detach.
+	 */
+	if (cxlmd->attach)
+		device_release_driver(cxlmd->dev.parent);
+	else
+		device_release_driver(&cxlmd->dev);
 	put_device(&cxlmd->dev);
 }
 
 static struct lock_class_key cxl_memdev_key;
 
 static struct cxl_memdev *cxl_memdev_alloc(struct cxl_dev_state *cxlds,
-					   const struct file_operations *fops)
+					   const struct file_operations *fops,
+					   const struct cxl_memdev_attach *attach)
 {
 	struct cxl_memdev *cxlmd;
 	struct device *dev;
@@ -664,6 +674,8 @@ static struct cxl_memdev *cxl_memdev_alloc(struct cxl_dev_state *cxlds,
 		goto err;
 	cxlmd->id = rc;
 	cxlmd->depth = -1;
+	cxlmd->attach = attach;
+	cxlmd->endpoint = ERR_PTR(-ENXIO);
 
 	dev = &cxlmd->dev;
 	device_initialize(dev);
@@ -1081,6 +1093,18 @@ static struct cxl_memdev *cxl_memdev_autoremove(struct cxl_memdev *cxlmd)
 {
 	int rc;
 
+	/*
+	 * If @attach is provided fail if the driver is not attached upon
+	 * return. Note that failure here could be the result of a race to
+	 * teardown the CXL port topology. I.e. cxl_mem_probe() could have
+	 * succeeded and then cxl_mem unbound before the lock is acquired.
+	 */
+	guard(device)(&cxlmd->dev);
+	if (cxlmd->attach && !cxlmd->dev.driver) {
+		cxl_memdev_unregister(cxlmd);
+		return ERR_PTR(-ENXIO);
+	}
+
 	rc = devm_add_action_or_reset(cxlmd->cxlds->dev, cxl_memdev_unregister,
 				      cxlmd);
 	if (rc)
@@ -1093,13 +1117,14 @@ static struct cxl_memdev *cxl_memdev_autoremove(struct cxl_memdev *cxlmd)
  * Core helper for devm_cxl_add_memdev() that wants to both create a device and
  * assert to the caller that upon return cxl_mem::probe() has been invoked.
  */
-struct cxl_memdev *__devm_cxl_add_memdev(struct cxl_dev_state *cxlds)
+struct cxl_memdev *__devm_cxl_add_memdev(struct cxl_dev_state *cxlds,
+					 const struct cxl_memdev_attach *attach)
 {
 	struct device *dev;
 	int rc;
 
 	struct cxl_memdev *cxlmd __free(put_cxlmd) =
-		cxl_memdev_alloc(cxlds, &cxl_memdev_fops);
+		cxl_memdev_alloc(cxlds, &cxl_memdev_fops, attach);
 	if (IS_ERR(cxlmd))
 		return cxlmd;
 
diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c
index 677996c65272..333c366b69e7 100644
--- a/drivers/cxl/mem.c
+++ b/drivers/cxl/mem.c
@@ -142,6 +142,12 @@ static int cxl_mem_probe(struct device *dev)
 			return rc;
 	}
 
+	if (cxlmd->attach) {
+		rc = cxlmd->attach->probe(cxlmd);
+		if (rc)
+			return rc;
+	}
+
 	rc = devm_cxl_memdev_edac_register(cxlmd);
 	if (rc)
 		dev_dbg(dev, "CXL memdev EDAC registration failed rc=%d\n", rc);
@@ -166,17 +172,23 @@ static int cxl_mem_probe(struct device *dev)
 /**
  * devm_cxl_add_memdev - Add a CXL memory device
  * @cxlds: CXL device state to associate with the memdev
+ * @attach: Caller depends on CXL topology attachment
  *
  * Upon return the device will have had a chance to attach to the
- * cxl_mem driver, but may fail if the CXL topology is not ready
- * (hardware CXL link down, or software platform CXL root not attached)
+ * cxl_mem driver, but may fail to attach if the CXL topology is not ready
+ * (hardware CXL link down, or software platform CXL root not attached).
+ *
+ * When @attach is NULL it indicates the caller wants the memdev to remain
+ * registered even if it does not immediately attach to the CXL hierarchy. When
+ * @attach is provided a cxl_mem_probe() failure leads to failure of this routine.
  *
  * The parent of the resulting device and the devm context for allocations is
  * @cxlds->dev.
  */
-struct cxl_memdev *devm_cxl_add_memdev(struct cxl_dev_state *cxlds)
+struct cxl_memdev *devm_cxl_add_memdev(struct cxl_dev_state *cxlds,
+				       const struct cxl_memdev_attach *attach)
 {
-	return __devm_cxl_add_memdev(cxlds);
+	return __devm_cxl_add_memdev(cxlds, attach);
 }
 EXPORT_SYMBOL_NS_GPL(devm_cxl_add_memdev, "CXL");
 
diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index 1c6fc5334806..549368a9c868 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -1006,7 +1006,7 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 	if (rc)
 		dev_dbg(&pdev->dev, "No CXL Features discovered\n");
 
-	cxlmd = devm_cxl_add_memdev(cxlds);
+	cxlmd = devm_cxl_add_memdev(cxlds, NULL);
 	if (IS_ERR(cxlmd))
 		return PTR_ERR(cxlmd);
 
diff --git a/tools/testing/cxl/test/mem.c b/tools/testing/cxl/test/mem.c
index 8a22b7601627..cb87e8c0e63c 100644
--- a/tools/testing/cxl/test/mem.c
+++ b/tools/testing/cxl/test/mem.c
@@ -1767,7 +1767,7 @@ static int cxl_mock_mem_probe(struct platform_device *pdev)
 
 	cxl_mock_add_event_logs(&mdata->mes);
 
-	cxlmd = devm_cxl_add_memdev(cxlds);
+	cxlmd = devm_cxl_add_memdev(cxlds, NULL);
 	if (IS_ERR(cxlmd))
 		return PTR_ERR(cxlmd);
 
-- 
2.51.1

Re: [PATCH v2 6/6] cxl/mem: Introduce cxl_memdev_attach for CXL-dependent operation
Posted by Alejandro Lucero Palau 1 month, 2 weeks ago
On 12/16/25 00:56, Dan Williams wrote:
> Unlike the cxl_pci class driver that opportunistically enables memory
> expansion with no other dependent functionality, CXL accelerator drivers
> have distinct PCIe-only and CXL-enhanced operation states. If CXL is
> available some additional coherent memory/cache operations can be enabled,
> otherwise traditional DMA+MMIO over PCIe/CXL.io is a fallback.


This can be misleading. I bet most Type2 drivers will indeed use CXL.io 
and traditional DMA along with some special functionality with CXL.mem 
and CXL.cache.


>
> Allow for a driver to pass a routine to be called in cxl_mem_probe()
> context. This ability is inspired by and mirrors the semantics of
> faux_device_create(). It allows for the caller to run CXL-topology
> attach-dependent logic on behalf of the caller.
>
> The probe callback runs after the port topology is successfully attached
> for the given memdev.


As discussed at LPC, I would like to reflect here this potential 
problem, with port not ready, is mainly due to CXL switches being in the 
path to the device, unlike an accelerator directly connected to a CXL 
Host Bridge which will have all this sorted out at the time of driver 
probing. If there exist other cases, that should be also documented here 
or in the future when discovered.


>
> Additionally the presence of @cxlmd->attach indicates that the accelerator
> driver be detached when CXL operation ends. This conceptually makes a CXL
> link loss event mirror a PCIe link loss event which results in triggering
> the ->remove() callback of affected devices+drivers.


Also as discussed at LPC, I do not think this should happen. The 
accelerator driver should of course get a notification about this but 
not removed. Moreover, should not a link error being reported somehow 
through the RAS infrastructure and the driver reacting accordingly?


FWIW, there is a pending work by Vishal Aslot based on Type2 support for 
this RAS support what I have discussed with him for being a follow-up 
work of Type2.


>   A driver can re-attach
> to recover back to PCIe-only operation. Live recovery, i.e. without a
> ->remove()/->probe() cycle, is left as a future consideration.
>
> Cc: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
> Reviewed-by: Ben Cheatham <benjamin.cheatham@amd.com>
> Reviewed-by: Dave Jiang <dave.jiang@intel.com> (✓ DKIM/intel.com)
> Tested-by: Alejandro Lucero <alucerop@amd.com> (✓ DKIM/amd.com)
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---
>   drivers/cxl/cxlmem.h         | 12 ++++++++++--
>   drivers/cxl/core/memdev.c    | 33 +++++++++++++++++++++++++++++----
>   drivers/cxl/mem.c            | 20 ++++++++++++++++----
>   drivers/cxl/pci.c            |  2 +-
>   tools/testing/cxl/test/mem.c |  2 +-
>   5 files changed, 57 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
> index 9db31c7993c4..ef202b34e5ea 100644
> --- a/drivers/cxl/cxlmem.h
> +++ b/drivers/cxl/cxlmem.h
> @@ -34,6 +34,10 @@
>   	(FIELD_GET(CXLMDEV_RESET_NEEDED_MASK, status) !=                       \
>   	 CXLMDEV_RESET_NEEDED_NOT)
>   
> +struct cxl_memdev_attach {
> +	int (*probe)(struct cxl_memdev *cxlmd);
> +};
> +
>   /**
>    * struct cxl_memdev - CXL bus object representing a Type-3 Memory Device
>    * @dev: driver core device object
> @@ -43,6 +47,7 @@
>    * @cxl_nvb: coordinate removal of @cxl_nvd if present
>    * @cxl_nvd: optional bridge to an nvdimm if the device supports pmem
>    * @endpoint: connection to the CXL port topology for this memory device
> + * @attach: creator of this memdev depends on CXL link attach to operate
>    * @id: id number of this memdev instance.
>    * @depth: endpoint port depth
>    * @scrub_cycle: current scrub cycle set for this device
> @@ -59,6 +64,7 @@ struct cxl_memdev {
>   	struct cxl_nvdimm_bridge *cxl_nvb;
>   	struct cxl_nvdimm *cxl_nvd;
>   	struct cxl_port *endpoint;
> +	const struct cxl_memdev_attach *attach;
>   	int id;
>   	int depth;
>   	u8 scrub_cycle;
> @@ -95,8 +101,10 @@ static inline bool is_cxl_endpoint(struct cxl_port *port)
>   	return is_cxl_memdev(port->uport_dev);
>   }
>   
> -struct cxl_memdev *__devm_cxl_add_memdev(struct cxl_dev_state *cxlds);
> -struct cxl_memdev *devm_cxl_add_memdev(struct cxl_dev_state *cxlds);
> +struct cxl_memdev *__devm_cxl_add_memdev(struct cxl_dev_state *cxlds,
> +					 const struct cxl_memdev_attach *attach);
> +struct cxl_memdev *devm_cxl_add_memdev(struct cxl_dev_state *cxlds,
> +				       const struct cxl_memdev_attach *attach);
>   int devm_cxl_sanitize_setup_notifier(struct device *host,
>   				     struct cxl_memdev *cxlmd);
>   struct cxl_memdev_state;
> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> index 63da2bd4436e..3ab4cd8f19ed 100644
> --- a/drivers/cxl/core/memdev.c
> +++ b/drivers/cxl/core/memdev.c
> @@ -641,14 +641,24 @@ static void detach_memdev(struct work_struct *work)
>   	struct cxl_memdev *cxlmd;
>   
>   	cxlmd = container_of(work, typeof(*cxlmd), detach_work);
> -	device_release_driver(&cxlmd->dev);
> +
> +	/*
> +	 * When the creator of @cxlmd sets ->attach it indicates CXL operation
> +	 * is required. In that case, @cxlmd detach escalates to parent device
> +	 * detach.
> +	 */
> +	if (cxlmd->attach)
> +		device_release_driver(cxlmd->dev.parent);
> +	else
> +		device_release_driver(&cxlmd->dev);
>   	put_device(&cxlmd->dev);
>   }
>   
>   static struct lock_class_key cxl_memdev_key;
>   
>   static struct cxl_memdev *cxl_memdev_alloc(struct cxl_dev_state *cxlds,
> -					   const struct file_operations *fops)
> +					   const struct file_operations *fops,
> +					   const struct cxl_memdev_attach *attach)
>   {
>   	struct cxl_memdev *cxlmd;
>   	struct device *dev;
> @@ -664,6 +674,8 @@ static struct cxl_memdev *cxl_memdev_alloc(struct cxl_dev_state *cxlds,
>   		goto err;
>   	cxlmd->id = rc;
>   	cxlmd->depth = -1;
> +	cxlmd->attach = attach;
> +	cxlmd->endpoint = ERR_PTR(-ENXIO);
>   
>   	dev = &cxlmd->dev;
>   	device_initialize(dev);
> @@ -1081,6 +1093,18 @@ static struct cxl_memdev *cxl_memdev_autoremove(struct cxl_memdev *cxlmd)
>   {
>   	int rc;
>   
> +	/*
> +	 * If @attach is provided fail if the driver is not attached upon
> +	 * return. Note that failure here could be the result of a race to
> +	 * teardown the CXL port topology. I.e. cxl_mem_probe() could have
> +	 * succeeded and then cxl_mem unbound before the lock is acquired.
> +	 */
> +	guard(device)(&cxlmd->dev);
> +	if (cxlmd->attach && !cxlmd->dev.driver) {
> +		cxl_memdev_unregister(cxlmd);
> +		return ERR_PTR(-ENXIO);
> +	}
> +
>   	rc = devm_add_action_or_reset(cxlmd->cxlds->dev, cxl_memdev_unregister,
>   				      cxlmd);
>   	if (rc)
> @@ -1093,13 +1117,14 @@ static struct cxl_memdev *cxl_memdev_autoremove(struct cxl_memdev *cxlmd)
>    * Core helper for devm_cxl_add_memdev() that wants to both create a device and
>    * assert to the caller that upon return cxl_mem::probe() has been invoked.
>    */
> -struct cxl_memdev *__devm_cxl_add_memdev(struct cxl_dev_state *cxlds)
> +struct cxl_memdev *__devm_cxl_add_memdev(struct cxl_dev_state *cxlds,
> +					 const struct cxl_memdev_attach *attach)
>   {
>   	struct device *dev;
>   	int rc;
>   
>   	struct cxl_memdev *cxlmd __free(put_cxlmd) =
> -		cxl_memdev_alloc(cxlds, &cxl_memdev_fops);
> +		cxl_memdev_alloc(cxlds, &cxl_memdev_fops, attach);
>   	if (IS_ERR(cxlmd))
>   		return cxlmd;
>   
> diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c
> index 677996c65272..333c366b69e7 100644
> --- a/drivers/cxl/mem.c
> +++ b/drivers/cxl/mem.c
> @@ -142,6 +142,12 @@ static int cxl_mem_probe(struct device *dev)
>   			return rc;
>   	}
>   
> +	if (cxlmd->attach) {
> +		rc = cxlmd->attach->probe(cxlmd);
> +		if (rc)
> +			return rc;
> +	}
> +
>   	rc = devm_cxl_memdev_edac_register(cxlmd);
>   	if (rc)
>   		dev_dbg(dev, "CXL memdev EDAC registration failed rc=%d\n", rc);
> @@ -166,17 +172,23 @@ static int cxl_mem_probe(struct device *dev)
>   /**
>    * devm_cxl_add_memdev - Add a CXL memory device
>    * @cxlds: CXL device state to associate with the memdev
> + * @attach: Caller depends on CXL topology attachment
>    *
>    * Upon return the device will have had a chance to attach to the
> - * cxl_mem driver, but may fail if the CXL topology is not ready
> - * (hardware CXL link down, or software platform CXL root not attached)
> + * cxl_mem driver, but may fail to attach if the CXL topology is not ready
> + * (hardware CXL link down, or software platform CXL root not attached).
> + *
> + * When @attach is NULL it indicates the caller wants the memdev to remain
> + * registered even if it does not immediately attach to the CXL hierarchy. When
> + * @attach is provided a cxl_mem_probe() failure leads to failure of this routine.
>    *
>    * The parent of the resulting device and the devm context for allocations is
>    * @cxlds->dev.
>    */
> -struct cxl_memdev *devm_cxl_add_memdev(struct cxl_dev_state *cxlds)
> +struct cxl_memdev *devm_cxl_add_memdev(struct cxl_dev_state *cxlds,
> +				       const struct cxl_memdev_attach *attach)
>   {
> -	return __devm_cxl_add_memdev(cxlds);
> +	return __devm_cxl_add_memdev(cxlds, attach);
>   }
>   EXPORT_SYMBOL_NS_GPL(devm_cxl_add_memdev, "CXL");
>   
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index 1c6fc5334806..549368a9c868 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -1006,7 +1006,7 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>   	if (rc)
>   		dev_dbg(&pdev->dev, "No CXL Features discovered\n");
>   
> -	cxlmd = devm_cxl_add_memdev(cxlds);
> +	cxlmd = devm_cxl_add_memdev(cxlds, NULL);
>   	if (IS_ERR(cxlmd))
>   		return PTR_ERR(cxlmd);
>   
> diff --git a/tools/testing/cxl/test/mem.c b/tools/testing/cxl/test/mem.c
> index 8a22b7601627..cb87e8c0e63c 100644
> --- a/tools/testing/cxl/test/mem.c
> +++ b/tools/testing/cxl/test/mem.c
> @@ -1767,7 +1767,7 @@ static int cxl_mock_mem_probe(struct platform_device *pdev)
>   
>   	cxl_mock_add_event_logs(&mdata->mes);
>   
> -	cxlmd = devm_cxl_add_memdev(cxlds);
> +	cxlmd = devm_cxl_add_memdev(cxlds, NULL);
>   	if (IS_ERR(cxlmd))
>   		return PTR_ERR(cxlmd);
>   
Re: [PATCH v2 6/6] cxl/mem: Introduce cxl_memdev_attach for CXL-dependent operation
Posted by Jonathan Cameron 1 month, 3 weeks ago
On Mon, 15 Dec 2025 16:56:16 -0800
Dan Williams <dan.j.williams@intel.com> wrote:

> Unlike the cxl_pci class driver that opportunistically enables memory
> expansion with no other dependent functionality, CXL accelerator drivers
> have distinct PCIe-only and CXL-enhanced operation states. If CXL is
> available some additional coherent memory/cache operations can be enabled,
> otherwise traditional DMA+MMIO over PCIe/CXL.io is a fallback.
> 
> Allow for a driver to pass a routine to be called in cxl_mem_probe()
> context. This ability is inspired by and mirrors the semantics of
> faux_device_create(). It allows for the caller to run CXL-topology
> attach-dependent logic on behalf of the caller.

This seems confusing.  The caller is running logic for the caller?
It can do that whenever it likes!  One of those is presumably callee



> 
> The probe callback runs after the port topology is successfully attached
> for the given memdev.
> 
> Additionally the presence of @cxlmd->attach indicates that the accelerator
> driver be detached when CXL operation ends. This conceptually makes a CXL
> link loss event mirror a PCIe link loss event which results in triggering
> the ->remove() callback of affected devices+drivers. A driver can re-attach
> to recover back to PCIe-only operation. Live recovery, i.e. without a
> ->remove()/->probe() cycle, is left as a future consideration.  
> 
> Cc: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
> Reviewed-by: Ben Cheatham <benjamin.cheatham@amd.com>
> Reviewed-by: Dave Jiang <dave.jiang@intel.com> (✓ DKIM/intel.com)

Have we started adding DKIM stuff to tags?

> Tested-by: Alejandro Lucero <alucerop@amd.com> (✓ DKIM/amd.com)
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
One trivial thing on function naming inline.  Either way.
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>

To me this looks good to start building the other stuff on top of.
Thanks for unblocking this stuff (hopefully)

Jonathan

> ---
>  drivers/cxl/cxlmem.h         | 12 ++++++++++--
>  drivers/cxl/core/memdev.c    | 33 +++++++++++++++++++++++++++++----
>  drivers/cxl/mem.c            | 20 ++++++++++++++++----
>  drivers/cxl/pci.c            |  2 +-
>  tools/testing/cxl/test/mem.c |  2 +-
>  5 files changed, 57 insertions(+), 12 deletions(-)

> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> index 63da2bd4436e..3ab4cd8f19ed 100644
> --- a/drivers/cxl/core/memdev.c
> +++ b/drivers/cxl/core/memdev.c


> @@ -1081,6 +1093,18 @@ static struct cxl_memdev *cxl_memdev_autoremove(struct cxl_memdev *cxlmd)
>  {
>  	int rc;
>  
> +	/*

The general approach is fine but is the function name appropriate for this
new stuff?  Naturally I'm not suggesting the bikeshed should be any particular
alternative color just maybe not the current blue.


> +	 * If @attach is provided fail if the driver is not attached upon
> +	 * return. Note that failure here could be the result of a race to
> +	 * teardown the CXL port topology. I.e. cxl_mem_probe() could have
> +	 * succeeded and then cxl_mem unbound before the lock is acquired.
> +	 */
> +	guard(device)(&cxlmd->dev);
> +	if (cxlmd->attach && !cxlmd->dev.driver) {
> +		cxl_memdev_unregister(cxlmd);
> +		return ERR_PTR(-ENXIO);
> +	}
> +
>  	rc = devm_add_action_or_reset(cxlmd->cxlds->dev, cxl_memdev_unregister,
>  				      cxlmd);
>  	if (rc)
Re: [PATCH v2 6/6] cxl/mem: Introduce cxl_memdev_attach for CXL-dependent operation
Posted by dan.j.williams@intel.com 1 month, 3 weeks ago
Jonathan Cameron wrote:
> On Mon, 15 Dec 2025 16:56:16 -0800
> Dan Williams <dan.j.williams@intel.com> wrote:
> 
> > Unlike the cxl_pci class driver that opportunistically enables memory
> > expansion with no other dependent functionality, CXL accelerator drivers
> > have distinct PCIe-only and CXL-enhanced operation states. If CXL is
> > available some additional coherent memory/cache operations can be enabled,
> > otherwise traditional DMA+MMIO over PCIe/CXL.io is a fallback.
> > 
> > Allow for a driver to pass a routine to be called in cxl_mem_probe()
> > context. This ability is inspired by and mirrors the semantics of
> > faux_device_create(). It allows for the caller to run CXL-topology
> > attach-dependent logic on behalf of the caller.
> 
> This seems confusing. 

Is faux_device_create() confusing?

> The caller is running logic for the caller?  It can do that whenever
> it likes!  One of those is presumably callee

No, it cannot execute CXL topology attach dependendent functionality in
the device's initial probe context synchronous with the device-attach
event "whenever it likes".

> > The probe callback runs after the port topology is successfully attached
> > for the given memdev.
> > 
> > Additionally the presence of @cxlmd->attach indicates that the accelerator
> > driver be detached when CXL operation ends. This conceptually makes a CXL
> > link loss event mirror a PCIe link loss event which results in triggering
> > the ->remove() callback of affected devices+drivers. A driver can re-attach
> > to recover back to PCIe-only operation. Live recovery, i.e. without a
> > ->remove()/->probe() cycle, is left as a future consideration.  
> > 
> > Cc: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
> > Reviewed-by: Ben Cheatham <benjamin.cheatham@amd.com>
> > Reviewed-by: Dave Jiang <dave.jiang@intel.com> (✓ DKIM/intel.com)
> 
> Have we started adding DKIM stuff to tags?

No, just a copy/paste typo that I did not catch.

> > Tested-by: Alejandro Lucero <alucerop@amd.com> (✓ DKIM/amd.com)
> > Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> One trivial thing on function naming inline.  Either way.
> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
> 
> To me this looks good to start building the other stuff on top of.
> Thanks for unblocking this stuff (hopefully)
> 
> Jonathan
> 
> > ---
> >  drivers/cxl/cxlmem.h         | 12 ++++++++++--
> >  drivers/cxl/core/memdev.c    | 33 +++++++++++++++++++++++++++++----
> >  drivers/cxl/mem.c            | 20 ++++++++++++++++----
> >  drivers/cxl/pci.c            |  2 +-
> >  tools/testing/cxl/test/mem.c |  2 +-
> >  5 files changed, 57 insertions(+), 12 deletions(-)
> 
> > diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
> > index 63da2bd4436e..3ab4cd8f19ed 100644
> > --- a/drivers/cxl/core/memdev.c
> > +++ b/drivers/cxl/core/memdev.c
> 
> 
> > @@ -1081,6 +1093,18 @@ static struct cxl_memdev *cxl_memdev_autoremove(struct cxl_memdev *cxlmd)
> >  {
> >  	int rc;
> >  
> > +	/*
> 
> The general approach is fine but is the function name appropriate for this
> new stuff?  Naturally I'm not suggesting the bikeshed should be any particular
> alternative color just maybe not the current blue.

The _autoremove() verb appears multiple times in the subsystem, not sure
why it is raising bikeshed concerns now. Please send a new proposal if
"autoremove" is not jibing.
Re: [PATCH v2 6/6] cxl/mem: Introduce cxl_memdev_attach for CXL-dependent operation
Posted by Jonathan Cameron 1 month, 3 weeks ago
On Wed, 17 Dec 2025 08:27:00 -0800
dan.j.williams@intel.com wrote:

> Jonathan Cameron wrote:
> > On Mon, 15 Dec 2025 16:56:16 -0800
> > Dan Williams <dan.j.williams@intel.com> wrote:
> >   
> > > Unlike the cxl_pci class driver that opportunistically enables memory
> > > expansion with no other dependent functionality, CXL accelerator drivers
> > > have distinct PCIe-only and CXL-enhanced operation states. If CXL is
> > > available some additional coherent memory/cache operations can be enabled,
> > > otherwise traditional DMA+MMIO over PCIe/CXL.io is a fallback.
> > > 
> > > Allow for a driver to pass a routine to be called in cxl_mem_probe()
> > > context. This ability is inspired by and mirrors the semantics of
> > > faux_device_create(). It allows for the caller to run CXL-topology
> > > attach-dependent logic on behalf of the caller.  
> > 
> > This seems confusing.   
> 
> Is faux_device_create() confusing?

Just to be clear this question is all about the word 'caller' being repeated
in that sentence. Not about the code itself or anything else in the explanation
or flow. 

This comment that reads very oddly to me and I think means something very
different from what is going on here.

> 
> > The caller is running logic for the caller?  It can do that whenever
> > it likes!  One of those is presumably callee  
> 
> No, it cannot execute CXL topology attach dependendent functionality in
> the device's initial probe context synchronous with the device-attach
> event "whenever it likes".

I'm still lost. Why 'caller to run' ... 'on behalf of the caller.'  In this case
caller is in both cases the function calling cxl_memdev_alloc()?

Maybe something like

"This arranges for CXL-topology attach-dependent logic to be run later, on behalf of
the caller."

Though that kind of repeats what follows, so maybe just drop the sentence.

> 
> > > The probe callback runs after the port topology is successfully attached
> > > for the given memdev.
> > > 
> > > Additionally the presence of @cxlmd->attach indicates that the accelerator
> > > driver be detached when CXL operation ends. This conceptually makes a CXL
> > > link loss event mirror a PCIe link loss event which results in triggering
> > > the ->remove() callback of affected devices+drivers. A driver can re-attach
> > > to recover back to PCIe-only operation. Live recovery, i.e. without a  
> > > ->remove()/->probe() cycle, is left as a future consideration.    

> > > @@ -1081,6 +1093,18 @@ static struct cxl_memdev *cxl_memdev_autoremove(struct cxl_memdev *cxlmd)
> > >  {
> > >  	int rc;
> > >  
> > > +	/*  
> > 
> > The general approach is fine but is the function name appropriate for this
> > new stuff?  Naturally I'm not suggesting the bikeshed should be any particular
> > alternative color just maybe not the current blue.  
> 
> The _autoremove() verb appears multiple times in the subsystem, not sure
> why it is raising bikeshed concerns now. Please send a new proposal if
> "autoremove" is not jibing.

It felt like a stretch given the additional code that is not about registering
autoremove for later, but doing it now under some circumstances.  Ah well I don't
care that much what it's called.

Jonathan
Re: [PATCH v2 6/6] cxl/mem: Introduce cxl_memdev_attach for CXL-dependent operation
Posted by dan.j.williams@intel.com 1 month, 3 weeks ago
Jonathan Cameron wrote:
> On Wed, 17 Dec 2025 08:27:00 -0800
> dan.j.williams@intel.com wrote:
> 
> > Jonathan Cameron wrote:
> > > On Mon, 15 Dec 2025 16:56:16 -0800
> > > Dan Williams <dan.j.williams@intel.com> wrote:
> > >   
> > > > Unlike the cxl_pci class driver that opportunistically enables memory
> > > > expansion with no other dependent functionality, CXL accelerator drivers
> > > > have distinct PCIe-only and CXL-enhanced operation states. If CXL is
> > > > available some additional coherent memory/cache operations can be enabled,
> > > > otherwise traditional DMA+MMIO over PCIe/CXL.io is a fallback.
> > > > 
> > > > Allow for a driver to pass a routine to be called in cxl_mem_probe()
> > > > context. This ability is inspired by and mirrors the semantics of
> > > > faux_device_create(). It allows for the caller to run CXL-topology
> > > > attach-dependent logic on behalf of the caller.  
> > > 
> > > This seems confusing.   
> > 
> > Is faux_device_create() confusing?
> 
> Just to be clear this question is all about the word 'caller' being repeated
> in that sentence. Not about the code itself or anything else in the explanation
> or flow. 

Oh, sorry, I took it as the design was confusing.

> This comment that reads very oddly to me and I think means something very
> different from what is going on here.
> 
> > 
> > > The caller is running logic for the caller?  It can do that whenever
> > > it likes!  One of those is presumably callee  
> > 
> > No, it cannot execute CXL topology attach dependendent functionality in
> > the device's initial probe context synchronous with the device-attach
> > event "whenever it likes".
> 
> I'm still lost. Why 'caller to run' ... 'on behalf of the caller.'  In this case
> caller is in both cases the function calling cxl_memdev_alloc()?
> 
> Maybe something like
> 
> "This arranges for CXL-topology attach-dependent logic to be run later, on behalf of
> the caller."
> 
> Though that kind of repeats what follows, so maybe just drop the sentence.

How about this reworked changelog?

---

Unlike the cxl_pci class driver that opportunistically enables memory
expansion with no other dependent functionality, CXL accelerator drivers
have distinct PCIe-only and CXL-enhanced operation states. If CXL is
available some additional coherent memory/cache operations can be enabled,
otherwise traditional DMA+MMIO over PCIe/CXL.io is a fallback.

This constitutes a new mode of operation where the caller of
devm_cxl_add_memdev() wants to make a "go/no-go" decision about running
in CXL accelerated mode or falling back to PCIe-only operation. Part of
that decision making process likely also includes additional
CXL-acceleration-specific resource setup. Encapsulate both of those
requirements into 'struct cxl_memdev_attach' that provides a ->probe()
callback. The probe callback runs in cxl_mem_probe() context, after the
port topology is successfully attached for the given memdev. It supports
a contract where, upon successful return from devm_cxl_add_memdev(),
everything needed for CXL accelerated operation has been enabled.

Additionally the presence of @cxlmd->attach indicates that the accelerator
driver be detached when CXL operation ends. This conceptually makes a CXL
link loss event mirror a PCIe link loss event which results in triggering
the ->remove() callback of affected devices+drivers. A driver can re-attach
to recover back to PCIe-only operation. Live recovery, i.e. without a
->remove()/->probe() cycle, is left as a future consideration.

---

> > > > @@ -1081,6 +1093,18 @@ static struct cxl_memdev *cxl_memdev_autoremove(struct cxl_memdev *cxlmd)
> > > >  {
> > > >  	int rc;
> > > >  
> > > > +	/*  
> > > 
> > > The general approach is fine but is the function name appropriate for this
> > > new stuff?  Naturally I'm not suggesting the bikeshed should be any particular
> > > alternative color just maybe not the current blue.  
> > 
> > The _autoremove() verb appears multiple times in the subsystem, not sure
> > why it is raising bikeshed concerns now. Please send a new proposal if
> > "autoremove" is not jibing.
> 
> It felt like a stretch given the additional code that is not about registering
> autoremove for later, but doing it now under some circumstances.  Ah well I don't
> care that much what it's called.

It is the same semantic as devm_add_action_or_reset() in that if the
"add_action" fails then the "reset" is triggered.

Yes, there is additional code that validates that the device to be
registered for removal is attached to its driver. That organization
supports having a single handoff from scoped-based cleanup to devm based
cleanup.

If you can think of a better organization and name I am open to hearing
options, but nothing better is immediately jumping out at me.
Re: [PATCH v2 6/6] cxl/mem: Introduce cxl_memdev_attach for CXL-dependent operation
Posted by Jonathan Cameron 1 month, 3 weeks ago
On Thu, 18 Dec 2025 11:50:10 -0800
<dan.j.williams@intel.com> wrote:

> Jonathan Cameron wrote:
> > On Wed, 17 Dec 2025 08:27:00 -0800
> > dan.j.williams@intel.com wrote:
> >   
> > > Jonathan Cameron wrote:  
> > > > On Mon, 15 Dec 2025 16:56:16 -0800
> > > > Dan Williams <dan.j.williams@intel.com> wrote:
> > > >     
> > > > > Unlike the cxl_pci class driver that opportunistically enables memory
> > > > > expansion with no other dependent functionality, CXL accelerator drivers
> > > > > have distinct PCIe-only and CXL-enhanced operation states. If CXL is
> > > > > available some additional coherent memory/cache operations can be enabled,
> > > > > otherwise traditional DMA+MMIO over PCIe/CXL.io is a fallback.
> > > > > 
> > > > > Allow for a driver to pass a routine to be called in cxl_mem_probe()
> > > > > context. This ability is inspired by and mirrors the semantics of
> > > > > faux_device_create(). It allows for the caller to run CXL-topology
> > > > > attach-dependent logic on behalf of the caller.    
> > > > 
> > > > This seems confusing.     
> > > 
> > > Is faux_device_create() confusing?  
> > 
> > Just to be clear this question is all about the word 'caller' being repeated
> > in that sentence. Not about the code itself or anything else in the explanation
> > or flow.   
> 
> Oh, sorry, I took it as the design was confusing.

I should have been clearer!

> 
> > This comment that reads very oddly to me and I think means something very
> > different from what is going on here.
> >   
> > >   
> > > > The caller is running logic for the caller?  It can do that whenever
> > > > it likes!  One of those is presumably callee    
> > > 
> > > No, it cannot execute CXL topology attach dependendent functionality in
> > > the device's initial probe context synchronous with the device-attach
> > > event "whenever it likes".  
> > 
> > I'm still lost. Why 'caller to run' ... 'on behalf of the caller.'  In this case
> > caller is in both cases the function calling cxl_memdev_alloc()?
> > 
> > Maybe something like
> > 
> > "This arranges for CXL-topology attach-dependent logic to be run later, on behalf of
> > the caller."
> > 
> > Though that kind of repeats what follows, so maybe just drop the sentence.  
> 
> How about this reworked changelog?
> 
> ---
> 
> Unlike the cxl_pci class driver that opportunistically enables memory
> expansion with no other dependent functionality, CXL accelerator drivers
> have distinct PCIe-only and CXL-enhanced operation states. If CXL is
> available some additional coherent memory/cache operations can be enabled,
> otherwise traditional DMA+MMIO over PCIe/CXL.io is a fallback.
> 
> This constitutes a new mode of operation where the caller of
> devm_cxl_add_memdev() wants to make a "go/no-go" decision about running
> in CXL accelerated mode or falling back to PCIe-only operation. Part of
> that decision making process likely also includes additional
> CXL-acceleration-specific resource setup. Encapsulate both of those
> requirements into 'struct cxl_memdev_attach' that provides a ->probe()
> callback. The probe callback runs in cxl_mem_probe() context, after the
> port topology is successfully attached for the given memdev. It supports
> a contract where, upon successful return from devm_cxl_add_memdev(),
> everything needed for CXL accelerated operation has been enabled.
> 
> Additionally the presence of @cxlmd->attach indicates that the accelerator
> driver be detached when CXL operation ends. This conceptually makes a CXL
> link loss event mirror a PCIe link loss event which results in triggering
> the ->remove() callback of affected devices+drivers. A driver can re-attach
> to recover back to PCIe-only operation. Live recovery, i.e. without a
> ->remove()/->probe() cycle, is left as a future consideration.  
Nice.  Thanks for the rewrite.

> 
> ---
> 
> > > > > @@ -1081,6 +1093,18 @@ static struct cxl_memdev *cxl_memdev_autoremove(struct cxl_memdev *cxlmd)
> > > > >  {
> > > > >  	int rc;
> > > > >  
> > > > > +	/*    
> > > > 
> > > > The general approach is fine but is the function name appropriate for this
> > > > new stuff?  Naturally I'm not suggesting the bikeshed should be any particular
> > > > alternative color just maybe not the current blue.    
> > > 
> > > The _autoremove() verb appears multiple times in the subsystem, not sure
> > > why it is raising bikeshed concerns now. Please send a new proposal if
> > > "autoremove" is not jibing.  
> > 
> > It felt like a stretch given the additional code that is not about registering
> > autoremove for later, but doing it now under some circumstances.  Ah well I don't
> > care that much what it's called.  
> 
> It is the same semantic as devm_add_action_or_reset() in that if the
> "add_action" fails then the "reset" is triggered.
> 
> Yes, there is additional code that validates that the device to be
> registered for removal is attached to its driver. That organization
> supports having a single handoff from scoped-based cleanup to devm based
> cleanup.
> 
> If you can think of a better organization and name I am open to hearing
> options, but nothing better is immediately jumping out at me.
> 
Let's go with current naming.  I don't like the only options I can come up
with either.  If inspiration strikes we can always change it later.

Jonathan