[v11] Enable CXL PCIe Port Protocol Error handling and logging

[PATCH v11 18/23] PCI/AER: Dequeue forwarded CXL error

Posted by Terry Bowman 5 months, 2 weeks ago

The AER driver is now designed to forward CXL protocol errors to the CXL
driver. Update the CXL driver with functionality to dequeue the forwarded
CXL error from the kfifo. Also, update the CXL driver to begin the protocol
error handling processing using the work received from the FIFO.

Update function cxl_proto_err_work_fn() to dequeue work forwarded by the
AER service driver. This will begin the CXL protocol error processing with
a call to cxl_handle_proto_error().

Introduce logic to take the SBDF values from 'struct cxl_proto_error_info'
and use in discovering the erring PCI device. The call to pci_get_domain_bus_and_slot()
will return a reference counted 'struct pci_dev *'. This will serve as
reference count to prevent releasing the CXL Endpoint's mapped RAS while
handling the error. Use scope base __free() to put the reference count.
This will change when adding support for CXL port devices in the future.

Implement cxl_handle_proto_error() to differentiate between Restricted CXL
Host (RCH) protocol errors and CXL virtual host (VH) protocol errors.
Maintain the existing RCH handling. Export the AER driver's pcie_walk_rcec()
allowing the CXL driver to walk the RCEC's secondary bus.

VH correctable error (CE) processing will call the CXL CE handler. VH
uncorrectable errors (UCE) will call cxl_do_recovery(), implemented as a
stub for now and to be updated in future patch. Export pci_aer_clean_fatal_status()
and pci_clean_device_status() used to clean up AER status after handling.

Signed-off-by: Terry Bowman <terry.bowman@amd.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>

---
Changes in v10->v11:
- Reword patch commit message to remove RCiEP details (Jonathan)
- Add #include <linux/bitfield.h> (Terry)
- is_cxl_rcd() - Fix short comment message wrap  (Jonathan)
- is_cxl_rcd() - Combine return calls into 1  (Jonathan)
- cxl_handle_proto_error() - Move comment earlier  (Jonathan)
- Usse FIELD_GET() in discovering class code (Jonathan)
- Remove BDF from cxl_proto_err_work_data. Use 'struct pci_dev *' (Dan)
---
 drivers/cxl/core/ras.c  | 68 ++++++++++++++++++++++++++++++++++-------
 drivers/pci/pci.c       |  1 +
 drivers/pci/pci.h       |  7 -----
 drivers/pci/pcie/aer.c  |  1 +
 drivers/pci/pcie/rcec.c |  1 +
 include/linux/aer.h     |  2 ++
 include/linux/pci.h     | 10 ++++++
 7 files changed, 72 insertions(+), 18 deletions(-)

diff --git a/drivers/cxl/core/ras.c b/drivers/cxl/core/ras.c
index b285448c2d9c..a2e95c49f965 100644
--- a/drivers/cxl/core/ras.c
+++ b/drivers/cxl/core/ras.c
@@ -118,17 +118,6 @@ static void cxl_cper_prot_err_work_fn(struct work_struct *work)
 }
 static DECLARE_WORK(cxl_cper_prot_err_work, cxl_cper_prot_err_work_fn);
 
-int cxl_ras_init(void)
-{
-	return cxl_cper_register_prot_err_work(&cxl_cper_prot_err_work);
-}
-
-void cxl_ras_exit(void)
-{
-	cxl_cper_unregister_prot_err_work(&cxl_cper_prot_err_work);
-	cancel_work_sync(&cxl_cper_prot_err_work);
-}
-
 static pci_ers_result_t cxl_handle_ras(struct device *dev, u64 serial, void __iomem *ras_base);
 static void cxl_handle_cor_ras(struct device *dev, u64 serial, void __iomem *ras_base);
 
@@ -331,6 +320,10 @@ void cxl_endpoint_port_init_ras(struct cxl_port *ep)
 }
 EXPORT_SYMBOL_NS_GPL(cxl_endpoint_port_init_ras, "CXL");
 
+static void cxl_do_recovery(struct device *dev)
+{
+}
+
 static void cxl_handle_cor_ras(struct device *dev, u64 serial, void __iomem *ras_base)
 {
 	void __iomem *addr;
@@ -472,3 +465,56 @@ pci_ers_result_t pci_error_detected(struct pci_dev *pdev,
 	return rc;
 }
 EXPORT_SYMBOL_NS_GPL(pci_error_detected, "CXL");
+
+static void cxl_handle_proto_error(struct cxl_proto_err_work_data *err_info)
+{
+	struct pci_dev *pdev = err_info->pdev;
+	struct cxl_dev_state *cxlds = pci_get_drvdata(pdev);
+	struct cxl_memdev *cxlmd = cxlds->cxlmd;
+	struct device *host_dev __free(put_device) = get_device(&cxlmd->dev);
+
+	if (err_info->severity == AER_CORRECTABLE) {
+		int aer = pdev->aer_cap;
+
+		if (aer)
+			pci_clear_and_set_config_dword(pdev,
+						       aer + PCI_ERR_COR_STATUS,
+						       0, PCI_ERR_COR_INTERNAL);
+
+		cxl_cor_error_detected(&cxlmd->dev);
+
+		pcie_clear_device_status(pdev);
+	} else {
+		cxl_do_recovery(&cxlmd->dev);
+	}
+}
+
+static void cxl_proto_err_work_fn(struct work_struct *work)
+{
+	struct cxl_proto_err_work_data wd;
+
+	while (cxl_proto_err_kfifo_get(&wd))
+		cxl_handle_proto_error(&wd);
+}
+
+static struct work_struct cxl_proto_err_work;
+static DECLARE_WORK(cxl_proto_err_work, cxl_proto_err_work_fn);
+
+int cxl_ras_init(void)
+{
+	if (cxl_cper_register_prot_err_work(&cxl_cper_prot_err_work))
+		pr_err("Failed to initialize CXL RAS CPER\n");
+
+	cxl_register_proto_err_work(&cxl_proto_err_work);
+
+	return 0;
+}
+
+void cxl_ras_exit(void)
+{
+	cxl_cper_unregister_prot_err_work(&cxl_cper_prot_err_work);
+	cancel_work_sync(&cxl_cper_prot_err_work);
+
+	cxl_unregister_proto_err_work();
+	cancel_work_sync(&cxl_proto_err_work);
+}
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index d775ed37a79b..2c9827690cb3 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -2328,6 +2328,7 @@ void pcie_clear_device_status(struct pci_dev *dev)
 	pcie_capability_read_word(dev, PCI_EXP_DEVSTA, &sta);
 	pcie_capability_write_word(dev, PCI_EXP_DEVSTA, sta);
 }
+EXPORT_SYMBOL_NS_GPL(pcie_clear_device_status, "CXL");
 #endif
 
 /**
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index cfa75903dd3f..69ff7c2d214f 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -671,16 +671,10 @@ static inline bool pci_dpc_recovered(struct pci_dev *pdev) { return false; }
 void pci_rcec_init(struct pci_dev *dev);
 void pci_rcec_exit(struct pci_dev *dev);
 void pcie_link_rcec(struct pci_dev *rcec);
-void pcie_walk_rcec(struct pci_dev *rcec,
-		    int (*cb)(struct pci_dev *, void *),
-		    void *userdata);
 #else
 static inline void pci_rcec_init(struct pci_dev *dev) { }
 static inline void pci_rcec_exit(struct pci_dev *dev) { }
 static inline void pcie_link_rcec(struct pci_dev *rcec) { }
-static inline void pcie_walk_rcec(struct pci_dev *rcec,
-				  int (*cb)(struct pci_dev *, void *),
-				  void *userdata) { }
 #endif
 
 #ifdef CONFIG_PCI_ATS
@@ -1022,7 +1016,6 @@ void pci_restore_aer_state(struct pci_dev *dev);
 static inline void pci_no_aer(void) { }
 static inline void pci_aer_init(struct pci_dev *d) { }
 static inline void pci_aer_exit(struct pci_dev *d) { }
-static inline void pci_aer_clear_fatal_status(struct pci_dev *dev) { }
 static inline int pci_aer_clear_status(struct pci_dev *dev) { return -EINVAL; }
 static inline int pci_aer_raw_clear_status(struct pci_dev *dev) { return -EINVAL; }
 static inline void pci_save_aer_state(struct pci_dev *dev) { }
diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
index 627d89ccea9c..45abe1622316 100644
--- a/drivers/pci/pcie/aer.c
+++ b/drivers/pci/pcie/aer.c
@@ -288,6 +288,7 @@ void pci_aer_clear_fatal_status(struct pci_dev *dev)
 	if (status)
 		pci_write_config_dword(dev, aer + PCI_ERR_UNCOR_STATUS, status);
 }
+EXPORT_SYMBOL_GPL(pci_aer_clear_fatal_status);
 
 /**
  * pci_aer_raw_clear_status - Clear AER error registers.
diff --git a/drivers/pci/pcie/rcec.c b/drivers/pci/pcie/rcec.c
index d0bcd141ac9c..fb6cf6449a1d 100644
--- a/drivers/pci/pcie/rcec.c
+++ b/drivers/pci/pcie/rcec.c
@@ -145,6 +145,7 @@ void pcie_walk_rcec(struct pci_dev *rcec, int (*cb)(struct pci_dev *, void *),
 
 	walk_rcec(walk_rcec_helper, &rcec_data);
 }
+EXPORT_SYMBOL_NS_GPL(pcie_walk_rcec, "CXL");
 
 void pci_rcec_init(struct pci_dev *dev)
 {
diff --git a/include/linux/aer.h b/include/linux/aer.h
index f8eb32805957..1f79f0be4bf7 100644
--- a/include/linux/aer.h
+++ b/include/linux/aer.h
@@ -66,12 +66,14 @@ struct cxl_proto_err_work_data {
 
 #if defined(CONFIG_PCIEAER)
 int pci_aer_clear_nonfatal_status(struct pci_dev *dev);
+void pci_aer_clear_fatal_status(struct pci_dev *dev);
 int pcie_aer_is_native(struct pci_dev *dev);
 #else
 static inline int pci_aer_clear_nonfatal_status(struct pci_dev *dev)
 {
 	return -EINVAL;
 }
+static inline void pci_aer_clear_fatal_status(struct pci_dev *dev) { }
 static inline int pcie_aer_is_native(struct pci_dev *dev) { return 0; }
 #endif
 
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 3dcab36c437f..3407d687459d 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1804,6 +1804,9 @@ extern bool pcie_ports_native;
 
 int pcie_set_target_speed(struct pci_dev *port, enum pci_bus_speed speed_req,
 			  bool use_lt);
+void pcie_walk_rcec(struct pci_dev *rcec,
+		    int (*cb)(struct pci_dev *, void *),
+		    void *userdata);
 #else
 #define pcie_ports_disabled	true
 #define pcie_ports_native	false
@@ -1814,8 +1817,15 @@ static inline int pcie_set_target_speed(struct pci_dev *port,
 {
 	return -EOPNOTSUPP;
 }
+
+static inline void pcie_walk_rcec(struct pci_dev *rcec,
+				  int (*cb)(struct pci_dev *, void *),
+				  void *userdata) { }
+
 #endif
 
+void pcie_clear_device_status(struct pci_dev *dev);
+
 #define PCIE_LINK_STATE_L0S		(BIT(0) | BIT(1)) /* Upstr/dwnstr L0s */
 #define PCIE_LINK_STATE_L1		BIT(2)	/* L1 state */
 #define PCIE_LINK_STATE_L1_1		BIT(3)	/* ASPM L1.1 state */
-- 
2.51.0.rc2.21.ge5ab6b3e5a

Re: [PATCH v11 18/23] PCI/AER: Dequeue forwarded CXL error

Posted by Jonathan Cameron 5 months ago

On Tue, 26 Aug 2025 20:35:33 -0500
Terry Bowman <terry.bowman@amd.com> wrote:

> The AER driver is now designed to forward CXL protocol errors to the CXL
> driver. Update the CXL driver with functionality to dequeue the forwarded
> CXL error from the kfifo. Also, update the CXL driver to begin the protocol
> error handling processing using the work received from the FIFO.
> 
> Update function cxl_proto_err_work_fn() to dequeue work forwarded by the
> AER service driver. This will begin the CXL protocol error processing with
> a call to cxl_handle_proto_error().
> 
> Introduce logic to take the SBDF values from 'struct cxl_proto_error_info'
> and use in discovering the erring PCI device. The call to pci_get_domain_bus_and_slot()
> will return a reference counted 'struct pci_dev *'. This will serve as
> reference count to prevent releasing the CXL Endpoint's mapped RAS while
> handling the error. Use scope base __free() to put the reference count.
> This will change when adding support for CXL port devices in the future.
> 
> Implement cxl_handle_proto_error() to differentiate between Restricted CXL
> Host (RCH) protocol errors and CXL virtual host (VH) protocol errors.
> Maintain the existing RCH handling. Export the AER driver's pcie_walk_rcec()
> allowing the CXL driver to walk the RCEC's secondary bus.
> 
> VH correctable error (CE) processing will call the CXL CE handler. VH
> uncorrectable errors (UCE) will call cxl_do_recovery(), implemented as a
> stub for now and to be updated in future patch. Export pci_aer_clean_fatal_status()
> and pci_clean_device_status() used to clean up AER status after handling.
> 
> Signed-off-by: Terry Bowman <terry.bowman@amd.com>
> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>

One additional comment.

> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index d775ed37a79b..2c9827690cb3 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -2328,6 +2328,7 @@ void pcie_clear_device_status(struct pci_dev *dev)
>  	pcie_capability_read_word(dev, PCI_EXP_DEVSTA, &sta);
>  	pcie_capability_write_word(dev, PCI_EXP_DEVSTA, sta);
>  }
> +EXPORT_SYMBOL_NS_GPL(pcie_clear_device_status, "CXL");

Not seeing this as CXL specific. I don't think the namespace is appropriate.

>  #endif

Re: [PATCH v11 18/23] PCI/AER: Dequeue forwarded CXL error

Posted by Dave Jiang 5 months, 2 weeks ago


On 8/26/25 6:35 PM, Terry Bowman wrote:
> The AER driver is now designed to forward CXL protocol errors to the CXL

I would rephrase it to:
The AER driver enqueues the CXL protocol error info to the created kfifo for the CXL driver to consume.
 
> driver. Update the CXL driver with functionality to dequeue the forwarded
> CXL error from the kfifo. Also, update the CXL driver to begin the protocol
> error handling processing using the work received from the FIFO.
> 
> Update function cxl_proto_err_work_fn() to dequeue work forwarded by the
> AER service driver. This will begin the CXL protocol error processing with
> a call to cxl_handle_proto_error().
> 
> Introduce logic to take the SBDF values from 'struct cxl_proto_error_info'
> and use in discovering the erring PCI device. The call to pci_get_domain_bus_and_slot()
> will return a reference counted 'struct pci_dev *'. This will serve as
> reference count to prevent releasing the CXL Endpoint's mapped RAS while
> handling the error. Use scope base __free() to put the reference count.
> This will change when adding support for CXL port devices in the future.
> 
> Implement cxl_handle_proto_error() to differentiate between Restricted CXL
> Host (RCH) protocol errors and CXL virtual host (VH) protocol errors.
> Maintain the existing RCH handling. Export the AER driver's pcie_walk_rcec()
> allowing the CXL driver to walk the RCEC's secondary bus.
> 
> VH correctable error (CE) processing will call the CXL CE handler. VH
> uncorrectable errors (UCE) will call cxl_do_recovery(), implemented as a
> stub for now and to be updated in future patch. Export pci_aer_clean_fatal_status()
> and pci_clean_device_status() used to clean up AER status after handling.
> 
> Signed-off-by: Terry Bowman <terry.bowman@amd.com>
> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
> 
> ---
> Changes in v10->v11:
> - Reword patch commit message to remove RCiEP details (Jonathan)
> - Add #include <linux/bitfield.h> (Terry)
> - is_cxl_rcd() - Fix short comment message wrap  (Jonathan)
> - is_cxl_rcd() - Combine return calls into 1  (Jonathan)
> - cxl_handle_proto_error() - Move comment earlier  (Jonathan)
> - Usse FIELD_GET() in discovering class code (Jonathan)
> - Remove BDF from cxl_proto_err_work_data. Use 'struct pci_dev *' (Dan)
> ---
>  drivers/cxl/core/ras.c  | 68 ++++++++++++++++++++++++++++++++++-------
>  drivers/pci/pci.c       |  1 +
>  drivers/pci/pci.h       |  7 -----
>  drivers/pci/pcie/aer.c  |  1 +
>  drivers/pci/pcie/rcec.c |  1 +
>  include/linux/aer.h     |  2 ++
>  include/linux/pci.h     | 10 ++++++
>  7 files changed, 72 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/cxl/core/ras.c b/drivers/cxl/core/ras.c
> index b285448c2d9c..a2e95c49f965 100644
> --- a/drivers/cxl/core/ras.c
> +++ b/drivers/cxl/core/ras.c
> @@ -118,17 +118,6 @@ static void cxl_cper_prot_err_work_fn(struct work_struct *work)
>  }
>  static DECLARE_WORK(cxl_cper_prot_err_work, cxl_cper_prot_err_work_fn);
>  
> -int cxl_ras_init(void)
> -{
> -	return cxl_cper_register_prot_err_work(&cxl_cper_prot_err_work);
> -}
> -
> -void cxl_ras_exit(void)
> -{
> -	cxl_cper_unregister_prot_err_work(&cxl_cper_prot_err_work);
> -	cancel_work_sync(&cxl_cper_prot_err_work);
> -}
> -
>  static pci_ers_result_t cxl_handle_ras(struct device *dev, u64 serial, void __iomem *ras_base);
>  static void cxl_handle_cor_ras(struct device *dev, u64 serial, void __iomem *ras_base);
>  
> @@ -331,6 +320,10 @@ void cxl_endpoint_port_init_ras(struct cxl_port *ep)
>  }
>  EXPORT_SYMBOL_NS_GPL(cxl_endpoint_port_init_ras, "CXL");
>  
> +static void cxl_do_recovery(struct device *dev)
> +{
> +}
> +
>  static void cxl_handle_cor_ras(struct device *dev, u64 serial, void __iomem *ras_base)
>  {
>  	void __iomem *addr;
> @@ -472,3 +465,56 @@ pci_ers_result_t pci_error_detected(struct pci_dev *pdev,
>  	return rc;
>  }
>  EXPORT_SYMBOL_NS_GPL(pci_error_detected, "CXL");
> +
> +static void cxl_handle_proto_error(struct cxl_proto_err_work_data *err_info)
> +{
> +	struct pci_dev *pdev = err_info->pdev;
> +	struct cxl_dev_state *cxlds = pci_get_drvdata(pdev);

So this function is called from the workqueue thread to consume data from the kfifo right? Do we need to take the device lock of the pdev to ensure that a driver is bound to the device before we attempt to retrieve the data? And do we also need to verify that the driver bound is the cxl_pci driver (and not something like vfio_pci)? Otherwise I think assuming the drv data is cxl_dev_state may cause crash.

DJ

> +	struct cxl_memdev *cxlmd = cxlds->cxlmd;
> +	struct device *host_dev __free(put_device) = get_device(&cxlmd->dev);
> +> +	if (err_info->severity == AER_CORRECTABLE) {
> +		int aer = pdev->aer_cap;
> +
> +		if (aer)
> +			pci_clear_and_set_config_dword(pdev,
> +						       aer + PCI_ERR_COR_STATUS,
> +						       0, PCI_ERR_COR_INTERNAL);
> +
> +		cxl_cor_error_detected(&cxlmd->dev);
> +
> +		pcie_clear_device_status(pdev);
> +	} else {
> +		cxl_do_recovery(&cxlmd->dev);
> +	}
> +}
> +
> +static void cxl_proto_err_work_fn(struct work_struct *work)
> +{
> +	struct cxl_proto_err_work_data wd;
> +
> +	while (cxl_proto_err_kfifo_get(&wd))
> +		cxl_handle_proto_error(&wd);
> +}
> +
> +static struct work_struct cxl_proto_err_work;
> +static DECLARE_WORK(cxl_proto_err_work, cxl_proto_err_work_fn);
> +
> +int cxl_ras_init(void)
> +{
> +	if (cxl_cper_register_prot_err_work(&cxl_cper_prot_err_work))
> +		pr_err("Failed to initialize CXL RAS CPER\n");
> +
> +	cxl_register_proto_err_work(&cxl_proto_err_work);
> +
> +	return 0;
> +}
> +
> +void cxl_ras_exit(void)
> +{
> +	cxl_cper_unregister_prot_err_work(&cxl_cper_prot_err_work);
> +	cancel_work_sync(&cxl_cper_prot_err_work);
> +
> +	cxl_unregister_proto_err_work();
> +	cancel_work_sync(&cxl_proto_err_work);
> +}
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index d775ed37a79b..2c9827690cb3 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -2328,6 +2328,7 @@ void pcie_clear_device_status(struct pci_dev *dev)
>  	pcie_capability_read_word(dev, PCI_EXP_DEVSTA, &sta);
>  	pcie_capability_write_word(dev, PCI_EXP_DEVSTA, sta);
>  }
> +EXPORT_SYMBOL_NS_GPL(pcie_clear_device_status, "CXL");
>  #endif
>  
>  /**
> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> index cfa75903dd3f..69ff7c2d214f 100644
> --- a/drivers/pci/pci.h
> +++ b/drivers/pci/pci.h
> @@ -671,16 +671,10 @@ static inline bool pci_dpc_recovered(struct pci_dev *pdev) { return false; }
>  void pci_rcec_init(struct pci_dev *dev);
>  void pci_rcec_exit(struct pci_dev *dev);
>  void pcie_link_rcec(struct pci_dev *rcec);
> -void pcie_walk_rcec(struct pci_dev *rcec,
> -		    int (*cb)(struct pci_dev *, void *),
> -		    void *userdata);
>  #else
>  static inline void pci_rcec_init(struct pci_dev *dev) { }
>  static inline void pci_rcec_exit(struct pci_dev *dev) { }
>  static inline void pcie_link_rcec(struct pci_dev *rcec) { }
> -static inline void pcie_walk_rcec(struct pci_dev *rcec,
> -				  int (*cb)(struct pci_dev *, void *),
> -				  void *userdata) { }
>  #endif
>  
>  #ifdef CONFIG_PCI_ATS
> @@ -1022,7 +1016,6 @@ void pci_restore_aer_state(struct pci_dev *dev);
>  static inline void pci_no_aer(void) { }
>  static inline void pci_aer_init(struct pci_dev *d) { }
>  static inline void pci_aer_exit(struct pci_dev *d) { }
> -static inline void pci_aer_clear_fatal_status(struct pci_dev *dev) { }
>  static inline int pci_aer_clear_status(struct pci_dev *dev) { return -EINVAL; }
>  static inline int pci_aer_raw_clear_status(struct pci_dev *dev) { return -EINVAL; }
>  static inline void pci_save_aer_state(struct pci_dev *dev) { }
> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> index 627d89ccea9c..45abe1622316 100644
> --- a/drivers/pci/pcie/aer.c
> +++ b/drivers/pci/pcie/aer.c
> @@ -288,6 +288,7 @@ void pci_aer_clear_fatal_status(struct pci_dev *dev)
>  	if (status)
>  		pci_write_config_dword(dev, aer + PCI_ERR_UNCOR_STATUS, status);
>  }
> +EXPORT_SYMBOL_GPL(pci_aer_clear_fatal_status);
>  
>  /**
>   * pci_aer_raw_clear_status - Clear AER error registers.
> diff --git a/drivers/pci/pcie/rcec.c b/drivers/pci/pcie/rcec.c
> index d0bcd141ac9c..fb6cf6449a1d 100644
> --- a/drivers/pci/pcie/rcec.c
> +++ b/drivers/pci/pcie/rcec.c
> @@ -145,6 +145,7 @@ void pcie_walk_rcec(struct pci_dev *rcec, int (*cb)(struct pci_dev *, void *),
>  
>  	walk_rcec(walk_rcec_helper, &rcec_data);
>  }
> +EXPORT_SYMBOL_NS_GPL(pcie_walk_rcec, "CXL");
>  
>  void pci_rcec_init(struct pci_dev *dev)
>  {
> diff --git a/include/linux/aer.h b/include/linux/aer.h
> index f8eb32805957..1f79f0be4bf7 100644
> --- a/include/linux/aer.h
> +++ b/include/linux/aer.h
> @@ -66,12 +66,14 @@ struct cxl_proto_err_work_data {
>  
>  #if defined(CONFIG_PCIEAER)
>  int pci_aer_clear_nonfatal_status(struct pci_dev *dev);
> +void pci_aer_clear_fatal_status(struct pci_dev *dev);
>  int pcie_aer_is_native(struct pci_dev *dev);
>  #else
>  static inline int pci_aer_clear_nonfatal_status(struct pci_dev *dev)
>  {
>  	return -EINVAL;
>  }
> +static inline void pci_aer_clear_fatal_status(struct pci_dev *dev) { }
>  static inline int pcie_aer_is_native(struct pci_dev *dev) { return 0; }
>  #endif
>  
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 3dcab36c437f..3407d687459d 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -1804,6 +1804,9 @@ extern bool pcie_ports_native;
>  
>  int pcie_set_target_speed(struct pci_dev *port, enum pci_bus_speed speed_req,
>  			  bool use_lt);
> +void pcie_walk_rcec(struct pci_dev *rcec,
> +		    int (*cb)(struct pci_dev *, void *),
> +		    void *userdata);
>  #else
>  #define pcie_ports_disabled	true
>  #define pcie_ports_native	false
> @@ -1814,8 +1817,15 @@ static inline int pcie_set_target_speed(struct pci_dev *port,
>  {
>  	return -EOPNOTSUPP;
>  }
> +
> +static inline void pcie_walk_rcec(struct pci_dev *rcec,
> +				  int (*cb)(struct pci_dev *, void *),
> +				  void *userdata) { }
> +
>  #endif
>  
> +void pcie_clear_device_status(struct pci_dev *dev);
> +
>  #define PCIE_LINK_STATE_L0S		(BIT(0) | BIT(1)) /* Upstr/dwnstr L0s */
>  #define PCIE_LINK_STATE_L1		BIT(2)	/* L1 state */
>  #define PCIE_LINK_STATE_L1_1		BIT(3)	/* ASPM L1.1 state */

Re: [PATCH v11 18/23] PCI/AER: Dequeue forwarded CXL error

Posted by Bowman, Terry 5 months ago


On 8/28/2025 7:43 PM, Dave Jiang wrote:
>
> On 8/26/25 6:35 PM, Terry Bowman wrote:
>> The AER driver is now designed to forward CXL protocol errors to the CXL
> I would rephrase it to:
> The AER driver enqueues the CXL protocol error info to the created kfifo for the CXL driver to consume.
>  
>> driver. Update the CXL driver with functionality to dequeue the forwarded
>> CXL error from the kfifo. Also, update the CXL driver to begin the protocol
>> error handling processing using the work received from the FIFO.
>>
>> Update function cxl_proto_err_work_fn() to dequeue work forwarded by the
>> AER service driver. This will begin the CXL protocol error processing with
>> a call to cxl_handle_proto_error().
>>
>> Introduce logic to take the SBDF values from 'struct cxl_proto_error_info'
>> and use in discovering the erring PCI device. The call to pci_get_domain_bus_and_slot()
>> will return a reference counted 'struct pci_dev *'. This will serve as
>> reference count to prevent releasing the CXL Endpoint's mapped RAS while
>> handling the error. Use scope base __free() to put the reference count.
>> This will change when adding support for CXL port devices in the future.
>>
>> Implement cxl_handle_proto_error() to differentiate between Restricted CXL
>> Host (RCH) protocol errors and CXL virtual host (VH) protocol errors.
>> Maintain the existing RCH handling. Export the AER driver's pcie_walk_rcec()
>> allowing the CXL driver to walk the RCEC's secondary bus.
>>
>> VH correctable error (CE) processing will call the CXL CE handler. VH
>> uncorrectable errors (UCE) will call cxl_do_recovery(), implemented as a
>> stub for now and to be updated in future patch. Export pci_aer_clean_fatal_status()
>> and pci_clean_device_status() used to clean up AER status after handling.
>>
>> Signed-off-by: Terry Bowman <terry.bowman@amd.com>
>> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
>>
>> ---
>> Changes in v10->v11:
>> - Reword patch commit message to remove RCiEP details (Jonathan)
>> - Add #include <linux/bitfield.h> (Terry)
>> - is_cxl_rcd() - Fix short comment message wrap  (Jonathan)
>> - is_cxl_rcd() - Combine return calls into 1  (Jonathan)
>> - cxl_handle_proto_error() - Move comment earlier  (Jonathan)
>> - Usse FIELD_GET() in discovering class code (Jonathan)
>> - Remove BDF from cxl_proto_err_work_data. Use 'struct pci_dev *' (Dan)
>> ---
>>  drivers/cxl/core/ras.c  | 68 ++++++++++++++++++++++++++++++++++-------
>>  drivers/pci/pci.c       |  1 +
>>  drivers/pci/pci.h       |  7 -----
>>  drivers/pci/pcie/aer.c  |  1 +
>>  drivers/pci/pcie/rcec.c |  1 +
>>  include/linux/aer.h     |  2 ++
>>  include/linux/pci.h     | 10 ++++++
>>  7 files changed, 72 insertions(+), 18 deletions(-)
>>
>> diff --git a/drivers/cxl/core/ras.c b/drivers/cxl/core/ras.c
>> index b285448c2d9c..a2e95c49f965 100644
>> --- a/drivers/cxl/core/ras.c
>> +++ b/drivers/cxl/core/ras.c
>> @@ -118,17 +118,6 @@ static void cxl_cper_prot_err_work_fn(struct work_struct *work)
>>  }
>>  static DECLARE_WORK(cxl_cper_prot_err_work, cxl_cper_prot_err_work_fn);
>>  
>> -int cxl_ras_init(void)
>> -{
>> -	return cxl_cper_register_prot_err_work(&cxl_cper_prot_err_work);
>> -}
>> -
>> -void cxl_ras_exit(void)
>> -{
>> -	cxl_cper_unregister_prot_err_work(&cxl_cper_prot_err_work);
>> -	cancel_work_sync(&cxl_cper_prot_err_work);
>> -}
>> -
>>  static pci_ers_result_t cxl_handle_ras(struct device *dev, u64 serial, void __iomem *ras_base);
>>  static void cxl_handle_cor_ras(struct device *dev, u64 serial, void __iomem *ras_base);
>>  
>> @@ -331,6 +320,10 @@ void cxl_endpoint_port_init_ras(struct cxl_port *ep)
>>  }
>>  EXPORT_SYMBOL_NS_GPL(cxl_endpoint_port_init_ras, "CXL");
>>  
>> +static void cxl_do_recovery(struct device *dev)
>> +{
>> +}
>> +
>>  static void cxl_handle_cor_ras(struct device *dev, u64 serial, void __iomem *ras_base)
>>  {
>>  	void __iomem *addr;
>> @@ -472,3 +465,56 @@ pci_ers_result_t pci_error_detected(struct pci_dev *pdev,
>>  	return rc;
>>  }
>>  EXPORT_SYMBOL_NS_GPL(pci_error_detected, "CXL");
>> +
>> +static void cxl_handle_proto_error(struct cxl_proto_err_work_data *err_info)
>> +{
>> +	struct pci_dev *pdev = err_info->pdev;
>> +	struct cxl_dev_state *cxlds = pci_get_drvdata(pdev);
> So this function is called from the workqueue thread to consume data from the kfifo right? Do we need to take the device lock of the pdev to ensure that a driver is bound to the device before we attempt to retrieve the data? And do we also need to verify that the driver bound is the cxl_pci driver (and not something like vfio_pci)? Otherwise I think assuming the drv data is cxl_dev_state may cause crash.
>
> DJ

Yes, this is called in the worker thread context. I added the pdev device locks
later in cxl_report_error_detected() for the UCE case. I found it necessary to 
put in this function and not in cxl_handle_proto_error() (here) because of the 
traversing logic in the UCE handling flow where it needs to be locked but only
exactly once. I didn't add for the CE because I wasn't certain a CE error was 
enough reason to add a device lock. 

The UCE flow is:

cxl_handle_proto_error()
--> cxl_do_recovery()
----> cxl_handle_proto_error() <--- Added device lock here because of topo traversing/iteration


I tried adding a function checking for cxl_pci driver but ran into circular dependency 
because the driver is defined in cxl_pci but called from cxl_core. I will revisit
this again but need some ideas how to make that work as I expect it will require 
some code moving. 

Terry

>> +	struct cxl_memdev *cxlmd = cxlds->cxlmd;
>> +	struct device *host_dev __free(put_device) = get_device(&cxlmd->dev);
>> +> +	if (err_info->severity == AER_CORRECTABLE) {
>> +		int aer = pdev->aer_cap;
>> +
>> +		if (aer)
>> +			pci_clear_and_set_config_dword(pdev,
>> +						       aer + PCI_ERR_COR_STATUS,
>> +						       0, PCI_ERR_COR_INTERNAL);
>> +
>> +		cxl_cor_error_detected(&cxlmd->dev);
>> +
>> +		pcie_clear_device_status(pdev);
>> +	} else {
>> +		cxl_do_recovery(&cxlmd->dev);
>> +	}
>> +}
>> +
>> +static void cxl_proto_err_work_fn(struct work_struct *work)
>> +{
>> +	struct cxl_proto_err_work_data wd;
>> +
>> +	while (cxl_proto_err_kfifo_get(&wd))
>> +		cxl_handle_proto_error(&wd);
>> +}
>> +
>> +static struct work_struct cxl_proto_err_work;
>> +static DECLARE_WORK(cxl_proto_err_work, cxl_proto_err_work_fn);
>> +
>> +int cxl_ras_init(void)
>> +{
>> +	if (cxl_cper_register_prot_err_work(&cxl_cper_prot_err_work))
>> +		pr_err("Failed to initialize CXL RAS CPER\n");
>> +
>> +	cxl_register_proto_err_work(&cxl_proto_err_work);
>> +
>> +	return 0;
>> +}
>> +
>> +void cxl_ras_exit(void)
>> +{
>> +	cxl_cper_unregister_prot_err_work(&cxl_cper_prot_err_work);
>> +	cancel_work_sync(&cxl_cper_prot_err_work);
>> +
>> +	cxl_unregister_proto_err_work();
>> +	cancel_work_sync(&cxl_proto_err_work);
>> +}
>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>> index d775ed37a79b..2c9827690cb3 100644
>> --- a/drivers/pci/pci.c
>> +++ b/drivers/pci/pci.c
>> @@ -2328,6 +2328,7 @@ void pcie_clear_device_status(struct pci_dev *dev)
>>  	pcie_capability_read_word(dev, PCI_EXP_DEVSTA, &sta);
>>  	pcie_capability_write_word(dev, PCI_EXP_DEVSTA, sta);
>>  }
>> +EXPORT_SYMBOL_NS_GPL(pcie_clear_device_status, "CXL");
>>  #endif
>>  
>>  /**
>> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
>> index cfa75903dd3f..69ff7c2d214f 100644
>> --- a/drivers/pci/pci.h
>> +++ b/drivers/pci/pci.h
>> @@ -671,16 +671,10 @@ static inline bool pci_dpc_recovered(struct pci_dev *pdev) { return false; }
>>  void pci_rcec_init(struct pci_dev *dev);
>>  void pci_rcec_exit(struct pci_dev *dev);
>>  void pcie_link_rcec(struct pci_dev *rcec);
>> -void pcie_walk_rcec(struct pci_dev *rcec,
>> -		    int (*cb)(struct pci_dev *, void *),
>> -		    void *userdata);
>>  #else
>>  static inline void pci_rcec_init(struct pci_dev *dev) { }
>>  static inline void pci_rcec_exit(struct pci_dev *dev) { }
>>  static inline void pcie_link_rcec(struct pci_dev *rcec) { }
>> -static inline void pcie_walk_rcec(struct pci_dev *rcec,
>> -				  int (*cb)(struct pci_dev *, void *),
>> -				  void *userdata) { }
>>  #endif
>>  
>>  #ifdef CONFIG_PCI_ATS
>> @@ -1022,7 +1016,6 @@ void pci_restore_aer_state(struct pci_dev *dev);
>>  static inline void pci_no_aer(void) { }
>>  static inline void pci_aer_init(struct pci_dev *d) { }
>>  static inline void pci_aer_exit(struct pci_dev *d) { }
>> -static inline void pci_aer_clear_fatal_status(struct pci_dev *dev) { }
>>  static inline int pci_aer_clear_status(struct pci_dev *dev) { return -EINVAL; }
>>  static inline int pci_aer_raw_clear_status(struct pci_dev *dev) { return -EINVAL; }
>>  static inline void pci_save_aer_state(struct pci_dev *dev) { }
>> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
>> index 627d89ccea9c..45abe1622316 100644
>> --- a/drivers/pci/pcie/aer.c
>> +++ b/drivers/pci/pcie/aer.c
>> @@ -288,6 +288,7 @@ void pci_aer_clear_fatal_status(struct pci_dev *dev)
>>  	if (status)
>>  		pci_write_config_dword(dev, aer + PCI_ERR_UNCOR_STATUS, status);
>>  }
>> +EXPORT_SYMBOL_GPL(pci_aer_clear_fatal_status);
>>  
>>  /**
>>   * pci_aer_raw_clear_status - Clear AER error registers.
>> diff --git a/drivers/pci/pcie/rcec.c b/drivers/pci/pcie/rcec.c
>> index d0bcd141ac9c..fb6cf6449a1d 100644
>> --- a/drivers/pci/pcie/rcec.c
>> +++ b/drivers/pci/pcie/rcec.c
>> @@ -145,6 +145,7 @@ void pcie_walk_rcec(struct pci_dev *rcec, int (*cb)(struct pci_dev *, void *),
>>  
>>  	walk_rcec(walk_rcec_helper, &rcec_data);
>>  }
>> +EXPORT_SYMBOL_NS_GPL(pcie_walk_rcec, "CXL");
>>  
>>  void pci_rcec_init(struct pci_dev *dev)
>>  {
>> diff --git a/include/linux/aer.h b/include/linux/aer.h
>> index f8eb32805957..1f79f0be4bf7 100644
>> --- a/include/linux/aer.h
>> +++ b/include/linux/aer.h
>> @@ -66,12 +66,14 @@ struct cxl_proto_err_work_data {
>>  
>>  #if defined(CONFIG_PCIEAER)
>>  int pci_aer_clear_nonfatal_status(struct pci_dev *dev);
>> +void pci_aer_clear_fatal_status(struct pci_dev *dev);
>>  int pcie_aer_is_native(struct pci_dev *dev);
>>  #else
>>  static inline int pci_aer_clear_nonfatal_status(struct pci_dev *dev)
>>  {
>>  	return -EINVAL;
>>  }
>> +static inline void pci_aer_clear_fatal_status(struct pci_dev *dev) { }
>>  static inline int pcie_aer_is_native(struct pci_dev *dev) { return 0; }
>>  #endif
>>  
>> diff --git a/include/linux/pci.h b/include/linux/pci.h
>> index 3dcab36c437f..3407d687459d 100644
>> --- a/include/linux/pci.h
>> +++ b/include/linux/pci.h
>> @@ -1804,6 +1804,9 @@ extern bool pcie_ports_native;
>>  
>>  int pcie_set_target_speed(struct pci_dev *port, enum pci_bus_speed speed_req,
>>  			  bool use_lt);
>> +void pcie_walk_rcec(struct pci_dev *rcec,
>> +		    int (*cb)(struct pci_dev *, void *),
>> +		    void *userdata);
>>  #else
>>  #define pcie_ports_disabled	true
>>  #define pcie_ports_native	false
>> @@ -1814,8 +1817,15 @@ static inline int pcie_set_target_speed(struct pci_dev *port,
>>  {
>>  	return -EOPNOTSUPP;
>>  }
>> +
>> +static inline void pcie_walk_rcec(struct pci_dev *rcec,
>> +				  int (*cb)(struct pci_dev *, void *),
>> +				  void *userdata) { }
>> +
>>  #endif
>>  
>> +void pcie_clear_device_status(struct pci_dev *dev);
>> +
>>  #define PCIE_LINK_STATE_L0S		(BIT(0) | BIT(1)) /* Upstr/dwnstr L0s */
>>  #define PCIE_LINK_STATE_L1		BIT(2)	/* L1 state */
>>  #define PCIE_LINK_STATE_L1_1		BIT(3)	/* ASPM L1.1 state */

Re: [PATCH v11 18/23] PCI/AER: Dequeue forwarded CXL error

Posted by Dave Jiang 5 months ago


On 9/11/25 7:33 AM, Bowman, Terry wrote:
> 
> 
> On 8/28/2025 7:43 PM, Dave Jiang wrote:
>>
>> On 8/26/25 6:35 PM, Terry Bowman wrote:
>>> The AER driver is now designed to forward CXL protocol errors to the CXL
>> I would rephrase it to:
>> The AER driver enqueues the CXL protocol error info to the created kfifo for the CXL driver to consume.
>>  
>>> driver. Update the CXL driver with functionality to dequeue the forwarded
>>> CXL error from the kfifo. Also, update the CXL driver to begin the protocol
>>> error handling processing using the work received from the FIFO.
>>>
>>> Update function cxl_proto_err_work_fn() to dequeue work forwarded by the
>>> AER service driver. This will begin the CXL protocol error processing with
>>> a call to cxl_handle_proto_error().
>>>
>>> Introduce logic to take the SBDF values from 'struct cxl_proto_error_info'
>>> and use in discovering the erring PCI device. The call to pci_get_domain_bus_and_slot()
>>> will return a reference counted 'struct pci_dev *'. This will serve as
>>> reference count to prevent releasing the CXL Endpoint's mapped RAS while
>>> handling the error. Use scope base __free() to put the reference count.
>>> This will change when adding support for CXL port devices in the future.
>>>
>>> Implement cxl_handle_proto_error() to differentiate between Restricted CXL
>>> Host (RCH) protocol errors and CXL virtual host (VH) protocol errors.
>>> Maintain the existing RCH handling. Export the AER driver's pcie_walk_rcec()
>>> allowing the CXL driver to walk the RCEC's secondary bus.
>>>
>>> VH correctable error (CE) processing will call the CXL CE handler. VH
>>> uncorrectable errors (UCE) will call cxl_do_recovery(), implemented as a
>>> stub for now and to be updated in future patch. Export pci_aer_clean_fatal_status()
>>> and pci_clean_device_status() used to clean up AER status after handling.
>>>
>>> Signed-off-by: Terry Bowman <terry.bowman@amd.com>
>>> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
>>>
>>> ---
>>> Changes in v10->v11:
>>> - Reword patch commit message to remove RCiEP details (Jonathan)
>>> - Add #include <linux/bitfield.h> (Terry)
>>> - is_cxl_rcd() - Fix short comment message wrap  (Jonathan)
>>> - is_cxl_rcd() - Combine return calls into 1  (Jonathan)
>>> - cxl_handle_proto_error() - Move comment earlier  (Jonathan)
>>> - Usse FIELD_GET() in discovering class code (Jonathan)
>>> - Remove BDF from cxl_proto_err_work_data. Use 'struct pci_dev *' (Dan)
>>> ---
>>>  drivers/cxl/core/ras.c  | 68 ++++++++++++++++++++++++++++++++++-------
>>>  drivers/pci/pci.c       |  1 +
>>>  drivers/pci/pci.h       |  7 -----
>>>  drivers/pci/pcie/aer.c  |  1 +
>>>  drivers/pci/pcie/rcec.c |  1 +
>>>  include/linux/aer.h     |  2 ++
>>>  include/linux/pci.h     | 10 ++++++
>>>  7 files changed, 72 insertions(+), 18 deletions(-)
>>>
>>> diff --git a/drivers/cxl/core/ras.c b/drivers/cxl/core/ras.c
>>> index b285448c2d9c..a2e95c49f965 100644
>>> --- a/drivers/cxl/core/ras.c
>>> +++ b/drivers/cxl/core/ras.c
>>> @@ -118,17 +118,6 @@ static void cxl_cper_prot_err_work_fn(struct work_struct *work)
>>>  }
>>>  static DECLARE_WORK(cxl_cper_prot_err_work, cxl_cper_prot_err_work_fn);
>>>  
>>> -int cxl_ras_init(void)
>>> -{
>>> -	return cxl_cper_register_prot_err_work(&cxl_cper_prot_err_work);
>>> -}
>>> -
>>> -void cxl_ras_exit(void)
>>> -{
>>> -	cxl_cper_unregister_prot_err_work(&cxl_cper_prot_err_work);
>>> -	cancel_work_sync(&cxl_cper_prot_err_work);
>>> -}
>>> -
>>>  static pci_ers_result_t cxl_handle_ras(struct device *dev, u64 serial, void __iomem *ras_base);
>>>  static void cxl_handle_cor_ras(struct device *dev, u64 serial, void __iomem *ras_base);
>>>  
>>> @@ -331,6 +320,10 @@ void cxl_endpoint_port_init_ras(struct cxl_port *ep)
>>>  }
>>>  EXPORT_SYMBOL_NS_GPL(cxl_endpoint_port_init_ras, "CXL");
>>>  
>>> +static void cxl_do_recovery(struct device *dev)
>>> +{
>>> +}
>>> +
>>>  static void cxl_handle_cor_ras(struct device *dev, u64 serial, void __iomem *ras_base)
>>>  {
>>>  	void __iomem *addr;
>>> @@ -472,3 +465,56 @@ pci_ers_result_t pci_error_detected(struct pci_dev *pdev,
>>>  	return rc;
>>>  }
>>>  EXPORT_SYMBOL_NS_GPL(pci_error_detected, "CXL");
>>> +
>>> +static void cxl_handle_proto_error(struct cxl_proto_err_work_data *err_info)
>>> +{
>>> +	struct pci_dev *pdev = err_info->pdev;
>>> +	struct cxl_dev_state *cxlds = pci_get_drvdata(pdev);
>> So this function is called from the workqueue thread to consume data from the kfifo right? Do we need to take the device lock of the pdev to ensure that a driver is bound to the device before we attempt to retrieve the data? And do we also need to verify that the driver bound is the cxl_pci driver (and not something like vfio_pci)? Otherwise I think assuming the drv data is cxl_dev_state may cause crash.
>>
>> DJ
> 
> Yes, this is called in the worker thread context. I added the pdev device locks
> later in cxl_report_error_detected() for the UCE case. I found it necessary to 
> put in this function and not in cxl_handle_proto_error() (here) because of the 
> traversing logic in the UCE handling flow where it needs to be locked but only
> exactly once. I didn't add for the CE because I wasn't certain a CE error was 
> enough reason to add a device lock. 
> 
> The UCE flow is:
> 
> cxl_handle_proto_error()
> --> cxl_do_recovery()
> ----> cxl_handle_proto_error() <--- Added device lock here because of topo traversing/iteration
> 
> 
> I tried adding a function checking for cxl_pci driver but ran into circular dependency 
> because the driver is defined in cxl_pci but called from cxl_core. I will revisit
> this again but need some ideas how to make that work as I expect it will require 
> some code moving. 

Is there a chance that the endpoint errors can just be handled via the standard AER flow via AER callback? Otherwise we may need to move the cxl_pci driver definition to core/pci.c in order to do this....

DJ

> 
> Terry
> 
>>> +	struct cxl_memdev *cxlmd = cxlds->cxlmd;
>>> +	struct device *host_dev __free(put_device) = get_device(&cxlmd->dev);
>>> +> +	if (err_info->severity == AER_CORRECTABLE) {
>>> +		int aer = pdev->aer_cap;
>>> +
>>> +		if (aer)
>>> +			pci_clear_and_set_config_dword(pdev,
>>> +						       aer + PCI_ERR_COR_STATUS,
>>> +						       0, PCI_ERR_COR_INTERNAL);
>>> +
>>> +		cxl_cor_error_detected(&cxlmd->dev);
>>> +
>>> +		pcie_clear_device_status(pdev);
>>> +	} else {
>>> +		cxl_do_recovery(&cxlmd->dev);
>>> +	}
>>> +}
>>> +
>>> +static void cxl_proto_err_work_fn(struct work_struct *work)
>>> +{
>>> +	struct cxl_proto_err_work_data wd;
>>> +
>>> +	while (cxl_proto_err_kfifo_get(&wd))
>>> +		cxl_handle_proto_error(&wd);
>>> +}
>>> +
>>> +static struct work_struct cxl_proto_err_work;
>>> +static DECLARE_WORK(cxl_proto_err_work, cxl_proto_err_work_fn);
>>> +
>>> +int cxl_ras_init(void)
>>> +{
>>> +	if (cxl_cper_register_prot_err_work(&cxl_cper_prot_err_work))
>>> +		pr_err("Failed to initialize CXL RAS CPER\n");
>>> +
>>> +	cxl_register_proto_err_work(&cxl_proto_err_work);
>>> +
>>> +	return 0;
>>> +}
>>> +
>>> +void cxl_ras_exit(void)
>>> +{
>>> +	cxl_cper_unregister_prot_err_work(&cxl_cper_prot_err_work);
>>> +	cancel_work_sync(&cxl_cper_prot_err_work);
>>> +
>>> +	cxl_unregister_proto_err_work();
>>> +	cancel_work_sync(&cxl_proto_err_work);
>>> +}
>>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>>> index d775ed37a79b..2c9827690cb3 100644
>>> --- a/drivers/pci/pci.c
>>> +++ b/drivers/pci/pci.c
>>> @@ -2328,6 +2328,7 @@ void pcie_clear_device_status(struct pci_dev *dev)
>>>  	pcie_capability_read_word(dev, PCI_EXP_DEVSTA, &sta);
>>>  	pcie_capability_write_word(dev, PCI_EXP_DEVSTA, sta);
>>>  }
>>> +EXPORT_SYMBOL_NS_GPL(pcie_clear_device_status, "CXL");
>>>  #endif
>>>  
>>>  /**
>>> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
>>> index cfa75903dd3f..69ff7c2d214f 100644
>>> --- a/drivers/pci/pci.h
>>> +++ b/drivers/pci/pci.h
>>> @@ -671,16 +671,10 @@ static inline bool pci_dpc_recovered(struct pci_dev *pdev) { return false; }
>>>  void pci_rcec_init(struct pci_dev *dev);
>>>  void pci_rcec_exit(struct pci_dev *dev);
>>>  void pcie_link_rcec(struct pci_dev *rcec);
>>> -void pcie_walk_rcec(struct pci_dev *rcec,
>>> -		    int (*cb)(struct pci_dev *, void *),
>>> -		    void *userdata);
>>>  #else
>>>  static inline void pci_rcec_init(struct pci_dev *dev) { }
>>>  static inline void pci_rcec_exit(struct pci_dev *dev) { }
>>>  static inline void pcie_link_rcec(struct pci_dev *rcec) { }
>>> -static inline void pcie_walk_rcec(struct pci_dev *rcec,
>>> -				  int (*cb)(struct pci_dev *, void *),
>>> -				  void *userdata) { }
>>>  #endif
>>>  
>>>  #ifdef CONFIG_PCI_ATS
>>> @@ -1022,7 +1016,6 @@ void pci_restore_aer_state(struct pci_dev *dev);
>>>  static inline void pci_no_aer(void) { }
>>>  static inline void pci_aer_init(struct pci_dev *d) { }
>>>  static inline void pci_aer_exit(struct pci_dev *d) { }
>>> -static inline void pci_aer_clear_fatal_status(struct pci_dev *dev) { }
>>>  static inline int pci_aer_clear_status(struct pci_dev *dev) { return -EINVAL; }
>>>  static inline int pci_aer_raw_clear_status(struct pci_dev *dev) { return -EINVAL; }
>>>  static inline void pci_save_aer_state(struct pci_dev *dev) { }
>>> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
>>> index 627d89ccea9c..45abe1622316 100644
>>> --- a/drivers/pci/pcie/aer.c
>>> +++ b/drivers/pci/pcie/aer.c
>>> @@ -288,6 +288,7 @@ void pci_aer_clear_fatal_status(struct pci_dev *dev)
>>>  	if (status)
>>>  		pci_write_config_dword(dev, aer + PCI_ERR_UNCOR_STATUS, status);
>>>  }
>>> +EXPORT_SYMBOL_GPL(pci_aer_clear_fatal_status);
>>>  
>>>  /**
>>>   * pci_aer_raw_clear_status - Clear AER error registers.
>>> diff --git a/drivers/pci/pcie/rcec.c b/drivers/pci/pcie/rcec.c
>>> index d0bcd141ac9c..fb6cf6449a1d 100644
>>> --- a/drivers/pci/pcie/rcec.c
>>> +++ b/drivers/pci/pcie/rcec.c
>>> @@ -145,6 +145,7 @@ void pcie_walk_rcec(struct pci_dev *rcec, int (*cb)(struct pci_dev *, void *),
>>>  
>>>  	walk_rcec(walk_rcec_helper, &rcec_data);
>>>  }
>>> +EXPORT_SYMBOL_NS_GPL(pcie_walk_rcec, "CXL");
>>>  
>>>  void pci_rcec_init(struct pci_dev *dev)
>>>  {
>>> diff --git a/include/linux/aer.h b/include/linux/aer.h
>>> index f8eb32805957..1f79f0be4bf7 100644
>>> --- a/include/linux/aer.h
>>> +++ b/include/linux/aer.h
>>> @@ -66,12 +66,14 @@ struct cxl_proto_err_work_data {
>>>  
>>>  #if defined(CONFIG_PCIEAER)
>>>  int pci_aer_clear_nonfatal_status(struct pci_dev *dev);
>>> +void pci_aer_clear_fatal_status(struct pci_dev *dev);
>>>  int pcie_aer_is_native(struct pci_dev *dev);
>>>  #else
>>>  static inline int pci_aer_clear_nonfatal_status(struct pci_dev *dev)
>>>  {
>>>  	return -EINVAL;
>>>  }
>>> +static inline void pci_aer_clear_fatal_status(struct pci_dev *dev) { }
>>>  static inline int pcie_aer_is_native(struct pci_dev *dev) { return 0; }
>>>  #endif
>>>  
>>> diff --git a/include/linux/pci.h b/include/linux/pci.h
>>> index 3dcab36c437f..3407d687459d 100644
>>> --- a/include/linux/pci.h
>>> +++ b/include/linux/pci.h
>>> @@ -1804,6 +1804,9 @@ extern bool pcie_ports_native;
>>>  
>>>  int pcie_set_target_speed(struct pci_dev *port, enum pci_bus_speed speed_req,
>>>  			  bool use_lt);
>>> +void pcie_walk_rcec(struct pci_dev *rcec,
>>> +		    int (*cb)(struct pci_dev *, void *),
>>> +		    void *userdata);
>>>  #else
>>>  #define pcie_ports_disabled	true
>>>  #define pcie_ports_native	false
>>> @@ -1814,8 +1817,15 @@ static inline int pcie_set_target_speed(struct pci_dev *port,
>>>  {
>>>  	return -EOPNOTSUPP;
>>>  }
>>> +
>>> +static inline void pcie_walk_rcec(struct pci_dev *rcec,
>>> +				  int (*cb)(struct pci_dev *, void *),
>>> +				  void *userdata) { }
>>> +
>>>  #endif
>>>  
>>> +void pcie_clear_device_status(struct pci_dev *dev);
>>> +
>>>  #define PCIE_LINK_STATE_L0S		(BIT(0) | BIT(1)) /* Upstr/dwnstr L0s */
>>>  #define PCIE_LINK_STATE_L1		BIT(2)	/* L1 state */
>>>  #define PCIE_LINK_STATE_L1_1		BIT(3)	/* ASPM L1.1 state */
>

Re: [PATCH v11 18/23] PCI/AER: Dequeue forwarded CXL error

Posted by Bowman, Terry 5 months ago


On 9/11/2025 10:41 AM, Dave Jiang wrote:
>
> On 9/11/25 7:33 AM, Bowman, Terry wrote:
>>
>> On 8/28/2025 7:43 PM, Dave Jiang wrote:
>>> On 8/26/25 6:35 PM, Terry Bowman wrote:
>>>> The AER driver is now designed to forward CXL protocol errors to the CXL
>>> I would rephrase it to:
>>> The AER driver enqueues the CXL protocol error info to the created kfifo for the CXL driver to consume.
>>>  
>>>> driver. Update the CXL driver with functionality to dequeue the forwarded
>>>> CXL error from the kfifo. Also, update the CXL driver to begin the protocol
>>>> error handling processing using the work received from the FIFO.
>>>>
>>>> Update function cxl_proto_err_work_fn() to dequeue work forwarded by the
>>>> AER service driver. This will begin the CXL protocol error processing with
>>>> a call to cxl_handle_proto_error().
>>>>
>>>> Introduce logic to take the SBDF values from 'struct cxl_proto_error_info'
>>>> and use in discovering the erring PCI device. The call to pci_get_domain_bus_and_slot()
>>>> will return a reference counted 'struct pci_dev *'. This will serve as
>>>> reference count to prevent releasing the CXL Endpoint's mapped RAS while
>>>> handling the error. Use scope base __free() to put the reference count.
>>>> This will change when adding support for CXL port devices in the future.
>>>>
>>>> Implement cxl_handle_proto_error() to differentiate between Restricted CXL
>>>> Host (RCH) protocol errors and CXL virtual host (VH) protocol errors.
>>>> Maintain the existing RCH handling. Export the AER driver's pcie_walk_rcec()
>>>> allowing the CXL driver to walk the RCEC's secondary bus.
>>>>
>>>> VH correctable error (CE) processing will call the CXL CE handler. VH
>>>> uncorrectable errors (UCE) will call cxl_do_recovery(), implemented as a
>>>> stub for now and to be updated in future patch. Export pci_aer_clean_fatal_status()
>>>> and pci_clean_device_status() used to clean up AER status after handling.
>>>>
>>>> Signed-off-by: Terry Bowman <terry.bowman@amd.com>
>>>> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
>>>>
>>>> ---
>>>> Changes in v10->v11:
>>>> - Reword patch commit message to remove RCiEP details (Jonathan)
>>>> - Add #include <linux/bitfield.h> (Terry)
>>>> - is_cxl_rcd() - Fix short comment message wrap  (Jonathan)
>>>> - is_cxl_rcd() - Combine return calls into 1  (Jonathan)
>>>> - cxl_handle_proto_error() - Move comment earlier  (Jonathan)
>>>> - Usse FIELD_GET() in discovering class code (Jonathan)
>>>> - Remove BDF from cxl_proto_err_work_data. Use 'struct pci_dev *' (Dan)
>>>> ---
>>>>  drivers/cxl/core/ras.c  | 68 ++++++++++++++++++++++++++++++++++-------
>>>>  drivers/pci/pci.c       |  1 +
>>>>  drivers/pci/pci.h       |  7 -----
>>>>  drivers/pci/pcie/aer.c  |  1 +
>>>>  drivers/pci/pcie/rcec.c |  1 +
>>>>  include/linux/aer.h     |  2 ++
>>>>  include/linux/pci.h     | 10 ++++++
>>>>  7 files changed, 72 insertions(+), 18 deletions(-)
>>>>
>>>> diff --git a/drivers/cxl/core/ras.c b/drivers/cxl/core/ras.c
>>>> index b285448c2d9c..a2e95c49f965 100644
>>>> --- a/drivers/cxl/core/ras.c
>>>> +++ b/drivers/cxl/core/ras.c
>>>> @@ -118,17 +118,6 @@ static void cxl_cper_prot_err_work_fn(struct work_struct *work)
>>>>  }
>>>>  static DECLARE_WORK(cxl_cper_prot_err_work, cxl_cper_prot_err_work_fn);
>>>>  
>>>> -int cxl_ras_init(void)
>>>> -{
>>>> -	return cxl_cper_register_prot_err_work(&cxl_cper_prot_err_work);
>>>> -}
>>>> -
>>>> -void cxl_ras_exit(void)
>>>> -{
>>>> -	cxl_cper_unregister_prot_err_work(&cxl_cper_prot_err_work);
>>>> -	cancel_work_sync(&cxl_cper_prot_err_work);
>>>> -}
>>>> -
>>>>  static pci_ers_result_t cxl_handle_ras(struct device *dev, u64 serial, void __iomem *ras_base);
>>>>  static void cxl_handle_cor_ras(struct device *dev, u64 serial, void __iomem *ras_base);
>>>>  
>>>> @@ -331,6 +320,10 @@ void cxl_endpoint_port_init_ras(struct cxl_port *ep)
>>>>  }
>>>>  EXPORT_SYMBOL_NS_GPL(cxl_endpoint_port_init_ras, "CXL");
>>>>  
>>>> +static void cxl_do_recovery(struct device *dev)
>>>> +{
>>>> +}
>>>> +
>>>>  static void cxl_handle_cor_ras(struct device *dev, u64 serial, void __iomem *ras_base)
>>>>  {
>>>>  	void __iomem *addr;
>>>> @@ -472,3 +465,56 @@ pci_ers_result_t pci_error_detected(struct pci_dev *pdev,
>>>>  	return rc;
>>>>  }
>>>>  EXPORT_SYMBOL_NS_GPL(pci_error_detected, "CXL");
>>>> +
>>>> +static void cxl_handle_proto_error(struct cxl_proto_err_work_data *err_info)
>>>> +{
>>>> +	struct pci_dev *pdev = err_info->pdev;
>>>> +	struct cxl_dev_state *cxlds = pci_get_drvdata(pdev);
>>> So this function is called from the workqueue thread to consume data from the kfifo right? Do we need to take the device lock of the pdev to ensure that a driver is bound to the device before we attempt to retrieve the data? And do we also need to verify that the driver bound is the cxl_pci driver (and not something like vfio_pci)? Otherwise I think assuming the drv data is cxl_dev_state may cause crash.
>>>
>>> DJ
>> Yes, this is called in the worker thread context. I added the pdev device locks
>> later in cxl_report_error_detected() for the UCE case. I found it necessary to 
>> put in this function and not in cxl_handle_proto_error() (here) because of the 
>> traversing logic in the UCE handling flow where it needs to be locked but only
>> exactly once. I didn't add for the CE because I wasn't certain a CE error was 
>> enough reason to add a device lock. 
>>
>> The UCE flow is:
>>
>> cxl_handle_proto_error()
>> --> cxl_do_recovery()
>> ----> cxl_handle_proto_error() <--- Added device lock here because of topo traversing/iteration
>>
>>
>> I tried adding a function checking for cxl_pci driver but ran into circular dependency 
>> because the driver is defined in cxl_pci but called from cxl_core. I will revisit
>> this again but need some ideas how to make that work as I expect it will require 
>> some code moving. 
> Is there a chance that the endpoint errors can just be handled via the standard AER flow via AER callback? Otherwise we may need to move the cxl_pci driver definition to core/pci.c in order to do this....
>
> DJ

I understand AER callbacks in your comment are the PCIe CE and UCE callback handlers. Calling 
the AER handlers for the EPs would technically work because they are initialized to CXL 'PCIe' 
callbacks already. The AER callbacks are used for handling EP fatal UCE errors.

One problem is this approach isn't consistent with the plan to use CXL device based error 
handling because it will now be using PCIe error handling for EP CE and UCE errors. 
Also, CXL device handling would be used for for CXL port devices which would be implemented
differently.

Terry 

>> Terry
>>
>>>> +	struct cxl_memdev *cxlmd = cxlds->cxlmd;
>>>> +	struct device *host_dev __free(put_device) = get_device(&cxlmd->dev);
>>>> +> +	if (err_info->severity == AER_CORRECTABLE) {
>>>> +		int aer = pdev->aer_cap;
>>>> +
>>>> +		if (aer)
>>>> +			pci_clear_and_set_config_dword(pdev,
>>>> +						       aer + PCI_ERR_COR_STATUS,
>>>> +						       0, PCI_ERR_COR_INTERNAL);
>>>> +
>>>> +		cxl_cor_error_detected(&cxlmd->dev);
>>>> +
>>>> +		pcie_clear_device_status(pdev);
>>>> +	} else {
>>>> +		cxl_do_recovery(&cxlmd->dev);
>>>> +	}
>>>> +}
>>>> +
>>>> +static void cxl_proto_err_work_fn(struct work_struct *work)
>>>> +{
>>>> +	struct cxl_proto_err_work_data wd;
>>>> +
>>>> +	while (cxl_proto_err_kfifo_get(&wd))
>>>> +		cxl_handle_proto_error(&wd);
>>>> +}
>>>> +
>>>> +static struct work_struct cxl_proto_err_work;
>>>> +static DECLARE_WORK(cxl_proto_err_work, cxl_proto_err_work_fn);
>>>> +
>>>> +int cxl_ras_init(void)
>>>> +{
>>>> +	if (cxl_cper_register_prot_err_work(&cxl_cper_prot_err_work))
>>>> +		pr_err("Failed to initialize CXL RAS CPER\n");
>>>> +
>>>> +	cxl_register_proto_err_work(&cxl_proto_err_work);
>>>> +
>>>> +	return 0;
>>>> +}
>>>> +
>>>> +void cxl_ras_exit(void)
>>>> +{
>>>> +	cxl_cper_unregister_prot_err_work(&cxl_cper_prot_err_work);
>>>> +	cancel_work_sync(&cxl_cper_prot_err_work);
>>>> +
>>>> +	cxl_unregister_proto_err_work();
>>>> +	cancel_work_sync(&cxl_proto_err_work);
>>>> +}
>>>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>>>> index d775ed37a79b..2c9827690cb3 100644
>>>> --- a/drivers/pci/pci.c
>>>> +++ b/drivers/pci/pci.c
>>>> @@ -2328,6 +2328,7 @@ void pcie_clear_device_status(struct pci_dev *dev)
>>>>  	pcie_capability_read_word(dev, PCI_EXP_DEVSTA, &sta);
>>>>  	pcie_capability_write_word(dev, PCI_EXP_DEVSTA, sta);
>>>>  }
>>>> +EXPORT_SYMBOL_NS_GPL(pcie_clear_device_status, "CXL");
>>>>  #endif
>>>>  
>>>>  /**
>>>> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
>>>> index cfa75903dd3f..69ff7c2d214f 100644
>>>> --- a/drivers/pci/pci.h
>>>> +++ b/drivers/pci/pci.h
>>>> @@ -671,16 +671,10 @@ static inline bool pci_dpc_recovered(struct pci_dev *pdev) { return false; }
>>>>  void pci_rcec_init(struct pci_dev *dev);
>>>>  void pci_rcec_exit(struct pci_dev *dev);
>>>>  void pcie_link_rcec(struct pci_dev *rcec);
>>>> -void pcie_walk_rcec(struct pci_dev *rcec,
>>>> -		    int (*cb)(struct pci_dev *, void *),
>>>> -		    void *userdata);
>>>>  #else
>>>>  static inline void pci_rcec_init(struct pci_dev *dev) { }
>>>>  static inline void pci_rcec_exit(struct pci_dev *dev) { }
>>>>  static inline void pcie_link_rcec(struct pci_dev *rcec) { }
>>>> -static inline void pcie_walk_rcec(struct pci_dev *rcec,
>>>> -				  int (*cb)(struct pci_dev *, void *),
>>>> -				  void *userdata) { }
>>>>  #endif
>>>>  
>>>>  #ifdef CONFIG_PCI_ATS
>>>> @@ -1022,7 +1016,6 @@ void pci_restore_aer_state(struct pci_dev *dev);
>>>>  static inline void pci_no_aer(void) { }
>>>>  static inline void pci_aer_init(struct pci_dev *d) { }
>>>>  static inline void pci_aer_exit(struct pci_dev *d) { }
>>>> -static inline void pci_aer_clear_fatal_status(struct pci_dev *dev) { }
>>>>  static inline int pci_aer_clear_status(struct pci_dev *dev) { return -EINVAL; }
>>>>  static inline int pci_aer_raw_clear_status(struct pci_dev *dev) { return -EINVAL; }
>>>>  static inline void pci_save_aer_state(struct pci_dev *dev) { }
>>>> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
>>>> index 627d89ccea9c..45abe1622316 100644
>>>> --- a/drivers/pci/pcie/aer.c
>>>> +++ b/drivers/pci/pcie/aer.c
>>>> @@ -288,6 +288,7 @@ void pci_aer_clear_fatal_status(struct pci_dev *dev)
>>>>  	if (status)
>>>>  		pci_write_config_dword(dev, aer + PCI_ERR_UNCOR_STATUS, status);
>>>>  }
>>>> +EXPORT_SYMBOL_GPL(pci_aer_clear_fatal_status);
>>>>  
>>>>  /**
>>>>   * pci_aer_raw_clear_status - Clear AER error registers.
>>>> diff --git a/drivers/pci/pcie/rcec.c b/drivers/pci/pcie/rcec.c
>>>> index d0bcd141ac9c..fb6cf6449a1d 100644
>>>> --- a/drivers/pci/pcie/rcec.c
>>>> +++ b/drivers/pci/pcie/rcec.c
>>>> @@ -145,6 +145,7 @@ void pcie_walk_rcec(struct pci_dev *rcec, int (*cb)(struct pci_dev *, void *),
>>>>  
>>>>  	walk_rcec(walk_rcec_helper, &rcec_data);
>>>>  }
>>>> +EXPORT_SYMBOL_NS_GPL(pcie_walk_rcec, "CXL");
>>>>  
>>>>  void pci_rcec_init(struct pci_dev *dev)
>>>>  {
>>>> diff --git a/include/linux/aer.h b/include/linux/aer.h
>>>> index f8eb32805957..1f79f0be4bf7 100644
>>>> --- a/include/linux/aer.h
>>>> +++ b/include/linux/aer.h
>>>> @@ -66,12 +66,14 @@ struct cxl_proto_err_work_data {
>>>>  
>>>>  #if defined(CONFIG_PCIEAER)
>>>>  int pci_aer_clear_nonfatal_status(struct pci_dev *dev);
>>>> +void pci_aer_clear_fatal_status(struct pci_dev *dev);
>>>>  int pcie_aer_is_native(struct pci_dev *dev);
>>>>  #else
>>>>  static inline int pci_aer_clear_nonfatal_status(struct pci_dev *dev)
>>>>  {
>>>>  	return -EINVAL;
>>>>  }
>>>> +static inline void pci_aer_clear_fatal_status(struct pci_dev *dev) { }
>>>>  static inline int pcie_aer_is_native(struct pci_dev *dev) { return 0; }
>>>>  #endif
>>>>  
>>>> diff --git a/include/linux/pci.h b/include/linux/pci.h
>>>> index 3dcab36c437f..3407d687459d 100644
>>>> --- a/include/linux/pci.h
>>>> +++ b/include/linux/pci.h
>>>> @@ -1804,6 +1804,9 @@ extern bool pcie_ports_native;
>>>>  
>>>>  int pcie_set_target_speed(struct pci_dev *port, enum pci_bus_speed speed_req,
>>>>  			  bool use_lt);
>>>> +void pcie_walk_rcec(struct pci_dev *rcec,
>>>> +		    int (*cb)(struct pci_dev *, void *),
>>>> +		    void *userdata);
>>>>  #else
>>>>  #define pcie_ports_disabled	true
>>>>  #define pcie_ports_native	false
>>>> @@ -1814,8 +1817,15 @@ static inline int pcie_set_target_speed(struct pci_dev *port,
>>>>  {
>>>>  	return -EOPNOTSUPP;
>>>>  }
>>>> +
>>>> +static inline void pcie_walk_rcec(struct pci_dev *rcec,
>>>> +				  int (*cb)(struct pci_dev *, void *),
>>>> +				  void *userdata) { }
>>>> +
>>>>  #endif
>>>>  
>>>> +void pcie_clear_device_status(struct pci_dev *dev);
>>>> +
>>>>  #define PCIE_LINK_STATE_L0S		(BIT(0) | BIT(1)) /* Upstr/dwnstr L0s */
>>>>  #define PCIE_LINK_STATE_L1		BIT(2)	/* L1 state */
>>>>  #define PCIE_LINK_STATE_L1_1		BIT(3)	/* ASPM L1.1 state */

Re: [PATCH v11 18/23] PCI/AER: Dequeue forwarded CXL error

Posted by Dave Jiang 5 months ago


On 9/11/25 9:47 AM, Bowman, Terry wrote:
> 
> 
> On 9/11/2025 10:41 AM, Dave Jiang wrote:
>>
>> On 9/11/25 7:33 AM, Bowman, Terry wrote:
>>>
>>> On 8/28/2025 7:43 PM, Dave Jiang wrote:
>>>> On 8/26/25 6:35 PM, Terry Bowman wrote:
>>>>> The AER driver is now designed to forward CXL protocol errors to the CXL
>>>> I would rephrase it to:
>>>> The AER driver enqueues the CXL protocol error info to the created kfifo for the CXL driver to consume.
>>>>  
>>>>> driver. Update the CXL driver with functionality to dequeue the forwarded
>>>>> CXL error from the kfifo. Also, update the CXL driver to begin the protocol
>>>>> error handling processing using the work received from the FIFO.
>>>>>
>>>>> Update function cxl_proto_err_work_fn() to dequeue work forwarded by the
>>>>> AER service driver. This will begin the CXL protocol error processing with
>>>>> a call to cxl_handle_proto_error().
>>>>>
>>>>> Introduce logic to take the SBDF values from 'struct cxl_proto_error_info'
>>>>> and use in discovering the erring PCI device. The call to pci_get_domain_bus_and_slot()
>>>>> will return a reference counted 'struct pci_dev *'. This will serve as
>>>>> reference count to prevent releasing the CXL Endpoint's mapped RAS while
>>>>> handling the error. Use scope base __free() to put the reference count.
>>>>> This will change when adding support for CXL port devices in the future.
>>>>>
>>>>> Implement cxl_handle_proto_error() to differentiate between Restricted CXL
>>>>> Host (RCH) protocol errors and CXL virtual host (VH) protocol errors.
>>>>> Maintain the existing RCH handling. Export the AER driver's pcie_walk_rcec()
>>>>> allowing the CXL driver to walk the RCEC's secondary bus.
>>>>>
>>>>> VH correctable error (CE) processing will call the CXL CE handler. VH
>>>>> uncorrectable errors (UCE) will call cxl_do_recovery(), implemented as a
>>>>> stub for now and to be updated in future patch. Export pci_aer_clean_fatal_status()
>>>>> and pci_clean_device_status() used to clean up AER status after handling.
>>>>>
>>>>> Signed-off-by: Terry Bowman <terry.bowman@amd.com>
>>>>> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
>>>>>
>>>>> ---
>>>>> Changes in v10->v11:
>>>>> - Reword patch commit message to remove RCiEP details (Jonathan)
>>>>> - Add #include <linux/bitfield.h> (Terry)
>>>>> - is_cxl_rcd() - Fix short comment message wrap  (Jonathan)
>>>>> - is_cxl_rcd() - Combine return calls into 1  (Jonathan)
>>>>> - cxl_handle_proto_error() - Move comment earlier  (Jonathan)
>>>>> - Usse FIELD_GET() in discovering class code (Jonathan)
>>>>> - Remove BDF from cxl_proto_err_work_data. Use 'struct pci_dev *' (Dan)
>>>>> ---
>>>>>  drivers/cxl/core/ras.c  | 68 ++++++++++++++++++++++++++++++++++-------
>>>>>  drivers/pci/pci.c       |  1 +
>>>>>  drivers/pci/pci.h       |  7 -----
>>>>>  drivers/pci/pcie/aer.c  |  1 +
>>>>>  drivers/pci/pcie/rcec.c |  1 +
>>>>>  include/linux/aer.h     |  2 ++
>>>>>  include/linux/pci.h     | 10 ++++++
>>>>>  7 files changed, 72 insertions(+), 18 deletions(-)
>>>>>
>>>>> diff --git a/drivers/cxl/core/ras.c b/drivers/cxl/core/ras.c
>>>>> index b285448c2d9c..a2e95c49f965 100644
>>>>> --- a/drivers/cxl/core/ras.c
>>>>> +++ b/drivers/cxl/core/ras.c
>>>>> @@ -118,17 +118,6 @@ static void cxl_cper_prot_err_work_fn(struct work_struct *work)
>>>>>  }
>>>>>  static DECLARE_WORK(cxl_cper_prot_err_work, cxl_cper_prot_err_work_fn);
>>>>>  
>>>>> -int cxl_ras_init(void)
>>>>> -{
>>>>> -	return cxl_cper_register_prot_err_work(&cxl_cper_prot_err_work);
>>>>> -}
>>>>> -
>>>>> -void cxl_ras_exit(void)
>>>>> -{
>>>>> -	cxl_cper_unregister_prot_err_work(&cxl_cper_prot_err_work);
>>>>> -	cancel_work_sync(&cxl_cper_prot_err_work);
>>>>> -}
>>>>> -
>>>>>  static pci_ers_result_t cxl_handle_ras(struct device *dev, u64 serial, void __iomem *ras_base);
>>>>>  static void cxl_handle_cor_ras(struct device *dev, u64 serial, void __iomem *ras_base);
>>>>>  
>>>>> @@ -331,6 +320,10 @@ void cxl_endpoint_port_init_ras(struct cxl_port *ep)
>>>>>  }
>>>>>  EXPORT_SYMBOL_NS_GPL(cxl_endpoint_port_init_ras, "CXL");
>>>>>  
>>>>> +static void cxl_do_recovery(struct device *dev)
>>>>> +{
>>>>> +}
>>>>> +
>>>>>  static void cxl_handle_cor_ras(struct device *dev, u64 serial, void __iomem *ras_base)
>>>>>  {
>>>>>  	void __iomem *addr;
>>>>> @@ -472,3 +465,56 @@ pci_ers_result_t pci_error_detected(struct pci_dev *pdev,
>>>>>  	return rc;
>>>>>  }
>>>>>  EXPORT_SYMBOL_NS_GPL(pci_error_detected, "CXL");
>>>>> +
>>>>> +static void cxl_handle_proto_error(struct cxl_proto_err_work_data *err_info)
>>>>> +{
>>>>> +	struct pci_dev *pdev = err_info->pdev;
>>>>> +	struct cxl_dev_state *cxlds = pci_get_drvdata(pdev);
>>>> So this function is called from the workqueue thread to consume data from the kfifo right? Do we need to take the device lock of the pdev to ensure that a driver is bound to the device before we attempt to retrieve the data? And do we also need to verify that the driver bound is the cxl_pci driver (and not something like vfio_pci)? Otherwise I think assuming the drv data is cxl_dev_state may cause crash.
>>>>
>>>> DJ
>>> Yes, this is called in the worker thread context. I added the pdev device locks
>>> later in cxl_report_error_detected() for the UCE case. I found it necessary to 
>>> put in this function and not in cxl_handle_proto_error() (here) because of the 
>>> traversing logic in the UCE handling flow where it needs to be locked but only
>>> exactly once. I didn't add for the CE because I wasn't certain a CE error was 
>>> enough reason to add a device lock. 
>>>
>>> The UCE flow is:
>>>
>>> cxl_handle_proto_error()
>>> --> cxl_do_recovery()
>>> ----> cxl_handle_proto_error() <--- Added device lock here because of topo traversing/iteration
>>>
>>>
>>> I tried adding a function checking for cxl_pci driver but ran into circular dependency 
>>> because the driver is defined in cxl_pci but called from cxl_core. I will revisit
>>> this again but need some ideas how to make that work as I expect it will require 
>>> some code moving. 
>> Is there a chance that the endpoint errors can just be handled via the standard AER flow via AER callback? Otherwise we may need to move the cxl_pci driver definition to core/pci.c in order to do this....
>>
>> DJ
> 
> I understand AER callbacks in your comment are the PCIe CE and UCE callback handlers. Calling 
> the AER handlers for the EPs would technically work because they are initialized to CXL 'PCIe' 
> callbacks already. The AER callbacks are used for handling EP fatal UCE errors.
> 
> One problem is this approach isn't consistent with the plan to use CXL device based error 
> handling because it will now be using PCIe error handling for EP CE and UCE errors. 
> Also, CXL device handling would be used for for CXL port devices which would be implemented
> differently.

Maybe Dan has an opinion on this. To me it seems we are at the point where we need to move some of the PCI driver code to cxl/core in order to allow the pci driver checking in the error handling. Of course the region driver is in cxl/core already so there's precedent.

DJ

> 
> Terry 
> 
>>> Terry
>>>
>>>>> +	struct cxl_memdev *cxlmd = cxlds->cxlmd;
>>>>> +	struct device *host_dev __free(put_device) = get_device(&cxlmd->dev);
>>>>> +> +	if (err_info->severity == AER_CORRECTABLE) {
>>>>> +		int aer = pdev->aer_cap;
>>>>> +
>>>>> +		if (aer)
>>>>> +			pci_clear_and_set_config_dword(pdev,
>>>>> +						       aer + PCI_ERR_COR_STATUS,
>>>>> +						       0, PCI_ERR_COR_INTERNAL);
>>>>> +
>>>>> +		cxl_cor_error_detected(&cxlmd->dev);
>>>>> +
>>>>> +		pcie_clear_device_status(pdev);
>>>>> +	} else {
>>>>> +		cxl_do_recovery(&cxlmd->dev);
>>>>> +	}
>>>>> +}
>>>>> +
>>>>> +static void cxl_proto_err_work_fn(struct work_struct *work)
>>>>> +{
>>>>> +	struct cxl_proto_err_work_data wd;
>>>>> +
>>>>> +	while (cxl_proto_err_kfifo_get(&wd))
>>>>> +		cxl_handle_proto_error(&wd);
>>>>> +}
>>>>> +
>>>>> +static struct work_struct cxl_proto_err_work;
>>>>> +static DECLARE_WORK(cxl_proto_err_work, cxl_proto_err_work_fn);
>>>>> +
>>>>> +int cxl_ras_init(void)
>>>>> +{
>>>>> +	if (cxl_cper_register_prot_err_work(&cxl_cper_prot_err_work))
>>>>> +		pr_err("Failed to initialize CXL RAS CPER\n");
>>>>> +
>>>>> +	cxl_register_proto_err_work(&cxl_proto_err_work);
>>>>> +
>>>>> +	return 0;
>>>>> +}
>>>>> +
>>>>> +void cxl_ras_exit(void)
>>>>> +{
>>>>> +	cxl_cper_unregister_prot_err_work(&cxl_cper_prot_err_work);
>>>>> +	cancel_work_sync(&cxl_cper_prot_err_work);
>>>>> +
>>>>> +	cxl_unregister_proto_err_work();
>>>>> +	cancel_work_sync(&cxl_proto_err_work);
>>>>> +}
>>>>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>>>>> index d775ed37a79b..2c9827690cb3 100644
>>>>> --- a/drivers/pci/pci.c
>>>>> +++ b/drivers/pci/pci.c
>>>>> @@ -2328,6 +2328,7 @@ void pcie_clear_device_status(struct pci_dev *dev)
>>>>>  	pcie_capability_read_word(dev, PCI_EXP_DEVSTA, &sta);
>>>>>  	pcie_capability_write_word(dev, PCI_EXP_DEVSTA, sta);
>>>>>  }
>>>>> +EXPORT_SYMBOL_NS_GPL(pcie_clear_device_status, "CXL");
>>>>>  #endif
>>>>>  
>>>>>  /**
>>>>> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
>>>>> index cfa75903dd3f..69ff7c2d214f 100644
>>>>> --- a/drivers/pci/pci.h
>>>>> +++ b/drivers/pci/pci.h
>>>>> @@ -671,16 +671,10 @@ static inline bool pci_dpc_recovered(struct pci_dev *pdev) { return false; }
>>>>>  void pci_rcec_init(struct pci_dev *dev);
>>>>>  void pci_rcec_exit(struct pci_dev *dev);
>>>>>  void pcie_link_rcec(struct pci_dev *rcec);
>>>>> -void pcie_walk_rcec(struct pci_dev *rcec,
>>>>> -		    int (*cb)(struct pci_dev *, void *),
>>>>> -		    void *userdata);
>>>>>  #else
>>>>>  static inline void pci_rcec_init(struct pci_dev *dev) { }
>>>>>  static inline void pci_rcec_exit(struct pci_dev *dev) { }
>>>>>  static inline void pcie_link_rcec(struct pci_dev *rcec) { }
>>>>> -static inline void pcie_walk_rcec(struct pci_dev *rcec,
>>>>> -				  int (*cb)(struct pci_dev *, void *),
>>>>> -				  void *userdata) { }
>>>>>  #endif
>>>>>  
>>>>>  #ifdef CONFIG_PCI_ATS
>>>>> @@ -1022,7 +1016,6 @@ void pci_restore_aer_state(struct pci_dev *dev);
>>>>>  static inline void pci_no_aer(void) { }
>>>>>  static inline void pci_aer_init(struct pci_dev *d) { }
>>>>>  static inline void pci_aer_exit(struct pci_dev *d) { }
>>>>> -static inline void pci_aer_clear_fatal_status(struct pci_dev *dev) { }
>>>>>  static inline int pci_aer_clear_status(struct pci_dev *dev) { return -EINVAL; }
>>>>>  static inline int pci_aer_raw_clear_status(struct pci_dev *dev) { return -EINVAL; }
>>>>>  static inline void pci_save_aer_state(struct pci_dev *dev) { }
>>>>> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
>>>>> index 627d89ccea9c..45abe1622316 100644
>>>>> --- a/drivers/pci/pcie/aer.c
>>>>> +++ b/drivers/pci/pcie/aer.c
>>>>> @@ -288,6 +288,7 @@ void pci_aer_clear_fatal_status(struct pci_dev *dev)
>>>>>  	if (status)
>>>>>  		pci_write_config_dword(dev, aer + PCI_ERR_UNCOR_STATUS, status);
>>>>>  }
>>>>> +EXPORT_SYMBOL_GPL(pci_aer_clear_fatal_status);
>>>>>  
>>>>>  /**
>>>>>   * pci_aer_raw_clear_status - Clear AER error registers.
>>>>> diff --git a/drivers/pci/pcie/rcec.c b/drivers/pci/pcie/rcec.c
>>>>> index d0bcd141ac9c..fb6cf6449a1d 100644
>>>>> --- a/drivers/pci/pcie/rcec.c
>>>>> +++ b/drivers/pci/pcie/rcec.c
>>>>> @@ -145,6 +145,7 @@ void pcie_walk_rcec(struct pci_dev *rcec, int (*cb)(struct pci_dev *, void *),
>>>>>  
>>>>>  	walk_rcec(walk_rcec_helper, &rcec_data);
>>>>>  }
>>>>> +EXPORT_SYMBOL_NS_GPL(pcie_walk_rcec, "CXL");
>>>>>  
>>>>>  void pci_rcec_init(struct pci_dev *dev)
>>>>>  {
>>>>> diff --git a/include/linux/aer.h b/include/linux/aer.h
>>>>> index f8eb32805957..1f79f0be4bf7 100644
>>>>> --- a/include/linux/aer.h
>>>>> +++ b/include/linux/aer.h
>>>>> @@ -66,12 +66,14 @@ struct cxl_proto_err_work_data {
>>>>>  
>>>>>  #if defined(CONFIG_PCIEAER)
>>>>>  int pci_aer_clear_nonfatal_status(struct pci_dev *dev);
>>>>> +void pci_aer_clear_fatal_status(struct pci_dev *dev);
>>>>>  int pcie_aer_is_native(struct pci_dev *dev);
>>>>>  #else
>>>>>  static inline int pci_aer_clear_nonfatal_status(struct pci_dev *dev)
>>>>>  {
>>>>>  	return -EINVAL;
>>>>>  }
>>>>> +static inline void pci_aer_clear_fatal_status(struct pci_dev *dev) { }
>>>>>  static inline int pcie_aer_is_native(struct pci_dev *dev) { return 0; }
>>>>>  #endif
>>>>>  
>>>>> diff --git a/include/linux/pci.h b/include/linux/pci.h
>>>>> index 3dcab36c437f..3407d687459d 100644
>>>>> --- a/include/linux/pci.h
>>>>> +++ b/include/linux/pci.h
>>>>> @@ -1804,6 +1804,9 @@ extern bool pcie_ports_native;
>>>>>  
>>>>>  int pcie_set_target_speed(struct pci_dev *port, enum pci_bus_speed speed_req,
>>>>>  			  bool use_lt);
>>>>> +void pcie_walk_rcec(struct pci_dev *rcec,
>>>>> +		    int (*cb)(struct pci_dev *, void *),
>>>>> +		    void *userdata);
>>>>>  #else
>>>>>  #define pcie_ports_disabled	true
>>>>>  #define pcie_ports_native	false
>>>>> @@ -1814,8 +1817,15 @@ static inline int pcie_set_target_speed(struct pci_dev *port,
>>>>>  {
>>>>>  	return -EOPNOTSUPP;
>>>>>  }
>>>>> +
>>>>> +static inline void pcie_walk_rcec(struct pci_dev *rcec,
>>>>> +				  int (*cb)(struct pci_dev *, void *),
>>>>> +				  void *userdata) { }
>>>>> +
>>>>>  #endif
>>>>>  
>>>>> +void pcie_clear_device_status(struct pci_dev *dev);
>>>>> +
>>>>>  #define PCIE_LINK_STATE_L0S		(BIT(0) | BIT(1)) /* Upstr/dwnstr L0s */
>>>>>  #define PCIE_LINK_STATE_L1		BIT(2)	/* L1 state */
>>>>>  #define PCIE_LINK_STATE_L1_1		BIT(3)	/* ASPM L1.1 state */
>

Re: [PATCH v11 18/23] PCI/AER: Dequeue forwarded CXL error

Posted by Lukas Wunner 5 months, 2 weeks ago

On Thu, Aug 28, 2025 at 05:43:31PM -0700, Dave Jiang wrote:
> On 8/26/25 6:35 PM, Terry Bowman wrote:
> > +static void cxl_handle_proto_error(struct cxl_proto_err_work_data *err_info)
> > +{
> > +	struct pci_dev *pdev = err_info->pdev;
> > +	struct cxl_dev_state *cxlds = pci_get_drvdata(pdev);
> 
> So this function is called from the workqueue thread to consume data
> from the kfifo right? Do we need to take the device lock of the pdev
> to ensure that a driver is bound to the device before we attempt to
> retrieve the data? And do we also need to verify that the driver bound
> is the cxl_pci driver (and not something like vfio_pci)? Otherwise I
> think assuming the drv data is cxl_dev_state may cause crash.

In v10 of this series, there used to be a cxl_pci_drv_bound() function
to verify that the cxl_pci_driver is bound and not some other driver.
That function was called from cxl_rch_handle_error_iter().

It seems this is gone in v11?

Thanks,

Lukas

Re: [PATCH v11 18/23] PCI/AER: Dequeue forwarded CXL error

Posted by Bowman, Terry 4 months, 3 weeks ago

On 8/29/2025 2:10 AM, Lukas Wunner wrote:
> On Thu, Aug 28, 2025 at 05:43:31PM -0700, Dave Jiang wrote:
>> On 8/26/25 6:35 PM, Terry Bowman wrote:
>>> +static void cxl_handle_proto_error(struct cxl_proto_err_work_data *err_info)
>>> +{
>>> +	struct pci_dev *pdev = err_info->pdev;
>>> +	struct cxl_dev_state *cxlds = pci_get_drvdata(pdev);
>> So this function is called from the workqueue thread to consume data
>> from the kfifo right? Do we need to take the device lock of the pdev
>> to ensure that a driver is bound to the device before we attempt to
>> retrieve the data? And do we also need to verify that the driver bound
>> is the cxl_pci driver (and not something like vfio_pci)? Otherwise I
>> think assuming the drv data is cxl_dev_state may cause crash.
> In v10 of this series, there used to be a cxl_pci_drv_bound() function
> to verify that the cxl_pci_driver is bound and not some other driver.
> That function was called from cxl_rch_handle_error_iter().
>
> It seems this is gone in v11?
>
> Thanks,
>
> Lukas
Hi Lukas,

Yes, this was removed due to a build issue. I am adding back with the fix.

You mentioned cxl_rch_handle_error_iter() above and I want to clarify my 
understanding that this is not needed in the RCH case. RCH handling includes 
traversal to find the first downstream PCI EP and calls the EP's 
pci_driver::err_handlers callbacks. These are part of pci_driver and therefore 
don't need the driver check as they will work for any bound pci_driver. 

The check is needed for VH EP CE and non-fatal UCE because they are handled 
by CXL error callbacks defined in the cxl_core/cxl_pci modules and not in the 
pci_driver::err_handlers.

Terry