[PATCH 15/15] cxl/pci: Enable internal CE/UCE interrupts for CXL PCIe port devices

Terry Bowman posted 15 patches 1 month, 2 weeks ago
There is a newer version of this series
[PATCH 15/15] cxl/pci: Enable internal CE/UCE interrupts for CXL PCIe port devices
Posted by Terry Bowman 1 month, 2 weeks ago
The AER service drivers and CXL drivers are updated to handle PCIe
port protocol errors. But, the PCIe AER correctable and uncorrectable
internal errors are mask disabled for the PCIe port devices.

Enable the AER internal errors for CXL PCIe port devices.

Signed-off-by: Terry Bowman <terry.bowman@amd.com>
---
 drivers/cxl/core/pci.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
index 4706113d2582..1d84a7022c4d 100644
--- a/drivers/cxl/core/pci.c
+++ b/drivers/cxl/core/pci.c
@@ -908,6 +908,7 @@ EXPORT_SYMBOL_NS_GPL(cxl_port_err_detected, CXL);
 
 void cxl_uport_init_aer(struct cxl_port *port)
 {
+	struct pci_dev *pdev = to_pci_dev(port->uport_dev);
 	/* uport may have more than 1 downstream EP. Check if already mapped. */
 	if (port->uport_regs.ras) {
 		dev_warn(&port->dev, "RAS is already mapped\n");
@@ -920,12 +921,14 @@ void cxl_uport_init_aer(struct cxl_port *port)
 		dev_err(&port->dev, "Failed to map RAS capability.\n");
 		return;
 	}
+	pci_aer_unmask_internal_errors(pdev);
 }
 EXPORT_SYMBOL_NS_GPL(cxl_uport_init_aer, CXL);
 
 void cxl_dport_init_aer(struct cxl_dport *dport)
 {
 	struct device *dport_dev = dport->dport_dev;
+	struct pci_dev *pdev = to_pci_dev(dport_dev);
 
 	if (dport->rch) {
 		struct pci_host_bridge *host_bridge = to_pci_host_bridge(dport_dev);
@@ -949,6 +952,7 @@ void cxl_dport_init_aer(struct cxl_dport *dport)
 		dev_err(dport_dev, "Failed to map RAS capability.\n");
 		return;
 	}
+	pci_aer_unmask_internal_errors(pdev);
 }
 EXPORT_SYMBOL_NS_GPL(cxl_dport_init_aer, CXL);
 
-- 
2.34.1
Re: [PATCH 15/15] cxl/pci: Enable internal CE/UCE interrupts for CXL PCIe port devices
Posted by Jonathan Cameron 1 month, 1 week ago
On Tue, 8 Oct 2024 17:16:57 -0500
Terry Bowman <terry.bowman@amd.com> wrote:

> The AER service drivers and CXL drivers are updated to handle PCIe
> port protocol errors. But, the PCIe AER correctable and uncorrectable
> internal errors are mask disabled for the PCIe port devices.
> 
> Enable the AER internal errors for CXL PCIe port devices.
> 
> Signed-off-by: Terry Bowman <terry.bowman@amd.com>

A while back I thought we had a discussion about just enabling these
for all devices and seeing if anyone screamed?

I'd love to do that rather than carefully enabling them for CXL devices
only ;)

If not, this looks fine to me.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

> ---
>  drivers/cxl/core/pci.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
> index 4706113d2582..1d84a7022c4d 100644
> --- a/drivers/cxl/core/pci.c
> +++ b/drivers/cxl/core/pci.c
> @@ -908,6 +908,7 @@ EXPORT_SYMBOL_NS_GPL(cxl_port_err_detected, CXL);
>  
>  void cxl_uport_init_aer(struct cxl_port *port)
>  {
> +	struct pci_dev *pdev = to_pci_dev(port->uport_dev);
>  	/* uport may have more than 1 downstream EP. Check if already mapped. */
>  	if (port->uport_regs.ras) {
>  		dev_warn(&port->dev, "RAS is already mapped\n");
> @@ -920,12 +921,14 @@ void cxl_uport_init_aer(struct cxl_port *port)
>  		dev_err(&port->dev, "Failed to map RAS capability.\n");
>  		return;
>  	}
> +	pci_aer_unmask_internal_errors(pdev);
>  }
>  EXPORT_SYMBOL_NS_GPL(cxl_uport_init_aer, CXL);
>  
>  void cxl_dport_init_aer(struct cxl_dport *dport)
>  {
>  	struct device *dport_dev = dport->dport_dev;
> +	struct pci_dev *pdev = to_pci_dev(dport_dev);
>  
>  	if (dport->rch) {
>  		struct pci_host_bridge *host_bridge = to_pci_host_bridge(dport_dev);
> @@ -949,6 +952,7 @@ void cxl_dport_init_aer(struct cxl_dport *dport)
>  		dev_err(dport_dev, "Failed to map RAS capability.\n");
>  		return;
>  	}
> +	pci_aer_unmask_internal_errors(pdev);
>  }
>  EXPORT_SYMBOL_NS_GPL(cxl_dport_init_aer, CXL);
>
Re: [PATCH 15/15] cxl/pci: Enable internal CE/UCE interrupts for CXL PCIe port devices
Posted by Terry Bowman 1 month, 1 week ago
Hi Jonathan,

On 10/16/24 12:21, Jonathan Cameron wrote:
> On Tue, 8 Oct 2024 17:16:57 -0500
> Terry Bowman <terry.bowman@amd.com> wrote:
> 
>> The AER service drivers and CXL drivers are updated to handle PCIe
>> port protocol errors. But, the PCIe AER correctable and uncorrectable
>> internal errors are mask disabled for the PCIe port devices.
>>
>> Enable the AER internal errors for CXL PCIe port devices.
>>
>> Signed-off-by: Terry Bowman <terry.bowman@amd.com>
> 
> A while back I thought we had a discussion about just enabling these
> for all devices and seeing if anyone screamed?
> 
> I'd love to do that rather than carefully enabling them for CXL devices
> only ;)
> 
> If not, this looks fine to me.
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> 

These last 2 patches will be removed for v2. This is not necessary.
Internal AER errors for root ports and RCECs handling are already enabled 
by the AER driver. 

Regards,
Terry