[PATCH] PCI/portdev: Disable AER for Titan Ridge 4C 2018

Atharva Tiwari posted 1 patch 1 month ago
There is a newer version of this series
drivers/pci/pcie/portdrv.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
[PATCH] PCI/portdev: Disable AER for Titan Ridge 4C 2018
Posted by Atharva Tiwari 1 month ago
Disable AER for Intel Titan Ridge 4C 2018
(used in T2 iMacs, where the warnings appear)
that generates continuous pcieport warnings. such as:

pcieport 0000:00:1c.4: AER: Correctable error message received from 0000:07:00.0
pcieport 0000:07:00.0: PCIe Bus Error: severity=Correctable, type=Data Link Layer, (Receiver ID)
pcieport 0000:07:00.0:   device [8086:15ea] error status/mask=00000080/00002000
pcieport 0000:07:00.0:    [ 7] BadDLLP

(see: https://bugzilla.kernel.org/show_bug.cgi?id=220651)

macOS also disables AER for Thunderbolt devices and controllers in their drivers.

Signed-off-by: Atharva Tiwari <atharvatiwarilinuxdev@gmail.com>
---
 drivers/pci/pcie/portdrv.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/pcie/portdrv.c b/drivers/pci/pcie/portdrv.c
index 38a41ccf79b9..5330a679fcff 100644
--- a/drivers/pci/pcie/portdrv.c
+++ b/drivers/pci/pcie/portdrv.c
@@ -240,7 +240,9 @@ static int get_port_device_capability(struct pci_dev *dev)
 	if ((pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT ||
              pci_pcie_type(dev) == PCI_EXP_TYPE_RC_EC) &&
 	    dev->aer_cap && pci_aer_available() &&
-	    (pcie_ports_native || host->native_aer))
+	    (pcie_ports_native || host->native_aer) &&
+	    !(dev->vendor == PCI_VENDOR_ID_INTEL &&
+		    (dev->device >= 0x15EA && dev->device <= 0x15EC)))
 		services |= PCIE_PORT_SERVICE_AER;
 #endif
 
-- 
2.43.0
Re: [PATCH] PCI/portdev: Disable AER for Titan Ridge 4C 2018
Posted by Kuppuswamy Sathyanarayanan 1 month ago

On 1/6/2026 10:20 AM, Atharva Tiwari wrote:
> Disable AER for Intel Titan Ridge 4C 2018
> (used in T2 iMacs, where the warnings appear)
> that generates continuous pcieport warnings. such as:
> 
> pcieport 0000:00:1c.4: AER: Correctable error message received from 0000:07:00.0
> pcieport 0000:07:00.0: PCIe Bus Error: severity=Correctable, type=Data Link Layer, (Receiver ID)
> pcieport 0000:07:00.0:   device [8086:15ea] error status/mask=00000080/00002000
> pcieport 0000:07:00.0:    [ 7] BadDLLP
> 
> (see: https://bugzilla.kernel.org/show_bug.cgi?id=220651)
> 
> macOS also disables AER for Thunderbolt devices and controllers in their drivers.
> 

Why not disable it in BIOS or use noaer command line option?

> Signed-off-by: Atharva Tiwari <atharvatiwarilinuxdev@gmail.com>
> ---
>  drivers/pci/pcie/portdrv.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/pcie/portdrv.c b/drivers/pci/pcie/portdrv.c
> index 38a41ccf79b9..5330a679fcff 100644
> --- a/drivers/pci/pcie/portdrv.c
> +++ b/drivers/pci/pcie/portdrv.c
> @@ -240,7 +240,9 @@ static int get_port_device_capability(struct pci_dev *dev)
>  	if ((pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT ||
>               pci_pcie_type(dev) == PCI_EXP_TYPE_RC_EC) &&
>  	    dev->aer_cap && pci_aer_available() &&
> -	    (pcie_ports_native || host->native_aer))
> +	    (pcie_ports_native || host->native_aer) &&
> +	    !(dev->vendor == PCI_VENDOR_ID_INTEL &&
> +		    (dev->device >= 0x15EA && dev->device <= 0x15EC)))
>  		services |= PCIE_PORT_SERVICE_AER;
>  #endif
>  

-- 
Sathyanarayanan Kuppuswamy
Linux Kernel Developer
Re: [PATCH] PCI/portdev: Disable AER for Titan Ridge 4C 2018
Posted by Bjorn Helgaas 1 month ago
[+cc Thunderbolt folks]

On Tue, Jan 06, 2026 at 11:00:52AM -0800, Kuppuswamy Sathyanarayanan wrote:
> On 1/6/2026 10:20 AM, Atharva Tiwari wrote:
> > Disable AER for Intel Titan Ridge 4C 2018
> > (used in T2 iMacs, where the warnings appear)
> > that generates continuous pcieport warnings. such as:
> > 
> > pcieport 0000:00:1c.4: AER: Correctable error message received from 0000:07:00.0
> > pcieport 0000:07:00.0: PCIe Bus Error: severity=Correctable, type=Data Link Layer, (Receiver ID)
> > pcieport 0000:07:00.0:   device [8086:15ea] error status/mask=00000080/00002000
> > pcieport 0000:07:00.0:    [ 7] BadDLLP
> > 
> > (see: https://bugzilla.kernel.org/show_bug.cgi?id=220651)
> > 
> > macOS also disables AER for Thunderbolt devices and controllers in
> > their drivers.
> 
> Why not disable it in BIOS or use noaer command line option?

If the kernel can figure this out by itself, we should do that so
users don't have to debug issues and figure out how to disable in BIOS
or use a command line option.

But if this is really a hardware issue, I would expect to see some
reports on the web, and I can't find AER reports that mention these
devices except this problem report.

Adding Thunderbolt folks in case they know about any errata.

> > Signed-off-by: Atharva Tiwari <atharvatiwarilinuxdev@gmail.com>
> > ---
> >  drivers/pci/pcie/portdrv.c | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/pci/pcie/portdrv.c b/drivers/pci/pcie/portdrv.c
> > index 38a41ccf79b9..5330a679fcff 100644
> > --- a/drivers/pci/pcie/portdrv.c
> > +++ b/drivers/pci/pcie/portdrv.c
> > @@ -240,7 +240,9 @@ static int get_port_device_capability(struct pci_dev *dev)
> >  	if ((pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT ||
> >               pci_pcie_type(dev) == PCI_EXP_TYPE_RC_EC) &&
> >  	    dev->aer_cap && pci_aer_available() &&
> > -	    (pcie_ports_native || host->native_aer))
> > +	    (pcie_ports_native || host->native_aer) &&
> > +	    !(dev->vendor == PCI_VENDOR_ID_INTEL &&
> > +		    (dev->device >= 0x15EA && dev->device <= 0x15EC)))
> >  		services |= PCIE_PORT_SERVICE_AER;
> >  #endif
> >  
> 
> -- 
> Sathyanarayanan Kuppuswamy
> Linux Kernel Developer
>
Re: [PATCH] PCI/portdev: Disable AER for Titan Ridge 4C 2018
Posted by Mika Westerberg 1 month ago
On Tue, Jan 06, 2026 at 02:48:01PM -0600, Bjorn Helgaas wrote:
> [+cc Thunderbolt folks]
> 
> On Tue, Jan 06, 2026 at 11:00:52AM -0800, Kuppuswamy Sathyanarayanan wrote:
> > On 1/6/2026 10:20 AM, Atharva Tiwari wrote:
> > > Disable AER for Intel Titan Ridge 4C 2018
> > > (used in T2 iMacs, where the warnings appear)
> > > that generates continuous pcieport warnings. such as:
> > > 
> > > pcieport 0000:00:1c.4: AER: Correctable error message received from 0000:07:00.0
> > > pcieport 0000:07:00.0: PCIe Bus Error: severity=Correctable, type=Data Link Layer, (Receiver ID)
> > > pcieport 0000:07:00.0:   device [8086:15ea] error status/mask=00000080/00002000
> > > pcieport 0000:07:00.0:    [ 7] BadDLLP
> > > 
> > > (see: https://bugzilla.kernel.org/show_bug.cgi?id=220651)
> > > 
> > > macOS also disables AER for Thunderbolt devices and controllers in
> > > their drivers.
> > 
> > Why not disable it in BIOS or use noaer command line option?
> 
> If the kernel can figure this out by itself, we should do that so
> users don't have to debug issues and figure out how to disable in BIOS
> or use a command line option.
> 
> But if this is really a hardware issue, I would expect to see some
> reports on the web, and I can't find AER reports that mention these
> devices except this problem report.
> 
> Adding Thunderbolt folks in case they know about any errata.

I wonder if these AER messages are caused by PTM too?

Can you try the latest mainline. It has this commit:

  044b9f1a7f4f ("PCI/PTM: Enable only if device advertises relevant role")

and see if that changes anything?

> 
> > > Signed-off-by: Atharva Tiwari <atharvatiwarilinuxdev@gmail.com>
> > > ---
> > >  drivers/pci/pcie/portdrv.c | 4 +++-
> > >  1 file changed, 3 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/pci/pcie/portdrv.c b/drivers/pci/pcie/portdrv.c
> > > index 38a41ccf79b9..5330a679fcff 100644
> > > --- a/drivers/pci/pcie/portdrv.c
> > > +++ b/drivers/pci/pcie/portdrv.c
> > > @@ -240,7 +240,9 @@ static int get_port_device_capability(struct pci_dev *dev)
> > >  	if ((pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT ||
> > >               pci_pcie_type(dev) == PCI_EXP_TYPE_RC_EC) &&
> > >  	    dev->aer_cap && pci_aer_available() &&
> > > -	    (pcie_ports_native || host->native_aer))
> > > +	    (pcie_ports_native || host->native_aer) &&
> > > +	    !(dev->vendor == PCI_VENDOR_ID_INTEL &&
> > > +		    (dev->device >= 0x15EA && dev->device <= 0x15EC)))
> > >  		services |= PCIE_PORT_SERVICE_AER;
> > >  #endif
> > >  
> > 
> > -- 
> > Sathyanarayanan Kuppuswamy
> > Linux Kernel Developer
> >
Re: [PATCH] PCI/portdev: Disable AER for Titan Ridge 4C 2018
Posted by Atharva Tiwari 1 month ago
I’ve been using the mainline kernel
(which I compiled about two weeks ago),
and the problem still isn’t fixed,
so PTM is most likely not the root cause.
Re: [PATCH] PCI/portdev: Disable AER for Titan Ridge 4C 2018
Posted by Mika Westerberg 1 month ago
On Wed, Jan 07, 2026 at 09:54:33AM +0000, Atharva Tiwari wrote:
> I’ve been using the mainline kernel
> (which I compiled about two weeks ago),
> and the problem still isn’t fixed,
> so PTM is most likely not the root cause.

Okay, what device you have connected there?

Can you provide full dmesg and output of 'sudo lspci -vv') when that device
is connected?
Re: [PATCH] PCI/portdev: Disable AER for Titan Ridge 4C 2018
Posted by Lukas Wunner 1 month ago
On Wed, Jan 07, 2026 at 09:54:33AM +0000, Atharva Tiwari wrote:
> I've been using the mainline kernel
> (which I compiled about two weeks ago),
> and the problem still isn't fixed,
> so PTM is most likely not the root cause.

The commit Mika called out went into v6.19-rc1, we're now at v6.19-rc4,
so unless you're closely tracking Linus' tree, I'm afraid you may not
have been using a recent enough kernel to verify Mika's suggestion.

Thanks,

Lukas
Re: [PATCH] PCI/portdev: Disable AER for Titan Ridge 4C 2018
Posted by Kuppuswamy Sathyanarayanan 1 month ago

On 1/6/2026 11:00 AM, Kuppuswamy Sathyanarayanan wrote:
> 
> 
> On 1/6/2026 10:20 AM, Atharva Tiwari wrote:
>> Disable AER for Intel Titan Ridge 4C 2018
>> (used in T2 iMacs, where the warnings appear)
>> that generates continuous pcieport warnings. such as:
>>
>> pcieport 0000:00:1c.4: AER: Correctable error message received from 0000:07:00.0
>> pcieport 0000:07:00.0: PCIe Bus Error: severity=Correctable, type=Data Link Layer, (Receiver ID)
>> pcieport 0000:07:00.0:   device [8086:15ea] error status/mask=00000080/00002000
>> pcieport 0000:07:00.0:    [ 7] BadDLLP
>>
>> (see: https://bugzilla.kernel.org/show_bug.cgi?id=220651)
>>
>> macOS also disables AER for Thunderbolt devices and controllers in their drivers.
>>
> 
> Why not disable it in BIOS or use noaer command line option?

As per the bugzilla report, this looks like a regression. Did you bisect
to find which commit introduced this warning?

Before disabling AER, please investigate the root cause:

Does this occur on all T2 iMacs or specific configurations?
Have you tested different PCIe link speeds?
Is this a cable/connection issue or firmware problem?

The device range (0x15EA-0x15EC) needs justification. Which specific
Titan Ridge variants have this issue?

Before fixing it, try to identify the root cause.

If it needs kernel fix, you need to use quirks (since BIOS
change does not work for you).

> 
>> Signed-off-by: Atharva Tiwari <atharvatiwarilinuxdev@gmail.com>
>> ---
>>  drivers/pci/pcie/portdrv.c | 4 +++-
>>  1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/pci/pcie/portdrv.c b/drivers/pci/pcie/portdrv.c
>> index 38a41ccf79b9..5330a679fcff 100644
>> --- a/drivers/pci/pcie/portdrv.c
>> +++ b/drivers/pci/pcie/portdrv.c
>> @@ -240,7 +240,9 @@ static int get_port_device_capability(struct pci_dev *dev)
>>  	if ((pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT ||
>>               pci_pcie_type(dev) == PCI_EXP_TYPE_RC_EC) &&
>>  	    dev->aer_cap && pci_aer_available() &&
>> -	    (pcie_ports_native || host->native_aer))
>> +	    (pcie_ports_native || host->native_aer) &&
>> +	    !(dev->vendor == PCI_VENDOR_ID_INTEL &&
>> +		    (dev->device >= 0x15EA && dev->device <= 0x15EC)))
>>  		services |= PCIE_PORT_SERVICE_AER;
>>  #endif
>>  
> 

-- 
Sathyanarayanan Kuppuswamy
Linux Kernel Developer
Re: [PATCH] PCI/portdev: Disable AER for Titan Ridge 4C 2018
Posted by Atharva Tiwari 1 month ago
This is not a regression. By ‘happening since 6.1’, I meant that I have
been using Linux on my iMac since kernel 6.1, and the warnings have been
present for as long as I have been running Linux.

These warning occurs in all T2 iMac's. different link speeds dosent help.
and this is most likely a firmware problem, as connection problems
wouldnt occur on all T2 iMac's.
Re: [PATCH] PCI/portdev: Disable AER for Titan Ridge 4C 2018
Posted by Atharva Tiwari 1 month ago
Macs dosent allow normal users to change settings in the bios, and i want
to use AER on different devices.
Re: [PATCH] PCI/portdev: Disable AER for Titan Ridge 4C 2018
Posted by Dave Jiang 1 month ago

On 1/6/26 11:20 AM, Atharva Tiwari wrote:
> Disable AER for Intel Titan Ridge 4C 2018
> (used in T2 iMacs, where the warnings appear)
> that generates continuous pcieport warnings. such as:
> 
> pcieport 0000:00:1c.4: AER: Correctable error message received from 0000:07:00.0
> pcieport 0000:07:00.0: PCIe Bus Error: severity=Correctable, type=Data Link Layer, (Receiver ID)
> pcieport 0000:07:00.0:   device [8086:15ea] error status/mask=00000080/00002000
> pcieport 0000:07:00.0:    [ 7] BadDLLP
> 
> (see: https://bugzilla.kernel.org/show_bug.cgi?id=220651)
> 
> macOS also disables AER for Thunderbolt devices and controllers in their drivers.
> 
> Signed-off-by: Atharva Tiwari <atharvatiwarilinuxdev@gmail.com>
> ---
>  drivers/pci/pcie/portdrv.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/pcie/portdrv.c b/drivers/pci/pcie/portdrv.c
> index 38a41ccf79b9..5330a679fcff 100644
> --- a/drivers/pci/pcie/portdrv.c
> +++ b/drivers/pci/pcie/portdrv.c
> @@ -240,7 +240,9 @@ static int get_port_device_capability(struct pci_dev *dev)
>  	if ((pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT ||
>               pci_pcie_type(dev) == PCI_EXP_TYPE_RC_EC) &&
>  	    dev->aer_cap && pci_aer_available() &&
> -	    (pcie_ports_native || host->native_aer))
> +	    (pcie_ports_native || host->native_aer) &&
> +	    !(dev->vendor == PCI_VENDOR_ID_INTEL &&
> +		    (dev->device >= 0x15EA && dev->device <= 0x15EC)))

You probably want to do this as a PCI quirk rather than adding a vendor specific line in a generic function. See drivers/pci/quirks.c.

DJ
>  		services |= PCIE_PORT_SERVICE_AER;
>  #endif
>