[PATCH v7 5/5] PCI/AER: Only clear error bits in pcie_clear_device_status()

Shuai Xue posted 5 patches 2 weeks, 2 days ago
[PATCH v7 5/5] PCI/AER: Only clear error bits in pcie_clear_device_status()
Posted by Shuai Xue 2 weeks, 2 days ago
Currently, pcie_clear_device_status() clears the entire PCIe Device
Status register (PCI_EXP_DEVSTA), which includes both error status bits
and other status bits such as AUX Power Detected (AUXPD) and
Transactions Pending (TRPND).

Clearing non-error status bits can interfere with other drivers or
subsystems that may rely on these bits. To fix it, only clear the error
bits (0xf) while preserving other status bits.

Fixes: ec752f5d54d7 ("PCI/AER: Clear device status bits during ERR_FATAL and ERR_NONFATAL")
Cc: stable@vger.kernel.org
Suggested-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
---
 drivers/pci/pci.c             | 2 +-
 include/uapi/linux/pci_regs.h | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 13dbb405dc31..0b947f90c333 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -2246,7 +2246,7 @@ void pcie_clear_device_status(struct pci_dev *dev)
 	u16 sta;
 
 	pcie_capability_read_word(dev, PCI_EXP_DEVSTA, &sta);
-	pcie_capability_write_word(dev, PCI_EXP_DEVSTA, sta);
+	pcie_capability_write_word(dev, PCI_EXP_DEVSTA, sta & PCI_EXP_DEVSTA_ERR);
 }
 #endif
 
diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h
index 3add74ae2594..f4b68203bc4e 100644
--- a/include/uapi/linux/pci_regs.h
+++ b/include/uapi/linux/pci_regs.h
@@ -534,6 +534,7 @@
 #define  PCI_EXP_DEVSTA_NFED	0x0002	/* Non-Fatal Error Detected */
 #define  PCI_EXP_DEVSTA_FED	0x0004	/* Fatal Error Detected */
 #define  PCI_EXP_DEVSTA_URD	0x0008	/* Unsupported Request Detected */
+#define  PCI_EXP_DEVSTA_ERR	0xf	/* Error bits */
 #define  PCI_EXP_DEVSTA_AUXPD	0x0010	/* AUX Power Detected */
 #define  PCI_EXP_DEVSTA_TRPND	0x0020	/* Transactions Pending */
 #define PCI_CAP_EXP_RC_ENDPOINT_SIZEOF_V1	12	/* v1 endpoints without link end here */
-- 
2.39.3
Re: [PATCH v7 5/5] PCI/AER: Only clear error bits in pcie_clear_device_status()
Posted by Lukas Wunner 6 days, 5 hours ago
On Sat, Jan 24, 2026 at 03:45:57PM +0800, Shuai Xue wrote:
> +++ b/drivers/pci/pci.c
> @@ -2246,7 +2246,7 @@ void pcie_clear_device_status(struct pci_dev *dev)
>  	u16 sta;
>  
>  	pcie_capability_read_word(dev, PCI_EXP_DEVSTA, &sta);
> -	pcie_capability_write_word(dev, PCI_EXP_DEVSTA, sta);
> +	pcie_capability_write_word(dev, PCI_EXP_DEVSTA, sta & PCI_EXP_DEVSTA_ERR);
>  }

I don't think there's any harm to write error bits which are currently 0,
so I'd just get rid of the pcie_capability_read_word() and directly write
the error bits.

> +++ b/include/uapi/linux/pci_regs.h
> @@ -534,6 +534,7 @@
>  #define  PCI_EXP_DEVSTA_NFED	0x0002	/* Non-Fatal Error Detected */
>  #define  PCI_EXP_DEVSTA_FED	0x0004	/* Fatal Error Detected */
>  #define  PCI_EXP_DEVSTA_URD	0x0008	/* Unsupported Request Detected */
> +#define  PCI_EXP_DEVSTA_ERR	0xf	/* Error bits */

There's only one user of PCI_EXP_DEVSTA_ERR and it feels a little
awkward to define a macro in a uapi header which does not correspond
to an "official" bit definition but is just there for convenience.

So maybe it's better to simply use the macros for the four bits in
pcie_clear_device_status()?  Might also be slightly clearer.

This patch could be submitted individually instead of being part
of this series.

Thanks,

Lukas
Re: [PATCH v7 5/5] PCI/AER: Only clear error bits in pcie_clear_device_status()
Posted by Shuai Xue 3 days, 5 hours ago

On 2/3/26 3:53 PM, Lukas Wunner wrote:
> On Sat, Jan 24, 2026 at 03:45:57PM +0800, Shuai Xue wrote:
>> +++ b/drivers/pci/pci.c
>> @@ -2246,7 +2246,7 @@ void pcie_clear_device_status(struct pci_dev *dev)
>>   	u16 sta;
>>   
>>   	pcie_capability_read_word(dev, PCI_EXP_DEVSTA, &sta);
>> -	pcie_capability_write_word(dev, PCI_EXP_DEVSTA, sta);
>> +	pcie_capability_write_word(dev, PCI_EXP_DEVSTA, sta & PCI_EXP_DEVSTA_ERR);
>>   }
> 
> I don't think there's any harm to write error bits which are currently 0,
> so I'd just get rid of the pcie_capability_read_word() and directly write
> the error bits.

Good point. I will remove the read step.

> 
>> +++ b/include/uapi/linux/pci_regs.h
>> @@ -534,6 +534,7 @@
>>   #define  PCI_EXP_DEVSTA_NFED	0x0002	/* Non-Fatal Error Detected */
>>   #define  PCI_EXP_DEVSTA_FED	0x0004	/* Fatal Error Detected */
>>   #define  PCI_EXP_DEVSTA_URD	0x0008	/* Unsupported Request Detected */
>> +#define  PCI_EXP_DEVSTA_ERR	0xf	/* Error bits */
> 
> There's only one user of PCI_EXP_DEVSTA_ERR and it feels a little
> awkward to define a macro in a uapi header which does not correspond
> to an "official" bit definition but is just there for convenience.
> 
> So maybe it's better to simply use the macros for the four bits in
> pcie_clear_device_status()?  Might also be slightly clearer.

Agreed, will move PCI_EXP_DEVSTA_ERR to drivers/pci/pci.c.

> 
> This patch could be submitted individually instead of being part
> of this series.


Got it. Will send a individual patch.

> 
> Thanks,
> 
> Lukas

Thanks for valuable comments.
Shuai
Re: [PATCH v7 5/5] PCI/AER: Only clear error bits in pcie_clear_device_status()
Posted by Kuppuswamy Sathyanarayanan 1 week, 4 days ago

On 1/23/2026 11:45 PM, Shuai Xue wrote:
> Currently, pcie_clear_device_status() clears the entire PCIe Device
> Status register (PCI_EXP_DEVSTA), which includes both error status bits
> and other status bits such as AUX Power Detected (AUXPD) and
> Transactions Pending (TRPND).
> 
> Clearing non-error status bits can interfere with other drivers or
> subsystems that may rely on these bits. To fix it, only clear the error
> bits (0xf) while preserving other status bits.
> 
> Fixes: ec752f5d54d7 ("PCI/AER: Clear device status bits during ERR_FATAL and ERR_NONFATAL")
> Cc: stable@vger.kernel.org
> Suggested-by: Lukas Wunner <lukas@wunner.de>
> Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
> ---

Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>


>  drivers/pci/pci.c             | 2 +-
>  include/uapi/linux/pci_regs.h | 1 +
>  2 files changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 13dbb405dc31..0b947f90c333 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -2246,7 +2246,7 @@ void pcie_clear_device_status(struct pci_dev *dev)
>  	u16 sta;
>  
>  	pcie_capability_read_word(dev, PCI_EXP_DEVSTA, &sta);
> -	pcie_capability_write_word(dev, PCI_EXP_DEVSTA, sta);
> +	pcie_capability_write_word(dev, PCI_EXP_DEVSTA, sta & PCI_EXP_DEVSTA_ERR);
>  }
>  #endif
>  
> diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h
> index 3add74ae2594..f4b68203bc4e 100644
> --- a/include/uapi/linux/pci_regs.h
> +++ b/include/uapi/linux/pci_regs.h
> @@ -534,6 +534,7 @@
>  #define  PCI_EXP_DEVSTA_NFED	0x0002	/* Non-Fatal Error Detected */
>  #define  PCI_EXP_DEVSTA_FED	0x0004	/* Fatal Error Detected */
>  #define  PCI_EXP_DEVSTA_URD	0x0008	/* Unsupported Request Detected */
> +#define  PCI_EXP_DEVSTA_ERR	0xf	/* Error bits */

To align with other macros, use 0x000F?

>  #define  PCI_EXP_DEVSTA_AUXPD	0x0010	/* AUX Power Detected */
>  #define  PCI_EXP_DEVSTA_TRPND	0x0020	/* Transactions Pending */
>  #define PCI_CAP_EXP_RC_ENDPOINT_SIZEOF_V1	12	/* v1 endpoints without link end here */

-- 
Sathyanarayanan Kuppuswamy
Linux Kernel Developer
Re: [PATCH v7 5/5] PCI/AER: Only clear error bits in pcie_clear_device_status()
Posted by Shuai Xue 1 week, 4 days ago

On 1/29/26 1:01 AM, Kuppuswamy Sathyanarayanan wrote:
> 
> 
> On 1/23/2026 11:45 PM, Shuai Xue wrote:
>> Currently, pcie_clear_device_status() clears the entire PCIe Device
>> Status register (PCI_EXP_DEVSTA), which includes both error status bits
>> and other status bits such as AUX Power Detected (AUXPD) and
>> Transactions Pending (TRPND).
>>
>> Clearing non-error status bits can interfere with other drivers or
>> subsystems that may rely on these bits. To fix it, only clear the error
>> bits (0xf) while preserving other status bits.
>>
>> Fixes: ec752f5d54d7 ("PCI/AER: Clear device status bits during ERR_FATAL and ERR_NONFATAL")
>> Cc: stable@vger.kernel.org
>> Suggested-by: Lukas Wunner <lukas@wunner.de>
>> Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
>> ---
> 
> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
> 
> 
>>   drivers/pci/pci.c             | 2 +-
>>   include/uapi/linux/pci_regs.h | 1 +
>>   2 files changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>> index 13dbb405dc31..0b947f90c333 100644
>> --- a/drivers/pci/pci.c
>> +++ b/drivers/pci/pci.c
>> @@ -2246,7 +2246,7 @@ void pcie_clear_device_status(struct pci_dev *dev)
>>   	u16 sta;
>>   
>>   	pcie_capability_read_word(dev, PCI_EXP_DEVSTA, &sta);
>> -	pcie_capability_write_word(dev, PCI_EXP_DEVSTA, sta);
>> +	pcie_capability_write_word(dev, PCI_EXP_DEVSTA, sta & PCI_EXP_DEVSTA_ERR);
>>   }
>>   #endif
>>   
>> diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h
>> index 3add74ae2594..f4b68203bc4e 100644
>> --- a/include/uapi/linux/pci_regs.h
>> +++ b/include/uapi/linux/pci_regs.h
>> @@ -534,6 +534,7 @@
>>   #define  PCI_EXP_DEVSTA_NFED	0x0002	/* Non-Fatal Error Detected */
>>   #define  PCI_EXP_DEVSTA_FED	0x0004	/* Fatal Error Detected */
>>   #define  PCI_EXP_DEVSTA_URD	0x0008	/* Unsupported Request Detected */
>> +#define  PCI_EXP_DEVSTA_ERR	0xf	/* Error bits */
> 
> To align with other macros, use 0x000F?
> 
Sure, will fix it.

Thanks.

Best Regards,
Shuai
Re: [PATCH v7 5/5] PCI/AER: Only clear error bits in pcie_clear_device_status()
Posted by Jonathan Cameron 1 week, 6 days ago
On Sat, 24 Jan 2026 15:45:57 +0800
Shuai Xue <xueshuai@linux.alibaba.com> wrote:

> Currently, pcie_clear_device_status() clears the entire PCIe Device
> Status register (PCI_EXP_DEVSTA), which includes both error status bits
> and other status bits such as AUX Power Detected (AUXPD) and
> Transactions Pending (TRPND).
> 
> Clearing non-error status bits can interfere with other drivers or
> subsystems that may rely on these bits. To fix it, only clear the error
> bits (0xf) while preserving other status bits.
> 
> Fixes: ec752f5d54d7 ("PCI/AER: Clear device status bits during ERR_FATAL and ERR_NONFATAL")
> Cc: stable@vger.kernel.org
> Suggested-by: Lukas Wunner <lukas@wunner.de>
> Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
Similar to previous. Drag to start of series to make backports easier if
we think this is a fix that affects real cases.  For stuff that's defined
in the PCI 6.2 spec, AUX power and Transactions Pending are RO, but
the interesting one is Emergency Power Reduction Detected which is RW1C
and hence reason this fix is potentially needed  + future features using
remaining bits.

I'd be more explicit in the commit message for that.  Talk about it not
mattering for AUXPD or TRPND because they are RO, vs the ones we
aren't using yet in Linux with the emergency power reduction detected
as an example that clearly shows we need to mask this!

J


> ---
>  drivers/pci/pci.c             | 2 +-
>  include/uapi/linux/pci_regs.h | 1 +
>  2 files changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 13dbb405dc31..0b947f90c333 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -2246,7 +2246,7 @@ void pcie_clear_device_status(struct pci_dev *dev)
>  	u16 sta;
>  
>  	pcie_capability_read_word(dev, PCI_EXP_DEVSTA, &sta);
> -	pcie_capability_write_word(dev, PCI_EXP_DEVSTA, sta);
> +	pcie_capability_write_word(dev, PCI_EXP_DEVSTA, sta & PCI_EXP_DEVSTA_ERR);
>  }
>  #endif
>  
> diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h
> index 3add74ae2594..f4b68203bc4e 100644
> --- a/include/uapi/linux/pci_regs.h
> +++ b/include/uapi/linux/pci_regs.h
> @@ -534,6 +534,7 @@
>  #define  PCI_EXP_DEVSTA_NFED	0x0002	/* Non-Fatal Error Detected */
>  #define  PCI_EXP_DEVSTA_FED	0x0004	/* Fatal Error Detected */
>  #define  PCI_EXP_DEVSTA_URD	0x0008	/* Unsupported Request Detected */
> +#define  PCI_EXP_DEVSTA_ERR	0xf	/* Error bits */
>  #define  PCI_EXP_DEVSTA_AUXPD	0x0010	/* AUX Power Detected */
>  #define  PCI_EXP_DEVSTA_TRPND	0x0020	/* Transactions Pending */
>  #define PCI_CAP_EXP_RC_ENDPOINT_SIZEOF_V1	12	/* v1 endpoints without link end here */
Re: [PATCH v7 5/5] PCI/AER: Only clear error bits in pcie_clear_device_status()
Posted by Shuai Xue 1 week, 5 days ago

On 1/27/26 6:45 PM, Jonathan Cameron wrote:
> On Sat, 24 Jan 2026 15:45:57 +0800
> Shuai Xue <xueshuai@linux.alibaba.com> wrote:
> 
>> Currently, pcie_clear_device_status() clears the entire PCIe Device
>> Status register (PCI_EXP_DEVSTA), which includes both error status bits
>> and other status bits such as AUX Power Detected (AUXPD) and
>> Transactions Pending (TRPND).
>>
>> Clearing non-error status bits can interfere with other drivers or
>> subsystems that may rely on these bits. To fix it, only clear the error
>> bits (0xf) while preserving other status bits.
>>
>> Fixes: ec752f5d54d7 ("PCI/AER: Clear device status bits during ERR_FATAL and ERR_NONFATAL")
>> Cc: stable@vger.kernel.org
>> Suggested-by: Lukas Wunner <lukas@wunner.de>
>> Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
> Similar to previous. Drag to start of series to make backports easier if
> we think this is a fix that affects real cases.  

Thank you for the detailed feedback.

I'll move this patch to the start of the series for easier backporting.

> For stuff that's defined
> in the PCI 6.2 spec, AUX power and Transactions Pending are RO, but
> the interesting one is Emergency Power Reduction Detected which is RW1C
> and hence reason this fix is potentially needed  + future features using
> remaining bits.
> 
> I'd be more explicit in the commit message for that.  Talk about it not
> mattering for AUXPD or TRPND because they are RO, vs the ones we
> aren't using yet in Linux with the emergency power reduction detected
> as an example that clearly shows we need to mask this!
> 

You're absolutely right - the commit
message should be more explicit about the different bit types and their
implications.

The revised the commit message is:

PCI/AER: Only clear error bits in PCIe Device Status register

Currently, pcie_clear_device_status() clears the entire PCIe Device
Status register (PCI_EXP_DEVSTA), which includes both error status bits
and other status bits.

According to PCIe Base Spec r6.0 sec 7.5.3.5, the Device Status register
contains different types of status bits:

- Read-only bits (AUXPD, TRPND): Writing to these has no effect, but
   clearing them is unnecessary.

- RW1C (read/write 1 to clear) non-error bits: For example, Emergency
   Power Reduction Detected (bit 6). Unconditionally clearing these bits
   can interfere with other drivers or subsystems that rely on them.

- Reserved bits: May be used for future features and should be preserved.

To prevent unintended side effects, modify pcie_clear_device_status() to
only clear the four error status bits (CED, NFED, FED, URD) while
preserving all other status bits.

Fixes: ec752f5d54d7 ("PCI/AER: Clear device status bits during ERR_FATAL and ERR_NONFATAL")
Cc: stable@vger.kernel.org
Suggested-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>

> J

Thanks.

Best Regards,
Shuai
Re: [PATCH v7 5/5] PCI/AER: Only clear error bits in pcie_clear_device_status()
Posted by Lukas Wunner 6 days, 5 hours ago
On Wed, Jan 28, 2026 at 08:45:36PM +0800, Shuai Xue wrote:
> The revised the commit message is:
> 
> PCI/AER: Only clear error bits in PCIe Device Status register
> 
> Currently, pcie_clear_device_status() clears the entire PCIe Device
> Status register (PCI_EXP_DEVSTA), which includes both error status bits
> and other status bits.
> 
> According to PCIe Base Spec r6.0 sec 7.5.3.5, the Device Status register
> contains different types of status bits:

Always cite the latest spec revision, i.e. PCIe r7.0 sec 7.5.3.5.

> - RW1C (read/write 1 to clear) non-error bits: For example, Emergency
>   Power Reduction Detected (bit 6). Unconditionally clearing these bits
>   can interfere with other drivers or subsystems that rely on them.

It would be good to explicitly call out that this bit was introduced with
PCIe r5.0 in 2019 and that it's currently the only writable bit in the
register besides the error bits.

> - Reserved bits: May be used for future features and should be preserved.

Wrong, they're marked "RsvdZ" (not "RsvdP"), so they're supposed to be
written as zero (not preserved).

Thanks,

Lukas
Re: [PATCH v7 5/5] PCI/AER: Only clear error bits in pcie_clear_device_status()
Posted by Shuai Xue 3 days, 4 hours ago

On 2/3/26 3:44 PM, Lukas Wunner wrote:
> On Wed, Jan 28, 2026 at 08:45:36PM +0800, Shuai Xue wrote:
>> The revised the commit message is:
>>
>> PCI/AER: Only clear error bits in PCIe Device Status register
>>
>> Currently, pcie_clear_device_status() clears the entire PCIe Device
>> Status register (PCI_EXP_DEVSTA), which includes both error status bits
>> and other status bits.
>>
>> According to PCIe Base Spec r6.0 sec 7.5.3.5, the Device Status register
>> contains different types of status bits:
> 
> Always cite the latest spec revision, i.e. PCIe r7.0 sec 7.5.3.5.

Sure, I will update the cite version.

> 
>> - RW1C (read/write 1 to clear) non-error bits: For example, Emergency
>>    Power Reduction Detected (bit 6). Unconditionally clearing these bits
>>    can interfere with other drivers or subsystems that rely on them.
> 
> It would be good to explicitly call out that this bit was introduced with
> PCIe r5.0 in 2019 and that it's currently the only writable bit in the
> register besides the error bits.

Sure, will add it.

> 
>> - Reserved bits: May be used for future features and should be preserved.
> 
> Wrong, they're marked "RsvdZ" (not "RsvdP"), so they're supposed to be
> written as zero (not preserved).

Thanks for correcting me. Will fix it.

> 
> Thanks,
> 
> Lukas


Thanks for valuable comments.
Shuai