drivers/pci/pcie/aer.c | 3 +++ 1 file changed, 3 insertions(+)
Similarly to pci_dev_aer_stats_incr(), pci_print_aer() may be called
when dev->aer_info is NULL. Add a NULL check before proceeding to avoid
calling aer_ratelimit() with a NULL aer_info pointer, returning 1, which
does not rate limit, given this is fatal.
This prevents a kernel crash triggered by dereferencing a NULL pointer
in aer_ratelimit(), ensuring safer handling of PCI devices that lack
AER info. This change aligns pci_print_aer() with pci_dev_aer_stats_incr()
which already performs this NULL check.
Signed-off-by: Breno Leitao <leitao@debian.org>
Fixes: a57f2bfb4a5863 ("PCI/AER: Ratelimit correctable and non-fatal error logging")
---
drivers/pci/pcie/aer.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
index 70ac661883672..b5f96fde4dcda 100644
--- a/drivers/pci/pcie/aer.c
+++ b/drivers/pci/pcie/aer.c
@@ -786,6 +786,9 @@ static void pci_rootport_aer_stats_incr(struct pci_dev *pdev,
static int aer_ratelimit(struct pci_dev *dev, unsigned int severity)
{
+ if (!dev->aer_info)
+ return 1;
+
switch (severity) {
case AER_NONFATAL:
return __ratelimit(&dev->aer_info->nonfatal_ratelimit);
---
base-commit: 89748acdf226fd1a8775ff6fa2703f8412b286c8
change-id: 20250801-aer_crash_2-b21cc2ef0d00
Best regards,
--
Breno Leitao <leitao@debian.org>
On 8/4/25 2:17 AM, Breno Leitao wrote:
> Similarly to pci_dev_aer_stats_incr(), pci_print_aer() may be called
> when dev->aer_info is NULL. Add a NULL check before proceeding to avoid
> calling aer_ratelimit() with a NULL aer_info pointer, returning 1, which
> does not rate limit, given this is fatal.
Why not add it to pci_print_aer() ?
>
> This prevents a kernel crash triggered by dereferencing a NULL pointer
> in aer_ratelimit(), ensuring safer handling of PCI devices that lack
> AER info. This change aligns pci_print_aer() with pci_dev_aer_stats_incr()
> which already performs this NULL check.
Is this happening during the kernel boot ? What is the frequency and steps
to reproduce? I am curious about why pci_print_aer() is called for a PCI device
without aer_info. Not aer_info means, that particular device is already released
or in the process of release (pci_release_dev()). Is this triggered by using a stale
pci_dev pointer?
>
> Signed-off-by: Breno Leitao <leitao@debian.org>
> Fixes: a57f2bfb4a5863 ("PCI/AER: Ratelimit correctable and non-fatal error logging")
> ---
> drivers/pci/pcie/aer.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> index 70ac661883672..b5f96fde4dcda 100644
> --- a/drivers/pci/pcie/aer.c
> +++ b/drivers/pci/pcie/aer.c
> @@ -786,6 +786,9 @@ static void pci_rootport_aer_stats_incr(struct pci_dev *pdev,
>
> static int aer_ratelimit(struct pci_dev *dev, unsigned int severity)
> {
> + if (!dev->aer_info)
> + return 1;
> +
> switch (severity) {
> case AER_NONFATAL:
> return __ratelimit(&dev->aer_info->nonfatal_ratelimit);
>
> ---
> base-commit: 89748acdf226fd1a8775ff6fa2703f8412b286c8
> change-id: 20250801-aer_crash_2-b21cc2ef0d00
>
> Best regards,
> --
> Breno Leitao <leitao@debian.org>
>
--
Sathyanarayanan Kuppuswamy
Linux Kernel Developer
Hello Sathyanarayanan, On Mon, Aug 04, 2025 at 06:50:30AM -0700, Sathyanarayanan Kuppuswamy wrote: > > On 8/4/25 2:17 AM, Breno Leitao wrote: > > Similarly to pci_dev_aer_stats_incr(), pci_print_aer() may be called > > when dev->aer_info is NULL. Add a NULL check before proceeding to avoid > > calling aer_ratelimit() with a NULL aer_info pointer, returning 1, which > > does not rate limit, given this is fatal. > > Why not add it to pci_print_aer() ? > > > > > This prevents a kernel crash triggered by dereferencing a NULL pointer > > in aer_ratelimit(), ensuring safer handling of PCI devices that lack > > AER info. This change aligns pci_print_aer() with pci_dev_aer_stats_incr() > > which already performs this NULL check. > > Is this happening during the kernel boot ? What is the frequency and steps > to reproduce? I am curious about why pci_print_aer() is called for a PCI device > without aer_info. Not aer_info means, that particular device is already released > or in the process of release (pci_release_dev()). Is this triggered by using a stale > pci_dev pointer? I've reported some of these investigations in here: https://lore.kernel.org/all/buduna6darbvwfg3aogl5kimyxkggu3n4romnmq6sozut6axeu@clnx7sfsy457/
On 8/4/25 8:35 AM, Breno Leitao wrote: > Hello Sathyanarayanan, > > On Mon, Aug 04, 2025 at 06:50:30AM -0700, Sathyanarayanan Kuppuswamy wrote: >> On 8/4/25 2:17 AM, Breno Leitao wrote: >>> Similarly to pci_dev_aer_stats_incr(), pci_print_aer() may be called >>> when dev->aer_info is NULL. Add a NULL check before proceeding to avoid >>> calling aer_ratelimit() with a NULL aer_info pointer, returning 1, which >>> does not rate limit, given this is fatal. >> Why not add it to pci_print_aer() ? >> >>> This prevents a kernel crash triggered by dereferencing a NULL pointer >>> in aer_ratelimit(), ensuring safer handling of PCI devices that lack >>> AER info. This change aligns pci_print_aer() with pci_dev_aer_stats_incr() >>> which already performs this NULL check. >> Is this happening during the kernel boot ? What is the frequency and steps >> to reproduce? I am curious about why pci_print_aer() is called for a PCI device >> without aer_info. Not aer_info means, that particular device is already released >> or in the process of release (pci_release_dev()). Is this triggered by using a stale >> pci_dev pointer? > I've reported some of these investigations in here: > > https://lore.kernel.org/all/buduna6darbvwfg3aogl5kimyxkggu3n4romnmq6sozut6axeu@clnx7sfsy457/ It has some details. But you did not mention details like your environment, steps to reproduce and how often you see it. I just want to understand in what scenario pci_print_aer() is triggered, when releasing the device. I am wondering whether we are missing proper locking some where. -- Sathyanarayanan Kuppuswamy Linux Kernel Developer
Hello Sathyanarayanan On Mon, Aug 04, 2025 at 09:11:27AM -0700, Sathyanarayanan Kuppuswamy wrote: > > On 8/4/25 8:35 AM, Breno Leitao wrote: > > Hello Sathyanarayanan, > > > > On Mon, Aug 04, 2025 at 06:50:30AM -0700, Sathyanarayanan Kuppuswamy wrote: > > > On 8/4/25 2:17 AM, Breno Leitao wrote: > > > > Similarly to pci_dev_aer_stats_incr(), pci_print_aer() may be called > > > > when dev->aer_info is NULL. Add a NULL check before proceeding to avoid > > > > calling aer_ratelimit() with a NULL aer_info pointer, returning 1, which > > > > does not rate limit, given this is fatal. > > > Why not add it to pci_print_aer() ? > > > > > > > This prevents a kernel crash triggered by dereferencing a NULL pointer > > > > in aer_ratelimit(), ensuring safer handling of PCI devices that lack > > > > AER info. This change aligns pci_print_aer() with pci_dev_aer_stats_incr() > > > > which already performs this NULL check. > > > Is this happening during the kernel boot ? What is the frequency and steps > > > to reproduce? I am curious about why pci_print_aer() is called for a PCI device > > > without aer_info. Not aer_info means, that particular device is already released > > > or in the process of release (pci_release_dev()). Is this triggered by using a stale > > > pci_dev pointer? > > I've reported some of these investigations in here: > > > > https://lore.kernel.org/all/buduna6darbvwfg3aogl5kimyxkggu3n4romnmq6sozut6axeu@clnx7sfsy457/ > > It has some details. But you did not mention details like your environment, steps to > reproduce and how often you see it. I just want to understand in what scenario > pci_print_aer() is triggered, when releasing the device. I am wondering whether we > are missing proper locking some where. Oh, unfortunately I don't have these details. I have a bunch of machine in "prod" running 6.16, and they crash from time to time, and then I have the crashdumps. I can get anything that crashdump provices, but, I don't have a reproducer or the exacty steps that are triggering it. If I can get this information from a crashdump, I am more than happy to investigate. Can we get these information from crashdump? Thanks, --breno
On 8/4/2025 5:17 PM, Breno Leitao wrote:
> Similarly to pci_dev_aer_stats_incr(), pci_print_aer() may be called
> when dev->aer_info is NULL. Add a NULL check before proceeding to avoid
> calling aer_ratelimit() with a NULL aer_info pointer, returning 1, which
> does not rate limit, given this is fatal.
>
> This prevents a kernel crash triggered by dereferencing a NULL pointer
> in aer_ratelimit(), ensuring safer handling of PCI devices that lack
> AER info. This change aligns pci_print_aer() with pci_dev_aer_stats_incr()
> which already performs this NULL check.
>
> Signed-off-by: Breno Leitao <leitao@debian.org>
> Fixes: a57f2bfb4a5863 ("PCI/AER: Ratelimit correctable and non-fatal error logging")
> ---
> drivers/pci/pcie/aer.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> index 70ac661883672..b5f96fde4dcda 100644
> --- a/drivers/pci/pcie/aer.c
> +++ b/drivers/pci/pcie/aer.c
> @@ -786,6 +786,9 @@ static void pci_rootport_aer_stats_incr(struct pci_dev *pdev,
>
> static int aer_ratelimit(struct pci_dev *dev, unsigned int severity)
> {
> + if (!dev->aer_info)
> + return 1;
> +
> switch (severity) {
> case AER_NONFATAL:
> return __ratelimit(&dev->aer_info->nonfatal_ratelimit);
>
> ---
Seems you are using arm64 platform default config item
arch/arm64/configs/defconfig:CONFIG_ACPI_APEI_PCIEAER=y
So the issue wouldn't be triggered on X86_64 with default config.
Thanks,
Ethan
> base-commit: 89748acdf226fd1a8775ff6fa2703f8412b286c8
> change-id: 20250801-aer_crash_2-b21cc2ef0d00
>
> Best regards,
> --
> Breno Leitao <leitao@debian.org>
>
>
On Tue, Aug 05, 2025 at 10:25:11PM +0800, Ethan Zhao wrote: > > Seems you are using arm64 platform default config item > arch/arm64/configs/defconfig:CONFIG_ACPI_APEI_PCIEAER=y > So the issue wouldn't be triggered on X86_64 with default config. Not really, I am running on x86 hosts. There are the AER part of my .config. # cat .config | grep AER CONFIG_ACPI_APEI_PCIEAER=y CONFIG_PCIEAER=y # CONFIG_PCIEAER_INJECT is not set CONFIG_PCIEAER_CXL=y
On 8/5/2025 11:18 PM, Breno Leitao wrote:
> On Tue, Aug 05, 2025 at 10:25:11PM +0800, Ethan Zhao wrote:
>>
>> Seems you are using arm64 platform default config item
>> arch/arm64/configs/defconfig:CONFIG_ACPI_APEI_PCIEAER=y
>> So the issue wouldn't be triggered on X86_64 with default config.
>
> Not really, I am running on x86 hosts. There are the AER part of my
> .config.
>
> # cat .config | grep AER
> CONFIG_ACPI_APEI_PCIEAER=y
> CONFIG_PCIEAER=y
> # CONFIG_PCIEAER_INJECT is not set
> CONFIG_PCIEAER_CXL=y
Okay, If so, I would suggest to check and validate the
struct aer_capability_regs *aer_regs before/in enqueue function
aer_recover_queue().
e.g.
static void ghes_handle_aer(struct acpi_hest_generic_data *gdata)
{
...
memcpy(aer_info, pcie_err->aer_info, sizeof(struct aer_capability_regs));
//validate the aer_info here
aer_recover_queue(pcie_err->device_id.segment
}
or
void aer_recover_queue(int domain, unsigned int bus, unsigned int devfn,
int severity, struct aer_capability_regs *aer_regs)
{
//check and validate aer_regs first here
}
Would be better than dequeue side aer_recover_work_func() ?
BTW, the cause seems you are using a buggy BIOS.
Thanks,
Ethan
On 8/4/2025 5:17 PM, Breno Leitao wrote:
> Similarly to pci_dev_aer_stats_incr(), pci_print_aer() may be called
> when dev->aer_info is NULL. Add a NULL check before proceeding to avoid
> calling aer_ratelimit() with a NULL aer_info pointer, returning 1, which
> does not rate limit, given this is fatal.
>
> This prevents a kernel crash triggered by dereferencing a NULL pointer
> in aer_ratelimit(), ensuring safer handling of PCI devices that lack
> AER info. This change aligns pci_print_aer() with pci_dev_aer_stats_incr()
> which already performs this NULL check.
>
The enqueue side has lock to protect the ring, but the dequeue side no
lock held.
The kfifo_get in
static void aer_recover_work_func(struct work_struct *work)
{
...
while (kfifo_get(&aer_recover_ring, &entry)) {
...
}
should be replaced by
kfifo_out_spinlocked()
as
static void aer_recover_work_func(struct work_struct *work)
{
...
while (kfifo_out_spinlocked(&aer_recover_ring,
&entry,1`,&aer_recover_ring_lock )) {
...
}
Thanks,
Ethan
> Signed-off-by: Breno Leitao <leitao@debian.org>
> Fixes: a57f2bfb4a5863 ("PCI/AER: Ratelimit correctable and non-fatal error logging")
> ---
> drivers/pci/pcie/aer.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> index 70ac661883672..b5f96fde4dcda 100644
> --- a/drivers/pci/pcie/aer.c
> +++ b/drivers/pci/pcie/aer.c
> @@ -786,6 +786,9 @@ static void pci_rootport_aer_stats_incr(struct pci_dev *pdev,
>
> static int aer_ratelimit(struct pci_dev *dev, unsigned int severity)
> {
> + if (!dev->aer_info)
> + return 1;
> +
> switch (severity) {
> case AER_NONFATAL:
> return __ratelimit(&dev->aer_info->nonfatal_ratelimit);
>
> ---
> base-commit: 89748acdf226fd1a8775ff6fa2703f8412b286c8
> change-id: 20250801-aer_crash_2-b21cc2ef0d00
>
> Best regards,
> --
> Breno Leitao <leitao@debian.org>
>
>
Hello Ethan,
On Wed, Aug 06, 2025 at 09:55:05AM +0800, Ethan Zhao wrote:
> On 8/4/2025 5:17 PM, Breno Leitao wrote:
> > Similarly to pci_dev_aer_stats_incr(), pci_print_aer() may be called
> > when dev->aer_info is NULL. Add a NULL check before proceeding to avoid
> > calling aer_ratelimit() with a NULL aer_info pointer, returning 1, which
> > does not rate limit, given this is fatal.
> >
> > This prevents a kernel crash triggered by dereferencing a NULL pointer
> > in aer_ratelimit(), ensuring safer handling of PCI devices that lack
> > AER info. This change aligns pci_print_aer() with pci_dev_aer_stats_incr()
> > which already performs this NULL check.
> >
> The enqueue side has lock to protect the ring, but the dequeue side no lock
> held.
>
> The kfifo_get in
> static void aer_recover_work_func(struct work_struct *work)
> {
> ...
> while (kfifo_get(&aer_recover_ring, &entry)) {
> ...
> }
> should be replaced by
> kfifo_out_spinlocked()
The design seems not to need the lock on the reader side. There is just
one reader, which is the aer_recover_work. aer_recover_work runs
aer_recover_work_func(). So, if we just have one reader, we do not need
to protect the kfifo by spinlock, right?
In fact, the code documents it in the aer_recover_ring_lock.
/*
* Mutual exclusion for writers of aer_recover_ring, reader side don't
* need lock, because there is only one reader and lock is not needed
* between reader and writer.
*/
static DEFINE_SPINLOCK(aer_recover_ring_lock);
On 8/6/2025 4:45 PM, Breno Leitao wrote:
> Hello Ethan,
>
> On Wed, Aug 06, 2025 at 09:55:05AM +0800, Ethan Zhao wrote:
>> On 8/4/2025 5:17 PM, Breno Leitao wrote:
>>> Similarly to pci_dev_aer_stats_incr(), pci_print_aer() may be called
>>> when dev->aer_info is NULL. Add a NULL check before proceeding to avoid
>>> calling aer_ratelimit() with a NULL aer_info pointer, returning 1, which
>>> does not rate limit, given this is fatal.
>>>
>>> This prevents a kernel crash triggered by dereferencing a NULL pointer
>>> in aer_ratelimit(), ensuring safer handling of PCI devices that lack
>>> AER info. This change aligns pci_print_aer() with pci_dev_aer_stats_incr()
>>> which already performs this NULL check.
>>>
>> The enqueue side has lock to protect the ring, but the dequeue side no lock
>> held.
>>
>> The kfifo_get in
>> static void aer_recover_work_func(struct work_struct *work)
>> {
>> ...
>> while (kfifo_get(&aer_recover_ring, &entry)) {
>> ...
>> }
>> should be replaced by
>> kfifo_out_spinlocked()
>
> The design seems not to need the lock on the reader side. There is just
> one reader, which is the aer_recover_work. aer_recover_work runs
> aer_recover_work_func(). So, if we just have one reader, we do not need
> to protect the kfifo by spinlock, right?
Not exactly,
If the writer and reader are serialized, no lock is needed. However,
here the writer kfifo_in_spinlocked() and the system work queue task
aer_recover_work() cannot guarantee serialized execution.
@Bjorn, help to check it out.
Thanks,
Ethan>
> In fact, the code documents it in the aer_recover_ring_lock.
>
> /*
> * Mutual exclusion for writers of aer_recover_ring, reader side don't
> * need lock, because there is only one reader and lock is not needed
> * between reader and writer.
> */
> static DEFINE_SPINLOCK(aer_recover_ring_lock);
© 2016 - 2026 Red Hat, Inc.