drivers/pci/pcie/aer.c | 3 +++ 1 file changed, 3 insertions(+)
Similarly to pci_dev_aer_stats_incr(), pci_print_aer() may be called
when dev->aer_info is NULL. Add a NULL check before proceeding to avoid
calling aer_ratelimit() with a NULL aer_info pointer, returning 1, which
does not rate limit, given this is fatal.
This prevents a kernel crash triggered by dereferencing a NULL pointer
in aer_ratelimit(), ensuring safer handling of PCI devices that lack
AER info. This change aligns pci_print_aer() with pci_dev_aer_stats_incr()
which already performs this NULL check.
Signed-off-by: Breno Leitao <leitao@debian.org>
Fixes: a57f2bfb4a5863 ("PCI/AER: Ratelimit correctable and non-fatal error logging")
---
drivers/pci/pcie/aer.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
index 70ac661883672..b5f96fde4dcda 100644
--- a/drivers/pci/pcie/aer.c
+++ b/drivers/pci/pcie/aer.c
@@ -786,6 +786,9 @@ static void pci_rootport_aer_stats_incr(struct pci_dev *pdev,
static int aer_ratelimit(struct pci_dev *dev, unsigned int severity)
{
+ if (!dev->aer_info)
+ return 1;
+
switch (severity) {
case AER_NONFATAL:
return __ratelimit(&dev->aer_info->nonfatal_ratelimit);
---
base-commit: 89748acdf226fd1a8775ff6fa2703f8412b286c8
change-id: 20250801-aer_crash_2-b21cc2ef0d00
Best regards,
--
Breno Leitao <leitao@debian.org>
On 8/4/25 2:17 AM, Breno Leitao wrote: > Similarly to pci_dev_aer_stats_incr(), pci_print_aer() may be called > when dev->aer_info is NULL. Add a NULL check before proceeding to avoid > calling aer_ratelimit() with a NULL aer_info pointer, returning 1, which > does not rate limit, given this is fatal. Why not add it to pci_print_aer() ? > > This prevents a kernel crash triggered by dereferencing a NULL pointer > in aer_ratelimit(), ensuring safer handling of PCI devices that lack > AER info. This change aligns pci_print_aer() with pci_dev_aer_stats_incr() > which already performs this NULL check. Is this happening during the kernel boot ? What is the frequency and steps to reproduce? I am curious about why pci_print_aer() is called for a PCI device without aer_info. Not aer_info means, that particular device is already released or in the process of release (pci_release_dev()). Is this triggered by using a stale pci_dev pointer? > > Signed-off-by: Breno Leitao <leitao@debian.org> > Fixes: a57f2bfb4a5863 ("PCI/AER: Ratelimit correctable and non-fatal error logging") > --- > drivers/pci/pcie/aer.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c > index 70ac661883672..b5f96fde4dcda 100644 > --- a/drivers/pci/pcie/aer.c > +++ b/drivers/pci/pcie/aer.c > @@ -786,6 +786,9 @@ static void pci_rootport_aer_stats_incr(struct pci_dev *pdev, > > static int aer_ratelimit(struct pci_dev *dev, unsigned int severity) > { > + if (!dev->aer_info) > + return 1; > + > switch (severity) { > case AER_NONFATAL: > return __ratelimit(&dev->aer_info->nonfatal_ratelimit); > > --- > base-commit: 89748acdf226fd1a8775ff6fa2703f8412b286c8 > change-id: 20250801-aer_crash_2-b21cc2ef0d00 > > Best regards, > -- > Breno Leitao <leitao@debian.org> > -- Sathyanarayanan Kuppuswamy Linux Kernel Developer
Hello Sathyanarayanan, On Mon, Aug 04, 2025 at 06:50:30AM -0700, Sathyanarayanan Kuppuswamy wrote: > > On 8/4/25 2:17 AM, Breno Leitao wrote: > > Similarly to pci_dev_aer_stats_incr(), pci_print_aer() may be called > > when dev->aer_info is NULL. Add a NULL check before proceeding to avoid > > calling aer_ratelimit() with a NULL aer_info pointer, returning 1, which > > does not rate limit, given this is fatal. > > Why not add it to pci_print_aer() ? > > > > > This prevents a kernel crash triggered by dereferencing a NULL pointer > > in aer_ratelimit(), ensuring safer handling of PCI devices that lack > > AER info. This change aligns pci_print_aer() with pci_dev_aer_stats_incr() > > which already performs this NULL check. > > Is this happening during the kernel boot ? What is the frequency and steps > to reproduce? I am curious about why pci_print_aer() is called for a PCI device > without aer_info. Not aer_info means, that particular device is already released > or in the process of release (pci_release_dev()). Is this triggered by using a stale > pci_dev pointer? I've reported some of these investigations in here: https://lore.kernel.org/all/buduna6darbvwfg3aogl5kimyxkggu3n4romnmq6sozut6axeu@clnx7sfsy457/
On 8/4/25 8:35 AM, Breno Leitao wrote: > Hello Sathyanarayanan, > > On Mon, Aug 04, 2025 at 06:50:30AM -0700, Sathyanarayanan Kuppuswamy wrote: >> On 8/4/25 2:17 AM, Breno Leitao wrote: >>> Similarly to pci_dev_aer_stats_incr(), pci_print_aer() may be called >>> when dev->aer_info is NULL. Add a NULL check before proceeding to avoid >>> calling aer_ratelimit() with a NULL aer_info pointer, returning 1, which >>> does not rate limit, given this is fatal. >> Why not add it to pci_print_aer() ? >> >>> This prevents a kernel crash triggered by dereferencing a NULL pointer >>> in aer_ratelimit(), ensuring safer handling of PCI devices that lack >>> AER info. This change aligns pci_print_aer() with pci_dev_aer_stats_incr() >>> which already performs this NULL check. >> Is this happening during the kernel boot ? What is the frequency and steps >> to reproduce? I am curious about why pci_print_aer() is called for a PCI device >> without aer_info. Not aer_info means, that particular device is already released >> or in the process of release (pci_release_dev()). Is this triggered by using a stale >> pci_dev pointer? > I've reported some of these investigations in here: > > https://lore.kernel.org/all/buduna6darbvwfg3aogl5kimyxkggu3n4romnmq6sozut6axeu@clnx7sfsy457/ It has some details. But you did not mention details like your environment, steps to reproduce and how often you see it. I just want to understand in what scenario pci_print_aer() is triggered, when releasing the device. I am wondering whether we are missing proper locking some where. -- Sathyanarayanan Kuppuswamy Linux Kernel Developer
Hello Sathyanarayanan On Mon, Aug 04, 2025 at 09:11:27AM -0700, Sathyanarayanan Kuppuswamy wrote: > > On 8/4/25 8:35 AM, Breno Leitao wrote: > > Hello Sathyanarayanan, > > > > On Mon, Aug 04, 2025 at 06:50:30AM -0700, Sathyanarayanan Kuppuswamy wrote: > > > On 8/4/25 2:17 AM, Breno Leitao wrote: > > > > Similarly to pci_dev_aer_stats_incr(), pci_print_aer() may be called > > > > when dev->aer_info is NULL. Add a NULL check before proceeding to avoid > > > > calling aer_ratelimit() with a NULL aer_info pointer, returning 1, which > > > > does not rate limit, given this is fatal. > > > Why not add it to pci_print_aer() ? > > > > > > > This prevents a kernel crash triggered by dereferencing a NULL pointer > > > > in aer_ratelimit(), ensuring safer handling of PCI devices that lack > > > > AER info. This change aligns pci_print_aer() with pci_dev_aer_stats_incr() > > > > which already performs this NULL check. > > > Is this happening during the kernel boot ? What is the frequency and steps > > > to reproduce? I am curious about why pci_print_aer() is called for a PCI device > > > without aer_info. Not aer_info means, that particular device is already released > > > or in the process of release (pci_release_dev()). Is this triggered by using a stale > > > pci_dev pointer? > > I've reported some of these investigations in here: > > > > https://lore.kernel.org/all/buduna6darbvwfg3aogl5kimyxkggu3n4romnmq6sozut6axeu@clnx7sfsy457/ > > It has some details. But you did not mention details like your environment, steps to > reproduce and how often you see it. I just want to understand in what scenario > pci_print_aer() is triggered, when releasing the device. I am wondering whether we > are missing proper locking some where. Oh, unfortunately I don't have these details. I have a bunch of machine in "prod" running 6.16, and they crash from time to time, and then I have the crashdumps. I can get anything that crashdump provices, but, I don't have a reproducer or the exacty steps that are triggering it. If I can get this information from a crashdump, I am more than happy to investigate. Can we get these information from crashdump? Thanks, --breno
On 8/4/2025 5:17 PM, Breno Leitao wrote: > Similarly to pci_dev_aer_stats_incr(), pci_print_aer() may be called > when dev->aer_info is NULL. Add a NULL check before proceeding to avoid > calling aer_ratelimit() with a NULL aer_info pointer, returning 1, which > does not rate limit, given this is fatal. > > This prevents a kernel crash triggered by dereferencing a NULL pointer > in aer_ratelimit(), ensuring safer handling of PCI devices that lack > AER info. This change aligns pci_print_aer() with pci_dev_aer_stats_incr() > which already performs this NULL check. > > Signed-off-by: Breno Leitao <leitao@debian.org> > Fixes: a57f2bfb4a5863 ("PCI/AER: Ratelimit correctable and non-fatal error logging") > --- > drivers/pci/pcie/aer.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c > index 70ac661883672..b5f96fde4dcda 100644 > --- a/drivers/pci/pcie/aer.c > +++ b/drivers/pci/pcie/aer.c > @@ -786,6 +786,9 @@ static void pci_rootport_aer_stats_incr(struct pci_dev *pdev, > > static int aer_ratelimit(struct pci_dev *dev, unsigned int severity) > { > + if (!dev->aer_info) > + return 1; > + > switch (severity) { > case AER_NONFATAL: > return __ratelimit(&dev->aer_info->nonfatal_ratelimit); > > --- Seems you are using arm64 platform default config item arch/arm64/configs/defconfig:CONFIG_ACPI_APEI_PCIEAER=y So the issue wouldn't be triggered on X86_64 with default config. Thanks, Ethan > base-commit: 89748acdf226fd1a8775ff6fa2703f8412b286c8 > change-id: 20250801-aer_crash_2-b21cc2ef0d00 > > Best regards, > -- > Breno Leitao <leitao@debian.org> > >
On Tue, Aug 05, 2025 at 10:25:11PM +0800, Ethan Zhao wrote: > > Seems you are using arm64 platform default config item > arch/arm64/configs/defconfig:CONFIG_ACPI_APEI_PCIEAER=y > So the issue wouldn't be triggered on X86_64 with default config. Not really, I am running on x86 hosts. There are the AER part of my .config. # cat .config | grep AER CONFIG_ACPI_APEI_PCIEAER=y CONFIG_PCIEAER=y # CONFIG_PCIEAER_INJECT is not set CONFIG_PCIEAER_CXL=y
On 8/5/2025 11:18 PM, Breno Leitao wrote: > On Tue, Aug 05, 2025 at 10:25:11PM +0800, Ethan Zhao wrote: >> >> Seems you are using arm64 platform default config item >> arch/arm64/configs/defconfig:CONFIG_ACPI_APEI_PCIEAER=y >> So the issue wouldn't be triggered on X86_64 with default config. > > Not really, I am running on x86 hosts. There are the AER part of my > .config. > > # cat .config | grep AER > CONFIG_ACPI_APEI_PCIEAER=y > CONFIG_PCIEAER=y > # CONFIG_PCIEAER_INJECT is not set > CONFIG_PCIEAER_CXL=y Okay, If so, I would suggest to check and validate the struct aer_capability_regs *aer_regs before/in enqueue function aer_recover_queue(). e.g. static void ghes_handle_aer(struct acpi_hest_generic_data *gdata) { ... memcpy(aer_info, pcie_err->aer_info, sizeof(struct aer_capability_regs)); //validate the aer_info here aer_recover_queue(pcie_err->device_id.segment } or void aer_recover_queue(int domain, unsigned int bus, unsigned int devfn, int severity, struct aer_capability_regs *aer_regs) { //check and validate aer_regs first here } Would be better than dequeue side aer_recover_work_func() ? BTW, the cause seems you are using a buggy BIOS. Thanks, Ethan
On 8/4/2025 5:17 PM, Breno Leitao wrote: > Similarly to pci_dev_aer_stats_incr(), pci_print_aer() may be called > when dev->aer_info is NULL. Add a NULL check before proceeding to avoid > calling aer_ratelimit() with a NULL aer_info pointer, returning 1, which > does not rate limit, given this is fatal. > > This prevents a kernel crash triggered by dereferencing a NULL pointer > in aer_ratelimit(), ensuring safer handling of PCI devices that lack > AER info. This change aligns pci_print_aer() with pci_dev_aer_stats_incr() > which already performs this NULL check. > The enqueue side has lock to protect the ring, but the dequeue side no lock held. The kfifo_get in static void aer_recover_work_func(struct work_struct *work) { ... while (kfifo_get(&aer_recover_ring, &entry)) { ... } should be replaced by kfifo_out_spinlocked() as static void aer_recover_work_func(struct work_struct *work) { ... while (kfifo_out_spinlocked(&aer_recover_ring, &entry,1`,&aer_recover_ring_lock )) { ... } Thanks, Ethan > Signed-off-by: Breno Leitao <leitao@debian.org> > Fixes: a57f2bfb4a5863 ("PCI/AER: Ratelimit correctable and non-fatal error logging") > --- > drivers/pci/pcie/aer.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c > index 70ac661883672..b5f96fde4dcda 100644 > --- a/drivers/pci/pcie/aer.c > +++ b/drivers/pci/pcie/aer.c > @@ -786,6 +786,9 @@ static void pci_rootport_aer_stats_incr(struct pci_dev *pdev, > > static int aer_ratelimit(struct pci_dev *dev, unsigned int severity) > { > + if (!dev->aer_info) > + return 1; > + > switch (severity) { > case AER_NONFATAL: > return __ratelimit(&dev->aer_info->nonfatal_ratelimit); > > --- > base-commit: 89748acdf226fd1a8775ff6fa2703f8412b286c8 > change-id: 20250801-aer_crash_2-b21cc2ef0d00 > > Best regards, > -- > Breno Leitao <leitao@debian.org> > >
Hello Ethan, On Wed, Aug 06, 2025 at 09:55:05AM +0800, Ethan Zhao wrote: > On 8/4/2025 5:17 PM, Breno Leitao wrote: > > Similarly to pci_dev_aer_stats_incr(), pci_print_aer() may be called > > when dev->aer_info is NULL. Add a NULL check before proceeding to avoid > > calling aer_ratelimit() with a NULL aer_info pointer, returning 1, which > > does not rate limit, given this is fatal. > > > > This prevents a kernel crash triggered by dereferencing a NULL pointer > > in aer_ratelimit(), ensuring safer handling of PCI devices that lack > > AER info. This change aligns pci_print_aer() with pci_dev_aer_stats_incr() > > which already performs this NULL check. > > > The enqueue side has lock to protect the ring, but the dequeue side no lock > held. > > The kfifo_get in > static void aer_recover_work_func(struct work_struct *work) > { > ... > while (kfifo_get(&aer_recover_ring, &entry)) { > ... > } > should be replaced by > kfifo_out_spinlocked() The design seems not to need the lock on the reader side. There is just one reader, which is the aer_recover_work. aer_recover_work runs aer_recover_work_func(). So, if we just have one reader, we do not need to protect the kfifo by spinlock, right? In fact, the code documents it in the aer_recover_ring_lock. /* * Mutual exclusion for writers of aer_recover_ring, reader side don't * need lock, because there is only one reader and lock is not needed * between reader and writer. */ static DEFINE_SPINLOCK(aer_recover_ring_lock);
On 8/6/2025 4:45 PM, Breno Leitao wrote: > Hello Ethan, > > On Wed, Aug 06, 2025 at 09:55:05AM +0800, Ethan Zhao wrote: >> On 8/4/2025 5:17 PM, Breno Leitao wrote: >>> Similarly to pci_dev_aer_stats_incr(), pci_print_aer() may be called >>> when dev->aer_info is NULL. Add a NULL check before proceeding to avoid >>> calling aer_ratelimit() with a NULL aer_info pointer, returning 1, which >>> does not rate limit, given this is fatal. >>> >>> This prevents a kernel crash triggered by dereferencing a NULL pointer >>> in aer_ratelimit(), ensuring safer handling of PCI devices that lack >>> AER info. This change aligns pci_print_aer() with pci_dev_aer_stats_incr() >>> which already performs this NULL check. >>> >> The enqueue side has lock to protect the ring, but the dequeue side no lock >> held. >> >> The kfifo_get in >> static void aer_recover_work_func(struct work_struct *work) >> { >> ... >> while (kfifo_get(&aer_recover_ring, &entry)) { >> ... >> } >> should be replaced by >> kfifo_out_spinlocked() > > The design seems not to need the lock on the reader side. There is just > one reader, which is the aer_recover_work. aer_recover_work runs > aer_recover_work_func(). So, if we just have one reader, we do not need > to protect the kfifo by spinlock, right? Not exactly, If the writer and reader are serialized, no lock is needed. However, here the writer kfifo_in_spinlocked() and the system work queue task aer_recover_work() cannot guarantee serialized execution. @Bjorn, help to check it out. Thanks, Ethan> > In fact, the code documents it in the aer_recover_ring_lock. > > /* > * Mutual exclusion for writers of aer_recover_ring, reader side don't > * need lock, because there is only one reader and lock is not needed > * between reader and writer. > */ > static DEFINE_SPINLOCK(aer_recover_ring_lock);
© 2016 - 2025 Red Hat, Inc.