drivers/acpi/apei/ghes.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
The function ghes_estatus_pool_region_free() is exported and be called
by the PCIe AER recovery path, which unconditionally invokes it to free
aer_capability_regs memory.
Although current AER usage assumes memory comes from the GHES pool,
robustness requires guarding against pool unavailability. Add a NULL check
before calling gen_pool_free() to prevent crashes when the pool is not
initialized. This also makes the API safer for potential future use by
non-GHES callers.
Fixes: e2abc47a5a1a ("ACPI: APEI: Fix AER info corruption when error status data has multiple sections")
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
---
drivers/acpi/apei/ghes.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 0dc767392a6c..e81c007464a9 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -236,7 +236,8 @@ int ghes_estatus_pool_init(unsigned int num_ghes)
*/
void ghes_estatus_pool_region_free(unsigned long addr, u32 size)
{
- gen_pool_free(ghes_estatus_pool, addr, size);
+ if (ghes_estatus_pool)
+ gen_pool_free(ghes_estatus_pool, addr, size);
}
EXPORT_SYMBOL_GPL(ghes_estatus_pool_region_free);
--
2.48.1
On Tue, Feb 03, 2026 at 10:12:32AM +0800, Jiawen Wu wrote:
> The function ghes_estatus_pool_region_free() is exported and be called
> by the PCIe AER recovery path, which unconditionally invokes it to free
> aer_capability_regs memory.
>
> Although current AER usage assumes memory comes from the GHES pool,
> robustness requires guarding against pool unavailability. Add a NULL check
> before calling gen_pool_free() to prevent crashes when the pool is not
> initialized. This also makes the API safer for potential future use by
> non-GHES callers.
I'm not sure what you mean by "pool unavailability." I think getting
here with ghes_estatus_pool==NULL means we have a logic error
somewhere, and I don't think we should silently hide that error.
I'm generally in favor of *not* checking so we find out if the caller
forgot to keep track of the pointer correctly.
> Fixes: e2abc47a5a1a ("ACPI: APEI: Fix AER info corruption when error status data has multiple sections")
> Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
> ---
> drivers/acpi/apei/ghes.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
> index 0dc767392a6c..e81c007464a9 100644
> --- a/drivers/acpi/apei/ghes.c
> +++ b/drivers/acpi/apei/ghes.c
> @@ -236,7 +236,8 @@ int ghes_estatus_pool_init(unsigned int num_ghes)
> */
> void ghes_estatus_pool_region_free(unsigned long addr, u32 size)
> {
> - gen_pool_free(ghes_estatus_pool, addr, size);
> + if (ghes_estatus_pool)
> + gen_pool_free(ghes_estatus_pool, addr, size);
> }
> EXPORT_SYMBOL_GPL(ghes_estatus_pool_region_free);
>
> --
> 2.48.1
>
On Wed, Feb 4, 2026 6:55 AM, Bjorn Helgaas wrote: > On Tue, Feb 03, 2026 at 10:12:32AM +0800, Jiawen Wu wrote: > > The function ghes_estatus_pool_region_free() is exported and be called > > by the PCIe AER recovery path, which unconditionally invokes it to free > > aer_capability_regs memory. > > > > Although current AER usage assumes memory comes from the GHES pool, > > robustness requires guarding against pool unavailability. Add a NULL check > > before calling gen_pool_free() to prevent crashes when the pool is not > > initialized. This also makes the API safer for potential future use by > > non-GHES callers. > > I'm not sure what you mean by "pool unavailability." I think getting > here with ghes_estatus_pool==NULL means we have a logic error > somewhere, and I don't think we should silently hide that error. > > I'm generally in favor of *not* checking so we find out if the caller > forgot to keep track of the pointer correctly. "pool unavailability" means that when I attempt to call aer_recover_queue() in a ethernet driver, which does not create ghes_estatus_pool, it leads to a NULL pointer dereference.
On Wed, Feb 04, 2026 at 10:03:34AM +0800, Jiawen Wu wrote: > On Wed, Feb 4, 2026 6:55 AM, Bjorn Helgaas wrote: > > On Tue, Feb 03, 2026 at 10:12:32AM +0800, Jiawen Wu wrote: > > > The function ghes_estatus_pool_region_free() is exported and be called > > > by the PCIe AER recovery path, which unconditionally invokes it to free > > > aer_capability_regs memory. > > > > > > Although current AER usage assumes memory comes from the GHES pool, > > > robustness requires guarding against pool unavailability. Add a NULL check > > > before calling gen_pool_free() to prevent crashes when the pool is not > > > initialized. This also makes the API safer for potential future use by > > > non-GHES callers. > > > > I'm not sure what you mean by "pool unavailability." I think getting > > here with ghes_estatus_pool==NULL means we have a logic error > > somewhere, and I don't think we should silently hide that error. > > > > I'm generally in favor of *not* checking so we find out if the caller > > forgot to keep track of the pointer correctly. > > "pool unavailability" means that when I attempt to call > aer_recover_queue() in a ethernet driver, which does not create > ghes_estatus_pool, it leads to a NULL pointer dereference. I guess that means you contemplate having an ethernet driver allocate and manage its own struct aer_capability_regs to pass to aer_recover_queue(). But I don't understand why such a driver would be involved in this part of the AER processing. Normally a device like a NIC that detects an error logs something in its local AER Capability, then sends an ERR_* message upstream. The Root Port that receives that ERR_* message generates an interrupt. In the native AER case, the Linux AER driver handles that interrupt, reads the error logs from the AER Capability of the device that sent the ERR_* message, and logs it. In the firmware-first case used by GHES, platform firmware handles the interrupt, reads the error logs, packages them up, and sends them to the Linux AER driver via GHES and aer_recover_queue(). What's the PCIe hardware flow that would lead to an ethernet driver calling aer_recover_queue()? An Endpoint driver wouldn't receive the AER interrupt generated by the Root Port. I suppose a NIC could generate its own device-specific interrupt when it logs an error in its local AER Capability, but if it conforms to the PCIe spec, it should also send an ERR_* message, which would feed into the existing AER path. I don't think we'd want the existing AER path racing with a parallel AER path in the Endpoint driver. Bjorn
On Thu, Feb 5, 2026 5:46 AM, Bjorn Helgaas wrote: > On Wed, Feb 04, 2026 at 10:03:34AM +0800, Jiawen Wu wrote: > > On Wed, Feb 4, 2026 6:55 AM, Bjorn Helgaas wrote: > > > On Tue, Feb 03, 2026 at 10:12:32AM +0800, Jiawen Wu wrote: > > > > The function ghes_estatus_pool_region_free() is exported and be called > > > > by the PCIe AER recovery path, which unconditionally invokes it to free > > > > aer_capability_regs memory. > > > > > > > > Although current AER usage assumes memory comes from the GHES pool, > > > > robustness requires guarding against pool unavailability. Add a NULL check > > > > before calling gen_pool_free() to prevent crashes when the pool is not > > > > initialized. This also makes the API safer for potential future use by > > > > non-GHES callers. > > > > > > I'm not sure what you mean by "pool unavailability." I think getting > > > here with ghes_estatus_pool==NULL means we have a logic error > > > somewhere, and I don't think we should silently hide that error. > > > > > > I'm generally in favor of *not* checking so we find out if the caller > > > forgot to keep track of the pointer correctly. > > > > "pool unavailability" means that when I attempt to call > > aer_recover_queue() in a ethernet driver, which does not create > > ghes_estatus_pool, it leads to a NULL pointer dereference. > > I guess that means you contemplate having an ethernet driver allocate > and manage its own struct aer_capability_regs to pass to > aer_recover_queue(). But I don't understand why such a driver would > be involved in this part of the AER processing. > > Normally a device like a NIC that detects an error logs something in > its local AER Capability, then sends an ERR_* message upstream. The > Root Port that receives that ERR_* message generates an interrupt. In > the native AER case, the Linux AER driver handles that interrupt, > reads the error logs from the AER Capability of the device that sent > the ERR_* message, and logs it. In the firmware-first case used by > GHES, platform firmware handles the interrupt, reads the error logs, > packages them up, and sends them to the Linux AER driver via GHES and > aer_recover_queue(). > > What's the PCIe hardware flow that would lead to an ethernet driver > calling aer_recover_queue()? An Endpoint driver wouldn't receive the > AER interrupt generated by the Root Port. > > I suppose a NIC could generate its own device-specific interrupt when > it logs an error in its local AER Capability, but if it conforms to > the PCIe spec, it should also send an ERR_* message, which would feed > into the existing AER path. I don't think we'd want the existing AER > path racing with a parallel AER path in the Endpoint driver. Thank you for your detailed explanation. I fully agree that aer_recover_queue() is intended for firmware-first error reporting via GHES, and an endpoint driver should not normally invoke it directly. However, in practice, we've encountered platforms where AER interrupts are not delivered reliably. For example, due to BIOS misconfiguration, disabled AER in firmware, or hardware that fails to generate ERR_* messages correctly. On such systems, when a PCIe error occurs, the standard AER path is never triggered, and the device remains in a stuck state. To verify this, I simulated a PCIE error by injecting it into the NIC register. But the Linux AER driver didn't respond at all, on many platforms. As a device driver, we'd like to ensure best-effort recovery regardless of platform AER support. Since pcie_do_recovery() encapsulates the complete and correct recovery sequence, it's exactly what we need-but it's not exported. Given this, could you advise on the proper way for an endpoint driver to initiate full PCIe error recovery when AER is unavailable? Is there a recommended pattern that safely achieves the same effect as pcie_do_recovery() without duplicating its logic? Thank you again for your guidance.
On Thu, Feb 05, 2026 at 11:11:02AM +0800, Jiawen Wu wrote: > On Thu, Feb 5, 2026 5:46 AM, Bjorn Helgaas wrote: > > On Wed, Feb 04, 2026 at 10:03:34AM +0800, Jiawen Wu wrote: > > > On Wed, Feb 4, 2026 6:55 AM, Bjorn Helgaas wrote: > > > > On Tue, Feb 03, 2026 at 10:12:32AM +0800, Jiawen Wu wrote: > > > > > The function ghes_estatus_pool_region_free() is exported and > > > > > be called by the PCIe AER recovery path, which > > > > > unconditionally invokes it to free aer_capability_regs > > > > > memory. > > > > > > > > > > Although current AER usage assumes memory comes from the > > > > > GHES pool, robustness requires guarding against pool > > > > > unavailability. Add a NULL check before calling > > > > > gen_pool_free() to prevent crashes when the pool is not > > > > > initialized. This also makes the API safer for potential > > > > > future use by non-GHES callers. > > > > > > > > I'm not sure what you mean by "pool unavailability." I think > > > > getting here with ghes_estatus_pool==NULL means we have a > > > > logic error somewhere, and I don't think we should silently > > > > hide that error. > > > > > > > > I'm generally in favor of *not* checking so we find out if the > > > > caller forgot to keep track of the pointer correctly. > > > > > > "pool unavailability" means that when I attempt to call > > > aer_recover_queue() in a ethernet driver, which does not create > > > ghes_estatus_pool, it leads to a NULL pointer dereference. > > > > I guess that means you contemplate having an ethernet driver > > allocate and manage its own struct aer_capability_regs to pass to > > aer_recover_queue(). But I don't understand why such a driver > > would be involved in this part of the AER processing. > > > > Normally a device like a NIC that detects an error logs something > > in its local AER Capability, then sends an ERR_* message upstream. > > The Root Port that receives that ERR_* message generates an > > interrupt. In the native AER case, the Linux AER driver handles > > that interrupt, reads the error logs from the AER Capability of > > the device that sent the ERR_* message, and logs it. In the > > firmware-first case used by GHES, platform firmware handles the > > interrupt, reads the error logs, packages them up, and sends them > > to the Linux AER driver via GHES and aer_recover_queue(). > > > > What's the PCIe hardware flow that would lead to an ethernet > > driver calling aer_recover_queue()? An Endpoint driver wouldn't > > receive the AER interrupt generated by the Root Port. > > > > I suppose a NIC could generate its own device-specific interrupt > > when it logs an error in its local AER Capability, but if it > > conforms to the PCIe spec, it should also send an ERR_* message, > > which would feed into the existing AER path. I don't think we'd > > want the existing AER path racing with a parallel AER path in the > > Endpoint driver. > > Thank you for your detailed explanation. > > I fully agree that aer_recover_queue() is intended for > firmware-first error reporting via GHES, and an endpoint driver > should not normally invoke it directly. > > However, in practice, we've encountered platforms where AER > interrupts are not delivered reliably. For example, due to BIOS > misconfiguration, disabled AER in firmware, or hardware that fails > to generate ERR_* messages correctly. On such systems, when a PCIe > error occurs, the standard AER path is never triggered, and the > device remains in a stuck state. > > To verify this, I simulated a PCIE error by injecting it into the > NIC register. But the Linux AER driver didn't respond at all, on > many platforms. > > As a device driver, we'd like to ensure best-effort recovery > regardless of platform AER support. Since pcie_do_recovery() > encapsulates the complete and correct recovery sequence, it's > exactly what we need-but it's not exported. > > Given this, could you advise on the proper way for an endpoint > driver to initiate full PCIe error recovery when AER is unavailable? > Is there a recommended pattern that safely achieves the same effect > as pcie_do_recovery() without duplicating its logic? It makes sense to try to work around broken hardware, and I think we should try to identify exactly what is broken and address it directly. If the NIC itself is broken, the problem should happen on every platform, and a quirk or the driver might be the best place to deal with it. If the platform is broken, we should see problems with many devices, and it would be better to deal with it more centrally instead of a single endpoint driver. I know about several platforms that don't support the architected AER interrupt, e.g., https://lore.kernel.org/all/20250702223841.GA1905230@bhelgaas/t/#u There is some work in progress to address this particular problem. Do you have any specifics about the devices and platforms where you're seeing issues? Bjorn
On Thu, Feb 5, 2026 11:39 PM, Bjorn Helgaas wrote: > On Thu, Feb 05, 2026 at 11:11:02AM +0800, Jiawen Wu wrote: > > On Thu, Feb 5, 2026 5:46 AM, Bjorn Helgaas wrote: > > > On Wed, Feb 04, 2026 at 10:03:34AM +0800, Jiawen Wu wrote: > > > > On Wed, Feb 4, 2026 6:55 AM, Bjorn Helgaas wrote: > > > > > On Tue, Feb 03, 2026 at 10:12:32AM +0800, Jiawen Wu wrote: > > > > > > The function ghes_estatus_pool_region_free() is exported and > > > > > > be called by the PCIe AER recovery path, which > > > > > > unconditionally invokes it to free aer_capability_regs > > > > > > memory. > > > > > > > > > > > > Although current AER usage assumes memory comes from the > > > > > > GHES pool, robustness requires guarding against pool > > > > > > unavailability. Add a NULL check before calling > > > > > > gen_pool_free() to prevent crashes when the pool is not > > > > > > initialized. This also makes the API safer for potential > > > > > > future use by non-GHES callers. > > > > > > > > > > I'm not sure what you mean by "pool unavailability." I think > > > > > getting here with ghes_estatus_pool==NULL means we have a > > > > > logic error somewhere, and I don't think we should silently > > > > > hide that error. > > > > > > > > > > I'm generally in favor of *not* checking so we find out if the > > > > > caller forgot to keep track of the pointer correctly. > > > > > > > > "pool unavailability" means that when I attempt to call > > > > aer_recover_queue() in a ethernet driver, which does not create > > > > ghes_estatus_pool, it leads to a NULL pointer dereference. > > > > > > I guess that means you contemplate having an ethernet driver > > > allocate and manage its own struct aer_capability_regs to pass to > > > aer_recover_queue(). But I don't understand why such a driver > > > would be involved in this part of the AER processing. > > > > > > Normally a device like a NIC that detects an error logs something > > > in its local AER Capability, then sends an ERR_* message upstream. > > > The Root Port that receives that ERR_* message generates an > > > interrupt. In the native AER case, the Linux AER driver handles > > > that interrupt, reads the error logs from the AER Capability of > > > the device that sent the ERR_* message, and logs it. In the > > > firmware-first case used by GHES, platform firmware handles the > > > interrupt, reads the error logs, packages them up, and sends them > > > to the Linux AER driver via GHES and aer_recover_queue(). > > > > > > What's the PCIe hardware flow that would lead to an ethernet > > > driver calling aer_recover_queue()? An Endpoint driver wouldn't > > > receive the AER interrupt generated by the Root Port. > > > > > > I suppose a NIC could generate its own device-specific interrupt > > > when it logs an error in its local AER Capability, but if it > > > conforms to the PCIe spec, it should also send an ERR_* message, > > > which would feed into the existing AER path. I don't think we'd > > > want the existing AER path racing with a parallel AER path in the > > > Endpoint driver. > > > > Thank you for your detailed explanation. > > > > I fully agree that aer_recover_queue() is intended for > > firmware-first error reporting via GHES, and an endpoint driver > > should not normally invoke it directly. > > > > However, in practice, we've encountered platforms where AER > > interrupts are not delivered reliably. For example, due to BIOS > > misconfiguration, disabled AER in firmware, or hardware that fails > > to generate ERR_* messages correctly. On such systems, when a PCIe > > error occurs, the standard AER path is never triggered, and the > > device remains in a stuck state. > > > > To verify this, I simulated a PCIE error by injecting it into the > > NIC register. But the Linux AER driver didn't respond at all, on > > many platforms. > > > > As a device driver, we'd like to ensure best-effort recovery > > regardless of platform AER support. Since pcie_do_recovery() > > encapsulates the complete and correct recovery sequence, it's > > exactly what we need-but it's not exported. > > > > Given this, could you advise on the proper way for an endpoint > > driver to initiate full PCIe error recovery when AER is unavailable? > > Is there a recommended pattern that safely achieves the same effect > > as pcie_do_recovery() without duplicating its logic? > > It makes sense to try to work around broken hardware, and I think we > should try to identify exactly what is broken and address it directly. > > If the NIC itself is broken, the problem should happen on every > platform, and a quirk or the driver might be the best place to deal > with it. > > If the platform is broken, we should see problems with many devices, > and it would be better to deal with it more centrally instead of a > single endpoint driver. Thank you for the thoughtful response. We are the NIC vendor, and our hardware (like many high-speed PCIe devices) can occasionally encounter PCIe errors due to real-world factors such as signal integrity issues, or marginal link training. These are not necessarily design flaws in the NIC itself, but rather transient conditions that can occur in field deployments. While we agree that platforms should properly deliver AER interrupts, in practice we see many customer environments (especially in embedded or custom server platforms) where: * AER is disabled in BIOS * The root port does not generate the architected interrupt * Firmware simply fails to report the error via GHES As a driver vendor, we have no ability to fix or even influence these platform-level issues. Yet from the user's perspective, the result is the same: the NIC becomes unusable (config space reads return 0xFFFFFFFF), and the network interface hangs indefinitely. Our goal is not to bypass the AER architecture, but to provide a last-resort recovery mechanism when the standard path is broken through no fault of our own. Since pcie_do_recovery() already implements the correct sequence, it would be ideal if endpoint drivers could safely invoke a similar flow when they detect a local failure (e.g., via MMIO timeout or Tx stall). I understand the concern about layering, but without any way to trigger recovery, the device remains dead. I think the driver only can do is copy the code of pcie_do_recovery() to restore the device. Would it be reasonable to consider exporting a recovery helper for use by endpoint drivers? > I know about several platforms that don't support the architected AER > interrupt, e.g., > https://lore.kernel.org/all/20250702223841.GA1905230@bhelgaas/t/#u > There is some work in progress to address this particular problem. > > Do you have any specifics about the devices and platforms where you're > seeing issues? The test platform I'm currently using: * CPU: AMD Ryzen 9 7950X 16-Core Processor * BIOS version: E7E16AMS.190 * OS: Ubuntu 25.04 * Kernel: Linux 6.19.0-rc7+ The device is our NIC, the driver is in the directory: drivers/net/ethernet/wangxun/ If you need more detailed information, please let me know. Thanks again for your time and support.
On Tue, Feb 3, 2026 at 3:14 AM Jiawen Wu <jiawenwu@trustnetic.com> wrote:
>
> The function ghes_estatus_pool_region_free() is exported and be called
> by the PCIe AER recovery path, which unconditionally invokes it to free
> aer_capability_regs memory.
>
> Although current AER usage assumes memory comes from the GHES pool,
> robustness requires guarding against pool unavailability. Add a NULL check
> before calling gen_pool_free() to prevent crashes when the pool is not
> initialized. This also makes the API safer for potential future use by
> non-GHES callers.
Are any such callers going to be added any time soon?
> Fixes: e2abc47a5a1a ("ACPI: APEI: Fix AER info corruption when error status data has multiple sections")
It doesn't fix anything, the lack of the check is not an error
currently, AFAICS.
> Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
> ---
> drivers/acpi/apei/ghes.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
> index 0dc767392a6c..e81c007464a9 100644
> --- a/drivers/acpi/apei/ghes.c
> +++ b/drivers/acpi/apei/ghes.c
> @@ -236,7 +236,8 @@ int ghes_estatus_pool_init(unsigned int num_ghes)
> */
> void ghes_estatus_pool_region_free(unsigned long addr, u32 size)
> {
> - gen_pool_free(ghes_estatus_pool, addr, size);
> + if (ghes_estatus_pool)
> + gen_pool_free(ghes_estatus_pool, addr, size);
> }
> EXPORT_SYMBOL_GPL(ghes_estatus_pool_region_free);
>
> --
On Tue, Feb 3, 2026 8:57 PM, Rafael J. Wysocki wrote:
> On Tue, Feb 3, 2026 at 3:14 AM Jiawen Wu <jiawenwu@trustnetic.com> wrote:
> >
> > The function ghes_estatus_pool_region_free() is exported and be called
> > by the PCIe AER recovery path, which unconditionally invokes it to free
> > aer_capability_regs memory.
> >
> > Although current AER usage assumes memory comes from the GHES pool,
> > robustness requires guarding against pool unavailability. Add a NULL check
> > before calling gen_pool_free() to prevent crashes when the pool is not
> > initialized. This also makes the API safer for potential future use by
> > non-GHES callers.
>
> Are any such callers going to be added any time soon?
Yes, I want a ethernet driver to call aer_recover_queue().
>
> > Fixes: e2abc47a5a1a ("ACPI: APEI: Fix AER info corruption when error status data has multiple sections")
>
> It doesn't fix anything, the lack of the check is not an error
> currently, AFAICS.
So far, it seems.
© 2016 - 2026 Red Hat, Inc.