[PATCH v6 15/16] PCI/AER: Add ratelimits to PCI AER Documentation

Bjorn Helgaas posted 16 patches 7 months ago
There is a newer version of this series
[PATCH v6 15/16] PCI/AER: Add ratelimits to PCI AER Documentation
Posted by Bjorn Helgaas 7 months ago
From: Jon Pan-Doh <pandoh@google.com>

Add ratelimits section for rationale and defaults.

Signed-off-by: Karolina Stolarek <karolina.stolarek@oracle.com>
Signed-off-by: Jon Pan-Doh <pandoh@google.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Acked-by: Paul E. McKenney <paulmck@kernel.org>
---
 Documentation/PCI/pcieaer-howto.rst | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/Documentation/PCI/pcieaer-howto.rst b/Documentation/PCI/pcieaer-howto.rst
index f013f3b27c82..896d2a232a90 100644
--- a/Documentation/PCI/pcieaer-howto.rst
+++ b/Documentation/PCI/pcieaer-howto.rst
@@ -85,6 +85,17 @@ In the example, 'Requester ID' means the ID of the device that sent
 the error message to the Root Port. Please refer to PCIe specs for other
 fields.
 
+AER Ratelimits
+--------------
+
+Since error messages can be generated for each transaction, we may see
+large volumes of errors reported. To prevent spammy devices from flooding
+the console/stalling execution, messages are throttled by device and error
+type (correctable vs. uncorrectable).
+
+AER uses the default ratelimit of DEFAULT_RATELIMIT_BURST (10 events) over
+DEFAULT_RATELIMIT_INTERVAL (5 seconds).
+
 AER Statistics / Counters
 -------------------------
 
-- 
2.43.0
Re: [PATCH v6 15/16] PCI/AER: Add ratelimits to PCI AER Documentation
Posted by Sathyanarayanan Kuppuswamy 7 months ago
On 5/19/25 2:35 PM, Bjorn Helgaas wrote:
> From: Jon Pan-Doh <pandoh@google.com>
>
> Add ratelimits section for rationale and defaults.
>
> Signed-off-by: Karolina Stolarek <karolina.stolarek@oracle.com>
> Signed-off-by: Jon Pan-Doh <pandoh@google.com>
> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
> Acked-by: Paul E. McKenney <paulmck@kernel.org>
> ---
>   Documentation/PCI/pcieaer-howto.rst | 11 +++++++++++
>   1 file changed, 11 insertions(+)
>
> diff --git a/Documentation/PCI/pcieaer-howto.rst b/Documentation/PCI/pcieaer-howto.rst
> index f013f3b27c82..896d2a232a90 100644
> --- a/Documentation/PCI/pcieaer-howto.rst
> +++ b/Documentation/PCI/pcieaer-howto.rst
> @@ -85,6 +85,17 @@ In the example, 'Requester ID' means the ID of the device that sent
>   the error message to the Root Port. Please refer to PCIe specs for other
>   fields.
>   
> +AER Ratelimits
> +--------------
> +
> +Since error messages can be generated for each transaction, we may see
> +large volumes of errors reported. To prevent spammy devices from flooding
> +the console/stalling execution, messages are throttled by device and error
> +type (correctable vs. uncorrectable).

Can we list exceptions like DPC and FATAL errors (if added) ?

> +
> +AER uses the default ratelimit of DEFAULT_RATELIMIT_BURST (10 events) over
> +DEFAULT_RATELIMIT_INTERVAL (5 seconds).
> +
>   AER Statistics / Counters
>   -------------------------
>   

-- 
Sathyanarayanan Kuppuswamy
Linux Kernel Developer
Re: [PATCH v6 15/16] PCI/AER: Add ratelimits to PCI AER Documentation
Posted by Bjorn Helgaas 7 months ago
On Mon, May 19, 2025 at 10:01:09PM -0700, Sathyanarayanan Kuppuswamy wrote:
> 
> On 5/19/25 2:35 PM, Bjorn Helgaas wrote:
> > From: Jon Pan-Doh <pandoh@google.com>
> > 
> > Add ratelimits section for rationale and defaults.

> > +AER Ratelimits
> > +--------------
> > +
> > +Since error messages can be generated for each transaction, we may see
> > +large volumes of errors reported. To prevent spammy devices from flooding
> > +the console/stalling execution, messages are throttled by device and error
> > +type (correctable vs. uncorrectable).
> 
> Can we list exceptions like DPC and FATAL errors (if added) ?

Like this?

  +... messages are throttled by device and error
  +type (correctable vs. non-fatal uncorrectable).  Fatal errors, including
  +DPC errors, are not ratelimited.

DPC is currently only triggered for fatal errors.
Re: [PATCH v6 15/16] PCI/AER: Add ratelimits to PCI AER Documentation
Posted by Sathyanarayanan Kuppuswamy 7 months ago
On 5/20/25 12:48 PM, Bjorn Helgaas wrote:
> On Mon, May 19, 2025 at 10:01:09PM -0700, Sathyanarayanan Kuppuswamy wrote:
>> On 5/19/25 2:35 PM, Bjorn Helgaas wrote:
>>> From: Jon Pan-Doh <pandoh@google.com>
>>>
>>> Add ratelimits section for rationale and defaults.
>>> +AER Ratelimits
>>> +--------------
>>> +
>>> +Since error messages can be generated for each transaction, we may see
>>> +large volumes of errors reported. To prevent spammy devices from flooding
>>> +the console/stalling execution, messages are throttled by device and error
>>> +type (correctable vs. uncorrectable).
>> Can we list exceptions like DPC and FATAL errors (if added) ?
> Like this?
>
>    +... messages are throttled by device and error
>    +type (correctable vs. non-fatal uncorrectable).  Fatal errors, including
>    +DPC errors, are not ratelimited.
>
> DPC is currently only triggered for fatal errors.

Yes.  I think it is good enough.


-- 
Sathyanarayanan Kuppuswamy
Linux Kernel Developer