include/ras/ras_event.h | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
The chained ternary conditional operator in the perf event format for
ras:aer_event was causing a misrepresentation of the severity of the event
when used with "perf script". Rather than building our own hand-rolled
formatting, just use the __print_symbolic helper to format it.
Specifically, all corrected errors were being formatted as non-fatal,
uncorrected errors, as shown below with the BAD_TLP errors, which is
correctable. This is due to a bug in libtraceevent, where chained
ternary conditions are not parsed correctly.
The before / after are as follows (and also tested to make sure
uncorrectable events) still show up as uncorrectable.
aer-inject was used with the following AER event injection script:
AER
PCI_ID 00:05.0
COR_STATUS BAD_TLP
HEADER_LOG 0 1 2 3
dmesg (unchanged between runs):
pcieport 0000:00:05.0: aer_inject: Injecting errors 00000040/00000000 into device 0000:00:05.0
pcieport 0000:00:05.0: AER: Correctable error message received from 0000:00:05.0
pcieport 0000:00:05.0: PCIe Bus Error: severity=Correctable, type=Data Link Layer, (Receiver ID)
pcieport 0000:00:05.0: device [1b36:000c] error status/mask=00000040/0000e000
pcieport 0000:00:05.0: [ 6] BadTLP
Before:
virtme-ng:/# perf script |cat
irq/24-aerdrv 424 [002] 392.240255: ras:aer_event: 0000:00:05.0 PCIe Bus Error: severity=Uncorrected, non-fatal, Bad TLP, TLP Header=Not available
After:
irq/24-aerdrv 424 [002] 29.198383: ras:aer_event: 0000:00:05.0 PCIe Bus Error: severity=Corrected, Bad TLP, TLP Header=Not available
Signed-off-by: Sargun Dhillon <sargun@sargun.me>
---
include/ras/ras_event.h | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/include/ras/ras_event.h b/include/ras/ras_event.h
index e5f7ee0864e7..9312007096d7 100644
--- a/include/ras/ras_event.h
+++ b/include/ras/ras_event.h
@@ -327,9 +327,10 @@ TRACE_EVENT(aer_event,
TP_printk("%s PCIe Bus Error: severity=%s, %s, TLP Header=%s\n",
__get_str(dev_name),
- __entry->severity == AER_CORRECTABLE ? "Corrected" :
- __entry->severity == AER_FATAL ?
- "Fatal" : "Uncorrected, non-fatal",
+ __print_symbolic(__entry->severity,
+ {AER_NONFATAL, "Uncorrected, non-fatal"},
+ {AER_FATAL, "Fatal"},
+ {AER_CORRECTABLE, "Corrected"}),
__entry->severity == AER_CORRECTABLE ?
__print_flags(__entry->status, "|", aer_correctable_errors) :
__print_flags(__entry->status, "|", aer_uncorrectable_errors),
--
2.47.1
+ Rostedt.
On Mon, Apr 14, 2025 at 08:38:34AM -0700, Sargun Dhillon wrote:
> The chained ternary conditional operator in the perf event format for
> ras:aer_event was causing a misrepresentation of the severity of the event
> when used with "perf script". Rather than building our own hand-rolled
> formatting, just use the __print_symbolic helper to format it.
>
> Specifically, all corrected errors were being formatted as non-fatal,
> uncorrected errors, as shown below with the BAD_TLP errors, which is
> correctable. This is due to a bug in libtraceevent, where chained
> ternary conditions are not parsed correctly.
So because *some* libtraceevent has a bug, we're wagging the dog, not the
tail?
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Mon, 14 Apr 2025 18:33:47 +0200
Borislav Petkov <bp@alien8.de> wrote:
> On Mon, Apr 14, 2025 at 08:38:34AM -0700, Sargun Dhillon wrote:
> > The chained ternary conditional operator in the perf event format for
> > ras:aer_event was causing a misrepresentation of the severity of the event
> > when used with "perf script". Rather than building our own hand-rolled
> > formatting, just use the __print_symbolic helper to format it.
> >
> > Specifically, all corrected errors were being formatted as non-fatal,
> > uncorrected errors, as shown below with the BAD_TLP errors, which is
> > correctable. This is due to a bug in libtraceevent, where chained
> > ternary conditions are not parsed correctly.
>
> So because *some* libtraceevent has a bug, we're wagging the dog, not the
> tail?
Agreed.
If something isn't parsed correctly in libtraceevent, please let me know!
Can you apply this to libtraceevent and see if it fixes your issue:
diff --git a/src/event-parse.c b/src/event-parse.c
index 6317ff6..4a09fcc 100644
--- a/src/event-parse.c
+++ b/src/event-parse.c
@@ -2083,6 +2083,16 @@ process_cond(struct tep_event *event, struct tep_print_arg *top, char **tok)
type = process_arg(event, right, &token);
+ againagain:
+ if (type == TEP_EVENT_ERROR)
+ goto out_free;
+
+ /* Handle other operations in the results */
+ if (type == TEP_EVENT_OP) {
+ type = process_op(event, right, &token);
+ goto againagain;
+ }
+
top->op.right = arg;
*tok = token;
I'm getting ready to post a new version of libtraceevent, and if this fixes
the parsing then I'll include this too.
-- Steve
© 2016 - 2025 Red Hat, Inc.