[PATCH v2] perf stat: Fix crash on arm64

Breno Leitao posted 1 patch 1 week, 1 day ago
tools/perf/builtin-stat.c | 26 +++++++++++++++++---------
1 file changed, 17 insertions(+), 9 deletions(-)
[PATCH v2] perf stat: Fix crash on arm64
Posted by Breno Leitao 1 week, 1 day ago
Perf stat is crashing on arm64 hosts with the following issue:

	# make -C tools/perf DEBUG=1
	# perf stat sleep 1
	perf: util/evsel.c:2034: get_group_fd: Assertion `!(!leader->core.fd)' failed.
	[1]    1220794 IOT instruction (core dumped)  ./perf stat

The sorting function introduced by commit a745c0831c15c ("perf stat:
Sort default events/metrics") compares events based on their individual
properties. This can cause events from different groups to be
interleaved, resulting in group members appearing before their leaders
in the sorted evlist.

When the iterator opens events in list order, a group member may be
processed before its leader has been opened.

For example, CPU_CYCLES (idx=32) with leader STALL_SLOT_BACKEND (idx=37)
could be sorted before its leader, causing the crash when CPU_CYCLES
tries to get its group fd from the not-yet-opened leader.

Fix this by comparing events based on their leader's attributes instead
of their own attributes when the events are in different groups. This
ensures all members of a group share the same sort key as their leader,
keeping groups together and guaranteeing leaders are opened before their
members.

Reported-by: Denis Yaroshevskiy <dyaroshev@meta.com>
Fixes: a745c0831c15c ("perf stat: Sort default events/metrics")
Tested-by: Dmitry Ilvokhin <d@ilvokhin.com>
Signed-off-by: Breno Leitao <leitao@debian.org>
---
Cc; linux-arm-kernel@lists.infradead.org
---
Changes in v2:
- No changes from V1, just resending the exact same patch.
- Link to v1: https://patch.msgid.link/20260205-perf_stat-v1-1-e433b0c918af@debian.org
---
 tools/perf/builtin-stat.c | 26 +++++++++++++++++---------
 1 file changed, 17 insertions(+), 9 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 73c2ba7e30760..a97b9d0de3f58 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -1917,25 +1917,33 @@ static int default_evlist_evsel_cmp(void *priv __maybe_unused,
 	const struct evsel *lhs = container_of(lhs_core, struct evsel, core);
 	const struct perf_evsel *rhs_core = container_of(r, struct perf_evsel, node);
 	const struct evsel *rhs = container_of(rhs_core, struct evsel, core);
+	const struct evsel *lhs_leader = evsel__leader(lhs);
+	const struct evsel *rhs_leader = evsel__leader(rhs);
 
-	if (evsel__leader(lhs) == evsel__leader(rhs)) {
+	if (lhs_leader == rhs_leader) {
 		/* Within the same group, respect the original order. */
 		return lhs_core->idx - rhs_core->idx;
 	}
 
+	/*
+	 * Compare using leader's attributes so that all members of a group
+	 * stay together. This ensures leaders are opened before their members.
+	 */
+
 	/* Sort default metrics evsels first, and default show events before those. */
-	if (lhs->default_metricgroup != rhs->default_metricgroup)
-		return lhs->default_metricgroup ? -1 : 1;
+	if (lhs_leader->default_metricgroup != rhs_leader->default_metricgroup)
+		return lhs_leader->default_metricgroup ? -1 : 1;
 
-	if (lhs->default_show_events != rhs->default_show_events)
-		return lhs->default_show_events ? -1 : 1;
+	if (lhs_leader->default_show_events != rhs_leader->default_show_events)
+		return lhs_leader->default_show_events ? -1 : 1;
 
 	/* Sort by PMU type (prefers legacy types first). */
-	if (lhs->pmu != rhs->pmu)
-		return lhs->pmu->type - rhs->pmu->type;
+	if (lhs_leader->pmu != rhs_leader->pmu)
+		return lhs_leader->pmu->type - rhs_leader->pmu->type;
 
-	/* Sort by name. */
-	return strcmp(evsel__name((struct evsel *)lhs), evsel__name((struct evsel *)rhs));
+	/* Sort by leader's name. */
+	return strcmp(evsel__name((struct evsel *)lhs_leader),
+		      evsel__name((struct evsel *)rhs_leader));
 }
 
 /*

---
base-commit: 85964cdcad0fac9a0eb7b87a0f9d88cc074b854c
change-id: 20260205-perf_stat-a0a2a37e21c5

Best regards,
--  
Breno Leitao <leitao@debian.org>
Re: [PATCH v2] perf stat: Fix crash on arm64
Posted by Namhyung Kim an hour ago
On Wed, 25 Mar 2026 03:24:30 -0700, Breno Leitao wrote:
> Perf stat is crashing on arm64 hosts with the following issue:
> 
> 	# make -C tools/perf DEBUG=1
> 	# perf stat sleep 1
> 	perf: util/evsel.c:2034: get_group_fd: Assertion `!(!leader->core.fd)' failed.
> 	[1]    1220794 IOT instruction (core dumped)  ./perf stat
> 
> [...]
Applied to perf-tools-next, thanks!

Best regards,
Namhyung
Re: [PATCH v2] perf stat: Fix crash on arm64
Posted by Leo Yan 1 week, 1 day ago
On Wed, Mar 25, 2026 at 03:24:30AM -0700, Breno Leitao wrote:
> Perf stat is crashing on arm64 hosts with the following issue:
> 
> 	# make -C tools/perf DEBUG=1
> 	# perf stat sleep 1
> 	perf: util/evsel.c:2034: get_group_fd: Assertion `!(!leader->core.fd)' failed.
> 	[1]    1220794 IOT instruction (core dumped)  ./perf stat
> 
> The sorting function introduced by commit a745c0831c15c ("perf stat:
> Sort default events/metrics") compares events based on their individual
> properties. This can cause events from different groups to be
> interleaved, resulting in group members appearing before their leaders
> in the sorted evlist.
> 
> When the iterator opens events in list order, a group member may be
> processed before its leader has been opened.
> 
> For example, CPU_CYCLES (idx=32) with leader STALL_SLOT_BACKEND (idx=37)
> could be sorted before its leader, causing the crash when CPU_CYCLES
> tries to get its group fd from the not-yet-opened leader.
> 
> Fix this by comparing events based on their leader's attributes instead
> of their own attributes when the events are in different groups. This
> ensures all members of a group share the same sort key as their leader,
> keeping groups together and guaranteeing leaders are opened before their
> members.
> 
> Reported-by: Denis Yaroshevskiy <dyaroshev@meta.com>
> Fixes: a745c0831c15c ("perf stat: Sort default events/metrics")
> Tested-by: Dmitry Ilvokhin <d@ilvokhin.com>
> Signed-off-by: Breno Leitao <leitao@debian.org>

As Arnaldo mentioned in v1, I also found Segmentation fault when
testing this patch:

 Program received signal SIGSEGV, Segmentation fault.
 metricgroup__copy_metric_events (evlist=0xaaaaab037750, cgrp=0x0, new_metric_events=0xaaaaab038210, old_metric_events=0xaaaaab038d20) at util/metricgroup.c:1662
 1662                    evsel = evlist__find_evsel(evlist, old_me->evsel->core.idx);
 (gdb) bt
 #0  metricgroup__copy_metric_events (evlist=0xaaaaab037750, cgrp=0x0, new_metric_events=0xaaaaab038210, old_metric_events=0xaaaaab038d20) at util/metricgroup.c:1662
 #1  0x0000aaaaaab05870 in add_default_events () at builtin-stat.c:2110
 #2  0x0000aaaaaab08300 in cmd_stat (argc=0, argv=0xfffffffffaa0) at builtin-stat.c:2838
 #3  0x0000aaaaaab40998 in run_builtin (p=0xaaaaaaf9d428 <commands+360>, argc=4, argv=0xfffffffffaa0) at perf.c:348
 #4  0x0000aaaaaab40c14 in handle_internal_command (argc=4, argv=0xfffffffffaa0) at perf.c:398
 #5  0x0000aaaaaab40ddc in run_argv (argcp=0xfffffffff8bc, argv=0xfffffffff8b0) at perf.c:442
 #6  0x0000aaaaaab41110 in main (argc=4, argv=0xfffffffffaa0) at perf.c:549

Last week I tested v1 and confirmed the issue was gone with the change,
I will dig a bit in tomorrow and share back if any finding.

Apologies for my lazy, as I should double check once Arnaldo
pointed out in v1.

Thanks,
Leo
Re: [PATCH v2] perf stat: Fix crash on arm64
Posted by Ian Rogers 1 week, 1 day ago
On Wed, Mar 25, 2026 at 1:59 PM Leo Yan <leo.yan@arm.com> wrote:
>
> On Wed, Mar 25, 2026 at 03:24:30AM -0700, Breno Leitao wrote:
> > Perf stat is crashing on arm64 hosts with the following issue:
> >
> >       # make -C tools/perf DEBUG=1
> >       # perf stat sleep 1
> >       perf: util/evsel.c:2034: get_group_fd: Assertion `!(!leader->core.fd)' failed.
> >       [1]    1220794 IOT instruction (core dumped)  ./perf stat
> >
> > The sorting function introduced by commit a745c0831c15c ("perf stat:
> > Sort default events/metrics") compares events based on their individual
> > properties. This can cause events from different groups to be
> > interleaved, resulting in group members appearing before their leaders
> > in the sorted evlist.
> >
> > When the iterator opens events in list order, a group member may be
> > processed before its leader has been opened.
> >
> > For example, CPU_CYCLES (idx=32) with leader STALL_SLOT_BACKEND (idx=37)
> > could be sorted before its leader, causing the crash when CPU_CYCLES
> > tries to get its group fd from the not-yet-opened leader.
> >
> > Fix this by comparing events based on their leader's attributes instead
> > of their own attributes when the events are in different groups. This
> > ensures all members of a group share the same sort key as their leader,
> > keeping groups together and guaranteeing leaders are opened before their
> > members.
> >
> > Reported-by: Denis Yaroshevskiy <dyaroshev@meta.com>
> > Fixes: a745c0831c15c ("perf stat: Sort default events/metrics")
> > Tested-by: Dmitry Ilvokhin <d@ilvokhin.com>
> > Signed-off-by: Breno Leitao <leitao@debian.org>
>
> As Arnaldo mentioned in v1, I also found Segmentation fault when
> testing this patch:
>
>  Program received signal SIGSEGV, Segmentation fault.
>  metricgroup__copy_metric_events (evlist=0xaaaaab037750, cgrp=0x0, new_metric_events=0xaaaaab038210, old_metric_events=0xaaaaab038d20) at util/metricgroup.c:1662
>  1662                    evsel = evlist__find_evsel(evlist, old_me->evsel->core.idx);
>  (gdb) bt
>  #0  metricgroup__copy_metric_events (evlist=0xaaaaab037750, cgrp=0x0, new_metric_events=0xaaaaab038210, old_metric_events=0xaaaaab038d20) at util/metricgroup.c:1662
>  #1  0x0000aaaaaab05870 in add_default_events () at builtin-stat.c:2110
>  #2  0x0000aaaaaab08300 in cmd_stat (argc=0, argv=0xfffffffffaa0) at builtin-stat.c:2838
>  #3  0x0000aaaaaab40998 in run_builtin (p=0xaaaaaaf9d428 <commands+360>, argc=4, argv=0xfffffffffaa0) at perf.c:348
>  #4  0x0000aaaaaab40c14 in handle_internal_command (argc=4, argv=0xfffffffffaa0) at perf.c:398
>  #5  0x0000aaaaaab40ddc in run_argv (argcp=0xfffffffff8bc, argv=0xfffffffff8b0) at perf.c:442
>  #6  0x0000aaaaaab41110 in main (argc=4, argv=0xfffffffffaa0) at perf.c:549
>
> Last week I tested v1 and confirmed the issue was gone with the change,
> I will dig a bit in tomorrow and share back if any finding.
>
> Apologies for my lazy, as I should double check once Arnaldo
> pointed out in v1.


From the stack trace I don't see anything that should allow this.
Either the list of metrics is broken, or the struct metric_event's
evsel is NULL. But that should never happen as we wouldn't have a
metric at that point. The sorting shouldn't affect that. If you can
reproduce the issue, verbose logs may help.

Thanks,
Ian

> Thanks,
> Leo
Re: [PATCH v2] perf stat: Fix crash on arm64
Posted by Leo Yan 1 week ago
On Wed, Mar 25, 2026 at 03:27:05PM -0700, Ian Rogers wrote:

[...]

> > As Arnaldo mentioned in v1, I also found Segmentation fault when
> > testing this patch:
> >
> >  Program received signal SIGSEGV, Segmentation fault.
> >  metricgroup__copy_metric_events (evlist=0xaaaaab037750, cgrp=0x0, new_metric_events=0xaaaaab038210, old_metric_events=0xaaaaab038d20) at util/metricgroup.c:1662
> >  1662                    evsel = evlist__find_evsel(evlist, old_me->evsel->core.idx);
> >  (gdb) bt
> >  #0  metricgroup__copy_metric_events (evlist=0xaaaaab037750, cgrp=0x0, new_metric_events=0xaaaaab038210, old_metric_events=0xaaaaab038d20) at util/metricgroup.c:1662
> >  #1  0x0000aaaaaab05870 in add_default_events () at builtin-stat.c:2110
> >  #2  0x0000aaaaaab08300 in cmd_stat (argc=0, argv=0xfffffffffaa0) at builtin-stat.c:2838
> >  #3  0x0000aaaaaab40998 in run_builtin (p=0xaaaaaaf9d428 <commands+360>, argc=4, argv=0xfffffffffaa0) at perf.c:348
> >  #4  0x0000aaaaaab40c14 in handle_internal_command (argc=4, argv=0xfffffffffaa0) at perf.c:398
> >  #5  0x0000aaaaaab40ddc in run_argv (argcp=0xfffffffff8bc, argv=0xfffffffff8b0) at perf.c:442
> >  #6  0x0000aaaaaab41110 in main (argc=4, argv=0xfffffffffaa0) at perf.c:549
> >
> > Last week I tested v1 and confirmed the issue was gone with the change,
> > I will dig a bit in tomorrow and share back if any finding.
> >
> > Apologies for my lazy, as I should double check once Arnaldo
> > pointed out in v1.
> 
> 
> From the stack trace I don't see anything that should allow this.
> Either the list of metrics is broken, or the struct metric_event's
> evsel is NULL. But that should never happen as we wouldn't have a
> metric at that point. The sorting shouldn't affect that. If you can
> reproduce the issue, verbose logs may help.

I believe I encountered a different issue, which is irrelevant to this
patch.  So Breno's patch is fine for me.

On one of my board, I can see the log:

  Events in 'frontend_bound' fully contained within 'retiring'
  Events in 'bad_speculation' fully contained within 'retiring'
  Events in 'backend_bound' fully contained within 'retiring'

Then, in parse_groups(), it selects fully contained list:

  list_for_each_entry(n, &metric_list, nd) {
      ...
      if (expr__subset_of_ids(n->pctx, m->pctx)) {
          metric_evlist = n->evlist;
          break;
      }
  }

  // As metric_evlist has been set, so below does not run
  if (!metric_evlist) {
        ret = parse_ids(metric_no_merge, fake_pmu, m->pctx, m->modifier,
                        m->group_events, tool_events, &m->evlist);
        if (ret)
                goto out;

        metric_evlist = m->evlist;
  }

The problem is it skips to call parse_ids() and m->pctx is not
initialized.  This leads to pick_display_evsel() returns NULL and
pass NULL pointer to metricgroup__lookup().   In the end, the NULL
pointer is consumed in metricgroup__copy_metric_events().

How about below change ?  I tested it which can dismiss the issue.

Subject: [PATCH] perf metricgroup: Fix segment fault in
 metricgroup__copy_metric_events()

In parse_groups(), when find a event is fully contained by a previous
event, it skips to call parse_ids(), as a result, m->pctx->ids is not
initialized.  Then, setup_metric_events() returns an empty metric
events, pick_display_evsel() consumes the returned metric_events and
feeds to metricgroup__lookup() with passing "evsel = NULL".

metricgroup__copy_metric_events() access old_me->evsel->core.idx, due
to the evsel is set to NULL in metricgroup__lookup(), it causes
segment fault.

This commit corrects the fully contained case to allow it to parse IDs,
thus this can effectively avoid setup NULL pointer in
metricgroup__lookup().

Fixes: 5ecd5a0c7d1c ("perf metrics: Modify setup and deduplication")
Signed-off-by: Leo Yan <leo.yan@arm.com>
---
 tools/perf/util/metricgroup.c | 11 +++++------
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c
index 7e39d469111b..528a6462876b 100644
--- a/tools/perf/util/metricgroup.c
+++ b/tools/perf/util/metricgroup.c
@@ -1431,7 +1431,7 @@ static int parse_groups(struct evlist *perf_evlist,
 	list_for_each_entry(m, &metric_list, nd) {
 		struct metric_event *me;
 		struct evsel **metric_events;
-		struct evlist *metric_evlist = NULL;
+		struct evlist *metric_evlist = NULL, *contained = NULL;
 		struct metric *n;
 		struct metric_expr *expr;
 
@@ -1463,19 +1463,18 @@ static int parse_groups(struct evlist *perf_evlist,
 				if (expr__subset_of_ids(n->pctx, m->pctx)) {
 					pr_debug("Events in '%s' fully contained within '%s'\n",
 						 m->metric_name, n->metric_name);
-					metric_evlist = n->evlist;
+					contained = n->evlist;
 					break;
 				}
-
 			}
 		}
 		if (!metric_evlist) {
+			metric_evlist = contained ? contained : m->evlist;
+
 			ret = parse_ids(metric_no_merge, fake_pmu, m->pctx, m->modifier,
-					m->group_events, tool_events, &m->evlist);
+					m->group_events, tool_events, &metric_evlist);
 			if (ret)
 				goto out;
-
-			metric_evlist = m->evlist;
 		}
 		ret = setup_metric_events(fake_pmu ? "all" : m->pmu, m->pctx->ids,
 					  metric_evlist, &metric_events);
-- 
2.34.1
Re: [PATCH v2] perf stat: Fix crash on arm64
Posted by Breno Leitao 6 days, 13 hours ago
On Thu, Mar 26, 2026 at 04:39:26PM +0000, Leo Yan wrote:
> On Wed, Mar 25, 2026 at 03:27:05PM -0700, Ian Rogers wrote:
>
> > From the stack trace I don't see anything that should allow this.
> > Either the list of metrics is broken, or the struct metric_event's
> > evsel is NULL. But that should never happen as we wouldn't have a
> > metric at that point. The sorting shouldn't affect that. If you can
> > reproduce the issue, verbose logs may help.
>
> I believe I encountered a different issue, which is irrelevant to this
> patch.  So Breno's patch is fine for me.

That said, are there any concerns about merging this patch in its
current form?
Re: [PATCH v2] perf stat: Fix crash on arm64
Posted by Arnaldo Melo 6 days, 10 hours ago

On March 27, 2026 7:35:19 AM GMT-03:00, Breno Leitao <leitao@debian.org> wrote:
>On Thu, Mar 26, 2026 at 04:39:26PM +0000, Leo Yan wrote:
>> On Wed, Mar 25, 2026 at 03:27:05PM -0700, Ian Rogers wrote:
>>
>> > From the stack trace I don't see anything that should allow this.
>> > Either the list of metrics is broken, or the struct metric_event's
>> > evsel is NULL. But that should never happen as we wouldn't have a
>> > metric at that point. The sorting shouldn't affect that. If you can
>> > reproduce the issue, verbose logs may help.
>>
>> I believe I encountered a different issue, which is irrelevant to this
>> patch.  So Breno's patch is fine for me.
>
>That said, are there any concerns about merging this patch in its
>current form?

I guess not, will try and merge it later today for 7
0.

Thanks,

- Arnaldo
Re: [PATCH v2] perf stat: Fix crash on arm64
Posted by Ian Rogers 1 day, 3 hours ago
On Fri, Mar 27, 2026 at 6:48 AM Arnaldo Melo <arnaldo.melo@gmail.com> wrote:
>
>
>
> On March 27, 2026 7:35:19 AM GMT-03:00, Breno Leitao <leitao@debian.org> wrote:
> >On Thu, Mar 26, 2026 at 04:39:26PM +0000, Leo Yan wrote:
> >> On Wed, Mar 25, 2026 at 03:27:05PM -0700, Ian Rogers wrote:
> >>
> >> > From the stack trace I don't see anything that should allow this.
> >> > Either the list of metrics is broken, or the struct metric_event's
> >> > evsel is NULL. But that should never happen as we wouldn't have a
> >> > metric at that point. The sorting shouldn't affect that. If you can
> >> > reproduce the issue, verbose logs may help.
> >>
> >> I believe I encountered a different issue, which is irrelevant to this
> >> patch.  So Breno's patch is fine for me.
> >
> >That said, are there any concerns about merging this patch in its
> >current form?
>
> I guess not, will try and merge it later today for 7
> 0.

Looks like this one is still missing. Thanks,

Ian

> Thanks,
>
> - Arnaldo
Re: [PATCH v2] perf stat: Fix crash on arm64
Posted by Arnaldo Carvalho de Melo 1 day, 2 hours ago
On Wed, Apr 01, 2026 at 01:16:49PM -0700, Ian Rogers wrote:
> On Fri, Mar 27, 2026 at 6:48 AM Arnaldo Melo <arnaldo.melo@gmail.com> wrote:
> >
> >
> >
> > On March 27, 2026 7:35:19 AM GMT-03:00, Breno Leitao <leitao@debian.org> wrote:
> > >On Thu, Mar 26, 2026 at 04:39:26PM +0000, Leo Yan wrote:
> > >> On Wed, Mar 25, 2026 at 03:27:05PM -0700, Ian Rogers wrote:
> > >>
> > >> > From the stack trace I don't see anything that should allow this.
> > >> > Either the list of metrics is broken, or the struct metric_event's
> > >> > evsel is NULL. But that should never happen as we wouldn't have a
> > >> > metric at that point. The sorting shouldn't affect that. If you can
> > >> > reproduce the issue, verbose logs may help.
> > >>
> > >> I believe I encountered a different issue, which is irrelevant to this
> > >> patch.  So Breno's patch is fine for me.
>
> > >That said, are there any concerns about merging this patch in its
> > >current form?

> > I guess not, will try and merge it later today for 7
> > 0.

> Looks like this one is still missing. Thanks,

So I think its too late for 7.0, I guess Namhyung should get it for
perf-tools-next, the fixes tag will make it get into the upcoming stable
series.

- Arnaldo
Re: [PATCH v2] perf stat: Fix crash on arm64
Posted by Ian Rogers 1 week ago
On Thu, Mar 26, 2026 at 9:39 AM Leo Yan <leo.yan@arm.com> wrote:
>
> On Wed, Mar 25, 2026 at 03:27:05PM -0700, Ian Rogers wrote:
>
> [...]
>
> > > As Arnaldo mentioned in v1, I also found Segmentation fault when
> > > testing this patch:
> > >
> > >  Program received signal SIGSEGV, Segmentation fault.
> > >  metricgroup__copy_metric_events (evlist=0xaaaaab037750, cgrp=0x0, new_metric_events=0xaaaaab038210, old_metric_events=0xaaaaab038d20) at util/metricgroup.c:1662
> > >  1662                    evsel = evlist__find_evsel(evlist, old_me->evsel->core.idx);
> > >  (gdb) bt
> > >  #0  metricgroup__copy_metric_events (evlist=0xaaaaab037750, cgrp=0x0, new_metric_events=0xaaaaab038210, old_metric_events=0xaaaaab038d20) at util/metricgroup.c:1662
> > >  #1  0x0000aaaaaab05870 in add_default_events () at builtin-stat.c:2110
> > >  #2  0x0000aaaaaab08300 in cmd_stat (argc=0, argv=0xfffffffffaa0) at builtin-stat.c:2838
> > >  #3  0x0000aaaaaab40998 in run_builtin (p=0xaaaaaaf9d428 <commands+360>, argc=4, argv=0xfffffffffaa0) at perf.c:348
> > >  #4  0x0000aaaaaab40c14 in handle_internal_command (argc=4, argv=0xfffffffffaa0) at perf.c:398
> > >  #5  0x0000aaaaaab40ddc in run_argv (argcp=0xfffffffff8bc, argv=0xfffffffff8b0) at perf.c:442
> > >  #6  0x0000aaaaaab41110 in main (argc=4, argv=0xfffffffffaa0) at perf.c:549
> > >
> > > Last week I tested v1 and confirmed the issue was gone with the change,
> > > I will dig a bit in tomorrow and share back if any finding.
> > >
> > > Apologies for my lazy, as I should double check once Arnaldo
> > > pointed out in v1.
> >
> >
> > From the stack trace I don't see anything that should allow this.
> > Either the list of metrics is broken, or the struct metric_event's
> > evsel is NULL. But that should never happen as we wouldn't have a
> > metric at that point. The sorting shouldn't affect that. If you can
> > reproduce the issue, verbose logs may help.
>
> I believe I encountered a different issue, which is irrelevant to this
> patch.  So Breno's patch is fine for me.
>
> On one of my board, I can see the log:
>
>   Events in 'frontend_bound' fully contained within 'retiring'
>   Events in 'bad_speculation' fully contained within 'retiring'
>   Events in 'backend_bound' fully contained within 'retiring'

I looked at nvidia/t410/metrics.json but I'm not clear on where the
issue is coming from.

>
> Then, in parse_groups(), it selects fully contained list:
>
>   list_for_each_entry(n, &metric_list, nd) {
>       ...
>       if (expr__subset_of_ids(n->pctx, m->pctx)) {
>           metric_evlist = n->evlist;
>           break;
>       }
>   }
>
>   // As metric_evlist has been set, so below does not run
>   if (!metric_evlist) {
>         ret = parse_ids(metric_no_merge, fake_pmu, m->pctx, m->modifier,
>                         m->group_events, tool_events, &m->evlist);
>         if (ret)
>                 goto out;
>
>         metric_evlist = m->evlist;
>   }
>
> The problem is it skips to call parse_ids() and m->pctx is not
> initialized.  This leads to pick_display_evsel() returns NULL and
> pass NULL pointer to metricgroup__lookup().   In the end, the NULL
> pointer is consumed in metricgroup__copy_metric_events().
>
> How about below change ?  I tested it which can dismiss the issue.
>
> Subject: [PATCH] perf metricgroup: Fix segment fault in
>  metricgroup__copy_metric_events()
>
> In parse_groups(), when find a event is fully contained by a previous
> event, it skips to call parse_ids(), as a result, m->pctx->ids is not
> initialized.  Then, setup_metric_events() returns an empty metric
> events, pick_display_evsel() consumes the returned metric_events and
> feeds to metricgroup__lookup() with passing "evsel = NULL".

Fully contained groups exist on x86, why isn't this problem breaking
whenever this happens? Stepping through "contained" examples, I see
that the ids aren't initialized. I think something else must be
happening.

> metricgroup__copy_metric_events() access old_me->evsel->core.idx, due
> to the evsel is set to NULL in metricgroup__lookup(), it causes
> segment fault.
>
> This commit corrects the fully contained case to allow it to parse IDs,
> thus this can effectively avoid setup NULL pointer in
> metricgroup__lookup().
>
> Fixes: 5ecd5a0c7d1c ("perf metrics: Modify setup and deduplication")
> Signed-off-by: Leo Yan <leo.yan@arm.com>
> ---
>  tools/perf/util/metricgroup.c | 11 +++++------
>  1 file changed, 5 insertions(+), 6 deletions(-)
>
> diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c
> index 7e39d469111b..528a6462876b 100644
> --- a/tools/perf/util/metricgroup.c
> +++ b/tools/perf/util/metricgroup.c
> @@ -1431,7 +1431,7 @@ static int parse_groups(struct evlist *perf_evlist,
>         list_for_each_entry(m, &metric_list, nd) {
>                 struct metric_event *me;
>                 struct evsel **metric_events;
> -               struct evlist *metric_evlist = NULL;
> +               struct evlist *metric_evlist = NULL, *contained = NULL;
>                 struct metric *n;
>                 struct metric_expr *expr;
>
> @@ -1463,19 +1463,18 @@ static int parse_groups(struct evlist *perf_evlist,
>                                 if (expr__subset_of_ids(n->pctx, m->pctx)) {
>                                         pr_debug("Events in '%s' fully contained within '%s'\n",
>                                                  m->metric_name, n->metric_name);
> -                                       metric_evlist = n->evlist;
> +                                       contained = n->evlist;
>                                         break;
>                                 }
> -
>                         }
>                 }
>                 if (!metric_evlist) {
> +                       metric_evlist = contained ? contained : m->evlist;
> +
>                         ret = parse_ids(metric_no_merge, fake_pmu, m->pctx, m->modifier,
> -                                       m->group_events, tool_events, &m->evlist);
> +                                       m->group_events, tool_events, &metric_evlist);

Won't this match the behavior of metric_no_merge/--metric-no-merge,
since for every metric the events for that metric are being appended
to the evlist?

Thanks,
Ian


>                         if (ret)
>                                 goto out;
> -
> -                       metric_evlist = m->evlist;
>                 }
>                 ret = setup_metric_events(fake_pmu ? "all" : m->pmu, m->pctx->ids,
>                                           metric_evlist, &metric_events);
> --
> 2.34.1
Re: [PATCH v2] perf stat: Fix crash on arm64
Posted by Leo Yan 6 days, 5 hours ago
On Thu, Mar 26, 2026 at 02:21:00PM -0700, Ian Rogers wrote:

[...]

> > On one of my board, I can see the log:
> >
> >   Events in 'frontend_bound' fully contained within 'retiring'
> >   Events in 'bad_speculation' fully contained within 'retiring'
> >   Events in 'backend_bound' fully contained within 'retiring'
> 
> I looked at nvidia/t410/metrics.json but I'm not clear on where the
> issue is coming from.

Digging a bit, all these metrics have syntax errors because #slots is
zero when returned from tool_pmu__cpu_slots_per_cycle().
See the log: https://termbin.com/e9ue

After that, executes parse_groups() treats all these problematic metrics
as being contained by "retiring".

I have sent a patch to make __add_metrics() respect the returned syntax
error, so the program can exit early.


> > In parse_groups(), when find a event is fully contained by a previous
> > event, it skips to call parse_ids(), as a result, m->pctx->ids is not
> > initialized.  Then, setup_metric_events() returns an empty metric
> > events, pick_display_evsel() consumes the returned metric_events and
> > feeds to metricgroup__lookup() with passing "evsel = NULL".
> 
> Fully contained groups exist on x86, why isn't this problem breaking
> whenever this happens? Stepping through "contained" examples, I see
> that the ids aren't initialized. I think something else must be
> happening.

If you change tool_pmu__cpu_slots_per_cycle() to always return zero,
this might can reproduce the issue.

[...]

> > @@ -1463,19 +1463,18 @@ static int parse_groups(struct evlist *perf_evlist,
> >                                 if (expr__subset_of_ids(n->pctx, m->pctx)) {
> >                                         pr_debug("Events in '%s' fully contained within '%s'\n",
> >                                                  m->metric_name, n->metric_name);
> > -                                       metric_evlist = n->evlist;
> > +                                       contained = n->evlist;
> >                                         break;
> >                                 }
> > -
> >                         }
> >                 }
> >                 if (!metric_evlist) {
> > +                       metric_evlist = contained ? contained : m->evlist;
> > +
> >                         ret = parse_ids(metric_no_merge, fake_pmu, m->pctx, m->modifier,
> > -                                       m->group_events, tool_events, &m->evlist);
> > +                                       m->group_events, tool_events, &metric_evlist);
> 
> Won't this match the behavior of metric_no_merge/--metric-no-merge,
> since for every metric the events for that metric are being appended
> to the evlist?

TBH, I still cannot understand well parse_groups().

The change above passes &metric_evlist to parse_ids(), as this will only
change the value of metric_evlist itself but not update m->evlist or
n->evlist, this is not right for metric_no_merge case ?

Here I am not sure how to use parse_ids() to parse "m->pctx" but avoid
to overwrite "n->evlist" for the fully contained case.

Thanks,
Leo
Re: [PATCH v2] perf stat: Fix crash on arm64
Posted by Ian Rogers 1 week, 1 day ago
On Wed, Mar 25, 2026 at 3:25 AM Breno Leitao <leitao@debian.org> wrote:
>
> Perf stat is crashing on arm64 hosts with the following issue:
>
>         # make -C tools/perf DEBUG=1
>         # perf stat sleep 1
>         perf: util/evsel.c:2034: get_group_fd: Assertion `!(!leader->core.fd)' failed.
>         [1]    1220794 IOT instruction (core dumped)  ./perf stat
>
> The sorting function introduced by commit a745c0831c15c ("perf stat:
> Sort default events/metrics") compares events based on their individual
> properties. This can cause events from different groups to be
> interleaved, resulting in group members appearing before their leaders
> in the sorted evlist.
>
> When the iterator opens events in list order, a group member may be
> processed before its leader has been opened.
>
> For example, CPU_CYCLES (idx=32) with leader STALL_SLOT_BACKEND (idx=37)
> could be sorted before its leader, causing the crash when CPU_CYCLES
> tries to get its group fd from the not-yet-opened leader.
>
> Fix this by comparing events based on their leader's attributes instead
> of their own attributes when the events are in different groups. This
> ensures all members of a group share the same sort key as their leader,
> keeping groups together and guaranteeing leaders are opened before their
> members.
>
> Reported-by: Denis Yaroshevskiy <dyaroshev@meta.com>
> Fixes: a745c0831c15c ("perf stat: Sort default events/metrics")
> Tested-by: Dmitry Ilvokhin <d@ilvokhin.com>
> Signed-off-by: Breno Leitao <leitao@debian.org>

Ah, this is the separate sorting for perf stat output and not the
parse-events sorting. This sorting happens after we've regrouped for
PMUs, etc. and so the invariants I expect shouldn't be broken by the
change. Fwiw, the Intel hybrid sorting isn't impacted:

Before:
```
$ perf stat -a sleep 1

 Performance counter stats for 'system wide':

            22,355      context-switches                 #    792.6
cs/sec  cs_per_second
         28,205.79 msec cpu-clock                        #     27.6
CPUs  CPUs_utilized
               477      cpu-migrations                   #     16.9
migrations/sec  migrations_per_second
               294      page-faults                      #     10.4
faults/sec  page_faults_per_second
         1,796,969      cpu_core/branch-misses/          #      2.0 %
branch_miss_rate
        90,819,614      cpu_core/branches/               #      3.2
M/sec  branch_frequency
       465,688,306      cpu_core/cpu-cycles/             #      0.0
GHz  cycles_frequency
       435,799,882      cpu_core/instructions/           #      0.9
instructions  insn_per_cycle
         2,239,373      cpu_atom/branch-misses/          #      4.6 %
branch_miss_rate         (49.78%)
        48,356,802      cpu_atom/branches/               #      1.7
M/sec  branch_frequency     (50.17%)
       477,205,378      cpu_atom/cpu-cycles/             #      0.0
GHz  cycles_frequency       (50.37%)
       236,677,090      cpu_atom/instructions/           #      0.5
instructions  insn_per_cycle  (50.35%)
             TopdownL1 (cpu_core)                        #      7.1 %
tma_bad_speculation
                                                         #     40.2 %
tma_frontend_bound
                                                         #     35.7 %
tma_backend_bound
                                                         #     17.1 %
tma_retiring
             TopdownL1 (cpu_atom)                        #     32.7 %
tma_backend_bound        (59.74%)
                                                         #     37.8 %
tma_frontend_bound       (59.54%)
                                                         #     17.2 %
tma_bad_speculation
                                                         #     12.3 %
tma_retiring             (59.62%)

       1.006767726 seconds time elapsed
```

After:
```
$ perf stat -a sleep 1

 Performance counter stats for 'system wide':

            21,329      context-switches                 #    758.5
cs/sec  cs_per_second
         28,120.76 msec cpu-clock                        #     27.7
CPUs  CPUs_utilized
               482      cpu-migrations                   #     17.1
migrations/sec  migrations_per_second
               217      page-faults                      #      7.7
faults/sec  page_faults_per_second
         1,606,877      cpu_core/branch-misses/          #      1.8 %
branch_miss_rate
        90,472,412      cpu_core/branches/               #      3.2
M/sec  branch_frequency
       459,566,033      cpu_core/cpu-cycles/             #      0.0
GHz  cycles_frequency
       482,577,042      cpu_core/instructions/           #      1.1
instructions  insn_per_cycle
         2,430,046      cpu_atom/branch-misses/          #      7.0 %
branch_miss_rate         (49.85%)
        35,031,345      cpu_atom/branches/               #      1.2
M/sec  branch_frequency     (50.20%)
       494,493,558      cpu_atom/cpu-cycles/             #      0.0
GHz  cycles_frequency       (50.22%)
       190,397,854      cpu_atom/instructions/           #      0.4
instructions  insn_per_cycle  (50.24%)
             TopdownL1 (cpu_core)                        #      6.6 %
tma_bad_speculation
                                                         #     35.9 %
tma_frontend_bound
                                                         #     38.2 %
tma_backend_bound
                                                         #     19.3 %
tma_retiring
             TopdownL1 (cpu_atom)                        #     30.2 %
tma_backend_bound        (59.74%)
                                                         #     38.8 %
tma_frontend_bound       (59.71%)
                                                         #     21.7 %
tma_bad_speculation
                                                         #      9.3 %
tma_retiring             (59.68%)

       1.005096844 seconds time elapsed
```

Tested-by: Ian Rogers <irogers@google.com>

Thanks,
Ian


> ---
> Cc; linux-arm-kernel@lists.infradead.org
> ---
> Changes in v2:
> - No changes from V1, just resending the exact same patch.
> - Link to v1: https://patch.msgid.link/20260205-perf_stat-v1-1-e433b0c918af@debian.org
> ---
>  tools/perf/builtin-stat.c | 26 +++++++++++++++++---------
>  1 file changed, 17 insertions(+), 9 deletions(-)
>
> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
> index 73c2ba7e30760..a97b9d0de3f58 100644
> --- a/tools/perf/builtin-stat.c
> +++ b/tools/perf/builtin-stat.c
> @@ -1917,25 +1917,33 @@ static int default_evlist_evsel_cmp(void *priv __maybe_unused,
>         const struct evsel *lhs = container_of(lhs_core, struct evsel, core);
>         const struct perf_evsel *rhs_core = container_of(r, struct perf_evsel, node);
>         const struct evsel *rhs = container_of(rhs_core, struct evsel, core);
> +       const struct evsel *lhs_leader = evsel__leader(lhs);
> +       const struct evsel *rhs_leader = evsel__leader(rhs);
>
> -       if (evsel__leader(lhs) == evsel__leader(rhs)) {
> +       if (lhs_leader == rhs_leader) {
>                 /* Within the same group, respect the original order. */
>                 return lhs_core->idx - rhs_core->idx;
>         }
>
> +       /*
> +        * Compare using leader's attributes so that all members of a group
> +        * stay together. This ensures leaders are opened before their members.
> +        */
> +
>         /* Sort default metrics evsels first, and default show events before those. */
> -       if (lhs->default_metricgroup != rhs->default_metricgroup)
> -               return lhs->default_metricgroup ? -1 : 1;
> +       if (lhs_leader->default_metricgroup != rhs_leader->default_metricgroup)
> +               return lhs_leader->default_metricgroup ? -1 : 1;
>
> -       if (lhs->default_show_events != rhs->default_show_events)
> -               return lhs->default_show_events ? -1 : 1;
> +       if (lhs_leader->default_show_events != rhs_leader->default_show_events)
> +               return lhs_leader->default_show_events ? -1 : 1;
>
>         /* Sort by PMU type (prefers legacy types first). */
> -       if (lhs->pmu != rhs->pmu)
> -               return lhs->pmu->type - rhs->pmu->type;
> +       if (lhs_leader->pmu != rhs_leader->pmu)
> +               return lhs_leader->pmu->type - rhs_leader->pmu->type;
>
> -       /* Sort by name. */
> -       return strcmp(evsel__name((struct evsel *)lhs), evsel__name((struct evsel *)rhs));
> +       /* Sort by leader's name. */
> +       return strcmp(evsel__name((struct evsel *)lhs_leader),
> +                     evsel__name((struct evsel *)rhs_leader));
>  }
>
>  /*
>
> ---
> base-commit: 85964cdcad0fac9a0eb7b87a0f9d88cc074b854c
> change-id: 20260205-perf_stat-a0a2a37e21c5
>
> Best regards,
> --
> Breno Leitao <leitao@debian.org>
>