[PATCH v1] perf x86 test: Update hybrid expectations

Ian Rogers posted 1 patch 1 year, 11 months ago
tools/perf/arch/x86/tests/hybrid.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)
[PATCH v1] perf x86 test: Update hybrid expectations
Posted by Ian Rogers 1 year, 11 months ago
The legacy events cpu-cycles and instructions have sysfs event
equivalents on x86 (see /sys/devices/cpu_core/events). As sysfs/JSON
events are now higher in priority than legacy events this causes the
hybrid test expectations not to be met. To fix this switch to legacy
events that don't have sysfs versions, namely cpu-cycles becomes
cycles and instructions becomes branches.

Fixes: a24d9d9dc096 ("perf parse-events: Make legacy events lower priority than sysfs/JSON")
Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/arch/x86/tests/hybrid.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/tools/perf/arch/x86/tests/hybrid.c b/tools/perf/arch/x86/tests/hybrid.c
index eb152770f148..05a5f81e8167 100644
--- a/tools/perf/arch/x86/tests/hybrid.c
+++ b/tools/perf/arch/x86/tests/hybrid.c
@@ -47,7 +47,7 @@ static int test__hybrid_hw_group_event(struct evlist *evlist)
 	evsel = evsel__next(evsel);
 	TEST_ASSERT_VAL("wrong type", PERF_TYPE_HARDWARE == evsel->core.attr.type);
 	TEST_ASSERT_VAL("wrong hybrid type", test_hybrid_type(evsel, PERF_TYPE_RAW));
-	TEST_ASSERT_VAL("wrong config", test_config(evsel, PERF_COUNT_HW_INSTRUCTIONS));
+	TEST_ASSERT_VAL("wrong config", test_config(evsel, PERF_COUNT_HW_BRANCH_INSTRUCTIONS));
 	TEST_ASSERT_VAL("wrong leader", evsel__has_leader(evsel, leader));
 	return TEST_OK;
 }
@@ -102,7 +102,7 @@ static int test__hybrid_group_modifier1(struct evlist *evlist)
 	evsel = evsel__next(evsel);
 	TEST_ASSERT_VAL("wrong type", PERF_TYPE_HARDWARE == evsel->core.attr.type);
 	TEST_ASSERT_VAL("wrong hybrid type", test_hybrid_type(evsel, PERF_TYPE_RAW));
-	TEST_ASSERT_VAL("wrong config", test_config(evsel, PERF_COUNT_HW_INSTRUCTIONS));
+	TEST_ASSERT_VAL("wrong config", test_config(evsel, PERF_COUNT_HW_BRANCH_INSTRUCTIONS));
 	TEST_ASSERT_VAL("wrong leader", evsel__has_leader(evsel, leader));
 	TEST_ASSERT_VAL("wrong exclude_user", !evsel->core.attr.exclude_user);
 	TEST_ASSERT_VAL("wrong exclude_kernel", evsel->core.attr.exclude_kernel);
@@ -171,27 +171,27 @@ struct evlist_test {
 
 static const struct evlist_test test__hybrid_events[] = {
 	{
-		.name  = "cpu_core/cpu-cycles/",
+		.name  = "cpu_core/cycles/",
 		.check = test__hybrid_hw_event_with_pmu,
 		/* 0 */
 	},
 	{
-		.name  = "{cpu_core/cpu-cycles/,cpu_core/instructions/}",
+		.name  = "{cpu_core/cycles/,cpu_core/branches/}",
 		.check = test__hybrid_hw_group_event,
 		/* 1 */
 	},
 	{
-		.name  = "{cpu-clock,cpu_core/cpu-cycles/}",
+		.name  = "{cpu-clock,cpu_core/cycles/}",
 		.check = test__hybrid_sw_hw_group_event,
 		/* 2 */
 	},
 	{
-		.name  = "{cpu_core/cpu-cycles/,cpu-clock}",
+		.name  = "{cpu_core/cycles/,cpu-clock}",
 		.check = test__hybrid_hw_sw_group_event,
 		/* 3 */
 	},
 	{
-		.name  = "{cpu_core/cpu-cycles/k,cpu_core/instructions/u}",
+		.name  = "{cpu_core/cycles/k,cpu_core/branches/u}",
 		.check = test__hybrid_group_modifier1,
 		/* 4 */
 	},
-- 
2.43.0.472.g3155946c3a-goog
Re: [PATCH v1] perf x86 test: Update hybrid expectations
Posted by Arnaldo Carvalho de Melo 1 year, 11 months ago
Em Tue, Jan 02, 2024 at 01:57:32PM -0800, Ian Rogers escreveu:
> The legacy events cpu-cycles and instructions have sysfs event
> equivalents on x86 (see /sys/devices/cpu_core/events). As sysfs/JSON
> events are now higher in priority than legacy events this causes the
> hybrid test expectations not to be met. To fix this switch to legacy
> events that don't have sysfs versions, namely cpu-cycles becomes
> cycles and instructions becomes branches.
> 
> Fixes: a24d9d9dc096 ("perf parse-events: Make legacy events lower priority than sysfs/JSON")
> Signed-off-by: Ian Rogers <irogers@google.com>

With it:

root@number:/home/acme# perf test hybrid
 71: Intel PT                                                        :
 71.2: Intel PT hybrid CPU compatibility                             : Ok
 75: x86 hybrid                                                      : Ok
root@number:/home/acme#

Applied.

Now to look at this on this hybrid system (14700K):

101: perf all metricgroups test                                      : FAILED!

Testing Mem
event syntax error: '{cpu_core/UNC_ARB_DAT_OCCUPANCY.RD,cmask=1,metric-id=cpu_core!3UNC_ARB_DAT_OCCUPANCY.RD!0cmask!21!3/,UNC_ARB_DAT_OCCUPANCY.RD/metric-id=UNC_ARB_DAT_OCCUPANCY.RD/}:W,du..'
                               \___ Bad event or PMU

Unable to find PMU or event on a PMU of 'cpu_core'

Initial error:
event syntax error: '{cpu_core/UNC_ARB_DAT_OCCUPANCY.RD,cmask=1,metric-id=cpu_core!3UNC_ARB_DAT_OCCUPANCY.RD!0cmask!21!3/,UNC_ARB_DAT_OCCUPANCY.RD/metric-id=UNC_ARB_DAT_OCCUPANCY.RD/}:W,du..'
                               \___ unknown term 'UNC_ARB_DAT_OCCUPANCY.RD' for pmu 'cpu_core'

valid terms: event,pc,edge,offcore_rsp,ldlat,inv,umask,frontend,cmask,config,config1,config2,config3,name,period,percore,metric-id
test child finished with -1
---- end ----
perf all metricgroups test: FAILED!
root@number:/home/acme# grep -m1 "model name" /proc/cpuinfo 
model name	: Intel(R) Core(TM) i7-14700K
root@number:/home/acme# 


root@number:/home/acme# ls -la /sys/devices/uncore_
uncore_arb_0/              uncore_cbox_1/             uncore_cbox_2/             uncore_cbox_5/             uncore_cbox_8/             uncore_imc_0/              uncore_imc_free_running_1/
uncore_arb_1/              uncore_cbox_10/            uncore_cbox_3/             uncore_cbox_6/             uncore_cbox_9/             uncore_imc_1/              
uncore_cbox_0/             uncore_cbox_11/            uncore_cbox_4/             uncore_cbox_7/             uncore_clock/              uncore_imc_free_running_0/ 
root@number:/home/acme# ls -la /sys/devices/uncore_


102: perf all metrics test                                           : FAILED!

event syntax error: '{cpu_core/UNC_ARB_DAT_OCCUPANCY.RD,cmask=1,metric-id=cpu..'
                               \___ Bad event or PMU

Unable to find PMU or event on a PMU of 'cpu_core'

Initial error:
event syntax error: '{cpu_core/UNC_ARB_DAT_OCCUPANCY.RD,cmask=1,metric-id=cpu..'
                               \___ unknown term 'UNC_ARB_DAT_OCCUPANCY.RD' for pmu 'cpu_core'

valid terms: event,pc,edge,offcore_rsp,ldlat,inv,umask,frontend,cmask,config,config1,config2,config3,name,period,percore,metric-id


Testing UNCORE_FREQ
Metric 'UNCORE_FREQ' not printed in:
event syntax error: '{tma_info_system_socket_clks/metric-id=tma_info_system_s..'
                      \___ Bad event or PMU

Unable to find PMU or event on a PMU of 'tma_info_system_socket_clks'

Initial error:
event syntax error: '{tma_info_system_socket_clks/metric-id=tma_info_system_s..'
                      \___ Cannot find PMU `tma_info_system_socket_clks'. Missing kernel support?
Testing tma_info_system_socket_clks

- Arnaldo
Re: [PATCH v1] perf x86 test: Update hybrid expectations
Posted by Ian Rogers 1 year, 11 months ago
On Wed, Jan 3, 2024 at 8:42 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
>
> Em Tue, Jan 02, 2024 at 01:57:32PM -0800, Ian Rogers escreveu:
> > The legacy events cpu-cycles and instructions have sysfs event
> > equivalents on x86 (see /sys/devices/cpu_core/events). As sysfs/JSON
> > events are now higher in priority than legacy events this causes the
> > hybrid test expectations not to be met. To fix this switch to legacy
> > events that don't have sysfs versions, namely cpu-cycles becomes
> > cycles and instructions becomes branches.
> >
> > Fixes: a24d9d9dc096 ("perf parse-events: Make legacy events lower priority than sysfs/JSON")
> > Signed-off-by: Ian Rogers <irogers@google.com>
>
> With it:
>
> root@number:/home/acme# perf test hybrid
>  71: Intel PT                                                        :
>  71.2: Intel PT hybrid CPU compatibility                             : Ok
>  75: x86 hybrid                                                      : Ok
> root@number:/home/acme#
>
> Applied.
>
> Now to look at this on this hybrid system (14700K):
>
> 101: perf all metricgroups test                                      : FAILED!
>
> Testing Mem
> event syntax error: '{cpu_core/UNC_ARB_DAT_OCCUPANCY.RD,cmask=1,metric-id=cpu_core!3UNC_ARB_DAT_OCCUPANCY.RD!0cmask!21!3/,UNC_ARB_DAT_OCCUPANCY.RD/metric-id=UNC_ARB_DAT_OCCUPANCY.RD/}:W,du..'
>                                \___ Bad event or PMU
>
> Unable to find PMU or event on a PMU of 'cpu_core'
>
> Initial error:
> event syntax error: '{cpu_core/UNC_ARB_DAT_OCCUPANCY.RD,cmask=1,metric-id=cpu_core!3UNC_ARB_DAT_OCCUPANCY.RD!0cmask!21!3/,UNC_ARB_DAT_OCCUPANCY.RD/metric-id=UNC_ARB_DAT_OCCUPANCY.RD/}:W,du..'
>                                \___ unknown term 'UNC_ARB_DAT_OCCUPANCY.RD' for pmu 'cpu_core'
>
> valid terms: event,pc,edge,offcore_rsp,ldlat,inv,umask,frontend,cmask,config,config1,config2,config3,name,period,percore,metric-id
> test child finished with -1
> ---- end ----
> perf all metricgroups test: FAILED!
> root@number:/home/acme# grep -m1 "model name" /proc/cpuinfo
> model name      : Intel(R) Core(TM) i7-14700K
> root@number:/home/acme#
>
>
> root@number:/home/acme# ls -la /sys/devices/uncore_
> uncore_arb_0/              uncore_cbox_1/             uncore_cbox_2/             uncore_cbox_5/             uncore_cbox_8/             uncore_imc_0/              uncore_imc_free_running_1/
> uncore_arb_1/              uncore_cbox_10/            uncore_cbox_3/             uncore_cbox_6/             uncore_cbox_9/             uncore_imc_1/
> uncore_cbox_0/             uncore_cbox_11/            uncore_cbox_4/             uncore_cbox_7/             uncore_clock/              uncore_imc_free_running_0/
> root@number:/home/acme# ls -la /sys/devices/uncore_
>
>
> 102: perf all metrics test                                           : FAILED!
>
> event syntax error: '{cpu_core/UNC_ARB_DAT_OCCUPANCY.RD,cmask=1,metric-id=cpu..'
>                                \___ Bad event or PMU
>
> Unable to find PMU or event on a PMU of 'cpu_core'
>
> Initial error:
> event syntax error: '{cpu_core/UNC_ARB_DAT_OCCUPANCY.RD,cmask=1,metric-id=cpu..'
>                                \___ unknown term 'UNC_ARB_DAT_OCCUPANCY.RD' for pmu 'cpu_core'
>
> valid terms: event,pc,edge,offcore_rsp,ldlat,inv,umask,frontend,cmask,config,config1,config2,config3,name,period,percore,metric-id

I'll take a look. UNC_ARB* events are going to be using uncore_arb_*
PMUs and so the cpu_core PMU shouldn't be specified. This looks like a
bug in how the metric is generated.

> Testing UNCORE_FREQ
> Metric 'UNCORE_FREQ' not printed in:
> event syntax error: '{tma_info_system_socket_clks/metric-id=tma_info_system_s..'
>                       \___ Bad event or PMU
>
> Unable to find PMU or event on a PMU of 'tma_info_system_socket_clks'
>
> Initial error:
> event syntax error: '{tma_info_system_socket_clks/metric-id=tma_info_system_s..'
>                       \___ Cannot find PMU `tma_info_system_socket_clks'. Missing kernel support?
> Testing tma_info_system_socket_clks

Similar bug but different as differing PMUs aren't involved:
https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py#L1459

I also see what may be a PMU driver bug in:
```
...
Metric 'tma_slow_pause' not printed in:
# Running 'internals/synthesize' benchmark:
Computing performance of single threaded perf event synthesis by
synthesizing events on the perf process itself:
 Average synthesis took: 11.657 usec (+- 0.039 usec)
 Average num. events: 4.000 (+- 0.000)
 Average time per event 2.914 usec
 Average data synthesis took: 11.832 usec (+- 0.037 usec)
 Average num. events: 13.000 (+- 0.000)
 Average time per event 0.910 usec

Performance counter stats for 'perf bench internals synthesize':

    <not counted>      cpu_core/TOPDOWN.SLOTS/
                        (0.00%)
    <not counted>      cpu_core/topdown-retiring/
                        (0.00%)
    <not counted>      cpu_core/topdown-mem-bound/
                        (0.00%)
    <not counted>      cpu_core/topdown-bad-spec/
                        (0.00%)
    <not counted>      cpu_core/topdown-fe-bound/
                        (0.00%)
    <not counted>      cpu_core/topdown-be-bound/
                        (0.00%)
    <not counted>      cpu_core/RESOURCE_STALLS.SCOREBOARD/
                            (0.00%)
    <not counted>      cpu_core/EXE_ACTIVITY.1_PORTS_UTIL/
                           (0.00%)
    <not counted>      cpu_core/EXE_ACTIVITY.BOUND_ON_LOADS/
                             (0.00%)
    <not counted>      cpu_core/CPU_CLK_UNHALTED.PAUSE/
                        (0.00%)
    <not counted>      cpu_core/CYCLE_ACTIVITY.STALLS_TOTAL/
                             (0.00%)
    <not counted>      cpu_core/CPU_CLK_UNHALTED.THREAD/
                         (0.00%)
    <not counted>      cpu_core/ARITH.DIV_ACTIVE/
                        (0.00%)
    <not counted>      cpu_core/EXE_ACTIVITY.2_PORTS_UTIL,umask=0xc/
                                     (0.00%)
    <not counted>      cpu_core/EXE_ACTIVITY.3_PORTS_UTIL,umask=0x80/
                                      (0.00%)

      0.327060340 seconds time elapsed

      0.114906000 seconds user
      0.210001000 seconds sys
...
```

as adding --metric-no-group fixes the issue. Adding --metric-no-group
shouldn't be necessary as perf_event_open should be failing causing
the weak group to break (hence the possible PMU driver bug). Perhaps
there is something erroneous in weak group breaking on hybrid.

Thanks,
Ian

> - Arnaldo