.../arch/test/test_soc/cpu/metrics.json | 6 +- .../arch/x86/alderlake/adl-metrics.json | 1353 ++++++++++++++++- .../pmu-events/arch/x86/alderlake/cache.json | 129 +- .../arch/x86/alderlake/frontend.json | 12 + .../pmu-events/arch/x86/alderlake/memory.json | 22 + .../pmu-events/arch/x86/alderlake/other.json | 22 + .../arch/x86/alderlake/pipeline.json | 14 +- .../arch/x86/broadwell/bdw-metrics.json | 679 +++++++-- .../arch/x86/broadwellde/bdwde-metrics.json | 711 +++++++-- .../arch/x86/broadwellx/bdx-metrics.json | 965 +++++++----- .../arch/x86/broadwellx/uncore-cache.json | 10 +- .../x86/broadwellx/uncore-interconnect.json | 18 +- .../arch/x86/broadwellx/uncore-memory.json | 18 +- .../arch/x86/cascadelakex/clx-metrics.json | 1285 ++++++++++------ .../arch/x86/cascadelakex/uncore-memory.json | 18 +- .../arch/x86/cascadelakex/uncore-other.json | 10 +- .../pmu-events/arch/x86/haswell/cache.json | 4 +- .../pmu-events/arch/x86/haswell/frontend.json | 12 +- .../arch/x86/haswell/hsw-metrics.json | 570 ++++++- .../pmu-events/arch/x86/haswellx/cache.json | 2 +- .../arch/x86/haswellx/frontend.json | 12 +- .../arch/x86/haswellx/hsx-metrics.json | 919 +++++++---- .../x86/haswellx/uncore-interconnect.json | 18 +- .../arch/x86/haswellx/uncore-memory.json | 18 +- .../pmu-events/arch/x86/icelake/cache.json | 6 +- .../arch/x86/icelake/icl-metrics.json | 808 +++++++++- .../pmu-events/arch/x86/icelake/pipeline.json | 2 +- .../pmu-events/arch/x86/icelakex/cache.json | 6 +- .../arch/x86/icelakex/icx-metrics.json | 1155 ++++++++++---- .../arch/x86/icelakex/pipeline.json | 2 +- .../arch/x86/icelakex/uncore-other.json | 2 +- .../arch/x86/ivybridge/ivb-metrics.json | 594 ++++++-- .../pmu-events/arch/x86/ivytown/cache.json | 4 +- .../arch/x86/ivytown/floating-point.json | 2 +- .../pmu-events/arch/x86/ivytown/frontend.json | 18 +- .../arch/x86/ivytown/ivt-metrics.json | 630 ++++++-- .../arch/x86/ivytown/uncore-cache.json | 58 +- .../arch/x86/ivytown/uncore-interconnect.json | 84 +- .../arch/x86/ivytown/uncore-memory.json | 2 +- .../arch/x86/ivytown/uncore-other.json | 6 +- .../arch/x86/ivytown/uncore-power.json | 8 +- .../arch/x86/jaketown/jkt-metrics.json | 327 +++- tools/perf/pmu-events/arch/x86/mapfile.csv | 18 +- .../arch/x86/sandybridge/snb-metrics.json | 315 +++- .../arch/x86/sapphirerapids/cache.json | 4 +- .../arch/x86/sapphirerapids/frontend.json | 11 + .../arch/x86/sapphirerapids/pipeline.json | 4 +- .../arch/x86/sapphirerapids/spr-metrics.json | 1249 ++++++++++----- .../arch/x86/skylake/skl-metrics.json | 861 ++++++++--- .../arch/x86/skylakex/skx-metrics.json | 1262 +++++++++------ .../arch/x86/skylakex/uncore-memory.json | 18 +- .../arch/x86/skylakex/uncore-other.json | 19 +- .../arch/x86/tigerlake/tgl-metrics.json | 810 +++++++++- tools/perf/pmu-events/empty-pmu-events.c | 6 +- tools/perf/tests/expr.c | 4 + tools/perf/util/expr.c | 11 +- tools/perf/util/expr.y | 2 +- tools/perf/util/stat-shadow.c | 9 +- 58 files changed, 11514 insertions(+), 3630 deletions(-)
For consistency with: https://github.com/intel/perfmon-metrics rename of topdown TMA metrics from Frontend_Bound to tma_frontend_bound. Remove _SMT suffix metrics are dropped as the #SMT_On and #EBS_Mode are correctly expanded in the single main metric. Fix perf expr to allow a double if to be correctly processed. Add all 6 levels of TMA metrics. Child metrics are placed in a group named after their parent allowing children of a metric to be easily measured using the metric name with a _group suffix. Don't drop TMA metrics if they contain topdown events. The ## and ##? operators are correctly expanded. The locate-with column is added to the long description describing a sampling event. Metrics are written in terms of other metrics to reduce the expression size and increase readability. Following this the pmu-events/arch/x86 directories match those created by the script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py with updates at: https://github.com/captain5050/event-converter-for-linux-perf v3. Fix a parse metrics test failure due to making metrics referring to other metrics case sensitive - make the cases in the test metric match. v2. Fixes commit message wrt missing mapfile.csv updates as noted by Zhengjun Xing <zhengjun.xing@linux.intel.com>. ScaleUnit is added for TMA metrics. Metrics with topdown events have have a missing slots event added if necessary. The latest metrics at: https://github.com/intel/perfmon-metrics are used, however, the event-converter-for-linux-perf scripts now prefer their own metrics in case of mismatched units when a metric is written in terms of another. Additional testing was performed on broadwell, broadwellde, cascadelakex, haswellx, sapphirerapids and tigerlake CPUs. Ian Rogers (23): perf expr: Allow a double if expression perf test: Adjust case of test metrics perf expr: Remove jevents case workaround perf metrics: Don't scale counts going into metrics perf vendor events: Update Intel skylakex perf vendor events: Update Intel alderlake perf vendor events: Update Intel broadwell perf vendor events: Update Intel broadwellx perf vendor events: Update Intel cascadelakex perf vendor events: Update elkhartlake cpuids perf vendor events: Update Intel haswell perf vendor events: Update Intel haswellx perf vendor events: Update Intel icelake perf vendor events: Update Intel icelakex perf vendor events: Update Intel ivybridge perf vendor events: Update Intel ivytown perf vendor events: Update Intel jaketown perf vendor events: Update Intel sandybridge perf vendor events: Update Intel sapphirerapids perf vendor events: Update silvermont cpuids perf vendor events: Update Intel skylake perf vendor events: Update Intel tigerlake perf vendor events: Update Intel broadwellde .../arch/test/test_soc/cpu/metrics.json | 6 +- .../arch/x86/alderlake/adl-metrics.json | 1353 ++++++++++++++++- .../pmu-events/arch/x86/alderlake/cache.json | 129 +- .../arch/x86/alderlake/frontend.json | 12 + .../pmu-events/arch/x86/alderlake/memory.json | 22 + .../pmu-events/arch/x86/alderlake/other.json | 22 + .../arch/x86/alderlake/pipeline.json | 14 +- .../arch/x86/broadwell/bdw-metrics.json | 679 +++++++-- .../arch/x86/broadwellde/bdwde-metrics.json | 711 +++++++-- .../arch/x86/broadwellx/bdx-metrics.json | 965 +++++++----- .../arch/x86/broadwellx/uncore-cache.json | 10 +- .../x86/broadwellx/uncore-interconnect.json | 18 +- .../arch/x86/broadwellx/uncore-memory.json | 18 +- .../arch/x86/cascadelakex/clx-metrics.json | 1285 ++++++++++------ .../arch/x86/cascadelakex/uncore-memory.json | 18 +- .../arch/x86/cascadelakex/uncore-other.json | 10 +- .../pmu-events/arch/x86/haswell/cache.json | 4 +- .../pmu-events/arch/x86/haswell/frontend.json | 12 +- .../arch/x86/haswell/hsw-metrics.json | 570 ++++++- .../pmu-events/arch/x86/haswellx/cache.json | 2 +- .../arch/x86/haswellx/frontend.json | 12 +- .../arch/x86/haswellx/hsx-metrics.json | 919 +++++++---- .../x86/haswellx/uncore-interconnect.json | 18 +- .../arch/x86/haswellx/uncore-memory.json | 18 +- .../pmu-events/arch/x86/icelake/cache.json | 6 +- .../arch/x86/icelake/icl-metrics.json | 808 +++++++++- .../pmu-events/arch/x86/icelake/pipeline.json | 2 +- .../pmu-events/arch/x86/icelakex/cache.json | 6 +- .../arch/x86/icelakex/icx-metrics.json | 1155 ++++++++++---- .../arch/x86/icelakex/pipeline.json | 2 +- .../arch/x86/icelakex/uncore-other.json | 2 +- .../arch/x86/ivybridge/ivb-metrics.json | 594 ++++++-- .../pmu-events/arch/x86/ivytown/cache.json | 4 +- .../arch/x86/ivytown/floating-point.json | 2 +- .../pmu-events/arch/x86/ivytown/frontend.json | 18 +- .../arch/x86/ivytown/ivt-metrics.json | 630 ++++++-- .../arch/x86/ivytown/uncore-cache.json | 58 +- .../arch/x86/ivytown/uncore-interconnect.json | 84 +- .../arch/x86/ivytown/uncore-memory.json | 2 +- .../arch/x86/ivytown/uncore-other.json | 6 +- .../arch/x86/ivytown/uncore-power.json | 8 +- .../arch/x86/jaketown/jkt-metrics.json | 327 +++- tools/perf/pmu-events/arch/x86/mapfile.csv | 18 +- .../arch/x86/sandybridge/snb-metrics.json | 315 +++- .../arch/x86/sapphirerapids/cache.json | 4 +- .../arch/x86/sapphirerapids/frontend.json | 11 + .../arch/x86/sapphirerapids/pipeline.json | 4 +- .../arch/x86/sapphirerapids/spr-metrics.json | 1249 ++++++++++----- .../arch/x86/skylake/skl-metrics.json | 861 ++++++++--- .../arch/x86/skylakex/skx-metrics.json | 1262 +++++++++------ .../arch/x86/skylakex/uncore-memory.json | 18 +- .../arch/x86/skylakex/uncore-other.json | 19 +- .../arch/x86/tigerlake/tgl-metrics.json | 810 +++++++++- tools/perf/pmu-events/empty-pmu-events.c | 6 +- tools/perf/tests/expr.c | 4 + tools/perf/util/expr.c | 11 +- tools/perf/util/expr.y | 2 +- tools/perf/util/stat-shadow.c | 9 +- 58 files changed, 11514 insertions(+), 3630 deletions(-) -- 2.38.0.rc1.362.ged0d419d3c-goog
On Mon, Oct 3, 2022 at 7:16 PM Ian Rogers <irogers@google.com> wrote: > > For consistency with: > https://github.com/intel/perfmon-metrics > rename of topdown TMA metrics from Frontend_Bound to tma_frontend_bound. > > Remove _SMT suffix metrics are dropped as the #SMT_On and #EBS_Mode > are correctly expanded in the single main metric. Fix perf expr to > allow a double if to be correctly processed. > > Add all 6 levels of TMA metrics. Child metrics are placed in a group > named after their parent allowing children of a metric to be > easily measured using the metric name with a _group suffix. > > Don't drop TMA metrics if they contain topdown events. > > The ## and ##? operators are correctly expanded. > > The locate-with column is added to the long description describing a > sampling event. > > Metrics are written in terms of other metrics to reduce the expression > size and increase readability. > > Following this the pmu-events/arch/x86 directories match those created > by the script at: > https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py > with updates at: > https://github.com/captain5050/event-converter-for-linux-perf > > > v3. Fix a parse metrics test failure due to making metrics referring > to other metrics case sensitive - make the cases in the test > metric match. > v2. Fixes commit message wrt missing mapfile.csv updates as noted by > Zhengjun Xing <zhengjun.xing@linux.intel.com>. ScaleUnit is added > for TMA metrics. Metrics with topdown events have have a missing > slots event added if necessary. The latest metrics at: > https://github.com/intel/perfmon-metrics are used, however, the > event-converter-for-linux-perf scripts now prefer their own > metrics in case of mismatched units when a metric is written in > terms of another. Additional testing was performed on broadwell, > broadwellde, cascadelakex, haswellx, sapphirerapids and tigerlake > CPUs. I wrote up a little example of performing a top-down analysis for the perf wiki here: https://perf.wiki.kernel.org/index.php/Top-Down_Analysis Thanks, Ian > Ian Rogers (23): > perf expr: Allow a double if expression > perf test: Adjust case of test metrics > perf expr: Remove jevents case workaround > perf metrics: Don't scale counts going into metrics > perf vendor events: Update Intel skylakex > perf vendor events: Update Intel alderlake > perf vendor events: Update Intel broadwell > perf vendor events: Update Intel broadwellx > perf vendor events: Update Intel cascadelakex > perf vendor events: Update elkhartlake cpuids > perf vendor events: Update Intel haswell > perf vendor events: Update Intel haswellx > perf vendor events: Update Intel icelake > perf vendor events: Update Intel icelakex > perf vendor events: Update Intel ivybridge > perf vendor events: Update Intel ivytown > perf vendor events: Update Intel jaketown > perf vendor events: Update Intel sandybridge > perf vendor events: Update Intel sapphirerapids > perf vendor events: Update silvermont cpuids > perf vendor events: Update Intel skylake > perf vendor events: Update Intel tigerlake > perf vendor events: Update Intel broadwellde > > .../arch/test/test_soc/cpu/metrics.json | 6 +- > .../arch/x86/alderlake/adl-metrics.json | 1353 ++++++++++++++++- > .../pmu-events/arch/x86/alderlake/cache.json | 129 +- > .../arch/x86/alderlake/frontend.json | 12 + > .../pmu-events/arch/x86/alderlake/memory.json | 22 + > .../pmu-events/arch/x86/alderlake/other.json | 22 + > .../arch/x86/alderlake/pipeline.json | 14 +- > .../arch/x86/broadwell/bdw-metrics.json | 679 +++++++-- > .../arch/x86/broadwellde/bdwde-metrics.json | 711 +++++++-- > .../arch/x86/broadwellx/bdx-metrics.json | 965 +++++++----- > .../arch/x86/broadwellx/uncore-cache.json | 10 +- > .../x86/broadwellx/uncore-interconnect.json | 18 +- > .../arch/x86/broadwellx/uncore-memory.json | 18 +- > .../arch/x86/cascadelakex/clx-metrics.json | 1285 ++++++++++------ > .../arch/x86/cascadelakex/uncore-memory.json | 18 +- > .../arch/x86/cascadelakex/uncore-other.json | 10 +- > .../pmu-events/arch/x86/haswell/cache.json | 4 +- > .../pmu-events/arch/x86/haswell/frontend.json | 12 +- > .../arch/x86/haswell/hsw-metrics.json | 570 ++++++- > .../pmu-events/arch/x86/haswellx/cache.json | 2 +- > .../arch/x86/haswellx/frontend.json | 12 +- > .../arch/x86/haswellx/hsx-metrics.json | 919 +++++++---- > .../x86/haswellx/uncore-interconnect.json | 18 +- > .../arch/x86/haswellx/uncore-memory.json | 18 +- > .../pmu-events/arch/x86/icelake/cache.json | 6 +- > .../arch/x86/icelake/icl-metrics.json | 808 +++++++++- > .../pmu-events/arch/x86/icelake/pipeline.json | 2 +- > .../pmu-events/arch/x86/icelakex/cache.json | 6 +- > .../arch/x86/icelakex/icx-metrics.json | 1155 ++++++++++---- > .../arch/x86/icelakex/pipeline.json | 2 +- > .../arch/x86/icelakex/uncore-other.json | 2 +- > .../arch/x86/ivybridge/ivb-metrics.json | 594 ++++++-- > .../pmu-events/arch/x86/ivytown/cache.json | 4 +- > .../arch/x86/ivytown/floating-point.json | 2 +- > .../pmu-events/arch/x86/ivytown/frontend.json | 18 +- > .../arch/x86/ivytown/ivt-metrics.json | 630 ++++++-- > .../arch/x86/ivytown/uncore-cache.json | 58 +- > .../arch/x86/ivytown/uncore-interconnect.json | 84 +- > .../arch/x86/ivytown/uncore-memory.json | 2 +- > .../arch/x86/ivytown/uncore-other.json | 6 +- > .../arch/x86/ivytown/uncore-power.json | 8 +- > .../arch/x86/jaketown/jkt-metrics.json | 327 +++- > tools/perf/pmu-events/arch/x86/mapfile.csv | 18 +- > .../arch/x86/sandybridge/snb-metrics.json | 315 +++- > .../arch/x86/sapphirerapids/cache.json | 4 +- > .../arch/x86/sapphirerapids/frontend.json | 11 + > .../arch/x86/sapphirerapids/pipeline.json | 4 +- > .../arch/x86/sapphirerapids/spr-metrics.json | 1249 ++++++++++----- > .../arch/x86/skylake/skl-metrics.json | 861 ++++++++--- > .../arch/x86/skylakex/skx-metrics.json | 1262 +++++++++------ > .../arch/x86/skylakex/uncore-memory.json | 18 +- > .../arch/x86/skylakex/uncore-other.json | 19 +- > .../arch/x86/tigerlake/tgl-metrics.json | 810 +++++++++- > tools/perf/pmu-events/empty-pmu-events.c | 6 +- > tools/perf/tests/expr.c | 4 + > tools/perf/util/expr.c | 11 +- > tools/perf/util/expr.y | 2 +- > tools/perf/util/stat-shadow.c | 9 +- > 58 files changed, 11514 insertions(+), 3630 deletions(-) > > -- > 2.38.0.rc1.362.ged0d419d3c-goog >
[cutting down cc list] On 10/3/2022 8:43 PM, Ian Rogers wrote: > On Mon, Oct 3, 2022 at 7:16 PM Ian Rogers <irogers@google.com> wrote: >> For consistency with: >> https://github.com/intel/perfmon-metrics >> rename of topdown TMA metrics from Frontend_Bound to tma_frontend_bound. >> >> Remove _SMT suffix metrics are dropped as the #SMT_On and #EBS_Mode >> are correctly expanded in the single main metric. Fix perf expr to >> allow a double if to be correctly processed. >> >> Add all 6 levels of TMA metrics. Child metrics are placed in a group >> named after their parent allowing children of a metric to be >> easily measured using the metric name with a _group suffix. >> >> Don't drop TMA metrics if they contain topdown events. >> >> The ## and ##? operators are correctly expanded. >> >> The locate-with column is added to the long description describing a >> sampling event. >> >> Metrics are written in terms of other metrics to reduce the expression >> size and increase readability. >> >> Following this the pmu-events/arch/x86 directories match those created >> by the script at: >> https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py >> with updates at: >> https://github.com/captain5050/event-converter-for-linux-perf >> >> >> v3. Fix a parse metrics test failure due to making metrics referring >> to other metrics case sensitive - make the cases in the test >> metric match. >> v2. Fixes commit message wrt missing mapfile.csv updates as noted by >> Zhengjun Xing <zhengjun.xing@linux.intel.com>. ScaleUnit is added >> for TMA metrics. Metrics with topdown events have have a missing >> slots event added if necessary. The latest metrics at: >> https://github.com/intel/perfmon-metrics are used, however, the >> event-converter-for-linux-perf scripts now prefer their own >> metrics in case of mismatched units when a metric is written in >> terms of another. Additional testing was performed on broadwell, >> broadwellde, cascadelakex, haswellx, sapphirerapids and tigerlake >> CPUs. > I wrote up a little example of performing a top-down analysis for the > perf wiki here: > https://perf.wiki.kernel.org/index.php/Top-Down_Analysis I did some quick testing. On Skylake the output of L1 isn't scaled to percent: $ ./perf stat -M TopdownL1 ~/pmu/pmu-tools/workloads/BC1s Performance counter stats for '/home/ak/pmu/pmu-tools/workloads/BC1s': 608,066,701 INT_MISC.RECOVERY_CYCLES # 0.32 Bad_Speculation (50.02%) 5,364,230,382 CPU_CLK_UNHALTED.THREAD # 0.48 Retiring (50.02%) 10,194,062,626 UOPS_RETIRED.RETIRE_SLOTS (50.02%) 14,613,100,390 UOPS_ISSUED.ANY (50.02%) 2,928,793,077 IDQ_UOPS_NOT_DELIVERED.CORE # 0.14 Frontend_Bound # 0.07 Backend_Bound (50.02%) 604,850,703 INT_MISC.RECOVERY_CYCLES (50.02%) 5,357,291,185 CPU_CLK_UNHALTED.THREAD (50.02%) 14,618,285,580 UOPS_ISSUED.ANY (50.02%) Then if I follow the wiki example here I would expect I need to do $ ./perf stat -M tma_backend_bound_group ~/pmu/pmu-tools/workloads/BC1s Cannot find metric or group `tma_backend_bound_group' but tma_retiring_group doesn't exist. So it seems the methodology isn't fully consistent everywhere? Perhaps the wiki needs to document the supported CPUs and also what part of the hierarchy is supported. Another problem I noticed in the example is that the sample event didn't specify PEBS, even though it probably should at least on Icelake+ where every event can be used with less over with PEBS. Also with all these groups that need to be specified by hand some bash completion support for groups would be really useful) -Andi
On Tue, Oct 4, 2022 at 10:29 AM Andi Kleen <ak@linux.intel.com> wrote: > > [cutting down cc list] > > > On 10/3/2022 8:43 PM, Ian Rogers wrote: > > On Mon, Oct 3, 2022 at 7:16 PM Ian Rogers <irogers@google.com> wrote: > >> For consistency with: > >> https://github.com/intel/perfmon-metrics > >> rename of topdown TMA metrics from Frontend_Bound to tma_frontend_bound. > >> > >> Remove _SMT suffix metrics are dropped as the #SMT_On and #EBS_Mode > >> are correctly expanded in the single main metric. Fix perf expr to > >> allow a double if to be correctly processed. > >> > >> Add all 6 levels of TMA metrics. Child metrics are placed in a group > >> named after their parent allowing children of a metric to be > >> easily measured using the metric name with a _group suffix. > >> > >> Don't drop TMA metrics if they contain topdown events. > >> > >> The ## and ##? operators are correctly expanded. > >> > >> The locate-with column is added to the long description describing a > >> sampling event. > >> > >> Metrics are written in terms of other metrics to reduce the expression > >> size and increase readability. > >> > >> Following this the pmu-events/arch/x86 directories match those created > >> by the script at: > >> https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py > >> with updates at: > >> https://github.com/captain5050/event-converter-for-linux-perf > >> > >> > >> v3. Fix a parse metrics test failure due to making metrics referring > >> to other metrics case sensitive - make the cases in the test > >> metric match. > >> v2. Fixes commit message wrt missing mapfile.csv updates as noted by > >> Zhengjun Xing <zhengjun.xing@linux.intel.com>. ScaleUnit is added > >> for TMA metrics. Metrics with topdown events have have a missing > >> slots event added if necessary. The latest metrics at: > >> https://github.com/intel/perfmon-metrics are used, however, the > >> event-converter-for-linux-perf scripts now prefer their own > >> metrics in case of mismatched units when a metric is written in > >> terms of another. Additional testing was performed on broadwell, > >> broadwellde, cascadelakex, haswellx, sapphirerapids and tigerlake > >> CPUs. > > I wrote up a little example of performing a top-down analysis for the > > perf wiki here: > > https://perf.wiki.kernel.org/index.php/Top-Down_Analysis > > > I did some quick testing. > > On Skylake the output of L1 isn't scaled to percent: > > $ ./perf stat -M TopdownL1 ~/pmu/pmu-tools/workloads/BC1s > > Performance counter stats for '/home/ak/pmu/pmu-tools/workloads/BC1s': > > 608,066,701 INT_MISC.RECOVERY_CYCLES # 0.32 > Bad_Speculation (50.02%) > 5,364,230,382 CPU_CLK_UNHALTED.THREAD # 0.48 > Retiring (50.02%) > 10,194,062,626 UOPS_RETIRED.RETIRE_SLOTS (50.02%) > 14,613,100,390 UOPS_ISSUED.ANY (50.02%) > 2,928,793,077 IDQ_UOPS_NOT_DELIVERED.CORE # 0.14 > Frontend_Bound > # 0.07 > Backend_Bound (50.02%) > 604,850,703 INT_MISC.RECOVERY_CYCLES (50.02%) > 5,357,291,185 CPU_CLK_UNHALTED.THREAD (50.02%) > 14,618,285,580 UOPS_ISSUED.ANY (50.02%) Did you build Arnaldo's perf/core branch with the changes applied? The metric values here should be tma_bad_speculation, tma_retiring, tma_frontend_bound, tma_backend_bound. Looking at: https://lore.kernel.org/lkml/20221004021612.325521-22-irogers@google.com/ + "MetricExpr": "1 - tma_frontend_bound - (UOPS_ISSUED.ANY + 4 * ((INT_MISC.RECOVERY_CYCLES_ANY / 2) if #SMT_on else INT_MISC.RECOVERY_CYCLES)) / SLOTS", + "MetricGroup": "TopdownL1;tma_L1_group", + "MetricName": "tma_backend_bound", + "PublicDescription": "This category represents fraction of slots where no uops are being delivered due to a lack of required resources for accepting new uops in the Backend. Backend is the portion of the processor core where the out-of-order scheduler dispatches ready uops into their respective execution units; and once completed these uops get retired according to program order. For example; stalls due to data-cache misses or stalls due to the divider unit being overloaded are both categorized under Backend Bound. Backend Bound is further divided into two main categories: Memory Bound and Core Bound.", + "ScaleUnit": "100%" So it wouldn't make sense to me that the scale was missing. Fwiw, I did test on SkylakeX but used Tigerlake for the wiki due to potential clock domain issues with SLOTS. > Then if I follow the wiki example here I would expect I need to do > > $ ./perf stat -M tma_backend_bound_group ~/pmu/pmu-tools/workloads/BC1s > > Cannot find metric or group `tma_backend_bound_group' > > but tma_retiring_group doesn't exist. So it seems the methodology isn't > fully consistent everywhere? Perhaps the wiki needs to document the > supported CPUs and also what part of the hierarchy is supported. So I think you've not got Arnaldo's branch with the changes applied. Unfortunately the instructions around '_group' are only going to apply to Linux 6.1. > Another problem I noticed in the example is that the sample event didn't > specify PEBS, even though it probably should at least on Icelake+ where > every event can be used with less over with PEBS. The 'Sample with' text is just text for a description. We can change it or put something on the wiki, what would you suggest? > Also with all these groups that need to be specified by hand some bash > completion support for groups would be really useful) Ack. My expectation is that everyone starts with TopdownL1 and goes from there adding '_group' to the metric they want to drill into. There are 104 topdown metrics and I'm not sure how useful expanding all of these would be. On Icelake+ this becomes muddy due to the unconditional printing of topdown metrics in the midst of the regularly computed metrics, this can be seen on the wiki. https://perf.wiki.kernel.org/index.php/Top-Down_Analysis For example, when the level 2 metric group tma_backend_bound_group is given the level 1 metrics Retiring, Frontend Bound, Backend Bound and Bad Speculation are displayed. Thanks, Ian > -Andi > >
Em Tue, Oct 04, 2022 at 10:55:56AM -0700, Ian Rogers escreveu: > On Tue, Oct 4, 2022 at 10:29 AM Andi Kleen <ak@linux.intel.com> wrote: > > Then if I follow the wiki example here I would expect I need to do > > $ ./perf stat -M tma_backend_bound_group ~/pmu/pmu-tools/workloads/BC1s > > Cannot find metric or group `tma_backend_bound_group' > > but tma_retiring_group doesn't exist. So it seems the methodology isn't > > fully consistent everywhere? Perhaps the wiki needs to document the > > supported CPUs and also what part of the hierarchy is supported. > So I think you've not got Arnaldo's branch with the changes applied. > Unfortunately the instructions around '_group' are only going to apply > to Linux 6.1. I just pushed perf/core with Ian's v3 series, please check with that one. - Arnaldo
> >> So I think you've not got Arnaldo's branch with the changes applied. >> Unfortunately the instructions around '_group' are only going to apply >> to Linux 6.1. > I just pushed perf/core with Ian's v3 series, please check with that > one. Yes works with the latest branch thanks. -Andi
Em Tue, Oct 04, 2022 at 11:15:34AM -0700, Andi Kleen escreveu: > > > > So I think you've not got Arnaldo's branch with the changes applied. > > > Unfortunately the instructions around '_group' are only going to apply > > > to Linux 6.1. > > I just pushed perf/core with Ian's v3 series, please check with that > > one. > > > Yes works with the latest branch thanks. Thanks for checking, - Arnaldo
Em Mon, Oct 03, 2022 at 07:15:49PM -0700, Ian Rogers escreveu: > For consistency with: > https://github.com/intel/perfmon-metrics > rename of topdown TMA metrics from Frontend_Bound to tma_frontend_bound. > > Remove _SMT suffix metrics are dropped as the #SMT_On and #EBS_Mode > are correctly expanded in the single main metric. Fix perf expr to > allow a double if to be correctly processed. > > Add all 6 levels of TMA metrics. Child metrics are placed in a group > named after their parent allowing children of a metric to be > easily measured using the metric name with a _group suffix. > > Don't drop TMA metrics if they contain topdown events. > > The ## and ##? operators are correctly expanded. > > The locate-with column is added to the long description describing a > sampling event. > > Metrics are written in terms of other metrics to reduce the expression > size and increase readability. > > Following this the pmu-events/arch/x86 directories match those created > by the script at: > https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py > with updates at: > https://github.com/captain5050/event-converter-for-linux-perf > > > v3. Fix a parse metrics test failure due to making metrics referring > to other metrics case sensitive - make the cases in the test > metric match. Thanks, applied. - Arnaldo > v2. Fixes commit message wrt missing mapfile.csv updates as noted by > Zhengjun Xing <zhengjun.xing@linux.intel.com>. ScaleUnit is added > for TMA metrics. Metrics with topdown events have have a missing > slots event added if necessary. The latest metrics at: > https://github.com/intel/perfmon-metrics are used, however, the > event-converter-for-linux-perf scripts now prefer their own > metrics in case of mismatched units when a metric is written in > terms of another. Additional testing was performed on broadwell, > broadwellde, cascadelakex, haswellx, sapphirerapids and tigerlake > CPUs. > > Ian Rogers (23): > perf expr: Allow a double if expression > perf test: Adjust case of test metrics > perf expr: Remove jevents case workaround > perf metrics: Don't scale counts going into metrics > perf vendor events: Update Intel skylakex > perf vendor events: Update Intel alderlake > perf vendor events: Update Intel broadwell > perf vendor events: Update Intel broadwellx > perf vendor events: Update Intel cascadelakex > perf vendor events: Update elkhartlake cpuids > perf vendor events: Update Intel haswell > perf vendor events: Update Intel haswellx > perf vendor events: Update Intel icelake > perf vendor events: Update Intel icelakex > perf vendor events: Update Intel ivybridge > perf vendor events: Update Intel ivytown > perf vendor events: Update Intel jaketown > perf vendor events: Update Intel sandybridge > perf vendor events: Update Intel sapphirerapids > perf vendor events: Update silvermont cpuids > perf vendor events: Update Intel skylake > perf vendor events: Update Intel tigerlake > perf vendor events: Update Intel broadwellde > > .../arch/test/test_soc/cpu/metrics.json | 6 +- > .../arch/x86/alderlake/adl-metrics.json | 1353 ++++++++++++++++- > .../pmu-events/arch/x86/alderlake/cache.json | 129 +- > .../arch/x86/alderlake/frontend.json | 12 + > .../pmu-events/arch/x86/alderlake/memory.json | 22 + > .../pmu-events/arch/x86/alderlake/other.json | 22 + > .../arch/x86/alderlake/pipeline.json | 14 +- > .../arch/x86/broadwell/bdw-metrics.json | 679 +++++++-- > .../arch/x86/broadwellde/bdwde-metrics.json | 711 +++++++-- > .../arch/x86/broadwellx/bdx-metrics.json | 965 +++++++----- > .../arch/x86/broadwellx/uncore-cache.json | 10 +- > .../x86/broadwellx/uncore-interconnect.json | 18 +- > .../arch/x86/broadwellx/uncore-memory.json | 18 +- > .../arch/x86/cascadelakex/clx-metrics.json | 1285 ++++++++++------ > .../arch/x86/cascadelakex/uncore-memory.json | 18 +- > .../arch/x86/cascadelakex/uncore-other.json | 10 +- > .../pmu-events/arch/x86/haswell/cache.json | 4 +- > .../pmu-events/arch/x86/haswell/frontend.json | 12 +- > .../arch/x86/haswell/hsw-metrics.json | 570 ++++++- > .../pmu-events/arch/x86/haswellx/cache.json | 2 +- > .../arch/x86/haswellx/frontend.json | 12 +- > .../arch/x86/haswellx/hsx-metrics.json | 919 +++++++---- > .../x86/haswellx/uncore-interconnect.json | 18 +- > .../arch/x86/haswellx/uncore-memory.json | 18 +- > .../pmu-events/arch/x86/icelake/cache.json | 6 +- > .../arch/x86/icelake/icl-metrics.json | 808 +++++++++- > .../pmu-events/arch/x86/icelake/pipeline.json | 2 +- > .../pmu-events/arch/x86/icelakex/cache.json | 6 +- > .../arch/x86/icelakex/icx-metrics.json | 1155 ++++++++++---- > .../arch/x86/icelakex/pipeline.json | 2 +- > .../arch/x86/icelakex/uncore-other.json | 2 +- > .../arch/x86/ivybridge/ivb-metrics.json | 594 ++++++-- > .../pmu-events/arch/x86/ivytown/cache.json | 4 +- > .../arch/x86/ivytown/floating-point.json | 2 +- > .../pmu-events/arch/x86/ivytown/frontend.json | 18 +- > .../arch/x86/ivytown/ivt-metrics.json | 630 ++++++-- > .../arch/x86/ivytown/uncore-cache.json | 58 +- > .../arch/x86/ivytown/uncore-interconnect.json | 84 +- > .../arch/x86/ivytown/uncore-memory.json | 2 +- > .../arch/x86/ivytown/uncore-other.json | 6 +- > .../arch/x86/ivytown/uncore-power.json | 8 +- > .../arch/x86/jaketown/jkt-metrics.json | 327 +++- > tools/perf/pmu-events/arch/x86/mapfile.csv | 18 +- > .../arch/x86/sandybridge/snb-metrics.json | 315 +++- > .../arch/x86/sapphirerapids/cache.json | 4 +- > .../arch/x86/sapphirerapids/frontend.json | 11 + > .../arch/x86/sapphirerapids/pipeline.json | 4 +- > .../arch/x86/sapphirerapids/spr-metrics.json | 1249 ++++++++++----- > .../arch/x86/skylake/skl-metrics.json | 861 ++++++++--- > .../arch/x86/skylakex/skx-metrics.json | 1262 +++++++++------ > .../arch/x86/skylakex/uncore-memory.json | 18 +- > .../arch/x86/skylakex/uncore-other.json | 19 +- > .../arch/x86/tigerlake/tgl-metrics.json | 810 +++++++++- > tools/perf/pmu-events/empty-pmu-events.c | 6 +- > tools/perf/tests/expr.c | 4 + > tools/perf/util/expr.c | 11 +- > tools/perf/util/expr.y | 2 +- > tools/perf/util/stat-shadow.c | 9 +- > 58 files changed, 11514 insertions(+), 3630 deletions(-) > > -- > 2.38.0.rc1.362.ged0d419d3c-goog -- - Arnaldo
© 2016 - 2024 Red Hat, Inc.