[PATCH v4 00/22] Python generated Intel metrics

Ian Rogers posted 22 patches 2 months ago
tools/perf/pmu-events/intel_metrics.py | 1046 +++++++++++++++++++++++-
tools/perf/pmu-events/metric.py        |   52 ++
2 files changed, 1095 insertions(+), 3 deletions(-)
[PATCH v4 00/22] Python generated Intel metrics
Posted by Ian Rogers 2 months ago
Generate twenty sets of additional metrics for Intel. Rapl and Idle
metrics aren't specific to Intel but are placed here for ease and
convenience. Smi and tsx metrics are added so they can be dropped from
the per model json files. There are four uncore sets of metrics and
eleven core metrics. Add a CheckPmu function to metric to simplify
detecting the presence of hybrid PMUs in events. Metrics with
experimental events are flagged as experimental in their description.

The patches should be applied on top of:
https://lore.kernel.org/lkml/20240926174101.406874-1-irogers@google.com/

v4. Experimental metric descriptions. Add mesh bandwidth metric. Rebase.
v3. Swap tsx and CheckPMU patches that were in the wrong order. Some
    minor code cleanup changes. Drop reference to merged fix for
    umasks/occ_sel in PCU events and for cstate metrics.
v2. Drop the cycles breakdown in favor of having it as a common
    metric, spelling and other improvements suggested by Kan Liang
    <kan.liang@linux.intel.com>.

Ian Rogers (22):
  perf jevents: Add RAPL metrics for all Intel models
  perf jevents: Add idle metric for Intel models
  perf jevents: Add smi metric group for Intel models
  perf jevents: Add CheckPmu to see if a PMU is in loaded json events
  perf jevents: Mark metrics with experimental events as experimental
  perf jevents: Add tsx metric group for Intel models
  perf jevents: Add br metric group for branch statistics on Intel
  perf jevents: Add software prefetch (swpf) metric group for Intel
  perf jevents: Add ports metric group giving utilization on Intel
  perf jevents: Add L2 metrics for Intel
  perf jevents: Add load store breakdown metrics ldst for Intel
  perf jevents: Add ILP metrics for Intel
  perf jevents: Add context switch metrics for Intel
  perf jevents: Add FPU metrics for Intel
  perf jevents: Add Miss Level Parallelism (MLP) metric for Intel
  perf jevents: Add mem_bw metric for Intel
  perf jevents: Add local/remote "mem" breakdown metrics for Intel
  perf jevents: Add dir breakdown metrics for Intel
  perf jevents: Add C-State metrics from the PCU PMU for Intel
  perf jevents: Add local/remote miss latency metrics for Intel
  perf jevents: Add upi_bw metric for Intel
  perf jevents: Add mesh bandwidth saturation metric for Intel

 tools/perf/pmu-events/intel_metrics.py | 1046 +++++++++++++++++++++++-
 tools/perf/pmu-events/metric.py        |   52 ++
 2 files changed, 1095 insertions(+), 3 deletions(-)

-- 
2.46.1.824.gd892dcdcdd-goog
Re: [PATCH v4 00/22] Python generated Intel metrics
Posted by Liang, Kan 2 months ago

On 2024-09-26 1:50 p.m., Ian Rogers wrote:
> Generate twenty sets of additional metrics for Intel. Rapl and Idle
> metrics aren't specific to Intel but are placed here for ease and
> convenience. Smi and tsx metrics are added so they can be dropped from
> the per model json files. 

Are Smi and tsx metrics the only two metrics who's duplicate metrics in
the json files will be dropped?

It sounds like there will be many duplicate metrics in perf list, right?

Also, is it an attempt to define some architectural metrics for perf?
How do you decide which metrics should be added here?

Thanks,
Kan

> There are four uncore sets of metrics and
> eleven core metrics. Add a CheckPmu function to metric to simplify
> detecting the presence of hybrid PMUs in events. Metrics with
> experimental events are flagged as experimental in their description.
> 
> The patches should be applied on top of:
> https://lore.kernel.org/lkml/20240926174101.406874-1-irogers@google.com/
> 
> v4. Experimental metric descriptions. Add mesh bandwidth metric. Rebase.
> v3. Swap tsx and CheckPMU patches that were in the wrong order. Some
>     minor code cleanup changes. Drop reference to merged fix for
>     umasks/occ_sel in PCU events and for cstate metrics.
> v2. Drop the cycles breakdown in favor of having it as a common
>     metric, spelling and other improvements suggested by Kan Liang
>     <kan.liang@linux.intel.com>.
> 
> Ian Rogers (22):
>   perf jevents: Add RAPL metrics for all Intel models
>   perf jevents: Add idle metric for Intel models
>   perf jevents: Add smi metric group for Intel models
>   perf jevents: Add CheckPmu to see if a PMU is in loaded json events
>   perf jevents: Mark metrics with experimental events as experimental
>   perf jevents: Add tsx metric group for Intel models
>   perf jevents: Add br metric group for branch statistics on Intel
>   perf jevents: Add software prefetch (swpf) metric group for Intel
>   perf jevents: Add ports metric group giving utilization on Intel
>   perf jevents: Add L2 metrics for Intel
>   perf jevents: Add load store breakdown metrics ldst for Intel
>   perf jevents: Add ILP metrics for Intel
>   perf jevents: Add context switch metrics for Intel
>   perf jevents: Add FPU metrics for Intel
>   perf jevents: Add Miss Level Parallelism (MLP) metric for Intel
>   perf jevents: Add mem_bw metric for Intel
>   perf jevents: Add local/remote "mem" breakdown metrics for Intel
>   perf jevents: Add dir breakdown metrics for Intel
>   perf jevents: Add C-State metrics from the PCU PMU for Intel
>   perf jevents: Add local/remote miss latency metrics for Intel
>   perf jevents: Add upi_bw metric for Intel
>   perf jevents: Add mesh bandwidth saturation metric for Intel
> 
>  tools/perf/pmu-events/intel_metrics.py | 1046 +++++++++++++++++++++++-
>  tools/perf/pmu-events/metric.py        |   52 ++
>  2 files changed, 1095 insertions(+), 3 deletions(-)
>
Re: [PATCH v4 00/22] Python generated Intel metrics
Posted by Ian Rogers 1 month, 2 weeks ago
On Fri, Sep 27, 2024 at 11:34 AM Liang, Kan <kan.liang@linux.intel.com> wrote:
>
>
>
> On 2024-09-26 1:50 p.m., Ian Rogers wrote:
> > Generate twenty sets of additional metrics for Intel. Rapl and Idle
> > metrics aren't specific to Intel but are placed here for ease and
> > convenience. Smi and tsx metrics are added so they can be dropped from
> > the per model json files.
>
> Are Smi and tsx metrics the only two metrics who's duplicate metrics in
> the json files will be dropped?

Yes. These metrics with their runtime detection and use of sysfs event
names I feel more naturally fit here rather than in the Intel perfmon
github converter script.

> It sounds like there will be many duplicate metrics in perf list, right?

That's not the goal. There may be memory bandwidth computed in
different ways, like TMA and using uncore, but that seems okay as the
metrics are using different counters so may say different things. I
think there is an action to always watch the metrics and ensure
duplicates don't occur, but some duplication can be beneficial.

> Also, is it an attempt to define some architectural metrics for perf?

There are many advantages of using python to generate the metric json,
a few are:
1) we verify the metrics use events from the event json,
2) the error prone escaping of commas and slashes is handled by the python,
3) metric expressions can be spread over multiple lines and have comments.
It is also an advantage that we can avoid copy-pasting one metric from
one architectural metric json to another. This helps propagate fixes.

So, it's not so much a goal to have architectural metrics but its nice
that we avoid copy-paste. Somewhere where I've tried to set up common
events across all architectures is with making tool have its own PMU.
Rather than have the tool PMU describe events using custom code it
just reuses the existing PMU json support:
https://github.com/googleprodkernel/linux-perf/blob/google_tools_master/tools/perf/pmu-events/arch/common/common/tool.json

> How do you decide which metrics should be added here?

The goal is to try to make open source metrics that Google has
internally. I've set up a git repo for this here:
https://github.com/googleprodkernel/linux-perf
Often the source of the metric is Intel's documentation on things like
uncore events, it's just such metrics aren't part of the perfmon
process and so we're adding them here. Were all these metrics on the
Intel github it'd be reasonable to remove them from here. If Intel
would like to work on or contribute some metrics here, that's also
fine. I think the main thing is to be giving users useful metrics.

Thanks,
Ian

> > There are four uncore sets of metrics and
> > eleven core metrics. Add a CheckPmu function to metric to simplify
> > detecting the presence of hybrid PMUs in events. Metrics with
> > experimental events are flagged as experimental in their description.
> >
> > The patches should be applied on top of:
> > https://lore.kernel.org/lkml/20240926174101.406874-1-irogers@google.com/
> >
> > v4. Experimental metric descriptions. Add mesh bandwidth metric. Rebase.
> > v3. Swap tsx and CheckPMU patches that were in the wrong order. Some
> >     minor code cleanup changes. Drop reference to merged fix for
> >     umasks/occ_sel in PCU events and for cstate metrics.
> > v2. Drop the cycles breakdown in favor of having it as a common
> >     metric, spelling and other improvements suggested by Kan Liang
> >     <kan.liang@linux.intel.com>.
> >
> > Ian Rogers (22):
> >   perf jevents: Add RAPL metrics for all Intel models
> >   perf jevents: Add idle metric for Intel models
> >   perf jevents: Add smi metric group for Intel models
> >   perf jevents: Add CheckPmu to see if a PMU is in loaded json events
> >   perf jevents: Mark metrics with experimental events as experimental
> >   perf jevents: Add tsx metric group for Intel models
> >   perf jevents: Add br metric group for branch statistics on Intel
> >   perf jevents: Add software prefetch (swpf) metric group for Intel
> >   perf jevents: Add ports metric group giving utilization on Intel
> >   perf jevents: Add L2 metrics for Intel
> >   perf jevents: Add load store breakdown metrics ldst for Intel
> >   perf jevents: Add ILP metrics for Intel
> >   perf jevents: Add context switch metrics for Intel
> >   perf jevents: Add FPU metrics for Intel
> >   perf jevents: Add Miss Level Parallelism (MLP) metric for Intel
> >   perf jevents: Add mem_bw metric for Intel
> >   perf jevents: Add local/remote "mem" breakdown metrics for Intel
> >   perf jevents: Add dir breakdown metrics for Intel
> >   perf jevents: Add C-State metrics from the PCU PMU for Intel
> >   perf jevents: Add local/remote miss latency metrics for Intel
> >   perf jevents: Add upi_bw metric for Intel
> >   perf jevents: Add mesh bandwidth saturation metric for Intel
> >
> >  tools/perf/pmu-events/intel_metrics.py | 1046 +++++++++++++++++++++++-
> >  tools/perf/pmu-events/metric.py        |   52 ++
> >  2 files changed, 1095 insertions(+), 3 deletions(-)
> >
Re: [PATCH v4 00/22] Python generated Intel metrics
Posted by Liang, Kan 3 weeks, 1 day ago

On 2024-10-09 12:02 p.m., Ian Rogers wrote:
> On Fri, Sep 27, 2024 at 11:34 AM Liang, Kan <kan.liang@linux.intel.com> wrote:
>>
>>
>>
>> On 2024-09-26 1:50 p.m., Ian Rogers wrote:
>>> Generate twenty sets of additional metrics for Intel. Rapl and Idle
>>> metrics aren't specific to Intel but are placed here for ease and
>>> convenience. Smi and tsx metrics are added so they can be dropped from
>>> the per model json files.
>>
>> Are Smi and tsx metrics the only two metrics who's duplicate metrics in
>> the json files will be dropped?
> 
> Yes. These metrics with their runtime detection and use of sysfs event
> names I feel more naturally fit here rather than in the Intel perfmon
> github converter script.
> 
>> It sounds like there will be many duplicate metrics in perf list, right?
> 
> That's not the goal. There may be memory bandwidth computed in
> different ways, like TMA and using uncore, but that seems okay as the
> metrics are using different counters so may say different things. I
> think there is an action to always watch the metrics and ensure
> duplicates don't occur, but some duplication can be beneficial.


Can we give a common prefix for all the automatically generated metrics,
e.g., general_ or std_?
As you said, there may be different metrics to calculate the same thing.

With a common prefix, we can clearly understand where the metrics is
from. In case, there are any issues found later for some metrics. I can
tell the end user to use either the TMA metrics or the automatically
generated metrics.
If they count the same thing, the main body of the metric name should be
the same.

Thanks,
Kan

> 
>> Also, is it an attempt to define some architectural metrics for perf?
> 
> There are many advantages of using python to generate the metric json,
> a few are:
> 1) we verify the metrics use events from the event json,
> 2) the error prone escaping of commas and slashes is handled by the python,
> 3) metric expressions can be spread over multiple lines and have comments.
> It is also an advantage that we can avoid copy-pasting one metric from
> one architectural metric json to another. This helps propagate fixes.
> 
> So, it's not so much a goal to have architectural metrics but its nice
> that we avoid copy-paste. Somewhere where I've tried to set up common
> events across all architectures is with making tool have its own PMU.
> Rather than have the tool PMU describe events using custom code it
> just reuses the existing PMU json support:
> https://github.com/googleprodkernel/linux-perf/blob/google_tools_master/tools/perf/pmu-events/arch/common/common/tool.json
> 
>> How do you decide which metrics should be added here?
> 
> The goal is to try to make open source metrics that Google has
> internally. I've set up a git repo for this here:
> https://github.com/googleprodkernel/linux-perf
> Often the source of the metric is Intel's documentation on things like
> uncore events, it's just such metrics aren't part of the perfmon
> process and so we're adding them here. Were all these metrics on the
> Intel github it'd be reasonable to remove them from here. If Intel
> would like to work on or contribute some metrics here, that's also
> fine. I think the main thing is to be giving users useful metrics.
> 
> Thanks,
> Ian
> 
>>> There are four uncore sets of metrics and
>>> eleven core metrics. Add a CheckPmu function to metric to simplify
>>> detecting the presence of hybrid PMUs in events. Metrics with
>>> experimental events are flagged as experimental in their description.
>>>
>>> The patches should be applied on top of:
>>> https://lore.kernel.org/lkml/20240926174101.406874-1-irogers@google.com/
>>>
>>> v4. Experimental metric descriptions. Add mesh bandwidth metric. Rebase.
>>> v3. Swap tsx and CheckPMU patches that were in the wrong order. Some
>>>     minor code cleanup changes. Drop reference to merged fix for
>>>     umasks/occ_sel in PCU events and for cstate metrics.
>>> v2. Drop the cycles breakdown in favor of having it as a common
>>>     metric, spelling and other improvements suggested by Kan Liang
>>>     <kan.liang@linux.intel.com>.
>>>
>>> Ian Rogers (22):
>>>   perf jevents: Add RAPL metrics for all Intel models
>>>   perf jevents: Add idle metric for Intel models
>>>   perf jevents: Add smi metric group for Intel models
>>>   perf jevents: Add CheckPmu to see if a PMU is in loaded json events
>>>   perf jevents: Mark metrics with experimental events as experimental
>>>   perf jevents: Add tsx metric group for Intel models
>>>   perf jevents: Add br metric group for branch statistics on Intel
>>>   perf jevents: Add software prefetch (swpf) metric group for Intel
>>>   perf jevents: Add ports metric group giving utilization on Intel
>>>   perf jevents: Add L2 metrics for Intel
>>>   perf jevents: Add load store breakdown metrics ldst for Intel
>>>   perf jevents: Add ILP metrics for Intel
>>>   perf jevents: Add context switch metrics for Intel
>>>   perf jevents: Add FPU metrics for Intel
>>>   perf jevents: Add Miss Level Parallelism (MLP) metric for Intel
>>>   perf jevents: Add mem_bw metric for Intel
>>>   perf jevents: Add local/remote "mem" breakdown metrics for Intel
>>>   perf jevents: Add dir breakdown metrics for Intel
>>>   perf jevents: Add C-State metrics from the PCU PMU for Intel
>>>   perf jevents: Add local/remote miss latency metrics for Intel
>>>   perf jevents: Add upi_bw metric for Intel
>>>   perf jevents: Add mesh bandwidth saturation metric for Intel
>>>
>>>  tools/perf/pmu-events/intel_metrics.py | 1046 +++++++++++++++++++++++-
>>>  tools/perf/pmu-events/metric.py        |   52 ++
>>>  2 files changed, 1095 insertions(+), 3 deletions(-)
>>>

Re: [PATCH v4 00/22] Python generated Intel metrics
Posted by Ian Rogers 2 weeks ago
On Wed, Nov 6, 2024 at 8:47 AM Liang, Kan <kan.liang@linux.intel.com> wrote:
>
>
>
> On 2024-10-09 12:02 p.m., Ian Rogers wrote:
> > On Fri, Sep 27, 2024 at 11:34 AM Liang, Kan <kan.liang@linux.intel.com> wrote:
> >>
> >>
> >>
> >> On 2024-09-26 1:50 p.m., Ian Rogers wrote:
> >>> Generate twenty sets of additional metrics for Intel. Rapl and Idle
> >>> metrics aren't specific to Intel but are placed here for ease and
> >>> convenience. Smi and tsx metrics are added so they can be dropped from
> >>> the per model json files.
> >>
> >> Are Smi and tsx metrics the only two metrics who's duplicate metrics in
> >> the json files will be dropped?
> >
> > Yes. These metrics with their runtime detection and use of sysfs event
> > names I feel more naturally fit here rather than in the Intel perfmon
> > github converter script.
> >
> >> It sounds like there will be many duplicate metrics in perf list, right?
> >
> > That's not the goal. There may be memory bandwidth computed in
> > different ways, like TMA and using uncore, but that seems okay as the
> > metrics are using different counters so may say different things. I
> > think there is an action to always watch the metrics and ensure
> > duplicates don't occur, but some duplication can be beneficial.
>
>
> Can we give a common prefix for all the automatically generated metrics,
> e.g., general_ or std_?
> As you said, there may be different metrics to calculate the same thing.
>
> With a common prefix, we can clearly understand where the metrics is
> from. In case, there are any issues found later for some metrics. I can
> tell the end user to use either the TMA metrics or the automatically
> generated metrics.
> If they count the same thing, the main body of the metric name should be
> the same.

I'm reminded of the default events where some of the set fail on AMD,
and of AMD calling their topdown like metrics things like PipelineL1
and PipelineL2 rather than the Intel names of TopdownL1 and TopdownL2.
Like you I have a desire for consistent naming, it just seems we
always get pulled away from it.

I'm going to post a v5 of these changes, we carry them in:
https://github.com/googleprodkernel/linux-perf
but I'll not vary the naming for now.

Thanks,
Ian