[PATCH v8 15/52] perf jevents: Add RAPL event metric for AMD zen models

Ian Rogers posted 52 patches 3 weeks, 4 days ago
There is a newer version of this series
[PATCH v8 15/52] perf jevents: Add RAPL event metric for AMD zen models
Posted by Ian Rogers 3 weeks, 4 days ago
Add power per second metrics based on RAPL.

Signed-off-by: Ian Rogers <irogers@google.com>
---
 tools/perf/pmu-events/amd_metrics.py | 31 +++++++++++++++++++++++++---
 1 file changed, 28 insertions(+), 3 deletions(-)

diff --git a/tools/perf/pmu-events/amd_metrics.py b/tools/perf/pmu-events/amd_metrics.py
index bc91d9c120fa..b6cdeb4f09fe 100755
--- a/tools/perf/pmu-events/amd_metrics.py
+++ b/tools/perf/pmu-events/amd_metrics.py
@@ -1,13 +1,36 @@
 #!/usr/bin/env python3
 # SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
 import argparse
+import math
 import os
-from metric import (JsonEncodeMetric, JsonEncodeMetricGroupDescriptions, LoadEvents,
-                    MetricGroup)
+from metric import (d_ratio, has_event, Event, JsonEncodeMetric, JsonEncodeMetricGroupDescriptions,
+                    LoadEvents, Metric, MetricGroup, Select)
 
 # Global command line arguments.
 _args = None
 
+interval_sec = Event("duration_time")
+
+
+def Rapl() -> MetricGroup:
+    """Processor socket power consumption estimate.
+
+    Use events from the running average power limit (RAPL) driver.
+    """
+    # Watts = joules/second
+    # Currently only energy-pkg is supported by AMD:
+    # https://lore.kernel.org/lkml/20220105185659.643355-1-eranian@google.com/
+    pkg = Event("power/energy\\-pkg/")
+    cond_pkg = Select(pkg, has_event(pkg), math.nan)
+    scale = 2.3283064365386962890625e-10
+    metrics = [
+        Metric("lpm_cpu_power_pkg", "",
+               d_ratio(cond_pkg * scale, interval_sec), "Watts"),
+    ]
+
+    return MetricGroup("lpm_cpu_power", metrics,
+                       description="Processor socket power consumption estimates")
+
 
 def main() -> None:
     global _args
@@ -33,7 +56,9 @@ def main() -> None:
     directory = f"{_args.events_path}/x86/{_args.model}/"
     LoadEvents(directory)
 
-    all_metrics = MetricGroup("", [])
+    all_metrics = MetricGroup("", [
+        Rapl(),
+    ])
 
     if _args.metricgroups:
         print(JsonEncodeMetricGroupDescriptions(all_metrics))
-- 
2.51.2.1041.gc1ab5b90ca-goog
Re: [PATCH v8 15/52] perf jevents: Add RAPL event metric for AMD zen models
Posted by Sandipan Das 1 week, 5 days ago
On 11/13/2025 8:50 AM, Ian Rogers wrote:
> Add power per second metrics based on RAPL.
> 
> Signed-off-by: Ian Rogers <irogers@google.com>
> ---
>  tools/perf/pmu-events/amd_metrics.py | 31 +++++++++++++++++++++++++---
>  1 file changed, 28 insertions(+), 3 deletions(-)
> 
> diff --git a/tools/perf/pmu-events/amd_metrics.py b/tools/perf/pmu-events/amd_metrics.py
> index bc91d9c120fa..b6cdeb4f09fe 100755
> --- a/tools/perf/pmu-events/amd_metrics.py
> +++ b/tools/perf/pmu-events/amd_metrics.py
> @@ -1,13 +1,36 @@
>  #!/usr/bin/env python3
>  # SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
>  import argparse
> +import math
>  import os
> -from metric import (JsonEncodeMetric, JsonEncodeMetricGroupDescriptions, LoadEvents,
> -                    MetricGroup)
> +from metric import (d_ratio, has_event, Event, JsonEncodeMetric, JsonEncodeMetricGroupDescriptions,
> +                    LoadEvents, Metric, MetricGroup, Select)
>  
>  # Global command line arguments.
>  _args = None
>  
> +interval_sec = Event("duration_time")
> +
> +
> +def Rapl() -> MetricGroup:
> +    """Processor socket power consumption estimate.
> +
> +    Use events from the running average power limit (RAPL) driver.
> +    """
> +    # Watts = joules/second
> +    # Currently only energy-pkg is supported by AMD:
> +    # https://lore.kernel.org/lkml/20220105185659.643355-1-eranian@google.com/
> +    pkg = Event("power/energy\\-pkg/")
> +    cond_pkg = Select(pkg, has_event(pkg), math.nan)
> +    scale = 2.3283064365386962890625e-10

It is unlikely that the scale factor will change, but would it still be safer to read
it from /sys/bus/event_source/devices/power/events/energy-pkg.scale?

> +    metrics = [
> +        Metric("lpm_cpu_power_pkg", "",
> +               d_ratio(cond_pkg * scale, interval_sec), "Watts"),
> +    ]
> +
> +    return MetricGroup("lpm_cpu_power", metrics,
> +                       description="Processor socket power consumption estimates")
> +
>  
>  def main() -> None:
>      global _args
> @@ -33,7 +56,9 @@ def main() -> None:
>      directory = f"{_args.events_path}/x86/{_args.model}/"
>      LoadEvents(directory)
>  
> -    all_metrics = MetricGroup("", [])
> +    all_metrics = MetricGroup("", [
> +        Rapl(),
> +    ])
>  
>      if _args.metricgroups:
>          print(JsonEncodeMetricGroupDescriptions(all_metrics))
Re: [PATCH v8 15/52] perf jevents: Add RAPL event metric for AMD zen models
Posted by Ian Rogers 1 week, 3 days ago
On Tue, Nov 25, 2025 at 9:05 PM Sandipan Das <sandipan.das@amd.com> wrote:
>
> On 11/13/2025 8:50 AM, Ian Rogers wrote:
> > Add power per second metrics based on RAPL.
> >
> > Signed-off-by: Ian Rogers <irogers@google.com>
> > ---
> >  tools/perf/pmu-events/amd_metrics.py | 31 +++++++++++++++++++++++++---
> >  1 file changed, 28 insertions(+), 3 deletions(-)
> >
> > diff --git a/tools/perf/pmu-events/amd_metrics.py b/tools/perf/pmu-events/amd_metrics.py
> > index bc91d9c120fa..b6cdeb4f09fe 100755
> > --- a/tools/perf/pmu-events/amd_metrics.py
> > +++ b/tools/perf/pmu-events/amd_metrics.py
> > @@ -1,13 +1,36 @@
> >  #!/usr/bin/env python3
> >  # SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
> >  import argparse
> > +import math
> >  import os
> > -from metric import (JsonEncodeMetric, JsonEncodeMetricGroupDescriptions, LoadEvents,
> > -                    MetricGroup)
> > +from metric import (d_ratio, has_event, Event, JsonEncodeMetric, JsonEncodeMetricGroupDescriptions,
> > +                    LoadEvents, Metric, MetricGroup, Select)
> >
> >  # Global command line arguments.
> >  _args = None
> >
> > +interval_sec = Event("duration_time")
> > +
> > +
> > +def Rapl() -> MetricGroup:
> > +    """Processor socket power consumption estimate.
> > +
> > +    Use events from the running average power limit (RAPL) driver.
> > +    """
> > +    # Watts = joules/second
> > +    # Currently only energy-pkg is supported by AMD:
> > +    # https://lore.kernel.org/lkml/20220105185659.643355-1-eranian@google.com/
> > +    pkg = Event("power/energy\\-pkg/")
> > +    cond_pkg = Select(pkg, has_event(pkg), math.nan)
> > +    scale = 2.3283064365386962890625e-10
>
> It is unlikely that the scale factor will change, but would it still be safer to read
> it from /sys/bus/event_source/devices/power/events/energy-pkg.scale?

Thanks Sandipan, I agree with the feedback but this isn't something
the metrics currently support. I'll keep it in mind.

Wrt, the other feedback, I'm wondering if we can get this series
landed and drop for now the patches that you have commented on? I'll
move them to a follow up series. That'd make the AMD patches here
something like:

Keep: [PATCH v8 15/52] perf jevents: Add RAPL event metric for AMD zen models
Keep: [PATCH v8 16/52] perf jevents: Add idle metric for AMD zen models
Keep: [PATCH v8 17/52] perf jevents: Add upc metric for uops per cycle for AMD
Keep: [PATCH v8 18/52] perf jevents: Add br metric group for branch
statistics on AMD
Drop: [PATCH v8 19/52] perf jevents: Add software prefetch (swpf)
metric group for AMD
Drop: [PATCH v8 20/52] perf jevents: Add hardware prefetch (hwpf)
metric group for AMD
Keep: [PATCH v8 21/52] perf jevents: Add itlb metric group for AMD
Keep: [PATCH v8 22/52] perf jevents: Add dtlb metric group for AMD
Drop (or perhaps keep and followup with miss latency): [PATCH v8
23/52] perf jevents: Add uncore l3 metric group for AMD
Keep: [PATCH v8 24/52] perf jevents: Add load store breakdown metrics
ldst for AMD
Drop: [PATCH v8 25/52] perf jevents: Add ILP metrics for AMD
Keep: [PATCH v8 26/52] perf jevents: Add context switch metrics for AMD
Keep: [PATCH v8 27/52] perf jevents: Add uop cache hit/miss rates for AMD

Is it okay to use your reviewed-by tag on the kept patches? If I keep
patch 23 for uncore l3, with a follow up on miss latency, then are you
okay with that?

Thanks!
Ian


> > +    metrics = [
> > +        Metric("lpm_cpu_power_pkg", "",
> > +               d_ratio(cond_pkg * scale, interval_sec), "Watts"),
> > +    ]
> > +
> > +    return MetricGroup("lpm_cpu_power", metrics,
> > +                       description="Processor socket power consumption estimates")
> > +
> >
> >  def main() -> None:
> >      global _args
> > @@ -33,7 +56,9 @@ def main() -> None:
> >      directory = f"{_args.events_path}/x86/{_args.model}/"
> >      LoadEvents(directory)
> >
> > -    all_metrics = MetricGroup("", [])
> > +    all_metrics = MetricGroup("", [
> > +        Rapl(),
> > +    ])
> >
> >      if _args.metricgroups:
> >          print(JsonEncodeMetricGroupDescriptions(all_metrics))
>
Re: [PATCH v8 15/52] perf jevents: Add RAPL event metric for AMD zen models
Posted by Sandipan Das 1 week, 3 days ago
On 11/28/2025 2:50 PM, Ian Rogers wrote:
> On Tue, Nov 25, 2025 at 9:05 PM Sandipan Das <sandipan.das@amd.com> wrote:
>>
>> On 11/13/2025 8:50 AM, Ian Rogers wrote:
>>> Add power per second metrics based on RAPL.
>>>
>>> Signed-off-by: Ian Rogers <irogers@google.com>
>>> ---
>>>  tools/perf/pmu-events/amd_metrics.py | 31 +++++++++++++++++++++++++---
>>>  1 file changed, 28 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/tools/perf/pmu-events/amd_metrics.py b/tools/perf/pmu-events/amd_metrics.py
>>> index bc91d9c120fa..b6cdeb4f09fe 100755
>>> --- a/tools/perf/pmu-events/amd_metrics.py
>>> +++ b/tools/perf/pmu-events/amd_metrics.py
>>> @@ -1,13 +1,36 @@
>>>  #!/usr/bin/env python3
>>>  # SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
>>>  import argparse
>>> +import math
>>>  import os
>>> -from metric import (JsonEncodeMetric, JsonEncodeMetricGroupDescriptions, LoadEvents,
>>> -                    MetricGroup)
>>> +from metric import (d_ratio, has_event, Event, JsonEncodeMetric, JsonEncodeMetricGroupDescriptions,
>>> +                    LoadEvents, Metric, MetricGroup, Select)
>>>
>>>  # Global command line arguments.
>>>  _args = None
>>>
>>> +interval_sec = Event("duration_time")
>>> +
>>> +
>>> +def Rapl() -> MetricGroup:
>>> +    """Processor socket power consumption estimate.
>>> +
>>> +    Use events from the running average power limit (RAPL) driver.
>>> +    """
>>> +    # Watts = joules/second
>>> +    # Currently only energy-pkg is supported by AMD:
>>> +    # https://lore.kernel.org/lkml/20220105185659.643355-1-eranian@google.com/
>>> +    pkg = Event("power/energy\\-pkg/")
>>> +    cond_pkg = Select(pkg, has_event(pkg), math.nan)
>>> +    scale = 2.3283064365386962890625e-10
>>
>> It is unlikely that the scale factor will change, but would it still be safer to read
>> it from /sys/bus/event_source/devices/power/events/energy-pkg.scale?
> 
> Thanks Sandipan, I agree with the feedback but this isn't something
> the metrics currently support. I'll keep it in mind.
> 

Thanks Ian. Parsing the scale factor and unit will be a nice addition.

> Wrt, the other feedback, I'm wondering if we can get this series
> landed and drop for now the patches that you have commented on? I'll
> move them to a follow up series. That'd make the AMD patches here
> something like:
> 
> Keep: [PATCH v8 15/52] perf jevents: Add RAPL event metric for AMD zen models
> Keep: [PATCH v8 16/52] perf jevents: Add idle metric for AMD zen models
> Keep: [PATCH v8 17/52] perf jevents: Add upc metric for uops per cycle for AMD
> Keep: [PATCH v8 18/52] perf jevents: Add br metric group for branch
> statistics on AMD
> Drop: [PATCH v8 19/52] perf jevents: Add software prefetch (swpf)
> metric group for AMD
> Drop: [PATCH v8 20/52] perf jevents: Add hardware prefetch (hwpf)
> metric group for AMD
> Keep: [PATCH v8 21/52] perf jevents: Add itlb metric group for AMD
> Keep: [PATCH v8 22/52] perf jevents: Add dtlb metric group for AMD
> Drop (or perhaps keep and followup with miss latency): [PATCH v8
> 23/52] perf jevents: Add uncore l3 metric group for AMD
> Keep: [PATCH v8 24/52] perf jevents: Add load store breakdown metrics
> ldst for AMD
> Drop: [PATCH v8 25/52] perf jevents: Add ILP metrics for AMD
> Keep: [PATCH v8 26/52] perf jevents: Add context switch metrics for AMD
> Keep: [PATCH v8 27/52] perf jevents: Add uop cache hit/miss rates for AMD
> 
> Is it okay to use your reviewed-by tag on the kept patches? If I keep
> patch 23 for uncore l3, with a follow up on miss latency, then are you
> okay with that?
> 

I am fine with patch 23 currently not having a miss latency metric.

For all the patches tagged "Keep" except 27/52, I don't see any obvious
issues but to be sure that the interpretations are correct, I reached
out to our internal hardware and performance teams for clarity on the
behaviour of some of the events. I would like to wait until I have the
answers.

> 
> 
>>> +    metrics = [
>>> +        Metric("lpm_cpu_power_pkg", "",
>>> +               d_ratio(cond_pkg * scale, interval_sec), "Watts"),
>>> +    ]
>>> +
>>> +    return MetricGroup("lpm_cpu_power", metrics,
>>> +                       description="Processor socket power consumption estimates")
>>> +
>>>
>>>  def main() -> None:
>>>      global _args
>>> @@ -33,7 +56,9 @@ def main() -> None:
>>>      directory = f"{_args.events_path}/x86/{_args.model}/"
>>>      LoadEvents(directory)
>>>
>>> -    all_metrics = MetricGroup("", [])
>>> +    all_metrics = MetricGroup("", [
>>> +        Rapl(),
>>> +    ])
>>>
>>>      if _args.metricgroups:
>>>          print(JsonEncodeMetricGroupDescriptions(all_metrics))
>>