[Patch v3 0/2] perf record: ratio-to-prev event term for auto counter reload

thomas.falcon@intel.com posted 2 patches 9 hours ago
tools/perf/Documentation/intel-acr.txt | 53 ++++++++++++++++++
tools/perf/Documentation/perf-list.txt |  2 +
tools/perf/arch/x86/util/evsel.c       | 52 ++++++++++++++++++
tools/perf/tests/parse-events.c        | 54 ++++++++++++++++++
tools/perf/tests/shell/record.sh       | 40 ++++++++++++++
tools/perf/util/evsel.c                | 76 ++++++++++++++++++++++++++
tools/perf/util/evsel.h                |  1 +
tools/perf/util/evsel_config.h         |  1 +
tools/perf/util/parse-events.c         | 22 ++++++++
tools/perf/util/parse-events.h         |  3 +-
tools/perf/util/parse-events.l         |  1 +
tools/perf/util/pmu.c                  |  3 +-
12 files changed, 306 insertions(+), 2 deletions(-)
create mode 100644 tools/perf/Documentation/intel-acr.txt
[Patch v3 0/2] perf record: ratio-to-prev event term for auto counter reload
Posted by thomas.falcon@intel.com 9 hours ago
From: Thomas Falcon <thomas.falcon@intel.com>

The Auto Counter Reload (ACR)[1] feature is used to track the
relative rates of two or more perf events, only sampling
when a given threshold is exceeded. This helps reduce overhead
and unnecessary samples. However, enabling this feature
currently requires setting two parameters:

 -- Event sampling period ("period")
 -- acr_mask, which determines which events get reloaded
    when the sample period is reached.

For example, in the following command:

perf record -e "{cpu_atom/branch-misses,period=200000,\
acr_mask=0x2/ppu,cpu_atom/branch-instructions,period=1000000,\
acr_mask=0x3/u}" -- ./mispredict

The goal is to limit event sampling to cases when the
branch miss rate exceeds 20%. If the branch instructions
sample period is exceeded first, both events are reloaded.
If branch misses exceed their threshold first, only the
second counter is reloaded, and a sample is taken.

To simplify this, provide a new “ratio-to-prev” event term
that works alongside the period event option or -c option.
This would allow users to specify the desired relative rate
between events as a ratio, making configuration more intuitive.

With this enhancement, the equivalent command would be:

perf record -e "{cpu_atom/branch-misses/ppu,\
cpu_atom/branch-instructions,period=1000000,ratio_to_prev=5/u}" \
-- ./mispredict

or

perf record -e "{cpu_atom/branch-misses/ppu,\
cpu_atom/branch-instructions,ratio-to-prev=5/u}" -c 1000000 \
-- ./mispredict

[1] https://lore.kernel.org/lkml/20250327195217.2683619-1-kan.liang@linux.intel.com/

v3: rebase to current perf-tools-next

v2: (changes below suggested by Ian Rogers):

-- Add documentation explaining acr_mask bitmask used by ACR
-- Move ACR specific implementation to arch/x86/
-- Provide test cases for event parsing and perf record tests

Thomas Falcon (2):
  perf record: Add ratio-to-prev term
  perf record: Add auto counter reload parse and regression tests

 tools/perf/Documentation/intel-acr.txt | 53 ++++++++++++++++++
 tools/perf/Documentation/perf-list.txt |  2 +
 tools/perf/arch/x86/util/evsel.c       | 52 ++++++++++++++++++
 tools/perf/tests/parse-events.c        | 54 ++++++++++++++++++
 tools/perf/tests/shell/record.sh       | 40 ++++++++++++++
 tools/perf/util/evsel.c                | 76 ++++++++++++++++++++++++++
 tools/perf/util/evsel.h                |  1 +
 tools/perf/util/evsel_config.h         |  1 +
 tools/perf/util/parse-events.c         | 22 ++++++++
 tools/perf/util/parse-events.h         |  3 +-
 tools/perf/util/parse-events.l         |  1 +
 tools/perf/util/pmu.c                  |  3 +-
 12 files changed, 306 insertions(+), 2 deletions(-)
 create mode 100644 tools/perf/Documentation/intel-acr.txt

-- 
2.50.1