[PATCH] perf doc: Add AMD IBS usage document

Ravi Bangoria posted 1 patch 1 year, 7 months ago
There is a newer version of this series
tools/perf/Documentation/perf-amd-ibs.txt | 126 ++++++++++++++++++++++
tools/perf/Documentation/perf.txt         |   3 +-
2 files changed, 128 insertions(+), 1 deletion(-)
create mode 100644 tools/perf/Documentation/perf-amd-ibs.txt
[PATCH] perf doc: Add AMD IBS usage document
Posted by Ravi Bangoria 1 year, 7 months ago
Add a perf man page document that describes how to exploit AMD IBS with
Linux perf. Brief intro about IBS and simple one-liner examples will help
naive users to get started. This is not meant to be an exhaustive IBS
guide. User should refer latest AMD64 Architecture Programmer's Manual
for detailed description of IBS.

Usage:

  $ man perf-amd-ibs

Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
---
 tools/perf/Documentation/perf-amd-ibs.txt | 126 ++++++++++++++++++++++
 tools/perf/Documentation/perf.txt         |   3 +-
 2 files changed, 128 insertions(+), 1 deletion(-)
 create mode 100644 tools/perf/Documentation/perf-amd-ibs.txt

diff --git a/tools/perf/Documentation/perf-amd-ibs.txt b/tools/perf/Documentation/perf-amd-ibs.txt
new file mode 100644
index 000000000000..d3dfa71e320c
--- /dev/null
+++ b/tools/perf/Documentation/perf-amd-ibs.txt
@@ -0,0 +1,126 @@
+perf-amd-ibs(1)
+===============
+
+NAME
+----
+perf-amd-ibs - Support for AMD Instruction-Based Sampling with perf tool
+
+SYNOPSIS
+--------
+[verse]
+'perf record' -e ibs_op//
+'perf record' -e ibs_fetch//
+
+DESCRIPTION
+-----------
+
+Instruction-Based Sampling (IBS) provides precise Instruction Pointer (IP)
+profiling support on AMD platforms. IBS has two independent components: IBS
+Op and IBS Fetch. IBS Op sampling provides information about instruction
+execution (micro-op execution to be precise) with details like d-cache
+hit/miss, d-TLB hit/miss, cache miss latency, load/store data source, branch
+behavior etc. IBS Fetch sampling provides information about instruction fetch
+with details like i-cache hit/miss, i-TLB hit/miss, fetch latency etc. IBS is
+per-smt-thread i.e. each SMT hardware thread contains standalone IBS units.
+
+Both, IBS Op and IBS Fetch, are exposed as PMUs by Linux and can be exploited
+using Linux perf utility. Following files will be created at boot time if IBS
+is supported by the hardware and kernel.
+
+  /sys/bus/event_source/devices/ibs_op/
+  /sys/bus/event_source/devices/ibs_fetch/
+
+IBS Op PMU supports two events: cycles and micro ops. IBS Fetch PMU supports
+one event: fetch ops.
+
+IBS VS. REGULAR CORE PMU
+------------------------
+
+IBS gives samples with precise IP, i.e. the IP recorded with IBS sample has
+no skid. Whereas the IP recorded by regular core PMU will have some skid
+(sample was generated at IP X but perf would record it at IP X+n). Hence,
+regular core PMU might not help for profiling with instruction level
+precision. Further, IBS provides additional information about the sample in
+question. On the other hand, regular core PMU has it's own advantages like
+plethora of events, counting mode (less interference), up to 6 parallel
+counters, event grouping support, filtering capabilities etc.
+
+EXAMPLES
+--------
+
+IBS Op PMU
+~~~~~~~~~~
+
+System-wide profile, cycles event, sampling period: 100000
+
+	$ sudo perf record -e ibs_op// -c 100000 -a
+
+Per-cpu profile (cpu10), cycles event, sampling period: 100000
+
+	$ sudo perf record -e ibs_op// -c 100000 -C 10
+
+Per-cpu profile (cpu10), cycles event, sampling freq: 1000
+
+	$ sudo perf record -e ibs_op// -F 1000 -C 10
+
+System-wide profile, uOps event, sampling period: 100000
+
+	$ sudo perf record -e ibs_op/cnt_ctl=1/ -c 100000 -a
+
+Same command, but also capture IBS register raw dump along with perf sample:
+
+	$ sudo perf record -e ibs_op/cnt_ctl=1/ -c 100000 -a --raw-samples
+
+System-wide profile, uOps event, sampling period: 100000, L3MissOnly (Zen4 onward)
+
+	$ sudo perf record -e ibs_op/cnt_ctl=1,l3missonly=1/ -c 100000 -a
+
+Per process(upstream v6.2 onward), uOps event, sampling period: 100000
+
+	$ sudo perf record -e ibs_op/cnt_ctl=1/ -c 100000 -p 1234
+
+Per process(upstream v6.2 onward), uOps event, sampling period: 100000
+
+	$ sudo perf record -e ibs_op/cnt_ctl=1/ -c 100000 -- ls
+
+To analyse recorded profile in aggregate mode
+
+	$ sudo perf report
+	/* Select a line and press 'a' to drill down at instruction level. */
+
+To go over each sample
+
+	$ sudo perf script
+
+Raw dump of IBS registers when profiled with --raw-samples
+
+	$ sudo perf report -D
+	/* Look for PERF_RECORD_SAMPLE */
+
+IBS applied in a real world usecase
+
+~90% regression was observed in tbench with specific scheduler hint which
+was counter intuitive. IBS profile of good and bad run captured using perf
+helped in identifying exact cause of the problem:
+
+	https://lore.kernel.org/r/20220921063638.2489-1-kprateek.nayak@amd.com
+
+IBS Fetch PMU
+~~~~~~~~~~~~~
+
+Similar commands can be used with Fetch PMU as well.
+
+System-wide profile, fetch ops event, sampling period: 100000
+
+	$ sudo perf record -e ibs_fetch// -c 100000 -a
+
+System-wide profile, fetch ops event, sampling period: 100000, Random enable
+
+	$ sudo perf record -e ibs_fetch/rand_en=1/ -c 100000 -a
+
+etc.
+
+SEE ALSO
+--------
+
+linkperf:perf-record[1], linkperf:perf-script[1], linkperf:perf-report[1]
diff --git a/tools/perf/Documentation/perf.txt b/tools/perf/Documentation/perf.txt
index 09f516f3fdfb..cbcc2e4d557e 100644
--- a/tools/perf/Documentation/perf.txt
+++ b/tools/perf/Documentation/perf.txt
@@ -82,7 +82,8 @@ linkperf:perf-stat[1], linkperf:perf-top[1],
 linkperf:perf-record[1], linkperf:perf-report[1],
 linkperf:perf-list[1]
 
-linkperf:perf-annotate[1],linkperf:perf-archive[1],linkperf:perf-arm-spe[1],
+linkperf:perf-amd-ibs[1], linkperf:perf-annotate[1],
+linkperf:perf-archive[1], linkperf:perf-arm-spe[1],
 linkperf:perf-bench[1], linkperf:perf-buildid-cache[1],
 linkperf:perf-buildid-list[1], linkperf:perf-c2c[1],
 linkperf:perf-config[1], linkperf:perf-data[1], linkperf:perf-diff[1],
-- 
2.45.2
Re: [PATCH] perf doc: Add AMD IBS usage document
Posted by Namhyung Kim 1 year, 7 months ago
Hello,

Adding Stephane to CC.

On Wed, Jun 19, 2024 at 2:23 AM Ravi Bangoria <ravi.bangoria@amd.com> wrote:
>
> Add a perf man page document that describes how to exploit AMD IBS with
> Linux perf. Brief intro about IBS and simple one-liner examples will help
> naive users to get started. This is not meant to be an exhaustive IBS
> guide. User should refer latest AMD64 Architecture Programmer's Manual
> for detailed description of IBS.
>
> Usage:
>
>   $ man perf-amd-ibs
>
> Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>

Thanks a lot for adding this documentation!  A nitpick below..


> ---
>  tools/perf/Documentation/perf-amd-ibs.txt | 126 ++++++++++++++++++++++
>  tools/perf/Documentation/perf.txt         |   3 +-
>  2 files changed, 128 insertions(+), 1 deletion(-)
>  create mode 100644 tools/perf/Documentation/perf-amd-ibs.txt
>
> diff --git a/tools/perf/Documentation/perf-amd-ibs.txt b/tools/perf/Documentation/perf-amd-ibs.txt
> new file mode 100644
> index 000000000000..d3dfa71e320c
> --- /dev/null
> +++ b/tools/perf/Documentation/perf-amd-ibs.txt
> @@ -0,0 +1,126 @@
> +perf-amd-ibs(1)
> +===============
> +
> +NAME
> +----
> +perf-amd-ibs - Support for AMD Instruction-Based Sampling with perf tool
> +
> +SYNOPSIS
> +--------
> +[verse]
> +'perf record' -e ibs_op//
> +'perf record' -e ibs_fetch//
> +
> +DESCRIPTION
> +-----------
> +
> +Instruction-Based Sampling (IBS) provides precise Instruction Pointer (IP)
> +profiling support on AMD platforms. IBS has two independent components: IBS
> +Op and IBS Fetch. IBS Op sampling provides information about instruction
> +execution (micro-op execution to be precise) with details like d-cache
> +hit/miss, d-TLB hit/miss, cache miss latency, load/store data source, branch
> +behavior etc. IBS Fetch sampling provides information about instruction fetch
> +with details like i-cache hit/miss, i-TLB hit/miss, fetch latency etc. IBS is
> +per-smt-thread i.e. each SMT hardware thread contains standalone IBS units.
> +
> +Both, IBS Op and IBS Fetch, are exposed as PMUs by Linux and can be exploited
> +using Linux perf utility. Following files will be created at boot time if IBS
> +is supported by the hardware and kernel.
> +
> +  /sys/bus/event_source/devices/ibs_op/
> +  /sys/bus/event_source/devices/ibs_fetch/
> +
> +IBS Op PMU supports two events: cycles and micro ops. IBS Fetch PMU supports
> +one event: fetch ops.
> +
> +IBS VS. REGULAR CORE PMU
> +------------------------
> +
> +IBS gives samples with precise IP, i.e. the IP recorded with IBS sample has
> +no skid. Whereas the IP recorded by regular core PMU will have some skid
> +(sample was generated at IP X but perf would record it at IP X+n). Hence,
> +regular core PMU might not help for profiling with instruction level
> +precision. Further, IBS provides additional information about the sample in
> +question. On the other hand, regular core PMU has it's own advantages like
> +plethora of events, counting mode (less interference), up to 6 parallel
> +counters, event grouping support, filtering capabilities etc.
> +
> +EXAMPLES
> +--------
> +
> +IBS Op PMU
> +~~~~~~~~~~
> +
> +System-wide profile, cycles event, sampling period: 100000
> +
> +       $ sudo perf record -e ibs_op// -c 100000 -a
> +
> +Per-cpu profile (cpu10), cycles event, sampling period: 100000
> +
> +       $ sudo perf record -e ibs_op// -c 100000 -C 10
> +
> +Per-cpu profile (cpu10), cycles event, sampling freq: 1000
> +
> +       $ sudo perf record -e ibs_op// -F 1000 -C 10
> +
> +System-wide profile, uOps event, sampling period: 100000
> +
> +       $ sudo perf record -e ibs_op/cnt_ctl=1/ -c 100000 -a
> +
> +Same command, but also capture IBS register raw dump along with perf sample:
> +
> +       $ sudo perf record -e ibs_op/cnt_ctl=1/ -c 100000 -a --raw-samples
> +
> +System-wide profile, uOps event, sampling period: 100000, L3MissOnly (Zen4 onward)
> +
> +       $ sudo perf record -e ibs_op/cnt_ctl=1,l3missonly=1/ -c 100000 -a
> +
> +Per process(upstream v6.2 onward), uOps event, sampling period: 100000
> +
> +       $ sudo perf record -e ibs_op/cnt_ctl=1/ -c 100000 -p 1234
> +
> +Per process(upstream v6.2 onward), uOps event, sampling period: 100000
> +
> +       $ sudo perf record -e ibs_op/cnt_ctl=1/ -c 100000 -- ls
> +
> +To analyse recorded profile in aggregate mode
> +
> +       $ sudo perf report
> +       /* Select a line and press 'a' to drill down at instruction level. */
> +
> +To go over each sample
> +
> +       $ sudo perf script
> +
> +Raw dump of IBS registers when profiled with --raw-samples
> +
> +       $ sudo perf report -D
> +       /* Look for PERF_RECORD_SAMPLE */
> +
> +IBS applied in a real world usecase
> +
> +~90% regression was observed in tbench with specific scheduler hint which
> +was counter intuitive. IBS profile of good and bad run captured using perf
> +helped in identifying exact cause of the problem:
> +
> +       https://lore.kernel.org/r/20220921063638.2489-1-kprateek.nayak@amd.com
> +
> +IBS Fetch PMU
> +~~~~~~~~~~~~~
> +
> +Similar commands can be used with Fetch PMU as well.
> +
> +System-wide profile, fetch ops event, sampling period: 100000
> +
> +       $ sudo perf record -e ibs_fetch// -c 100000 -a
> +
> +System-wide profile, fetch ops event, sampling period: 100000, Random enable

Can you please add a brief description of what 'random enable' means?

Thanks,
Namhyung


> +
> +       $ sudo perf record -e ibs_fetch/rand_en=1/ -c 100000 -a
> +
> +etc.
> +
> +SEE ALSO
> +--------
> +
> +linkperf:perf-record[1], linkperf:perf-script[1], linkperf:perf-report[1]
> diff --git a/tools/perf/Documentation/perf.txt b/tools/perf/Documentation/perf.txt
> index 09f516f3fdfb..cbcc2e4d557e 100644
> --- a/tools/perf/Documentation/perf.txt
> +++ b/tools/perf/Documentation/perf.txt
> @@ -82,7 +82,8 @@ linkperf:perf-stat[1], linkperf:perf-top[1],
>  linkperf:perf-record[1], linkperf:perf-report[1],
>  linkperf:perf-list[1]
>
> -linkperf:perf-annotate[1],linkperf:perf-archive[1],linkperf:perf-arm-spe[1],
> +linkperf:perf-amd-ibs[1], linkperf:perf-annotate[1],
> +linkperf:perf-archive[1], linkperf:perf-arm-spe[1],
>  linkperf:perf-bench[1], linkperf:perf-buildid-cache[1],
>  linkperf:perf-buildid-list[1], linkperf:perf-c2c[1],
>  linkperf:perf-config[1], linkperf:perf-data[1], linkperf:perf-diff[1],
> --
> 2.45.2
>
Re: [PATCH] perf doc: Add AMD IBS usage document
Posted by Ravi Bangoria 1 year, 7 months ago
>> +IBS Fetch PMU
>> +~~~~~~~~~~~~~
>> +
>> +Similar commands can be used with Fetch PMU as well.
>> +
>> +System-wide profile, fetch ops event, sampling period: 100000
>> +
>> +       $ sudo perf record -e ibs_fetch// -c 100000 -a
>> +
>> +System-wide profile, fetch ops event, sampling period: 100000, Random enable
> 
> Can you please add a brief description of what 'random enable' means?

Sure, here is the detail about RandEn bit:

Sample period value in IBS Fetch PMU must be multiple of 16. IBS hardware
internally sets pseudo-random value in [3:0] bits when RandEn bit is set.
This variability will help in cases like long running loops where IBS
Fetch PMU is tagging the same instruction over and over because of the
fixed sample period.

Thanks for the review,
Ravi
Re: [PATCH] perf doc: Add AMD IBS usage document
Posted by Arnaldo Carvalho de Melo 1 year, 7 months ago
On Wed, Jun 19, 2024 at 09:22:34AM +0000, Ravi Bangoria wrote:
> Add a perf man page document that describes how to exploit AMD IBS with
> Linux perf. Brief intro about IBS and simple one-liner examples will help
> naive users to get started. This is not meant to be an exhaustive IBS
> guide. User should refer latest AMD64 Architecture Programmer's Manual
> for detailed description of IBS.
> 
> Usage:
> 
>   $ man perf-amd-ibs
> 
> Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
> ---
>  tools/perf/Documentation/perf-amd-ibs.txt | 126 ++++++++++++++++++++++
>  tools/perf/Documentation/perf.txt         |   3 +-
>  2 files changed, 128 insertions(+), 1 deletion(-)
>  create mode 100644 tools/perf/Documentation/perf-amd-ibs.txt
> 
> diff --git a/tools/perf/Documentation/perf-amd-ibs.txt b/tools/perf/Documentation/perf-amd-ibs.txt
> new file mode 100644
> index 000000000000..d3dfa71e320c
> --- /dev/null
> +++ b/tools/perf/Documentation/perf-amd-ibs.txt
> @@ -0,0 +1,126 @@
> +perf-amd-ibs(1)
> +===============
> +
> +NAME
> +----
> +perf-amd-ibs - Support for AMD Instruction-Based Sampling with perf tool
> +
> +SYNOPSIS
> +--------
> +[verse]
> +'perf record' -e ibs_op//
> +'perf record' -e ibs_fetch//
> +
> +DESCRIPTION
> +-----------
> +
> +Instruction-Based Sampling (IBS) provides precise Instruction Pointer (IP)
> +profiling support on AMD platforms. IBS has two independent components: IBS
> +Op and IBS Fetch. IBS Op sampling provides information about instruction
> +execution (micro-op execution to be precise) with details like d-cache
> +hit/miss, d-TLB hit/miss, cache miss latency, load/store data source, branch
> +behavior etc. IBS Fetch sampling provides information about instruction fetch
> +with details like i-cache hit/miss, i-TLB hit/miss, fetch latency etc. IBS is
> +per-smt-thread i.e. each SMT hardware thread contains standalone IBS units.
> +
> +Both, IBS Op and IBS Fetch, are exposed as PMUs by Linux and can be exploited
> +using Linux perf utility. Following files will be created at boot time if IBS
        the                  The
> +is supported by the hardware and kernel.
> +
> +  /sys/bus/event_source/devices/ibs_op/
> +  /sys/bus/event_source/devices/ibs_fetch/
> +
> +IBS Op PMU supports two events: cycles and micro ops. IBS Fetch PMU supports
> +one event: fetch ops.
> +
> +IBS VS. REGULAR CORE PMU
> +------------------------
> +
> +IBS gives samples with precise IP, i.e. the IP recorded with IBS sample has
> +no skid. Whereas the IP recorded by regular core PMU will have some skid
> +(sample was generated at IP X but perf would record it at IP X+n). Hence,
> +regular core PMU might not help for profiling with instruction level
> +precision. Further, IBS provides additional information about the sample in
> +question. On the other hand, regular core PMU has it's own advantages like
> +plethora of events, counting mode (less interference), up to 6 parallel
> +counters, event grouping support, filtering capabilities etc.

IIRC if one does:

   perf record -e cycles:P

on AMD systems it maps it to 

   ibs_op//

No?

I don't have access right now to my 5950X, so its from memory, about
"IBS invocation from core PMUs with precise_ip set"

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=78075d947534013b4575687d19ebcbbb6d3addcd

One other thing to mention is 'perf mem record' that will use ibs_op//
as we can see in the cover letter for this perf-tools merge commit
upstream:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=9d64bf433c53cab2f48a3fff7a1f2a696bc5229a

         # perf mem record -a --filter 'mem_op == load || mem_op == store, ip > 0x8000000000000000'
         ^C[ perf record: Woken up 1 times to write data ]
         [ perf record: Captured and wrote 2.199 MB perf.data (2913 samples) ]
         #
         # ls -la perf.data
         -rw-------. 1 root root 2346486 Jan  9 18:36 perf.data
         # perf evlist
         ibs_op//
         dummy:u
         # perf evlist -v
         ibs_op//: type: 11, size: 136, config: 0, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ADDR|CPU|PERIOD|IDENTIFIER|DATA_SRC|WEIGHT, read_format: ID, disabled: 1, inherit: 1, freq: 1, sample_id_all: 1

Another examples available in the merge commit of when ibs_op support
was added to 'perf c2c' and 'perf mem':

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d465bff130bf4ca17b6980abe51164ace1e0cba4

Showing how you can use 'perf report -D' to extract info about these
samples should be interesting as well:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0429796e45ec17eee26d7a59de92271c275d7666
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=291dcb98d7ee5cd719f4c5991d977794b1829c16

> +EXAMPLES
> +--------
> +
> +IBS Op PMU
> +~~~~~~~~~~
> +
> +System-wide profile, cycles event, sampling period: 100000
> +
> +	$ sudo perf record -e ibs_op// -c 100000 -a
> +
> +Per-cpu profile (cpu10), cycles event, sampling period: 100000
> +
> +	$ sudo perf record -e ibs_op// -c 100000 -C 10
> +
> +Per-cpu profile (cpu10), cycles event, sampling freq: 1000
> +
> +	$ sudo perf record -e ibs_op// -F 1000 -C 10
> +
> +System-wide profile, uOps event, sampling period: 100000
> +
> +	$ sudo perf record -e ibs_op/cnt_ctl=1/ -c 100000 -a
> +
> +Same command, but also capture IBS register raw dump along with perf sample:
> +
> +	$ sudo perf record -e ibs_op/cnt_ctl=1/ -c 100000 -a --raw-samples
> +
> +System-wide profile, uOps event, sampling period: 100000, L3MissOnly (Zen4 onward)
> +
> +	$ sudo perf record -e ibs_op/cnt_ctl=1,l3missonly=1/ -c 100000 -a
> +
> +Per process(upstream v6.2 onward), uOps event, sampling period: 100000
> +
> +	$ sudo perf record -e ibs_op/cnt_ctl=1/ -c 100000 -p 1234
> +
> +Per process(upstream v6.2 onward), uOps event, sampling period: 100000
> +
> +	$ sudo perf record -e ibs_op/cnt_ctl=1/ -c 100000 -- ls
> +
> +To analyse recorded profile in aggregate mode
> +
> +	$ sudo perf report
> +	/* Select a line and press 'a' to drill down at instruction level. */
> +
> +To go over each sample
> +
> +	$ sudo perf script

Here I think it would be to have an example of such output.
> +
> +Raw dump of IBS registers when profiled with --raw-samples
> +
> +	$ sudo perf report -D
> +	/* Look for PERF_RECORD_SAMPLE */

Ditto

> +
> +IBS applied in a real world usecase
> +
> +~90% regression was observed in tbench with specific scheduler hint which
> +was counter intuitive. IBS profile of good and bad run captured using perf
> +helped in identifying exact cause of the problem:
> +
> +	https://lore.kernel.org/r/20220921063638.2489-1-kprateek.nayak@amd.com
> +
> +IBS Fetch PMU
> +~~~~~~~~~~~~~
> +
> +Similar commands can be used with Fetch PMU as well.
> +
> +System-wide profile, fetch ops event, sampling period: 100000
> +
> +	$ sudo perf record -e ibs_fetch// -c 100000 -a
> +
> +System-wide profile, fetch ops event, sampling period: 100000, Random enable
> +
> +	$ sudo perf record -e ibs_fetch/rand_en=1/ -c 100000 -a
> +
> +etc.
> +
> +SEE ALSO
> +--------
> +
> +linkperf:perf-record[1], linkperf:perf-script[1], linkperf:perf-report[1]

perf-mem, perf-c2c

> diff --git a/tools/perf/Documentation/perf.txt b/tools/perf/Documentation/perf.txt
> index 09f516f3fdfb..cbcc2e4d557e 100644
> --- a/tools/perf/Documentation/perf.txt
> +++ b/tools/perf/Documentation/perf.txt
> @@ -82,7 +82,8 @@ linkperf:perf-stat[1], linkperf:perf-top[1],
>  linkperf:perf-record[1], linkperf:perf-report[1],
>  linkperf:perf-list[1]
>  
> -linkperf:perf-annotate[1],linkperf:perf-archive[1],linkperf:perf-arm-spe[1],
> +linkperf:perf-amd-ibs[1], linkperf:perf-annotate[1],
> +linkperf:perf-archive[1], linkperf:perf-arm-spe[1],
>  linkperf:perf-bench[1], linkperf:perf-buildid-cache[1],
>  linkperf:perf-buildid-list[1], linkperf:perf-c2c[1],
>  linkperf:perf-config[1], linkperf:perf-data[1], linkperf:perf-diff[1],
> -- 
> 2.45.2
Re: [PATCH] perf doc: Add AMD IBS usage document
Posted by Ravi Bangoria 1 year, 7 months ago
> IIRC if one does:
> 
>    perf record -e cycles:P
> 
> on AMD systems it maps it to 
> 
>    ibs_op//
> 
> No?

Correct. man perf-list already covers that under event modifier section
but will add brief detail here as well.

> I don't have access right now to my 5950X, so its from memory, about
> "IBS invocation from core PMUs with precise_ip set"
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=78075d947534013b4575687d19ebcbbb6d3addcd
> 
> One other thing to mention is 'perf mem record' that will use ibs_op//
> as we can see in the cover letter for this perf-tools merge commit
> upstream:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=9d64bf433c53cab2f48a3fff7a1f2a696bc5229a
> 
>          # perf mem record -a --filter 'mem_op == load || mem_op == store, ip > 0x8000000000000000'
>          ^C[ perf record: Woken up 1 times to write data ]
>          [ perf record: Captured and wrote 2.199 MB perf.data (2913 samples) ]
>          #
>          # ls -la perf.data
>          -rw-------. 1 root root 2346486 Jan  9 18:36 perf.data
>          # perf evlist
>          ibs_op//
>          dummy:u
>          # perf evlist -v
>          ibs_op//: type: 11, size: 136, config: 0, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ADDR|CPU|PERIOD|IDENTIFIER|DATA_SRC|WEIGHT, read_format: ID, disabled: 1, inherit: 1, freq: 1, sample_id_all: 1
> 
> Another examples available in the merge commit of when ibs_op support
> was added to 'perf c2c' and 'perf mem':

Correct. Will add brief detail about perf mem and c2c here.

> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d465bff130bf4ca17b6980abe51164ace1e0cba4
> 
> Showing how you can use 'perf report -D' to extract info about these
> samples should be interesting as well:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0429796e45ec17eee26d7a59de92271c275d7666
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=291dcb98d7ee5cd719f4c5991d977794b1829c16

Sure. Will add that in the example below.

>> +To go over each sample
>> +
>> +	$ sudo perf script
> 
> Here I think it would be to have an example of such output.

This would be a normal perf script output but raw dump contains IBS
specific raw values. I'll add a sample output in below command.

>> +Raw dump of IBS registers when profiled with --raw-samples
>> +
>> +	$ sudo perf report -D
>> +	/* Look for PERF_RECORD_SAMPLE */
> 
> Ditto

...

>> +SEE ALSO
>> +--------
>> +
>> +linkperf:perf-record[1], linkperf:perf-script[1], linkperf:perf-report[1]
> 
> perf-mem, perf-c2c

Ack.

Thanks for the review,
Ravi