[RFC PATCH v12 6/8] perf stat: Add command line option for enabling tpebs recording

weilin.wang@intel.com posted 8 patches 1 year, 8 months ago
There is a newer version of this series
[RFC PATCH v12 6/8] perf stat: Add command line option for enabling tpebs recording
Posted by weilin.wang@intel.com 1 year, 8 months ago
From: Weilin Wang <weilin.wang@intel.com>

With this command line option, tpebs recording is turned off in perf stat on
default. It will only be turned on when this option is given in perf stat
command.

Exampe with --enable-tpebs-recording:

perf stat -M tma_dtlb_store -a --enable-tpebs-recording sleep 1

[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.000 MB - ]

 Performance counter stats for 'system wide':

       181,047,168      cpu_core/TOPDOWN.SLOTS/          #      0.6 %  tma_dtlb_store
         3,195,608      cpu_core/topdown-retiring/
        40,156,649      cpu_core/topdown-mem-bound/
         3,550,925      cpu_core/topdown-bad-spec/
       117,571,818      cpu_core/topdown-fe-bound/
        57,118,087      cpu_core/topdown-be-bound/
            69,179      cpu_core/EXE_ACTIVITY.BOUND_ON_STORES/
             4,582      cpu_core/MEM_INST_RETIRED.STLB_HIT_STORES/
        30,183,104      cpu_core/CPU_CLK_UNHALTED.DISTRIBUTED/
        30,556,790      cpu_core/CPU_CLK_UNHALTED.THREAD/
           168,486      cpu_core/DTLB_STORE_MISSES.WALK_ACTIVE/
                 0      MEM_INST_RETIRED.STLB_HIT_STORES:R

       1.003105924 seconds time elapsed

Exampe without --enable-tpebs-recording:

perf stat -M tma_dtlb_store -a sleep 1

 Performance counter stats for 'system wide':

       181,047,168      cpu_core/TOPDOWN.SLOTS/          #      0.6 %  tma_dtlb_store
         3,195,608      cpu_core/topdown-retiring/
        40,156,649      cpu_core/topdown-mem-bound/
         3,550,925      cpu_core/topdown-bad-spec/
       117,571,818      cpu_core/topdown-fe-bound/
        57,118,087      cpu_core/topdown-be-bound/
            69,179      cpu_core/EXE_ACTIVITY.BOUND_ON_STORES/
             4,582      cpu_core/MEM_INST_RETIRED.STLB_HIT_STORES/
        30,183,104      cpu_core/CPU_CLK_UNHALTED.DISTRIBUTED/
        30,556,790      cpu_core/CPU_CLK_UNHALTED.THREAD/
           168,486      cpu_core/DTLB_STORE_MISSES.WALK_ACTIVE/
                 0      MEM_INST_RETIRED.STLB_HIT_STORES:R

       1.003105924 seconds time elapsed

Signed-off-by: Weilin Wang <weilin.wang@intel.com>
---
 tools/perf/builtin-stat.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 14be132f7b6f..055139e44763 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -1235,6 +1235,8 @@ static struct option stat_options[] = {
 		       "disable adding events for the metric threshold calculation"),
 	OPT_BOOLEAN(0, "topdown", &topdown_run,
 			"measure top-down statistics"),
+	OPT_BOOLEAN(0, "enable-tpebs-recording", &tpebs_recording,
+			"enable recording for tpebs when retire_latency required"),
 	OPT_UINTEGER(0, "td-level", &stat_config.topdown_level,
 			"Set the metrics level for the top-down statistics (0: max level)"),
 	OPT_BOOLEAN(0, "smi-cost", &smi_cost,
-- 
2.43.0
Re: [RFC PATCH v12 6/8] perf stat: Add command line option for enabling tpebs recording
Posted by Namhyung Kim 1 year, 8 months ago
Hello,

On Mon, Jun 10, 2024 at 10:24 PM <weilin.wang@intel.com> wrote:
>
> From: Weilin Wang <weilin.wang@intel.com>
>
> With this command line option, tpebs recording is turned off in perf stat on
> default. It will only be turned on when this option is given in perf stat
> command.
>
> Exampe with --enable-tpebs-recording:
>
> perf stat -M tma_dtlb_store -a --enable-tpebs-recording sleep 1
>
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.000 MB - ]
>
>  Performance counter stats for 'system wide':
>
>        181,047,168      cpu_core/TOPDOWN.SLOTS/          #      0.6 %  tma_dtlb_store
>          3,195,608      cpu_core/topdown-retiring/
>         40,156,649      cpu_core/topdown-mem-bound/
>          3,550,925      cpu_core/topdown-bad-spec/
>        117,571,818      cpu_core/topdown-fe-bound/
>         57,118,087      cpu_core/topdown-be-bound/
>             69,179      cpu_core/EXE_ACTIVITY.BOUND_ON_STORES/
>              4,582      cpu_core/MEM_INST_RETIRED.STLB_HIT_STORES/
>         30,183,104      cpu_core/CPU_CLK_UNHALTED.DISTRIBUTED/
>         30,556,790      cpu_core/CPU_CLK_UNHALTED.THREAD/
>            168,486      cpu_core/DTLB_STORE_MISSES.WALK_ACTIVE/
>                  0      MEM_INST_RETIRED.STLB_HIT_STORES:R

Here I guess we can expect a non-zero value, right?

>
>        1.003105924 seconds time elapsed
>
> Exampe without --enable-tpebs-recording:
>
> perf stat -M tma_dtlb_store -a sleep 1
>
>  Performance counter stats for 'system wide':
>
>        181,047,168      cpu_core/TOPDOWN.SLOTS/          #      0.6 %  tma_dtlb_store
>          3,195,608      cpu_core/topdown-retiring/
>         40,156,649      cpu_core/topdown-mem-bound/
>          3,550,925      cpu_core/topdown-bad-spec/
>        117,571,818      cpu_core/topdown-fe-bound/
>         57,118,087      cpu_core/topdown-be-bound/
>             69,179      cpu_core/EXE_ACTIVITY.BOUND_ON_STORES/
>              4,582      cpu_core/MEM_INST_RETIRED.STLB_HIT_STORES/
>         30,183,104      cpu_core/CPU_CLK_UNHALTED.DISTRIBUTED/
>         30,556,790      cpu_core/CPU_CLK_UNHALTED.THREAD/
>            168,486      cpu_core/DTLB_STORE_MISSES.WALK_ACTIVE/
>                  0      MEM_INST_RETIRED.STLB_HIT_STORES:R
>
>        1.003105924 seconds time elapsed
>
> Signed-off-by: Weilin Wang <weilin.wang@intel.com>
> ---
>  tools/perf/builtin-stat.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
> index 14be132f7b6f..055139e44763 100644
> --- a/tools/perf/builtin-stat.c
> +++ b/tools/perf/builtin-stat.c
> @@ -1235,6 +1235,8 @@ static struct option stat_options[] = {

What's the base of your patchset?  I think we moved this to cmd_stat().
Please make sure to rebase on the perf-tools-next.


>                        "disable adding events for the metric threshold calculation"),
>         OPT_BOOLEAN(0, "topdown", &topdown_run,
>                         "measure top-down statistics"),
> +       OPT_BOOLEAN(0, "enable-tpebs-recording", &tpebs_recording,
> +                       "enable recording for tpebs when retire_latency required"),

Please update the documentation (man page) for this option.

Also I'm afraid it'd fail to build on non-x86 because tpebes_recording
is defined in intel-tpebs.c.

Thanks,
Namhyung


>         OPT_UINTEGER(0, "td-level", &stat_config.topdown_level,
>                         "Set the metrics level for the top-down statistics (0: max level)"),
>         OPT_BOOLEAN(0, "smi-cost", &smi_cost,
> --
> 2.43.0
>
RE: [RFC PATCH v12 6/8] perf stat: Add command line option for enabling tpebs recording
Posted by Wang, Weilin 1 year, 8 months ago

> -----Original Message-----
> From: Namhyung Kim <namhyung@kernel.org>
> Sent: Tuesday, June 11, 2024 9:50 AM
> To: Wang, Weilin <weilin.wang@intel.com>
> Cc: Ian Rogers <irogers@google.com>; Arnaldo Carvalho de Melo
> <acme@kernel.org>; Peter Zijlstra <peterz@infradead.org>; Ingo Molnar
> <mingo@redhat.com>; Alexander Shishkin
> <alexander.shishkin@linux.intel.com>; Jiri Olsa <jolsa@kernel.org>; Hunter,
> Adrian <adrian.hunter@intel.com>; Kan Liang <kan.liang@linux.intel.com>;
> linux-perf-users@vger.kernel.org; linux-kernel@vger.kernel.org; Taylor, Perry
> <perry.taylor@intel.com>; Alt, Samantha <samantha.alt@intel.com>; Biggers,
> Caleb <caleb.biggers@intel.com>
> Subject: Re: [RFC PATCH v12 6/8] perf stat: Add command line option for
> enabling tpebs recording
> 
> Hello,
> 
> On Mon, Jun 10, 2024 at 10:24 PM <weilin.wang@intel.com> wrote:
> >
> > From: Weilin Wang <weilin.wang@intel.com>
> >
> > With this command line option, tpebs recording is turned off in perf stat on
> > default. It will only be turned on when this option is given in perf stat
> > command.
> >
> > Exampe with --enable-tpebs-recording:
> >
> > perf stat -M tma_dtlb_store -a --enable-tpebs-recording sleep 1
> >
> > [ perf record: Woken up 1 times to write data ]
> > [ perf record: Captured and wrote 0.000 MB - ]
> >
> >  Performance counter stats for 'system wide':
> >
> >        181,047,168      cpu_core/TOPDOWN.SLOTS/          #      0.6 %
> tma_dtlb_store
> >          3,195,608      cpu_core/topdown-retiring/
> >         40,156,649      cpu_core/topdown-mem-bound/
> >          3,550,925      cpu_core/topdown-bad-spec/
> >        117,571,818      cpu_core/topdown-fe-bound/
> >         57,118,087      cpu_core/topdown-be-bound/
> >             69,179      cpu_core/EXE_ACTIVITY.BOUND_ON_STORES/
> >              4,582      cpu_core/MEM_INST_RETIRED.STLB_HIT_STORES/
> >         30,183,104      cpu_core/CPU_CLK_UNHALTED.DISTRIBUTED/
> >         30,556,790      cpu_core/CPU_CLK_UNHALTED.THREAD/
> >            168,486      cpu_core/DTLB_STORE_MISSES.WALK_ACTIVE/
> >                  0      MEM_INST_RETIRED.STLB_HIT_STORES:R
> 
> Here I guess we can expect a non-zero value, right?

Hi Namhyung, 

Do you mean when we have the option, we should expect non-zero value? 
During the execution, it's possible that we don't capture any sample on this 
event. In this case, we would have a zero value. In the future, I think we will
give it the default value instead of zero.

> 
> >
> >        1.003105924 seconds time elapsed
> >
> > Exampe without --enable-tpebs-recording:
> >
> > perf stat -M tma_dtlb_store -a sleep 1
> >
> >  Performance counter stats for 'system wide':
> >
> >        181,047,168      cpu_core/TOPDOWN.SLOTS/          #      0.6 %
> tma_dtlb_store
> >          3,195,608      cpu_core/topdown-retiring/
> >         40,156,649      cpu_core/topdown-mem-bound/
> >          3,550,925      cpu_core/topdown-bad-spec/
> >        117,571,818      cpu_core/topdown-fe-bound/
> >         57,118,087      cpu_core/topdown-be-bound/
> >             69,179      cpu_core/EXE_ACTIVITY.BOUND_ON_STORES/
> >              4,582      cpu_core/MEM_INST_RETIRED.STLB_HIT_STORES/
> >         30,183,104      cpu_core/CPU_CLK_UNHALTED.DISTRIBUTED/
> >         30,556,790      cpu_core/CPU_CLK_UNHALTED.THREAD/
> >            168,486      cpu_core/DTLB_STORE_MISSES.WALK_ACTIVE/
> >                  0      MEM_INST_RETIRED.STLB_HIT_STORES:R
> >
> >        1.003105924 seconds time elapsed
> >
> > Signed-off-by: Weilin Wang <weilin.wang@intel.com>
> > ---
> >  tools/perf/builtin-stat.c | 2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
> > index 14be132f7b6f..055139e44763 100644
> > --- a/tools/perf/builtin-stat.c
> > +++ b/tools/perf/builtin-stat.c
> > @@ -1235,6 +1235,8 @@ static struct option stat_options[] = {
> 
> What's the base of your patchset?  I think we moved this to cmd_stat().
> Please make sure to rebase on the perf-tools-next.

This was rebased on top of Ian's change for the tool event and retire_latency parser 
patches. I think that was on tmp.perf-tools.next. I will rebase on perf-tools-next. 

> 
> 
> >                        "disable adding events for the metric threshold calculation"),
> >         OPT_BOOLEAN(0, "topdown", &topdown_run,
> >                         "measure top-down statistics"),
> > +       OPT_BOOLEAN(0, "enable-tpebs-recording", &tpebs_recording,
> > +                       "enable recording for tpebs when retire_latency required"),
> 
> Please update the documentation (man page) for this option.
> 
> Also I'm afraid it'd fail to build on non-x86 because tpebes_recording
> is defined in intel-tpebs.c.
> 

Yes, you are right. I forgot about non-x86. I will update this and also the 
Documentation.

Thanks,
Weilin

> Thanks,
> Namhyung
> 
> 
> >         OPT_UINTEGER(0, "td-level", &stat_config.topdown_level,
> >                         "Set the metrics level for the top-down statistics (0: max level)"),
> >         OPT_BOOLEAN(0, "smi-cost", &smi_cost,
> > --
> > 2.43.0
> >
Re: [RFC PATCH v12 6/8] perf stat: Add command line option for enabling tpebs recording
Posted by Namhyung Kim 1 year, 8 months ago
On Tue, Jun 11, 2024 at 10:49 AM Wang, Weilin <weilin.wang@intel.com> wrote:
>
>
>
> > -----Original Message-----
> > From: Namhyung Kim <namhyung@kernel.org>
> > Sent: Tuesday, June 11, 2024 9:50 AM
> > To: Wang, Weilin <weilin.wang@intel.com>
> > Cc: Ian Rogers <irogers@google.com>; Arnaldo Carvalho de Melo
> > <acme@kernel.org>; Peter Zijlstra <peterz@infradead.org>; Ingo Molnar
> > <mingo@redhat.com>; Alexander Shishkin
> > <alexander.shishkin@linux.intel.com>; Jiri Olsa <jolsa@kernel.org>; Hunter,
> > Adrian <adrian.hunter@intel.com>; Kan Liang <kan.liang@linux.intel.com>;
> > linux-perf-users@vger.kernel.org; linux-kernel@vger.kernel.org; Taylor, Perry
> > <perry.taylor@intel.com>; Alt, Samantha <samantha.alt@intel.com>; Biggers,
> > Caleb <caleb.biggers@intel.com>
> > Subject: Re: [RFC PATCH v12 6/8] perf stat: Add command line option for
> > enabling tpebs recording
> >
> > Hello,
> >
> > On Mon, Jun 10, 2024 at 10:24 PM <weilin.wang@intel.com> wrote:
> > >
> > > From: Weilin Wang <weilin.wang@intel.com>
> > >
> > > With this command line option, tpebs recording is turned off in perf stat on
> > > default. It will only be turned on when this option is given in perf stat
> > > command.
> > >
> > > Exampe with --enable-tpebs-recording:
> > >
> > > perf stat -M tma_dtlb_store -a --enable-tpebs-recording sleep 1
> > >
> > > [ perf record: Woken up 1 times to write data ]
> > > [ perf record: Captured and wrote 0.000 MB - ]
> > >
> > >  Performance counter stats for 'system wide':
> > >
> > >        181,047,168      cpu_core/TOPDOWN.SLOTS/          #      0.6 %
> > tma_dtlb_store
> > >          3,195,608      cpu_core/topdown-retiring/
> > >         40,156,649      cpu_core/topdown-mem-bound/
> > >          3,550,925      cpu_core/topdown-bad-spec/
> > >        117,571,818      cpu_core/topdown-fe-bound/
> > >         57,118,087      cpu_core/topdown-be-bound/
> > >             69,179      cpu_core/EXE_ACTIVITY.BOUND_ON_STORES/
> > >              4,582      cpu_core/MEM_INST_RETIRED.STLB_HIT_STORES/
> > >         30,183,104      cpu_core/CPU_CLK_UNHALTED.DISTRIBUTED/
> > >         30,556,790      cpu_core/CPU_CLK_UNHALTED.THREAD/
> > >            168,486      cpu_core/DTLB_STORE_MISSES.WALK_ACTIVE/
> > >                  0      MEM_INST_RETIRED.STLB_HIT_STORES:R
> >
> > Here I guess we can expect a non-zero value, right?
>
> Hi Namhyung,
>
> Do you mean when we have the option, we should expect non-zero value?
> During the execution, it's possible that we don't capture any sample on this
> event. In this case, we would have a zero value. In the future, I think we will
> give it the default value instead of zero.

I mean there's a chance you get non-zero, but it can be zero sometimes, right?
Then I think it's better to choose non-zero in this example otherwise it's not
clear what's the difference of using this option or not when you look at the
output in this commit message (they are both 0).

Thanks,
Namhyung

>
> >
> > >
> > >        1.003105924 seconds time elapsed
> > >
> > > Exampe without --enable-tpebs-recording:
> > >
> > > perf stat -M tma_dtlb_store -a sleep 1
> > >
> > >  Performance counter stats for 'system wide':
> > >
> > >        181,047,168      cpu_core/TOPDOWN.SLOTS/          #      0.6 %
> > tma_dtlb_store
> > >          3,195,608      cpu_core/topdown-retiring/
> > >         40,156,649      cpu_core/topdown-mem-bound/
> > >          3,550,925      cpu_core/topdown-bad-spec/
> > >        117,571,818      cpu_core/topdown-fe-bound/
> > >         57,118,087      cpu_core/topdown-be-bound/
> > >             69,179      cpu_core/EXE_ACTIVITY.BOUND_ON_STORES/
> > >              4,582      cpu_core/MEM_INST_RETIRED.STLB_HIT_STORES/
> > >         30,183,104      cpu_core/CPU_CLK_UNHALTED.DISTRIBUTED/
> > >         30,556,790      cpu_core/CPU_CLK_UNHALTED.THREAD/
> > >            168,486      cpu_core/DTLB_STORE_MISSES.WALK_ACTIVE/
> > >                  0      MEM_INST_RETIRED.STLB_HIT_STORES:R
> > >
> > >        1.003105924 seconds time elapsed
> > >
> > > Signed-off-by: Weilin Wang <weilin.wang@intel.com>
> > > ---
> > >  tools/perf/builtin-stat.c | 2 ++
> > >  1 file changed, 2 insertions(+)
> > >
> > > diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
> > > index 14be132f7b6f..055139e44763 100644
> > > --- a/tools/perf/builtin-stat.c
> > > +++ b/tools/perf/builtin-stat.c
> > > @@ -1235,6 +1235,8 @@ static struct option stat_options[] = {
> >
> > What's the base of your patchset?  I think we moved this to cmd_stat().
> > Please make sure to rebase on the perf-tools-next.
>
> This was rebased on top of Ian's change for the tool event and retire_latency parser
> patches. I think that was on tmp.perf-tools.next. I will rebase on perf-tools-next.
>
> >
> >
> > >                        "disable adding events for the metric threshold calculation"),
> > >         OPT_BOOLEAN(0, "topdown", &topdown_run,
> > >                         "measure top-down statistics"),
> > > +       OPT_BOOLEAN(0, "enable-tpebs-recording", &tpebs_recording,
> > > +                       "enable recording for tpebs when retire_latency required"),
> >
> > Please update the documentation (man page) for this option.
> >
> > Also I'm afraid it'd fail to build on non-x86 because tpebes_recording
> > is defined in intel-tpebs.c.
> >
>
> Yes, you are right. I forgot about non-x86. I will update this and also the
> Documentation.
>
> Thanks,
> Weilin
>
> > Thanks,
> > Namhyung
> >
> >
> > >         OPT_UINTEGER(0, "td-level", &stat_config.topdown_level,
> > >                         "Set the metrics level for the top-down statistics (0: max level)"),
> > >         OPT_BOOLEAN(0, "smi-cost", &smi_cost,
> > > --
> > > 2.43.0
> > >
RE: [RFC PATCH v12 6/8] perf stat: Add command line option for enabling tpebs recording
Posted by Wang, Weilin 1 year, 8 months ago

> -----Original Message-----
> From: Namhyung Kim <namhyung@kernel.org>
> Sent: Tuesday, June 11, 2024 11:49 AM
> To: Wang, Weilin <weilin.wang@intel.com>
> Cc: Ian Rogers <irogers@google.com>; Arnaldo Carvalho de Melo
> <acme@kernel.org>; Peter Zijlstra <peterz@infradead.org>; Ingo Molnar
> <mingo@redhat.com>; Alexander Shishkin
> <alexander.shishkin@linux.intel.com>; Jiri Olsa <jolsa@kernel.org>; Hunter,
> Adrian <adrian.hunter@intel.com>; Kan Liang <kan.liang@linux.intel.com>;
> linux-perf-users@vger.kernel.org; linux-kernel@vger.kernel.org; Taylor, Perry
> <perry.taylor@intel.com>; Alt, Samantha <samantha.alt@intel.com>; Biggers,
> Caleb <caleb.biggers@intel.com>
> Subject: Re: [RFC PATCH v12 6/8] perf stat: Add command line option for
> enabling tpebs recording
> 
> On Tue, Jun 11, 2024 at 10:49 AM Wang, Weilin <weilin.wang@intel.com>
> wrote:
> >
> >
> >
> > > -----Original Message-----
> > > From: Namhyung Kim <namhyung@kernel.org>
> > > Sent: Tuesday, June 11, 2024 9:50 AM
> > > To: Wang, Weilin <weilin.wang@intel.com>
> > > Cc: Ian Rogers <irogers@google.com>; Arnaldo Carvalho de Melo
> > > <acme@kernel.org>; Peter Zijlstra <peterz@infradead.org>; Ingo Molnar
> > > <mingo@redhat.com>; Alexander Shishkin
> > > <alexander.shishkin@linux.intel.com>; Jiri Olsa <jolsa@kernel.org>; Hunter,
> > > Adrian <adrian.hunter@intel.com>; Kan Liang <kan.liang@linux.intel.com>;
> > > linux-perf-users@vger.kernel.org; linux-kernel@vger.kernel.org; Taylor,
> Perry
> > > <perry.taylor@intel.com>; Alt, Samantha <samantha.alt@intel.com>;
> Biggers,
> > > Caleb <caleb.biggers@intel.com>
> > > Subject: Re: [RFC PATCH v12 6/8] perf stat: Add command line option for
> > > enabling tpebs recording
> > >
> > > Hello,
> > >
> > > On Mon, Jun 10, 2024 at 10:24 PM <weilin.wang@intel.com> wrote:
> > > >
> > > > From: Weilin Wang <weilin.wang@intel.com>
> > > >
> > > > With this command line option, tpebs recording is turned off in perf stat
> on
> > > > default. It will only be turned on when this option is given in perf stat
> > > > command.
> > > >
> > > > Exampe with --enable-tpebs-recording:
> > > >
> > > > perf stat -M tma_dtlb_store -a --enable-tpebs-recording sleep 1
> > > >
> > > > [ perf record: Woken up 1 times to write data ]
> > > > [ perf record: Captured and wrote 0.000 MB - ]
> > > >
> > > >  Performance counter stats for 'system wide':
> > > >
> > > >        181,047,168      cpu_core/TOPDOWN.SLOTS/          #      0.6 %
> > > tma_dtlb_store
> > > >          3,195,608      cpu_core/topdown-retiring/
> > > >         40,156,649      cpu_core/topdown-mem-bound/
> > > >          3,550,925      cpu_core/topdown-bad-spec/
> > > >        117,571,818      cpu_core/topdown-fe-bound/
> > > >         57,118,087      cpu_core/topdown-be-bound/
> > > >             69,179      cpu_core/EXE_ACTIVITY.BOUND_ON_STORES/
> > > >              4,582      cpu_core/MEM_INST_RETIRED.STLB_HIT_STORES/
> > > >         30,183,104      cpu_core/CPU_CLK_UNHALTED.DISTRIBUTED/
> > > >         30,556,790      cpu_core/CPU_CLK_UNHALTED.THREAD/
> > > >            168,486      cpu_core/DTLB_STORE_MISSES.WALK_ACTIVE/
> > > >                  0      MEM_INST_RETIRED.STLB_HIT_STORES:R
> > >
> > > Here I guess we can expect a non-zero value, right?
> >
> > Hi Namhyung,
> >
> > Do you mean when we have the option, we should expect non-zero value?
> > During the execution, it's possible that we don't capture any sample on this
> > event. In this case, we would have a zero value. In the future, I think we will
> > give it the default value instead of zero.
> 
> I mean there's a chance you get non-zero, but it can be zero sometimes, right?
> Then I think it's better to choose non-zero in this example otherwise it's not
> clear what's the difference of using this option or not when you look at the
> output in this commit message (they are both 0).

I see. I will create a non-zero example and update. 

Thanks,
Weilin
> 
> Thanks,
> Namhyung
> 
> >
> > >
> > > >
> > > >        1.003105924 seconds time elapsed
> > > >
> > > > Exampe without --enable-tpebs-recording:
> > > >
> > > > perf stat -M tma_dtlb_store -a sleep 1
> > > >
> > > >  Performance counter stats for 'system wide':
> > > >
> > > >        181,047,168      cpu_core/TOPDOWN.SLOTS/          #      0.6 %
> > > tma_dtlb_store
> > > >          3,195,608      cpu_core/topdown-retiring/
> > > >         40,156,649      cpu_core/topdown-mem-bound/
> > > >          3,550,925      cpu_core/topdown-bad-spec/
> > > >        117,571,818      cpu_core/topdown-fe-bound/
> > > >         57,118,087      cpu_core/topdown-be-bound/
> > > >             69,179      cpu_core/EXE_ACTIVITY.BOUND_ON_STORES/
> > > >              4,582      cpu_core/MEM_INST_RETIRED.STLB_HIT_STORES/
> > > >         30,183,104      cpu_core/CPU_CLK_UNHALTED.DISTRIBUTED/
> > > >         30,556,790      cpu_core/CPU_CLK_UNHALTED.THREAD/
> > > >            168,486      cpu_core/DTLB_STORE_MISSES.WALK_ACTIVE/
> > > >                  0      MEM_INST_RETIRED.STLB_HIT_STORES:R
> > > >
> > > >        1.003105924 seconds time elapsed
> > > >
> > > > Signed-off-by: Weilin Wang <weilin.wang@intel.com>
> > > > ---
> > > >  tools/perf/builtin-stat.c | 2 ++
> > > >  1 file changed, 2 insertions(+)
> > > >
> > > > diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
> > > > index 14be132f7b6f..055139e44763 100644
> > > > --- a/tools/perf/builtin-stat.c
> > > > +++ b/tools/perf/builtin-stat.c
> > > > @@ -1235,6 +1235,8 @@ static struct option stat_options[] = {
> > >
> > > What's the base of your patchset?  I think we moved this to cmd_stat().
> > > Please make sure to rebase on the perf-tools-next.
> >
> > This was rebased on top of Ian's change for the tool event and retire_latency
> parser
> > patches. I think that was on tmp.perf-tools.next. I will rebase on perf-tools-
> next.
> >
> > >
> > >
> > > >                        "disable adding events for the metric threshold calculation"),
> > > >         OPT_BOOLEAN(0, "topdown", &topdown_run,
> > > >                         "measure top-down statistics"),
> > > > +       OPT_BOOLEAN(0, "enable-tpebs-recording", &tpebs_recording,
> > > > +                       "enable recording for tpebs when retire_latency required"),
> > >
> > > Please update the documentation (man page) for this option.
> > >
> > > Also I'm afraid it'd fail to build on non-x86 because tpebes_recording
> > > is defined in intel-tpebs.c.
> > >
> >
> > Yes, you are right. I forgot about non-x86. I will update this and also the
> > Documentation.
> >
> > Thanks,
> > Weilin
> >
> > > Thanks,
> > > Namhyung
> > >
> > >
> > > >         OPT_UINTEGER(0, "td-level", &stat_config.topdown_level,
> > > >                         "Set the metrics level for the top-down statistics (0: max
> level)"),
> > > >         OPT_BOOLEAN(0, "smi-cost", &smi_cost,
> > > > --
> > > > 2.43.0
> > > >