perf metrics: Support parsing metrics if platforms only have json table of system PMU

[PATCH v2] perf metrics: Support parsing metrics if platforms only have json table of system PMU

Posted by Junhao He 1 month, 2 weeks ago

The system PMUs don't depend on the certain CPUs and we don't need a CPUID
to metric table mapping to match the json event for generating the metric
table. For example HiSilicon HIP09 only have json events table of system
PMUs in the "sys/" subdirectory.

Currently for this case the struct of system metric table
"pmu_metrics__hisilicon_hip09_sys" generating works fine and metrics
display as expected by using `perf list`. But `perf stat` doesn't work
for such metrics.

  $ perf list metric
  Metrics:
    cpa_p0_avg_bw
         [Average bandwidth of CPA Port 0]
    cpa_p1_avg_bw
         [Average bandwidth of CPA Port 1]
  $ perf stat -M cpa_p0_avg_bw --timeout 1000 --> No error messages output
  $ echo $?
  234

The metricgroup__parse_groups() expects to find an cpu metric table, but
the hisilicon/hip09 doesn't uses CPUID to map json events and metrics, so
pmu_metrics_table__find() will return NULL, than the cmd run failed.

But in metricgroup__add_metric(), the function parse for each sys metric
and add it to metric_list, which also will get an valid sys metric table.
So, we can ignore the NULL result of pmu_metrics_table__find() and to use
the sys metric table.

metricgroup__parse_groups
 -> parse_groups
     -> metricgroup__add_metric_list
         -> metricgroup__add_metric
	     -> pmu_for_each_sys_metric   --> parse for each sys metric

Testing:
  $ perf stat -M cpa_p0_avg_bw --timeout 1000

 Performance counter stats for 'system wide':

     4,004,863,602      cpa_cycles                #     0.00 cpa_p0_avg_bw
                 0      cpa_p0_wr_dat
                 0      cpa_p0_rd_dat_64b
                 0      cpa_p0_rd_dat_32b

       1.001306160 seconds time elapsed

Signed-off-by: Junhao He <hejunhao3@huawei.com>
Tested-by: Yicong Yang <yangyicong@hisilicon.com>
---
v1 --> v2:
 -Add some comments to explain the table is null.
 -Modify the patch commit.
 -Add Yicong Test-by.
v1:https://lore.kernel.org/all/20240807040002.47119-1-hejunhao3@huawei.com/
---
 tools/perf/util/metricgroup.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c
index 69f6a46402c3..cb428eabd485 100644
--- a/tools/perf/util/metricgroup.c
+++ b/tools/perf/util/metricgroup.c
@@ -1123,7 +1123,7 @@ static int metricgroup__add_metric_sys_event_iter(const struct pmu_metric *pm,
 
 	ret = add_metric(d->metric_list, pm, d->modifier, d->metric_no_group,
 			 d->metric_no_threshold, d->user_requested_cpu_list,
-			 d->system_wide, d->root_metric, d->visited, d->table);
+			 d->system_wide, d->root_metric, d->visited, d->table ?: table);
 	if (ret)
 		goto out;
 
@@ -1226,7 +1226,8 @@ static int metricgroup__add_metric_callback(const struct pmu_metric *pm,
  * @system_wide: Are events for all processes recorded.
  * @metric_list: The list that the metric or metric group are added to.
  * @table: The table that is searched for metrics, most commonly the table for the
- *       architecture perf is running upon.
+ *       architecture perf is running upon. This value could be NULL if no core
+ *       metrics matches the architecture and we'll try to use the table of system PMUs.
  */
 static int metricgroup__add_metric(const char *pmu, const char *metric_name, const char *modifier,
 				   bool metric_no_group, bool metric_no_threshold,
@@ -1239,7 +1240,8 @@ static int metricgroup__add_metric(const char *pmu, const char *metric_name, con
 	int ret;
 	bool has_match = false;
 
-	{
+	/* Add core metrics to the metric list */
+	if (table) {
 		struct metricgroup__add_metric_data data = {
 			.list = &list,
 			.pmu = pmu,
@@ -1263,6 +1265,7 @@ static int metricgroup__add_metric(const char *pmu, const char *metric_name, con
 		has_match = data.has_match;
 	}
 	{
+		/* Parse metrics table of system PMUs */
 		struct metricgroup_iter_data data = {
 			.fn = metricgroup__add_metric_sys_event_iter,
 			.data = (void *) &(struct metricgroup_add_iter_data) {
@@ -1697,7 +1700,8 @@ int metricgroup__parse_groups(struct evlist *perf_evlist,
 	const struct pmu_metrics_table *table = pmu_metrics_table__find();
 
 	if (!table)
-		return -EINVAL;
+		pr_debug("The core metric table not found, continue to parse system metric table\n");
+
 	if (hardware_aware_grouping)
 		pr_debug("Use hardware aware grouping instead of traditional metric grouping method\n");
 
-- 
2.33.0

Re: [PATCH v2] perf metrics: Support parsing metrics if platforms only have json table of system PMU

Posted by Namhyung Kim 1 month ago

Hello,

On Thu, Oct 10, 2024 at 03:44:30PM +0800, Junhao He wrote:
> The system PMUs don't depend on the certain CPUs and we don't need a CPUID
> to metric table mapping to match the json event for generating the metric
> table. For example HiSilicon HIP09 only have json events table of system
> PMUs in the "sys/" subdirectory.
> 
> Currently for this case the struct of system metric table
> "pmu_metrics__hisilicon_hip09_sys" generating works fine and metrics
> display as expected by using `perf list`. But `perf stat` doesn't work
> for such metrics.
> 
>   $ perf list metric
>   Metrics:
>     cpa_p0_avg_bw
>          [Average bandwidth of CPA Port 0]
>     cpa_p1_avg_bw
>          [Average bandwidth of CPA Port 1]
>   $ perf stat -M cpa_p0_avg_bw --timeout 1000 --> No error messages output
>   $ echo $?
>   234
> 
> The metricgroup__parse_groups() expects to find an cpu metric table, but
> the hisilicon/hip09 doesn't uses CPUID to map json events and metrics, so
> pmu_metrics_table__find() will return NULL, than the cmd run failed.
> 
> But in metricgroup__add_metric(), the function parse for each sys metric
> and add it to metric_list, which also will get an valid sys metric table.
> So, we can ignore the NULL result of pmu_metrics_table__find() and to use
> the sys metric table.
> 
> metricgroup__parse_groups
>  -> parse_groups
>      -> metricgroup__add_metric_list
>          -> metricgroup__add_metric
> 	     -> pmu_for_each_sys_metric   --> parse for each sys metric
> 
> Testing:
>   $ perf stat -M cpa_p0_avg_bw --timeout 1000
> 
>  Performance counter stats for 'system wide':
> 
>      4,004,863,602      cpa_cycles                #     0.00 cpa_p0_avg_bw
>                  0      cpa_p0_wr_dat
>                  0      cpa_p0_rd_dat_64b
>                  0      cpa_p0_rd_dat_32b
> 
>        1.001306160 seconds time elapsed

Ian, can you please review this?

Thanks,
Namhyung

> 
> Signed-off-by: Junhao He <hejunhao3@huawei.com>
> Tested-by: Yicong Yang <yangyicong@hisilicon.com>
> ---
> v1 --> v2:
>  -Add some comments to explain the table is null.
>  -Modify the patch commit.
>  -Add Yicong Test-by.
> v1:https://lore.kernel.org/all/20240807040002.47119-1-hejunhao3@huawei.com/
> ---
>  tools/perf/util/metricgroup.c | 12 ++++++++----
>  1 file changed, 8 insertions(+), 4 deletions(-)
> 
> diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c
> index 69f6a46402c3..cb428eabd485 100644
> --- a/tools/perf/util/metricgroup.c
> +++ b/tools/perf/util/metricgroup.c
> @@ -1123,7 +1123,7 @@ static int metricgroup__add_metric_sys_event_iter(const struct pmu_metric *pm,
>  
>  	ret = add_metric(d->metric_list, pm, d->modifier, d->metric_no_group,
>  			 d->metric_no_threshold, d->user_requested_cpu_list,
> -			 d->system_wide, d->root_metric, d->visited, d->table);
> +			 d->system_wide, d->root_metric, d->visited, d->table ?: table);
>  	if (ret)
>  		goto out;
>  
> @@ -1226,7 +1226,8 @@ static int metricgroup__add_metric_callback(const struct pmu_metric *pm,
>   * @system_wide: Are events for all processes recorded.
>   * @metric_list: The list that the metric or metric group are added to.
>   * @table: The table that is searched for metrics, most commonly the table for the
> - *       architecture perf is running upon.
> + *       architecture perf is running upon. This value could be NULL if no core
> + *       metrics matches the architecture and we'll try to use the table of system PMUs.
>   */
>  static int metricgroup__add_metric(const char *pmu, const char *metric_name, const char *modifier,
>  				   bool metric_no_group, bool metric_no_threshold,
> @@ -1239,7 +1240,8 @@ static int metricgroup__add_metric(const char *pmu, const char *metric_name, con
>  	int ret;
>  	bool has_match = false;
>  
> -	{
> +	/* Add core metrics to the metric list */
> +	if (table) {
>  		struct metricgroup__add_metric_data data = {
>  			.list = &list,
>  			.pmu = pmu,
> @@ -1263,6 +1265,7 @@ static int metricgroup__add_metric(const char *pmu, const char *metric_name, con
>  		has_match = data.has_match;
>  	}
>  	{
> +		/* Parse metrics table of system PMUs */
>  		struct metricgroup_iter_data data = {
>  			.fn = metricgroup__add_metric_sys_event_iter,
>  			.data = (void *) &(struct metricgroup_add_iter_data) {
> @@ -1697,7 +1700,8 @@ int metricgroup__parse_groups(struct evlist *perf_evlist,
>  	const struct pmu_metrics_table *table = pmu_metrics_table__find();
>  
>  	if (!table)
> -		return -EINVAL;
> +		pr_debug("The core metric table not found, continue to parse system metric table\n");
> +
>  	if (hardware_aware_grouping)
>  		pr_debug("Use hardware aware grouping instead of traditional metric grouping method\n");
>  
> -- 
> 2.33.0
>