Currently the while loop always either exits on the first iteration with
a core PMU, or exits with NULL on heterogeneous systems or when not all
CPUs are online.
Both of the latter behaviors are undesirable for platforms other than
Arm so simplify it to always return the first core PMU, or NULL if none
exist.
This behavior was depended on by the Arm version of
pmu_metrics_table__find(), so the logic has been moved there instead.
Suggested-by: Ian Rogers <irogers@google.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@arm.com>
---
tools/perf/arch/arm64/util/pmu.c | 8 +++++++-
tools/perf/util/pmus.c | 14 +-------------
2 files changed, 8 insertions(+), 14 deletions(-)
diff --git a/tools/perf/arch/arm64/util/pmu.c b/tools/perf/arch/arm64/util/pmu.c
index 3d9330feebd2..3099f5f448ba 100644
--- a/tools/perf/arch/arm64/util/pmu.c
+++ b/tools/perf/arch/arm64/util/pmu.c
@@ -10,8 +10,14 @@
const struct pmu_metrics_table *pmu_metrics_table__find(void)
{
- struct perf_pmu *pmu = perf_pmus__find_core_pmu();
+ struct perf_pmu *pmu;
+
+ /* Metrics aren't currently supported on heterogeneous Arm systems */
+ if (perf_pmus__num_core_pmus() > 1)
+ return NULL;
+ /* Doesn't matter which one here because they'll all be the same */
+ pmu = perf_pmus__find_core_pmu();
if (pmu)
return perf_pmu__find_metrics_table(pmu);
diff --git a/tools/perf/util/pmus.c b/tools/perf/util/pmus.c
index cec869cbe163..64e798e68a2d 100644
--- a/tools/perf/util/pmus.c
+++ b/tools/perf/util/pmus.c
@@ -596,17 +596,5 @@ struct perf_pmu *evsel__find_pmu(const struct evsel *evsel)
struct perf_pmu *perf_pmus__find_core_pmu(void)
{
- struct perf_pmu *pmu = NULL;
-
- while ((pmu = perf_pmus__scan_core(pmu))) {
- /*
- * The cpumap should cover all CPUs. Otherwise, some CPUs may
- * not support some events or have different event IDs.
- */
- if (RC_CHK_ACCESS(pmu->cpus)->nr != cpu__max_cpu().cpu)
- return NULL;
-
- return pmu;
- }
- return NULL;
+ return perf_pmus__scan_core(NULL);
}
--
2.34.1
On 13/09/2023 16:33, James Clark wrote:
> Currently the while loop always either exits on the first iteration with
> a core PMU, or exits with NULL on heterogeneous systems or when not all
> CPUs are online.
>
> Both of the latter behaviors are undesirable for platforms other than
> Arm so simplify it to always return the first core PMU, or NULL if none
> exist.
>
> This behavior was depended on by the Arm version of
> pmu_metrics_table__find(), so the logic has been moved there instead.
>
> Suggested-by: Ian Rogers <irogers@google.com>
> Reviewed-by: Ian Rogers <irogers@google.com>
> Signed-off-by: James Clark <james.clark@arm.com>
Turns out the "Simple expression parser" test is failing on
heterogeneous arm systems without this patch. I didn't realise there was
a dependency and should have put the commits the other way round. I will
leave the error message here in case someone bumps into it, but no fix
is required apart from applying the remaining patches in this set:
$ perf test expr -v
4: Simple expression parser :
--- start ---
test child forked, pid 4902
Using CPUID 0x00000000410fd070
FAILED tests/expr.c:83 get_cpuid
test child finished with -1
---- end ----
Simple expression parser: FAILED!
> ---
> tools/perf/arch/arm64/util/pmu.c | 8 +++++++-
> tools/perf/util/pmus.c | 14 +-------------
> 2 files changed, 8 insertions(+), 14 deletions(-)
>
> diff --git a/tools/perf/arch/arm64/util/pmu.c b/tools/perf/arch/arm64/util/pmu.c
> index 3d9330feebd2..3099f5f448ba 100644
> --- a/tools/perf/arch/arm64/util/pmu.c
> +++ b/tools/perf/arch/arm64/util/pmu.c
> @@ -10,8 +10,14 @@
>
> const struct pmu_metrics_table *pmu_metrics_table__find(void)
> {
> - struct perf_pmu *pmu = perf_pmus__find_core_pmu();
> + struct perf_pmu *pmu;
> +
> + /* Metrics aren't currently supported on heterogeneous Arm systems */
> + if (perf_pmus__num_core_pmus() > 1)
> + return NULL;
>
> + /* Doesn't matter which one here because they'll all be the same */
> + pmu = perf_pmus__find_core_pmu();
> if (pmu)
> return perf_pmu__find_metrics_table(pmu);
>
> diff --git a/tools/perf/util/pmus.c b/tools/perf/util/pmus.c
> index cec869cbe163..64e798e68a2d 100644
> --- a/tools/perf/util/pmus.c
> +++ b/tools/perf/util/pmus.c
> @@ -596,17 +596,5 @@ struct perf_pmu *evsel__find_pmu(const struct evsel *evsel)
>
> struct perf_pmu *perf_pmus__find_core_pmu(void)
> {
> - struct perf_pmu *pmu = NULL;
> -
> - while ((pmu = perf_pmus__scan_core(pmu))) {
> - /*
> - * The cpumap should cover all CPUs. Otherwise, some CPUs may
> - * not support some events or have different event IDs.
> - */
> - if (RC_CHK_ACCESS(pmu->cpus)->nr != cpu__max_cpu().cpu)
> - return NULL;
> -
> - return pmu;
> - }
> - return NULL;
> + return perf_pmus__scan_core(NULL);
> }
On 13/09/2023 16:33, James Clark wrote: > Currently the while loop always either exits on the first iteration with > a core PMU, or exits with NULL on heterogeneous systems or when not all > CPUs are online. > > Both of the latter behaviors are undesirable for platforms other than > Arm so simplify it to always return the first core PMU, or NULL if none > exist. > > This behavior was depended on by the Arm version of > pmu_metrics_table__find(), so the logic has been moved there instead. > > Suggested-by: Ian Rogers <irogers@google.com> > Reviewed-by: Ian Rogers <irogers@google.com> > Signed-off-by: James Clark <james.clark@arm.com> Reviewed-by: John Garry <john.g.garry@oracle.com>
© 2016 - 2025 Red Hat, Inc.