From nobody Mon Feb 9 20:30:32 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5FA6DC7619A for ; Mon, 20 Mar 2023 15:29:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233004AbjCTP3M (ORCPT ); Mon, 20 Mar 2023 11:29:12 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48196 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232884AbjCTP2i (ORCPT ); Mon, 20 Mar 2023 11:28:38 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 058BD302B8; Mon, 20 Mar 2023 08:21:50 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 53D99FEC; Mon, 20 Mar 2023 08:16:05 -0700 (PDT) Received: from localhost.localdomain (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id E301C3F71E; Mon, 20 Mar 2023 08:15:18 -0700 (PDT) From: James Clark To: linux-perf-users@vger.kernel.org, Anshuman.Khandual@arm.com Cc: German Gomez , James Clark , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Mark Rutland , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Ian Rogers , Adrian Hunter , John Garry , Will Deacon , Mike Leach , Leo Yan , linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org Subject: [PATCH v2 1/4] perf event: Add simd_flags field to perf_sample Date: Mon, 20 Mar 2023 15:15:05 +0000 Message-Id: <20230320151509.1137462-2-james.clark@arm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230320151509.1137462-1-james.clark@arm.com> References: <20230320151509.1137462-1-james.clark@arm.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: German Gomez Add new field to the struct perf_sample to store flags related to SIMD ops. It will be used to store SIMD information from SVE and NEON when profiling using ARM SPE. Signed-off-by: German Gomez Signed-off-by: James Clark Acked-by: Ian Rogers --- tools/perf/util/sample.h | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/tools/perf/util/sample.h b/tools/perf/util/sample.h index 33b08e0ac746..c92ad0f51ecd 100644 --- a/tools/perf/util/sample.h +++ b/tools/perf/util/sample.h @@ -66,6 +66,18 @@ struct aux_sample { void *data; }; =20 +struct simd_flags { + u64 arch:1, /* architecture (isa) */ + pred:2; /* predication */ +}; + +/* simd architecture flags */ +#define SIMD_OP_FLAGS_ARCH_SVE 0x01 /* ARM SVE */ + +/* simd predicate flags */ +#define SIMD_OP_FLAGS_PRED_PARTIAL 0x01 /* partial predicate */ +#define SIMD_OP_FLAGS_PRED_EMPTY 0x02 /* empty predicate */ + struct perf_sample { u64 ip; u32 pid, tid; @@ -106,6 +118,7 @@ struct perf_sample { struct stack_dump user_stack; struct sample_read read; struct aux_sample aux_sample; + struct simd_flags simd_flags; }; =20 /* --=20 2.34.1 From nobody Mon Feb 9 20:30:32 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6C111C7618D for ; Mon, 20 Mar 2023 15:23:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232701AbjCTPXH (ORCPT ); Mon, 20 Mar 2023 11:23:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39134 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232682AbjCTPWi (ORCPT ); Mon, 20 Mar 2023 11:22:38 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 784332BF1E; Mon, 20 Mar 2023 08:16:10 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id BA3EE1042; Mon, 20 Mar 2023 08:16:08 -0700 (PDT) Received: from localhost.localdomain (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 60C8D3F71E; Mon, 20 Mar 2023 08:15:22 -0700 (PDT) From: James Clark To: linux-perf-users@vger.kernel.org, Anshuman.Khandual@arm.com Cc: German Gomez , Leo Yan , James Clark , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Mark Rutland , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Ian Rogers , Adrian Hunter , John Garry , Will Deacon , Mike Leach , linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org Subject: [PATCH v2 2/4] perf arm-spe: Refactor arm-spe to support operation packet type Date: Mon, 20 Mar 2023 15:15:06 +0000 Message-Id: <20230320151509.1137462-3-james.clark@arm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230320151509.1137462-1-james.clark@arm.com> References: <20230320151509.1137462-1-james.clark@arm.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: German Gomez Extend the decoder of Arm SPE records to support more fields from the operation packet type. Not all fields are being decoded by this commit. Only those needed to support the use-case SVE load/store/other operations. Suggested-by: Leo Yan Signed-off-by: German Gomez Signed-off-by: James Clark Acked-by: Ian Rogers --- .../util/arm-spe-decoder/arm-spe-decoder.c | 30 ++++++++++-- .../util/arm-spe-decoder/arm-spe-decoder.h | 47 +++++++++++++++---- tools/perf/util/arm-spe.c | 8 ++-- 3 files changed, 67 insertions(+), 18 deletions(-) diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c b/tools/perf= /util/arm-spe-decoder/arm-spe-decoder.c index 40dcedfd75cd..f3918f290df5 100644 --- a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c +++ b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c @@ -190,11 +190,27 @@ static int arm_spe_read_record(struct arm_spe_decoder= *decoder) decoder->record.context_id =3D payload; break; case ARM_SPE_OP_TYPE: - if (idx =3D=3D SPE_OP_PKT_HDR_CLASS_LD_ST_ATOMIC) { - if (payload & 0x1) - decoder->record.op =3D ARM_SPE_ST; + switch (idx) { + case SPE_OP_PKT_HDR_CLASS_LD_ST_ATOMIC: + decoder->record.op |=3D ARM_SPE_OP_LDST; + if (payload & SPE_OP_PKT_ST) + decoder->record.op |=3D ARM_SPE_OP_ST; else - decoder->record.op =3D ARM_SPE_LD; + decoder->record.op |=3D ARM_SPE_OP_LD; + if (SPE_OP_PKT_IS_LDST_SVE(payload)) + decoder->record.op |=3D ARM_SPE_OP_SVE_LDST; + break; + case SPE_OP_PKT_HDR_CLASS_OTHER: + decoder->record.op |=3D ARM_SPE_OP_OTHER; + if (SPE_OP_PKT_IS_OTHER_SVE_OP(payload)) + decoder->record.op |=3D ARM_SPE_OP_SVE_OTHER; + break; + case SPE_OP_PKT_HDR_CLASS_BR_ERET: + decoder->record.op |=3D ARM_SPE_OP_BRANCH_ERET; + break; + default: + pr_err("Get packet error!\n"); + return -1; } break; case ARM_SPE_EVENTS: @@ -222,6 +238,12 @@ static int arm_spe_read_record(struct arm_spe_decoder = *decoder) if (payload & BIT(EV_MISPRED)) decoder->record.type |=3D ARM_SPE_BRANCH_MISS; =20 + if (payload & BIT(EV_PARTIAL_PREDICATE)) + decoder->record.type |=3D ARM_SPE_SVE_PARTIAL_PRED; + + if (payload & BIT(EV_EMPTY_PREDICATE)) + decoder->record.type |=3D ARM_SPE_SVE_EMPTY_PRED; + break; case ARM_SPE_DATA_SOURCE: decoder->record.source =3D payload; diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h b/tools/perf= /util/arm-spe-decoder/arm-spe-decoder.h index 46a61df1145b..1443c28545a9 100644 --- a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h +++ b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h @@ -14,19 +14,46 @@ #include "arm-spe-pkt-decoder.h" =20 enum arm_spe_sample_type { - ARM_SPE_L1D_ACCESS =3D 1 << 0, - ARM_SPE_L1D_MISS =3D 1 << 1, - ARM_SPE_LLC_ACCESS =3D 1 << 2, - ARM_SPE_LLC_MISS =3D 1 << 3, - ARM_SPE_TLB_ACCESS =3D 1 << 4, - ARM_SPE_TLB_MISS =3D 1 << 5, - ARM_SPE_BRANCH_MISS =3D 1 << 6, - ARM_SPE_REMOTE_ACCESS =3D 1 << 7, + ARM_SPE_L1D_ACCESS =3D 1 << 0, + ARM_SPE_L1D_MISS =3D 1 << 1, + ARM_SPE_LLC_ACCESS =3D 1 << 2, + ARM_SPE_LLC_MISS =3D 1 << 3, + ARM_SPE_TLB_ACCESS =3D 1 << 4, + ARM_SPE_TLB_MISS =3D 1 << 5, + ARM_SPE_BRANCH_MISS =3D 1 << 6, + ARM_SPE_REMOTE_ACCESS =3D 1 << 7, + ARM_SPE_SVE_PARTIAL_PRED =3D 1 << 8, + ARM_SPE_SVE_EMPTY_PRED =3D 1 << 9, }; =20 enum arm_spe_op_type { - ARM_SPE_LD =3D 1 << 0, - ARM_SPE_ST =3D 1 << 1, + /* First level operation type */ + ARM_SPE_OP_OTHER =3D 1 << 0, + ARM_SPE_OP_LDST =3D 1 << 1, + ARM_SPE_OP_BRANCH_ERET =3D 1 << 2, + + /* Second level operation type for OTHER */ + ARM_SPE_OP_SVE_OTHER =3D 1 << 16, + ARM_SPE_OP_SVE_FP =3D 1 << 17, + ARM_SPE_OP_SVE_PRED_OTHER =3D 1 << 18, + + /* Second level operation type for LDST */ + ARM_SPE_OP_LD =3D 1 << 16, + ARM_SPE_OP_ST =3D 1 << 17, + ARM_SPE_OP_ATOMIC =3D 1 << 18, + ARM_SPE_OP_EXCL =3D 1 << 19, + ARM_SPE_OP_AR =3D 1 << 20, + ARM_SPE_OP_SIMD_FP =3D 1 << 21, + ARM_SPE_OP_GP_REG =3D 1 << 22, + ARM_SPE_OP_UNSPEC_REG =3D 1 << 23, + ARM_SPE_OP_NV_SYSREG =3D 1 << 24, + ARM_SPE_OP_SVE_LDST =3D 1 << 25, + ARM_SPE_OP_SVE_PRED_LDST =3D 1 << 26, + ARM_SPE_OP_SVE_SG =3D 1 << 27, + + /* Second level operation type for BRANCH_ERET */ + ARM_SPE_OP_BR_COND =3D 1 << 16, + ARM_SPE_OP_BR_INDIRECT =3D 1 << 17, }; =20 enum arm_spe_neoverse_data_source { diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c index 906476a839e1..bfae4731a47a 100644 --- a/tools/perf/util/arm-spe.c +++ b/tools/perf/util/arm-spe.c @@ -411,7 +411,7 @@ static void arm_spe__synth_data_source_neoverse(const s= truct arm_spe_record *rec * We have no data on the hit level or data source for stores in the * Neoverse SPE records. */ - if (record->op & ARM_SPE_ST) { + if (record->op & ARM_SPE_OP_ST) { data_src->mem_lvl =3D PERF_MEM_LVL_NA; data_src->mem_lvl_num =3D PERF_MEM_LVLNUM_NA; data_src->mem_snoop =3D PERF_MEM_SNOOP_NA; @@ -497,12 +497,12 @@ static void arm_spe__synth_data_source_generic(const = struct arm_spe_record *reco =20 static u64 arm_spe__synth_data_source(const struct arm_spe_record *record,= u64 midr) { - union perf_mem_data_src data_src =3D { 0 }; + union perf_mem_data_src data_src =3D { .mem_op =3D PERF_MEM_OP_NA }; bool is_neoverse =3D is_midr_in_range_list(midr, neoverse_spe); =20 - if (record->op =3D=3D ARM_SPE_LD) + if (record->op & ARM_SPE_OP_LD) data_src.mem_op =3D PERF_MEM_OP_LOAD; - else if (record->op =3D=3D ARM_SPE_ST) + else if (record->op & ARM_SPE_OP_ST) data_src.mem_op =3D PERF_MEM_OP_STORE; else return 0; --=20 2.34.1 From nobody Mon Feb 9 20:30:32 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 060F4C76195 for ; Mon, 20 Mar 2023 15:23:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232689AbjCTPXE (ORCPT ); Mon, 20 Mar 2023 11:23:04 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39128 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232727AbjCTPWi (ORCPT ); Mon, 20 Mar 2023 11:22:38 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 47AEC16884; Mon, 20 Mar 2023 08:16:10 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 2881C1063; Mon, 20 Mar 2023 08:16:12 -0700 (PDT) Received: from localhost.localdomain (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id C26A63F71E; Mon, 20 Mar 2023 08:15:25 -0700 (PDT) From: James Clark To: linux-perf-users@vger.kernel.org, Anshuman.Khandual@arm.com Cc: German Gomez , James Clark , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Mark Rutland , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Ian Rogers , Adrian Hunter , John Garry , Will Deacon , Mike Leach , Leo Yan , linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org Subject: [PATCH v2 3/4] perf arm-spe: Add SVE flags to the SPE samples Date: Mon, 20 Mar 2023 15:15:07 +0000 Message-Id: <20230320151509.1137462-4-james.clark@arm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230320151509.1137462-1-james.clark@arm.com> References: <20230320151509.1137462-1-james.clark@arm.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: German Gomez Add flags from the Scalable Vector Extension (SVE) to the SPE samples which are available from Armv8.3 (FEAT_SPEv1p1). These will be displayed in a new SIMD sort field in a later commit. Signed-off-by: German Gomez Signed-off-by: James Clark Acked-by: Ian Rogers --- tools/perf/util/arm-spe.c | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c index bfae4731a47a..7b36ba6b4079 100644 --- a/tools/perf/util/arm-spe.c +++ b/tools/perf/util/arm-spe.c @@ -273,6 +273,25 @@ static int arm_spe_set_tid(struct arm_spe_queue *speq,= pid_t tid) return 0; } =20 +static struct simd_flags arm_spe__synth_simd_flags(const struct arm_spe_re= cord *record) +{ + struct simd_flags simd_flags =3D {}; + + if ((record->op & ARM_SPE_OP_LDST) && (record->op & ARM_SPE_OP_SVE_LDST)) + simd_flags.arch |=3D SIMD_OP_FLAGS_ARCH_SVE; + + if ((record->op & ARM_SPE_OP_OTHER) && (record->op & ARM_SPE_OP_SVE_OTHER= )) + simd_flags.arch |=3D SIMD_OP_FLAGS_ARCH_SVE; + + if (record->type & ARM_SPE_SVE_PARTIAL_PRED) + simd_flags.pred |=3D SIMD_OP_FLAGS_PRED_PARTIAL; + + if (record->type & ARM_SPE_SVE_EMPTY_PRED) + simd_flags.pred |=3D SIMD_OP_FLAGS_PRED_EMPTY; + + return simd_flags; +} + static void arm_spe_prep_sample(struct arm_spe *spe, struct arm_spe_queue *speq, union perf_event *event, @@ -289,6 +308,7 @@ static void arm_spe_prep_sample(struct arm_spe *spe, sample->tid =3D speq->tid; sample->period =3D 1; sample->cpu =3D speq->cpu; + sample->simd_flags =3D arm_spe__synth_simd_flags(record); =20 event->sample.header.type =3D PERF_RECORD_SAMPLE; event->sample.header.misc =3D sample->cpumode; --=20 2.34.1 From nobody Mon Feb 9 20:30:32 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E0D1AC6FD1D for ; Mon, 20 Mar 2023 15:23:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232768AbjCTPXB (ORCPT ); Mon, 20 Mar 2023 11:23:01 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59296 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232726AbjCTPWi (ORCPT ); Mon, 20 Mar 2023 11:22:38 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 47F0A32E7E; Mon, 20 Mar 2023 08:16:10 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 8CDC3106F; Mon, 20 Mar 2023 08:16:15 -0700 (PDT) Received: from localhost.localdomain (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 31DCC3F71E; Mon, 20 Mar 2023 08:15:29 -0700 (PDT) From: James Clark To: linux-perf-users@vger.kernel.org, Anshuman.Khandual@arm.com Cc: German Gomez , James Clark , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Mark Rutland , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Ian Rogers , Adrian Hunter , John Garry , Will Deacon , Mike Leach , Leo Yan , linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org Subject: [PATCH v2 4/4] perf report: Add 'simd' sort field Date: Mon, 20 Mar 2023 15:15:08 +0000 Message-Id: <20230320151509.1137462-5-james.clark@arm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230320151509.1137462-1-james.clark@arm.com> References: <20230320151509.1137462-1-james.clark@arm.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: German Gomez Add 'simd' sort field to visualize SIMD ops in perf-report. Rows are labeled with the SIMD isa, and the type of predicate (if any): - [p] partial predicate - [e] empty predicate (no elements in the vector being used) Example with Arm SPE and SVE (Scalable Vector Extension): #include double src[1025], dst[1025]; int main(void) { svfloat64_t vc =3D svdup_f64(1); for(;;) for(int i =3D 0; i < 1025; i +=3D svcntd()) { svbool_t pg =3D svwhilelt_b64(i, 1025); svfloat64_t vsrc =3D svld1(pg, &src[i]); svfloat64_t vdst =3D svadd_x(pg, vsrc, vc); svst1(pg, &dst[i], vdst); } return 0; } ... compiled using "gcc-11 -march=3Darmv8-a+sve -O3" Profiling on a platform that implements FEAT_SVE and FEAT_SPEv1p1: $ perf record -e arm_spe_0// -- ./a.out $ perf report --itrace=3Di1i -s overhead,pid,simd,sym Overhead Pid:Command Simd Symbol ........ ................ ....... ...................... 53.76% 10758:program [.] main 46.14% 10758:program [.] SVE [.] main 0.09% 10758:program [p] SVE [.] main The report shows 0.09% of the sampled SVE operations use partial predicates due to src and dst arrays not being multiples of the vector register lengths. Signed-off-by: German Gomez Signed-off-by: James Clark Acked-by: Ian Rogers --- tools/perf/Documentation/perf-report.txt | 1 + tools/perf/util/hist.c | 1 + tools/perf/util/hist.h | 1 + tools/perf/util/sort.c | 47 ++++++++++++++++++++++++ tools/perf/util/sort.h | 2 + 5 files changed, 52 insertions(+) diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Document= ation/perf-report.txt index c242e8da6b1a..cfd502f7e6da 100644 --- a/tools/perf/Documentation/perf-report.txt +++ b/tools/perf/Documentation/perf-report.txt @@ -117,6 +117,7 @@ OPTIONS - addr: (Full) virtual address of the sampled instruction - retire_lat: On X86, this reports pipeline stall of this instruction com= pared to the previous instruction in cycles. And currently supported only on = X86 + - simd: Flags describing a SIMD operation. "e" for empty Arm SVE predicat= e. "p" for partial Arm SVE predicate =20 By default, comm, dso and symbol keys are used. (i.e. --sort comm,dso,symbol) diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c index 3670136a0074..0c11f50abfec 100644 --- a/tools/perf/util/hist.c +++ b/tools/perf/util/hist.c @@ -745,6 +745,7 @@ __hists__add_entry(struct hists *hists, .weight =3D sample->weight, .ins_lat =3D sample->ins_lat, .p_stage_cyc =3D sample->p_stage_cyc, + .simd_flags =3D sample->simd_flags, }, *he =3D hists__findnew_entry(hists, &entry, al, sample_self); =20 if (!hists->has_callchains && he && he->callchain_size !=3D 0) diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h index 86a677954279..afc9f1c7f4dc 100644 --- a/tools/perf/util/hist.h +++ b/tools/perf/util/hist.h @@ -81,6 +81,7 @@ enum hist_column { HISTC_ADDR_FROM, HISTC_ADDR_TO, HISTC_ADDR, + HISTC_SIMD, HISTC_NR_COLS, /* Last entry */ }; =20 diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c index 093a0c8b2e3d..e11e68ecf0a2 100644 --- a/tools/perf/util/sort.c +++ b/tools/perf/util/sort.c @@ -139,6 +139,52 @@ struct sort_entry sort_thread =3D { .se_width_idx =3D HISTC_THREAD, }; =20 +/* --sort simd */ + +static int64_t +sort__simd_cmp(struct hist_entry *left, struct hist_entry *right) +{ + if (left->simd_flags.arch !=3D right->simd_flags.arch) + return (int64_t) left->simd_flags.arch - right->simd_flags.arch; + + return (int64_t) left->simd_flags.pred - right->simd_flags.pred; +} + +static const char *hist_entry__get_simd_name(struct simd_flags *simd_flags) +{ + u64 arch =3D simd_flags->arch; + + if (arch & SIMD_OP_FLAGS_ARCH_SVE) + return "SVE"; + else + return "n/a"; +} + +static int hist_entry__simd_snprintf(struct hist_entry *he, char *bf, + size_t size, unsigned int width __maybe_unused) +{ + const char *name; + + if (!he->simd_flags.arch) + return repsep_snprintf(bf, size, ""); + + name =3D hist_entry__get_simd_name(&he->simd_flags); + + if (he->simd_flags.pred & SIMD_OP_FLAGS_PRED_EMPTY) + return repsep_snprintf(bf, size, "[e] %s", name); + else if (he->simd_flags.pred & SIMD_OP_FLAGS_PRED_PARTIAL) + return repsep_snprintf(bf, size, "[p] %s", name); + + return repsep_snprintf(bf, size, "[.] %s", name); +} + +struct sort_entry sort_simd =3D { + .se_header =3D "Simd ", + .se_cmp =3D sort__simd_cmp, + .se_snprintf =3D hist_entry__simd_snprintf, + .se_width_idx =3D HISTC_SIMD, +}; + /* --sort comm */ =20 /* @@ -2142,6 +2188,7 @@ static struct sort_dimension common_sort_dimensions[]= =3D { DIM(SORT_ADDR, "addr", sort_addr), DIM(SORT_LOCAL_RETIRE_LAT, "local_retire_lat", sort_local_p_stage_cyc), DIM(SORT_GLOBAL_RETIRE_LAT, "retire_lat", sort_global_p_stage_cyc), + DIM(SORT_SIMD, "simd", sort_simd) }; =20 #undef DIM diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h index 22f437c3476f..ecfb7f1359d5 100644 --- a/tools/perf/util/sort.h +++ b/tools/perf/util/sort.h @@ -111,6 +111,7 @@ struct hist_entry { u64 p_stage_cyc; u8 cpumode; u8 depth; + struct simd_flags simd_flags; =20 /* We are added by hists__add_dummy_entry. */ bool dummy; @@ -241,6 +242,7 @@ enum sort_type { SORT_ADDR, SORT_LOCAL_RETIRE_LAT, SORT_GLOBAL_RETIRE_LAT, + SORT_SIMD, =20 /* branch stack specific sort keys */ __SORT_BRANCH_STACK, --=20 2.34.1