From nobody Sun Feb  8 18:15:08 2026
Received: from out203-205-221-231.mail.qq.com (out203-205-221-231.mail.qq.com
 [203.205.221.231])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by smtp.subspace.kernel.org (Postfix) with ESMTPS id E4A2726B09E
	for <linux-kernel@vger.kernel.org>; Thu, 13 Feb 2025 15:12:45 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org;
 arc=none smtp.client-ip=203.205.221.231
ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;
	t=1739459574; cv=none;
 b=kLjTnGM5oM9IEUSJlUS51zgxDv+4IkeNO/UJKwF5LBkJ+5/Mx0BDYWPaoYvnhye6gRQ6akG0KDTZMlKkRiTJeI+Y4ZupcpvrO420GqQ4oU5lW4aHTZcRz4V7mTn7RJ5P6K4VZhXIvOeUF/UHu6YjkR3SnzI1AMIWGFbDwtT//zk=
ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org;
	s=arc-20240116; t=1739459574; c=relaxed/simple;
	bh=2jcKfs8j2570hfqmJFobbtJr+YFiijx6fS5+6LD050o=;
	h=Message-ID:From:To:Cc:Subject:Date:In-Reply-To:References:
	 MIME-Version;
 b=RagN+g5Gcd93p2fJXSMi8AjTfQjF7jl8mp5LmF8d+dVhPnyIteGe/GCESaNtC9F9reJl6kVDT3OnUzSNx9kbXHNWuyvFoU36foBrKdtgYQHJwZIPiDsVOWUHuo+bGp0PhO/RCUS/VDSWlMHTWdMqMUlNU5cs3PMiRKvt8YAyrIo=
ARC-Authentication-Results: i=1; smtp.subspace.kernel.org;
 dmarc=none (p=none dis=none) header.from=cyyself.name;
 spf=pass smtp.mailfrom=cyyself.name;
 dkim=pass (1024-bit key) header.d=qq.com header.i=@qq.com header.b=HseCZg/Z;
 arc=none smtp.client-ip=203.205.221.231
Authentication-Results: smtp.subspace.kernel.org;
 dmarc=none (p=none dis=none) header.from=cyyself.name
Authentication-Results: smtp.subspace.kernel.org;
 spf=pass smtp.mailfrom=cyyself.name
Authentication-Results: smtp.subspace.kernel.org;
	dkim=pass (1024-bit key) header.d=qq.com header.i=@qq.com header.b="HseCZg/Z"
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=qq.com; s=s201512;
	t=1739459559; bh=U37SpgKCfAQXTOl9oGsbvq6Zo2b4aluskpRdxYnCskE=;
	h=From:To:Cc:Subject:Date:In-Reply-To:References;
	b=HseCZg/ZrcNU8G8visZyvtZZdarZH3AWmmygTcA3ntfjv7Zxz1w+hue/z+1ByzW2I
	 7aXoEQajF8J5oB/DDW/O0UgjT05veR2AgdN1rukjKINrpYr3FcYozapVg3LgRmQVof
	 gUi3ytpFsSpb3weY2wvRoQ6yRkNlutii8Ntq8+Go=
Received: from cyy-pc.lan ([240e:379:2251:3600:f57b:26f9:9718:486c])
	by newxmesmtplogicsvrszb16-1.qq.com (NewEsmtp) with SMTP
	id 3212741D; Thu, 13 Feb 2025 23:12:33 +0800
X-QQ-mid: xmsmtpt1739459553t5bpdbvqy
Message-ID: <tencent_0434DA8D8E21B8C4318A213C8D6B7CC08208@qq.com>
X-QQ-XMAILINFO: NnIX2CK8LSsJ3nFwDszqQ65PatHZarqNE0mnYsZsqegbGZxPt75uhFbjNEJ9u4
	 Cox/zl30YWP1PXQZdh/tHMM1D+EmgORCQvbsbRYVXdHZYOLo3g80Jc1rckUcHPM54LMNBfJjxlvw
	 ecw6YIYZdlBaSBT/siR+8+ZEzv16r/rYQQIUgik86Jtdr0j4pNgedMfVpFDoM9OzfpH/cqnYH6Ft
	 iYtWU6p4ErA7f6KSTU0/+glN626I8JlEu046Pm4aMJA/KOkPa5uu3ZQUNWS5+/r3Fl3SDCLcTjNb
	 EKbcArcEHOu1/JbjWhhDI81/VzWsQJ7Ifd0s96PWOn08bnJIQeLvM/q+lud7RACo7xYLJ/CySKvX
	 IdItbCFSfDKhCYP4hCzQ1dmUlj9DTDnH7EJz3G/+VwFOklhD9f52pGdxhu3fwRSfmCw5ov+EstVE
	 vOQRJjnp1WIZIVh/wuinmeuG+adKoks2EZN6QAEUnLJdVqRqrxlU0b87vdipl0xnf1LRULxQIybV
	 Q+ivDM/QA1Xkq6ffVQZKJLyVdz6FWoFHw89iwJZj97lEFi9jT55kmENXsrajTgk7sqKFJg78vlsK
	 aduYgdl8kHveBgBCyJj8lrtzV+LPorL4oEzahKa6xsbTMDXVeu75h204ae7IFjX79GKrVJBATaqX
	 IiG/zhbxxKvgAwigMQcGBCC0mn+ISgSGYTDysjLblqpQLIrMXWC9LTqSGCc67QrtSwOQKbgU/wKe
	 5dzODMn1JxU+OfAj1Co5PvIS/Z44YAO6Y/4VlrLRcrMykVjPZStLmeY86tj/XASCvpxFmJlgyFpW
	 s72S9EnYdeDuKx3uS5sYqIvOa6KjeWFMeyPTYr2L1eluSYOXIpgzjcd5+wZupd1+paguIbbqCv2v
	 EE7SgDPV6MB4T04a2+62sCYUMP3BT7YtjJqKnj/RUlSAPeAk7zunUJi3/CZ5KKWgc6G4jlPFF8aw
	 YY15TJ/p+Lnv6IVCuHR0oYknbLl95Iv8/3RhXpkAvOHMNRdj+JWW3EhyXpLVwfrMTbXUPsFlArO+
	 wRWMwEXg==
X-QQ-XMRINFO: Mp0Kj//9VHAxr69bL5MkOOs=
From: Yangyu Chen <cyy@cyyself.name>
To: linux-perf-users@vger.kernel.org
Cc: John Garry <john.g.garry@oracle.com>,
	Will Deacon <will@kernel.org>,
	James Clark <james.clark@linaro.org>,
	Mike Leach <mike.leach@linaro.org>,
	Leo Yan <leo.yan@linux.dev>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Namhyung Kim <namhyung@kernel.org>,
	Mark Rutland <mark.rutland@arm.com>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Jiri Olsa <jolsa@kernel.org>,
	Ian Rogers <irogers@google.com>,
	Adrian Hunter <adrian.hunter@intel.com>,
	Liang Kan <kan.liang@linux.intel.com>,
	Yoshihiro Furudera <fj5100bi@fujitsu.com>,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org,
	Yangyu Chen <cyy@cyyself.name>
Subject: [PATCH 1/2] perf vendor events arm64: Add Cortex-A720 events/metrics
Date: Thu, 13 Feb 2025 23:12:25 +0800
X-OQ-MSGID: <20250213151226.187205-1-cyy@cyyself.name>
X-Mailer: git-send-email 2.47.2
In-Reply-To: <tencent_5360DA048EE5B8CF3104213F8D037C698208@qq.com>
References: <tencent_5360DA048EE5B8CF3104213F8D037C698208@qq.com>
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
List-Id: <linux-kernel.vger.kernel.org>
List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org>
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset="utf-8"

Add JSON files for Cortex-A720 events and metrics. Using the existing
Neoverse N3 JSON files as a template, I manually checked the missing and
extra events/metrics using my script [1] and modified them according to
the Arm Cortex-A720 Core Technical Reference Manual [2].

[1] https://github.com/cyyself/arm-pmu-check/tree/1075bebeb3f1441067448251a=
387df35af15bf16
[2] https://developer.arm.com/documentation/102530/0002/Performance-Monitor=
s-Extension-support-/Performance-monitors-events

Signed-off-by: Yangyu Chen <cyy@cyyself.name>
---
 .../arch/arm64/arm/cortex-a720/bus.json       |  18 +
 .../arch/arm64/arm/cortex-a720/exception.json |  62 +++
 .../arm64/arm/cortex-a720/fp_operation.json   |  22 +
 .../arch/arm64/arm/cortex-a720/general.json   |  10 +
 .../arch/arm64/arm/cortex-a720/l1d_cache.json |  50 ++
 .../arch/arm64/arm/cortex-a720/l1i_cache.json |  14 +
 .../arch/arm64/arm/cortex-a720/l2_cache.json  |  62 +++
 .../arch/arm64/arm/cortex-a720/l3_cache.json  |  22 +
 .../arch/arm64/arm/cortex-a720/ll_cache.json  |  10 +
 .../arch/arm64/arm/cortex-a720/memory.json    |  54 +++
 .../arch/arm64/arm/cortex-a720/metrics.json   | 436 ++++++++++++++++++
 .../arch/arm64/arm/cortex-a720/pmu.json       |   8 +
 .../arch/arm64/arm/cortex-a720/retired.json   |  90 ++++
 .../arch/arm64/arm/cortex-a720/spe.json       |  42 ++
 .../arm64/arm/cortex-a720/spec_operation.json |  90 ++++
 .../arch/arm64/arm/cortex-a720/stall.json     |  82 ++++
 .../arch/arm64/arm/cortex-a720/sve.json       |  50 ++
 .../arch/arm64/arm/cortex-a720/tlb.json       |  74 +++
 .../arch/arm64/arm/cortex-a720/trace.json     |  32 ++
 tools/perf/pmu-events/arch/arm64/mapfile.csv  |   1 +
 20 files changed, 1229 insertions(+)
 create mode 100644 tools/perf/pmu-events/arch/arm64/arm/cortex-a720/bus.js=
on
 create mode 100644 tools/perf/pmu-events/arch/arm64/arm/cortex-a720/except=
ion.json
 create mode 100644 tools/perf/pmu-events/arch/arm64/arm/cortex-a720/fp_ope=
ration.json
 create mode 100644 tools/perf/pmu-events/arch/arm64/arm/cortex-a720/genera=
l.json
 create mode 100644 tools/perf/pmu-events/arch/arm64/arm/cortex-a720/l1d_ca=
che.json
 create mode 100644 tools/perf/pmu-events/arch/arm64/arm/cortex-a720/l1i_ca=
che.json
 create mode 100644 tools/perf/pmu-events/arch/arm64/arm/cortex-a720/l2_cac=
he.json
 create mode 100644 tools/perf/pmu-events/arch/arm64/arm/cortex-a720/l3_cac=
he.json
 create mode 100644 tools/perf/pmu-events/arch/arm64/arm/cortex-a720/ll_cac=
he.json
 create mode 100644 tools/perf/pmu-events/arch/arm64/arm/cortex-a720/memory=
.json
 create mode 100644 tools/perf/pmu-events/arch/arm64/arm/cortex-a720/metric=
s.json
 create mode 100644 tools/perf/pmu-events/arch/arm64/arm/cortex-a720/pmu.js=
on
 create mode 100644 tools/perf/pmu-events/arch/arm64/arm/cortex-a720/retire=
d.json
 create mode 100644 tools/perf/pmu-events/arch/arm64/arm/cortex-a720/spe.js=
on
 create mode 100644 tools/perf/pmu-events/arch/arm64/arm/cortex-a720/spec_o=
peration.json
 create mode 100644 tools/perf/pmu-events/arch/arm64/arm/cortex-a720/stall.=
json
 create mode 100644 tools/perf/pmu-events/arch/arm64/arm/cortex-a720/sve.js=
on
 create mode 100644 tools/perf/pmu-events/arch/arm64/arm/cortex-a720/tlb.js=
on
 create mode 100644 tools/perf/pmu-events/arch/arm64/arm/cortex-a720/trace.=
json

diff --git a/tools/perf/pmu-events/arch/arm64/arm/cortex-a720/bus.json b/to=
ols/perf/pmu-events/arch/arm64/arm/cortex-a720/bus.json
new file mode 100644
index 000000000000..2e11a8c4a484
--- /dev/null
+++ b/tools/perf/pmu-events/arch/arm64/arm/cortex-a720/bus.json
@@ -0,0 +1,18 @@
+[
+    {
+        "ArchStdEvent": "BUS_ACCESS",
+        "PublicDescription": "Counts memory transactions issued by the CPU=
 to the external bus, including snoop requests and snoop responses. Each be=
at of data is counted individually."
+    },
+    {
+        "ArchStdEvent": "BUS_CYCLES",
+        "PublicDescription": "Counts bus cycles in the CPU. Bus cycles rep=
resent a clock cycle in which a transaction could be sent or received on th=
e interface from the CPU to the external bus. Since that interface is drive=
n at the same clock speed as the CPU, this event is a duplicate of CPU_CYCL=
ES."
+    },
+    {
+        "ArchStdEvent": "BUS_ACCESS_RD",
+        "PublicDescription": "Counts memory read transactions seen on the =
external bus. Each beat of data is counted individually."
+    },
+    {
+        "ArchStdEvent": "BUS_ACCESS_WR",
+        "PublicDescription": "Counts memory write transactions seen on the=
 external bus. Each beat of data is counted individually."
+    }
+]
diff --git a/tools/perf/pmu-events/arch/arm64/arm/cortex-a720/exception.jso=
n b/tools/perf/pmu-events/arch/arm64/arm/cortex-a720/exception.json
new file mode 100644
index 000000000000..7126fbf292e0
--- /dev/null
+++ b/tools/perf/pmu-events/arch/arm64/arm/cortex-a720/exception.json
@@ -0,0 +1,62 @@
+[
+    {
+        "ArchStdEvent": "EXC_TAKEN",
+        "PublicDescription": "Counts any taken architecturally visible exc=
eptions such as IRQ, FIQ, SError, and other synchronous exceptions. Excepti=
ons are counted whether or not they are taken locally."
+    },
+    {
+        "ArchStdEvent": "EXC_RETURN",
+        "PublicDescription": "Counts any architecturally executed exceptio=
n return instructions. For example: AArch64: ERET"
+    },
+    {
+        "ArchStdEvent": "EXC_UNDEF",
+        "PublicDescription": "Counts the number of synchronous exceptions =
which are taken locally that are due to attempting to execute an instructio=
n that is UNDEFINED. Attempting to execute instruction bit patterns that ha=
ve not been allocated. Attempting to execute instructions when they are dis=
abled. Attempting to execute instructions at an inappropriate Exception lev=
el. Attempting to execute an instruction when the value of PSTATE.IL is 1."
+    },
+    {
+        "ArchStdEvent": "EXC_SVC",
+        "PublicDescription": "Counts SVC exceptions taken locally."
+    },
+    {
+        "ArchStdEvent": "EXC_PABORT",
+        "PublicDescription": "Counts synchronous exceptions that are taken=
 locally and caused by Instruction Aborts."
+    },
+    {
+        "ArchStdEvent": "EXC_DABORT",
+        "PublicDescription": "Counts exceptions that are taken locally and=
 are caused by data aborts or SErrors. Conditions that could cause those ex=
ceptions are attempting to read or write memory where the MMU generates a f=
ault, attempting to read or write memory with a misaligned address, interru=
pts from the nSEI inputs and internally generated SErrors."
+    },
+    {
+        "ArchStdEvent": "EXC_IRQ",
+        "PublicDescription": "Counts IRQ exceptions including the virtual =
IRQs that are taken locally."
+    },
+    {
+        "ArchStdEvent": "EXC_FIQ",
+        "PublicDescription": "Counts FIQ exceptions including the virtual =
FIQs that are taken locally."
+    },
+    {
+        "ArchStdEvent": "EXC_SMC",
+        "PublicDescription": "Counts SMC exceptions take to EL3."
+    },
+    {
+        "ArchStdEvent": "EXC_HVC",
+        "PublicDescription": "Counts HVC exceptions taken to EL2."
+    },
+    {
+        "ArchStdEvent": "EXC_TRAP_PABORT",
+        "PublicDescription": "Counts exceptions which are traps not taken =
locally and are caused by Instruction Aborts. For example, attempting to ex=
ecute an instruction with a misaligned PC."
+    },
+    {
+        "ArchStdEvent": "EXC_TRAP_DABORT",
+        "PublicDescription": "Counts exceptions which are traps not taken =
locally and are caused by Data Aborts or SError interrupts. Conditions that=
 could cause those exceptions are:\n\n1. Attempting to read or write memory=
 where the MMU generates a fault,\n2. Attempting to read or write memory wi=
th a misaligned address,\n3. Interrupts from the SEI input.\n4. internally =
generated SErrors."
+    },
+    {
+        "ArchStdEvent": "EXC_TRAP_OTHER",
+        "PublicDescription": "Counts the number of synchronous trap except=
ions which are not taken locally and are not SVC, SMC, HVC, data aborts, In=
struction Aborts, or interrupts."
+    },
+    {
+        "ArchStdEvent": "EXC_TRAP_IRQ",
+        "PublicDescription": "Counts IRQ exceptions including the virtual =
IRQs that are not taken locally."
+    },
+    {
+        "ArchStdEvent": "EXC_TRAP_FIQ",
+        "PublicDescription": "Counts FIQs which are not taken locally but =
taken from EL0, EL1,\n or EL2 to EL3 (which would be the normal behavior fo=
r FIQs when not executing\n in EL3)."
+    }
+]
diff --git a/tools/perf/pmu-events/arch/arm64/arm/cortex-a720/fp_operation.=
json b/tools/perf/pmu-events/arch/arm64/arm/cortex-a720/fp_operation.json
new file mode 100644
index 000000000000..cec3435ac766
--- /dev/null
+++ b/tools/perf/pmu-events/arch/arm64/arm/cortex-a720/fp_operation.json
@@ -0,0 +1,22 @@
+[
+    {
+        "ArchStdEvent": "FP_HP_SPEC",
+        "PublicDescription": "Counts speculatively executed half precision=
 floating point operations."
+    },
+    {
+        "ArchStdEvent": "FP_SP_SPEC",
+        "PublicDescription": "Counts speculatively executed single precisi=
on floating point operations."
+    },
+    {
+        "ArchStdEvent": "FP_DP_SPEC",
+        "PublicDescription": "Counts speculatively executed double precisi=
on floating point operations."
+    },
+    {
+        "ArchStdEvent": "FP_SCALE_OPS_SPEC",
+        "PublicDescription": "Counts speculatively executed scalable singl=
e precision floating point operations."
+    },
+    {
+        "ArchStdEvent": "FP_FIXED_OPS_SPEC",
+        "PublicDescription": "Counts speculatively executed non-scalable s=
ingle precision floating point operations."
+    }
+]
diff --git a/tools/perf/pmu-events/arch/arm64/arm/cortex-a720/general.json =
b/tools/perf/pmu-events/arch/arm64/arm/cortex-a720/general.json
new file mode 100644
index 000000000000..c5dcdcf43c58
--- /dev/null
+++ b/tools/perf/pmu-events/arch/arm64/arm/cortex-a720/general.json
@@ -0,0 +1,10 @@
+[
+    {
+        "ArchStdEvent": "CPU_CYCLES",
+        "PublicDescription": "Counts CPU clock cycles (not timer cycles). =
The clock measured by this event is defined as the physical clock driving t=
he CPU logic."
+    },
+    {
+        "ArchStdEvent": "CNT_CYCLES",
+        "PublicDescription": "Increments at a constant frequency equal to =
the rate of increment of the System Counter, CNTPCT_EL0."
+    }
+]
diff --git a/tools/perf/pmu-events/arch/arm64/arm/cortex-a720/l1d_cache.jso=
n b/tools/perf/pmu-events/arch/arm64/arm/cortex-a720/l1d_cache.json
new file mode 100644
index 000000000000..a6fee569f4c6
--- /dev/null
+++ b/tools/perf/pmu-events/arch/arm64/arm/cortex-a720/l1d_cache.json
@@ -0,0 +1,50 @@
+[
+    {
+        "ArchStdEvent": "L1D_CACHE_REFILL",
+        "PublicDescription": "Counts level 1 data cache refills caused by =
speculatively executed load or store operations that missed in the level 1 =
data cache. This event only counts one event per cache line."
+    },
+    {
+        "ArchStdEvent": "L1D_CACHE",
+        "PublicDescription": "Counts level 1 data cache accesses from any =
load/store operations. Atomic operations that resolve in the CPUs caches (n=
ear atomic operations) counts as both a write access and read access. Each =
access to a cache line is counted including the multiple accesses caused by=
 single instructions such as LDM or STM. Each access to other level 1 data =
or unified memory structures, for example refill buffers, write buffers, an=
d write-back buffers, are also counted."
+    },
+    {
+        "ArchStdEvent": "L1D_CACHE_WB",
+        "PublicDescription": "Counts write-backs of dirty data from the L1=
 data cache to the L2 cache. This occurs when either a dirty cache line is =
evicted from L1 data cache and allocated in the L2 cache or dirty data is w=
ritten to the L2 and possibly to the next level of cache. This event counts=
 both victim cache line evictions and cache write-backs from snoops or cach=
e maintenance operations. The following cache operations are not counted:\n=
\n1. Invalidations which do not result in data being transferred out of the=
 L1 (such as evictions of clean data),\n2. Full line writes which write to =
L2 without writing L1, such as write streaming mode."
+    },
+    {
+        "ArchStdEvent": "L1D_CACHE_LMISS_RD",
+        "PublicDescription": "Counts cache line refills into the level 1 d=
ata cache from any memory read operations, that incurred additional latency=
."
+    },
+    {
+        "ArchStdEvent": "L1D_CACHE_RD",
+        "PublicDescription": "Counts level 1 data cache accesses from any =
load operation. Atomic load operations that resolve in the CPUs caches coun=
ts as both a write access and read access."
+    },
+    {
+        "ArchStdEvent": "L1D_CACHE_WR",
+        "PublicDescription": "Counts level 1 data cache accesses generated=
 by store operations. This event also counts accesses caused by a DC ZVA (d=
ata cache zero, specified by virtual address) instruction. Near atomic oper=
ations that resolve in the CPUs caches count as a write access and read acc=
ess."
+    },
+    {
+        "ArchStdEvent": "L1D_CACHE_REFILL_INNER",
+        "PublicDescription": "Counts level 1 data cache refills where the =
cache line data came from caches inside the immediate cluster of the core."
+    },
+    {
+        "ArchStdEvent": "L1D_CACHE_REFILL_OUTER",
+        "PublicDescription": "Counts level 1 data cache refills for which =
the cache line data came from outside the immediate cluster of the core, li=
ke an SLC in the system interconnect or DRAM."
+    },
+    {
+        "ArchStdEvent": "L1D_CACHE_INVAL",
+        "PublicDescription": "Counts each explicit invalidation of a cache=
 line in the level 1 data cache caused by:\n\n- Cache Maintenance Operation=
s (CMO) that operate by a virtual address.\n- Broadcast cache coherency ope=
rations from another CPU in the system.\n\nThis event does not count for th=
e following conditions:\n\n1. A cache refill invalidates a cache line.\n2. =
A CMO which is executed on that CPU and invalidates a cache line specified =
by set/way.\n\nNote that CMOs that operate by set/way cannot be broadcast f=
rom one CPU to another."
+    },
+    {
+        "ArchStdEvent": "L1D_CACHE_RW",
+        "PublicDescription": "Counts level 1 data demand cache accesses fr=
om any load or store operation. Near atomic operations that resolve in the =
CPUs caches counts as both a write access and read access."
+    },
+    {
+        "ArchStdEvent": "L1D_CACHE_PRF",
+        "BriefDescription": "This event counts fetch counted by either Lev=
el 1 data hardware prefetch or Level 1 data software prefetch."
+    },
+    {
+        "ArchStdEvent": "L1D_CACHE_REFILL_PRF",
+        "BriefDescription": "This event counts hardware prefetch counted b=
y L1D_CACHE_PRF that causes a refill of the Level 1 data cache from outside=
 of the Level 1 data cache."
+    }
+]
diff --git a/tools/perf/pmu-events/arch/arm64/arm/cortex-a720/l1i_cache.jso=
n b/tools/perf/pmu-events/arch/arm64/arm/cortex-a720/l1i_cache.json
new file mode 100644
index 000000000000..633f1030359d
--- /dev/null
+++ b/tools/perf/pmu-events/arch/arm64/arm/cortex-a720/l1i_cache.json
@@ -0,0 +1,14 @@
+[
+    {
+        "ArchStdEvent": "L1I_CACHE_REFILL",
+        "PublicDescription": "Counts cache line refills in the level 1 ins=
truction cache caused by a missed instruction fetch. Instruction fetches ma=
y include accessing multiple instructions, but the single cache line alloca=
tion is counted once."
+    },
+    {
+        "ArchStdEvent": "L1I_CACHE",
+        "PublicDescription": "Counts instruction fetches which access the =
level 1 instruction cache. Instruction cache accesses caused by cache maint=
enance operations are not counted."
+    },
+    {
+        "ArchStdEvent": "L1I_CACHE_LMISS",
+        "PublicDescription": "Counts cache line refills into the level 1 i=
nstruction cache, that incurred additional latency."
+    }
+]
diff --git a/tools/perf/pmu-events/arch/arm64/arm/cortex-a720/l2_cache.json=
 b/tools/perf/pmu-events/arch/arm64/arm/cortex-a720/l2_cache.json
new file mode 100644
index 000000000000..3806fef42b30
--- /dev/null
+++ b/tools/perf/pmu-events/arch/arm64/arm/cortex-a720/l2_cache.json
@@ -0,0 +1,62 @@
+[
+    {
+        "ArchStdEvent": "L2D_CACHE",
+        "PublicDescription": "Counts accesses to the level 2 cache due to =
data accesses. Level 2 cache is a unified cache for data and instruction ac=
cesses. Accesses are for misses in the first level data cache or translatio=
n resolutions due to accesses. This event also counts write back of dirty d=
ata from level 1 data cache to the L2 cache."
+    },
+    {
+        "ArchStdEvent": "L2D_CACHE_REFILL",
+        "PublicDescription": "Counts cache line refills into the level 2 c=
ache. Level 2 cache is a unified cache for data and instruction accesses. A=
ccesses are for misses in the level 1 data cache or translation resolutions=
 due to accesses."
+    },
+    {
+        "ArchStdEvent": "L2D_CACHE_WB",
+        "PublicDescription": "Counts write-backs of data from the L2 cache=
 to outside the CPU. This includes snoops to the L2 (from other CPUs) which=
 return data even if the snoops cause an invalidation. L2 cache line invali=
dations which do not write data outside the CPU and snoops which return dat=
a from an L1 cache are not counted. Data would not be written outside the c=
ache when invalidating a clean cache line."
+    },
+    {
+        "ArchStdEvent": "L2D_CACHE_ALLOCATE",
+        "PublicDescription": "Counts level 2 cache line allocates that do =
not fetch data from outside the level 2 data or unified cache."
+    },
+    {
+        "ArchStdEvent": "L2D_CACHE_RD",
+        "PublicDescription": "Counts level 2 data cache accesses due to me=
mory read operations. Level 2 cache is a unified cache for data and instruc=
tion accesses, accesses are for misses in the level 1 data cache or transla=
tion resolutions due to accesses."
+    },
+    {
+        "ArchStdEvent": "L2D_CACHE_WR",
+        "PublicDescription": "Counts level 2 cache accesses due to memory =
write operations. Level 2 cache is a unified cache for data and instruction=
 accesses, accesses are for misses in the level 1 data cache or translation=
 resolutions due to accesses."
+    },
+    {
+        "ArchStdEvent": "L2D_CACHE_REFILL_RD",
+        "PublicDescription": "Counts refills for memory accesses due to me=
mory read operation counted by L2D_CACHE_RD. Level 2 cache is a unified cac=
he for data and instruction accesses, accesses are for misses in the level =
1 data cache or translation resolutions due to accesses."
+    },
+    {
+        "ArchStdEvent": "L2D_CACHE_REFILL_WR",
+        "PublicDescription": "Counts refills for memory accesses due to me=
mory write operation counted by L2D_CACHE_WR. Level 2 cache is a unified ca=
che for data and instruction accesses, accesses are for misses in the level=
 1 data cache or translation resolutions due to accesses."
+    },
+    {
+        "ArchStdEvent": "L2D_CACHE_WB_VICTIM",
+        "PublicDescription": "Counts evictions from the level 2 cache beca=
use of a line being allocated into the L2 cache."
+    },
+    {
+        "ArchStdEvent": "L2D_CACHE_WB_CLEAN",
+        "PublicDescription": "Counts write-backs from the level 2 cache th=
at are a result of either:\n\n1. Cache maintenance operations,\n\n2. Snoop =
responses or,\n\n3. Direct cache transfers to another CPU due to a forwardi=
ng snoop request."
+    },
+    {
+        "ArchStdEvent": "L2D_CACHE_INVAL",
+        "PublicDescription": "Counts each explicit invalidation of a cache=
 line in the level 2 cache by cache maintenance operations that operate by =
a virtual address, or by external coherency operations. This event does not=
 count if either:\n\n1. A cache refill invalidates a cache line or,\n2. A C=
ache Maintenance Operation (CMO), which invalidates a cache line specified =
by set/way, is executed on that CPU.\n\nCMOs that operate by set/way cannot=
 be broadcast from one CPU to another."
+    },
+    {
+        "ArchStdEvent": "L2D_CACHE_LMISS_RD",
+        "PublicDescription": "Counts cache line refills into the level 2 u=
nified cache from any memory read operations that incurred additional laten=
cy."
+    },
+    {
+        "ArchStdEvent": "L2D_CACHE_RW",
+        "PublicDescription": "Counts level 2 cache demand accesses from an=
y load/store operations. Level 2 cache is a unified cache for data and inst=
ruction accesses, accesses are for misses in the level 1 data cache or tran=
slation resolutions due to accesses."
+    },
+    {
+        "ArchStdEvent": "L2D_CACHE_PRF",
+        "PublicDescription": "Counts level 2 data cache accesses from soft=
ware preload or prefetch instructions or hardware prefetcher."
+    },
+    {
+        "ArchStdEvent": "L2D_CACHE_REFILL_PRF",
+        "PublicDescription": "Counts refills due to accesses generated as =
a result of prefetches."
+    }
+]
diff --git a/tools/perf/pmu-events/arch/arm64/arm/cortex-a720/l3_cache.json=
 b/tools/perf/pmu-events/arch/arm64/arm/cortex-a720/l3_cache.json
new file mode 100644
index 000000000000..4a2e72fc5ada
--- /dev/null
+++ b/tools/perf/pmu-events/arch/arm64/arm/cortex-a720/l3_cache.json
@@ -0,0 +1,22 @@
+[
+    {
+        "ArchStdEvent": "L3D_CACHE_ALLOCATE",
+        "PublicDescription": "Counts level 3 cache line allocates that do =
not fetch data from outside the level 3 data or unified cache. For example,=
 allocates due to streaming stores."
+    },
+    {
+        "ArchStdEvent": "L3D_CACHE_REFILL",
+        "PublicDescription": "Counts level 3 accesses that receive data fr=
om outside the L3 cache."
+    },
+    {
+        "ArchStdEvent": "L3D_CACHE",
+        "PublicDescription": "Counts level 3 cache accesses. Level 3 cache=
 is a unified cache for data and instruction accesses. Accesses are for mis=
ses in the lower level caches or translation resolutions due to accesses."
+    },
+    {
+        "ArchStdEvent": "L3D_CACHE_RD",
+        "PublicDescription": "Counts level 3 cache accesses caused by any =
memory read operation. Level 3 cache is a unified cache for data and instru=
ction accesses. Accesses are for misses in the lower level caches or transl=
ation resolutions due to accesses."
+    },
+    {
+        "ArchStdEvent": "L3D_CACHE_LMISS_RD",
+        "PublicDescription": "Counts any cache line refill into the level =
3 cache from memory read operations that incurred additional latency."
+    }
+]
diff --git a/tools/perf/pmu-events/arch/arm64/arm/cortex-a720/ll_cache.json=
 b/tools/perf/pmu-events/arch/arm64/arm/cortex-a720/ll_cache.json
new file mode 100644
index 000000000000..fd5a2e0099b8
--- /dev/null
+++ b/tools/perf/pmu-events/arch/arm64/arm/cortex-a720/ll_cache.json
@@ -0,0 +1,10 @@
+[
+    {
+        "ArchStdEvent": "LL_CACHE_RD",
+        "PublicDescription": "Counts read transactions that were returned =
from outside the core cluster. This event counts for external last level ca=
che  when the system register CPUECTLR.EXTLLC bit is set, otherwise it coun=
ts for the L3 cache. This event counts read transactions returned from outs=
ide the core if those transactions are either hit in the system level cache=
 or missed in the SLC and are returned from any other external sources."
+    },
+    {
+        "ArchStdEvent": "LL_CACHE_MISS_RD",
+        "PublicDescription": "Counts read transactions that were returned =
from outside the core cluster but missed in the system level cache. This ev=
ent counts for external last level cache when the system register CPUECTLR.=
EXTLLC bit is set, otherwise it counts for L3 cache. This event counts read=
 transactions returned from outside the core if those transactions are miss=
ed in the System level Cache. The data source of the transaction is indicat=
ed by a field in the CHI transaction returning to the CPU. This event does =
not count reads caused by cache maintenance operations."
+    }
+]
diff --git a/tools/perf/pmu-events/arch/arm64/arm/cortex-a720/memory.json b=
/tools/perf/pmu-events/arch/arm64/arm/cortex-a720/memory.json
new file mode 100644
index 000000000000..f19204a5faae
--- /dev/null
+++ b/tools/perf/pmu-events/arch/arm64/arm/cortex-a720/memory.json
@@ -0,0 +1,54 @@
+[
+    {
+        "ArchStdEvent": "MEM_ACCESS",
+        "PublicDescription": "Counts memory accesses issued by the CPU loa=
d store unit, where those accesses are issued due to load or store operatio=
ns. This event counts memory accesses no matter whether the data is receive=
d from any level of cache hierarchy or external memory. If memory accesses =
are broken up into smaller transactions than what were specified in the loa=
d or store instructions, then the event counts those smaller memory transac=
tions."
+    },
+    {
+        "ArchStdEvent": "REMOTE_ACCESS",
+        "PublicDescription": "Counts accesses to another chip, which is im=
plemented as a different CMN mesh in the system. If the CHI bus response ba=
ck to the core indicates that the data source is from another chip (mesh), =
then the counter is updated. If no data is returned, even if the system sno=
ops another chip/mesh, then the counter is not updated."
+    },
+    {
+        "ArchStdEvent": "MEM_ACCESS_RD",
+        "PublicDescription": "Counts memory accesses issued by the CPU due=
 to load operations. The event counts any memory load access, no matter whe=
ther the data is received from any level of cache hierarchy or external mem=
ory. The event also counts atomic load operations. If memory accesses are b=
roken up by the load/store unit into smaller transactions that are issued b=
y the bus interface, then the event counts those smaller transactions."
+    },
+    {
+        "ArchStdEvent": "MEM_ACCESS_WR",
+        "PublicDescription": "Counts memory accesses issued by the CPU due=
 to store operations. The event counts any memory store access, no matter w=
hether the data is located in any level of cache or external memory. The ev=
ent also counts atomic load and store operations. If memory accesses are br=
oken up by the load/store unit into smaller transactions that are issued by=
 the bus interface, then the event counts those smaller transactions."
+    },
+    {
+        "ArchStdEvent": "LDST_ALIGN_LAT",
+        "PublicDescription": "Counts the number of memory read and write a=
ccesses in a cycle that incurred additional latency, due to the alignment o=
f the address and the size of data being accessed, which results in store c=
rossing a single cache line."
+    },
+    {
+        "ArchStdEvent": "LD_ALIGN_LAT",
+        "PublicDescription": "Counts the number of memory read accesses in=
 a cycle that incurred additional latency, due to the alignment of the addr=
ess and size of data being accessed, which results in load crossing a singl=
e cache line."
+    },
+    {
+        "ArchStdEvent": "ST_ALIGN_LAT",
+        "PublicDescription": "Counts the number of memory write access in =
a cycle that incurred additional latency, due to the alignment of the addre=
ss and size of data being accessed incurred additional latency."
+    },
+    {
+        "ArchStdEvent": "MEM_ACCESS_CHECKED",
+        "PublicDescription": "Counts the number of memory read and write a=
ccesses counted by MEM_ACCESS that are tag checked by the Memory Tagging Ex=
tension (MTE). This event is implemented as the sum of MEM_ACCESS_CHECKED_R=
D and MEM_ACCESS_CHECKED_WR"
+    },
+    {
+        "ArchStdEvent": "MEM_ACCESS_CHECKED_RD",
+        "PublicDescription": "Counts the number of memory read accesses in=
 a cycle that are tag checked by the Memory Tagging Extension (MTE)."
+    },
+    {
+        "ArchStdEvent": "MEM_ACCESS_CHECKED_WR",
+        "PublicDescription": "Counts the number of memory write accesses i=
n a cycle that is tag checked by the Memory Tagging Extension (MTE)."
+    },
+    {
+        "ArchStdEvent": "INST_FETCH_PERCYC",
+        "PublicDescription": "Counts number of instruction fetches outstan=
ding per cycle, which will provide an average latency of instruction fetch."
+    },
+    {
+        "ArchStdEvent": "MEM_ACCESS_RD_PERCYC",
+        "PublicDescription": "Counts the number of outstanding loads or me=
mory read accesses per cycle."
+    },
+    {
+        "ArchStdEvent": "INST_FETCH",
+        "PublicDescription": "Counts Instruction memory accesses that the =
PE makes."
+    }
+]
diff --git a/tools/perf/pmu-events/arch/arm64/arm/cortex-a720/metrics.json =
b/tools/perf/pmu-events/arch/arm64/arm/cortex-a720/metrics.json
new file mode 100644
index 000000000000..d8e8b5155cfa
--- /dev/null
+++ b/tools/perf/pmu-events/arch/arm64/arm/cortex-a720/metrics.json
@@ -0,0 +1,436 @@
+[
+    {
+        "ArchStdEvent": "backend_bound"
+    },
+    {
+        "MetricName": "backend_busy_bound",
+        "MetricExpr": "STALL_BACKEND_BUSY / STALL_BACKEND * 100",
+        "BriefDescription": "This metric is the percentage of total cycles=
 stalled in the backend due to issue queues being full to accept operations=
 for execution.",
+        "MetricGroup": "Topdown_Backend",
+        "ScaleUnit": "1percent of cycles"
+    },
+    {
+        "MetricName": "backend_cache_l1d_bound",
+        "MetricExpr": "STALL_BACKEND_L1D / (STALL_BACKEND_L1D + STALL_BACK=
END_MEM) * 100",
+        "BriefDescription": "This metric is the percentage of total cycles=
 stalled in the backend due to memory access latency issues caused by level=
 1 data cache misses.",
+        "MetricGroup": "Topdown_Backend",
+        "ScaleUnit": "1percent of cycles"
+    },
+    {
+        "MetricName": "backend_cache_l2d_bound",
+        "MetricExpr": "STALL_BACKEND_MEM / (STALL_BACKEND_L1D + STALL_BACK=
END_MEM) * 100",
+        "BriefDescription": "This metric is the percentage of total cycles=
 stalled in the backend due to memory access latency issues caused by level=
 2 data cache misses.",
+        "MetricGroup": "Topdown_Backend",
+        "ScaleUnit": "1percent of cycles"
+    },
+    {
+        "MetricName": "backend_core_bound",
+        "MetricExpr": "STALL_BACKEND_CPUBOUND / STALL_BACKEND * 100",
+        "BriefDescription": "This metric is the percentage of total cycles=
 stalled in the backend due to backend core resource constraints not relate=
d to instruction fetch latency issues caused by memory access components.",
+        "MetricGroup": "Topdown_Backend",
+        "ScaleUnit": "1percent of cycles"
+    },
+    {
+        "MetricName": "backend_core_rename_bound",
+        "MetricExpr": "STALL_BACKEND_RENAME / STALL_BACKEND_CPUBOUND * 100=
",
+        "BriefDescription": "This metric is the percentage of total cycles=
 stalled in the backend as the rename unit registers are unavailable.",
+        "MetricGroup": "Topdown_Backend",
+        "ScaleUnit": "1percent of cycles"
+    },
+    {
+        "MetricName": "backend_mem_bound",
+        "MetricExpr": "STALL_BACKEND_MEMBOUND / STALL_BACKEND * 100",
+        "BriefDescription": "This metric is the percentage of total cycles=
 stalled in the backend due to backend core resource constraints related to=
 memory access latency issues caused by memory access components.",
+        "MetricGroup": "Topdown_Backend",
+        "ScaleUnit": "1percent of cycles"
+    },
+    {
+        "MetricName": "backend_mem_cache_bound",
+        "MetricExpr": "(STALL_BACKEND_L1D + STALL_BACKEND_MEM) / STALL_BAC=
KEND_MEMBOUND * 100",
+        "BriefDescription": "This metric is the percentage of total cycles=
 stalled in the backend due to memory latency issues caused by data cache m=
isses.",
+        "MetricGroup": "Topdown_Backend",
+        "ScaleUnit": "1percent of cycles"
+    },
+    {
+        "MetricName": "backend_mem_store_bound",
+        "MetricExpr": "STALL_BACKEND_ST / STALL_BACKEND_MEMBOUND * 100",
+        "BriefDescription": "This metric is the percentage of total cycles=
 stalled in the frontend due to memory write pending caused by stores stall=
ed in the pre-commit stage.",
+        "MetricGroup": "Topdown_Backend",
+        "ScaleUnit": "1percent of cycles"
+    },
+    {
+        "MetricName": "backend_mem_tlb_bound",
+        "MetricExpr": "STALL_BACKEND_TLB / STALL_BACKEND_MEMBOUND * 100",
+        "BriefDescription": "This metric is the percentage of total cycles=
 stalled in the backend due to memory access latency issues caused by data =
TLB misses.",
+        "MetricGroup": "Topdown_Backend",
+        "ScaleUnit": "1percent of cycles"
+    },
+    {
+        "MetricName": "backend_stalled_cycles",
+        "MetricExpr": "STALL_BACKEND / CPU_CYCLES * 100",
+        "BriefDescription": "This metric is the percentage of cycles that =
were stalled due to resource constraints in the backend unit of the process=
or.",
+        "MetricGroup": "Cycle_Accounting",
+        "ScaleUnit": "1percent of cycles"
+    },
+    {
+        "ArchStdEvent": "bad_speculation",
+        "MetricExpr": "(1 - STALL_SLOT / (10 * CPU_CYCLES)) * (1 - OP_RETI=
RED / OP_SPEC) * 100 + STALL_FRONTEND_FLUSH / CPU_CYCLES * 100"
+    },
+    {
+        "MetricName": "barrier_percentage",
+        "MetricExpr": "(ISB_SPEC + DSB_SPEC + DMB_SPEC) / INST_SPEC * 100",
+        "BriefDescription": "This metric measures instruction and data bar=
rier operations as a percentage of operations speculatively executed.",
+        "MetricGroup": "Operation_Mix",
+        "ScaleUnit": "1percent of operations"
+    },
+    {
+        "MetricName": "branch_direct_ratio",
+        "MetricExpr": "BR_IMMED_RETIRED / BR_RETIRED",
+        "BriefDescription": "This metric measures the ratio of direct bran=
ches retired to the total number of branches architecturally executed.",
+        "MetricGroup": "Branch_Effectiveness",
+        "ScaleUnit": "1per branch"
+    },
+    {
+        "MetricName": "branch_indirect_ratio",
+        "MetricExpr": "BR_IND_RETIRED / BR_RETIRED",
+        "BriefDescription": "This metric measures the ratio of indirect br=
anches retired, including function returns, to the total number of branches=
 architecturally executed.",
+        "MetricGroup": "Branch_Effectiveness",
+        "ScaleUnit": "1per branch"
+    },
+    {
+        "MetricName": "branch_misprediction_ratio",
+        "MetricExpr": "BR_MIS_PRED_RETIRED / BR_RETIRED",
+        "BriefDescription": "This metric measures the ratio of branches mi=
spredicted to the total number of branches architecturally executed. This g=
ives an indication of the effectiveness of the branch prediction unit.",
+        "MetricGroup": "Miss_Ratio;Branch_Effectiveness",
+        "ScaleUnit": "100percent of branches"
+    },
+    {
+        "MetricName": "branch_mpki",
+        "MetricExpr": "BR_MIS_PRED_RETIRED / INST_RETIRED * 1000",
+        "BriefDescription": "This metric measures the number of branch mis=
predictions per thousand instructions executed.",
+        "MetricGroup": "MPKI;Branch_Effectiveness",
+        "ScaleUnit": "1MPKI"
+    },
+    {
+        "MetricName": "branch_return_ratio",
+        "MetricExpr": "BR_RETURN_RETIRED / BR_RETIRED",
+        "BriefDescription": "This metric measures the ratio of branches re=
tired that are function returns to the total number of branches architectur=
ally executed.",
+        "MetricGroup": "Branch_Effectiveness",
+        "ScaleUnit": "1per branch"
+    },
+    {
+        "MetricName": "crypto_percentage",
+        "MetricExpr": "CRYPTO_SPEC / INST_SPEC * 100",
+        "BriefDescription": "This metric measures crypto operations as a p=
ercentage of operations speculatively executed.",
+        "MetricGroup": "Operation_Mix",
+        "ScaleUnit": "1percent of operations"
+    },
+    {
+        "MetricName": "dtlb_mpki",
+        "MetricExpr": "DTLB_WALK / INST_RETIRED * 1000",
+        "BriefDescription": "This metric measures the number of data TLB W=
alks per thousand instructions executed.",
+        "MetricGroup": "MPKI;DTLB_Effectiveness",
+        "ScaleUnit": "1MPKI"
+    },
+    {
+        "MetricName": "dtlb_walk_ratio",
+        "MetricExpr": "DTLB_WALK / L1D_TLB",
+        "BriefDescription": "This metric measures the ratio of data TLB Wa=
lks to the total number of data TLB accesses. This gives an indication of t=
he effectiveness of the data TLB accesses.",
+        "MetricGroup": "Miss_Ratio;DTLB_Effectiveness",
+        "ScaleUnit": "100percent of TLB accesses"
+    },
+    {
+        "MetricName": "fp16_percentage",
+        "MetricExpr": "FP_HP_SPEC / INST_SPEC * 100",
+        "BriefDescription": "This metric measures half-precision floating =
point operations as a percentage of operations speculatively executed.",
+        "MetricGroup": "FP_Precision_Mix",
+        "ScaleUnit": "1percent of operations"
+    },
+    {
+        "MetricName": "fp32_percentage",
+        "MetricExpr": "FP_SP_SPEC / INST_SPEC * 100",
+        "BriefDescription": "This metric measures single-precision floatin=
g point operations as a percentage of operations speculatively executed.",
+        "MetricGroup": "FP_Precision_Mix",
+        "ScaleUnit": "1percent of operations"
+    },
+    {
+        "MetricName": "fp64_percentage",
+        "MetricExpr": "FP_DP_SPEC / INST_SPEC * 100",
+        "BriefDescription": "This metric measures double-precision floatin=
g point operations as a percentage of operations speculatively executed.",
+        "MetricGroup": "FP_Precision_Mix",
+        "ScaleUnit": "1percent of operations"
+    },
+    {
+        "MetricName": "fp_ops_per_cycle",
+        "MetricExpr": "(FP_SCALE_OPS_SPEC + FP_FIXED_OPS_SPEC) / CPU_CYCLE=
S",
+        "BriefDescription": "This metric measures floating point operation=
s per cycle in any precision performed by any instruction. Operations are c=
ounted by computation and by vector lanes, fused computations such as multi=
ply-add count as twice per vector lane for example.",
+        "MetricGroup": "FP_Arithmetic_Intensity",
+        "ScaleUnit": "1operations per cycle"
+    },
+    {
+        "MetricName": "frontend_cache_l1i_bound",
+        "MetricExpr": "STALL_FRONTEND_L1I / (STALL_FRONTEND_L1I + STALL_FR=
ONTEND_MEM) * 100",
+        "BriefDescription": "This metric is the percentage of total cycles=
 stalled in the frontend due to memory access latency issues caused by leve=
l 1 instruction cache misses.",
+        "MetricGroup": "Topdown_Frontend",
+        "ScaleUnit": "1percent of cycles"
+    },
+    {
+        "MetricName": "frontend_cache_l2i_bound",
+        "MetricExpr": "STALL_FRONTEND_MEM / (STALL_FRONTEND_L1I + STALL_FR=
ONTEND_MEM) * 100",
+        "BriefDescription": "This metric is the percentage of total cycles=
 stalled in the frontend due to memory access latency issues caused by leve=
l 2 instruction cache misses.",
+        "MetricGroup": "Topdown_Frontend",
+        "ScaleUnit": "1percent of cycles"
+    },
+    {
+        "MetricName": "frontend_core_bound",
+        "MetricExpr": "STALL_FRONTEND_CPUBOUND / STALL_FRONTEND * 100",
+        "BriefDescription": "This metric is the percentage of total cycles=
 stalled in the frontend due to frontend core resource constraints not rela=
ted to instruction fetch latency issues caused by memory access components.=
",
+        "MetricGroup": "Topdown_Frontend",
+        "ScaleUnit": "1percent of cycles"
+    },
+    {
+        "MetricName": "frontend_core_flush_bound",
+        "MetricExpr": "STALL_FRONTEND_FLUSH / STALL_FRONTEND_CPUBOUND * 10=
0",
+        "BriefDescription": "This metric is the percentage of total cycles=
 stalled in the frontend as the processor is recovering from a pipeline flu=
sh caused by bad speculation or other machine resteers.",
+        "MetricGroup": "Topdown_Frontend",
+        "ScaleUnit": "1percent of cycles"
+    },
+    {
+        "MetricName": "frontend_mem_bound",
+        "MetricExpr": "STALL_FRONTEND_MEMBOUND / STALL_FRONTEND * 100",
+        "BriefDescription": "This metric is the percentage of total cycles=
 stalled in the frontend due to frontend core resource constraints related =
to the instruction fetch latency issues caused by memory access components.=
",
+        "MetricGroup": "Topdown_Frontend",
+        "ScaleUnit": "1percent of cycles"
+    },
+    {
+        "MetricName": "frontend_mem_cache_bound",
+        "MetricExpr": "(STALL_FRONTEND_L1I + STALL_FRONTEND_MEM) / STALL_F=
RONTEND_MEMBOUND * 100",
+        "BriefDescription": "This metric is the percentage of total cycles=
 stalled in the frontend due to instruction fetch latency issues caused by =
instruction cache misses.",
+        "MetricGroup": "Topdown_Frontend",
+        "ScaleUnit": "1percent of cycles"
+    },
+    {
+        "MetricName": "frontend_mem_tlb_bound",
+        "MetricExpr": "STALL_FRONTEND_TLB / STALL_FRONTEND_MEMBOUND * 100",
+        "BriefDescription": "This metric is the percentage of total cycles=
 stalled in the frontend due to instruction fetch latency issues caused by =
instruction TLB misses.",
+        "MetricGroup": "Topdown_Frontend",
+        "ScaleUnit": "1percent of cycles"
+    },
+    {
+        "MetricName": "frontend_stalled_cycles",
+        "MetricExpr": "STALL_FRONTEND / CPU_CYCLES * 100",
+        "BriefDescription": "This metric is the percentage of cycles that =
were stalled due to resource constraints in the frontend unit of the proces=
sor.",
+        "MetricGroup": "Cycle_Accounting",
+        "ScaleUnit": "1percent of cycles"
+    },
+    {
+        "MetricName": "integer_dp_percentage",
+        "MetricExpr": "DP_SPEC / INST_SPEC * 100",
+        "BriefDescription": "This metric measures scalar integer operation=
s as a percentage of operations speculatively executed.",
+        "MetricGroup": "Operation_Mix",
+        "ScaleUnit": "1percent of operations"
+    },
+    {
+        "MetricName": "ipc",
+        "MetricExpr": "INST_RETIRED / CPU_CYCLES",
+        "BriefDescription": "This metric measures the number of instructio=
ns retired per cycle.",
+        "MetricGroup": "General",
+        "ScaleUnit": "1per cycle"
+    },
+    {
+        "MetricName": "itlb_mpki",
+        "MetricExpr": "ITLB_WALK / INST_RETIRED * 1000",
+        "BriefDescription": "This metric measures the number of instructio=
n TLB Walks per thousand instructions executed.",
+        "MetricGroup": "MPKI;ITLB_Effectiveness",
+        "ScaleUnit": "1MPKI"
+    },
+    {
+        "MetricName": "itlb_walk_ratio",
+        "MetricExpr": "ITLB_WALK / L1I_TLB",
+        "BriefDescription": "This metric measures the ratio of instruction=
 TLB Walks to the total number of instruction TLB accesses. This gives an i=
ndication of the effectiveness of the instruction TLB accesses.",
+        "MetricGroup": "Miss_Ratio;ITLB_Effectiveness",
+        "ScaleUnit": "100percent of TLB accesses"
+    },
+    {
+        "MetricName": "l1d_cache_miss_ratio",
+        "MetricExpr": "L1D_CACHE_REFILL / L1D_CACHE",
+        "BriefDescription": "This metric measures the ratio of level 1 dat=
a cache accesses missed to the total number of level 1 data cache accesses.=
 This gives an indication of the effectiveness of the level 1 data cache.",
+        "MetricGroup": "Miss_Ratio;L1D_Cache_Effectiveness",
+        "ScaleUnit": "100percent of cache accesses"
+    },
+    {
+        "MetricName": "l1d_cache_mpki",
+        "MetricExpr": "L1D_CACHE_REFILL / INST_RETIRED * 1000",
+        "BriefDescription": "This metric measures the number of level 1 da=
ta cache accesses missed per thousand instructions executed.",
+        "MetricGroup": "MPKI;L1D_Cache_Effectiveness",
+        "ScaleUnit": "1MPKI"
+    },
+    {
+        "MetricName": "l1d_tlb_miss_ratio",
+        "MetricExpr": "L1D_TLB_REFILL / L1D_TLB",
+        "BriefDescription": "This metric measures the ratio of level 1 dat=
a TLB accesses missed to the total number of level 1 data TLB accesses. Thi=
s gives an indication of the effectiveness of the level 1 data TLB.",
+        "MetricGroup": "Miss_Ratio;DTLB_Effectiveness",
+        "ScaleUnit": "100percent of TLB accesses"
+    },
+    {
+        "MetricName": "l1d_tlb_mpki",
+        "MetricExpr": "L1D_TLB_REFILL / INST_RETIRED * 1000",
+        "BriefDescription": "This metric measures the number of level 1 da=
ta TLB accesses missed per thousand instructions executed.",
+        "MetricGroup": "MPKI;DTLB_Effectiveness",
+        "ScaleUnit": "1MPKI"
+    },
+    {
+        "MetricName": "l1i_cache_miss_ratio",
+        "MetricExpr": "L1I_CACHE_REFILL / L1I_CACHE",
+        "BriefDescription": "This metric measures the ratio of level 1 ins=
truction cache accesses missed to the total number of level 1 instruction c=
ache accesses. This gives an indication of the effectiveness of the level 1=
 instruction cache.",
+        "MetricGroup": "Miss_Ratio;L1I_Cache_Effectiveness",
+        "ScaleUnit": "100percent of cache accesses"
+    },
+    {
+        "MetricName": "l1i_cache_mpki",
+        "MetricExpr": "L1I_CACHE_REFILL / INST_RETIRED * 1000",
+        "BriefDescription": "This metric measures the number of level 1 in=
struction cache accesses missed per thousand instructions executed.",
+        "MetricGroup": "MPKI;L1I_Cache_Effectiveness",
+        "ScaleUnit": "1MPKI"
+    },
+    {
+        "MetricName": "l1i_tlb_miss_ratio",
+        "MetricExpr": "L1I_TLB_REFILL / L1I_TLB",
+        "BriefDescription": "This metric measures the ratio of level 1 ins=
truction TLB accesses missed to the total number of level 1 instruction TLB=
 accesses. This gives an indication of the effectiveness of the level 1 ins=
truction TLB.",
+        "MetricGroup": "Miss_Ratio;ITLB_Effectiveness",
+        "ScaleUnit": "100percent of TLB accesses"
+    },
+    {
+        "MetricName": "l1i_tlb_mpki",
+        "MetricExpr": "L1I_TLB_REFILL / INST_RETIRED * 1000",
+        "BriefDescription": "This metric measures the number of level 1 in=
struction TLB accesses missed per thousand instructions executed.",
+        "MetricGroup": "MPKI;ITLB_Effectiveness",
+        "ScaleUnit": "1MPKI"
+    },
+    {
+        "MetricName": "l2_cache_miss_ratio",
+        "MetricExpr": "L2D_CACHE_REFILL / L2D_CACHE",
+        "BriefDescription": "This metric measures the ratio of level 2 cac=
he accesses missed to the total number of level 2 cache accesses. This give=
s an indication of the effectiveness of the level 2 cache, which is a unifi=
ed cache that stores both data and instruction. Note that cache accesses in=
 this cache are either data memory access or instruction fetch as this is a=
 unified cache.",
+        "MetricGroup": "Miss_Ratio;L2_Cache_Effectiveness",
+        "ScaleUnit": "100percent of cache accesses"
+    },
+    {
+        "MetricName": "l2_cache_mpki",
+        "MetricExpr": "L2D_CACHE_REFILL / INST_RETIRED * 1000",
+        "BriefDescription": "This metric measures the number of level 2 un=
ified cache accesses missed per thousand instructions executed. Note that c=
ache accesses in this cache are either data memory access or instruction fe=
tch as this is a unified cache.",
+        "MetricGroup": "MPKI;L2_Cache_Effectiveness",
+        "ScaleUnit": "1MPKI"
+    },
+    {
+        "MetricName": "l2_tlb_miss_ratio",
+        "MetricExpr": "L2D_TLB_REFILL / L2D_TLB",
+        "BriefDescription": "This metric measures the ratio of level 2 uni=
fied TLB accesses missed to the total number of level 2 unified TLB accesse=
s. This gives an indication of the effectiveness of the level 2 TLB.",
+        "MetricGroup": "Miss_Ratio;ITLB_Effectiveness;DTLB_Effectiveness",
+        "ScaleUnit": "100percent of TLB accesses"
+    },
+    {
+        "MetricName": "l2_tlb_mpki",
+        "MetricExpr": "L2D_TLB_REFILL / INST_RETIRED * 1000",
+        "BriefDescription": "This metric measures the number of level 2 un=
ified TLB accesses missed per thousand instructions executed.",
+        "MetricGroup": "MPKI;ITLB_Effectiveness;DTLB_Effectiveness",
+        "ScaleUnit": "1MPKI"
+    },
+    {
+        "MetricName": "ll_cache_read_hit_ratio",
+        "MetricExpr": "(LL_CACHE_RD - LL_CACHE_MISS_RD) / LL_CACHE_RD",
+        "BriefDescription": "This metric measures the ratio of last level =
cache read accesses hit in the cache to the total number of last level cach=
e accesses. This gives an indication of the effectiveness of the last level=
 cache for read traffic. Note that cache accesses in this cache are either =
data memory access or instruction fetch as this is a system level cache.",
+        "MetricGroup": "LL_Cache_Effectiveness",
+        "ScaleUnit": "100percent of cache accesses"
+    },
+    {
+        "MetricName": "ll_cache_read_miss_ratio",
+        "MetricExpr": "LL_CACHE_MISS_RD / LL_CACHE_RD",
+        "BriefDescription": "This metric measures the ratio of last level =
cache read accesses missed to the total number of last level cache accesses=
. This gives an indication of the effectiveness of the last level cache for=
 read traffic. Note that cache accesses in this cache are either data memor=
y access or instruction fetch as this is a system level cache.",
+        "MetricGroup": "Miss_Ratio;LL_Cache_Effectiveness",
+        "ScaleUnit": "100percent of cache accesses"
+    },
+    {
+        "MetricName": "ll_cache_read_mpki",
+        "MetricExpr": "LL_CACHE_MISS_RD / INST_RETIRED * 1000",
+        "BriefDescription": "This metric measures the number of last level=
 cache read accesses missed per thousand instructions executed.",
+        "MetricGroup": "MPKI;LL_Cache_Effectiveness",
+        "ScaleUnit": "1MPKI"
+    },
+    {
+        "MetricName": "load_percentage",
+        "MetricExpr": "LD_SPEC / INST_SPEC * 100",
+        "BriefDescription": "This metric measures load operations as a per=
centage of operations speculatively executed.",
+        "MetricGroup": "Operation_Mix",
+        "ScaleUnit": "1percent of operations"
+    },
+    {
+        "MetricName": "nonsve_fp_ops_per_cycle",
+        "MetricExpr": "FP_FIXED_OPS_SPEC / CPU_CYCLES",
+        "BriefDescription": "This metric measures floating point operation=
s per cycle in any precision performed by an instruction that is not an SVE=
 instruction. Operations are counted by computation and by vector lanes, fu=
sed computations such as multiply-add count as twice per vector lane for ex=
ample.",
+        "MetricGroup": "FP_Arithmetic_Intensity",
+        "ScaleUnit": "1operations per cycle"
+    },
+    {
+        "MetricName": "scalar_fp_percentage",
+        "MetricExpr": "VFP_SPEC / INST_SPEC * 100",
+        "BriefDescription": "This metric measures scalar floating point op=
erations as a percentage of operations speculatively executed.",
+        "MetricGroup": "Operation_Mix",
+        "ScaleUnit": "1percent of operations"
+    },
+    {
+        "MetricName": "simd_percentage",
+        "MetricExpr": "ASE_SPEC / INST_SPEC * 100",
+        "BriefDescription": "This metric measures advanced SIMD operations=
 as a percentage of total operations speculatively executed.",
+        "MetricGroup": "Operation_Mix",
+        "ScaleUnit": "1percent of operations"
+    },
+    {
+        "MetricName": "store_percentage",
+        "MetricExpr": "ST_SPEC / INST_SPEC * 100",
+        "BriefDescription": "This metric measures store operations as a pe=
rcentage of operations speculatively executed.",
+        "MetricGroup": "Operation_Mix",
+        "ScaleUnit": "1percent of operations"
+    },
+    {
+        "MetricName": "sve_all_percentage",
+        "MetricExpr": "SVE_INST_SPEC / INST_SPEC * 100",
+        "BriefDescription": "This metric measures scalable vector operatio=
ns, including loads and stores, as a percentage of operations speculatively=
 executed.",
+        "MetricGroup": "Operation_Mix",
+        "ScaleUnit": "1percent of operations"
+    },
+    {
+        "MetricName": "sve_fp_ops_per_cycle",
+        "MetricExpr": "FP_SCALE_OPS_SPEC / CPU_CYCLES",
+        "BriefDescription": "This metric measures floating point operation=
s per cycle in any precision performed by SVE instructions. Operations are =
counted by computation and by vector lanes, fused computations such as mult=
iply-add count as twice per vector lane for example.",
+        "MetricGroup": "FP_Arithmetic_Intensity",
+        "ScaleUnit": "1operations per cycle"
+    },
+    {
+        "MetricName": "sve_predicate_empty_percentage",
+        "MetricExpr": "SVE_PRED_EMPTY_SPEC / SVE_PRED_SPEC * 100",
+        "BriefDescription": "This metric measures scalable vector operatio=
ns with no active predicates as a percentage of sve predicated operations s=
peculatively executed.",
+        "MetricGroup": "SVE_Effectiveness",
+        "ScaleUnit": "1percent of operations"
+    },
+    {
+        "MetricName": "sve_predicate_full_percentage",
+        "MetricExpr": "SVE_PRED_FULL_SPEC / SVE_PRED_SPEC * 100",
+        "BriefDescription": "This metric measures scalable vector operatio=
ns with all active predicates as a percentage of sve predicated operations =
speculatively executed.",
+        "MetricGroup": "SVE_Effectiveness",
+        "ScaleUnit": "1percent of operations"
+    },
+    {
+        "MetricName": "sve_predicate_partial_percentage",
+        "MetricExpr": "SVE_PRED_PARTIAL_SPEC / SVE_PRED_SPEC * 100",
+        "BriefDescription": "This metric measures scalable vector operatio=
ns with at least one active predicates as a percentage of sve predicated op=
erations speculatively executed.",
+        "MetricGroup": "SVE_Effectiveness",
+        "ScaleUnit": "1percent of operations"
+    },
+    {
+        "MetricName": "sve_predicate_percentage",
+        "MetricExpr": "SVE_PRED_SPEC / INST_SPEC * 100",
+        "BriefDescription": "This metric measures scalable vector operatio=
ns with predicates as a percentage of operations speculatively executed.",
+        "MetricGroup": "SVE_Effectiveness",
+        "ScaleUnit": "1percent of operations"
+    }
+]
diff --git a/tools/perf/pmu-events/arch/arm64/arm/cortex-a720/pmu.json b/to=
ols/perf/pmu-events/arch/arm64/arm/cortex-a720/pmu.json
new file mode 100644
index 000000000000..d8b7b9f9e5fa
--- /dev/null
+++ b/tools/perf/pmu-events/arch/arm64/arm/cortex-a720/pmu.json
@@ -0,0 +1,8 @@
+[
+    {
+        "ArchStdEvent": "PMU_OVFS"
+    },
+    {
+        "ArchStdEvent": "PMU_HOVFS"
+    }
+]
diff --git a/tools/perf/pmu-events/arch/arm64/arm/cortex-a720/retired.json =
b/tools/perf/pmu-events/arch/arm64/arm/cortex-a720/retired.json
new file mode 100644
index 000000000000..69f9a0b0c7ff
--- /dev/null
+++ b/tools/perf/pmu-events/arch/arm64/arm/cortex-a720/retired.json
@@ -0,0 +1,90 @@
+[
+    {
+        "ArchStdEvent": "SW_INCR",
+        "PublicDescription": "Counts software writes to the PMSWINC_EL0 (s=
oftware PMU increment) register. The PMSWINC_EL0 register is a manually upd=
ated counter for use by application software.\n\nThis event could be used t=
o measure any user program event, such as accesses to a particular data str=
ucture (by writing to the PMSWINC_EL0 register each time the data structure=
 is accessed).\n\nTo use the PMSWINC_EL0 register and event, developers mus=
t insert instructions that write to the PMSWINC_EL0 register into the sourc=
e code.\n\nSince the SW_INCR event records writes to the PMSWINC_EL0 regist=
er, there is no need to do a read/increment/write sequence to the PMSWINC_E=
L0 register."
+    },
+    {
+        "ArchStdEvent": "INST_RETIRED",
+        "PublicDescription": "Counts instructions that have been architect=
urally executed."
+    },
+    {
+        "ArchStdEvent": "CID_WRITE_RETIRED",
+        "PublicDescription": "Counts architecturally executed writes to th=
e CONTEXTIDR_EL1 register, which usually contain the kernel PID and can be =
output with hardware trace."
+    },
+    {
+        "ArchStdEvent": "PC_WRITE_RETIRED",
+        "PublicDescription": "Counts branch instructions that caused a cha=
nge of Program Counter, which effectively causes a change in the control fl=
ow of the program."
+    },
+    {
+        "ArchStdEvent": "BR_IMMED_RETIRED",
+        "PublicDescription": "Counts architecturally executed direct branc=
hes."
+    },
+    {
+        "ArchStdEvent": "BR_RETURN_RETIRED",
+        "PublicDescription": "Counts architecturally executed procedure re=
turns."
+    },
+    {
+        "ArchStdEvent": "TTBR_WRITE_RETIRED",
+        "PublicDescription": "Counts architectural writes to TTBR0/1_EL1. =
If virtualization host extensions are enabled (by setting the HCR_EL2.E2H b=
it to 1), then accesses to TTBR0/1_EL1 that are redirected to TTBR0/1_EL2, =
or accesses to TTBR0/1_EL12, are counted. TTBRn registers are typically upd=
ated when the kernel is swapping user-space threads or applications."
+    },
+    {
+        "ArchStdEvent": "BR_RETIRED",
+        "PublicDescription": "Counts architecturally executed branches, wh=
ether the branch is taken or not. Instructions that explicitly write to the=
 PC are also counted. Note that exception generating instructions, exceptio=
n return instructions and context synchronization instructions are not coun=
ted."
+    },
+    {
+        "ArchStdEvent": "BR_MIS_PRED_RETIRED",
+        "PublicDescription": "Counts branches counted by BR_RETIRED which =
were mispredicted and caused a pipeline flush."
+    },
+    {
+        "ArchStdEvent": "OP_RETIRED",
+        "PublicDescription": "Counts micro-operations that are architectur=
ally executed. This is a count of number of micro-operations retired from t=
he commit queue in a single cycle."
+    },
+    {
+        "ArchStdEvent": "BR_IMMED_TAKEN_RETIRED",
+        "PublicDescription": "Counts architecturally executed immediate br=
anches that were taken."
+    },
+    {
+        "ArchStdEvent": "BR_INDNR_TAKEN_RETIRED",
+        "PublicDescription": "Counts architecturally executed indirect bra=
nches excluding procedure returns that were taken."
+    },
+    {
+        "ArchStdEvent": "BR_IMMED_PRED_RETIRED",
+        "PublicDescription": "Counts architecturally executed direct branc=
hes that were correctly predicted."
+    },
+    {
+        "ArchStdEvent": "BR_IMMED_MIS_PRED_RETIRED",
+        "PublicDescription": "Counts architecturally executed direct branc=
hes that were mispredicted and caused a pipeline flush."
+    },
+    {
+        "ArchStdEvent": "BR_IND_PRED_RETIRED",
+        "PublicDescription": "Counts architecturally executed indirect bra=
nches including procedure returns that were correctly predicted."
+    },
+    {
+        "ArchStdEvent": "BR_IND_MIS_PRED_RETIRED",
+        "PublicDescription": "Counts architecturally executed indirect bra=
nches including procedure returns that were mispredicted and caused a pipel=
ine flush."
+    },
+    {
+        "ArchStdEvent": "BR_RETURN_PRED_RETIRED",
+        "PublicDescription": "Counts architecturally executed procedure re=
turns that were correctly predicted."
+    },
+    {
+        "ArchStdEvent": "BR_RETURN_MIS_PRED_RETIRED",
+        "PublicDescription": "Counts architecturally executed procedure re=
turns that were mispredicted and caused a pipeline flush."
+    },
+    {
+        "ArchStdEvent": "BR_INDNR_PRED_RETIRED",
+        "PublicDescription": "Counts architecturally executed indirect bra=
nches excluding procedure returns that were correctly predicted."
+    },
+    {
+        "ArchStdEvent": "BR_INDNR_MIS_PRED_RETIRED",
+        "PublicDescription": "Counts architecturally executed indirect bra=
nches excluding procedure returns that were mispredicted and caused a pipel=
ine flush."
+    },
+    {
+        "ArchStdEvent": "BR_PRED_RETIRED",
+        "PublicDescription": "Counts branch instructions counted by BR_RET=
IRED which were correctly predicted."
+    },
+    {
+        "ArchStdEvent": "BR_IND_RETIRED",
+        "PublicDescription": "Counts architecturally executed indirect bra=
nches including procedure returns."
+    }
+]
diff --git a/tools/perf/pmu-events/arch/arm64/arm/cortex-a720/spe.json b/to=
ols/perf/pmu-events/arch/arm64/arm/cortex-a720/spe.json
new file mode 100644
index 000000000000..ca0217fa4681
--- /dev/null
+++ b/tools/perf/pmu-events/arch/arm64/arm/cortex-a720/spe.json
@@ -0,0 +1,42 @@
+[
+    {
+        "ArchStdEvent": "SAMPLE_POP",
+        "PublicDescription": "Counts statistical profiling sample populati=
on, the count of all operations that could be sampled but may or may not be=
 chosen for sampling."
+    },
+    {
+        "ArchStdEvent": "SAMPLE_FEED",
+        "PublicDescription": "Counts statistical profiling samples taken f=
or sampling."
+    },
+    {
+        "ArchStdEvent": "SAMPLE_FILTRATE",
+        "PublicDescription": "Counts statistical profiling samples taken w=
hich are not removed by filtering."
+    },
+    {
+        "ArchStdEvent": "SAMPLE_COLLISION",
+        "PublicDescription": "Counts statistical profiling samples that ha=
ve collided with a previous sample and so therefore not taken."
+    },
+    {
+        "ArchStdEvent": "SAMPLE_FEED_BR",
+        "PublicDescription": "Counts statistical profiling samples taken w=
hich are branches."
+    },
+    {
+        "ArchStdEvent": "SAMPLE_FEED_LD",
+        "PublicDescription": "Counts statistical profiling samples taken w=
hich are loads or load atomic operations."
+    },
+    {
+        "ArchStdEvent": "SAMPLE_FEED_ST",
+        "PublicDescription": "Counts statistical profiling samples taken w=
hich are stores or store atomic operations."
+    },
+    {
+        "ArchStdEvent": "SAMPLE_FEED_OP",
+        "PublicDescription": "Counts statistical profiling samples taken w=
hich are matching any operation type filters supported."
+    },
+    {
+        "ArchStdEvent": "SAMPLE_FEED_EVENT",
+        "PublicDescription": "Counts statistical profiling samples taken w=
hich are matching event packet filter constraints."
+    },
+    {
+        "ArchStdEvent": "SAMPLE_FEED_LAT",
+        "PublicDescription": "Counts statistical profiling samples taken w=
hich are exceeding minimum latency set by operation latency filter constrai=
nts."
+    }
+]
diff --git a/tools/perf/pmu-events/arch/arm64/arm/cortex-a720/spec_operatio=
n.json b/tools/perf/pmu-events/arch/arm64/arm/cortex-a720/spec_operation.js=
on
new file mode 100644
index 000000000000..f91eb18d683c
--- /dev/null
+++ b/tools/perf/pmu-events/arch/arm64/arm/cortex-a720/spec_operation.json
@@ -0,0 +1,90 @@
+[
+    {
+        "ArchStdEvent": "BR_MIS_PRED",
+        "PublicDescription": "Counts branches which are speculatively exec=
uted and mispredicted."
+    },
+    {
+        "ArchStdEvent": "BR_PRED",
+        "PublicDescription": "Counts all speculatively executed branches."
+    },
+    {
+        "ArchStdEvent": "INST_SPEC",
+        "PublicDescription": "Counts operations that have been speculative=
ly executed."
+    },
+    {
+        "ArchStdEvent": "OP_SPEC",
+        "PublicDescription": "Counts micro-operations speculatively execut=
ed. This is the count of the number of micro-operations dispatched in a cyc=
le."
+    },
+    {
+        "ArchStdEvent": "STREX_FAIL_SPEC",
+        "PublicDescription": "Counts store-exclusive operations that have =
been speculatively executed and have not successfully completed the store o=
peration."
+    },
+    {
+        "ArchStdEvent": "STREX_SPEC",
+        "PublicDescription": "Counts store-exclusive operations that have =
been speculatively executed."
+    },
+    {
+        "ArchStdEvent": "LD_SPEC",
+        "PublicDescription": "Counts speculatively executed load operation=
s including Single Instruction Multiple Data (SIMD) load operations."
+    },
+    {
+        "ArchStdEvent": "ST_SPEC",
+        "PublicDescription": "Counts speculatively executed store operatio=
ns including Single Instruction Multiple Data (SIMD) store operations."
+    },
+    {
+        "ArchStdEvent": "DP_SPEC",
+        "PublicDescription": "Counts speculatively executed logical or ari=
thmetic instructions such as MOV/MVN operations."
+    },
+    {
+        "ArchStdEvent": "ASE_SPEC",
+        "PublicDescription": "Counts speculatively executed Advanced SIMD =
operations excluding load, store and move micro-operations that move data t=
o or from SIMD (vector) registers."
+    },
+    {
+        "ArchStdEvent": "VFP_SPEC",
+        "PublicDescription": "Counts speculatively executed floating point=
 operations. This event does not count operations that move data to or from=
 floating point (vector) registers."
+    },
+    {
+        "ArchStdEvent": "PC_WRITE_SPEC",
+        "PublicDescription": "Counts speculatively executed operations whi=
ch cause software changes of the PC. Those operations include all taken bra=
nch operations."
+    },
+    {
+        "ArchStdEvent": "CRYPTO_SPEC",
+        "PublicDescription": "Counts speculatively executed cryptographic =
operations except for PMULL and VMULL operations."
+    },
+    {
+        "ArchStdEvent": "ISB_SPEC",
+        "PublicDescription": "Counts ISB operations that are executed."
+    },
+    {
+        "ArchStdEvent": "DSB_SPEC",
+        "PublicDescription": "Counts DSB operations that are speculatively=
 issued to Load/Store unit in the CPU."
+    },
+    {
+        "ArchStdEvent": "DMB_SPEC",
+        "PublicDescription": "Counts DMB operations that are speculatively=
 issued to the Load/Store unit in the CPU. This event does not count implie=
d barriers from load acquire/store release operations."
+    },
+    {
+        "ArchStdEvent": "RC_LD_SPEC",
+        "PublicDescription": "Counts any load acquire operations that are =
speculatively executed. For example: LDAR, LDARH, LDARB"
+    },
+    {
+        "ArchStdEvent": "RC_ST_SPEC",
+        "PublicDescription": "Counts any store release operations that are=
 speculatively executed. For example: STLR, STLRH, STLRB"
+    },
+    {
+        "ArchStdEvent": "ASE_INST_SPEC",
+        "PublicDescription": "Counts speculatively executed Advanced SIMD =
operations."
+    },
+    {
+        "ArchStdEvent": "CAS_NEAR_PASS",
+        "PublicDescription": "Counts compare and swap instructions that ex=
ecuted locally to the PE and updated the location accessed."
+    },
+    {
+        "ArchStdEvent": "CAS_NEAR_SPEC",
+        "PublicDescription": "Counts compare and swap instructions that ex=
ecuted locally to the PE."
+    },
+    {
+        "ArchStdEvent": "CAS_FAR_SPEC",
+        "PublicDescription": "Counts compare and swap instructions that di=
d not execute locally to the PE."
+    }
+]
diff --git a/tools/perf/pmu-events/arch/arm64/arm/cortex-a720/stall.json b/=
tools/perf/pmu-events/arch/arm64/arm/cortex-a720/stall.json
new file mode 100644
index 000000000000..b1eae21bac07
--- /dev/null
+++ b/tools/perf/pmu-events/arch/arm64/arm/cortex-a720/stall.json
@@ -0,0 +1,82 @@
+[
+    {
+        "ArchStdEvent": "STALL_FRONTEND",
+        "PublicDescription": "Counts cycles when frontend could not send a=
ny micro-operations to the rename stage because of frontend resource stalls=
 caused by fetch memory latency or branch prediction flow stalls. STALL_FRO=
NTEND_SLOTS counts SLOTS during the cycle when this event counts."
+    },
+    {
+        "ArchStdEvent": "STALL_BACKEND",
+        "PublicDescription": "Counts cycles whenever the rename unit is un=
able to send any micro-operations to the backend of the pipeline because of=
 backend resource constraints. Backend resource constraints can include iss=
ue stage fullness, execution stage fullness, or other internal pipeline res=
ource fullness. All the backend slots were empty during the cycle when this=
 event counts."
+    },
+    {
+        "ArchStdEvent": "STALL",
+        "PublicDescription": "Counts cycles when no operations are sent to=
 the rename unit from the frontend or from the rename unit to the backend f=
or any reason (either frontend or backend stall). This event is the sum of =
STALL_FRONTEND and STALL_BACKEND"
+    },
+    {
+        "ArchStdEvent": "STALL_SLOT_BACKEND",
+        "PublicDescription": "Counts slots per cycle in which no operation=
s are sent from the rename unit to the backend due to backend resource cons=
traints. STALL_BACKEND counts during the cycle when STALL_SLOT_BACKEND coun=
ts at least 1."
+    },
+    {
+        "ArchStdEvent": "STALL_SLOT_FRONTEND",
+        "PublicDescription": "Counts slots per cycle in which no operation=
s are sent to the rename unit from the frontend due to frontend resource co=
nstraints."
+    },
+    {
+        "ArchStdEvent": "STALL_SLOT",
+        "PublicDescription": "Counts slots per cycle in which no operation=
s are sent to the rename unit from the frontend or from the rename unit to =
the backend for any reason (either frontend or backend stall). STALL_SLOT i=
s the sum of STALL_SLOT_FRONTEND and STALL_SLOT_BACKEND."
+    },
+    {
+        "ArchStdEvent": "STALL_BACKEND_MEM",
+        "PublicDescription": "Counts cycles when the backend is stalled be=
cause there is a pending demand load request in progress in the last level =
core cache."
+    },
+    {
+        "ArchStdEvent": "STALL_FRONTEND_MEMBOUND",
+        "PublicDescription": "Counts cycles when the frontend could not se=
nd any micro-operations to the rename stage due to resource constraints in =
the memory resources."
+    },
+    {
+        "ArchStdEvent": "STALL_FRONTEND_L1I",
+        "PublicDescription": "Counts cycles when the frontend is stalled b=
ecause there is an instruction fetch request pending in the level 1 instruc=
tion cache."
+    },
+    {
+        "ArchStdEvent": "STALL_FRONTEND_MEM",
+        "PublicDescription": "Counts cycles when the frontend is stalled b=
ecause there is an instruction fetch request pending in the last level core=
 cache."
+    },
+    {
+        "ArchStdEvent": "STALL_FRONTEND_TLB",
+        "PublicDescription": "Counts when the frontend is stalled on any T=
LB misses being handled. This event also counts the TLB accesses made by ha=
rdware prefetches."
+    },
+    {
+        "ArchStdEvent": "STALL_FRONTEND_CPUBOUND",
+        "PublicDescription": "Counts cycles when the frontend could not se=
nd any micro-operations to the rename stage due to resource constraints in =
the CPU resources excluding memory resources."
+    },
+    {
+        "ArchStdEvent": "STALL_FRONTEND_FLUSH",
+        "PublicDescription": "Counts cycles when the frontend could not se=
nd any micro-operations to the rename stage as the frontend is recovering f=
rom a machine flush or resteer. Example scenarios that cause a flush includ=
e branch mispredictions, taken exceptions, micro-architectural flush etc."
+    },
+    {
+        "ArchStdEvent": "STALL_BACKEND_MEMBOUND",
+        "PublicDescription": "Counts cycles when the backend could not acc=
ept any micro-operations due to resource constraints in the memory resource=
s."
+    },
+    {
+        "ArchStdEvent": "STALL_BACKEND_L1D",
+        "PublicDescription": "Counts cycles when the backend is stalled be=
cause there is a pending demand load request in progress in the level 1 dat=
a cache."
+    },
+    {
+        "ArchStdEvent": "STALL_BACKEND_TLB",
+        "PublicDescription": "Counts cycles when the backend is stalled on=
 any demand TLB misses being handled."
+    },
+    {
+        "ArchStdEvent": "STALL_BACKEND_ST",
+        "PublicDescription": "Counts cycles when the backend is stalled an=
d there is a store that has not reached the pre-commit stage."
+    },
+    {
+        "ArchStdEvent": "STALL_BACKEND_CPUBOUND",
+        "PublicDescription": "Counts cycles when the backend could not acc=
ept any micro-operations due to any resource constraints in the CPU excludi=
ng memory resources."
+    },
+    {
+        "ArchStdEvent": "STALL_BACKEND_BUSY",
+        "PublicDescription": "Counts cycles when the backend could not acc=
ept any micro-operations because the issue queues are full to take any oper=
ations for execution."
+    },
+    {
+        "ArchStdEvent": "STALL_BACKEND_RENAME",
+        "PublicDescription": "Counts cycles when backend is stalled even w=
hen operations are available from the frontend but at least one is not read=
y to be sent to the backend because no rename register is available."
+    }
+]
diff --git a/tools/perf/pmu-events/arch/arm64/arm/cortex-a720/sve.json b/to=
ols/perf/pmu-events/arch/arm64/arm/cortex-a720/sve.json
new file mode 100644
index 000000000000..51dab48cb2ba
--- /dev/null
+++ b/tools/perf/pmu-events/arch/arm64/arm/cortex-a720/sve.json
@@ -0,0 +1,50 @@
+[
+    {
+        "ArchStdEvent": "SVE_INST_SPEC",
+        "PublicDescription": "Counts speculatively executed operations tha=
t are SVE operations."
+    },
+    {
+        "ArchStdEvent": "SVE_PRED_SPEC",
+        "PublicDescription": "Counts speculatively executed predicated SVE=
 operations."
+    },
+    {
+        "ArchStdEvent": "SVE_PRED_EMPTY_SPEC",
+        "PublicDescription": "Counts speculatively executed predicated SVE=
 operations with no active predicate elements."
+    },
+    {
+        "ArchStdEvent": "SVE_PRED_FULL_SPEC",
+        "PublicDescription": "Counts speculatively executed predicated SVE=
 operations with all predicate elements active."
+    },
+    {
+        "ArchStdEvent": "SVE_PRED_PARTIAL_SPEC",
+        "PublicDescription": "Counts speculatively executed predicated SVE=
 operations with at least one but not all active predicate elements."
+    },
+    {
+        "ArchStdEvent": "SVE_PRED_NOT_FULL_SPEC",
+        "PublicDescription": "Counts speculatively executed predicated SVE=
 operations with at least one non active predicate elements."
+    },
+    {
+        "ArchStdEvent": "SVE_LDFF_SPEC",
+        "PublicDescription": "Counts speculatively executed SVE first faul=
t or non-fault load operations."
+    },
+    {
+        "ArchStdEvent": "SVE_LDFF_FAULT_SPEC",
+        "PublicDescription": "Counts speculatively executed SVE first faul=
t or non-fault load operations that clear at least one bit in the FFR."
+    },
+    {
+        "ArchStdEvent": "ASE_SVE_INT8_SPEC",
+        "PublicDescription": "Counts speculatively executed Advanced SIMD =
or SVE integer operations with the largest data type an 8-bit integer."
+    },
+    {
+        "ArchStdEvent": "ASE_SVE_INT16_SPEC",
+        "PublicDescription": "Counts speculatively executed Advanced SIMD =
or SVE integer operations with the largest data type a 16-bit integer."
+    },
+    {
+        "ArchStdEvent": "ASE_SVE_INT32_SPEC",
+        "PublicDescription": "Counts speculatively executed Advanced SIMD =
or SVE integer operations with the largest data type a 32-bit integer."
+    },
+    {
+        "ArchStdEvent": "ASE_SVE_INT64_SPEC",
+        "PublicDescription": "Counts speculatively executed Advanced SIMD =
or SVE integer operations with the largest data type a 64-bit integer."
+    }
+]
diff --git a/tools/perf/pmu-events/arch/arm64/arm/cortex-a720/tlb.json b/to=
ols/perf/pmu-events/arch/arm64/arm/cortex-a720/tlb.json
new file mode 100644
index 000000000000..c7aa89c2f19f
--- /dev/null
+++ b/tools/perf/pmu-events/arch/arm64/arm/cortex-a720/tlb.json
@@ -0,0 +1,74 @@
+[
+    {
+        "ArchStdEvent": "L1I_TLB_REFILL",
+        "PublicDescription": "Counts level 1 instruction TLB refills from =
any Instruction fetch. If there are multiple misses in the TLB that are res=
olved by the refill, then this event only counts once. This event will not =
count if the translation table walk results in a fault (such as a translati=
on or access fault), since there is no new translation created for the TLB."
+    },
+    {
+        "ArchStdEvent": "L1D_TLB_REFILL",
+        "PublicDescription": "Counts level 1 data TLB accesses that result=
ed in TLB refills. If there are multiple misses in the TLB that are resolve=
d by the refill, then this event only counts once. This event counts for re=
fills caused by preload instructions or hardware prefetch accesses. This ev=
ent counts regardless of whether the miss hits in L2 or results in a transl=
ation table walk. This event will not count if the translation table walk r=
esults in a fault (such as a translation or access fault), since there is n=
o new translation created for the TLB. This event will not count on an acce=
ss from an AT(address translation) instruction."
+    },
+    {
+        "ArchStdEvent": "L1D_TLB",
+        "PublicDescription": "Counts level 1 data TLB accesses caused by a=
ny memory load or store operation. Note that load or store instructions can=
 be broken up into multiple memory operations. This event does not count TL=
B maintenance operations."
+    },
+    {
+        "ArchStdEvent": "L1I_TLB",
+        "PublicDescription": "Counts level 1 instruction TLB accesses, whe=
ther the access hits or misses in the TLB. This event counts both demand ac=
cesses and prefetch or preload generated accesses."
+    },
+    {
+        "ArchStdEvent": "L2D_TLB_REFILL",
+        "PublicDescription": "Counts level 2 TLB refills caused by memory =
operations from both data and instruction fetch, except for those caused by=
 TLB maintenance operations and hardware prefetches."
+    },
+    {
+        "ArchStdEvent": "L2D_TLB",
+        "PublicDescription": "Counts level 2 TLB accesses except those cau=
sed by TLB maintenance operations."
+    },
+    {
+        "ArchStdEvent": "DTLB_WALK",
+        "PublicDescription": "Counts number of demand data translation tab=
le walks caused by a miss in the L2 TLB and performing at least one memory =
access. Translation table walks are counted even if the translation ended u=
p taking a translation fault for reasons different than EPD, E0PD and NFD. =
Note that partial translations that cause a translation table walk are also=
 counted. Also note that this event counts walks triggered by software prel=
oads, but not walks triggered by hardware prefetchers, and that this event =
does not count walks triggered by TLB maintenance operations."
+    },
+    {
+        "ArchStdEvent": "ITLB_WALK",
+        "PublicDescription": "Counts number of instruction translation tab=
le walks caused by a miss in the L2 TLB and performing at least one memory =
access. Translation table walks are counted even if the translation ended u=
p taking a translation fault for reasons different than EPD, E0PD and NFD. =
Note that partial translations that cause a translation table walk are also=
 counted. Also note that this event does not count walks triggered by TLB m=
aintenance operations."
+    },
+    {
+        "ArchStdEvent": "DTLB_WALK_PERCYC",
+        "PublicDescription": "Counts the number of data translation table =
walks in progress per cycle."
+    },
+    {
+        "ArchStdEvent": "ITLB_WALK_PERCYC",
+        "PublicDescription": "Counts the number of instruction translation=
 table walks in progress per cycle."
+    },
+    {
+        "ArchStdEvent": "DTLB_HWUPD",
+        "PublicDescription": "Counts number of memory accesses triggered b=
y a data translation table walk and performing an update of a translation t=
able entry. Memory accesses are counted even if the translation ended up ta=
king a translation fault for reasons different than EPD, E0PD and NFD. Note=
 that this event counts accesses triggered by software preloads, but not ac=
cesses triggered by hardware prefetchers."
+    },
+    {
+        "ArchStdEvent": "ITLB_HWUPD",
+        "PublicDescription": "Counts number of memory accesses triggered b=
y an instruction translation table walk and performing an update of a trans=
lation table entry. Memory accesses are counted even if the translation end=
ed up taking a translation fault for reasons different than EPD, E0PD and N=
FD."
+    },
+    {
+        "ArchStdEvent": "DTLB_STEP",
+        "PublicDescription": "Counts number of memory accesses triggered b=
y a demand data translation table walk and performing a read of a translati=
on table entry. Memory accesses are counted even if the translation ended u=
p taking a translation fault for reasons different than EPD, E0PD and NFD. =
Note that this event counts accesses triggered by software preloads, but no=
t accesses triggered by hardware prefetchers."
+    },
+    {
+        "ArchStdEvent": "ITLB_STEP",
+        "PublicDescription": "Counts number of memory accesses triggered b=
y an instruction translation table walk and performing a read of a translat=
ion table entry. Memory accesses are counted even if the translation ended =
up taking a translation fault for reasons different than EPD, E0PD and NFD."
+    },
+    {
+        "ArchStdEvent": "DTLB_WALK_LARGE",
+        "PublicDescription": "Counts number of demand data translation tab=
le walks caused by a miss in the L2 TLB and yielding a large page. The set =
of large pages is defined as all pages with a final size higher than or equ=
al to 2MB. Translation table walks that end up taking a translation fault a=
re not counted, as the page size would be undefined in that case. If DTLB_W=
ALK_BLOCK is implemented, then it is an alias for this event in this family=
. Note that partial translations that cause a translation table walk are al=
so counted. Also note that this event counts walks triggered by software pr=
eloads, but not walks triggered by hardware prefetchers, and that this even=
t does not count walks triggered by TLB maintenance operations."
+    },
+    {
+        "ArchStdEvent": "ITLB_WALK_LARGE",
+        "PublicDescription": "Counts number of instruction translation tab=
le walks caused by a miss in the L2 TLB and yielding a large page. The set =
of large pages is defined as all pages with a final size higher than or equ=
al to 2MB. Translation table walks that end up taking a translation fault a=
re not counted, as the page size would be undefined in that case. In this f=
amily, this is equal to ITLB_WALK_BLOCK event. Note that partial translatio=
ns that cause a translation table walk are also counted. Also note that thi=
s event does not count walks triggered by TLB maintenance operations."
+    },
+    {
+        "ArchStdEvent": "DTLB_WALK_SMALL",
+        "PublicDescription": "Counts number of data translation table walk=
s caused by a miss in the L2 TLB and yielding a small page. The set of smal=
l pages is defined as all pages with a final size lower than 2MB. Translati=
on table walks that end up taking a translation fault are not counted, as t=
he page size would be undefined in that case. If DTLB_WALK_PAGE event is im=
plemented, then it is an alias for this event in this family. Note that par=
tial translations that cause a translation table walk are also counted. Als=
o note that this event counts walks triggered by software preloads, but not=
 walks triggered by hardware prefetchers, and that this event does not coun=
t walks triggered by TLB maintenance operations."
+    },
+    {
+        "ArchStdEvent": "ITLB_WALK_SMALL",
+        "PublicDescription": "Counts number of instruction translation tab=
le walks caused by a miss in the L2 TLB and yielding a small page. The set =
of small pages is defined as all pages with a final size lower than 2MB. Tr=
anslation table walks that end up taking a translation fault are not counte=
d, as the page size would be undefined in that case. In this family, this i=
s equal to ITLB_WALK_PAGE event. Note that partial translations that cause =
a translation table walk are also counted. Also note that this event does n=
ot count walks triggered by TLB maintenance operations."
+    }
+]
diff --git a/tools/perf/pmu-events/arch/arm64/arm/cortex-a720/trace.json b/=
tools/perf/pmu-events/arch/arm64/arm/cortex-a720/trace.json
new file mode 100644
index 000000000000..33672a8711d4
--- /dev/null
+++ b/tools/perf/pmu-events/arch/arm64/arm/cortex-a720/trace.json
@@ -0,0 +1,32 @@
+[
+    {
+        "ArchStdEvent": "TRB_WRAP"
+    },
+    {
+        "ArchStdEvent": "TRB_TRIG"
+    },
+    {
+        "ArchStdEvent": "TRCEXTOUT0"
+    },
+    {
+        "ArchStdEvent": "TRCEXTOUT1"
+    },
+    {
+        "ArchStdEvent": "TRCEXTOUT2"
+    },
+    {
+        "ArchStdEvent": "TRCEXTOUT3"
+    },
+    {
+        "ArchStdEvent": "CTI_TRIGOUT4"
+    },
+    {
+        "ArchStdEvent": "CTI_TRIGOUT5"
+    },
+    {
+        "ArchStdEvent": "CTI_TRIGOUT6"
+    },
+    {
+        "ArchStdEvent": "CTI_TRIGOUT7"
+    }
+]
diff --git a/tools/perf/pmu-events/arch/arm64/mapfile.csv b/tools/perf/pmu-=
events/arch/arm64/mapfile.csv
index bb3fa8a33496..ccfcae375750 100644
--- a/tools/perf/pmu-events/arch/arm64/mapfile.csv
+++ b/tools/perf/pmu-events/arch/arm64/mapfile.csv
@@ -33,6 +33,7 @@
 0x00000000410fd4c0,v1,arm/cortex-x1,core
 0x00000000410fd460,v1,arm/cortex-a510,core
 0x00000000410fd470,v1,arm/cortex-a710,core
+0x00000000410fd810,v1,arm/cortex-a720,core
 0x00000000410fd480,v1,arm/cortex-x2,core
 0x00000000410fd490,v1,arm/neoverse-n2-v2,core
 0x00000000410fd4f0,v1,arm/neoverse-n2-v2,core
--=20
2.47.2
From nobody Sun Feb  8 18:15:08 2026
Received: from out162-62-58-216.mail.qq.com (out162-62-58-216.mail.qq.com
 [162.62.58.216])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by smtp.subspace.kernel.org (Postfix) with ESMTPS id 74C0770805
	for <linux-kernel@vger.kernel.org>; Thu, 13 Feb 2025 15:31:19 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org;
 arc=none smtp.client-ip=162.62.58.216
ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;
	t=1739460687; cv=none;
 b=CroIZedLQ+rBOk6ASlqrNG9bwaxYJ9vrdAzq+OMQA7bcRQT1TZPi4EJQwizubs0SVADqjjtQcWQGIogDpyJHLT/aRUKYznR9BDXN3ZN/Xaz+XOTKE4E8m70Aqj4gSzrJGK+dQXEU+MWEu1y7WEXRvB/ZiKNGlZ3oZpKR9jy5gLo=
ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org;
	s=arc-20240116; t=1739460687; c=relaxed/simple;
	bh=75WtKV97demkDyvVvsBSVI/sHIJRofn2uIxOHJA6Cfk=;
	h=Message-ID:From:To:Cc:Subject:Date:In-Reply-To:References:
	 MIME-Version;
 b=igpcyienQXskxz7NBjhhn2WLQI0RKN6Xklp4C0u94IX/B58WclO/CylpizB8vYOqKHAuOEWPhJ66/MQ1ReuufHuOdlhaq30/EB+AE6YBWZmdGXuO99nZVNKZ6rqN/QChiCRINaiqDG3ainHaMOfs1ifcDH0EVTWQUrQvaRTQsWI=
ARC-Authentication-Results: i=1; smtp.subspace.kernel.org;
 dmarc=none (p=none dis=none) header.from=cyyself.name;
 spf=pass smtp.mailfrom=cyyself.name;
 dkim=pass (1024-bit key) header.d=qq.com header.i=@qq.com header.b=TZPiOWwp;
 arc=none smtp.client-ip=162.62.58.216
Authentication-Results: smtp.subspace.kernel.org;
 dmarc=none (p=none dis=none) header.from=cyyself.name
Authentication-Results: smtp.subspace.kernel.org;
 spf=pass smtp.mailfrom=cyyself.name
Authentication-Results: smtp.subspace.kernel.org;
	dkim=pass (1024-bit key) header.d=qq.com header.i=@qq.com header.b="TZPiOWwp"
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=qq.com; s=s201512;
	t=1739460375; bh=Nzt2vQH7DSa3tUtAyFdTa0J3/Ncy+l1J5I0wQF068Hc=;
	h=From:To:Cc:Subject:Date:In-Reply-To:References;
	b=TZPiOWwpGow/Bwpxw3TypvX22B4kZsojpNvHb6PdivTitrJ7luoM/9YVFP7sBApZr
	 0Y6JhkftpZRCqt4X2Bt2DXQHs9TqhjnW5fVFneQ+PhleDnVRcyJqjjMzQ2OJQ+b+0T
	 ZJ9WJShutVkztaRsgFII58N+45HZAw8MABKRso1Y=
Received: from cyy-pc.lan ([240e:379:2251:3600:f57b:26f9:9718:486c])
	by newxmesmtplogicsvrsza36-0.qq.com (NewEsmtp) with SMTP
	id 338A58E8; Thu, 13 Feb 2025 23:12:56 +0800
X-QQ-mid: xmsmtpt1739459576tgw6byg21
Message-ID: <tencent_E5686B3956E04AC548862C35FAF03909E20A@qq.com>
X-QQ-XMAILINFO: OIJV+wUmQOUAhsKvSrCboBKxoZ5TS2wuovy/4FSdvWKxhHFyehT7Mwr6nb1BVM
	 UGsdKQjSd1PANX6UJ+LvUiaI5Ve38sAMRV1nELAOdZKCXB6HAc2aJLpa3qdYAUZ+3/huaJiMNQY8
	 CIFAF/EDJGU+39weutf9O5zhpL9MGQPCzhDuPlBc9OSSicHrtM+0A67bIzNIHgCnJmYQM5Xu6h3O
	 c4lv7B5uUt5uoeT6EWnncKgZt8gSyh5W2Vqqw36qFp3hfGwNKn9PgsptjFsYkzUD7d3ffqwC5bu/
	 YKsYanM1qbZSVtfFgfCSGaSgDqpm5/jImpQk0IWI+ekCSpqyrTUJ+jEF38KzpFsUjwvYlVk5Xih4
	 VQ276N6xp80lUbBuEorLanWDG3nPmtlHPWBimaoUe/x5p5TGPKIVzgwaZQ2MHWXrCu75B7tJmGb3
	 uOJGJaHtgkRSnVcNWIKo/+bbLP2/0B2CrtcbMcWmRc3UfzNo4UXwJ6+FQ/RzwJ9pzrwdc00+G+eZ
	 3UBg/ONATwEN0AHv92nZPp54K0N7bHljJOVVEy3cRng5lUbkSn4VYnncL/OugO72pd7Ki1zUSBOI
	 sU0+4+Y0b7sWTnNScFfBt/iaNXuRJiY52Hw5HY7SzSixEppM2gTsKfbBTDkVzHoTcAmInwxu2bdJ
	 R1EnHIKts4WWRvngj1ljy+YRkL+7bN5mLZSkDvsMRvM+yipn0RTQQvlIYlOJfChPMK83AiksDLXo
	 cwKdPqct8AA6B8iWtmBUBO+s/YukeH8e+3Pu9qvkkJwHj6wCFoYTqql7WdqXZrcimzlc+92aIoEa
	 3QqVBkMQHANBhzN1STF4uSzOslw8eVfnKnhUoq/lMRx9nP1JGm0q5UBfq5OQACzrDprS1lmDkVYv
	 bQMLfTjgdvjoXOMJphY6fhE4fOWRYAYZvIHoe86+niC1JQkE2Odp4mCzftSFwKYotQR7GLp2brry
	 tNr8QZtLugy3M1MoACDtsNOeweUhRzdHQrT5n6nLf4VBJOk8aR9yGjB6II1U/gv09rqjOXlXF2b2
	 1XAwWUFg/TcwzR/mrNJjU6dtlPrzNvjPlkKqhMS8yEjF3sSVWQ
X-QQ-XMRINFO: NS+P29fieYNw95Bth2bWPxk=
From: Yangyu Chen <cyy@cyyself.name>
To: linux-perf-users@vger.kernel.org
Cc: John Garry <john.g.garry@oracle.com>,
	Will Deacon <will@kernel.org>,
	James Clark <james.clark@linaro.org>,
	Mike Leach <mike.leach@linaro.org>,
	Leo Yan <leo.yan@linux.dev>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Namhyung Kim <namhyung@kernel.org>,
	Mark Rutland <mark.rutland@arm.com>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Jiri Olsa <jolsa@kernel.org>,
	Ian Rogers <irogers@google.com>,
	Adrian Hunter <adrian.hunter@intel.com>,
	Liang Kan <kan.liang@linux.intel.com>,
	Yoshihiro Furudera <fj5100bi@fujitsu.com>,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org,
	Yangyu Chen <cyy@cyyself.name>
Subject: [PATCH 2/2] perf vendor events arm64: Add Cortex-A520 events/metrics
Date: Thu, 13 Feb 2025 23:12:52 +0800
X-OQ-MSGID: <20250213151252.187475-1-cyy@cyyself.name>
X-Mailer: git-send-email 2.47.2
In-Reply-To: <tencent_5360DA048EE5B8CF3104213F8D037C698208@qq.com>
References: <tencent_5360DA048EE5B8CF3104213F8D037C698208@qq.com>
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
List-Id: <linux-kernel.vger.kernel.org>
List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org>
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset="utf-8"

Add JSON files for Cortex-A520 events and metrics. Using the existing
Neoverse N3 JSON files as a template, I manually checked the missing and
extra events/metrics using my script [1] and modified them according to
the Arm Cortex-A520 Core Technical Reference Manual [2].

[1] https://github.com/cyyself/arm-pmu-check/tree/1075bebeb3f1441067448251a=
387df35af15bf16
[2] https://developer.arm.com/documentation/102517/0004/Performance-Monitor=
s-Extension-support-/Performance-monitors-events/Common-event-PMU-events

Signed-off-by: Yangyu Chen <cyy@cyyself.name>
---
 .../arch/arm64/arm/cortex-a520/bus.json       |  26 ++
 .../arch/arm64/arm/cortex-a520/exception.json |  18 +
 .../arm64/arm/cortex-a520/fp_operation.json   |  14 +
 .../arch/arm64/arm/cortex-a520/general.json   |   6 +
 .../arch/arm64/arm/cortex-a520/l1d_cache.json |  50 +++
 .../arch/arm64/arm/cortex-a520/l1i_cache.json |  14 +
 .../arch/arm64/arm/cortex-a520/l2_cache.json  |  46 +++
 .../arch/arm64/arm/cortex-a520/l3_cache.json  |  21 +
 .../arch/arm64/arm/cortex-a520/ll_cache.json  |  10 +
 .../arch/arm64/arm/cortex-a520/memory.json    |  58 +++
 .../arch/arm64/arm/cortex-a520/metrics.json   | 373 ++++++++++++++++++
 .../arch/arm64/arm/cortex-a520/pmu.json       |   8 +
 .../arch/arm64/arm/cortex-a520/retired.json   |  90 +++++
 .../arm64/arm/cortex-a520/spec_operation.json |  70 ++++
 .../arch/arm64/arm/cortex-a520/stall.json     |  82 ++++
 .../arch/arm64/arm/cortex-a520/sve.json       |  22 ++
 .../arch/arm64/arm/cortex-a520/tlb.json       |  78 ++++
 .../arch/arm64/arm/cortex-a520/trace.json     |  32 ++
 .../arch/arm64/common-and-microarch.json      |  15 +
 tools/perf/pmu-events/arch/arm64/mapfile.csv  |   1 +
 20 files changed, 1034 insertions(+)
 create mode 100644 tools/perf/pmu-events/arch/arm64/arm/cortex-a520/bus.js=
on
 create mode 100644 tools/perf/pmu-events/arch/arm64/arm/cortex-a520/except=
ion.json
 create mode 100644 tools/perf/pmu-events/arch/arm64/arm/cortex-a520/fp_ope=
ration.json
 create mode 100644 tools/perf/pmu-events/arch/arm64/arm/cortex-a520/genera=
l.json
 create mode 100644 tools/perf/pmu-events/arch/arm64/arm/cortex-a520/l1d_ca=
che.json
 create mode 100644 tools/perf/pmu-events/arch/arm64/arm/cortex-a520/l1i_ca=
che.json
 create mode 100644 tools/perf/pmu-events/arch/arm64/arm/cortex-a520/l2_cac=
he.json
 create mode 100644 tools/perf/pmu-events/arch/arm64/arm/cortex-a520/l3_cac=
he.json
 create mode 100644 tools/perf/pmu-events/arch/arm64/arm/cortex-a520/ll_cac=
he.json
 create mode 100644 tools/perf/pmu-events/arch/arm64/arm/cortex-a520/memory=
.json
 create mode 100644 tools/perf/pmu-events/arch/arm64/arm/cortex-a520/metric=
s.json
 create mode 100644 tools/perf/pmu-events/arch/arm64/arm/cortex-a520/pmu.js=
on
 create mode 100644 tools/perf/pmu-events/arch/arm64/arm/cortex-a520/retire=
d.json
 create mode 100644 tools/perf/pmu-events/arch/arm64/arm/cortex-a520/spec_o=
peration.json
 create mode 100644 tools/perf/pmu-events/arch/arm64/arm/cortex-a520/stall.=
json
 create mode 100644 tools/perf/pmu-events/arch/arm64/arm/cortex-a520/sve.js=
on
 create mode 100644 tools/perf/pmu-events/arch/arm64/arm/cortex-a520/tlb.js=
on
 create mode 100644 tools/perf/pmu-events/arch/arm64/arm/cortex-a520/trace.=
json

diff --git a/tools/perf/pmu-events/arch/arm64/arm/cortex-a520/bus.json b/to=
ols/perf/pmu-events/arch/arm64/arm/cortex-a520/bus.json
new file mode 100644
index 000000000000..884e42ab6a49
--- /dev/null
+++ b/tools/perf/pmu-events/arch/arm64/arm/cortex-a520/bus.json
@@ -0,0 +1,26 @@
+[
+    {
+        "ArchStdEvent": "BUS_ACCESS",
+        "PublicDescription": "Counts memory transactions issued by the CPU=
 to the external bus, including snoop requests and snoop responses. Each be=
at of data is counted individually."
+    },
+    {
+        "ArchStdEvent": "BUS_CYCLES",
+        "PublicDescription": "Counts bus cycles in the CPU. Bus cycles rep=
resent a clock cycle in which a transaction could be sent or received on th=
e interface from the CPU to the external bus. Since that interface is drive=
n at the same clock speed as the CPU, this event is a duplicate of CPU_CYCL=
ES."
+    },
+    {
+        "ArchStdEvent": "BUS_ACCESS_RD",
+        "PublicDescription": "Counts memory read transactions seen on the =
external bus. Each beat of data is counted individually."
+    },
+    {
+        "ArchStdEvent": "BUS_ACCESS_WR",
+        "PublicDescription": "Counts memory write transactions seen on the=
 external bus. Each beat of data is counted individually."
+    },
+    {
+        "ArchStdEvent": "BUS_REQ_RD_PERCYC",
+        "PublicDescription": "Bus read transactions in progress."
+    },
+    {
+        "ArchStdEvent": "BUS_REQ_RD",
+        "BriefDescription": "Bus request, read"
+    }
+]
diff --git a/tools/perf/pmu-events/arch/arm64/arm/cortex-a520/exception.jso=
n b/tools/perf/pmu-events/arch/arm64/arm/cortex-a520/exception.json
new file mode 100644
index 000000000000..fbe580e15c2e
--- /dev/null
+++ b/tools/perf/pmu-events/arch/arm64/arm/cortex-a520/exception.json
@@ -0,0 +1,18 @@
+[
+    {
+        "ArchStdEvent": "EXC_TAKEN",
+        "PublicDescription": "Counts any taken architecturally visible exc=
eptions such as IRQ, FIQ, SError, and other synchronous exceptions. Excepti=
ons are counted whether or not they are taken locally."
+    },
+    {
+        "ArchStdEvent": "EXC_RETURN",
+        "PublicDescription": "Counts any architecturally executed exceptio=
n return instructions. For example: AArch64: ERET"
+    },
+    {
+        "ArchStdEvent": "EXC_IRQ",
+        "PublicDescription": "Counts IRQ exceptions including the virtual =
IRQs that are taken locally."
+    },
+    {
+        "ArchStdEvent": "EXC_FIQ",
+        "PublicDescription": "Counts FIQ exceptions including the virtual =
FIQs that are taken locally."
+    }
+]
diff --git a/tools/perf/pmu-events/arch/arm64/arm/cortex-a520/fp_operation.=
json b/tools/perf/pmu-events/arch/arm64/arm/cortex-a520/fp_operation.json
new file mode 100644
index 000000000000..da0c4b05ad5b
--- /dev/null
+++ b/tools/perf/pmu-events/arch/arm64/arm/cortex-a520/fp_operation.json
@@ -0,0 +1,14 @@
+[
+    {
+        "ArchStdEvent": "FP_HP_SPEC",
+        "PublicDescription": "Counts speculatively executed half precision=
 floating point operations."
+    },
+    {
+        "ArchStdEvent": "FP_SP_SPEC",
+        "PublicDescription": "Counts speculatively executed single precisi=
on floating point operations."
+    },
+    {
+        "ArchStdEvent": "FP_DP_SPEC",
+        "PublicDescription": "Counts speculatively executed double precisi=
on floating point operations."
+    }
+]
diff --git a/tools/perf/pmu-events/arch/arm64/arm/cortex-a520/general.json =
b/tools/perf/pmu-events/arch/arm64/arm/cortex-a520/general.json
new file mode 100644
index 000000000000..20fada95ef97
--- /dev/null
+++ b/tools/perf/pmu-events/arch/arm64/arm/cortex-a520/general.json
@@ -0,0 +1,6 @@
+[
+    {
+        "ArchStdEvent": "CPU_CYCLES",
+        "PublicDescription": "Counts CPU clock cycles (not timer cycles). =
The clock measured by this event is defined as the physical clock driving t=
he CPU logic."
+    }
+]
diff --git a/tools/perf/pmu-events/arch/arm64/arm/cortex-a520/l1d_cache.jso=
n b/tools/perf/pmu-events/arch/arm64/arm/cortex-a520/l1d_cache.json
new file mode 100644
index 000000000000..90e871c8986a
--- /dev/null
+++ b/tools/perf/pmu-events/arch/arm64/arm/cortex-a520/l1d_cache.json
@@ -0,0 +1,50 @@
+[
+    {
+        "ArchStdEvent": "L1D_CACHE_REFILL",
+        "PublicDescription": "Counts level 1 data cache refills caused by =
speculatively executed load or store operations that missed in the level 1 =
data cache. This event only counts one event per cache line."
+    },
+    {
+        "ArchStdEvent": "L1D_CACHE",
+        "PublicDescription": "Counts level 1 data cache accesses from any =
load/store operations. Atomic operations that resolve in the CPUs caches (n=
ear atomic operations) counts as both a write access and read access. Each =
access to a cache line is counted including the multiple accesses caused by=
 single instructions such as LDM or STM. Each access to other level 1 data =
or unified memory structures, for example refill buffers, write buffers, an=
d write-back buffers, are also counted."
+    },
+    {
+        "ArchStdEvent": "L1D_CACHE_WB",
+        "PublicDescription": "Counts write-backs of dirty data from the L1=
 data cache to the L2 cache. This occurs when either a dirty cache line is =
evicted from L1 data cache and allocated in the L2 cache or dirty data is w=
ritten to the L2 and possibly to the next level of cache. This event counts=
 both victim cache line evictions and cache write-backs from snoops or cach=
e maintenance operations. The following cache operations are not counted:\n=
\n1. Invalidations which do not result in data being transferred out of the=
 L1 (such as evictions of clean data),\n2. Full line writes which write to =
L2 without writing L1, such as write streaming mode."
+    },
+    {
+        "ArchStdEvent": "L1D_CACHE_LMISS_RD",
+        "PublicDescription": "Counts cache line refills into the level 1 d=
ata cache from any memory read operations, that incurred additional latency=
."
+    },
+    {
+        "ArchStdEvent": "L1D_CACHE_RD",
+        "PublicDescription": "Counts level 1 data cache accesses from any =
load operation. Atomic load operations that resolve in the CPUs caches coun=
ts as both a write access and read access."
+    },
+    {
+        "ArchStdEvent": "L1D_CACHE_WR",
+        "PublicDescription": "Counts level 1 data cache accesses generated=
 by store operations. This event also counts accesses caused by a DC ZVA (d=
ata cache zero, specified by virtual address) instruction. Near atomic oper=
ations that resolve in the CPUs caches count as a write access and read acc=
ess."
+    },
+    {
+        "ArchStdEvent": "L1D_CACHE_REFILL_RD",
+        "PublicDescription": "Counts level 1 data cache refills caused by =
speculatively executed load instructions where the memory read operation mi=
sses in the level 1 data cache. This event only counts one event per cache =
line."
+    },
+    {
+        "ArchStdEvent": "L1D_CACHE_REFILL_WR",
+        "PublicDescription": "Counts level 1 data cache refills caused by =
speculatively executed store instructions where the memory write operation =
misses in the level 1 data cache. This event only counts one event per cach=
e line."
+    },
+    {
+        "ArchStdEvent": "L1D_CACHE_REFILL_INNER",
+        "PublicDescription": "Counts level 1 data cache refills where the =
cache line data came from caches inside the immediate cluster of the core."
+    },
+    {
+        "ArchStdEvent": "L1D_CACHE_REFILL_OUTER",
+        "PublicDescription": "Counts level 1 data cache refills for which =
the cache line data came from outside the immediate cluster of the core, li=
ke an SLC in the system interconnect or DRAM."
+    },
+    {
+        "ArchStdEvent": "L1D_CACHE_HWPRF",
+        "PublicDescription": "Counts level 1 data cache accesses from any =
load/store operations generated by the hardware prefetcher."
+    },
+    {
+        "ArchStdEvent": "L1D_CACHE_REFILL_HWPRF",
+        "PublicDescription": "Counts level 1 data cache refills where the =
cache line is requested by a hardware prefetcher."
+    }
+]
diff --git a/tools/perf/pmu-events/arch/arm64/arm/cortex-a520/l1i_cache.jso=
n b/tools/perf/pmu-events/arch/arm64/arm/cortex-a520/l1i_cache.json
new file mode 100644
index 000000000000..633f1030359d
--- /dev/null
+++ b/tools/perf/pmu-events/arch/arm64/arm/cortex-a520/l1i_cache.json
@@ -0,0 +1,14 @@
+[
+    {
+        "ArchStdEvent": "L1I_CACHE_REFILL",
+        "PublicDescription": "Counts cache line refills in the level 1 ins=
truction cache caused by a missed instruction fetch. Instruction fetches ma=
y include accessing multiple instructions, but the single cache line alloca=
tion is counted once."
+    },
+    {
+        "ArchStdEvent": "L1I_CACHE",
+        "PublicDescription": "Counts instruction fetches which access the =
level 1 instruction cache. Instruction cache accesses caused by cache maint=
enance operations are not counted."
+    },
+    {
+        "ArchStdEvent": "L1I_CACHE_LMISS",
+        "PublicDescription": "Counts cache line refills into the level 1 i=
nstruction cache, that incurred additional latency."
+    }
+]
diff --git a/tools/perf/pmu-events/arch/arm64/arm/cortex-a520/l2_cache.json=
 b/tools/perf/pmu-events/arch/arm64/arm/cortex-a520/l2_cache.json
new file mode 100644
index 000000000000..9874b1a7c94b
--- /dev/null
+++ b/tools/perf/pmu-events/arch/arm64/arm/cortex-a520/l2_cache.json
@@ -0,0 +1,46 @@
+[
+    {
+        "ArchStdEvent": "L2D_CACHE",
+        "PublicDescription": "Counts accesses to the level 2 cache due to =
data accesses. Level 2 cache is a unified cache for data and instruction ac=
cesses. Accesses are for misses in the first level data cache or translatio=
n resolutions due to accesses. This event also counts write back of dirty d=
ata from level 1 data cache to the L2 cache."
+    },
+    {
+        "ArchStdEvent": "L2D_CACHE_REFILL",
+        "PublicDescription": "Counts cache line refills into the level 2 c=
ache. Level 2 cache is a unified cache for data and instruction accesses. A=
ccesses are for misses in the level 1 data cache or translation resolutions=
 due to accesses."
+    },
+    {
+        "ArchStdEvent": "L2D_CACHE_WB",
+        "PublicDescription": "Counts write-backs of data from the L2 cache=
 to outside the CPU. This includes snoops to the L2 (from other CPUs) which=
 return data even if the snoops cause an invalidation. L2 cache line invali=
dations which do not write data outside the CPU and snoops which return dat=
a from an L1 cache are not counted. Data would not be written outside the c=
ache when invalidating a clean cache line."
+    },
+    {
+        "ArchStdEvent": "L2D_CACHE_ALLOCATE",
+        "PublicDescription": "Counts level 2 cache line allocates that do =
not fetch data from outside the level 2 data or unified cache."
+    },
+    {
+        "ArchStdEvent": "L2D_CACHE_RD",
+        "PublicDescription": "Counts level 2 data cache accesses due to me=
mory read operations. Level 2 cache is a unified cache for data and instruc=
tion accesses, accesses are for misses in the level 1 data cache or transla=
tion resolutions due to accesses."
+    },
+    {
+        "ArchStdEvent": "L2D_CACHE_WR",
+        "PublicDescription": "Counts level 2 cache accesses due to memory =
write operations. Level 2 cache is a unified cache for data and instruction=
 accesses, accesses are for misses in the level 1 data cache or translation=
 resolutions due to accesses."
+    },
+    {
+        "ArchStdEvent": "L2D_CACHE_REFILL_RD",
+        "PublicDescription": "Counts refills for memory accesses due to me=
mory read operation counted by L2D_CACHE_RD. Level 2 cache is a unified cac=
he for data and instruction accesses, accesses are for misses in the level =
1 data cache or translation resolutions due to accesses."
+    },
+    {
+        "ArchStdEvent": "L2D_CACHE_REFILL_WR",
+        "PublicDescription": "Counts refills for memory accesses due to me=
mory write operation counted by L2D_CACHE_WR. Level 2 cache is a unified ca=
che for data and instruction accesses, accesses are for misses in the level=
 1 data cache or translation resolutions due to accesses."
+    },
+    {
+        "ArchStdEvent": "L2D_CACHE_LMISS_RD",
+        "PublicDescription": "Counts cache line refills into the level 2 u=
nified cache from any memory read operations that incurred additional laten=
cy."
+    },
+    {
+        "ArchStdEvent": "L2D_CACHE_HWPRF",
+        "PublicDescription": "Counts level 2 data cache accesses generated=
 by L2D hardware prefetchers."
+    },
+    {
+        "ArchStdEvent": "L2D_CACHE_REFILL_HWPRF",
+        "BriefDescription": "This event counts hardware prefetch counted b=
y L2D_CACHE_HWPRF that causes a refill of the Level 2 cache, or any Level 1=
 data and instruction cache of this PE, from outside of those caches."
+    }
+]
diff --git a/tools/perf/pmu-events/arch/arm64/arm/cortex-a520/l3_cache.json=
 b/tools/perf/pmu-events/arch/arm64/arm/cortex-a520/l3_cache.json
new file mode 100644
index 000000000000..d5485d71babb
--- /dev/null
+++ b/tools/perf/pmu-events/arch/arm64/arm/cortex-a520/l3_cache.json
@@ -0,0 +1,21 @@
+[
+    {
+        "ArchStdEvent": "L3D_CACHE",
+        "PublicDescription": "Counts level 3 cache accesses. Level 3 cache=
 is a unified cache for data and instruction accesses. Accesses are for mis=
ses in the lower level caches or translation resolutions due to accesses."
+    },
+    {
+        "ArchStdEvent": "L3D_CACHE_RD",
+        "PublicDescription": "Counts level 3 cache accesses caused by any =
memory read operation. Level 3 cache is a unified cache for data and instru=
ction accesses. Accesses are for misses in the lower level caches or transl=
ation resolutions due to accesses."
+    },
+    {
+        "ArchStdEvent": "L3D_CACHE_REFILL_RD"
+    },
+    {
+        "ArchStdEvent": "L3D_CACHE_LMISS_RD",
+        "PublicDescription": "Counts any cache line refill into the level =
3 cache from memory read operations that incurred additional latency."
+    },
+    {
+        "ArchStdEvent": "L3D_CACHE_HWPRF",
+        "PublicDescription": "Level 3 data cache hardware prefetch."
+    }
+]
diff --git a/tools/perf/pmu-events/arch/arm64/arm/cortex-a520/ll_cache.json=
 b/tools/perf/pmu-events/arch/arm64/arm/cortex-a520/ll_cache.json
new file mode 100644
index 000000000000..fd5a2e0099b8
--- /dev/null
+++ b/tools/perf/pmu-events/arch/arm64/arm/cortex-a520/ll_cache.json
@@ -0,0 +1,10 @@
+[
+    {
+        "ArchStdEvent": "LL_CACHE_RD",
+        "PublicDescription": "Counts read transactions that were returned =
from outside the core cluster. This event counts for external last level ca=
che  when the system register CPUECTLR.EXTLLC bit is set, otherwise it coun=
ts for the L3 cache. This event counts read transactions returned from outs=
ide the core if those transactions are either hit in the system level cache=
 or missed in the SLC and are returned from any other external sources."
+    },
+    {
+        "ArchStdEvent": "LL_CACHE_MISS_RD",
+        "PublicDescription": "Counts read transactions that were returned =
from outside the core cluster but missed in the system level cache. This ev=
ent counts for external last level cache when the system register CPUECTLR.=
EXTLLC bit is set, otherwise it counts for L3 cache. This event counts read=
 transactions returned from outside the core if those transactions are miss=
ed in the System level Cache. The data source of the transaction is indicat=
ed by a field in the CHI transaction returning to the CPU. This event does =
not count reads caused by cache maintenance operations."
+    }
+]
diff --git a/tools/perf/pmu-events/arch/arm64/arm/cortex-a520/memory.json b=
/tools/perf/pmu-events/arch/arm64/arm/cortex-a520/memory.json
new file mode 100644
index 000000000000..e7f7914ecd2b
--- /dev/null
+++ b/tools/perf/pmu-events/arch/arm64/arm/cortex-a520/memory.json
@@ -0,0 +1,58 @@
+[
+    {
+        "ArchStdEvent": "MEM_ACCESS",
+        "PublicDescription": "Counts memory accesses issued by the CPU loa=
d store unit, where those accesses are issued due to load or store operatio=
ns. This event counts memory accesses no matter whether the data is receive=
d from any level of cache hierarchy or external memory. If memory accesses =
are broken up into smaller transactions than what were specified in the loa=
d or store instructions, then the event counts those smaller memory transac=
tions."
+    },
+    {
+        "ArchStdEvent": "MEMORY_ERROR",
+        "PublicDescription": "Counts any detected correctable or uncorrect=
able physical memory errors (ECC or parity) in protected CPUs RAMs. On the =
core, this event counts errors in the caches (including data and tag rams).=
 Any detected memory error (from either a speculative and abandoned access,=
 or an architecturally executed access) is counted. Note that errors are on=
ly detected when the actual protected memory is accessed by an operation."
+    },
+    {
+        "ArchStdEvent": "REMOTE_ACCESS_RD",
+        "PublicDescription": "Counts memory access to another socket in a =
multi-socket system, read."
+    },
+    {
+        "ArchStdEvent": "MEM_ACCESS_RD",
+        "PublicDescription": "Counts memory accesses issued by the CPU due=
 to load operations. The event counts any memory load access, no matter whe=
ther the data is received from any level of cache hierarchy or external mem=
ory. The event also counts atomic load operations. If memory accesses are b=
roken up by the load/store unit into smaller transactions that are issued b=
y the bus interface, then the event counts those smaller transactions."
+    },
+    {
+        "ArchStdEvent": "MEM_ACCESS_WR",
+        "PublicDescription": "Counts memory accesses issued by the CPU due=
 to store operations. The event counts any memory store access, no matter w=
hether the data is located in any level of cache or external memory. The ev=
ent also counts atomic load and store operations. If memory accesses are br=
oken up by the load/store unit into smaller transactions that are issued by=
 the bus interface, then the event counts those smaller transactions."
+    },
+    {
+        "ArchStdEvent": "LDST_ALIGN_LAT",
+        "PublicDescription": "Counts the number of memory read and write a=
ccesses in a cycle that incurred additional latency, due to the alignment o=
f the address and the size of data being accessed, which results in store c=
rossing a single cache line."
+    },
+    {
+        "ArchStdEvent": "LD_ALIGN_LAT",
+        "PublicDescription": "Counts the number of memory read accesses in=
 a cycle that incurred additional latency, due to the alignment of the addr=
ess and size of data being accessed, which results in load crossing a singl=
e cache line."
+    },
+    {
+        "ArchStdEvent": "ST_ALIGN_LAT",
+        "PublicDescription": "Counts the number of memory write access in =
a cycle that incurred additional latency, due to the alignment of the addre=
ss and size of data being accessed incurred additional latency."
+    },
+    {
+        "ArchStdEvent": "MEM_ACCESS_CHECKED",
+        "PublicDescription": "Counts the number of memory read and write a=
ccesses counted by MEM_ACCESS that are tag checked by the Memory Tagging Ex=
tension (MTE). This event is implemented as the sum of MEM_ACCESS_CHECKED_R=
D and MEM_ACCESS_CHECKED_WR"
+    },
+    {
+        "ArchStdEvent": "MEM_ACCESS_CHECKED_RD",
+        "PublicDescription": "Counts the number of memory read accesses in=
 a cycle that are tag checked by the Memory Tagging Extension (MTE)."
+    },
+    {
+        "ArchStdEvent": "MEM_ACCESS_CHECKED_WR",
+        "PublicDescription": "Counts the number of memory write accesses i=
n a cycle that is tag checked by the Memory Tagging Extension (MTE)."
+    },
+    {
+        "ArchStdEvent": "INST_FETCH_PERCYC",
+        "PublicDescription": "Counts number of instruction fetches outstan=
ding per cycle, which will provide an average latency of instruction fetch."
+    },
+    {
+        "ArchStdEvent": "MEM_ACCESS_RD_PERCYC",
+        "PublicDescription": "Counts the number of outstanding loads or me=
mory read accesses per cycle."
+    },
+    {
+        "ArchStdEvent": "INST_FETCH",
+        "PublicDescription": "Counts Instruction memory accesses that the =
PE makes."
+    }
+]
diff --git a/tools/perf/pmu-events/arch/arm64/arm/cortex-a520/metrics.json =
b/tools/perf/pmu-events/arch/arm64/arm/cortex-a520/metrics.json
new file mode 100644
index 000000000000..62cb910c8945
--- /dev/null
+++ b/tools/perf/pmu-events/arch/arm64/arm/cortex-a520/metrics.json
@@ -0,0 +1,373 @@
+[
+    {
+        "ArchStdEvent": "backend_bound"
+    },
+    {
+        "MetricName": "backend_busy_bound",
+        "MetricExpr": "STALL_BACKEND_BUSY / STALL_BACKEND * 100",
+        "BriefDescription": "This metric is the percentage of total cycles=
 stalled in the backend due to issue queues being full to accept operations=
 for execution.",
+        "MetricGroup": "Topdown_Backend",
+        "ScaleUnit": "1percent of cycles"
+    },
+    {
+        "MetricName": "backend_cache_l1d_bound",
+        "MetricExpr": "STALL_BACKEND_L1D / (STALL_BACKEND_L1D + STALL_BACK=
END_MEM) * 100",
+        "BriefDescription": "This metric is the percentage of total cycles=
 stalled in the backend due to memory access latency issues caused by level=
 1 data cache misses.",
+        "MetricGroup": "Topdown_Backend",
+        "ScaleUnit": "1percent of cycles"
+    },
+    {
+        "MetricName": "backend_cache_l2d_bound",
+        "MetricExpr": "STALL_BACKEND_MEM / (STALL_BACKEND_L1D + STALL_BACK=
END_MEM) * 100",
+        "BriefDescription": "This metric is the percentage of total cycles=
 stalled in the backend due to memory access latency issues caused by level=
 2 data cache misses.",
+        "MetricGroup": "Topdown_Backend",
+        "ScaleUnit": "1percent of cycles"
+    },
+    {
+        "MetricName": "backend_mem_bound",
+        "MetricExpr": "STALL_BACKEND_MEMBOUND / STALL_BACKEND * 100",
+        "BriefDescription": "This metric is the percentage of total cycles=
 stalled in the backend due to backend core resource constraints related to=
 memory access latency issues caused by memory access components.",
+        "MetricGroup": "Topdown_Backend",
+        "ScaleUnit": "1percent of cycles"
+    },
+    {
+        "MetricName": "backend_mem_cache_bound",
+        "MetricExpr": "(STALL_BACKEND_L1D + STALL_BACKEND_MEM) / STALL_BAC=
KEND_MEMBOUND * 100",
+        "BriefDescription": "This metric is the percentage of total cycles=
 stalled in the backend due to memory latency issues caused by data cache m=
isses.",
+        "MetricGroup": "Topdown_Backend",
+        "ScaleUnit": "1percent of cycles"
+    },
+    {
+        "MetricName": "backend_mem_store_bound",
+        "MetricExpr": "STALL_BACKEND_ST / STALL_BACKEND_MEMBOUND * 100",
+        "BriefDescription": "This metric is the percentage of total cycles=
 stalled in the frontend due to memory write pending caused by stores stall=
ed in the pre-commit stage.",
+        "MetricGroup": "Topdown_Backend",
+        "ScaleUnit": "1percent of cycles"
+    },
+    {
+        "MetricName": "backend_mem_tlb_bound",
+        "MetricExpr": "STALL_BACKEND_TLB / STALL_BACKEND_MEMBOUND * 100",
+        "BriefDescription": "This metric is the percentage of total cycles=
 stalled in the backend due to memory access latency issues caused by data =
TLB misses.",
+        "MetricGroup": "Topdown_Backend",
+        "ScaleUnit": "1percent of cycles"
+    },
+    {
+        "MetricName": "backend_stalled_cycles",
+        "MetricExpr": "STALL_BACKEND / CPU_CYCLES * 100",
+        "BriefDescription": "This metric is the percentage of cycles that =
were stalled due to resource constraints in the backend unit of the process=
or.",
+        "MetricGroup": "Cycle_Accounting",
+        "ScaleUnit": "1percent of cycles"
+    },
+    {
+        "ArchStdEvent": "bad_speculation",
+        "MetricExpr": "(1 - STALL_SLOT / (10 * CPU_CYCLES)) * (1 - OP_RETI=
RED / OP_SPEC) * 100 + STALL_FRONTEND_FLUSH / CPU_CYCLES * 100"
+    },
+    {
+        "MetricName": "branch_direct_ratio",
+        "MetricExpr": "BR_IMMED_RETIRED / BR_RETIRED",
+        "BriefDescription": "This metric measures the ratio of direct bran=
ches retired to the total number of branches architecturally executed.",
+        "MetricGroup": "Branch_Effectiveness",
+        "ScaleUnit": "1per branch"
+    },
+    {
+        "MetricName": "branch_indirect_ratio",
+        "MetricExpr": "BR_IND_RETIRED / BR_RETIRED",
+        "BriefDescription": "This metric measures the ratio of indirect br=
anches retired, including function returns, to the total number of branches=
 architecturally executed.",
+        "MetricGroup": "Branch_Effectiveness",
+        "ScaleUnit": "1per branch"
+    },
+    {
+        "MetricName": "branch_misprediction_ratio",
+        "MetricExpr": "BR_MIS_PRED_RETIRED / BR_RETIRED",
+        "BriefDescription": "This metric measures the ratio of branches mi=
spredicted to the total number of branches architecturally executed. This g=
ives an indication of the effectiveness of the branch prediction unit.",
+        "MetricGroup": "Miss_Ratio;Branch_Effectiveness",
+        "ScaleUnit": "100percent of branches"
+    },
+    {
+        "MetricName": "branch_mpki",
+        "MetricExpr": "BR_MIS_PRED_RETIRED / INST_RETIRED * 1000",
+        "BriefDescription": "This metric measures the number of branch mis=
predictions per thousand instructions executed.",
+        "MetricGroup": "MPKI;Branch_Effectiveness",
+        "ScaleUnit": "1MPKI"
+    },
+    {
+        "MetricName": "branch_percentage",
+        "MetricExpr": "PC_WRITE_SPEC / INST_SPEC * 100",
+        "BriefDescription": "This metric measures branch operations as a p=
ercentage of operations speculatively executed.",
+        "MetricGroup": "Operation_Mix",
+        "ScaleUnit": "1percent of operations"
+    },
+    {
+        "MetricName": "branch_return_ratio",
+        "MetricExpr": "BR_RETURN_RETIRED / BR_RETIRED",
+        "BriefDescription": "This metric measures the ratio of branches re=
tired that are function returns to the total number of branches architectur=
ally executed.",
+        "MetricGroup": "Branch_Effectiveness",
+        "ScaleUnit": "1per branch"
+    },
+    {
+        "MetricName": "crypto_percentage",
+        "MetricExpr": "CRYPTO_SPEC / INST_SPEC * 100",
+        "BriefDescription": "This metric measures crypto operations as a p=
ercentage of operations speculatively executed.",
+        "MetricGroup": "Operation_Mix",
+        "ScaleUnit": "1percent of operations"
+    },
+    {
+        "MetricName": "dtlb_mpki",
+        "MetricExpr": "DTLB_WALK / INST_RETIRED * 1000",
+        "BriefDescription": "This metric measures the number of data TLB W=
alks per thousand instructions executed.",
+        "MetricGroup": "MPKI;DTLB_Effectiveness",
+        "ScaleUnit": "1MPKI"
+    },
+    {
+        "MetricName": "dtlb_walk_ratio",
+        "MetricExpr": "DTLB_WALK / L1D_TLB",
+        "BriefDescription": "This metric measures the ratio of data TLB Wa=
lks to the total number of data TLB accesses. This gives an indication of t=
he effectiveness of the data TLB accesses.",
+        "MetricGroup": "Miss_Ratio;DTLB_Effectiveness",
+        "ScaleUnit": "100percent of TLB accesses"
+    },
+    {
+        "MetricName": "fp16_percentage",
+        "MetricExpr": "FP_HP_SPEC / INST_SPEC * 100",
+        "BriefDescription": "This metric measures half-precision floating =
point operations as a percentage of operations speculatively executed.",
+        "MetricGroup": "FP_Precision_Mix",
+        "ScaleUnit": "1percent of operations"
+    },
+    {
+        "MetricName": "fp32_percentage",
+        "MetricExpr": "FP_SP_SPEC / INST_SPEC * 100",
+        "BriefDescription": "This metric measures single-precision floatin=
g point operations as a percentage of operations speculatively executed.",
+        "MetricGroup": "FP_Precision_Mix",
+        "ScaleUnit": "1percent of operations"
+    },
+    {
+        "MetricName": "fp64_percentage",
+        "MetricExpr": "FP_DP_SPEC / INST_SPEC * 100",
+        "BriefDescription": "This metric measures double-precision floatin=
g point operations as a percentage of operations speculatively executed.",
+        "MetricGroup": "FP_Precision_Mix",
+        "ScaleUnit": "1percent of operations"
+    },
+    {
+        "MetricName": "frontend_cache_l1i_bound",
+        "MetricExpr": "STALL_FRONTEND_L1I / (STALL_FRONTEND_L1I + STALL_FR=
ONTEND_MEM) * 100",
+        "BriefDescription": "This metric is the percentage of total cycles=
 stalled in the frontend due to memory access latency issues caused by leve=
l 1 instruction cache misses.",
+        "MetricGroup": "Topdown_Frontend",
+        "ScaleUnit": "1percent of cycles"
+    },
+    {
+        "MetricName": "frontend_cache_l2i_bound",
+        "MetricExpr": "STALL_FRONTEND_MEM / (STALL_FRONTEND_L1I + STALL_FR=
ONTEND_MEM) * 100",
+        "BriefDescription": "This metric is the percentage of total cycles=
 stalled in the frontend due to memory access latency issues caused by leve=
l 2 instruction cache misses.",
+        "MetricGroup": "Topdown_Frontend",
+        "ScaleUnit": "1percent of cycles"
+    },
+    {
+        "MetricName": "frontend_core_bound",
+        "MetricExpr": "STALL_FRONTEND_CPUBOUND / STALL_FRONTEND * 100",
+        "BriefDescription": "This metric is the percentage of total cycles=
 stalled in the frontend due to frontend core resource constraints not rela=
ted to instruction fetch latency issues caused by memory access components.=
",
+        "MetricGroup": "Topdown_Frontend",
+        "ScaleUnit": "1percent of cycles"
+    },
+    {
+        "MetricName": "frontend_core_flush_bound",
+        "MetricExpr": "STALL_FRONTEND_FLUSH / STALL_FRONTEND_CPUBOUND * 10=
0",
+        "BriefDescription": "This metric is the percentage of total cycles=
 stalled in the frontend as the processor is recovering from a pipeline flu=
sh caused by bad speculation or other machine resteers.",
+        "MetricGroup": "Topdown_Frontend",
+        "ScaleUnit": "1percent of cycles"
+    },
+    {
+        "MetricName": "frontend_mem_bound",
+        "MetricExpr": "STALL_FRONTEND_MEMBOUND / STALL_FRONTEND * 100",
+        "BriefDescription": "This metric is the percentage of total cycles=
 stalled in the frontend due to frontend core resource constraints related =
to the instruction fetch latency issues caused by memory access components.=
",
+        "MetricGroup": "Topdown_Frontend",
+        "ScaleUnit": "1percent of cycles"
+    },
+    {
+        "MetricName": "frontend_mem_cache_bound",
+        "MetricExpr": "(STALL_FRONTEND_L1I + STALL_FRONTEND_MEM) / STALL_F=
RONTEND_MEMBOUND * 100",
+        "BriefDescription": "This metric is the percentage of total cycles=
 stalled in the frontend due to instruction fetch latency issues caused by =
instruction cache misses.",
+        "MetricGroup": "Topdown_Frontend",
+        "ScaleUnit": "1percent of cycles"
+    },
+    {
+        "MetricName": "frontend_mem_tlb_bound",
+        "MetricExpr": "STALL_FRONTEND_TLB / STALL_FRONTEND_MEMBOUND * 100",
+        "BriefDescription": "This metric is the percentage of total cycles=
 stalled in the frontend due to instruction fetch latency issues caused by =
instruction TLB misses.",
+        "MetricGroup": "Topdown_Frontend",
+        "ScaleUnit": "1percent of cycles"
+    },
+    {
+        "MetricName": "frontend_stalled_cycles",
+        "MetricExpr": "STALL_FRONTEND / CPU_CYCLES * 100",
+        "BriefDescription": "This metric is the percentage of cycles that =
were stalled due to resource constraints in the frontend unit of the proces=
sor.",
+        "MetricGroup": "Cycle_Accounting",
+        "ScaleUnit": "1percent of cycles"
+    },
+    {
+        "MetricName": "integer_dp_percentage",
+        "MetricExpr": "DP_SPEC / INST_SPEC * 100",
+        "BriefDescription": "This metric measures scalar integer operation=
s as a percentage of operations speculatively executed.",
+        "MetricGroup": "Operation_Mix",
+        "ScaleUnit": "1percent of operations"
+    },
+    {
+        "MetricName": "ipc",
+        "MetricExpr": "INST_RETIRED / CPU_CYCLES",
+        "BriefDescription": "This metric measures the number of instructio=
ns retired per cycle.",
+        "MetricGroup": "General",
+        "ScaleUnit": "1per cycle"
+    },
+    {
+        "MetricName": "itlb_mpki",
+        "MetricExpr": "ITLB_WALK / INST_RETIRED * 1000",
+        "BriefDescription": "This metric measures the number of instructio=
n TLB Walks per thousand instructions executed.",
+        "MetricGroup": "MPKI;ITLB_Effectiveness",
+        "ScaleUnit": "1MPKI"
+    },
+    {
+        "MetricName": "itlb_walk_ratio",
+        "MetricExpr": "ITLB_WALK / L1I_TLB",
+        "BriefDescription": "This metric measures the ratio of instruction=
 TLB Walks to the total number of instruction TLB accesses. This gives an i=
ndication of the effectiveness of the instruction TLB accesses.",
+        "MetricGroup": "Miss_Ratio;ITLB_Effectiveness",
+        "ScaleUnit": "100percent of TLB accesses"
+    },
+    {
+        "MetricName": "l1d_cache_miss_ratio",
+        "MetricExpr": "L1D_CACHE_REFILL / L1D_CACHE",
+        "BriefDescription": "This metric measures the ratio of level 1 dat=
a cache accesses missed to the total number of level 1 data cache accesses.=
 This gives an indication of the effectiveness of the level 1 data cache.",
+        "MetricGroup": "Miss_Ratio;L1D_Cache_Effectiveness",
+        "ScaleUnit": "100percent of cache accesses"
+    },
+    {
+        "MetricName": "l1d_cache_mpki",
+        "MetricExpr": "L1D_CACHE_REFILL / INST_RETIRED * 1000",
+        "BriefDescription": "This metric measures the number of level 1 da=
ta cache accesses missed per thousand instructions executed.",
+        "MetricGroup": "MPKI;L1D_Cache_Effectiveness",
+        "ScaleUnit": "1MPKI"
+    },
+    {
+        "MetricName": "l1d_tlb_miss_ratio",
+        "MetricExpr": "L1D_TLB_REFILL / L1D_TLB",
+        "BriefDescription": "This metric measures the ratio of level 1 dat=
a TLB accesses missed to the total number of level 1 data TLB accesses. Thi=
s gives an indication of the effectiveness of the level 1 data TLB.",
+        "MetricGroup": "Miss_Ratio;DTLB_Effectiveness",
+        "ScaleUnit": "100percent of TLB accesses"
+    },
+    {
+        "MetricName": "l1d_tlb_mpki",
+        "MetricExpr": "L1D_TLB_REFILL / INST_RETIRED * 1000",
+        "BriefDescription": "This metric measures the number of level 1 da=
ta TLB accesses missed per thousand instructions executed.",
+        "MetricGroup": "MPKI;DTLB_Effectiveness",
+        "ScaleUnit": "1MPKI"
+    },
+    {
+        "MetricName": "l1i_cache_miss_ratio",
+        "MetricExpr": "L1I_CACHE_REFILL / L1I_CACHE",
+        "BriefDescription": "This metric measures the ratio of level 1 ins=
truction cache accesses missed to the total number of level 1 instruction c=
ache accesses. This gives an indication of the effectiveness of the level 1=
 instruction cache.",
+        "MetricGroup": "Miss_Ratio;L1I_Cache_Effectiveness",
+        "ScaleUnit": "100percent of cache accesses"
+    },
+    {
+        "MetricName": "l1i_cache_mpki",
+        "MetricExpr": "L1I_CACHE_REFILL / INST_RETIRED * 1000",
+        "BriefDescription": "This metric measures the number of level 1 in=
struction cache accesses missed per thousand instructions executed.",
+        "MetricGroup": "MPKI;L1I_Cache_Effectiveness",
+        "ScaleUnit": "1MPKI"
+    },
+    {
+        "MetricName": "l1i_tlb_miss_ratio",
+        "MetricExpr": "L1I_TLB_REFILL / L1I_TLB",
+        "BriefDescription": "This metric measures the ratio of level 1 ins=
truction TLB accesses missed to the total number of level 1 instruction TLB=
 accesses. This gives an indication of the effectiveness of the level 1 ins=
truction TLB.",
+        "MetricGroup": "Miss_Ratio;ITLB_Effectiveness",
+        "ScaleUnit": "100percent of TLB accesses"
+    },
+    {
+        "MetricName": "l1i_tlb_mpki",
+        "MetricExpr": "L1I_TLB_REFILL / INST_RETIRED * 1000",
+        "BriefDescription": "This metric measures the number of level 1 in=
struction TLB accesses missed per thousand instructions executed.",
+        "MetricGroup": "MPKI;ITLB_Effectiveness",
+        "ScaleUnit": "1MPKI"
+    },
+    {
+        "MetricName": "l2_cache_miss_ratio",
+        "MetricExpr": "L2D_CACHE_REFILL / L2D_CACHE",
+        "BriefDescription": "This metric measures the ratio of level 2 cac=
he accesses missed to the total number of level 2 cache accesses. This give=
s an indication of the effectiveness of the level 2 cache, which is a unifi=
ed cache that stores both data and instruction. Note that cache accesses in=
 this cache are either data memory access or instruction fetch as this is a=
 unified cache.",
+        "MetricGroup": "Miss_Ratio;L2_Cache_Effectiveness",
+        "ScaleUnit": "100percent of cache accesses"
+    },
+    {
+        "MetricName": "l2_cache_mpki",
+        "MetricExpr": "L2D_CACHE_REFILL / INST_RETIRED * 1000",
+        "BriefDescription": "This metric measures the number of level 2 un=
ified cache accesses missed per thousand instructions executed. Note that c=
ache accesses in this cache are either data memory access or instruction fe=
tch as this is a unified cache.",
+        "MetricGroup": "MPKI;L2_Cache_Effectiveness",
+        "ScaleUnit": "1MPKI"
+    },
+    {
+        "MetricName": "l2_tlb_miss_ratio",
+        "MetricExpr": "L2D_TLB_REFILL / L2D_TLB",
+        "BriefDescription": "This metric measures the ratio of level 2 uni=
fied TLB accesses missed to the total number of level 2 unified TLB accesse=
s. This gives an indication of the effectiveness of the level 2 TLB.",
+        "MetricGroup": "Miss_Ratio;ITLB_Effectiveness;DTLB_Effectiveness",
+        "ScaleUnit": "100percent of TLB accesses"
+    },
+    {
+        "MetricName": "l2_tlb_mpki",
+        "MetricExpr": "L2D_TLB_REFILL / INST_RETIRED * 1000",
+        "BriefDescription": "This metric measures the number of level 2 un=
ified TLB accesses missed per thousand instructions executed.",
+        "MetricGroup": "MPKI;ITLB_Effectiveness;DTLB_Effectiveness",
+        "ScaleUnit": "1MPKI"
+    },
+    {
+        "MetricName": "ll_cache_read_hit_ratio",
+        "MetricExpr": "(LL_CACHE_RD - LL_CACHE_MISS_RD) / LL_CACHE_RD",
+        "BriefDescription": "This metric measures the ratio of last level =
cache read accesses hit in the cache to the total number of last level cach=
e accesses. This gives an indication of the effectiveness of the last level=
 cache for read traffic. Note that cache accesses in this cache are either =
data memory access or instruction fetch as this is a system level cache.",
+        "MetricGroup": "LL_Cache_Effectiveness",
+        "ScaleUnit": "100percent of cache accesses"
+    },
+    {
+        "MetricName": "ll_cache_read_miss_ratio",
+        "MetricExpr": "LL_CACHE_MISS_RD / LL_CACHE_RD",
+        "BriefDescription": "This metric measures the ratio of last level =
cache read accesses missed to the total number of last level cache accesses=
. This gives an indication of the effectiveness of the last level cache for=
 read traffic. Note that cache accesses in this cache are either data memor=
y access or instruction fetch as this is a system level cache.",
+        "MetricGroup": "Miss_Ratio;LL_Cache_Effectiveness",
+        "ScaleUnit": "100percent of cache accesses"
+    },
+    {
+        "MetricName": "ll_cache_read_mpki",
+        "MetricExpr": "LL_CACHE_MISS_RD / INST_RETIRED * 1000",
+        "BriefDescription": "This metric measures the number of last level=
 cache read accesses missed per thousand instructions executed.",
+        "MetricGroup": "MPKI;LL_Cache_Effectiveness",
+        "ScaleUnit": "1MPKI"
+    },
+    {
+        "MetricName": "load_percentage",
+        "MetricExpr": "LD_SPEC / INST_SPEC * 100",
+        "BriefDescription": "This metric measures load operations as a per=
centage of operations speculatively executed.",
+        "MetricGroup": "Operation_Mix",
+        "ScaleUnit": "1percent of operations"
+    },
+    {
+        "MetricName": "scalar_fp_percentage",
+        "MetricExpr": "VFP_SPEC / INST_SPEC * 100",
+        "BriefDescription": "This metric measures scalar floating point op=
erations as a percentage of operations speculatively executed.",
+        "MetricGroup": "Operation_Mix",
+        "ScaleUnit": "1percent of operations"
+    },
+    {
+        "MetricName": "simd_percentage",
+        "MetricExpr": "ASE_SPEC / INST_SPEC * 100",
+        "BriefDescription": "This metric measures advanced SIMD operations=
 as a percentage of total operations speculatively executed.",
+        "MetricGroup": "Operation_Mix",
+        "ScaleUnit": "1percent of operations"
+    },
+    {
+        "MetricName": "store_percentage",
+        "MetricExpr": "ST_SPEC / INST_SPEC * 100",
+        "BriefDescription": "This metric measures store operations as a pe=
rcentage of operations speculatively executed.",
+        "MetricGroup": "Operation_Mix",
+        "ScaleUnit": "1percent of operations"
+    },
+    {
+        "MetricName": "sve_all_percentage",
+        "MetricExpr": "SVE_INST_SPEC / INST_SPEC * 100",
+        "BriefDescription": "This metric measures scalable vector operatio=
ns, including loads and stores, as a percentage of operations speculatively=
 executed.",
+        "MetricGroup": "Operation_Mix",
+        "ScaleUnit": "1percent of operations"
+    }
+]
diff --git a/tools/perf/pmu-events/arch/arm64/arm/cortex-a520/pmu.json b/to=
ols/perf/pmu-events/arch/arm64/arm/cortex-a520/pmu.json
new file mode 100644
index 000000000000..d8b7b9f9e5fa
--- /dev/null
+++ b/tools/perf/pmu-events/arch/arm64/arm/cortex-a520/pmu.json
@@ -0,0 +1,8 @@
+[
+    {
+        "ArchStdEvent": "PMU_OVFS"
+    },
+    {
+        "ArchStdEvent": "PMU_HOVFS"
+    }
+]
diff --git a/tools/perf/pmu-events/arch/arm64/arm/cortex-a520/retired.json =
b/tools/perf/pmu-events/arch/arm64/arm/cortex-a520/retired.json
new file mode 100644
index 000000000000..152f15c1253c
--- /dev/null
+++ b/tools/perf/pmu-events/arch/arm64/arm/cortex-a520/retired.json
@@ -0,0 +1,90 @@
+[
+    {
+        "ArchStdEvent": "SW_INCR",
+        "PublicDescription": "Counts software writes to the PMSWINC_EL0 (s=
oftware PMU increment) register. The PMSWINC_EL0 register is a manually upd=
ated counter for use by application software.\n\nThis event could be used t=
o measure any user program event, such as accesses to a particular data str=
ucture (by writing to the PMSWINC_EL0 register each time the data structure=
 is accessed).\n\nTo use the PMSWINC_EL0 register and event, developers mus=
t insert instructions that write to the PMSWINC_EL0 register into the sourc=
e code.\n\nSince the SW_INCR event records writes to the PMSWINC_EL0 regist=
er, there is no need to do a read/increment/write sequence to the PMSWINC_E=
L0 register."
+    },
+    {
+        "ArchStdEvent": "LD_RETIRED",
+        "PublicDescription": "Counts instruction architecturally executed,=
 Condition code check pass, load."
+    },
+    {
+        "ArchStdEvent": "ST_RETIRED",
+        "PublicDescription": "Counts instruction architecturally executed,=
 Condition code check pass, store."
+    },
+    {
+        "ArchStdEvent": "INST_RETIRED",
+        "PublicDescription": "Counts instructions that have been architect=
urally executed."
+    },
+    {
+        "ArchStdEvent": "CID_WRITE_RETIRED",
+        "PublicDescription": "Counts architecturally executed writes to th=
e CONTEXTIDR_EL1 register, which usually contain the kernel PID and can be =
output with hardware trace."
+    },
+    {
+        "ArchStdEvent": "PC_WRITE_RETIRED",
+        "PublicDescription": "Counts branch instructions that caused a cha=
nge of Program Counter, which effectively causes a change in the control fl=
ow of the program."
+    },
+    {
+        "ArchStdEvent": "BR_IMMED_RETIRED",
+        "PublicDescription": "Counts architecturally executed direct branc=
hes."
+    },
+    {
+        "ArchStdEvent": "BR_RETURN_RETIRED",
+        "PublicDescription": "Counts architecturally executed procedure re=
turns."
+    },
+    {
+        "ArchStdEvent": "TTBR_WRITE_RETIRED",
+        "PublicDescription": "Counts architectural writes to TTBR0/1_EL1. =
If virtualization host extensions are enabled (by setting the HCR_EL2.E2H b=
it to 1), then accesses to TTBR0/1_EL1 that are redirected to TTBR0/1_EL2, =
or accesses to TTBR0/1_EL12, are counted. TTBRn registers are typically upd=
ated when the kernel is swapping user-space threads or applications."
+    },
+    {
+        "ArchStdEvent": "BR_RETIRED",
+        "PublicDescription": "Counts architecturally executed branches, wh=
ether the branch is taken or not. Instructions that explicitly write to the=
 PC are also counted. Note that exception generating instructions, exceptio=
n return instructions and context synchronization instructions are not coun=
ted."
+    },
+    {
+        "ArchStdEvent": "BR_MIS_PRED_RETIRED",
+        "PublicDescription": "Counts branches counted by BR_RETIRED which =
were mispredicted and caused a pipeline flush."
+    },
+    {
+        "ArchStdEvent": "OP_RETIRED",
+        "PublicDescription": "Counts micro-operations that are architectur=
ally executed. This is a count of number of micro-operations retired from t=
he commit queue in a single cycle."
+    },
+    {
+        "ArchStdEvent": "SVE_INST_RETIRED",
+        "PublicDescription": "Counts architecturally executed SVE instruct=
ions."
+    },
+    {
+        "ArchStdEvent": "BR_INDNR_TAKEN_RETIRED",
+        "PublicDescription": "Counts architecturally executed indirect bra=
nches excluding procedure returns that were taken."
+    },
+    {
+        "ArchStdEvent": "BR_IMMED_PRED_RETIRED",
+        "PublicDescription": "Counts architecturally executed direct branc=
hes that were correctly predicted."
+    },
+    {
+        "ArchStdEvent": "BR_IMMED_MIS_PRED_RETIRED",
+        "PublicDescription": "Counts architecturally executed direct branc=
hes that were mispredicted and caused a pipeline flush."
+    },
+    {
+        "ArchStdEvent": "BR_RETURN_PRED_RETIRED",
+        "PublicDescription": "Counts architecturally executed procedure re=
turns that were correctly predicted."
+    },
+    {
+        "ArchStdEvent": "BR_RETURN_MIS_PRED_RETIRED",
+        "PublicDescription": "Counts architecturally executed procedure re=
turns that were mispredicted and caused a pipeline flush."
+    },
+    {
+        "ArchStdEvent": "BR_INDNR_PRED_RETIRED",
+        "PublicDescription": "Counts architecturally executed indirect bra=
nches excluding procedure returns that were correctly predicted."
+    },
+    {
+        "ArchStdEvent": "BR_INDNR_MIS_PRED_RETIRED",
+        "PublicDescription": "Counts architecturally executed indirect bra=
nches excluding procedure returns that were mispredicted and caused a pipel=
ine flush."
+    },
+    {
+        "ArchStdEvent": "BR_PRED_RETIRED",
+        "PublicDescription": "Counts branch instructions counted by BR_RET=
IRED which were correctly predicted."
+    },
+    {
+        "ArchStdEvent": "BR_IND_RETIRED",
+        "PublicDescription": "Counts architecturally executed indirect bra=
nches including procedure returns."
+    }
+]
diff --git a/tools/perf/pmu-events/arch/arm64/arm/cortex-a520/spec_operatio=
n.json b/tools/perf/pmu-events/arch/arm64/arm/cortex-a520/spec_operation.js=
on
new file mode 100644
index 000000000000..40c29be53cc0
--- /dev/null
+++ b/tools/perf/pmu-events/arch/arm64/arm/cortex-a520/spec_operation.json
@@ -0,0 +1,70 @@
+[
+    {
+        "ArchStdEvent": "BR_MIS_PRED",
+        "PublicDescription": "Counts branches which are speculatively exec=
uted and mispredicted."
+    },
+    {
+        "ArchStdEvent": "BR_PRED",
+        "PublicDescription": "Counts all speculatively executed branches."
+    },
+    {
+        "ArchStdEvent": "INST_SPEC",
+        "PublicDescription": "Counts operations that have been speculative=
ly executed."
+    },
+    {
+        "ArchStdEvent": "OP_SPEC",
+        "PublicDescription": "Counts micro-operations speculatively execut=
ed. This is the count of the number of micro-operations dispatched in a cyc=
le."
+    },
+    {
+        "ArchStdEvent": "STREX_FAIL_SPEC",
+        "PublicDescription": "Counts store-exclusive operations that have =
been speculatively executed and have not successfully completed the store o=
peration."
+    },
+    {
+        "ArchStdEvent": "STREX_SPEC",
+        "PublicDescription": "Counts store-exclusive operations that have =
been speculatively executed."
+    },
+    {
+        "ArchStdEvent": "LD_SPEC",
+        "PublicDescription": "Counts speculatively executed load operation=
s including Single Instruction Multiple Data (SIMD) load operations."
+    },
+    {
+        "ArchStdEvent": "ST_SPEC",
+        "PublicDescription": "Counts speculatively executed store operatio=
ns including Single Instruction Multiple Data (SIMD) store operations."
+    },
+    {
+        "ArchStdEvent": "LDST_SPEC",
+        "PublicDescription": "Counts speculatively executed load and store=
 operations."
+    },
+    {
+        "ArchStdEvent": "DP_SPEC",
+        "PublicDescription": "Counts speculatively executed logical or ari=
thmetic instructions such as MOV/MVN operations."
+    },
+    {
+        "ArchStdEvent": "ASE_SPEC",
+        "PublicDescription": "Counts speculatively executed Advanced SIMD =
operations excluding load, store and move micro-operations that move data t=
o or from SIMD (vector) registers."
+    },
+    {
+        "ArchStdEvent": "VFP_SPEC",
+        "PublicDescription": "Counts speculatively executed floating point=
 operations. This event does not count operations that move data to or from=
 floating point (vector) registers."
+    },
+    {
+        "ArchStdEvent": "PC_WRITE_SPEC",
+        "PublicDescription": "Counts speculatively executed operations whi=
ch cause software changes of the PC. Those operations include all taken bra=
nch operations."
+    },
+    {
+        "ArchStdEvent": "CRYPTO_SPEC",
+        "PublicDescription": "Counts speculatively executed cryptographic =
operations except for PMULL and VMULL operations."
+    },
+    {
+        "ArchStdEvent": "BR_IMMED_SPEC",
+        "PublicDescription": "Counts direct branch operations which are sp=
eculatively executed."
+    },
+    {
+        "ArchStdEvent": "BR_RETURN_SPEC",
+        "PublicDescription": "Counts procedure return operations (RET, RET=
AA and RETAB) which are speculatively executed."
+    },
+    {
+        "ArchStdEvent": "BR_INDIRECT_SPEC",
+        "PublicDescription": "Counts indirect branch operations including =
procedure returns, which are speculatively executed. This includes operatio=
ns that force a software change of the PC, other than exception-generating =
operations and direct branch instructions. Some examples of the instruction=
s counted by this event include BR Xn, RET, etc..."
+    }
+]
diff --git a/tools/perf/pmu-events/arch/arm64/arm/cortex-a520/stall.json b/=
tools/perf/pmu-events/arch/arm64/arm/cortex-a520/stall.json
new file mode 100644
index 000000000000..d65aeb4b8808
--- /dev/null
+++ b/tools/perf/pmu-events/arch/arm64/arm/cortex-a520/stall.json
@@ -0,0 +1,82 @@
+[
+    {
+        "ArchStdEvent": "STALL_FRONTEND",
+        "PublicDescription": "Counts cycles when frontend could not send a=
ny micro-operations to the rename stage because of frontend resource stalls=
 caused by fetch memory latency or branch prediction flow stalls. STALL_FRO=
NTEND_SLOTS counts SLOTS during the cycle when this event counts."
+    },
+    {
+        "ArchStdEvent": "STALL_BACKEND",
+        "PublicDescription": "Counts cycles whenever the rename unit is un=
able to send any micro-operations to the backend of the pipeline because of=
 backend resource constraints. Backend resource constraints can include iss=
ue stage fullness, execution stage fullness, or other internal pipeline res=
ource fullness. All the backend slots were empty during the cycle when this=
 event counts."
+    },
+    {
+        "ArchStdEvent": "STALL",
+        "PublicDescription": "Counts cycles when no operations are sent to=
 the rename unit from the frontend or from the rename unit to the backend f=
or any reason (either frontend or backend stall). This event is the sum of =
STALL_FRONTEND and STALL_BACKEND"
+    },
+    {
+        "ArchStdEvent": "STALL_SLOT_BACKEND",
+        "PublicDescription": "Counts slots per cycle in which no operation=
s are sent from the rename unit to the backend due to backend resource cons=
traints. STALL_BACKEND counts during the cycle when STALL_SLOT_BACKEND coun=
ts at least 1."
+    },
+    {
+        "ArchStdEvent": "STALL_SLOT_FRONTEND",
+        "PublicDescription": "Counts slots per cycle in which no operation=
s are sent to the rename unit from the frontend due to frontend resource co=
nstraints."
+    },
+    {
+        "ArchStdEvent": "STALL_SLOT",
+        "PublicDescription": "Counts slots per cycle in which no operation=
s are sent to the rename unit from the frontend or from the rename unit to =
the backend for any reason (either frontend or backend stall). STALL_SLOT i=
s the sum of STALL_SLOT_FRONTEND and STALL_SLOT_BACKEND."
+    },
+    {
+        "ArchStdEvent": "STALL_BACKEND_MEM",
+        "PublicDescription": "Counts cycles when the backend is stalled be=
cause there is a pending demand load request in progress in the last level =
core cache."
+    },
+    {
+        "ArchStdEvent": "STALL_FRONTEND_MEMBOUND",
+        "PublicDescription": "Counts cycles when the frontend could not se=
nd any micro-operations to the rename stage due to resource constraints in =
the memory resources."
+    },
+    {
+        "ArchStdEvent": "STALL_FRONTEND_L1I",
+        "PublicDescription": "Counts cycles when the frontend is stalled b=
ecause there is an instruction fetch request pending in the level 1 instruc=
tion cache."
+    },
+    {
+        "ArchStdEvent": "STALL_FRONTEND_MEM",
+        "PublicDescription": "Counts cycles when the frontend is stalled b=
ecause there is an instruction fetch request pending in the last level core=
 cache."
+    },
+    {
+        "ArchStdEvent": "STALL_FRONTEND_TLB",
+        "PublicDescription": "Counts when the frontend is stalled on any T=
LB misses being handled. This event also counts the TLB accesses made by ha=
rdware prefetches."
+    },
+    {
+        "ArchStdEvent": "STALL_FRONTEND_CPUBOUND",
+        "PublicDescription": "Counts cycles when the frontend could not se=
nd any micro-operations to the rename stage due to resource constraints in =
the CPU resources excluding memory resources."
+    },
+    {
+        "ArchStdEvent": "STALL_FRONTEND_FLOW",
+        "PublicDescription": "Counts cycles when the frontend could not se=
nd any micro-operations to the rename stage due to resource constraints in =
the branch prediction unit."
+    },
+    {
+        "ArchStdEvent": "STALL_FRONTEND_FLUSH",
+        "PublicDescription": "Counts cycles when the frontend could not se=
nd any micro-operations to the rename stage as the frontend is recovering f=
rom a machine flush or resteer. Example scenarios that cause a flush includ=
e branch mispredictions, taken exceptions, micro-architectural flush etc."
+    },
+    {
+        "ArchStdEvent": "STALL_BACKEND_MEMBOUND",
+        "PublicDescription": "Counts cycles when the backend could not acc=
ept any micro-operations due to resource constraints in the memory resource=
s."
+    },
+    {
+        "ArchStdEvent": "STALL_BACKEND_L1D",
+        "PublicDescription": "Counts cycles when the backend is stalled be=
cause there is a pending demand load request in progress in the level 1 dat=
a cache."
+    },
+    {
+        "ArchStdEvent": "STALL_BACKEND_TLB",
+        "PublicDescription": "Counts cycles when the backend is stalled on=
 any demand TLB misses being handled."
+    },
+    {
+        "ArchStdEvent": "STALL_BACKEND_ST",
+        "PublicDescription": "Counts cycles when the backend is stalled an=
d there is a store that has not reached the pre-commit stage."
+    },
+    {
+        "ArchStdEvent": "STALL_BACKEND_BUSY",
+        "PublicDescription": "Counts cycles when the backend could not acc=
ept any micro-operations because the issue queues are full to take any oper=
ations for execution."
+    },
+    {
+        "ArchStdEvent": "STALL_BACKEND_ILOCK",
+        "PublicDescription": "Counts cycles when the backend could not acc=
ept any micro-operations due to resource constraints imposed by input depen=
dency."
+    }
+]
diff --git a/tools/perf/pmu-events/arch/arm64/arm/cortex-a520/sve.json b/to=
ols/perf/pmu-events/arch/arm64/arm/cortex-a520/sve.json
new file mode 100644
index 000000000000..21810ce5de8d
--- /dev/null
+++ b/tools/perf/pmu-events/arch/arm64/arm/cortex-a520/sve.json
@@ -0,0 +1,22 @@
+[
+    {
+        "ArchStdEvent": "SVE_INST_SPEC",
+        "PublicDescription": "Counts speculatively executed operations tha=
t are SVE operations."
+    },
+    {
+        "ArchStdEvent": "ASE_SVE_INT8_SPEC",
+        "PublicDescription": "Counts speculatively executed Advanced SIMD =
or SVE integer operations with the largest data type an 8-bit integer."
+    },
+    {
+        "ArchStdEvent": "ASE_SVE_INT16_SPEC",
+        "PublicDescription": "Counts speculatively executed Advanced SIMD =
or SVE integer operations with the largest data type a 16-bit integer."
+    },
+    {
+        "ArchStdEvent": "ASE_SVE_INT32_SPEC",
+        "PublicDescription": "Counts speculatively executed Advanced SIMD =
or SVE integer operations with the largest data type a 32-bit integer."
+    },
+    {
+        "ArchStdEvent": "ASE_SVE_INT64_SPEC",
+        "PublicDescription": "Counts speculatively executed Advanced SIMD =
or SVE integer operations with the largest data type a 64-bit integer."
+    }
+]
diff --git a/tools/perf/pmu-events/arch/arm64/arm/cortex-a520/tlb.json b/to=
ols/perf/pmu-events/arch/arm64/arm/cortex-a520/tlb.json
new file mode 100644
index 000000000000..1de56300e581
--- /dev/null
+++ b/tools/perf/pmu-events/arch/arm64/arm/cortex-a520/tlb.json
@@ -0,0 +1,78 @@
+[
+    {
+        "ArchStdEvent": "L1I_TLB_REFILL",
+        "PublicDescription": "Counts level 1 instruction TLB refills from =
any Instruction fetch. If there are multiple misses in the TLB that are res=
olved by the refill, then this event only counts once. This event will not =
count if the translation table walk results in a fault (such as a translati=
on or access fault), since there is no new translation created for the TLB."
+    },
+    {
+        "ArchStdEvent": "L1D_TLB_REFILL",
+        "PublicDescription": "Counts level 1 data TLB accesses that result=
ed in TLB refills. If there are multiple misses in the TLB that are resolve=
d by the refill, then this event only counts once. This event counts for re=
fills caused by preload instructions or hardware prefetch accesses. This ev=
ent counts regardless of whether the miss hits in L2 or results in a transl=
ation table walk. This event will not count if the translation table walk r=
esults in a fault (such as a translation or access fault), since there is n=
o new translation created for the TLB. This event will not count on an acce=
ss from an AT(address translation) instruction."
+    },
+    {
+        "ArchStdEvent": "L1D_TLB",
+        "PublicDescription": "Counts level 1 data TLB accesses caused by a=
ny memory load or store operation. Note that load or store instructions can=
 be broken up into multiple memory operations. This event does not count TL=
B maintenance operations."
+    },
+    {
+        "ArchStdEvent": "L1I_TLB",
+        "PublicDescription": "Counts level 1 instruction TLB accesses, whe=
ther the access hits or misses in the TLB. This event counts both demand ac=
cesses and prefetch or preload generated accesses."
+    },
+    {
+        "ArchStdEvent": "L2D_TLB_REFILL",
+        "PublicDescription": "Counts level 2 TLB refills caused by memory =
operations from both data and instruction fetch, except for those caused by=
 TLB maintenance operations and hardware prefetches."
+    },
+    {
+        "ArchStdEvent": "L2D_TLB",
+        "PublicDescription": "Counts level 2 TLB accesses except those cau=
sed by TLB maintenance operations."
+    },
+    {
+        "ArchStdEvent": "DTLB_WALK",
+        "PublicDescription": "Counts number of demand data translation tab=
le walks caused by a miss in the L2 TLB and performing at least one memory =
access. Translation table walks are counted even if the translation ended u=
p taking a translation fault for reasons different than EPD, E0PD and NFD. =
Note that partial translations that cause a translation table walk are also=
 counted. Also note that this event counts walks triggered by software prel=
oads, but not walks triggered by hardware prefetchers, and that this event =
does not count walks triggered by TLB maintenance operations."
+    },
+    {
+        "ArchStdEvent": "ITLB_WALK",
+        "PublicDescription": "Counts number of instruction translation tab=
le walks caused by a miss in the L2 TLB and performing at least one memory =
access. Translation table walks are counted even if the translation ended u=
p taking a translation fault for reasons different than EPD, E0PD and NFD. =
Note that partial translations that cause a translation table walk are also=
 counted. Also note that this event does not count walks triggered by TLB m=
aintenance operations."
+    },
+    {
+        "ArchStdEvent": "DTLB_WALK_PERCYC",
+        "PublicDescription": "Counts the number of data translation table =
walks in progress per cycle."
+    },
+    {
+        "ArchStdEvent": "ITLB_WALK_PERCYC",
+        "PublicDescription": "Counts the number of instruction translation=
 table walks in progress per cycle."
+    },
+    {
+        "ArchStdEvent": "DTLB_HWUPD",
+        "PublicDescription": "Counts number of memory accesses triggered b=
y a data translation table walk and performing an update of a translation t=
able entry. Memory accesses are counted even if the translation ended up ta=
king a translation fault for reasons different than EPD, E0PD and NFD. Note=
 that this event counts accesses triggered by software preloads, but not ac=
cesses triggered by hardware prefetchers."
+    },
+    {
+        "ArchStdEvent": "ITLB_HWUPD",
+        "PublicDescription": "Counts number of memory accesses triggered b=
y an instruction translation table walk and performing an update of a trans=
lation table entry. Memory accesses are counted even if the translation end=
ed up taking a translation fault for reasons different than EPD, E0PD and N=
FD."
+    },
+    {
+        "ArchStdEvent": "DTLB_STEP",
+        "PublicDescription": "Counts number of memory accesses triggered b=
y a demand data translation table walk and performing a read of a translati=
on table entry. Memory accesses are counted even if the translation ended u=
p taking a translation fault for reasons different than EPD, E0PD and NFD. =
Note that this event counts accesses triggered by software preloads, but no=
t accesses triggered by hardware prefetchers."
+    },
+    {
+        "ArchStdEvent": "ITLB_STEP",
+        "PublicDescription": "Counts number of memory accesses triggered b=
y an instruction translation table walk and performing a read of a translat=
ion table entry. Memory accesses are counted even if the translation ended =
up taking a translation fault for reasons different than EPD, E0PD and NFD."
+    },
+    {
+        "ArchStdEvent": "DTLB_WALK_LARGE",
+        "PublicDescription": "Counts number of demand data translation tab=
le walks caused by a miss in the L2 TLB and yielding a large page. The set =
of large pages is defined as all pages with a final size higher than or equ=
al to 2MB. Translation table walks that end up taking a translation fault a=
re not counted, as the page size would be undefined in that case. If DTLB_W=
ALK_BLOCK is implemented, then it is an alias for this event in this family=
. Note that partial translations that cause a translation table walk are al=
so counted. Also note that this event counts walks triggered by software pr=
eloads, but not walks triggered by hardware prefetchers, and that this even=
t does not count walks triggered by TLB maintenance operations."
+    },
+    {
+        "ArchStdEvent": "ITLB_WALK_LARGE",
+        "PublicDescription": "Counts number of instruction translation tab=
le walks caused by a miss in the L2 TLB and yielding a large page. The set =
of large pages is defined as all pages with a final size higher than or equ=
al to 2MB. Translation table walks that end up taking a translation fault a=
re not counted, as the page size would be undefined in that case. In this f=
amily, this is equal to ITLB_WALK_BLOCK event. Note that partial translatio=
ns that cause a translation table walk are also counted. Also note that thi=
s event does not count walks triggered by TLB maintenance operations."
+    },
+    {
+        "ArchStdEvent": "DTLB_WALK_SMALL",
+        "PublicDescription": "Counts number of data translation table walk=
s caused by a miss in the L2 TLB and yielding a small page. The set of smal=
l pages is defined as all pages with a final size lower than 2MB. Translati=
on table walks that end up taking a translation fault are not counted, as t=
he page size would be undefined in that case. If DTLB_WALK_PAGE event is im=
plemented, then it is an alias for this event in this family. Note that par=
tial translations that cause a translation table walk are also counted. Als=
o note that this event counts walks triggered by software preloads, but not=
 walks triggered by hardware prefetchers, and that this event does not coun=
t walks triggered by TLB maintenance operations."
+    },
+    {
+        "ArchStdEvent": "ITLB_WALK_SMALL",
+        "PublicDescription": "Counts number of instruction translation tab=
le walks caused by a miss in the L2 TLB and yielding a small page. The set =
of small pages is defined as all pages with a final size lower than 2MB. Tr=
anslation table walks that end up taking a translation fault are not counte=
d, as the page size would be undefined in that case. In this family, this i=
s equal to ITLB_WALK_PAGE event. Note that partial translations that cause =
a translation table walk are also counted. Also note that this event does n=
ot count walks triggered by TLB maintenance operations."
+    },
+    {
+        "ArchStdEvent": "DTLB_WALK_RW",
+        "PublicDescription": "Counts number of demand data translation tab=
le walks caused by a miss in the L2 TLB and performing at least one memory =
access. Translation table walks are counted even if the translation ended u=
p taking a translation fault for reasons different than EPD, E0PD and NFD. =
Note that partial translations that cause a translation table walk are also=
 counted. Also note that this event does not count walks triggered by TLB m=
aintenance operations."
+    }
+]
diff --git a/tools/perf/pmu-events/arch/arm64/arm/cortex-a520/trace.json b/=
tools/perf/pmu-events/arch/arm64/arm/cortex-a520/trace.json
new file mode 100644
index 000000000000..33672a8711d4
--- /dev/null
+++ b/tools/perf/pmu-events/arch/arm64/arm/cortex-a520/trace.json
@@ -0,0 +1,32 @@
+[
+    {
+        "ArchStdEvent": "TRB_WRAP"
+    },
+    {
+        "ArchStdEvent": "TRB_TRIG"
+    },
+    {
+        "ArchStdEvent": "TRCEXTOUT0"
+    },
+    {
+        "ArchStdEvent": "TRCEXTOUT1"
+    },
+    {
+        "ArchStdEvent": "TRCEXTOUT2"
+    },
+    {
+        "ArchStdEvent": "TRCEXTOUT3"
+    },
+    {
+        "ArchStdEvent": "CTI_TRIGOUT4"
+    },
+    {
+        "ArchStdEvent": "CTI_TRIGOUT5"
+    },
+    {
+        "ArchStdEvent": "CTI_TRIGOUT6"
+    },
+    {
+        "ArchStdEvent": "CTI_TRIGOUT7"
+    }
+]
diff --git a/tools/perf/pmu-events/arch/arm64/common-and-microarch.json b/t=
ools/perf/pmu-events/arch/arm64/common-and-microarch.json
index e40be37addf8..3e774c1e1413 100644
--- a/tools/perf/pmu-events/arch/arm64/common-and-microarch.json
+++ b/tools/perf/pmu-events/arch/arm64/common-and-microarch.json
@@ -1339,6 +1339,11 @@
         "EventName": "INST_FETCH",
         "BriefDescription": "Instruction memory access"
     },
+    {
+        "EventCode": "0x8125",
+        "EventName": "BUS_REQ_RD_PERCYC",
+        "BriefDescription": "Bus read transactions in progress"
+    },
     {
         "EventCode": "0x8128",
         "EventName": "DTLB_WALK_PERCYC",
@@ -1539,6 +1544,11 @@
         "EventName": "L2D_CACHE_HWPRF",
         "BriefDescription": "Level 2 data cache hardware prefetch."
     },
+    {
+        "EventCode": "0x8156",
+        "EventName": "L3D_CACHE_HWPRF",
+        "BriefDescription": "Level 3 data cache hardware prefetch."
+    },
     {
         "EventCode": "0x8158",
         "EventName": "STALL_FRONTEND_MEMBOUND",
@@ -1674,6 +1684,11 @@
         "EventName": "DTLB_WALK_PAGE",
         "BriefDescription": "Data TLB page translation table walk."
     },
+    {
+        "EventCode": "0x818D",
+        "EventName": "BUS_REQ_RD",
+        "BriefDescription": "Bus request, read"
+    },
     {
         "EventCode": "0x818B",
         "EventName": "ITLB_WALK_PAGE",
diff --git a/tools/perf/pmu-events/arch/arm64/mapfile.csv b/tools/perf/pmu-=
events/arch/arm64/mapfile.csv
index ccfcae375750..6b98632636e1 100644
--- a/tools/perf/pmu-events/arch/arm64/mapfile.csv
+++ b/tools/perf/pmu-events/arch/arm64/mapfile.csv
@@ -32,6 +32,7 @@
 0x00000000410fd440,v1,arm/cortex-x1,core
 0x00000000410fd4c0,v1,arm/cortex-x1,core
 0x00000000410fd460,v1,arm/cortex-a510,core
+0x00000000410fd800,v1,arm/cortex-a520,core
 0x00000000410fd470,v1,arm/cortex-a710,core
 0x00000000410fd810,v1,arm/cortex-a720,core
 0x00000000410fd480,v1,arm/cortex-x2,core
--=20
2.47.2