-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256
Xen Security Advisory CVE-2025-10263 / XSA-493
version 2
Arm: Completion of memory accesses not guaranteed by completion of a TLBI
UPDATES IN VERSION 2
====================
Public release.
ISSUE DESCRIPTION
=================
A hardware issue has been identified in certain Arm CPU designs. A
broadcast TLBI on one PE may complete before affected memory accesses
on another PE are globally observed. This may permit bypass of Stage 1
translation, Stage 2 translation, or GPT protection.
The erratum occurs when all of the following conditions are met:
- A PE (PEx) executes a store.
- Another PE (PEy) executes a TLBI instruction which applies to
Stage 1 only information, Stage 1 and 2 information, or GPT
information (but not Stage 2 only information), applies to the
Inner Shareable or Outer Shareable domain containing PEx, and
affects at least one of the bytes accessed by PEx's store.
- PEy executes a DSB instruction which is sufficient to complete the
TLBI instruction.
- Complex micro-architectural conditions occur.
When all conditions are met, PEy's DSB may complete before the global
observation of a portion of PEx's store which was affected by the TLB
invalidation. This store may complete at a later time, after memory
accesses which are ordered after the DSB.
The relevant TLB entries are invalidated correctly before the
completion of the DSB. This erratum does not affect reads.
For more details, please refer to the Arm Security Center:
https://developer.arm.com/Arm%20Security%20Center
IMPACT
======
A malicious guest may be able to write to memory it no longer has
permission to write to, after Xen has modified Stage 2 translation to
forbid writes to that location. This could allow a guest to escalate
its privileges to that of the hypervisor.
VULNERABLE SYSTEMS
==================
Only systems running Xen on Arm are affected. x86 systems are not
vulnerable.
Only multi-core configurations are affected.
The following Arm CPUs are affected:
- Arm C1-Ultra, C1-Premium
- Neoverse V3 & V3AE, Neoverse V2, Neoverse V1, Neoverse N2,
Neoverse N1
- Cortex-X925, Cortex-X4, Cortex-X3, Cortex-X2, Cortex-X1 & X1C,
Cortex-A710, Cortex-A78, A78AE & A78C, Cortex-A77, Cortex-A76 &
A76AE
MITIGATION
==========
There is no known mitigation.
CREDITS
=======
This issue was reported by Arm.
RESOLUTION
==========
Applying the appropriate set of attached patches resolves this issue.
Note that patches for released versions are generally prepared to
apply to the stable branches, and may not apply cleanly to the most
recent release tarball. Downstreams are encouraged to update to the
tip of the stable branch before applying these patches.
xsa493/xsa493-??.patch xen-unstable
xsa493/xsa493-4.21-??.patch Xen 4.21.x
xsa493/xsa493-4.20-??.patch Xen 4.20.x
xsa493/xsa493-4.19-??.patch Xen 4.19.x
xsa493/xsa493-4.18-??.patch Xen 4.18.x
xsa493/xsa493-4.17-??.patch Xen 4.17.x
$ sha256sum xsa493*/*
b065245ad3e22d19a0a1f26af6978ebf52f1d59f4ddeb4aeb03eb198bc12f2fd xsa493/xsa493-01.patch
d8f3896d4916867aaefe340ce4d2bce0c3698c093e59ee863677d6524f43a000 xsa493/xsa493-02.patch
d77017101f424f792b560b37c82d75108b68ff9183a640fa680ba6f5fc9928aa xsa493/xsa493-03.patch
a1cd4eabe923d1d4197c95a9ce8f233a226a49cd4bf6c8651b7a11f89fccc0ed xsa493/xsa493-4.17-01.patch
7238d3bbfe6bfd96fac0da8fb36456c23519938fe694a9f90a9f7317ba1c8fdb xsa493/xsa493-4.17-02.patch
b561f4c7365fd6f39a35661bcc74330126abdf7f022e6340b56c6beaf5dad9c2 xsa493/xsa493-4.17-03.patch
84f818e5549cc48ca93cc7f153162881c825c51cc1da1d7e677ca1779db4e2a7 xsa493/xsa493-4.17-04.patch
1226029b0bdb4091979819bcbbe4480cb4dc4c8073758dcfa4c418dec5ff49e5 xsa493/xsa493-4.17-05.patch
59f49949a1cb27580e846cbc08402f496228de129607a90c84603c9961d7c51e xsa493/xsa493-4.17-06.patch
f6175dc3287d38ec7c225dee428e17d6dd66c2457668942fadbf5aff78cffa1b xsa493/xsa493-4.17-07.patch
da413bb5e5e3114e7cbbfa8ee26ffed61f902475d2ef809893a2b4002d41dd01 xsa493/xsa493-4.17-08.patch
3ef94e7a74c4e5c06655174245d004819ce6dfdc1d54f63c2463e5edf8ea182b xsa493/xsa493-4.17-09.patch
5d604ef4efffe2a199dbe8e4dcb46883e1ec294b71f7d2679bcbfa4a3d6ae168 xsa493/xsa493-4.18-01.patch
074fad2b5bf195337c0799d59493a621e1020d8cb9834ed2997997b208d498d3 xsa493/xsa493-4.18-02.patch
df6dcfc54ddfee83e2bfc00448d7a3dedda9c8c0858ea3258ebdaf674d9cf8a4 xsa493/xsa493-4.18-03.patch
cc3457e14c2b35afef35a9fd3cc3905f6e03b0f30333b56b963bc1577dbcf4a8 xsa493/xsa493-4.18-04.patch
4b523acb3b5904d649531f8c78e701ec9384e02045fc941d2ae061f28d9c5e73 xsa493/xsa493-4.18-05.patch
3511018842968d19e34e949800d638d648ddfaad7511f80f53acfb96af244750 xsa493/xsa493-4.19-01.patch
5e157dd88c71d10323f3102f555a069c1ded6ecb203a69d53c7e441ecaaa06fc xsa493/xsa493-4.19-02.patch
5da2ee837cb3bd151af442397c32bd5afca508b4d2f237fd6a395f20d41b740a xsa493/xsa493-4.19-03.patch
797955e752e4010b2df5dadf75bf210a00a8ad1bfe6ee8848b5b68734ec3cd2b xsa493/xsa493-4.19-04.patch
dfa9616895e9768b6f0d7c6efc903b00e2e51af4e0f5c38a29e79d17ea272b86 xsa493/xsa493-4.19-05.patch
0e50dae0a0dddeb2755f761f966a8d0a9186246504dacda4dd5994367f71ea8e xsa493/xsa493-4.20-01.patch
9d9911d02f5ca5aaaf9fe3700e0ff66371d1bb469471e4bf6c305a786329f3d1 xsa493/xsa493-4.20-02.patch
9058d6dfe2fcbedbb0b10d529e9e3d3e7635381d12b41383832e163aff156002 xsa493/xsa493-4.20-03.patch
eb81f949744f3e748a871dc81eb0774e58faeb3bcc6c486f2237b9f516fdad00 xsa493/xsa493-4.20-04.patch
d4eb81c40cedbdd425429c340da45d7bb344b63d71328d8cc978fc70f606804d xsa493/xsa493-4.20-05.patch
bd2e39066c4f9a9ed20a9214d6dd4cb71a5fa34349129398dba03b684ab49478 xsa493/xsa493-4.21-01.patch
b4b603075259fa6274b61a09133d59c8846910a29dd5b0d5af2d55a0adc67659 xsa493/xsa493-4.21-02.patch
721d339f1c18f6867d5a5a0d02e3edceb8d97ed08725787b3537969a656d74f6 xsa493/xsa493-4.21-03.patch
bfc9c9b005968f33f8a33116be7f8ce9918cd3020f35f8bd173727ac19bb0261 xsa493/xsa493-4.21-04.patch
738177c22c9b081165fb4500c05ddf53b7e9e1de68b3190462eb8cb66a5aa6a5 xsa493/xsa493-4.21-05.patch
0d6bca07e5177f4e13c572410224c5cea0c10b5004c370dd742c7c725d98a9be xsa493/xsa493-04.patch
$
DEPLOYMENT DURING EMBARGO
=========================
Deployment of the patches and/or mitigations described above (or
others which are substantially similar) is permitted during the
embargo, even on public-facing systems with untrusted guest users and
administrators.
But: Distribution of updated software is prohibited (except to other
members of the predisclosure list).
Predisclosure list members who wish to deploy significantly different
patches and/or mitigations, please contact the Xen Project Security
Team.
(Note: this during-embargo deployment notice is retained in
post-embargo publicly released Xen Project advisories, even though it
is then no longer applicable. This is to enable the community to have
oversight of the Xen Project Security Team's decisionmaking.)
For more information about permissible uses of embargoed information,
consult the Xen Project community's agreed Security Policy:
http://www.xenproject.org/security-policy.html
-----BEGIN PGP SIGNATURE-----
iQFABAEBCAAqFiEEI+MiLBRfRHX6gGCng/4UyVfoK9kFAmon+5EMHHBncEB4ZW4u
b3JnAAoJEIP+FMlX6CvZl9sH/2gOt3FnPag044GT7tB/PZHNzVPNZKsqKv6TbKrh
Sd+3da3eoNX9Py4AJ25t/jUkuLoZLL1yc7Mo/6nXj3/YevWb0RgUIc8Z0nUSi17f
yBcbtAaOYlGmmDGlC/MY9H4xT2htYJXwA+XOztb7k7VS0j9g8xEv8q08RBM1Jibd
nelqwwKiDm7kJS7AtuA8bHWX+pNuvGKqKvt+AhHD6F6XXsFzZ7fU1F2Sin/Rxj2V
fi8EDJcaCBNWuyajpvQbpt3vZJX5cV4n6HnkSeUxEOyLbkSMk3oH3EIqPRbuV8V3
g2WFu5NEbyjwIUOHENQubIH1isSQ8ogx6e/JPR/fgLPJtU8=
=wDng
-----END PGP SIGNATURE-----
From f3a301903705a1f997f2d7346f41a69decbc76de Mon Sep 17 00:00:00 2001
From: Michal Orzel <michal.orzel@amd.com>
Date: Fri, 22 May 2026 09:35:55 +0200
Subject: xen/arm: Sync missing definitions for Arm CPUs with Linux
Synchronize with Linux kernel 7.0 definitions for the following CPUs:
- Cortex-A76AE,
- Cortex-A78AE,
- Cortex-X1C,
- Cortex-X3,
- Neoverse-V2,
- Cortex-X4,
- Neoverse-V3AE,
- Neoverse-V3,
- Cortex-X925.
These will be used for errata detection in subsequent patches.
Signed-off-by: Michal Orzel <michal.orzel@amd.com>
Reviewed-by: Julien Grall <julien@xen.org>
diff --git a/xen/arch/arm/include/asm/processor.h b/xen/arch/arm/include/asm/processor.h
index 895d7cd50244..9e2b6bd59766 100644
--- a/xen/arch/arm/include/asm/processor.h
+++ b/xen/arch/arm/include/asm/processor.h
@@ -89,13 +89,22 @@
#define ARM_CPU_PART_CORTEX_A76 0xD0B
#define ARM_CPU_PART_NEOVERSE_N1 0xD0C
#define ARM_CPU_PART_CORTEX_A77 0xD0D
+#define ARM_CPU_PART_CORTEX_A76AE 0xD0E
#define ARM_CPU_PART_NEOVERSE_V1 0xD40
#define ARM_CPU_PART_CORTEX_A78 0xD41
+#define ARM_CPU_PART_CORTEX_A78AE 0xD42
#define ARM_CPU_PART_CORTEX_X1 0xD44
#define ARM_CPU_PART_CORTEX_A710 0xD47
#define ARM_CPU_PART_CORTEX_X2 0xD48
#define ARM_CPU_PART_NEOVERSE_N2 0xD49
#define ARM_CPU_PART_CORTEX_A78C 0xD4B
+#define ARM_CPU_PART_CORTEX_X1C 0xD4C
+#define ARM_CPU_PART_CORTEX_X3 0xD4E
+#define ARM_CPU_PART_NEOVERSE_V2 0xD4F
+#define ARM_CPU_PART_CORTEX_X4 0xD82
+#define ARM_CPU_PART_NEOVERSE_V3AE 0xD83
+#define ARM_CPU_PART_NEOVERSE_V3 0xD84
+#define ARM_CPU_PART_CORTEX_X925 0xD85
#define MIDR_CORTEX_A12 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A12)
#define MIDR_CORTEX_A17 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A17)
@@ -110,13 +119,22 @@
#define MIDR_CORTEX_A76 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A76)
#define MIDR_NEOVERSE_N1 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_N1)
#define MIDR_CORTEX_A77 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A77)
+#define MIDR_CORTEX_A76AE MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A76AE)
#define MIDR_NEOVERSE_V1 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_V1)
#define MIDR_CORTEX_A78 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A78)
+#define MIDR_CORTEX_A78AE MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A78AE)
#define MIDR_CORTEX_X1 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X1)
#define MIDR_CORTEX_A710 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A710)
#define MIDR_CORTEX_X2 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X2)
#define MIDR_NEOVERSE_N2 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_N2)
#define MIDR_CORTEX_A78C MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A78C)
+#define MIDR_CORTEX_X1C MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X1C)
+#define MIDR_CORTEX_X3 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X3)
+#define MIDR_NEOVERSE_V2 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_V2)
+#define MIDR_CORTEX_X4 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X4)
+#define MIDR_NEOVERSE_V3AE MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_V3AE)
+#define MIDR_NEOVERSE_V3 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_V3)
+#define MIDR_CORTEX_X925 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X925)
/* MPIDR Multiprocessor Affinity Register */
#define _MPIDR_UP (30)
From 2de64fcde2f906fde2161cb9e662ec14d5956e40 Mon Sep 17 00:00:00 2001
From: Michal Orzel <michal.orzel@amd.com>
Date: Fri, 22 May 2026 09:35:56 +0200
Subject: xen/arm: Add C1-Ultra definitions
Add processor definitions for C1-Ultra. These will be used for errata
detection in subsequent patches.
These values can be found in the C1-Ultra TRM:
https://developer.arm.com/documentation/108014/0100/
... in section A.5.1 ("MIDR_EL1, Main ID Register").
Signed-off-by: Michal Orzel <michal.orzel@amd.com>
Reviewed-by: Julien Grall <julien@xen.org>
diff --git a/xen/arch/arm/include/asm/processor.h b/xen/arch/arm/include/asm/processor.h
index 9e2b6bd59766..955cb1a8a711 100644
--- a/xen/arch/arm/include/asm/processor.h
+++ b/xen/arch/arm/include/asm/processor.h
@@ -105,6 +105,7 @@
#define ARM_CPU_PART_NEOVERSE_V3AE 0xD83
#define ARM_CPU_PART_NEOVERSE_V3 0xD84
#define ARM_CPU_PART_CORTEX_X925 0xD85
+#define ARM_CPU_PART_C1_ULTRA 0xD8C
#define MIDR_CORTEX_A12 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A12)
#define MIDR_CORTEX_A17 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A17)
@@ -135,6 +136,7 @@
#define MIDR_NEOVERSE_V3AE MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_V3AE)
#define MIDR_NEOVERSE_V3 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_V3)
#define MIDR_CORTEX_X925 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X925)
+#define MIDR_C1_ULTRA MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_C1_ULTRA)
/* MPIDR Multiprocessor Affinity Register */
#define _MPIDR_UP (30)
From 907c605f7f9ad107803e77b115d5124053ab02cd Mon Sep 17 00:00:00 2001
From: Michal Orzel <michal.orzel@amd.com>
Date: Fri, 22 May 2026 09:35:57 +0200
Subject: xen/arm: Add C1-Premium definitions
Add processor definitions for C1-Premium. These will be used for errata
detection in subsequent patches.
These values can be found in the C1-Premium TRM:
https://developer.arm.com/documentation/109416/0100/
... in section A.5.1 ("MIDR_EL1, Main ID Register").
Signed-off-by: Michal Orzel <michal.orzel@amd.com>
Reviewed-by: Julien Grall <julien@xen.org>
diff --git a/xen/arch/arm/include/asm/processor.h b/xen/arch/arm/include/asm/processor.h
index 955cb1a8a711..a3753c317fff 100644
--- a/xen/arch/arm/include/asm/processor.h
+++ b/xen/arch/arm/include/asm/processor.h
@@ -106,6 +106,7 @@
#define ARM_CPU_PART_NEOVERSE_V3 0xD84
#define ARM_CPU_PART_CORTEX_X925 0xD85
#define ARM_CPU_PART_C1_ULTRA 0xD8C
+#define ARM_CPU_PART_C1_PREMIUM 0xD90
#define MIDR_CORTEX_A12 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A12)
#define MIDR_CORTEX_A17 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A17)
@@ -137,6 +138,7 @@
#define MIDR_NEOVERSE_V3 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_V3)
#define MIDR_CORTEX_X925 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X925)
#define MIDR_C1_ULTRA MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_C1_ULTRA)
+#define MIDR_C1_PREMIUM MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_C1_PREMIUM)
/* MPIDR Multiprocessor Affinity Register */
#define _MPIDR_UP (30)
From 43aaa649d7c52a04768ff8abbbd461c2b4f2376a Mon Sep 17 00:00:00 2001
From: Julien Grall <jgrall@amazon.com>
Date: Tue, 24 Jan 2023 19:25:19 +0000
Subject: xen/arm64: flushtlb: Reduce scope of barrier for local TLB flush
Per D5-4929 in ARM DDI 0487H.a:
"A DSB NSH is sufficient to ensure completion of TLB maintenance
instructions that apply to a single PE. A DSB ISH is sufficient to
ensure completion of TLB maintenance instructions that apply to PEs
in the same Inner Shareable domain.
"
This means barrier after local TLB flushes could be reduced to
non-shareable.
Note that the scope of the barrier in the workaround has not been
changed because Linux v6.1-rc8 is also using 'ish' and I couldn't
find anything in the Neoverse N1 suggesting that a 'nsh' would
be sufficient.
Signed-off-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Michal Orzel <michal.orzel@amd.com>
Tested-by: Henry Wang <Henry.Wang@arm.com>
(cherry picked from commit 7c438851475482bf73fcf451551d1cb718d4904c)
xen/arm: Remove stray semicolon at VREG_REG_HELPERS/TLB_HELPER* callers
This is inconsistent with the rest of the code where macros are used
to define functions, as it results in an empty declaration (i.e.
semicolon with nothing before it) after function definition. This is also
not allowed by C99.
Take the opportunity to undefine TLB_HELPER* macros after last use.
Signed-off-by: Michal Orzel <michal.orzel@amd.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
(cherry picked from commit 6044b485ba5b0e4073a773402cedc2f2fae540ad)
diff --git a/xen/arch/arm/include/asm/arm64/flushtlb.h b/xen/arch/arm/include/asm/arm64/flushtlb.h
index 7c5431518741..fcc0788c3049 100644
--- a/xen/arch/arm/include/asm/arm64/flushtlb.h
+++ b/xen/arch/arm/include/asm/arm64/flushtlb.h
@@ -12,8 +12,9 @@
* ARM64_WORKAROUND_REPEAT_TLBI:
* Modification of the translation table for a virtual address might lead to
* read-after-read ordering violation.
- * The workaround repeats TLBI+DSB operation for all the TLB flush operations.
- * While this is stricly not necessary, we don't want to take any risk.
+ * The workaround repeats TLBI+DSB ISH operation for all the TLB flush
+ * operations. While this is strictly not necessary, we don't want to
+ * take any risk.
*
* For Xen page-tables the ISB will discard any instructions fetched
* from the old mappings.
@@ -21,12 +22,17 @@
* For the Stage-2 page-tables the ISB ensures the completion of the DSB
* (and therefore the TLB invalidation) before continuing. So we know
* the TLBs cannot contain an entry for a mapping we may have removed.
+ *
+ * Note that for local TLB flush, using non-shareable (nsh) is sufficient
+ * (see D5-4929 in ARM DDI 0487H.a). Although, the memory barrier in
+ * for the workaround is left as inner-shareable to match with Linux
+ * v6.1-rc8.
*/
-#define TLB_HELPER(name, tlbop) \
+#define TLB_HELPER(name, tlbop, sh) \
static inline void name(void) \
{ \
asm volatile( \
- "dsb ishst;" \
+ "dsb " # sh "st;" \
"tlbi " # tlbop ";" \
ALTERNATIVE( \
"nop; nop;", \
@@ -34,25 +40,25 @@ static inline void name(void) \
"tlbi " # tlbop ";", \
ARM64_WORKAROUND_REPEAT_TLBI, \
CONFIG_ARM64_WORKAROUND_REPEAT_TLBI) \
- "dsb ish;" \
+ "dsb " # sh ";" \
"isb;" \
: : : "memory"); \
}
/* Flush local TLBs, current VMID only. */
-TLB_HELPER(flush_guest_tlb_local, vmalls12e1);
+TLB_HELPER(flush_guest_tlb_local, vmalls12e1, nsh)
/* Flush innershareable TLBs, current VMID only */
-TLB_HELPER(flush_guest_tlb, vmalls12e1is);
+TLB_HELPER(flush_guest_tlb, vmalls12e1is, ish)
/* Flush local TLBs, all VMIDs, non-hypervisor mode */
-TLB_HELPER(flush_all_guests_tlb_local, alle1);
+TLB_HELPER(flush_all_guests_tlb_local, alle1, nsh)
/* Flush innershareable TLBs, all VMIDs, non-hypervisor mode */
-TLB_HELPER(flush_all_guests_tlb, alle1is);
+TLB_HELPER(flush_all_guests_tlb, alle1is, ish)
/* Flush all hypervisor mappings from the TLB of the local processor. */
-TLB_HELPER(flush_xen_tlb_local, alle2);
+TLB_HELPER(flush_xen_tlb_local, alle2, nsh)
/* Flush TLB of local processor for address va. */
static inline void __flush_xen_tlb_one_local(vaddr_t va)
@@ -66,6 +72,8 @@ static inline void __flush_xen_tlb_one(vaddr_t va)
asm volatile("tlbi vae2is, %0;" : : "r" (va>>PAGE_SHIFT) : "memory");
}
+#undef TLB_HELPER
+
#endif /* __ASM_ARM_ARM64_FLUSHTLB_H__ */
/*
* Local variables:
From 418979c4ac782d37fc3190c4b17dbf7f5c629c56 Mon Sep 17 00:00:00 2001
From: Julien Grall <jgrall@amazon.com>
Date: Tue, 24 Jan 2023 19:25:50 +0000
Subject: xen/arm64: flushtlb: Implement the TLBI repeat workaround for TLB
flush by VA
Looking at the Neoverse N1 errata document, it is not clear to me
why the TLBI repeat workaround is not applied for TLB flush by VA.
The TLB flush by VA helpers are used in flush_xen_tlb_range_va_local()
and flush_xen_tlb_range_va(). So if the range size is a fixed size smaller
than a PAGE_SIZE, it would be possible that the compiler remove the loop
and therefore replicate the sequence described in the erratum 1286807.
So the TLBI repeat workaround should also be applied for the TLB flush
by VA helpers.
Fixes: 22e323d115d8 ("xen/arm: Add workaround for Cortex-A76/Neoverse-N1 erratum #1286807")
Signed-off-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Michal Orzel <michal.orzel@amd.com>
Tested-by: Henry Wang <Henry.Wang@arm.com>
(cherry picked from commit cbfaf6ccd2cb5d1b2bc6efe5c6f4ba5cccce5689)
xen/arm: Remove stray semicolon at VREG_REG_HELPERS/TLB_HELPER* callers
This is inconsistent with the rest of the code where macros are used
to define functions, as it results in an empty declaration (i.e.
semicolon with nothing before it) after function definition. This is also
not allowed by C99.
Take the opportunity to undefine TLB_HELPER* macros after last use.
Signed-off-by: Michal Orzel <michal.orzel@amd.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
(cherry picked from commit 6044b485ba5b0e4073a773402cedc2f2fae540ad)
diff --git a/xen/arch/arm/include/asm/arm64/flushtlb.h b/xen/arch/arm/include/asm/arm64/flushtlb.h
index fcc0788c3049..56c6fc763b56 100644
--- a/xen/arch/arm/include/asm/arm64/flushtlb.h
+++ b/xen/arch/arm/include/asm/arm64/flushtlb.h
@@ -45,6 +45,27 @@ static inline void name(void) \
: : : "memory"); \
}
+/*
+ * FLush TLB by VA. This will likely be used in a loop, so the caller
+ * is responsible to use the appropriate memory barriers before/after
+ * the sequence.
+ *
+ * See above about the ARM64_WORKAROUND_REPEAT_TLBI sequence.
+ */
+#define TLB_HELPER_VA(name, tlbop) \
+static inline void name(vaddr_t va) \
+{ \
+ asm volatile( \
+ "tlbi " # tlbop ", %0;" \
+ ALTERNATIVE( \
+ "nop; nop;", \
+ "dsb ish;" \
+ "tlbi " # tlbop ", %0;", \
+ ARM64_WORKAROUND_REPEAT_TLBI, \
+ CONFIG_ARM64_WORKAROUND_REPEAT_TLBI) \
+ : : "r" (va >> PAGE_SHIFT) : "memory"); \
+}
+
/* Flush local TLBs, current VMID only. */
TLB_HELPER(flush_guest_tlb_local, vmalls12e1, nsh)
@@ -61,18 +82,13 @@ TLB_HELPER(flush_all_guests_tlb, alle1is, ish)
TLB_HELPER(flush_xen_tlb_local, alle2, nsh)
/* Flush TLB of local processor for address va. */
-static inline void __flush_xen_tlb_one_local(vaddr_t va)
-{
- asm volatile("tlbi vae2, %0;" : : "r" (va>>PAGE_SHIFT) : "memory");
-}
+TLB_HELPER_VA(__flush_xen_tlb_one_local, vae2)
/* Flush TLB of all processors in the inner-shareable domain for address va. */
-static inline void __flush_xen_tlb_one(vaddr_t va)
-{
- asm volatile("tlbi vae2is, %0;" : : "r" (va>>PAGE_SHIFT) : "memory");
-}
+TLB_HELPER_VA(__flush_xen_tlb_one, vae2is)
#undef TLB_HELPER
+#undef TLB_HELPER_VA
#endif /* __ASM_ARM_ARM64_FLUSHTLB_H__ */
/*
From cda79fee2afd422c2836f8ef3039fda144b05218 Mon Sep 17 00:00:00 2001
From: Julien Grall <jgrall@amazon.com>
Date: Tue, 24 Jan 2023 19:26:09 +0000
Subject: xen/arm32: flushtlb: Reduce scope of barrier for local TLB flush
Per G5-9224 in ARM DDI 0487I.a:
"A DSB NSH is sufficient to ensure completion of TLB maintenance
instructions that apply to a single PE. A DSB ISH is sufficient to
ensure completion of TLB maintenance instructions that apply to PEs
in the same Inner Shareable domain.
"
This is quoting the Armv8 specification because I couldn't find an
explicit statement in the Armv7 specification. Instead, I could find
bits in various places that confirm the same implementation.
Furthermore, Linux has been using 'nsh' since 2013 (62cbbc42e001
"ARM: tlb: reduce scope of barrier domains for TLB invalidation").
This means barrier after local TLB flushes could be reduced to
non-shareable.
Signed-off-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Michal Orzel <michal.orzel@amd.com>
Tested-by: Henry Wang <Henry.Wang@arm.com>
(cherry picked from commit d56c70b6e1fe2b4ee836ca4449a3277cbbeb0ddc)
xen/arm: Remove stray semicolon at VREG_REG_HELPERS/TLB_HELPER* callers
This is inconsistent with the rest of the code where macros are used
to define functions, as it results in an empty declaration (i.e.
semicolon with nothing before it) after function definition. This is also
not allowed by C99.
Take the opportunity to undefine TLB_HELPER* macros after last use.
Signed-off-by: Michal Orzel <michal.orzel@amd.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
(cherry picked from commit 6044b485ba5b0e4073a773402cedc2f2fae540ad)
diff --git a/xen/arch/arm/include/asm/arm32/flushtlb.h b/xen/arch/arm/include/asm/arm32/flushtlb.h
index 9085e6501153..22ee3b317b4d 100644
--- a/xen/arch/arm/include/asm/arm32/flushtlb.h
+++ b/xen/arch/arm/include/asm/arm32/flushtlb.h
@@ -15,30 +15,35 @@
* For the Stage-2 page-tables the ISB ensures the completion of the DSB
* (and therefore the TLB invalidation) before continuing. So we know
* the TLBs cannot contain an entry for a mapping we may have removed.
+ *
+ * Note that for local TLB flush, using non-shareable (nsh) is sufficient
+ * (see G5-9224 in ARM DDI 0487I.a).
*/
-#define TLB_HELPER(name, tlbop) \
-static inline void name(void) \
-{ \
- dsb(ishst); \
- WRITE_CP32(0, tlbop); \
- dsb(ish); \
- isb(); \
+#define TLB_HELPER(name, tlbop, sh) \
+static inline void name(void) \
+{ \
+ dsb(sh ## st); \
+ WRITE_CP32(0, tlbop); \
+ dsb(sh); \
+ isb(); \
}
/* Flush local TLBs, current VMID only */
-TLB_HELPER(flush_guest_tlb_local, TLBIALL);
+TLB_HELPER(flush_guest_tlb_local, TLBIALL, nsh)
/* Flush inner shareable TLBs, current VMID only */
-TLB_HELPER(flush_guest_tlb, TLBIALLIS);
+TLB_HELPER(flush_guest_tlb, TLBIALLIS, ish)
/* Flush local TLBs, all VMIDs, non-hypervisor mode */
-TLB_HELPER(flush_all_guests_tlb_local, TLBIALLNSNH);
+TLB_HELPER(flush_all_guests_tlb_local, TLBIALLNSNH, nsh)
/* Flush innershareable TLBs, all VMIDs, non-hypervisor mode */
-TLB_HELPER(flush_all_guests_tlb, TLBIALLNSNHIS);
+TLB_HELPER(flush_all_guests_tlb, TLBIALLNSNHIS, ish)
/* Flush all hypervisor mappings from the TLB of the local processor. */
-TLB_HELPER(flush_xen_tlb_local, TLBIALLH);
+TLB_HELPER(flush_xen_tlb_local, TLBIALLH, nsh)
+
+#undef TLB_HELPER
/* Flush TLB of local processor for address va. */
static inline void __flush_xen_tlb_one_local(vaddr_t va)
From c5521dbadd1814aeab64378711a33ed52b978557 Mon Sep 17 00:00:00 2001
From: Julien Grall <jgrall@amazon.com>
Date: Tue, 24 Jan 2023 19:26:29 +0000
Subject: xen/arm: flushtlb: Reduce scope of barrier for the TLB range flush
At the moment, flush_xen_tlb_range_va{,_local}() are using system
wide memory barrier. This is quite expensive and unnecessary.
For the local version, a non-shareable barrier is sufficient.
For the SMP version, an inner-shareable barrier is sufficient.
Furthermore, the initial barrier only needs to a store barrier.
For the full explanation of the sequence see asm/arm{32,64}/flushtlb.h.
Signed-off-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Michal Orzel <michal.orzel@amd.com>
Reviewed-by: Henry Wang <Henry.Wang@arm.com>
(cherry picked from commit 5e5d1a43e18468399448ff8dec687342d48f56da)
diff --git a/xen/arch/arm/include/asm/flushtlb.h b/xen/arch/arm/include/asm/flushtlb.h
index 125a141975e0..e45fb6d97b02 100644
--- a/xen/arch/arm/include/asm/flushtlb.h
+++ b/xen/arch/arm/include/asm/flushtlb.h
@@ -37,13 +37,14 @@ static inline void flush_xen_tlb_range_va_local(vaddr_t va,
{
vaddr_t end = va + size;
- dsb(sy); /* Ensure preceding are visible */
+ /* See asm/arm{32,64}/flushtlb.h for the explanation of the sequence. */
+ dsb(nshst); /* Ensure prior page-tables updates have completed */
while ( va < end )
{
__flush_xen_tlb_one_local(va);
va += PAGE_SIZE;
}
- dsb(sy); /* Ensure completion of the TLB flush */
+ dsb(nsh); /* Ensure the TLB invalidation has completed */
isb();
}
@@ -56,13 +57,14 @@ static inline void flush_xen_tlb_range_va(vaddr_t va,
{
vaddr_t end = va + size;
- dsb(sy); /* Ensure preceding are visible */
+ /* See asm/arm{32,64}/flushtlb.h for the explanation of the sequence. */
+ dsb(ishst); /* Ensure prior page-tables updates have completed */
while ( va < end )
{
__flush_xen_tlb_one(va);
va += PAGE_SIZE;
}
- dsb(sy); /* Ensure completion of the TLB flush */
+ dsb(ish); /* Ensure the TLB invalidation has completed */
isb();
}
From 33d4834619cd4845cc98f2f944c335225c3caf8a Mon Sep 17 00:00:00 2001
From: Michal Orzel <michal.orzel@amd.com>
Date: Tue, 14 Apr 2026 10:11:24 +0200
Subject: xen/arm64: flushtlb: Optimize ARM64_WORKAROUND_REPEAT_TLBI
The ARM64_WORKAROUND_REPEAT_TLBI workaround is used to mitigate several
errata where broadcast TLBI;DSB sequences don't provide all the
architecturally required synchronization. The workaround performs more
work than necessary, and can have significant overhead. This patch
optimizes the workaround, as explained below.
1. All relevant errata only affect the ordering and/or completion of
memory accesses which have been translated by an invalidated TLB
entry. The actual invalidation of TLB entries is unaffected.
2. The existing workaround is applied to both broadcast and local TLB
invalidation, whereas for all relevant errata it is only necessary to
apply a workaround for broadcast invalidation.
3. The existing workaround replaces every TLBI with a TLBI;DSB;TLBI
sequence, whereas for all relevant errata it is only necessary to
execute a single additional TLBI;DSB sequence after any number of
TLBIs are completed by a DSB.
For example, for a sequence of batched TLBIs:
TLBI <op1>[, <arg1>]
TLBI <op2>[, <arg2>]
TLBI <op3>[, <arg3>]
DSB ISH
... the existing workaround will expand this to:
TLBI <op1>[, <arg1>]
DSB ISH // additional
TLBI <op1>[, <arg1>] // additional
TLBI <op2>[, <arg2>]
DSB ISH // additional
TLBI <op2>[, <arg2>] // additional
TLBI <op3>[, <arg3>]
DSB ISH // additional
TLBI <op3>[, <arg3>] // additional
DSB ISH
... whereas it is sufficient to have:
TLBI <op1>[, <arg1>]
TLBI <op2>[, <arg2>]
TLBI <op3>[, <arg3>]
DSB ISH
TLBI <opX>[, <argX>] // additional
DSB ISH // additional
Using a single additional TLBI and DSB at the end of the sequence can
have significantly lower overhead as each DSB which completes a TLBI
must synchronize with other PEs in the system, with potential
performance effects both locally and system-wide.
4. The existing workaround repeats each specific TLBI operation, whereas
for all relevant errata it is sufficient for the additional TLBI to
use *any* operation which will be broadcast, regardless of which
translation regime or stage of translation the operation applies to.
For example, for a single TLBI:
TLBI ALLE2IS
DSB ISH
... the existing workaround will expand this to:
TLBI ALLE2IS
DSB ISH
TLBI ALLE2IS // additional
DSB ISH // additional
... whereas it is sufficient to have:
TLBI ALLE2IS
DSB ISH
TLBI VALE1IS, XZR // additional
DSB ISH // additional
As the additional TLBI doesn't have to match a specific earlier TLBI,
the additional TLBI can be implemented in separate code, with no
memory of the earlier TLBIs. The additional TLBI can also use a
cheaper TLBI operation.
5. The existing workaround is applied to both Stage-1 and Stage-2 TLB
invalidation, whereas for all relevant errata it is only necessary to
apply a workaround for Stage-1 invalidation.
Architecturally, TLBI operations which invalidate only Stage-2
information (e.g. IPAS2E1IS) are not required to invalidate TLB
entries which combine information from Stage-1 and Stage-2
translation table entries, and consequently may not complete memory
accesses translated by those combined entries. In these cases,
completion of memory accesses is only guaranteed after subsequent
invalidation of Stage-1 information (e.g. VMALLE1IS).
Rework the workaround logic as follows:
- add TLB_HELPER_LOCAL() to be used for local TLB ops without a
workaround,
- modify TLB_HELPER() workaround to use tlbi vale2is, xzr as a second
TLBI,
- drop TLB_HELPER_VA(). It's used only by __flush_xen_tlb_one_local
which is local and does not need workaround and by
__flush_xen_tlb_one. In the latter case, since it's used in a loop,
we don't need a workaround in the middle. Add __tlb_repeat_sync with
a workaround to be used at the end after DSB and before final ISB,
- TLBI VALE2IS passing XZR is used as an additional TLBI. While there is
an identity mapping there, it's used very rarely. The performance
impact is therefore negligible. If things change in the future, we
can revisit the decision.
Signed-off-by: Michal Orzel <michal.orzel@amd.com>
Reviewed-by: Luca Fancellu <luca.fancellu@arm.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
(cherry picked from commit 7c502d7591519135765b8041cbd1c70e56e5a0b9)
diff --git a/xen/arch/arm/include/asm/arm32/flushtlb.h b/xen/arch/arm/include/asm/arm32/flushtlb.h
index 22ee3b317b4d..749a9b07da76 100644
--- a/xen/arch/arm/include/asm/arm32/flushtlb.h
+++ b/xen/arch/arm/include/asm/arm32/flushtlb.h
@@ -57,6 +57,9 @@ static inline void __flush_xen_tlb_one(vaddr_t va)
asm volatile(STORE_CP32(0, TLBIMVAHIS) : : "r" (va) : "memory");
}
+/* Only for ARM64_WORKAROUND_REPEAT_TLBI */
+static inline void __tlb_repeat_sync(void) {}
+
#endif /* __ASM_ARM_ARM32_FLUSHTLB_H__ */
/*
* Local variables:
diff --git a/xen/arch/arm/include/asm/arm64/flushtlb.h b/xen/arch/arm/include/asm/arm64/flushtlb.h
index 56c6fc763b56..7ef6d3f331a0 100644
--- a/xen/arch/arm/include/asm/arm64/flushtlb.h
+++ b/xen/arch/arm/include/asm/arm64/flushtlb.h
@@ -12,9 +12,14 @@
* ARM64_WORKAROUND_REPEAT_TLBI:
* Modification of the translation table for a virtual address might lead to
* read-after-read ordering violation.
- * The workaround repeats TLBI+DSB ISH operation for all the TLB flush
- * operations. While this is strictly not necessary, we don't want to
- * take any risk.
+ * The workaround repeats TLBI+DSB ISH operation for broadcast TLB flush
+ * operations. The workaround is not needed for local operations.
+ *
+ * It is sufficient for the additional TLBI to use *any* operation which will
+ * be broadcast, regardless of which translation regime or stage of translation
+ * the operation applies to. TLBI VALE2IS is used passing XZR. While there is
+ * an identity mapping there, it's only used during suspend/resume, CPU on/off,
+ * so the impact (performance if any) is negligible.
*
* For Xen page-tables the ISB will discard any instructions fetched
* from the old mappings.
@@ -26,69 +31,90 @@
* Note that for local TLB flush, using non-shareable (nsh) is sufficient
* (see D5-4929 in ARM DDI 0487H.a). Although, the memory barrier in
* for the workaround is left as inner-shareable to match with Linux
- * v6.1-rc8.
+ * v6.19.
*/
-#define TLB_HELPER(name, tlbop, sh) \
+#define TLB_HELPER_LOCAL(name, tlbop) \
static inline void name(void) \
{ \
asm volatile( \
- "dsb " # sh "st;" \
+ "dsb nshst;" \
"tlbi " # tlbop ";" \
- ALTERNATIVE( \
- "nop; nop;", \
- "dsb ish;" \
- "tlbi " # tlbop ";", \
- ARM64_WORKAROUND_REPEAT_TLBI, \
- CONFIG_ARM64_WORKAROUND_REPEAT_TLBI) \
- "dsb " # sh ";" \
+ "dsb nsh;" \
"isb;" \
: : : "memory"); \
}
-/*
- * FLush TLB by VA. This will likely be used in a loop, so the caller
- * is responsible to use the appropriate memory barriers before/after
- * the sequence.
- *
- * See above about the ARM64_WORKAROUND_REPEAT_TLBI sequence.
- */
-#define TLB_HELPER_VA(name, tlbop) \
-static inline void name(vaddr_t va) \
-{ \
- asm volatile( \
- "tlbi " # tlbop ", %0;" \
- ALTERNATIVE( \
- "nop; nop;", \
- "dsb ish;" \
- "tlbi " # tlbop ", %0;", \
- ARM64_WORKAROUND_REPEAT_TLBI, \
- CONFIG_ARM64_WORKAROUND_REPEAT_TLBI) \
- : : "r" (va >> PAGE_SHIFT) : "memory"); \
+#define TLB_HELPER(name, tlbop) \
+static inline void name(void) \
+{ \
+ asm volatile ( \
+ "dsb ishst;" \
+ "tlbi " # tlbop ";" \
+ ALTERNATIVE( \
+ "nop; nop;", \
+ "dsb ish;" \
+ "tlbi vale2is, xzr;", \
+ ARM64_WORKAROUND_REPEAT_TLBI, \
+ CONFIG_ARM64_WORKAROUND_REPEAT_TLBI) \
+ "dsb ish;" \
+ "isb;" \
+ : : : "memory"); \
}
/* Flush local TLBs, current VMID only. */
-TLB_HELPER(flush_guest_tlb_local, vmalls12e1, nsh)
+TLB_HELPER_LOCAL(flush_guest_tlb_local, vmalls12e1)
/* Flush innershareable TLBs, current VMID only */
-TLB_HELPER(flush_guest_tlb, vmalls12e1is, ish)
+TLB_HELPER(flush_guest_tlb, vmalls12e1is)
/* Flush local TLBs, all VMIDs, non-hypervisor mode */
-TLB_HELPER(flush_all_guests_tlb_local, alle1, nsh)
+TLB_HELPER_LOCAL(flush_all_guests_tlb_local, alle1)
/* Flush innershareable TLBs, all VMIDs, non-hypervisor mode */
-TLB_HELPER(flush_all_guests_tlb, alle1is, ish)
+TLB_HELPER(flush_all_guests_tlb, alle1is)
/* Flush all hypervisor mappings from the TLB of the local processor. */
-TLB_HELPER(flush_xen_tlb_local, alle2, nsh)
+TLB_HELPER_LOCAL(flush_xen_tlb_local, alle2)
+
+#undef TLB_HELPER_LOCAL
+#undef TLB_HELPER
+
+/*
+ * FLush TLB by VA. This will likely be used in a loop, so the caller
+ * is responsible to use the appropriate memory barriers before/after
+ * the sequence.
+ */
/* Flush TLB of local processor for address va. */
-TLB_HELPER_VA(__flush_xen_tlb_one_local, vae2)
+static inline void __flush_xen_tlb_one_local(vaddr_t va)
+{
+ asm volatile (
+ "tlbi vae2, %0" : : "r" (va >> PAGE_SHIFT) : "memory");
+}
/* Flush TLB of all processors in the inner-shareable domain for address va. */
-TLB_HELPER_VA(__flush_xen_tlb_one, vae2is)
+static inline void __flush_xen_tlb_one(vaddr_t va)
+{
+ asm volatile (
+ "tlbi vae2is, %0" : : "r" (va >> PAGE_SHIFT) : "memory");
+}
-#undef TLB_HELPER
-#undef TLB_HELPER_VA
+/*
+ * ARM64_WORKAROUND_REPEAT_TLBI:
+ * For all relevant erratas it is only necessary to execute a single
+ * additional TLBI;DSB sequence after any number of TLBIs are completed by DSB.
+ */
+static inline void __tlb_repeat_sync(void)
+{
+ asm volatile (
+ ALTERNATIVE(
+ "nop; nop;",
+ "tlbi vale2is, xzr;"
+ "dsb ish;",
+ ARM64_WORKAROUND_REPEAT_TLBI,
+ CONFIG_ARM64_WORKAROUND_REPEAT_TLBI)
+ : : : "memory");
+}
#endif /* __ASM_ARM_ARM64_FLUSHTLB_H__ */
/*
diff --git a/xen/arch/arm/include/asm/flushtlb.h b/xen/arch/arm/include/asm/flushtlb.h
index e45fb6d97b02..c292c3c00d29 100644
--- a/xen/arch/arm/include/asm/flushtlb.h
+++ b/xen/arch/arm/include/asm/flushtlb.h
@@ -65,6 +65,7 @@ static inline void flush_xen_tlb_range_va(vaddr_t va,
va += PAGE_SIZE;
}
dsb(ish); /* Ensure the TLB invalidation has completed */
+ __tlb_repeat_sync();
isb();
}
From fc4826ae9778036a32eb407c5fde9d0ba9b5149c Mon Sep 17 00:00:00 2001
From: Michal Orzel <michal.orzel@amd.com>
Date: Fri, 22 May 2026 09:35:55 +0200
Subject: xen/arm: Sync missing definitions for Arm CPUs with Linux
Synchronize with Linux kernel 7.0 definitions for the following CPUs:
- Cortex-A76AE,
- Cortex-A78AE,
- Cortex-X1C,
- Cortex-X3,
- Neoverse-V2,
- Cortex-X4,
- Neoverse-V3AE,
- Neoverse-V3,
- Cortex-X925.
These will be used for errata detection in subsequent patches.
Signed-off-by: Michal Orzel <michal.orzel@amd.com>
Reviewed-by: Julien Grall <julien@xen.org>
diff --git a/xen/arch/arm/include/asm/processor.h b/xen/arch/arm/include/asm/processor.h
index 1dd81d7d528f..af086ebe987b 100644
--- a/xen/arch/arm/include/asm/processor.h
+++ b/xen/arch/arm/include/asm/processor.h
@@ -74,13 +74,22 @@
#define ARM_CPU_PART_CORTEX_A76 0xD0B
#define ARM_CPU_PART_NEOVERSE_N1 0xD0C
#define ARM_CPU_PART_CORTEX_A77 0xD0D
+#define ARM_CPU_PART_CORTEX_A76AE 0xD0E
#define ARM_CPU_PART_NEOVERSE_V1 0xD40
#define ARM_CPU_PART_CORTEX_A78 0xD41
+#define ARM_CPU_PART_CORTEX_A78AE 0xD42
#define ARM_CPU_PART_CORTEX_X1 0xD44
#define ARM_CPU_PART_CORTEX_A710 0xD47
#define ARM_CPU_PART_CORTEX_X2 0xD48
#define ARM_CPU_PART_NEOVERSE_N2 0xD49
#define ARM_CPU_PART_CORTEX_A78C 0xD4B
+#define ARM_CPU_PART_CORTEX_X1C 0xD4C
+#define ARM_CPU_PART_CORTEX_X3 0xD4E
+#define ARM_CPU_PART_NEOVERSE_V2 0xD4F
+#define ARM_CPU_PART_CORTEX_X4 0xD82
+#define ARM_CPU_PART_NEOVERSE_V3AE 0xD83
+#define ARM_CPU_PART_NEOVERSE_V3 0xD84
+#define ARM_CPU_PART_CORTEX_X925 0xD85
#define MIDR_CORTEX_A12 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A12)
#define MIDR_CORTEX_A17 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A17)
@@ -95,13 +104,22 @@
#define MIDR_CORTEX_A76 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A76)
#define MIDR_NEOVERSE_N1 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_N1)
#define MIDR_CORTEX_A77 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A77)
+#define MIDR_CORTEX_A76AE MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A76AE)
#define MIDR_NEOVERSE_V1 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_V1)
#define MIDR_CORTEX_A78 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A78)
+#define MIDR_CORTEX_A78AE MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A78AE)
#define MIDR_CORTEX_X1 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X1)
#define MIDR_CORTEX_A710 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A710)
#define MIDR_CORTEX_X2 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X2)
#define MIDR_NEOVERSE_N2 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_N2)
#define MIDR_CORTEX_A78C MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A78C)
+#define MIDR_CORTEX_X1C MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X1C)
+#define MIDR_CORTEX_X3 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X3)
+#define MIDR_NEOVERSE_V2 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_V2)
+#define MIDR_CORTEX_X4 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X4)
+#define MIDR_NEOVERSE_V3AE MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_V3AE)
+#define MIDR_NEOVERSE_V3 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_V3)
+#define MIDR_CORTEX_X925 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X925)
/* MPIDR Multiprocessor Affinity Register */
#define _MPIDR_UP (30)
From 206ead6e1119a1a2e3153c9bcf78749921a6ff69 Mon Sep 17 00:00:00 2001
From: Michal Orzel <michal.orzel@amd.com>
Date: Fri, 22 May 2026 09:35:56 +0200
Subject: xen/arm: Add C1-Ultra definitions
Add processor definitions for C1-Ultra. These will be used for errata
detection in subsequent patches.
These values can be found in the C1-Ultra TRM:
https://developer.arm.com/documentation/108014/0100/
... in section A.5.1 ("MIDR_EL1, Main ID Register").
Signed-off-by: Michal Orzel <michal.orzel@amd.com>
Reviewed-by: Julien Grall <julien@xen.org>
diff --git a/xen/arch/arm/include/asm/processor.h b/xen/arch/arm/include/asm/processor.h
index af086ebe987b..935dd617623b 100644
--- a/xen/arch/arm/include/asm/processor.h
+++ b/xen/arch/arm/include/asm/processor.h
@@ -90,6 +90,7 @@
#define ARM_CPU_PART_NEOVERSE_V3AE 0xD83
#define ARM_CPU_PART_NEOVERSE_V3 0xD84
#define ARM_CPU_PART_CORTEX_X925 0xD85
+#define ARM_CPU_PART_C1_ULTRA 0xD8C
#define MIDR_CORTEX_A12 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A12)
#define MIDR_CORTEX_A17 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A17)
@@ -120,6 +121,7 @@
#define MIDR_NEOVERSE_V3AE MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_V3AE)
#define MIDR_NEOVERSE_V3 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_V3)
#define MIDR_CORTEX_X925 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X925)
+#define MIDR_C1_ULTRA MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_C1_ULTRA)
/* MPIDR Multiprocessor Affinity Register */
#define _MPIDR_UP (30)
From 701b4906794ae9b2292971ce3910d23427bb6895 Mon Sep 17 00:00:00 2001
From: Michal Orzel <michal.orzel@amd.com>
Date: Fri, 22 May 2026 09:35:57 +0200
Subject: xen/arm: Add C1-Premium definitions
Add processor definitions for C1-Premium. These will be used for errata
detection in subsequent patches.
These values can be found in the C1-Premium TRM:
https://developer.arm.com/documentation/109416/0100/
... in section A.5.1 ("MIDR_EL1, Main ID Register").
Signed-off-by: Michal Orzel <michal.orzel@amd.com>
Reviewed-by: Julien Grall <julien@xen.org>
diff --git a/xen/arch/arm/include/asm/processor.h b/xen/arch/arm/include/asm/processor.h
index 935dd617623b..ea4cdfb3e71a 100644
--- a/xen/arch/arm/include/asm/processor.h
+++ b/xen/arch/arm/include/asm/processor.h
@@ -91,6 +91,7 @@
#define ARM_CPU_PART_NEOVERSE_V3 0xD84
#define ARM_CPU_PART_CORTEX_X925 0xD85
#define ARM_CPU_PART_C1_ULTRA 0xD8C
+#define ARM_CPU_PART_C1_PREMIUM 0xD90
#define MIDR_CORTEX_A12 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A12)
#define MIDR_CORTEX_A17 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A17)
@@ -122,6 +123,7 @@
#define MIDR_NEOVERSE_V3 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_V3)
#define MIDR_CORTEX_X925 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X925)
#define MIDR_C1_ULTRA MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_C1_ULTRA)
+#define MIDR_C1_PREMIUM MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_C1_PREMIUM)
/* MPIDR Multiprocessor Affinity Register */
#define _MPIDR_UP (30)
From 5eb60938ac321d6baab2d6d6497f5061cf899836 Mon Sep 17 00:00:00 2001
From: Michal Orzel <michal.orzel@amd.com>
Date: Fri, 22 May 2026 09:35:58 +0200
Subject: xen/arm: Mitigate TLBI errata on various Arm CPUs
A number of CPUs developed by Arm suffer from errata whereby a broadcast
TLBI + DSB sequence may complete before the global observation of writes
which are translated by an affected TLB entry. This can lead to memory
corruption and potential privilege escalation.
These errata ONLY affect the completion of memory accesses which have
been translated by an invalidated TLB entry, and these errata DO NOT
affect the actual invalidation of TLB entries. TLB entries are removed
correctly.
To mitigate this issue, Arm recommends that software follows each
TLBI+DSB sequence with an additional TLBI+DSB, which will ensure that
all memory write effects affected by the first TLBI have been globally
observed.
The ARM64_WORKAROUND_REPEAT_TLBI workaround is sufficient to mitigate the
issue. Enable this workaround for affected CPUs.
This is XSA-493 / CVE-2025-10263.
Signed-off-by: Michal Orzel <michal.orzel@amd.com>
Reviewed-by: Julien Grall <julien@xen.org>
diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
index 87d541c411cc..29076297addf 100644
--- a/xen/arch/arm/Kconfig
+++ b/xen/arch/arm/Kconfig
@@ -339,6 +339,27 @@ config ARM64_ERRATUM_1508412
If unsure, say Y.
+config ARM64_ERRATUM_CVE_2025_10263
+ bool "Cortex-*/Neoverse-*/C1-*: Completion of affected memory accesses might not be guaranteed by completion of a TLBI"
+ default y
+ depends on ARM_64
+ select ARM64_WORKAROUND_REPEAT_TLBI
+ help
+ This option adds a workaround for CVE-2025-10263.
+
+ A broadcast TLBI on another PE may complete before affected memory
+ accesses are globally observed. This may permit bypass of Stage 1
+ translation, Stage-2 translation, or GPT protection.
+
+ The workaround repeats the TLBI VALE2IS, XZR + DSB ISH operation for all
+ the broadcast TLB flush operations. A single additional TLBI and DSB are
+ sufficient regardless of how many TLBIs are completed by the DSB.
+
+ Note that software workarounds are required at all execution levels for
+ affected parts to fully mitigate this issue.
+
+ If unsure, say Y.
+
endmenu
config ARM64_HARDEN_BRANCH_PREDICTOR
diff --git a/xen/arch/arm/cpuerrata.c b/xen/arch/arm/cpuerrata.c
index ea680fac2e44..33736342fbed 100644
--- a/xen/arch/arm/cpuerrata.c
+++ b/xen/arch/arm/cpuerrata.c
@@ -534,6 +534,92 @@ static const struct arm_cpu_capabilities arm_errata[] = {
MIDR_RANGE(MIDR_NEOVERSE_N1, 0, 3 << MIDR_VARIANT_SHIFT),
},
#endif
+#ifdef CONFIG_ARM64_ERRATUM_CVE_2025_10263
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_A76),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_A76AE),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_A77),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_A78),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_A78AE),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_A78C),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_A710),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_X1),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_X1C),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_X2),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_X3),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_X4),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_X925),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N1),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N2),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_NEOVERSE_V1),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_NEOVERSE_V2),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_NEOVERSE_V3),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_NEOVERSE_V3AE),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_C1_ULTRA),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_C1_PREMIUM),
+ },
+#endif
#ifdef CONFIG_ARM64_HARDEN_BRANCH_PREDICTOR
{
.capability = ARM_HARDEN_BRANCH_PREDICTOR,
From 3065266e45b475e62dd41ee194ec731f0cb80249 Mon Sep 17 00:00:00 2001
From: Michal Orzel <michal.orzel@amd.com>
Date: Tue, 14 Apr 2026 10:11:24 +0200
Subject: xen/arm64: flushtlb: Optimize ARM64_WORKAROUND_REPEAT_TLBI
The ARM64_WORKAROUND_REPEAT_TLBI workaround is used to mitigate several
errata where broadcast TLBI;DSB sequences don't provide all the
architecturally required synchronization. The workaround performs more
work than necessary, and can have significant overhead. This patch
optimizes the workaround, as explained below.
1. All relevant errata only affect the ordering and/or completion of
memory accesses which have been translated by an invalidated TLB
entry. The actual invalidation of TLB entries is unaffected.
2. The existing workaround is applied to both broadcast and local TLB
invalidation, whereas for all relevant errata it is only necessary to
apply a workaround for broadcast invalidation.
3. The existing workaround replaces every TLBI with a TLBI;DSB;TLBI
sequence, whereas for all relevant errata it is only necessary to
execute a single additional TLBI;DSB sequence after any number of
TLBIs are completed by a DSB.
For example, for a sequence of batched TLBIs:
TLBI <op1>[, <arg1>]
TLBI <op2>[, <arg2>]
TLBI <op3>[, <arg3>]
DSB ISH
... the existing workaround will expand this to:
TLBI <op1>[, <arg1>]
DSB ISH // additional
TLBI <op1>[, <arg1>] // additional
TLBI <op2>[, <arg2>]
DSB ISH // additional
TLBI <op2>[, <arg2>] // additional
TLBI <op3>[, <arg3>]
DSB ISH // additional
TLBI <op3>[, <arg3>] // additional
DSB ISH
... whereas it is sufficient to have:
TLBI <op1>[, <arg1>]
TLBI <op2>[, <arg2>]
TLBI <op3>[, <arg3>]
DSB ISH
TLBI <opX>[, <argX>] // additional
DSB ISH // additional
Using a single additional TLBI and DSB at the end of the sequence can
have significantly lower overhead as each DSB which completes a TLBI
must synchronize with other PEs in the system, with potential
performance effects both locally and system-wide.
4. The existing workaround repeats each specific TLBI operation, whereas
for all relevant errata it is sufficient for the additional TLBI to
use *any* operation which will be broadcast, regardless of which
translation regime or stage of translation the operation applies to.
For example, for a single TLBI:
TLBI ALLE2IS
DSB ISH
... the existing workaround will expand this to:
TLBI ALLE2IS
DSB ISH
TLBI ALLE2IS // additional
DSB ISH // additional
... whereas it is sufficient to have:
TLBI ALLE2IS
DSB ISH
TLBI VALE1IS, XZR // additional
DSB ISH // additional
As the additional TLBI doesn't have to match a specific earlier TLBI,
the additional TLBI can be implemented in separate code, with no
memory of the earlier TLBIs. The additional TLBI can also use a
cheaper TLBI operation.
5. The existing workaround is applied to both Stage-1 and Stage-2 TLB
invalidation, whereas for all relevant errata it is only necessary to
apply a workaround for Stage-1 invalidation.
Architecturally, TLBI operations which invalidate only Stage-2
information (e.g. IPAS2E1IS) are not required to invalidate TLB
entries which combine information from Stage-1 and Stage-2
translation table entries, and consequently may not complete memory
accesses translated by those combined entries. In these cases,
completion of memory accesses is only guaranteed after subsequent
invalidation of Stage-1 information (e.g. VMALLE1IS).
Rework the workaround logic as follows:
- add TLB_HELPER_LOCAL() to be used for local TLB ops without a
workaround,
- modify TLB_HELPER() workaround to use tlbi vale2is, xzr as a second
TLBI,
- drop TLB_HELPER_VA(). It's used only by __flush_xen_tlb_one_local
which is local and does not need workaround and by
__flush_xen_tlb_one. In the latter case, since it's used in a loop,
we don't need a workaround in the middle. Add __tlb_repeat_sync with
a workaround to be used at the end after DSB and before final ISB,
- TLBI VALE2IS passing XZR is used as an additional TLBI. While there is
an identity mapping there, it's used very rarely. The performance
impact is therefore negligible. If things change in the future, we
can revisit the decision.
Signed-off-by: Michal Orzel <michal.orzel@amd.com>
Reviewed-by: Luca Fancellu <luca.fancellu@arm.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
(cherry picked from commit 7c502d7591519135765b8041cbd1c70e56e5a0b9)
diff --git a/xen/arch/arm/include/asm/arm32/flushtlb.h b/xen/arch/arm/include/asm/arm32/flushtlb.h
index 61c25a318998..5483be08fbbe 100644
--- a/xen/arch/arm/include/asm/arm32/flushtlb.h
+++ b/xen/arch/arm/include/asm/arm32/flushtlb.h
@@ -57,6 +57,9 @@ static inline void __flush_xen_tlb_one(vaddr_t va)
asm volatile(STORE_CP32(0, TLBIMVAHIS) : : "r" (va) : "memory");
}
+/* Only for ARM64_WORKAROUND_REPEAT_TLBI */
+static inline void __tlb_repeat_sync(void) {}
+
#endif /* __ASM_ARM_ARM32_FLUSHTLB_H__ */
/*
* Local variables:
diff --git a/xen/arch/arm/include/asm/arm64/flushtlb.h b/xen/arch/arm/include/asm/arm64/flushtlb.h
index 45642201d147..c1314be122d7 100644
--- a/xen/arch/arm/include/asm/arm64/flushtlb.h
+++ b/xen/arch/arm/include/asm/arm64/flushtlb.h
@@ -12,9 +12,14 @@
* ARM64_WORKAROUND_REPEAT_TLBI:
* Modification of the translation table for a virtual address might lead to
* read-after-read ordering violation.
- * The workaround repeats TLBI+DSB ISH operation for all the TLB flush
- * operations. While this is strictly not necessary, we don't want to
- * take any risk.
+ * The workaround repeats TLBI+DSB ISH operation for broadcast TLB flush
+ * operations. The workaround is not needed for local operations.
+ *
+ * It is sufficient for the additional TLBI to use *any* operation which will
+ * be broadcast, regardless of which translation regime or stage of translation
+ * the operation applies to. TLBI VALE2IS is used passing XZR. While there is
+ * an identity mapping there, it's only used during suspend/resume, CPU on/off,
+ * so the impact (performance if any) is negligible.
*
* For Xen page-tables the ISB will discard any instructions fetched
* from the old mappings.
@@ -26,69 +31,90 @@
* Note that for local TLB flush, using non-shareable (nsh) is sufficient
* (see D5-4929 in ARM DDI 0487H.a). Although, the memory barrier in
* for the workaround is left as inner-shareable to match with Linux
- * v6.1-rc8.
+ * v6.19.
*/
-#define TLB_HELPER(name, tlbop, sh) \
+#define TLB_HELPER_LOCAL(name, tlbop) \
static inline void name(void) \
{ \
asm volatile( \
- "dsb " # sh "st;" \
+ "dsb nshst;" \
"tlbi " # tlbop ";" \
- ALTERNATIVE( \
- "nop; nop;", \
- "dsb ish;" \
- "tlbi " # tlbop ";", \
- ARM64_WORKAROUND_REPEAT_TLBI, \
- CONFIG_ARM64_WORKAROUND_REPEAT_TLBI) \
- "dsb " # sh ";" \
+ "dsb nsh;" \
"isb;" \
: : : "memory"); \
}
-/*
- * FLush TLB by VA. This will likely be used in a loop, so the caller
- * is responsible to use the appropriate memory barriers before/after
- * the sequence.
- *
- * See above about the ARM64_WORKAROUND_REPEAT_TLBI sequence.
- */
-#define TLB_HELPER_VA(name, tlbop) \
-static inline void name(vaddr_t va) \
-{ \
- asm volatile( \
- "tlbi " # tlbop ", %0;" \
- ALTERNATIVE( \
- "nop; nop;", \
- "dsb ish;" \
- "tlbi " # tlbop ", %0;", \
- ARM64_WORKAROUND_REPEAT_TLBI, \
- CONFIG_ARM64_WORKAROUND_REPEAT_TLBI) \
- : : "r" (va >> PAGE_SHIFT) : "memory"); \
+#define TLB_HELPER(name, tlbop) \
+static inline void name(void) \
+{ \
+ asm volatile ( \
+ "dsb ishst;" \
+ "tlbi " # tlbop ";" \
+ ALTERNATIVE( \
+ "nop; nop;", \
+ "dsb ish;" \
+ "tlbi vale2is, xzr;", \
+ ARM64_WORKAROUND_REPEAT_TLBI, \
+ CONFIG_ARM64_WORKAROUND_REPEAT_TLBI) \
+ "dsb ish;" \
+ "isb;" \
+ : : : "memory"); \
}
/* Flush local TLBs, current VMID only. */
-TLB_HELPER(flush_guest_tlb_local, vmalls12e1, nsh)
+TLB_HELPER_LOCAL(flush_guest_tlb_local, vmalls12e1)
/* Flush innershareable TLBs, current VMID only */
-TLB_HELPER(flush_guest_tlb, vmalls12e1is, ish)
+TLB_HELPER(flush_guest_tlb, vmalls12e1is)
/* Flush local TLBs, all VMIDs, non-hypervisor mode */
-TLB_HELPER(flush_all_guests_tlb_local, alle1, nsh)
+TLB_HELPER_LOCAL(flush_all_guests_tlb_local, alle1)
/* Flush innershareable TLBs, all VMIDs, non-hypervisor mode */
-TLB_HELPER(flush_all_guests_tlb, alle1is, ish)
+TLB_HELPER(flush_all_guests_tlb, alle1is)
/* Flush all hypervisor mappings from the TLB of the local processor. */
-TLB_HELPER(flush_xen_tlb_local, alle2, nsh)
+TLB_HELPER_LOCAL(flush_xen_tlb_local, alle2)
+
+#undef TLB_HELPER_LOCAL
+#undef TLB_HELPER
+
+/*
+ * FLush TLB by VA. This will likely be used in a loop, so the caller
+ * is responsible to use the appropriate memory barriers before/after
+ * the sequence.
+ */
/* Flush TLB of local processor for address va. */
-TLB_HELPER_VA(__flush_xen_tlb_one_local, vae2)
+static inline void __flush_xen_tlb_one_local(vaddr_t va)
+{
+ asm volatile (
+ "tlbi vae2, %0" : : "r" (va >> PAGE_SHIFT) : "memory");
+}
/* Flush TLB of all processors in the inner-shareable domain for address va. */
-TLB_HELPER_VA(__flush_xen_tlb_one, vae2is)
+static inline void __flush_xen_tlb_one(vaddr_t va)
+{
+ asm volatile (
+ "tlbi vae2is, %0" : : "r" (va >> PAGE_SHIFT) : "memory");
+}
-#undef TLB_HELPER
-#undef TLB_HELPER_VA
+/*
+ * ARM64_WORKAROUND_REPEAT_TLBI:
+ * For all relevant erratas it is only necessary to execute a single
+ * additional TLBI;DSB sequence after any number of TLBIs are completed by DSB.
+ */
+static inline void __tlb_repeat_sync(void)
+{
+ asm volatile (
+ ALTERNATIVE(
+ "nop; nop;",
+ "tlbi vale2is, xzr;"
+ "dsb ish;",
+ ARM64_WORKAROUND_REPEAT_TLBI,
+ CONFIG_ARM64_WORKAROUND_REPEAT_TLBI)
+ : : : "memory");
+}
#endif /* __ASM_ARM_ARM64_FLUSHTLB_H__ */
/*
diff --git a/xen/arch/arm/include/asm/flushtlb.h b/xen/arch/arm/include/asm/flushtlb.h
index e45fb6d97b02..c292c3c00d29 100644
--- a/xen/arch/arm/include/asm/flushtlb.h
+++ b/xen/arch/arm/include/asm/flushtlb.h
@@ -65,6 +65,7 @@ static inline void flush_xen_tlb_range_va(vaddr_t va,
va += PAGE_SIZE;
}
dsb(ish); /* Ensure the TLB invalidation has completed */
+ __tlb_repeat_sync();
isb();
}
diff --git a/xen/arch/arm/include/asm/mmu/layout.h b/xen/arch/arm/include/asm/mmu/layout.h
index da6be276ac5f..d5c58b49a248 100644
--- a/xen/arch/arm/include/asm/mmu/layout.h
+++ b/xen/arch/arm/include/asm/mmu/layout.h
@@ -23,6 +23,10 @@
*
* Reserved to identity map Xen
*
+ * Note: As part of ARM64_WORKAROUND_REPEAT_TLBI, VA 0 is used for an extra
+ * TLBI operation given its rare use (only identity mapping) and thus
+ * negligible performance impact.
+ *
* 0x0000020000000000 - 0x0000027fffffffff (512GB, L0 slot [4])
* (Relative offsets)
* 0 - 2M Unmapped
From a29d8723d95074a33ec841213e1267320c622ea7 Mon Sep 17 00:00:00 2001
From: Michal Orzel <michal.orzel@amd.com>
Date: Fri, 22 May 2026 09:35:55 +0200
Subject: xen/arm: Sync missing definitions for Arm CPUs with Linux
Synchronize with Linux kernel 7.0 definitions for the following CPUs:
- Cortex-A76AE,
- Cortex-A78AE,
- Cortex-X1C,
- Cortex-X3,
- Neoverse-V2,
- Cortex-X4,
- Neoverse-V3AE,
- Neoverse-V3,
- Cortex-X925.
These will be used for errata detection in subsequent patches.
Signed-off-by: Michal Orzel <michal.orzel@amd.com>
Reviewed-by: Julien Grall <julien@xen.org>
diff --git a/xen/arch/arm/include/asm/processor.h b/xen/arch/arm/include/asm/processor.h
index 8e0241046504..e1620d579863 100644
--- a/xen/arch/arm/include/asm/processor.h
+++ b/xen/arch/arm/include/asm/processor.h
@@ -74,13 +74,22 @@
#define ARM_CPU_PART_CORTEX_A76 0xD0B
#define ARM_CPU_PART_NEOVERSE_N1 0xD0C
#define ARM_CPU_PART_CORTEX_A77 0xD0D
+#define ARM_CPU_PART_CORTEX_A76AE 0xD0E
#define ARM_CPU_PART_NEOVERSE_V1 0xD40
#define ARM_CPU_PART_CORTEX_A78 0xD41
+#define ARM_CPU_PART_CORTEX_A78AE 0xD42
#define ARM_CPU_PART_CORTEX_X1 0xD44
#define ARM_CPU_PART_CORTEX_A710 0xD47
#define ARM_CPU_PART_CORTEX_X2 0xD48
#define ARM_CPU_PART_NEOVERSE_N2 0xD49
#define ARM_CPU_PART_CORTEX_A78C 0xD4B
+#define ARM_CPU_PART_CORTEX_X1C 0xD4C
+#define ARM_CPU_PART_CORTEX_X3 0xD4E
+#define ARM_CPU_PART_NEOVERSE_V2 0xD4F
+#define ARM_CPU_PART_CORTEX_X4 0xD82
+#define ARM_CPU_PART_NEOVERSE_V3AE 0xD83
+#define ARM_CPU_PART_NEOVERSE_V3 0xD84
+#define ARM_CPU_PART_CORTEX_X925 0xD85
#define MIDR_CORTEX_A12 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A12)
#define MIDR_CORTEX_A17 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A17)
@@ -95,13 +104,22 @@
#define MIDR_CORTEX_A76 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A76)
#define MIDR_NEOVERSE_N1 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_N1)
#define MIDR_CORTEX_A77 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A77)
+#define MIDR_CORTEX_A76AE MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A76AE)
#define MIDR_NEOVERSE_V1 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_V1)
#define MIDR_CORTEX_A78 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A78)
+#define MIDR_CORTEX_A78AE MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A78AE)
#define MIDR_CORTEX_X1 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X1)
#define MIDR_CORTEX_A710 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A710)
#define MIDR_CORTEX_X2 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X2)
#define MIDR_NEOVERSE_N2 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_N2)
#define MIDR_CORTEX_A78C MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A78C)
+#define MIDR_CORTEX_X1C MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X1C)
+#define MIDR_CORTEX_X3 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X3)
+#define MIDR_NEOVERSE_V2 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_V2)
+#define MIDR_CORTEX_X4 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X4)
+#define MIDR_NEOVERSE_V3AE MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_V3AE)
+#define MIDR_NEOVERSE_V3 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_V3)
+#define MIDR_CORTEX_X925 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X925)
/* MPIDR Multiprocessor Affinity Register */
#define _MPIDR_UP (30)
From 0c8cda58788252ccc4fb6741c9f93730475fad08 Mon Sep 17 00:00:00 2001
From: Michal Orzel <michal.orzel@amd.com>
Date: Fri, 22 May 2026 09:35:56 +0200
Subject: xen/arm: Add C1-Ultra definitions
Add processor definitions for C1-Ultra. These will be used for errata
detection in subsequent patches.
These values can be found in the C1-Ultra TRM:
https://developer.arm.com/documentation/108014/0100/
... in section A.5.1 ("MIDR_EL1, Main ID Register").
Signed-off-by: Michal Orzel <michal.orzel@amd.com>
Reviewed-by: Julien Grall <julien@xen.org>
diff --git a/xen/arch/arm/include/asm/processor.h b/xen/arch/arm/include/asm/processor.h
index e1620d579863..a2104c2caf50 100644
--- a/xen/arch/arm/include/asm/processor.h
+++ b/xen/arch/arm/include/asm/processor.h
@@ -90,6 +90,7 @@
#define ARM_CPU_PART_NEOVERSE_V3AE 0xD83
#define ARM_CPU_PART_NEOVERSE_V3 0xD84
#define ARM_CPU_PART_CORTEX_X925 0xD85
+#define ARM_CPU_PART_C1_ULTRA 0xD8C
#define MIDR_CORTEX_A12 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A12)
#define MIDR_CORTEX_A17 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A17)
@@ -120,6 +121,7 @@
#define MIDR_NEOVERSE_V3AE MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_V3AE)
#define MIDR_NEOVERSE_V3 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_V3)
#define MIDR_CORTEX_X925 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X925)
+#define MIDR_C1_ULTRA MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_C1_ULTRA)
/* MPIDR Multiprocessor Affinity Register */
#define _MPIDR_UP (30)
From 459dca05fc9ccb8acde1e72536a8acab546f31f8 Mon Sep 17 00:00:00 2001
From: Michal Orzel <michal.orzel@amd.com>
Date: Fri, 22 May 2026 09:35:57 +0200
Subject: xen/arm: Add C1-Premium definitions
Add processor definitions for C1-Premium. These will be used for errata
detection in subsequent patches.
These values can be found in the C1-Premium TRM:
https://developer.arm.com/documentation/109416/0100/
... in section A.5.1 ("MIDR_EL1, Main ID Register").
Signed-off-by: Michal Orzel <michal.orzel@amd.com>
Reviewed-by: Julien Grall <julien@xen.org>
diff --git a/xen/arch/arm/include/asm/processor.h b/xen/arch/arm/include/asm/processor.h
index a2104c2caf50..3f086beed154 100644
--- a/xen/arch/arm/include/asm/processor.h
+++ b/xen/arch/arm/include/asm/processor.h
@@ -91,6 +91,7 @@
#define ARM_CPU_PART_NEOVERSE_V3 0xD84
#define ARM_CPU_PART_CORTEX_X925 0xD85
#define ARM_CPU_PART_C1_ULTRA 0xD8C
+#define ARM_CPU_PART_C1_PREMIUM 0xD90
#define MIDR_CORTEX_A12 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A12)
#define MIDR_CORTEX_A17 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A17)
@@ -122,6 +123,7 @@
#define MIDR_NEOVERSE_V3 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_V3)
#define MIDR_CORTEX_X925 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X925)
#define MIDR_C1_ULTRA MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_C1_ULTRA)
+#define MIDR_C1_PREMIUM MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_C1_PREMIUM)
/* MPIDR Multiprocessor Affinity Register */
#define _MPIDR_UP (30)
From eed2bbb4e440f27352dfabd3f14dcb7135d3caac Mon Sep 17 00:00:00 2001
From: Michal Orzel <michal.orzel@amd.com>
Date: Fri, 22 May 2026 09:35:58 +0200
Subject: xen/arm: Mitigate TLBI errata on various Arm CPUs
A number of CPUs developed by Arm suffer from errata whereby a broadcast
TLBI + DSB sequence may complete before the global observation of writes
which are translated by an affected TLB entry. This can lead to memory
corruption and potential privilege escalation.
These errata ONLY affect the completion of memory accesses which have
been translated by an invalidated TLB entry, and these errata DO NOT
affect the actual invalidation of TLB entries. TLB entries are removed
correctly.
To mitigate this issue, Arm recommends that software follows each
TLBI+DSB sequence with an additional TLBI+DSB, which will ensure that
all memory write effects affected by the first TLBI have been globally
observed.
The ARM64_WORKAROUND_REPEAT_TLBI workaround is sufficient to mitigate the
issue. Enable this workaround for affected CPUs.
This is XSA-493 / CVE-2025-10263.
Signed-off-by: Michal Orzel <michal.orzel@amd.com>
Reviewed-by: Julien Grall <julien@xen.org>
diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
index 2939db429b78..3eddc9c1e6bc 100644
--- a/xen/arch/arm/Kconfig
+++ b/xen/arch/arm/Kconfig
@@ -383,6 +383,27 @@ config ARM64_ERRATUM_1508412
If unsure, say Y.
+config ARM64_ERRATUM_CVE_2025_10263
+ bool "Cortex-*/Neoverse-*/C1-*: Completion of affected memory accesses might not be guaranteed by completion of a TLBI"
+ default y
+ depends on ARM_64
+ select ARM64_WORKAROUND_REPEAT_TLBI
+ help
+ This option adds a workaround for CVE-2025-10263.
+
+ A broadcast TLBI on another PE may complete before affected memory
+ accesses are globally observed. This may permit bypass of Stage 1
+ translation, Stage-2 translation, or GPT protection.
+
+ The workaround repeats the TLBI VALE2IS, XZR + DSB ISH operation for all
+ the broadcast TLB flush operations. A single additional TLBI and DSB are
+ sufficient regardless of how many TLBIs are completed by the DSB.
+
+ Note that software workarounds are required at all execution levels for
+ affected parts to fully mitigate this issue.
+
+ If unsure, say Y.
+
endmenu
config ARM64_HARDEN_BRANCH_PREDICTOR
diff --git a/xen/arch/arm/cpuerrata.c b/xen/arch/arm/cpuerrata.c
index 9137958fb682..99517b5298cf 100644
--- a/xen/arch/arm/cpuerrata.c
+++ b/xen/arch/arm/cpuerrata.c
@@ -536,6 +536,92 @@ static const struct arm_cpu_capabilities arm_errata[] = {
MIDR_RANGE(MIDR_NEOVERSE_N1, 0, 3 << MIDR_VARIANT_SHIFT),
},
#endif
+#ifdef CONFIG_ARM64_ERRATUM_CVE_2025_10263
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_A76),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_A76AE),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_A77),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_A78),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_A78AE),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_A78C),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_A710),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_X1),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_X1C),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_X2),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_X3),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_X4),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_X925),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N1),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N2),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_NEOVERSE_V1),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_NEOVERSE_V2),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_NEOVERSE_V3),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_NEOVERSE_V3AE),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_C1_ULTRA),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_C1_PREMIUM),
+ },
+#endif
#ifdef CONFIG_ARM64_HARDEN_BRANCH_PREDICTOR
{
.capability = ARM_HARDEN_BRANCH_PREDICTOR,
From 1efa09ba765b90a0f2f3f8063cc0be2316cada0d Mon Sep 17 00:00:00 2001
From: Michal Orzel <michal.orzel@amd.com>
Date: Tue, 14 Apr 2026 10:11:24 +0200
Subject: xen/arm64: flushtlb: Optimize ARM64_WORKAROUND_REPEAT_TLBI
The ARM64_WORKAROUND_REPEAT_TLBI workaround is used to mitigate several
errata where broadcast TLBI;DSB sequences don't provide all the
architecturally required synchronization. The workaround performs more
work than necessary, and can have significant overhead. This patch
optimizes the workaround, as explained below.
1. All relevant errata only affect the ordering and/or completion of
memory accesses which have been translated by an invalidated TLB
entry. The actual invalidation of TLB entries is unaffected.
2. The existing workaround is applied to both broadcast and local TLB
invalidation, whereas for all relevant errata it is only necessary to
apply a workaround for broadcast invalidation.
3. The existing workaround replaces every TLBI with a TLBI;DSB;TLBI
sequence, whereas for all relevant errata it is only necessary to
execute a single additional TLBI;DSB sequence after any number of
TLBIs are completed by a DSB.
For example, for a sequence of batched TLBIs:
TLBI <op1>[, <arg1>]
TLBI <op2>[, <arg2>]
TLBI <op3>[, <arg3>]
DSB ISH
... the existing workaround will expand this to:
TLBI <op1>[, <arg1>]
DSB ISH // additional
TLBI <op1>[, <arg1>] // additional
TLBI <op2>[, <arg2>]
DSB ISH // additional
TLBI <op2>[, <arg2>] // additional
TLBI <op3>[, <arg3>]
DSB ISH // additional
TLBI <op3>[, <arg3>] // additional
DSB ISH
... whereas it is sufficient to have:
TLBI <op1>[, <arg1>]
TLBI <op2>[, <arg2>]
TLBI <op3>[, <arg3>]
DSB ISH
TLBI <opX>[, <argX>] // additional
DSB ISH // additional
Using a single additional TLBI and DSB at the end of the sequence can
have significantly lower overhead as each DSB which completes a TLBI
must synchronize with other PEs in the system, with potential
performance effects both locally and system-wide.
4. The existing workaround repeats each specific TLBI operation, whereas
for all relevant errata it is sufficient for the additional TLBI to
use *any* operation which will be broadcast, regardless of which
translation regime or stage of translation the operation applies to.
For example, for a single TLBI:
TLBI ALLE2IS
DSB ISH
... the existing workaround will expand this to:
TLBI ALLE2IS
DSB ISH
TLBI ALLE2IS // additional
DSB ISH // additional
... whereas it is sufficient to have:
TLBI ALLE2IS
DSB ISH
TLBI VALE1IS, XZR // additional
DSB ISH // additional
As the additional TLBI doesn't have to match a specific earlier TLBI,
the additional TLBI can be implemented in separate code, with no
memory of the earlier TLBIs. The additional TLBI can also use a
cheaper TLBI operation.
5. The existing workaround is applied to both Stage-1 and Stage-2 TLB
invalidation, whereas for all relevant errata it is only necessary to
apply a workaround for Stage-1 invalidation.
Architecturally, TLBI operations which invalidate only Stage-2
information (e.g. IPAS2E1IS) are not required to invalidate TLB
entries which combine information from Stage-1 and Stage-2
translation table entries, and consequently may not complete memory
accesses translated by those combined entries. In these cases,
completion of memory accesses is only guaranteed after subsequent
invalidation of Stage-1 information (e.g. VMALLE1IS).
Rework the workaround logic as follows:
- add TLB_HELPER_LOCAL() to be used for local TLB ops without a
workaround,
- modify TLB_HELPER() workaround to use tlbi vale2is, xzr as a second
TLBI,
- drop TLB_HELPER_VA(). It's used only by __flush_xen_tlb_one_local
which is local and does not need workaround and by
__flush_xen_tlb_one. In the latter case, since it's used in a loop,
we don't need a workaround in the middle. Add __tlb_repeat_sync with
a workaround to be used at the end after DSB and before final ISB,
- TLBI VALE2IS passing XZR is used as an additional TLBI. While there is
an identity mapping there, it's used very rarely. The performance
impact is therefore negligible. If things change in the future, we
can revisit the decision.
Signed-off-by: Michal Orzel <michal.orzel@amd.com>
Reviewed-by: Luca Fancellu <luca.fancellu@arm.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
(cherry picked from commit 7c502d7591519135765b8041cbd1c70e56e5a0b9)
diff --git a/xen/arch/arm/include/asm/arm32/flushtlb.h b/xen/arch/arm/include/asm/arm32/flushtlb.h
index 61c25a318998..5483be08fbbe 100644
--- a/xen/arch/arm/include/asm/arm32/flushtlb.h
+++ b/xen/arch/arm/include/asm/arm32/flushtlb.h
@@ -57,6 +57,9 @@ static inline void __flush_xen_tlb_one(vaddr_t va)
asm volatile(STORE_CP32(0, TLBIMVAHIS) : : "r" (va) : "memory");
}
+/* Only for ARM64_WORKAROUND_REPEAT_TLBI */
+static inline void __tlb_repeat_sync(void) {}
+
#endif /* __ASM_ARM_ARM32_FLUSHTLB_H__ */
/*
* Local variables:
diff --git a/xen/arch/arm/include/asm/arm64/flushtlb.h b/xen/arch/arm/include/asm/arm64/flushtlb.h
index 45642201d147..c1314be122d7 100644
--- a/xen/arch/arm/include/asm/arm64/flushtlb.h
+++ b/xen/arch/arm/include/asm/arm64/flushtlb.h
@@ -12,9 +12,14 @@
* ARM64_WORKAROUND_REPEAT_TLBI:
* Modification of the translation table for a virtual address might lead to
* read-after-read ordering violation.
- * The workaround repeats TLBI+DSB ISH operation for all the TLB flush
- * operations. While this is strictly not necessary, we don't want to
- * take any risk.
+ * The workaround repeats TLBI+DSB ISH operation for broadcast TLB flush
+ * operations. The workaround is not needed for local operations.
+ *
+ * It is sufficient for the additional TLBI to use *any* operation which will
+ * be broadcast, regardless of which translation regime or stage of translation
+ * the operation applies to. TLBI VALE2IS is used passing XZR. While there is
+ * an identity mapping there, it's only used during suspend/resume, CPU on/off,
+ * so the impact (performance if any) is negligible.
*
* For Xen page-tables the ISB will discard any instructions fetched
* from the old mappings.
@@ -26,69 +31,90 @@
* Note that for local TLB flush, using non-shareable (nsh) is sufficient
* (see D5-4929 in ARM DDI 0487H.a). Although, the memory barrier in
* for the workaround is left as inner-shareable to match with Linux
- * v6.1-rc8.
+ * v6.19.
*/
-#define TLB_HELPER(name, tlbop, sh) \
+#define TLB_HELPER_LOCAL(name, tlbop) \
static inline void name(void) \
{ \
asm volatile( \
- "dsb " # sh "st;" \
+ "dsb nshst;" \
"tlbi " # tlbop ";" \
- ALTERNATIVE( \
- "nop; nop;", \
- "dsb ish;" \
- "tlbi " # tlbop ";", \
- ARM64_WORKAROUND_REPEAT_TLBI, \
- CONFIG_ARM64_WORKAROUND_REPEAT_TLBI) \
- "dsb " # sh ";" \
+ "dsb nsh;" \
"isb;" \
: : : "memory"); \
}
-/*
- * FLush TLB by VA. This will likely be used in a loop, so the caller
- * is responsible to use the appropriate memory barriers before/after
- * the sequence.
- *
- * See above about the ARM64_WORKAROUND_REPEAT_TLBI sequence.
- */
-#define TLB_HELPER_VA(name, tlbop) \
-static inline void name(vaddr_t va) \
-{ \
- asm volatile( \
- "tlbi " # tlbop ", %0;" \
- ALTERNATIVE( \
- "nop; nop;", \
- "dsb ish;" \
- "tlbi " # tlbop ", %0;", \
- ARM64_WORKAROUND_REPEAT_TLBI, \
- CONFIG_ARM64_WORKAROUND_REPEAT_TLBI) \
- : : "r" (va >> PAGE_SHIFT) : "memory"); \
+#define TLB_HELPER(name, tlbop) \
+static inline void name(void) \
+{ \
+ asm volatile ( \
+ "dsb ishst;" \
+ "tlbi " # tlbop ";" \
+ ALTERNATIVE( \
+ "nop; nop;", \
+ "dsb ish;" \
+ "tlbi vale2is, xzr;", \
+ ARM64_WORKAROUND_REPEAT_TLBI, \
+ CONFIG_ARM64_WORKAROUND_REPEAT_TLBI) \
+ "dsb ish;" \
+ "isb;" \
+ : : : "memory"); \
}
/* Flush local TLBs, current VMID only. */
-TLB_HELPER(flush_guest_tlb_local, vmalls12e1, nsh)
+TLB_HELPER_LOCAL(flush_guest_tlb_local, vmalls12e1)
/* Flush innershareable TLBs, current VMID only */
-TLB_HELPER(flush_guest_tlb, vmalls12e1is, ish)
+TLB_HELPER(flush_guest_tlb, vmalls12e1is)
/* Flush local TLBs, all VMIDs, non-hypervisor mode */
-TLB_HELPER(flush_all_guests_tlb_local, alle1, nsh)
+TLB_HELPER_LOCAL(flush_all_guests_tlb_local, alle1)
/* Flush innershareable TLBs, all VMIDs, non-hypervisor mode */
-TLB_HELPER(flush_all_guests_tlb, alle1is, ish)
+TLB_HELPER(flush_all_guests_tlb, alle1is)
/* Flush all hypervisor mappings from the TLB of the local processor. */
-TLB_HELPER(flush_xen_tlb_local, alle2, nsh)
+TLB_HELPER_LOCAL(flush_xen_tlb_local, alle2)
+
+#undef TLB_HELPER_LOCAL
+#undef TLB_HELPER
+
+/*
+ * FLush TLB by VA. This will likely be used in a loop, so the caller
+ * is responsible to use the appropriate memory barriers before/after
+ * the sequence.
+ */
/* Flush TLB of local processor for address va. */
-TLB_HELPER_VA(__flush_xen_tlb_one_local, vae2)
+static inline void __flush_xen_tlb_one_local(vaddr_t va)
+{
+ asm volatile (
+ "tlbi vae2, %0" : : "r" (va >> PAGE_SHIFT) : "memory");
+}
/* Flush TLB of all processors in the inner-shareable domain for address va. */
-TLB_HELPER_VA(__flush_xen_tlb_one, vae2is)
+static inline void __flush_xen_tlb_one(vaddr_t va)
+{
+ asm volatile (
+ "tlbi vae2is, %0" : : "r" (va >> PAGE_SHIFT) : "memory");
+}
-#undef TLB_HELPER
-#undef TLB_HELPER_VA
+/*
+ * ARM64_WORKAROUND_REPEAT_TLBI:
+ * For all relevant erratas it is only necessary to execute a single
+ * additional TLBI;DSB sequence after any number of TLBIs are completed by DSB.
+ */
+static inline void __tlb_repeat_sync(void)
+{
+ asm volatile (
+ ALTERNATIVE(
+ "nop; nop;",
+ "tlbi vale2is, xzr;"
+ "dsb ish;",
+ ARM64_WORKAROUND_REPEAT_TLBI,
+ CONFIG_ARM64_WORKAROUND_REPEAT_TLBI)
+ : : : "memory");
+}
#endif /* __ASM_ARM_ARM64_FLUSHTLB_H__ */
/*
diff --git a/xen/arch/arm/include/asm/flushtlb.h b/xen/arch/arm/include/asm/flushtlb.h
index e45fb6d97b02..c292c3c00d29 100644
--- a/xen/arch/arm/include/asm/flushtlb.h
+++ b/xen/arch/arm/include/asm/flushtlb.h
@@ -65,6 +65,7 @@ static inline void flush_xen_tlb_range_va(vaddr_t va,
va += PAGE_SIZE;
}
dsb(ish); /* Ensure the TLB invalidation has completed */
+ __tlb_repeat_sync();
isb();
}
diff --git a/xen/arch/arm/include/asm/mmu/layout.h b/xen/arch/arm/include/asm/mmu/layout.h
index a3b546465b5a..74848725e9a3 100644
--- a/xen/arch/arm/include/asm/mmu/layout.h
+++ b/xen/arch/arm/include/asm/mmu/layout.h
@@ -23,6 +23,10 @@
*
* Reserved to identity map Xen
*
+ * Note: As part of ARM64_WORKAROUND_REPEAT_TLBI, VA 0 is used for an extra
+ * TLBI operation given its rare use (only identity mapping) and thus
+ * negligible performance impact.
+ *
* 0x00000a0000000000 - 0x00000a7fffffffff (512GB, L0 slot [20])
* (Relative offsets)
* 0 - 2M Unmapped
From b11cdde4229a0870c04c66c146b4ec9734dea512 Mon Sep 17 00:00:00 2001
From: Michal Orzel <michal.orzel@amd.com>
Date: Fri, 22 May 2026 09:35:55 +0200
Subject: xen/arm: Sync missing definitions for Arm CPUs with Linux
Synchronize with Linux kernel 7.0 definitions for the following CPUs:
- Cortex-A76AE,
- Cortex-A78AE,
- Cortex-X1C,
- Cortex-X3,
- Neoverse-V2,
- Cortex-X4,
- Neoverse-V3AE,
- Neoverse-V3,
- Cortex-X925.
These will be used for errata detection in subsequent patches.
Signed-off-by: Michal Orzel <michal.orzel@amd.com>
Reviewed-by: Julien Grall <julien@xen.org>
diff --git a/xen/arch/arm/include/asm/processor.h b/xen/arch/arm/include/asm/processor.h
index 2ca2662c0248..5fda69344c89 100644
--- a/xen/arch/arm/include/asm/processor.h
+++ b/xen/arch/arm/include/asm/processor.h
@@ -74,13 +74,22 @@
#define ARM_CPU_PART_CORTEX_A76 0xD0B
#define ARM_CPU_PART_NEOVERSE_N1 0xD0C
#define ARM_CPU_PART_CORTEX_A77 0xD0D
+#define ARM_CPU_PART_CORTEX_A76AE 0xD0E
#define ARM_CPU_PART_NEOVERSE_V1 0xD40
#define ARM_CPU_PART_CORTEX_A78 0xD41
+#define ARM_CPU_PART_CORTEX_A78AE 0xD42
#define ARM_CPU_PART_CORTEX_X1 0xD44
#define ARM_CPU_PART_CORTEX_A710 0xD47
#define ARM_CPU_PART_CORTEX_X2 0xD48
#define ARM_CPU_PART_NEOVERSE_N2 0xD49
#define ARM_CPU_PART_CORTEX_A78C 0xD4B
+#define ARM_CPU_PART_CORTEX_X1C 0xD4C
+#define ARM_CPU_PART_CORTEX_X3 0xD4E
+#define ARM_CPU_PART_NEOVERSE_V2 0xD4F
+#define ARM_CPU_PART_CORTEX_X4 0xD82
+#define ARM_CPU_PART_NEOVERSE_V3AE 0xD83
+#define ARM_CPU_PART_NEOVERSE_V3 0xD84
+#define ARM_CPU_PART_CORTEX_X925 0xD85
#define MIDR_CORTEX_A12 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A12)
#define MIDR_CORTEX_A17 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A17)
@@ -95,13 +104,22 @@
#define MIDR_CORTEX_A76 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A76)
#define MIDR_NEOVERSE_N1 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_N1)
#define MIDR_CORTEX_A77 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A77)
+#define MIDR_CORTEX_A76AE MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A76AE)
#define MIDR_NEOVERSE_V1 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_V1)
#define MIDR_CORTEX_A78 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A78)
+#define MIDR_CORTEX_A78AE MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A78AE)
#define MIDR_CORTEX_X1 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X1)
#define MIDR_CORTEX_A710 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A710)
#define MIDR_CORTEX_X2 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X2)
#define MIDR_NEOVERSE_N2 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_N2)
#define MIDR_CORTEX_A78C MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A78C)
+#define MIDR_CORTEX_X1C MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X1C)
+#define MIDR_CORTEX_X3 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X3)
+#define MIDR_NEOVERSE_V2 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_V2)
+#define MIDR_CORTEX_X4 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X4)
+#define MIDR_NEOVERSE_V3AE MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_V3AE)
+#define MIDR_NEOVERSE_V3 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_V3)
+#define MIDR_CORTEX_X925 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X925)
/* MPIDR Multiprocessor Affinity Register */
#define _MPIDR_UP (30)
From 6cb1317f4c7c5ddd209e8b0663529c3cc6775694 Mon Sep 17 00:00:00 2001
From: Michal Orzel <michal.orzel@amd.com>
Date: Fri, 22 May 2026 09:35:56 +0200
Subject: xen/arm: Add C1-Ultra definitions
Add processor definitions for C1-Ultra. These will be used for errata
detection in subsequent patches.
These values can be found in the C1-Ultra TRM:
https://developer.arm.com/documentation/108014/0100/
... in section A.5.1 ("MIDR_EL1, Main ID Register").
Signed-off-by: Michal Orzel <michal.orzel@amd.com>
Reviewed-by: Julien Grall <julien@xen.org>
diff --git a/xen/arch/arm/include/asm/processor.h b/xen/arch/arm/include/asm/processor.h
index 5fda69344c89..85a7469eba2c 100644
--- a/xen/arch/arm/include/asm/processor.h
+++ b/xen/arch/arm/include/asm/processor.h
@@ -90,6 +90,7 @@
#define ARM_CPU_PART_NEOVERSE_V3AE 0xD83
#define ARM_CPU_PART_NEOVERSE_V3 0xD84
#define ARM_CPU_PART_CORTEX_X925 0xD85
+#define ARM_CPU_PART_C1_ULTRA 0xD8C
#define MIDR_CORTEX_A12 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A12)
#define MIDR_CORTEX_A17 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A17)
@@ -120,6 +121,7 @@
#define MIDR_NEOVERSE_V3AE MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_V3AE)
#define MIDR_NEOVERSE_V3 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_V3)
#define MIDR_CORTEX_X925 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X925)
+#define MIDR_C1_ULTRA MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_C1_ULTRA)
/* MPIDR Multiprocessor Affinity Register */
#define _MPIDR_UP (30)
From e3667cb66078cc0cc6d612e2663c0a887fd7b2c8 Mon Sep 17 00:00:00 2001
From: Michal Orzel <michal.orzel@amd.com>
Date: Fri, 22 May 2026 09:35:57 +0200
Subject: xen/arm: Add C1-Premium definitions
Add processor definitions for C1-Premium. These will be used for errata
detection in subsequent patches.
These values can be found in the C1-Premium TRM:
https://developer.arm.com/documentation/109416/0100/
... in section A.5.1 ("MIDR_EL1, Main ID Register").
Signed-off-by: Michal Orzel <michal.orzel@amd.com>
Reviewed-by: Julien Grall <julien@xen.org>
diff --git a/xen/arch/arm/include/asm/processor.h b/xen/arch/arm/include/asm/processor.h
index 85a7469eba2c..f89b061d444c 100644
--- a/xen/arch/arm/include/asm/processor.h
+++ b/xen/arch/arm/include/asm/processor.h
@@ -91,6 +91,7 @@
#define ARM_CPU_PART_NEOVERSE_V3 0xD84
#define ARM_CPU_PART_CORTEX_X925 0xD85
#define ARM_CPU_PART_C1_ULTRA 0xD8C
+#define ARM_CPU_PART_C1_PREMIUM 0xD90
#define MIDR_CORTEX_A12 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A12)
#define MIDR_CORTEX_A17 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A17)
@@ -122,6 +123,7 @@
#define MIDR_NEOVERSE_V3 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_V3)
#define MIDR_CORTEX_X925 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X925)
#define MIDR_C1_ULTRA MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_C1_ULTRA)
+#define MIDR_C1_PREMIUM MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_C1_PREMIUM)
/* MPIDR Multiprocessor Affinity Register */
#define _MPIDR_UP (30)
From d148e0b79ab2c7f2d5ac82b8ae9b5f918367dfb3 Mon Sep 17 00:00:00 2001
From: Michal Orzel <michal.orzel@amd.com>
Date: Fri, 22 May 2026 09:35:58 +0200
Subject: xen/arm: Mitigate TLBI errata on various Arm CPUs
A number of CPUs developed by Arm suffer from errata whereby a broadcast
TLBI + DSB sequence may complete before the global observation of writes
which are translated by an affected TLB entry. This can lead to memory
corruption and potential privilege escalation.
These errata ONLY affect the completion of memory accesses which have
been translated by an invalidated TLB entry, and these errata DO NOT
affect the actual invalidation of TLB entries. TLB entries are removed
correctly.
To mitigate this issue, Arm recommends that software follows each
TLBI+DSB sequence with an additional TLBI+DSB, which will ensure that
all memory write effects affected by the first TLBI have been globally
observed.
The ARM64_WORKAROUND_REPEAT_TLBI workaround is sufficient to mitigate the
issue. Enable this workaround for affected CPUs.
This is XSA-493 / CVE-2025-10263.
Signed-off-by: Michal Orzel <michal.orzel@amd.com>
Reviewed-by: Julien Grall <julien@xen.org>
diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
index 21d03d9f4424..12aa8580c820 100644
--- a/xen/arch/arm/Kconfig
+++ b/xen/arch/arm/Kconfig
@@ -436,6 +436,27 @@ config ARM64_ERRATUM_1508412
If unsure, say Y.
+config ARM64_ERRATUM_CVE_2025_10263
+ bool "Cortex-*/Neoverse-*/C1-*: Completion of affected memory accesses might not be guaranteed by completion of a TLBI"
+ default y
+ depends on ARM_64
+ select ARM64_WORKAROUND_REPEAT_TLBI
+ help
+ This option adds a workaround for CVE-2025-10263.
+
+ A broadcast TLBI on another PE may complete before affected memory
+ accesses are globally observed. This may permit bypass of Stage 1
+ translation, Stage-2 translation, or GPT protection.
+
+ The workaround repeats the TLBI VALE2IS, XZR + DSB ISH operation for all
+ the broadcast TLB flush operations. A single additional TLBI and DSB are
+ sufficient regardless of how many TLBIs are completed by the DSB.
+
+ Note that software workarounds are required at all execution levels for
+ affected parts to fully mitigate this issue.
+
+ If unsure, say Y.
+
endmenu
config ARM64_HARDEN_BRANCH_PREDICTOR
diff --git a/xen/arch/arm/cpuerrata.c b/xen/arch/arm/cpuerrata.c
index 2b7101ea2524..1fb0cf599fb0 100644
--- a/xen/arch/arm/cpuerrata.c
+++ b/xen/arch/arm/cpuerrata.c
@@ -535,6 +535,92 @@ static const struct arm_cpu_capabilities arm_errata[] = {
MIDR_RANGE(MIDR_NEOVERSE_N1, 0, 3 << MIDR_VARIANT_SHIFT),
},
#endif
+#ifdef CONFIG_ARM64_ERRATUM_CVE_2025_10263
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_A76),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_A76AE),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_A77),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_A78),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_A78AE),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_A78C),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_A710),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_X1),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_X1C),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_X2),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_X3),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_X4),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_X925),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N1),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N2),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_NEOVERSE_V1),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_NEOVERSE_V2),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_NEOVERSE_V3),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_NEOVERSE_V3AE),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_C1_ULTRA),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_C1_PREMIUM),
+ },
+#endif
#ifdef CONFIG_ARM64_HARDEN_BRANCH_PREDICTOR
{
.capability = ARM_HARDEN_BRANCH_PREDICTOR,
From 8e26ceb02a24d228b26226c5d6b82cb51fb1307c Mon Sep 17 00:00:00 2001
From: Michal Orzel <michal.orzel@amd.com>
Date: Tue, 14 Apr 2026 10:11:24 +0200
Subject: xen/arm64: flushtlb: Optimize ARM64_WORKAROUND_REPEAT_TLBI
The ARM64_WORKAROUND_REPEAT_TLBI workaround is used to mitigate several
errata where broadcast TLBI;DSB sequences don't provide all the
architecturally required synchronization. The workaround performs more
work than necessary, and can have significant overhead. This patch
optimizes the workaround, as explained below.
1. All relevant errata only affect the ordering and/or completion of
memory accesses which have been translated by an invalidated TLB
entry. The actual invalidation of TLB entries is unaffected.
2. The existing workaround is applied to both broadcast and local TLB
invalidation, whereas for all relevant errata it is only necessary to
apply a workaround for broadcast invalidation.
3. The existing workaround replaces every TLBI with a TLBI;DSB;TLBI
sequence, whereas for all relevant errata it is only necessary to
execute a single additional TLBI;DSB sequence after any number of
TLBIs are completed by a DSB.
For example, for a sequence of batched TLBIs:
TLBI <op1>[, <arg1>]
TLBI <op2>[, <arg2>]
TLBI <op3>[, <arg3>]
DSB ISH
... the existing workaround will expand this to:
TLBI <op1>[, <arg1>]
DSB ISH // additional
TLBI <op1>[, <arg1>] // additional
TLBI <op2>[, <arg2>]
DSB ISH // additional
TLBI <op2>[, <arg2>] // additional
TLBI <op3>[, <arg3>]
DSB ISH // additional
TLBI <op3>[, <arg3>] // additional
DSB ISH
... whereas it is sufficient to have:
TLBI <op1>[, <arg1>]
TLBI <op2>[, <arg2>]
TLBI <op3>[, <arg3>]
DSB ISH
TLBI <opX>[, <argX>] // additional
DSB ISH // additional
Using a single additional TLBI and DSB at the end of the sequence can
have significantly lower overhead as each DSB which completes a TLBI
must synchronize with other PEs in the system, with potential
performance effects both locally and system-wide.
4. The existing workaround repeats each specific TLBI operation, whereas
for all relevant errata it is sufficient for the additional TLBI to
use *any* operation which will be broadcast, regardless of which
translation regime or stage of translation the operation applies to.
For example, for a single TLBI:
TLBI ALLE2IS
DSB ISH
... the existing workaround will expand this to:
TLBI ALLE2IS
DSB ISH
TLBI ALLE2IS // additional
DSB ISH // additional
... whereas it is sufficient to have:
TLBI ALLE2IS
DSB ISH
TLBI VALE1IS, XZR // additional
DSB ISH // additional
As the additional TLBI doesn't have to match a specific earlier TLBI,
the additional TLBI can be implemented in separate code, with no
memory of the earlier TLBIs. The additional TLBI can also use a
cheaper TLBI operation.
5. The existing workaround is applied to both Stage-1 and Stage-2 TLB
invalidation, whereas for all relevant errata it is only necessary to
apply a workaround for Stage-1 invalidation.
Architecturally, TLBI operations which invalidate only Stage-2
information (e.g. IPAS2E1IS) are not required to invalidate TLB
entries which combine information from Stage-1 and Stage-2
translation table entries, and consequently may not complete memory
accesses translated by those combined entries. In these cases,
completion of memory accesses is only guaranteed after subsequent
invalidation of Stage-1 information (e.g. VMALLE1IS).
Rework the workaround logic as follows:
- add TLB_HELPER_LOCAL() to be used for local TLB ops without a
workaround,
- modify TLB_HELPER() workaround to use tlbi vale2is, xzr as a second
TLBI,
- drop TLB_HELPER_VA(). It's used only by __flush_xen_tlb_one_local
which is local and does not need workaround and by
__flush_xen_tlb_one. In the latter case, since it's used in a loop,
we don't need a workaround in the middle. Add __tlb_repeat_sync with
a workaround to be used at the end after DSB and before final ISB,
- TLBI VALE2IS passing XZR is used as an additional TLBI. While there is
an identity mapping there, it's used very rarely. The performance
impact is therefore negligible. If things change in the future, we
can revisit the decision.
Signed-off-by: Michal Orzel <michal.orzel@amd.com>
Reviewed-by: Luca Fancellu <luca.fancellu@arm.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
(cherry picked from commit 7c502d7591519135765b8041cbd1c70e56e5a0b9)
diff --git a/xen/arch/arm/include/asm/arm32/flushtlb.h b/xen/arch/arm/include/asm/arm32/flushtlb.h
index 61c25a318998..5483be08fbbe 100644
--- a/xen/arch/arm/include/asm/arm32/flushtlb.h
+++ b/xen/arch/arm/include/asm/arm32/flushtlb.h
@@ -57,6 +57,9 @@ static inline void __flush_xen_tlb_one(vaddr_t va)
asm volatile(STORE_CP32(0, TLBIMVAHIS) : : "r" (va) : "memory");
}
+/* Only for ARM64_WORKAROUND_REPEAT_TLBI */
+static inline void __tlb_repeat_sync(void) {}
+
#endif /* __ASM_ARM_ARM32_FLUSHTLB_H__ */
/*
* Local variables:
diff --git a/xen/arch/arm/include/asm/arm64/flushtlb.h b/xen/arch/arm/include/asm/arm64/flushtlb.h
index 45642201d147..c1314be122d7 100644
--- a/xen/arch/arm/include/asm/arm64/flushtlb.h
+++ b/xen/arch/arm/include/asm/arm64/flushtlb.h
@@ -12,9 +12,14 @@
* ARM64_WORKAROUND_REPEAT_TLBI:
* Modification of the translation table for a virtual address might lead to
* read-after-read ordering violation.
- * The workaround repeats TLBI+DSB ISH operation for all the TLB flush
- * operations. While this is strictly not necessary, we don't want to
- * take any risk.
+ * The workaround repeats TLBI+DSB ISH operation for broadcast TLB flush
+ * operations. The workaround is not needed for local operations.
+ *
+ * It is sufficient for the additional TLBI to use *any* operation which will
+ * be broadcast, regardless of which translation regime or stage of translation
+ * the operation applies to. TLBI VALE2IS is used passing XZR. While there is
+ * an identity mapping there, it's only used during suspend/resume, CPU on/off,
+ * so the impact (performance if any) is negligible.
*
* For Xen page-tables the ISB will discard any instructions fetched
* from the old mappings.
@@ -26,69 +31,90 @@
* Note that for local TLB flush, using non-shareable (nsh) is sufficient
* (see D5-4929 in ARM DDI 0487H.a). Although, the memory barrier in
* for the workaround is left as inner-shareable to match with Linux
- * v6.1-rc8.
+ * v6.19.
*/
-#define TLB_HELPER(name, tlbop, sh) \
+#define TLB_HELPER_LOCAL(name, tlbop) \
static inline void name(void) \
{ \
asm volatile( \
- "dsb " # sh "st;" \
+ "dsb nshst;" \
"tlbi " # tlbop ";" \
- ALTERNATIVE( \
- "nop; nop;", \
- "dsb ish;" \
- "tlbi " # tlbop ";", \
- ARM64_WORKAROUND_REPEAT_TLBI, \
- CONFIG_ARM64_WORKAROUND_REPEAT_TLBI) \
- "dsb " # sh ";" \
+ "dsb nsh;" \
"isb;" \
: : : "memory"); \
}
-/*
- * FLush TLB by VA. This will likely be used in a loop, so the caller
- * is responsible to use the appropriate memory barriers before/after
- * the sequence.
- *
- * See above about the ARM64_WORKAROUND_REPEAT_TLBI sequence.
- */
-#define TLB_HELPER_VA(name, tlbop) \
-static inline void name(vaddr_t va) \
-{ \
- asm volatile( \
- "tlbi " # tlbop ", %0;" \
- ALTERNATIVE( \
- "nop; nop;", \
- "dsb ish;" \
- "tlbi " # tlbop ", %0;", \
- ARM64_WORKAROUND_REPEAT_TLBI, \
- CONFIG_ARM64_WORKAROUND_REPEAT_TLBI) \
- : : "r" (va >> PAGE_SHIFT) : "memory"); \
+#define TLB_HELPER(name, tlbop) \
+static inline void name(void) \
+{ \
+ asm volatile ( \
+ "dsb ishst;" \
+ "tlbi " # tlbop ";" \
+ ALTERNATIVE( \
+ "nop; nop;", \
+ "dsb ish;" \
+ "tlbi vale2is, xzr;", \
+ ARM64_WORKAROUND_REPEAT_TLBI, \
+ CONFIG_ARM64_WORKAROUND_REPEAT_TLBI) \
+ "dsb ish;" \
+ "isb;" \
+ : : : "memory"); \
}
/* Flush local TLBs, current VMID only. */
-TLB_HELPER(flush_guest_tlb_local, vmalls12e1, nsh)
+TLB_HELPER_LOCAL(flush_guest_tlb_local, vmalls12e1)
/* Flush innershareable TLBs, current VMID only */
-TLB_HELPER(flush_guest_tlb, vmalls12e1is, ish)
+TLB_HELPER(flush_guest_tlb, vmalls12e1is)
/* Flush local TLBs, all VMIDs, non-hypervisor mode */
-TLB_HELPER(flush_all_guests_tlb_local, alle1, nsh)
+TLB_HELPER_LOCAL(flush_all_guests_tlb_local, alle1)
/* Flush innershareable TLBs, all VMIDs, non-hypervisor mode */
-TLB_HELPER(flush_all_guests_tlb, alle1is, ish)
+TLB_HELPER(flush_all_guests_tlb, alle1is)
/* Flush all hypervisor mappings from the TLB of the local processor. */
-TLB_HELPER(flush_xen_tlb_local, alle2, nsh)
+TLB_HELPER_LOCAL(flush_xen_tlb_local, alle2)
+
+#undef TLB_HELPER_LOCAL
+#undef TLB_HELPER
+
+/*
+ * FLush TLB by VA. This will likely be used in a loop, so the caller
+ * is responsible to use the appropriate memory barriers before/after
+ * the sequence.
+ */
/* Flush TLB of local processor for address va. */
-TLB_HELPER_VA(__flush_xen_tlb_one_local, vae2)
+static inline void __flush_xen_tlb_one_local(vaddr_t va)
+{
+ asm volatile (
+ "tlbi vae2, %0" : : "r" (va >> PAGE_SHIFT) : "memory");
+}
/* Flush TLB of all processors in the inner-shareable domain for address va. */
-TLB_HELPER_VA(__flush_xen_tlb_one, vae2is)
+static inline void __flush_xen_tlb_one(vaddr_t va)
+{
+ asm volatile (
+ "tlbi vae2is, %0" : : "r" (va >> PAGE_SHIFT) : "memory");
+}
-#undef TLB_HELPER
-#undef TLB_HELPER_VA
+/*
+ * ARM64_WORKAROUND_REPEAT_TLBI:
+ * For all relevant erratas it is only necessary to execute a single
+ * additional TLBI;DSB sequence after any number of TLBIs are completed by DSB.
+ */
+static inline void __tlb_repeat_sync(void)
+{
+ asm volatile (
+ ALTERNATIVE(
+ "nop; nop;",
+ "tlbi vale2is, xzr;"
+ "dsb ish;",
+ ARM64_WORKAROUND_REPEAT_TLBI,
+ CONFIG_ARM64_WORKAROUND_REPEAT_TLBI)
+ : : : "memory");
+}
#endif /* __ASM_ARM_ARM64_FLUSHTLB_H__ */
/*
diff --git a/xen/arch/arm/include/asm/flushtlb.h b/xen/arch/arm/include/asm/flushtlb.h
index e45fb6d97b02..c292c3c00d29 100644
--- a/xen/arch/arm/include/asm/flushtlb.h
+++ b/xen/arch/arm/include/asm/flushtlb.h
@@ -65,6 +65,7 @@ static inline void flush_xen_tlb_range_va(vaddr_t va,
va += PAGE_SIZE;
}
dsb(ish); /* Ensure the TLB invalidation has completed */
+ __tlb_repeat_sync();
isb();
}
diff --git a/xen/arch/arm/include/asm/mmu/layout.h b/xen/arch/arm/include/asm/mmu/layout.h
index 19c0ec63a59a..feafc14ebfda 100644
--- a/xen/arch/arm/include/asm/mmu/layout.h
+++ b/xen/arch/arm/include/asm/mmu/layout.h
@@ -23,6 +23,10 @@
*
* Reserved to identity map Xen
*
+ * Note: As part of ARM64_WORKAROUND_REPEAT_TLBI, VA 0 is used for an extra
+ * TLBI operation given its rare use (only identity mapping) and thus
+ * negligible performance impact.
+ *
* 0x00000a0000000000 - 0x00000a7fffffffff (512GB, L0 slot [20])
* (Relative offsets)
* 0 - 2M Unmapped
From b2e4eef9a92f02ea180f0b30602359b165b8ed49 Mon Sep 17 00:00:00 2001
From: Michal Orzel <michal.orzel@amd.com>
Date: Fri, 22 May 2026 09:35:55 +0200
Subject: xen/arm: Sync missing definitions for Arm CPUs with Linux
Synchronize with Linux kernel 7.0 definitions for the following CPUs:
- Cortex-A76AE,
- Cortex-A78AE,
- Cortex-X1C,
- Cortex-X3,
- Neoverse-V2,
- Cortex-X4,
- Neoverse-V3AE,
- Neoverse-V3,
- Cortex-X925.
These will be used for errata detection in subsequent patches.
Signed-off-by: Michal Orzel <michal.orzel@amd.com>
Reviewed-by: Julien Grall <julien@xen.org>
diff --git a/xen/arch/arm/include/asm/processor.h b/xen/arch/arm/include/asm/processor.h
index ecf7f2e859df..e3447150822b 100644
--- a/xen/arch/arm/include/asm/processor.h
+++ b/xen/arch/arm/include/asm/processor.h
@@ -89,13 +89,22 @@
#define ARM_CPU_PART_CORTEX_A76 0xD0B
#define ARM_CPU_PART_NEOVERSE_N1 0xD0C
#define ARM_CPU_PART_CORTEX_A77 0xD0D
+#define ARM_CPU_PART_CORTEX_A76AE 0xD0E
#define ARM_CPU_PART_NEOVERSE_V1 0xD40
#define ARM_CPU_PART_CORTEX_A78 0xD41
+#define ARM_CPU_PART_CORTEX_A78AE 0xD42
#define ARM_CPU_PART_CORTEX_X1 0xD44
#define ARM_CPU_PART_CORTEX_A710 0xD47
#define ARM_CPU_PART_CORTEX_X2 0xD48
#define ARM_CPU_PART_NEOVERSE_N2 0xD49
#define ARM_CPU_PART_CORTEX_A78C 0xD4B
+#define ARM_CPU_PART_CORTEX_X1C 0xD4C
+#define ARM_CPU_PART_CORTEX_X3 0xD4E
+#define ARM_CPU_PART_NEOVERSE_V2 0xD4F
+#define ARM_CPU_PART_CORTEX_X4 0xD82
+#define ARM_CPU_PART_NEOVERSE_V3AE 0xD83
+#define ARM_CPU_PART_NEOVERSE_V3 0xD84
+#define ARM_CPU_PART_CORTEX_X925 0xD85
#define MIDR_CORTEX_A12 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A12)
#define MIDR_CORTEX_A17 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A17)
@@ -110,13 +119,22 @@
#define MIDR_CORTEX_A76 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A76)
#define MIDR_NEOVERSE_N1 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_N1)
#define MIDR_CORTEX_A77 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A77)
+#define MIDR_CORTEX_A76AE MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A76AE)
#define MIDR_NEOVERSE_V1 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_V1)
#define MIDR_CORTEX_A78 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A78)
+#define MIDR_CORTEX_A78AE MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A78AE)
#define MIDR_CORTEX_X1 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X1)
#define MIDR_CORTEX_A710 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A710)
#define MIDR_CORTEX_X2 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X2)
#define MIDR_NEOVERSE_N2 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_N2)
#define MIDR_CORTEX_A78C MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A78C)
+#define MIDR_CORTEX_X1C MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X1C)
+#define MIDR_CORTEX_X3 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X3)
+#define MIDR_NEOVERSE_V2 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_V2)
+#define MIDR_CORTEX_X4 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X4)
+#define MIDR_NEOVERSE_V3AE MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_V3AE)
+#define MIDR_NEOVERSE_V3 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_V3)
+#define MIDR_CORTEX_X925 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X925)
/* MPIDR Multiprocessor Affinity Register */
#define _MPIDR_UP (30)
From d14527d670d29b59fd960043527e256439c0fc7d Mon Sep 17 00:00:00 2001
From: Michal Orzel <michal.orzel@amd.com>
Date: Fri, 22 May 2026 09:35:56 +0200
Subject: xen/arm: Add C1-Ultra definitions
Add processor definitions for C1-Ultra. These will be used for errata
detection in subsequent patches.
These values can be found in the C1-Ultra TRM:
https://developer.arm.com/documentation/108014/0100/
... in section A.5.1 ("MIDR_EL1, Main ID Register").
Signed-off-by: Michal Orzel <michal.orzel@amd.com>
Reviewed-by: Julien Grall <julien@xen.org>
diff --git a/xen/arch/arm/include/asm/processor.h b/xen/arch/arm/include/asm/processor.h
index e3447150822b..1f8b2184fa44 100644
--- a/xen/arch/arm/include/asm/processor.h
+++ b/xen/arch/arm/include/asm/processor.h
@@ -105,6 +105,7 @@
#define ARM_CPU_PART_NEOVERSE_V3AE 0xD83
#define ARM_CPU_PART_NEOVERSE_V3 0xD84
#define ARM_CPU_PART_CORTEX_X925 0xD85
+#define ARM_CPU_PART_C1_ULTRA 0xD8C
#define MIDR_CORTEX_A12 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A12)
#define MIDR_CORTEX_A17 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A17)
@@ -135,6 +136,7 @@
#define MIDR_NEOVERSE_V3AE MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_V3AE)
#define MIDR_NEOVERSE_V3 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_V3)
#define MIDR_CORTEX_X925 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X925)
+#define MIDR_C1_ULTRA MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_C1_ULTRA)
/* MPIDR Multiprocessor Affinity Register */
#define _MPIDR_UP (30)
From 7c186d15acd9b17a610a5ae1fca85c6f99ccca0d Mon Sep 17 00:00:00 2001
From: Michal Orzel <michal.orzel@amd.com>
Date: Fri, 22 May 2026 09:35:57 +0200
Subject: xen/arm: Add C1-Premium definitions
Add processor definitions for C1-Premium. These will be used for errata
detection in subsequent patches.
These values can be found in the C1-Premium TRM:
https://developer.arm.com/documentation/109416/0100/
... in section A.5.1 ("MIDR_EL1, Main ID Register").
Signed-off-by: Michal Orzel <michal.orzel@amd.com>
Reviewed-by: Julien Grall <julien@xen.org>
diff --git a/xen/arch/arm/include/asm/processor.h b/xen/arch/arm/include/asm/processor.h
index 1f8b2184fa44..2405616534ab 100644
--- a/xen/arch/arm/include/asm/processor.h
+++ b/xen/arch/arm/include/asm/processor.h
@@ -106,6 +106,7 @@
#define ARM_CPU_PART_NEOVERSE_V3 0xD84
#define ARM_CPU_PART_CORTEX_X925 0xD85
#define ARM_CPU_PART_C1_ULTRA 0xD8C
+#define ARM_CPU_PART_C1_PREMIUM 0xD90
#define MIDR_CORTEX_A12 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A12)
#define MIDR_CORTEX_A17 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A17)
@@ -137,6 +138,7 @@
#define MIDR_NEOVERSE_V3 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_V3)
#define MIDR_CORTEX_X925 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X925)
#define MIDR_C1_ULTRA MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_C1_ULTRA)
+#define MIDR_C1_PREMIUM MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_C1_PREMIUM)
/* MPIDR Multiprocessor Affinity Register */
#define _MPIDR_UP (30)
From 788fd7c006e2ab210ef66806e061b36ccdfb2813 Mon Sep 17 00:00:00 2001
From: Michal Orzel <michal.orzel@amd.com>
Date: Fri, 22 May 2026 09:35:58 +0200
Subject: xen/arm: Mitigate TLBI errata on various Arm CPUs
A number of CPUs developed by Arm suffer from errata whereby a broadcast
TLBI + DSB sequence may complete before the global observation of writes
which are translated by an affected TLB entry. This can lead to memory
corruption and potential privilege escalation.
These errata ONLY affect the completion of memory accesses which have
been translated by an invalidated TLB entry, and these errata DO NOT
affect the actual invalidation of TLB entries. TLB entries are removed
correctly.
To mitigate this issue, Arm recommends that software follows each
TLBI+DSB sequence with an additional TLBI+DSB, which will ensure that
all memory write effects affected by the first TLBI have been globally
observed.
The ARM64_WORKAROUND_REPEAT_TLBI workaround is sufficient to mitigate the
issue. Enable this workaround for affected CPUs.
This is XSA-493 / CVE-2025-10263.
Signed-off-by: Michal Orzel <michal.orzel@amd.com>
Reviewed-by: Julien Grall <julien@xen.org>
diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
index a26d3e11827c..a89995545f95 100644
--- a/xen/arch/arm/Kconfig
+++ b/xen/arch/arm/Kconfig
@@ -467,6 +467,27 @@ config ARM64_ERRATUM_1508412
If unsure, say Y.
+config ARM64_ERRATUM_CVE_2025_10263
+ bool "Cortex-*/Neoverse-*/C1-*: Completion of affected memory accesses might not be guaranteed by completion of a TLBI"
+ default y
+ depends on ARM_64
+ select ARM64_WORKAROUND_REPEAT_TLBI
+ help
+ This option adds a workaround for CVE-2025-10263.
+
+ A broadcast TLBI on another PE may complete before affected memory
+ accesses are globally observed. This may permit bypass of Stage 1
+ translation, Stage-2 translation, or GPT protection.
+
+ The workaround repeats the TLBI VALE2IS, XZR + DSB ISH operation for all
+ the broadcast TLB flush operations. A single additional TLBI and DSB are
+ sufficient regardless of how many TLBIs are completed by the DSB.
+
+ Note that software workarounds are required at all execution levels for
+ affected parts to fully mitigate this issue.
+
+ If unsure, say Y.
+
endmenu
config ARM64_HARDEN_BRANCH_PREDICTOR
diff --git a/xen/arch/arm/cpuerrata.c b/xen/arch/arm/cpuerrata.c
index 17cf134f1b0d..3a32183618dc 100644
--- a/xen/arch/arm/cpuerrata.c
+++ b/xen/arch/arm/cpuerrata.c
@@ -534,6 +534,92 @@ static const struct arm_cpu_capabilities arm_errata[] = {
MIDR_RANGE(MIDR_NEOVERSE_N1, 0, 3 << MIDR_VARIANT_SHIFT),
},
#endif
+#ifdef CONFIG_ARM64_ERRATUM_CVE_2025_10263
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_A76),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_A76AE),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_A77),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_A78),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_A78AE),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_A78C),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_A710),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_X1),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_X1C),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_X2),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_X3),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_X4),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_X925),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N1),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N2),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_NEOVERSE_V1),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_NEOVERSE_V2),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_NEOVERSE_V3),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_NEOVERSE_V3AE),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_C1_ULTRA),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_C1_PREMIUM),
+ },
+#endif
#ifdef CONFIG_ARM64_HARDEN_BRANCH_PREDICTOR
{
.capability = ARM_HARDEN_BRANCH_PREDICTOR,
From 2e21b5301765de353c06081eee953255bf327176 Mon Sep 17 00:00:00 2001
From: Michal Orzel <michal.orzel@amd.com>
Date: Tue, 14 Apr 2026 10:11:24 +0200
Subject: xen/arm64: flushtlb: Optimize ARM64_WORKAROUND_REPEAT_TLBI
The ARM64_WORKAROUND_REPEAT_TLBI workaround is used to mitigate several
errata where broadcast TLBI;DSB sequences don't provide all the
architecturally required synchronization. The workaround performs more
work than necessary, and can have significant overhead. This patch
optimizes the workaround, as explained below.
1. All relevant errata only affect the ordering and/or completion of
memory accesses which have been translated by an invalidated TLB
entry. The actual invalidation of TLB entries is unaffected.
2. The existing workaround is applied to both broadcast and local TLB
invalidation, whereas for all relevant errata it is only necessary to
apply a workaround for broadcast invalidation.
3. The existing workaround replaces every TLBI with a TLBI;DSB;TLBI
sequence, whereas for all relevant errata it is only necessary to
execute a single additional TLBI;DSB sequence after any number of
TLBIs are completed by a DSB.
For example, for a sequence of batched TLBIs:
TLBI <op1>[, <arg1>]
TLBI <op2>[, <arg2>]
TLBI <op3>[, <arg3>]
DSB ISH
... the existing workaround will expand this to:
TLBI <op1>[, <arg1>]
DSB ISH // additional
TLBI <op1>[, <arg1>] // additional
TLBI <op2>[, <arg2>]
DSB ISH // additional
TLBI <op2>[, <arg2>] // additional
TLBI <op3>[, <arg3>]
DSB ISH // additional
TLBI <op3>[, <arg3>] // additional
DSB ISH
... whereas it is sufficient to have:
TLBI <op1>[, <arg1>]
TLBI <op2>[, <arg2>]
TLBI <op3>[, <arg3>]
DSB ISH
TLBI <opX>[, <argX>] // additional
DSB ISH // additional
Using a single additional TLBI and DSB at the end of the sequence can
have significantly lower overhead as each DSB which completes a TLBI
must synchronize with other PEs in the system, with potential
performance effects both locally and system-wide.
4. The existing workaround repeats each specific TLBI operation, whereas
for all relevant errata it is sufficient for the additional TLBI to
use *any* operation which will be broadcast, regardless of which
translation regime or stage of translation the operation applies to.
For example, for a single TLBI:
TLBI ALLE2IS
DSB ISH
... the existing workaround will expand this to:
TLBI ALLE2IS
DSB ISH
TLBI ALLE2IS // additional
DSB ISH // additional
... whereas it is sufficient to have:
TLBI ALLE2IS
DSB ISH
TLBI VALE1IS, XZR // additional
DSB ISH // additional
As the additional TLBI doesn't have to match a specific earlier TLBI,
the additional TLBI can be implemented in separate code, with no
memory of the earlier TLBIs. The additional TLBI can also use a
cheaper TLBI operation.
5. The existing workaround is applied to both Stage-1 and Stage-2 TLB
invalidation, whereas for all relevant errata it is only necessary to
apply a workaround for Stage-1 invalidation.
Architecturally, TLBI operations which invalidate only Stage-2
information (e.g. IPAS2E1IS) are not required to invalidate TLB
entries which combine information from Stage-1 and Stage-2
translation table entries, and consequently may not complete memory
accesses translated by those combined entries. In these cases,
completion of memory accesses is only guaranteed after subsequent
invalidation of Stage-1 information (e.g. VMALLE1IS).
Rework the workaround logic as follows:
- add TLB_HELPER_LOCAL() to be used for local TLB ops without a
workaround,
- modify TLB_HELPER() workaround to use tlbi vale2is, xzr as a second
TLBI,
- drop TLB_HELPER_VA(). It's used only by __flush_xen_tlb_one_local
which is local and does not need workaround and by
__flush_xen_tlb_one. In the latter case, since it's used in a loop,
we don't need a workaround in the middle. Add __tlb_repeat_sync with
a workaround to be used at the end after DSB and before final ISB,
- TLBI VALE2IS passing XZR is used as an additional TLBI. While there is
an identity mapping there, it's used very rarely. The performance
impact is therefore negligible. If things change in the future, we
can revisit the decision.
Signed-off-by: Michal Orzel <michal.orzel@amd.com>
Reviewed-by: Luca Fancellu <luca.fancellu@arm.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
(cherry picked from commit 7c502d7591519135765b8041cbd1c70e56e5a0b9)
diff --git a/xen/arch/arm/include/asm/arm32/flushtlb.h b/xen/arch/arm/include/asm/arm32/flushtlb.h
index 61c25a318998..5483be08fbbe 100644
--- a/xen/arch/arm/include/asm/arm32/flushtlb.h
+++ b/xen/arch/arm/include/asm/arm32/flushtlb.h
@@ -57,6 +57,9 @@ static inline void __flush_xen_tlb_one(vaddr_t va)
asm volatile(STORE_CP32(0, TLBIMVAHIS) : : "r" (va) : "memory");
}
+/* Only for ARM64_WORKAROUND_REPEAT_TLBI */
+static inline void __tlb_repeat_sync(void) {}
+
#endif /* __ASM_ARM_ARM32_FLUSHTLB_H__ */
/*
* Local variables:
diff --git a/xen/arch/arm/include/asm/arm64/flushtlb.h b/xen/arch/arm/include/asm/arm64/flushtlb.h
index 3b99c11b50d1..1606b26bf28a 100644
--- a/xen/arch/arm/include/asm/arm64/flushtlb.h
+++ b/xen/arch/arm/include/asm/arm64/flushtlb.h
@@ -12,9 +12,14 @@
* ARM64_WORKAROUND_REPEAT_TLBI:
* Modification of the translation table for a virtual address might lead to
* read-after-read ordering violation.
- * The workaround repeats TLBI+DSB ISH operation for all the TLB flush
- * operations. While this is strictly not necessary, we don't want to
- * take any risk.
+ * The workaround repeats TLBI+DSB ISH operation for broadcast TLB flush
+ * operations. The workaround is not needed for local operations.
+ *
+ * It is sufficient for the additional TLBI to use *any* operation which will
+ * be broadcast, regardless of which translation regime or stage of translation
+ * the operation applies to. TLBI VALE2IS is used passing XZR. While there is
+ * an identity mapping there, it's only used during suspend/resume, CPU on/off,
+ * so the impact (performance if any) is negligible.
*
* For Xen page-tables the ISB will discard any instructions fetched
* from the old mappings.
@@ -26,69 +31,90 @@
* Note that for local TLB flush, using non-shareable (nsh) is sufficient
* (see D5-4929 in ARM DDI 0487H.a). Although, the memory barrier in
* for the workaround is left as inner-shareable to match with Linux
- * v6.1-rc8.
+ * v6.19.
*/
-#define TLB_HELPER(name, tlbop, sh) \
+#define TLB_HELPER_LOCAL(name, tlbop) \
static inline void name(void) \
{ \
asm_inline volatile ( \
- "dsb " # sh "st;" \
+ "dsb nshst;" \
"tlbi " # tlbop ";" \
- ALTERNATIVE( \
- "nop; nop;", \
- "dsb ish;" \
- "tlbi " # tlbop ";", \
- ARM64_WORKAROUND_REPEAT_TLBI, \
- CONFIG_ARM64_WORKAROUND_REPEAT_TLBI) \
- "dsb " # sh ";" \
+ "dsb nsh;" \
"isb;" \
: : : "memory"); \
}
-/*
- * FLush TLB by VA. This will likely be used in a loop, so the caller
- * is responsible to use the appropriate memory barriers before/after
- * the sequence.
- *
- * See above about the ARM64_WORKAROUND_REPEAT_TLBI sequence.
- */
-#define TLB_HELPER_VA(name, tlbop) \
-static inline void name(vaddr_t va) \
-{ \
- asm_inline volatile ( \
- "tlbi " # tlbop ", %0;" \
- ALTERNATIVE( \
- "nop; nop;", \
- "dsb ish;" \
- "tlbi " # tlbop ", %0;", \
- ARM64_WORKAROUND_REPEAT_TLBI, \
- CONFIG_ARM64_WORKAROUND_REPEAT_TLBI) \
- : : "r" (va >> PAGE_SHIFT) : "memory"); \
+#define TLB_HELPER(name, tlbop) \
+static inline void name(void) \
+{ \
+ asm_inline volatile ( \
+ "dsb ishst;" \
+ "tlbi " # tlbop ";" \
+ ALTERNATIVE( \
+ "nop; nop;", \
+ "dsb ish;" \
+ "tlbi vale2is, xzr;", \
+ ARM64_WORKAROUND_REPEAT_TLBI, \
+ CONFIG_ARM64_WORKAROUND_REPEAT_TLBI) \
+ "dsb ish;" \
+ "isb;" \
+ : : : "memory"); \
}
/* Flush local TLBs, current VMID only. */
-TLB_HELPER(flush_guest_tlb_local, vmalls12e1, nsh)
+TLB_HELPER_LOCAL(flush_guest_tlb_local, vmalls12e1)
/* Flush innershareable TLBs, current VMID only */
-TLB_HELPER(flush_guest_tlb, vmalls12e1is, ish)
+TLB_HELPER(flush_guest_tlb, vmalls12e1is)
/* Flush local TLBs, all VMIDs, non-hypervisor mode */
-TLB_HELPER(flush_all_guests_tlb_local, alle1, nsh)
+TLB_HELPER_LOCAL(flush_all_guests_tlb_local, alle1)
/* Flush innershareable TLBs, all VMIDs, non-hypervisor mode */
-TLB_HELPER(flush_all_guests_tlb, alle1is, ish)
+TLB_HELPER(flush_all_guests_tlb, alle1is)
/* Flush all hypervisor mappings from the TLB of the local processor. */
-TLB_HELPER(flush_xen_tlb_local, alle2, nsh)
+TLB_HELPER_LOCAL(flush_xen_tlb_local, alle2)
+
+#undef TLB_HELPER_LOCAL
+#undef TLB_HELPER
+
+/*
+ * FLush TLB by VA. This will likely be used in a loop, so the caller
+ * is responsible to use the appropriate memory barriers before/after
+ * the sequence.
+ */
/* Flush TLB of local processor for address va. */
-TLB_HELPER_VA(__flush_xen_tlb_one_local, vae2)
+static inline void __flush_xen_tlb_one_local(vaddr_t va)
+{
+ asm_inline volatile (
+ "tlbi vae2, %0" : : "r" (va >> PAGE_SHIFT) : "memory");
+}
/* Flush TLB of all processors in the inner-shareable domain for address va. */
-TLB_HELPER_VA(__flush_xen_tlb_one, vae2is)
+static inline void __flush_xen_tlb_one(vaddr_t va)
+{
+ asm_inline volatile (
+ "tlbi vae2is, %0" : : "r" (va >> PAGE_SHIFT) : "memory");
+}
-#undef TLB_HELPER
-#undef TLB_HELPER_VA
+/*
+ * ARM64_WORKAROUND_REPEAT_TLBI:
+ * For all relevant erratas it is only necessary to execute a single
+ * additional TLBI;DSB sequence after any number of TLBIs are completed by DSB.
+ */
+static inline void __tlb_repeat_sync(void)
+{
+ asm_inline volatile (
+ ALTERNATIVE(
+ "nop; nop;",
+ "tlbi vale2is, xzr;"
+ "dsb ish;",
+ ARM64_WORKAROUND_REPEAT_TLBI,
+ CONFIG_ARM64_WORKAROUND_REPEAT_TLBI)
+ : : : "memory");
+}
#endif /* __ASM_ARM_ARM64_FLUSHTLB_H__ */
/*
diff --git a/xen/arch/arm/include/asm/flushtlb.h b/xen/arch/arm/include/asm/flushtlb.h
index e45fb6d97b02..c292c3c00d29 100644
--- a/xen/arch/arm/include/asm/flushtlb.h
+++ b/xen/arch/arm/include/asm/flushtlb.h
@@ -65,6 +65,7 @@ static inline void flush_xen_tlb_range_va(vaddr_t va,
va += PAGE_SIZE;
}
dsb(ish); /* Ensure the TLB invalidation has completed */
+ __tlb_repeat_sync();
isb();
}
diff --git a/xen/arch/arm/include/asm/mmu/layout.h b/xen/arch/arm/include/asm/mmu/layout.h
index 19c0ec63a59a..feafc14ebfda 100644
--- a/xen/arch/arm/include/asm/mmu/layout.h
+++ b/xen/arch/arm/include/asm/mmu/layout.h
@@ -23,6 +23,10 @@
*
* Reserved to identity map Xen
*
+ * Note: As part of ARM64_WORKAROUND_REPEAT_TLBI, VA 0 is used for an extra
+ * TLBI operation given its rare use (only identity mapping) and thus
+ * negligible performance impact.
+ *
* 0x00000a0000000000 - 0x00000a7fffffffff (512GB, L0 slot [20])
* (Relative offsets)
* 0 - 2M Unmapped
From 7e70b87512c966248b1e8453d9ac54c643c06f44 Mon Sep 17 00:00:00 2001
From: Michal Orzel <michal.orzel@amd.com>
Date: Fri, 22 May 2026 09:35:55 +0200
Subject: xen/arm: Sync missing definitions for Arm CPUs with Linux
Synchronize with Linux kernel 7.0 definitions for the following CPUs:
- Cortex-A76AE,
- Cortex-A78AE,
- Cortex-X1C,
- Cortex-X3,
- Neoverse-V2,
- Cortex-X4,
- Neoverse-V3AE,
- Neoverse-V3,
- Cortex-X925.
These will be used for errata detection in subsequent patches.
Signed-off-by: Michal Orzel <michal.orzel@amd.com>
Reviewed-by: Julien Grall <julien@xen.org>
diff --git a/xen/arch/arm/include/asm/processor.h b/xen/arch/arm/include/asm/processor.h
index ec23fd098b63..907778683b08 100644
--- a/xen/arch/arm/include/asm/processor.h
+++ b/xen/arch/arm/include/asm/processor.h
@@ -89,13 +89,22 @@
#define ARM_CPU_PART_CORTEX_A76 0xD0B
#define ARM_CPU_PART_NEOVERSE_N1 0xD0C
#define ARM_CPU_PART_CORTEX_A77 0xD0D
+#define ARM_CPU_PART_CORTEX_A76AE 0xD0E
#define ARM_CPU_PART_NEOVERSE_V1 0xD40
#define ARM_CPU_PART_CORTEX_A78 0xD41
+#define ARM_CPU_PART_CORTEX_A78AE 0xD42
#define ARM_CPU_PART_CORTEX_X1 0xD44
#define ARM_CPU_PART_CORTEX_A710 0xD47
#define ARM_CPU_PART_CORTEX_X2 0xD48
#define ARM_CPU_PART_NEOVERSE_N2 0xD49
#define ARM_CPU_PART_CORTEX_A78C 0xD4B
+#define ARM_CPU_PART_CORTEX_X1C 0xD4C
+#define ARM_CPU_PART_CORTEX_X3 0xD4E
+#define ARM_CPU_PART_NEOVERSE_V2 0xD4F
+#define ARM_CPU_PART_CORTEX_X4 0xD82
+#define ARM_CPU_PART_NEOVERSE_V3AE 0xD83
+#define ARM_CPU_PART_NEOVERSE_V3 0xD84
+#define ARM_CPU_PART_CORTEX_X925 0xD85
#define MIDR_CORTEX_A12 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A12)
#define MIDR_CORTEX_A17 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A17)
@@ -110,13 +119,22 @@
#define MIDR_CORTEX_A76 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A76)
#define MIDR_NEOVERSE_N1 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_N1)
#define MIDR_CORTEX_A77 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A77)
+#define MIDR_CORTEX_A76AE MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A76AE)
#define MIDR_NEOVERSE_V1 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_V1)
#define MIDR_CORTEX_A78 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A78)
+#define MIDR_CORTEX_A78AE MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A78AE)
#define MIDR_CORTEX_X1 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X1)
#define MIDR_CORTEX_A710 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A710)
#define MIDR_CORTEX_X2 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X2)
#define MIDR_NEOVERSE_N2 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_N2)
#define MIDR_CORTEX_A78C MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A78C)
+#define MIDR_CORTEX_X1C MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X1C)
+#define MIDR_CORTEX_X3 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X3)
+#define MIDR_NEOVERSE_V2 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_V2)
+#define MIDR_CORTEX_X4 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X4)
+#define MIDR_NEOVERSE_V3AE MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_V3AE)
+#define MIDR_NEOVERSE_V3 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_V3)
+#define MIDR_CORTEX_X925 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X925)
/* MPIDR Multiprocessor Affinity Register */
#define _MPIDR_UP (30)
From c0f7b40fdbb986b3cf470ed51f3878261e33f9cb Mon Sep 17 00:00:00 2001
From: Michal Orzel <michal.orzel@amd.com>
Date: Fri, 22 May 2026 09:35:56 +0200
Subject: xen/arm: Add C1-Ultra definitions
Add processor definitions for C1-Ultra. These will be used for errata
detection in subsequent patches.
These values can be found in the C1-Ultra TRM:
https://developer.arm.com/documentation/108014/0100/
... in section A.5.1 ("MIDR_EL1, Main ID Register").
Signed-off-by: Michal Orzel <michal.orzel@amd.com>
Reviewed-by: Julien Grall <julien@xen.org>
diff --git a/xen/arch/arm/include/asm/processor.h b/xen/arch/arm/include/asm/processor.h
index 907778683b08..72745cca62bc 100644
--- a/xen/arch/arm/include/asm/processor.h
+++ b/xen/arch/arm/include/asm/processor.h
@@ -105,6 +105,7 @@
#define ARM_CPU_PART_NEOVERSE_V3AE 0xD83
#define ARM_CPU_PART_NEOVERSE_V3 0xD84
#define ARM_CPU_PART_CORTEX_X925 0xD85
+#define ARM_CPU_PART_C1_ULTRA 0xD8C
#define MIDR_CORTEX_A12 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A12)
#define MIDR_CORTEX_A17 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A17)
@@ -135,6 +136,7 @@
#define MIDR_NEOVERSE_V3AE MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_V3AE)
#define MIDR_NEOVERSE_V3 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_V3)
#define MIDR_CORTEX_X925 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X925)
+#define MIDR_C1_ULTRA MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_C1_ULTRA)
/* MPIDR Multiprocessor Affinity Register */
#define _MPIDR_UP (30)
From 6af67aeca418bffb807424eb3415fab59e581733 Mon Sep 17 00:00:00 2001
From: Michal Orzel <michal.orzel@amd.com>
Date: Fri, 22 May 2026 09:35:57 +0200
Subject: xen/arm: Add C1-Premium definitions
Add processor definitions for C1-Premium. These will be used for errata
detection in subsequent patches.
These values can be found in the C1-Premium TRM:
https://developer.arm.com/documentation/109416/0100/
... in section A.5.1 ("MIDR_EL1, Main ID Register").
Signed-off-by: Michal Orzel <michal.orzel@amd.com>
Reviewed-by: Julien Grall <julien@xen.org>
diff --git a/xen/arch/arm/include/asm/processor.h b/xen/arch/arm/include/asm/processor.h
index 72745cca62bc..25c5762c6706 100644
--- a/xen/arch/arm/include/asm/processor.h
+++ b/xen/arch/arm/include/asm/processor.h
@@ -106,6 +106,7 @@
#define ARM_CPU_PART_NEOVERSE_V3 0xD84
#define ARM_CPU_PART_CORTEX_X925 0xD85
#define ARM_CPU_PART_C1_ULTRA 0xD8C
+#define ARM_CPU_PART_C1_PREMIUM 0xD90
#define MIDR_CORTEX_A12 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A12)
#define MIDR_CORTEX_A17 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A17)
@@ -137,6 +138,7 @@
#define MIDR_NEOVERSE_V3 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_V3)
#define MIDR_CORTEX_X925 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X925)
#define MIDR_C1_ULTRA MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_C1_ULTRA)
+#define MIDR_C1_PREMIUM MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_C1_PREMIUM)
/* MPIDR Multiprocessor Affinity Register */
#define _MPIDR_UP (30)
From ffbb964c4a3c60d52d8f0b2e448e1ba6b34a443f Mon Sep 17 00:00:00 2001
From: Michal Orzel <michal.orzel@amd.com>
Date: Fri, 22 May 2026 09:35:58 +0200
Subject: xen/arm: Mitigate TLBI errata on various Arm CPUs
A number of CPUs developed by Arm suffer from errata whereby a broadcast
TLBI + DSB sequence may complete before the global observation of writes
which are translated by an affected TLB entry. This can lead to memory
corruption and potential privilege escalation.
These errata ONLY affect the completion of memory accesses which have
been translated by an invalidated TLB entry, and these errata DO NOT
affect the actual invalidation of TLB entries. TLB entries are removed
correctly.
To mitigate this issue, Arm recommends that software follows each
TLBI+DSB sequence with an additional TLBI+DSB, which will ensure that
all memory write effects affected by the first TLBI have been globally
observed.
The ARM64_WORKAROUND_REPEAT_TLBI workaround is sufficient to mitigate the
issue. Enable this workaround for affected CPUs.
This is XSA-493 / CVE-2025-10263.
Signed-off-by: Michal Orzel <michal.orzel@amd.com>
Reviewed-by: Julien Grall <julien@xen.org>
diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
index cf6af68299f6..dad922c51b75 100644
--- a/xen/arch/arm/Kconfig
+++ b/xen/arch/arm/Kconfig
@@ -467,6 +467,27 @@ config ARM64_ERRATUM_1508412
If unsure, say Y.
+config ARM64_ERRATUM_CVE_2025_10263
+ bool "Cortex-*/Neoverse-*/C1-*: Completion of affected memory accesses might not be guaranteed by completion of a TLBI"
+ default y
+ depends on ARM_64
+ select ARM64_WORKAROUND_REPEAT_TLBI
+ help
+ This option adds a workaround for CVE-2025-10263.
+
+ A broadcast TLBI on another PE may complete before affected memory
+ accesses are globally observed. This may permit bypass of Stage 1
+ translation, Stage-2 translation, or GPT protection.
+
+ The workaround repeats the TLBI VALE2IS, XZR + DSB ISH operation for all
+ the broadcast TLB flush operations. A single additional TLBI and DSB are
+ sufficient regardless of how many TLBIs are completed by the DSB.
+
+ Note that software workarounds are required at all execution levels for
+ affected parts to fully mitigate this issue.
+
+ If unsure, say Y.
+
endmenu
config ARM64_HARDEN_BRANCH_PREDICTOR
diff --git a/xen/arch/arm/cpuerrata.c b/xen/arch/arm/cpuerrata.c
index 17cf134f1b0d..3a32183618dc 100644
--- a/xen/arch/arm/cpuerrata.c
+++ b/xen/arch/arm/cpuerrata.c
@@ -534,6 +534,92 @@ static const struct arm_cpu_capabilities arm_errata[] = {
MIDR_RANGE(MIDR_NEOVERSE_N1, 0, 3 << MIDR_VARIANT_SHIFT),
},
#endif
+#ifdef CONFIG_ARM64_ERRATUM_CVE_2025_10263
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_A76),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_A76AE),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_A77),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_A78),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_A78AE),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_A78C),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_A710),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_X1),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_X1C),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_X2),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_X3),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_X4),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_X925),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N1),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N2),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_NEOVERSE_V1),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_NEOVERSE_V2),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_NEOVERSE_V3),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_NEOVERSE_V3AE),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_C1_ULTRA),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_C1_PREMIUM),
+ },
+#endif
#ifdef CONFIG_ARM64_HARDEN_BRANCH_PREDICTOR
{
.capability = ARM_HARDEN_BRANCH_PREDICTOR,
From 9411b10c994c39b660fb295236ace806551ba959 Mon Sep 17 00:00:00 2001
From: Michal Orzel <michal.orzel@amd.com>
Date: Fri, 22 May 2026 09:35:58 +0200
Subject: xen/arm: Mitigate TLBI errata on various Arm CPUs
A number of CPUs developed by Arm suffer from errata whereby a broadcast
TLBI + DSB sequence may complete before the global observation of writes
which are translated by an affected TLB entry. This can lead to memory
corruption and potential privilege escalation.
These errata ONLY affect the completion of memory accesses which have
been translated by an invalidated TLB entry, and these errata DO NOT
affect the actual invalidation of TLB entries. TLB entries are removed
correctly.
To mitigate this issue, Arm recommends that software follows each
TLBI+DSB sequence with an additional TLBI+DSB, which will ensure that
all memory write effects affected by the first TLBI have been globally
observed.
The ARM64_WORKAROUND_REPEAT_TLBI workaround is sufficient to mitigate the
issue. Enable this workaround for affected CPUs.
This is XSA-493 / CVE-2025-10263.
Signed-off-by: Michal Orzel <michal.orzel@amd.com>
Reviewed-by: Julien Grall <julien@xen.org>
diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
index 79622b46a10d..5fa89fcb2428 100644
--- a/xen/arch/arm/Kconfig
+++ b/xen/arch/arm/Kconfig
@@ -468,6 +468,27 @@ config ARM64_ERRATUM_1508412
If unsure, say Y.
+config ARM64_ERRATUM_CVE_2025_10263
+ bool "Cortex-*/Neoverse-*/C1-*: Completion of affected memory accesses might not be guaranteed by completion of a TLBI"
+ default y
+ depends on ARM_64
+ select ARM64_WORKAROUND_REPEAT_TLBI
+ help
+ This option adds a workaround for CVE-2025-10263.
+
+ A broadcast TLBI on another PE may complete before affected memory
+ accesses are globally observed. This may permit bypass of Stage 1
+ translation, Stage-2 translation, or GPT protection.
+
+ The workaround repeats the TLBI VALE2IS, XZR + DSB ISH operation for all
+ the broadcast TLB flush operations. A single additional TLBI and DSB are
+ sufficient regardless of how many TLBIs are completed by the DSB.
+
+ Note that software workarounds are required at all execution levels for
+ affected parts to fully mitigate this issue.
+
+ If unsure, say Y.
+
endmenu
config ARM64_HARDEN_BRANCH_PREDICTOR
diff --git a/xen/arch/arm/cpuerrata.c b/xen/arch/arm/cpuerrata.c
index 17cf134f1b0d..3a32183618dc 100644
--- a/xen/arch/arm/cpuerrata.c
+++ b/xen/arch/arm/cpuerrata.c
@@ -534,6 +534,92 @@ static const struct arm_cpu_capabilities arm_errata[] = {
MIDR_RANGE(MIDR_NEOVERSE_N1, 0, 3 << MIDR_VARIANT_SHIFT),
},
#endif
+#ifdef CONFIG_ARM64_ERRATUM_CVE_2025_10263
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_A76),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_A76AE),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_A77),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_A78),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_A78AE),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_A78C),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_A710),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_X1),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_X1C),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_X2),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_X3),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_X4),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_X925),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N1),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N2),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_NEOVERSE_V1),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_NEOVERSE_V2),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_NEOVERSE_V3),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_NEOVERSE_V3AE),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_C1_ULTRA),
+ },
+ {
+ .capability = ARM64_WORKAROUND_REPEAT_TLBI,
+ MIDR_ALL_VERSIONS(MIDR_C1_PREMIUM),
+ },
+#endif
#ifdef CONFIG_ARM64_HARDEN_BRANCH_PREDICTOR
{
.capability = ARM_HARDEN_BRANCH_PREDICTOR,
© 2016 - 2026 Red Hat, Inc.